Playing around with privately hosted LLM models

February 17, 2025 math and physics play No comments , , , ,

I have a personal use copy of Mathematica, and wanted to play with the new LLM functionality (LLMConfiguration, LLMFunction, …)

It turns out that this functionality is doubly useless.  Wolfram wants to charge you extra to use it with their models, and it fails with the chatGPT models, because openAI also requires a subscription payment tier.

However, I found that you can install and run local LLM models fairly easily.  Using ollama, it was fairly trivial to install and run the llama2 and mixtral models.  I was surprised just how easy it is to get an LLM running on your own hardware.

The steps, if I remember them properly, was just:

  • Install ollama
  • ollama pull llama2
  • ollama run llama2

Once you have the model’s installed, you can check that the model(s) are installed, and how much storage they consume:

# ollama list
NAME              ID              SIZE      MODIFIED     
mixtral:latest    a3b6bef0f836    26 GB     31 hours ago    
llama2:latest     78e26419b446    3.8 GB    32 hours ago    

Here’s an example of a command line session with the mixtral model:

# ollama run mixtral
>>> Show me Mathematica code to solve the linear system: a x + b y = c, d x + e y = f
Here is some Mathematica code that solves the linear system you have given:

```
a = 2; b = 3; c = 4;
d = 5; e = 6; f = 7;

sol = Solve[{a x + b y == c, d x + e y == f}, {x, y}]

(* {{x -> -(1/2) - (3 f)/14, y -> 2/7 + (5 c)/14}} *)
```

Incidentally, the llama2 model gets this wrong, and suggests:

solve {a*x + b*y == c, d*x + e*y == f}

which is malformed. The mixtral model needlessly includes specific values for all the matrix elements, but can be coached not to do so.

One nice thing about ollama is that it provides a rest server interface, so you can query it programatically. Here’s some sample python code cobbled together to do just that:

#! /usr/bin/env python3

import requests
import json
import sys
import os
import argparse
import re
from pathlib import Path

#MODEL = 'mixtral'
MODEL = 'llama2'
CACHE_FILE = ''

def load_cache():
    if os.path.exists(CACHE_FILE):
        with open(CACHE_FILE, 'r') as f:
            return json.load(f)
    return []

def save_cache(context):
    with open(CACHE_FILE, 'w') as f:
        json.dump(context, f)

def query_ollama(prompt, context):
    response = requests.post(
        'http://localhost:11434/api/generate',
        json={'model': MODEL, 'prompt': prompt, 'context': context, 'stream': False}
    ).json()

    return response

def main():
    global MODEL, CACHE_FILE

    # Set up argument parser
    parser = argparse.ArgumentParser(description="Query the Ollama Mixtral LLM model.")
    parser.add_argument('--clean', action='store_true', help="Start with an empty context list (no cache).")
    parser.add_argument('--model', type=str, default=MODEL, help="Specify an alternative model.")
    parser.add_argument('--cache', type=str, default='context_cache', help="context cache path component.")
    parser.add_argument('--query', type=str, default='', help="multiline query in a file")

    args = parser.parse_args()

    MODEL = args.model
    print(f"Using model: {MODEL}")

    cache = args.cache;
    CACHE_FILE = os.path.expanduser(f'~/.llmq.{cache}.{MODEL}.json')
    print(f"Context cache file: {CACHE_FILE}")

    # Initialize context based on command-line options
    if args.clean:
        context = []
    else:
        context = load_cache()

    while True:
        if args.query:
            prompt = Path(args.query).read_text().strip()
        else:
            # Read input from stdin (interactive prompt)
            prompt = input("Enter your query: ")

            # Check if the prompt matches exit or quit
            if re.match(r'^(exit|quit)$', prompt.lower()):
                print("Exiting...")
                break

        # Query the model and get the response
        response = query_ollama(prompt, context)
        rr = response.get('response', 'No response key in API output')
        print(f"Response: {rr}")

        # Update context (for context propagation)
        context = response.get('context', [])
        save_cache(context)

        if args.query:
            break

if __name__ == "__main__":
    main()

# vim: et ts=4 sw=4

This code initially queries the rest server with an empty list context, but has a mechanism to save context once returned. Here is a sample session with the script:

# llmq.py --model mixtral --cache mathematica 
Using model: mixtral
Context cache file: /Users/pjoot/.llmq.mathematica.mixtral.json
Enter your query: show me code to plot sin(x), and 2 sin(x) cos(x) in a single plot with different colors
Response: 	To plot both `Sin[x]` and `2 Sin[x] Cos[x]` in the same plot with different colors, you can use the `Plot` function in Mathematica as follows:
```mathematica
Plot[{Sin[x], 2*Sin[x]*Cos[x]}, {x, -Pi, Pi}, PlotStyle -> {Blue, Red}]
```
This will create a plot with `Sin[x]` in blue and `2 Sin[x] Cos[x]` in red. The x-axis ranges from `-Pi` to `Pi`. The `Plot` function automatically chooses appropriate tick marks and axis labels, but you can customize these using additional options if desired.
Note that the option `PlotStyle -> {Blue, Red}` specifies that the first curve should be plotted in blue and the second curve in red. You can change the colors by modifying this option.[br]
You can also add labels and title to your plot with the following code:
```mathematica
Plot[{Sin[x], 2*Sin[x]*Cos[x]}, {x, -Pi, Pi}, PlotStyle -> {Blue, Red}, PlotLabel -> "Plots of sin(x) and 2 sin(x) cos(x)", AxesLabel -> {"x", ""}]
```
This will create a plot with the title "Plots of sin(x) and 2 sin(x) cos(x)" and labels "x" on the x-axis.[br]
You can also add a legend to your plot by adding the option `PlotLegends -> {"sin(x)", "2 sin(x)cos(x)"}`:
```mathematica
Plot[{Sin[x], 2*Sin[x]*Cos[x]}, {x, -Pi, Pi}, PlotStyle -> {Blue, Red}, PlotLabel -> "Plots of sin(x) and 2 sin(x) cos(x)", AxesLabel -> {"x", ""}, PlotLegends -> {"sin(x)", "2 sin(x)cos(x)"}]
```
Enter your query: Now I'd like to plot the gamma function in the complex plane.  What is a good way to visualize that?
Response: 	Visualizing the Gamma function on the complex plane can be done using a 3D surface plot, where the real and imaginary parts of the input argument are the x-axis and y-axis, respectively, and the output value (the magnitude of the Gamma function) is the z-axis. Here's an example code in Mathematica to create such a plot:
```mathematica
ComplexPlot3D[Gamma[x + I*y], {x, -5, 5}, {y, -5, 5}]
```
This creates a 3D surface plot of the Gamma function on the complex plane for the range `-5 <= Re(z) <= 5` and `-5 <= Im(z) <= 5`. The resulting plot shows the magnitude of the Gamma function as a surface in 3D space, with contour lines indicating the values of the function.

Alternatively, you can use the following code to create a density plot of the absolute value of the Gamma function on the complex plane:
```mathematica
ComplexPlot[Abs[Gamma[x + I*y]], {x, -5, 5}, {y, -5, 5}]
```
This creates a 2D density plot of the absolute value of the Gamma function on the complex plane for the same range as before. The resulting plot shows the magnitude of the Gamma function using a color scale, with darker colors indicating higher values and lighter colors indicating lower values. This can be useful to visualize the behavior of the Gamma function in different regions of the complex plane.

Note that the Gamma function has poles at non-positive integers, so the plot will show singularities at those points. You may want to exclude these points from your plot by using `Exclusions -> {Re[z] <= 0, Im[z] == 0, Re[z] == Integer}` in the `ComplexPlot3D` function or `Exclusions -> {Re[z] <= 0, Im[z] == 0}` in the `ComplexPlot` function.
Enter your query: that produced the error: Options expected (instead of {y,-5,5}) beyond position 2 in ...
Response: 	I apologize for the mistake in my previous response. Here's the corrected code to create a density plot of the absolute value of the Gamma function on the complex plane:
```mathematica
ComplexPlot[Abs[Gamma[x + I*y]], {x, -5, 5}, {y, -5, 5}]
```
This creates a 2D density plot of the absolute value of the Gamma function on the complex plane for the range `-5 <= Re(z) <= 5` and `-5 <= Im(z) <= 5`. The resulting plot shows the magnitude of the Gamma function using a color scale, with darker colors indicating higher values and lighter colors indicating lower values. This can be useful to visualize the behavior of the Gamma function in different regions of the complex plane.

Again, I apologize for any confusion my previous response may have caused.
Enter your query: from the manual it appears that the plot range should be complex: like this : {z, -2 - 2 I, 2 + 2 I}
Response: 	You're absolutely right! Thank you for bringing that up. When creating a `ComplexPlot` or `ComplexPlot3D` plot in Mathematica, it is indeed necessary to specify the range of complex values as a pair of complex numbers. Here's an example code for a 3D surface plot of the Gamma function on the complex plane using the correct syntax:
```mathematica
ComplexPlot3D[Gamma[z], {z, -2 - 2 I, 2 + 2 I}]
```
This creates a 3D surface plot of the Gamma function on the complex plane for the range `-2 - 2 I <= z <= 2 + 2 I`. The resulting plot shows the magnitude of the Gamma function as a surface in 3D space, with contour lines indicating the values of the function.

Thank you for pointing out this important distinction!

Testing the LLM output: Complex plots of the gamma function (2D and 3D).

Like chatGPT, the model can get things wrong, and needs coaching, but it is still impressive to me that you can run this locally on generic hardware (although, this was done on a well specced macbook pro.) I don’t really intend to use this extensively — I was really just playing around.  Next I’d like to try on my windows laptop (equipped with a little Nivida 20XX model GPU) and see how that compares to the mac.  Is there really a good use case for me to do this on my own hardware — probably not, but it’s really cool!

Just for fun: a little C program to print Pascal’s triangle

February 13, 2025 C/C++ development and debugging. No comments , , , , , , ,

Using the recurrence relation derived in the previous post, here’s a little C program to print Pascal’s triangle.  Why?  Because I felt like it.

/*
 * Print Pascal's triangle up to a given size.
 *
 * Usage: pascalsTriangle [n]
 *
 * n = 10 by default.
 *
 * Example:
 * ./pascalsTriangle 6
         1
        1  1
      1  2  1
     1  3  3  1
   1  4  6  4  1
  1  5 10 10  5  1
 */
#include <stdio.h>
#include <stdlib.h>

/** For the last row of Pascal's triangle that we will print, figure out how many digits is required for one of the central values.
 *
 * This function and printrow both use a recurrence relationship to compute the binomial coefficient:
 *
 * \binom{n,k} = \binom{n,k-1} (n - k + 1)/k
 *
 */
static int biggest( int n ) {
   int binom = 1;

   for ( int k = 1 ; k < n/2 ; k++ ) {
      binom *= (n - k + 1);
      binom /= k;
   }

   int len = snprintf( NULL, 0, "%d", binom );
   return len;
}

/// Print a row of Pascal's triangle
static void printrow( int n, int m, int spaces ) {
   int binom = 1;
   int leading = (m - n - 1) * spaces/2;

   printf( "%*s%*d", leading, "", spaces, 1 );

   for ( int k = 1 ; k < n ; k++ ) {
      binom *= (n - k + 1);
      binom /= k;

      printf( "%*d", spaces, binom );
   }

   if ( n > 0 ) {
      printf( "%*d", spaces, 1 );
   }
   printf( "\n" );
}

int main( int argc, char ** argv ) {
   int m = 10;

   if ( argc == 2 ) {
      m = atoi( argv[1] );
   }

   if ( m <= 0 ) {
      fprintf( stderr, "invalid size: %d\n", m );
      return 1;
   }

   int spaces = biggest( m ) + 1;

   for ( int n = 0 ; n < m ; n++ ) {
      printrow( n, m, spaces );
   }

   return 0;
}

// vim: et ts=3 sw=3

This was just for fun, but I was pleased with the way that I easily computed the required spacing to make it all line up nicely (at least when using a monospaced font.)

 

I didn’t thoroughly test the output for correctness, but it looks right, and I didn’t find any errors spot checking using “sum of elements above”.

Pascal’s triangle: binomial coefficients, value and recurrence relation

February 12, 2025 math and physics play No comments , , ,

[Click here for a PDF version of this post]

I saw an interesting solution of an introductory programming source sample problem, showing how to display Pascal’s triangle up to a given size. For example, given input \( n = 5 \), the output should be like:
\begin{equation*}
\begin{array}{c}
1 \\
1 \quad 1 \\
1 \quad 2 \quad 1 \\
1 \quad 3 \quad 3 \quad 1 \\
1 \quad 4 \quad 6 \quad 4 \quad 1 \\
1 \quad 5 \quad 10 \quad 10 \quad 5 \quad 1
\end{array}
\end{equation*}

One kind of neat property of Pascal’s triangle is that any non-unity number on the triangle, is the sum of the two numbers above it. That’s one of the ways that I’d imagine a student would solve the programming problem of displaying this triangle.

Another way would be to use a formula for the binomial coefficients. Recall that

Theorem 1.1: Binomial theorem for integer powers.

Given integer \( n \), and any values \( a, b \) that commute under multiplication
\begin{equation*}
\lr{ a + b }^n = \sum_{k = 0}^n \binom{n}{k} a^k b^{n-k},
\end{equation*}
where
\begin{equation*}
\binom{n}{k} = \frac{n!}{k!\lr{n-k}!}.
\end{equation*}

These \( \binom{n}{k} \) values are the binomial coefficients, and are, in fact the values in Pascal’s triangle (we will show this in a bit.)

What surprised me about the programming solution that I saw was that it didn’t use the binomial coefficients. Instead if appeared to use a recurrence relationship, which was interesting, because I didn’t remember any such relationship. We’ll also find that recurrence relationship below, which allows for evaluation of the binomial coefficients without evaluating any of the factorials.

Combinatoric argument for the binomial coefficients

Let’s look at an explicit expansion of the LHS of the binomial theorem for the first few values of \( n \), starting at \(2\).
\begin{equation}\label{eqn:pascals:20}
\lr{ H + T }^2 = \lr{ H + T }\lr{ H + T} = H H + H T + T H + T T.
\end{equation}
We see that we have all possible combinations of \( H, T \) in the final expression, but because of assumed commutation, there is some redundancy, and we can write
\begin{equation}\label{eqn:pascals:40}
\lr{ H + T }^2 = H^2 + 2 H T + T T.
\end{equation}
If we enumerate all possible values for two flips of a coin, there can only be one \( H, H \) pair, there will be two \( H, T \) pairs, and one \( T, T \) pair. These numbers are the second row of Pascal’s triangle: \( 1, 2, 1 \).

Similarly, for \( n = 3 \), we have
\begin{equation}\label{eqn:pascals:60}
\begin{aligned}
\lr{ H + T }^3
&= \lr{ H H + H T + T H + T T } \lr{ H + T } \\
&= H H H + H T H + T H H + T T H + H H T + H T T + T H T + T T T.
\end{aligned}
\end{equation}
This time, after grouping, we have
\begin{equation}\label{eqn:pascals:80}
\lr{ H + T }^3 = H^3 + 3 H^2 T + 3 H T^2 + T^3.
\end{equation}
If we enumerate all possible values for three flips of a coin, there can only be one \( H, H, H \) triplet, there will be three \( H, H, T \) triplets, three \( T, T, H \) triplets, and one \( T, T, T \) triplet. These numbers are the third row of Pascal’s triangle: \( 1, 3, 3, 1 \).

To understand the binomial coefficient, we have to consider how to pick indistinguishable items from a set. Suppose that we want to pick three items from a set of 7, in a specific order. There are 7 ways to pick the first item, 6 ways to pick the second, and 5 ways to pick the third. We can express that as

\begin{equation}\label{eqn:pascals:120}
P(7,3) = 7 \times 6 \times 5 = \frac{7!}{4!} = \frac{7!}{\lr{7-3}!},
\end{equation}
or more generally, to pick \( k \) distinguishable items from a set of \( n \)
\begin{equation}\label{eqn:pascals:140}
P(n,k) = \frac{n!}{\lr{n-k}!}.
\end{equation}
In our \( \lr{H + T}^n \) expansion, we cannot distinguish any of the elements. For each \( k \) items we pick, there are \( k! \) ways of arranging those, so the binomial coefficient is
\begin{equation}\label{eqn:pascals:160}
\binom{n}{k} = \frac{P(n,k)}{k!} = \frac{n!}{k!\lr{n-k}!},
\end{equation}
as stated earlier.

Recurrence relation

For the recurrence relation, let’s assume that we have already found \( \binom{n}{k} \), let’s see if we can use that to find the next on the line. That is, for \( k < n \)
\begin{equation}\label{eqn:pascals:100}
\begin{aligned}
\binom{n}{k + 1}
&= \frac{n!}{\lr{k+1}!\lr{n-(k+1)}!} \\
&= \frac{n!}{\lr{k+1} k!\lr{n – k – 1)}!} \\
&= \frac{n! \lr{ n – k }}{\lr{k+1} k!\lr{n – k)}!} \\
&= \binom{n}{k} \frac{n – k}{k + 1}.
\end{aligned}
\end{equation}

This is a kind of cool result, and is the method that I saw used in the solution to the Pascal’s triangle program. We can start with the 1 value at the left of any row, and with a single multiply and divide, figure out the next entry in the triangle, and proceed from there.

Pascal’s triangle.

For the Pascal’s triangle property to hold, we must have, for any \( k \ne 0 \)
\begin{equation}\label{eqn:pascals:180}
\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k},
\end{equation}
as illustrated in fig. 1.

fig. 1. Pascal’s triangle, sum of elements above.

It’s fairly straightforward to verify that this property holds.
\begin{equation}\label{eqn:pascals:200}
\begin{aligned}
\binom{n-1}{k-1} + \binom{n-1}{k}
&=
\frac{\lr{n-1}!}{\lr{k-1}!\lr{n-1-\lr{k-1}}!}
+
\frac{\lr{n-1}!}{\lr{k}!\lr{n-1-k}!} \\
&=
\frac{\lr{n-1}!}{\lr{k-1}!} \lr{ \inv{\lr{n-k}!} + \inv{k \lr{n-1-k}!} } \\
&=
\frac{\lr{n-1}!}{\lr{k-1}!\lr{n-1-k}!} \lr{ \inv{n-k} + \inv{k} } \\
&=
\frac{\lr{n-1}!}{\lr{k-1}!\lr{n-1-k}!} \frac{k + n – k }{k \lr{ n – k } } \\
&=
\frac{n!}{k!\lr{n-k}!} \\
&=
\binom{n}{k},
\end{aligned}
\end{equation}
as expected.

Electric fields for ring and disk charge distributions (along the central axis.)

February 5, 2025 math and physics play , , , , , , , ,

[Click here for a PDF version of this post]

Motivation: explain field formulas in an engineering problem set.

The following equations were given in a problem set for the magnitude of the electric field strength at a point \( z \) due to symmetric ring and disk charge distributions respectively:
\begin{equation}\label{eqn:ringAndDiskField:20}
E = \frac{q z}{4 \pi \epsilon_0 \lr{ z^2 + R^2 }^{3/2} },
\end{equation}
\begin{equation}\label{eqn:ringAndDiskField:40}
E = \frac{\sigma}{2 \epsilon_0} \lr{ 1 – \frac{ z }{\sqrt{ z^2 + R^2 }} }.
\end{equation}

While these were given as black box results to use in a few superposition problems, there isn’t actually any magic required to find these, provided one knows how to perform some line and area integrals.

Electric field due to charge distributions.

Assuming that we are talking about stationary charges, we have only an electric field. Recall that the field, measured at a point \( \Bx \) for a single stationary point situated at point \( \Bx’ \) is
\begin{equation}\label{eqn:ringAndDiskField:60}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \frac{ q \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
If we want to find the field due to charges \( q_k \) at points \( \Bx_k \), then we just sum over all those charges
\begin{equation}\label{eqn:ringAndDiskField:80}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \sum_k \frac{ q_k \lr{ \Bx – \Bx_k }}{\Norm{\Bx – \Bx_k}^3 }.
\end{equation}
This sum can be extended to a continuous charge density. If \( \rho(\Bx) \) represents the charge per unit volume situated at point \( \Bx \), then the field due to that continuous charge density is
\begin{equation}\label{eqn:ringAndDiskField:100}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \int dV’ \frac{ \rho(\Bx’) \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
Similarly, if \( \sigma(\Bx) \) represents the charge per unit area for at point \( \Bx \), then the field due to that continuous surface charge density is
\begin{equation}\label{eqn:ringAndDiskField:120}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \int dA’ \frac{ \sigma(\Bx’) \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
Finally, if \( \lambda(\Bx) \) represents the charge per unit length situated at point \( \Bx \), then the field due to that continuous linear charge density is
\begin{equation}\label{eqn:ringAndDiskField:140}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \int dl’ \frac{ \lambda(\Bx’) \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
At least theoretically, given any set of charge distributions, it’s just a matter of integration to find the corresponding electric field at any point. It happens that these integrals can be tricky, but we can get lucky when the charge configurations have special symmetries, as in the problem set formulas.

Ring charge distribution.

Let’s compute the electric field at a point along the axis of a ring charge distribution. I’ll use \( \setlr{ \Be_1, \Be_2, \Be_3 } \) to represent the standard basis direction vectors. We are considering the geometry of fig. 1.

fig. 1. Circular ring charge distribution.

The point on the central axis can be written as
\begin{equation}\label{eqn:ringAndDiskField:160}
\Bx = z \Be_3,
\end{equation}
and the points on the ring are
\begin{equation}\label{eqn:ringAndDiskField:180}
\Bx'(\theta) = R \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta },
\end{equation}
then
\begin{equation}\label{eqn:ringAndDiskField:200}
\Bx – \Bx’ = z \Be_3 – R \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta },
\end{equation}
and
\begin{equation}\label{eqn:ringAndDiskField:220}
\Norm{\Bx – \Bx’}^2 = z^2 + R^2.
\end{equation}
Since \( R d\theta \) is the length of a segment of the ring, the charge on that segment is \( R d\theta \lambda \), and the field is
\begin{equation}\label{eqn:ringAndDiskField:240}
\begin{aligned}
\BE(\Bx)
&= \inv{4 \pi \epsilon_0} \int_{\theta =0}^{2 \pi} R d\theta \lambda \frac{z \Be_3 – R \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta }}{\lr{ z^2 + R^2 }^{3/2} } \\
&= \inv{4 \pi \epsilon_0} \frac{2 \pi R z \Be_3 }{ \lr{ z^2 + R^2 }^{3/2} }.
\end{aligned}
\end{equation}
Observe that the integral of the \( \sin\theta \) and \( \cos\theta \) components are zero, since we are integrating over a full period. Finally, since the total charge is \( q = 2 \pi R \lambda \), we have
\begin{equation}\label{eqn:ringAndDiskField:260}
\BE(0, 0, z) = \inv{4 \pi \epsilon_0} \frac{q z \Be_3 }{ \lr{ z^2 + R^2 }^{3/2} }.
\end{equation}
This recovers the equation from the problem set, with the only difference being the explicit inclusion of the direction vector, which was implied in the problem statement.

Field of a disk.

The setup above can be used to state the integral to solve for the field of a disk. The area of a segment of the disk is \( r dr d\theta \), so the charge of that fragment is \( r dr d\theta \sigma \), so the field is
\begin{equation}\label{eqn:ringAndDiskField:280}
\begin{aligned}
\BE(\Bx)
&= \inv{4 \pi \epsilon_0} \int_{r = 0}^R \int_{\theta =0}^{2 \pi} r dr d\theta \sigma \frac{z \Be_3 – r \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta }}{\lr{ z^2 + r^2 }^{3/2} } \\
&= \frac{\sigma z \Be_3}{2 \epsilon_0} \int_{r = 0}^R dr \frac{r}{\lr{ z^2 + r^2 }^{3/2} }.
\end{aligned}
\end{equation}
As before, when evaluating the \( \theta \) integral, the sinusoidal components are killed. Since \( \int_0^{2 \pi} d\theta = 2 \pi \), we have a nice cancellation of \( 2 \pi \) downstairs.
Finally, provided \( z \ne 0 \),
\begin{equation}\label{eqn:ringAndDiskField:300}
\int_{r = 0}^R dr \frac{r}{\lr{ z^2 + r^2 }^{3/2} } = \inv{\Abs{z}} – \inv{\sqrt{ z^2 + R^2 }},
\end{equation}
so the electric field is
\begin{equation}\label{eqn:ringAndDiskField:320}
\BE(0, 0, z) = \frac{ \sigma }{2 \epsilon_0 } \lr{ \mathrm{sgn}(z) – \frac{z}{\sqrt{z^2 + R^2}} },
\end{equation}
where \( \mathrm{sgn}(z) \) is the sign of \( z \). We see that the equation provided in the problem set is actually only true for \( z > 0 \), and needs a sign correction otherwise.

A Green’s function solution to falling with resistance problem.

January 30, 2025 math and physics play , , , , , , , ,

[Click here for a PDF version of this post]

Motivation.

In a fun twitter/x post, we have a Green’s function solution to a constant acceleration problem with drag. The post is meant to be a joke, as the stated problem is: “A boy drops a ball from a height \( h \). What is the speed of the ball when it reaches the floor (no drag)?”

The joke is that nobody would solve this problem using Green’s functions, and nobody would solve this function using Green’s functions for the more general case, allowing for drag. Instead, you’d just solve this using energy balance, which makes the problem trivial.

That said, there are actually lots of cool ideas in the Green’s function method on the joke side of the solution.

So let’s play along with the joke and solve the general damped problem with Green’s functions. Along the way, we can fill in the missing details, and also explore some supplemental ideas that are worth understanding.

Setup.

The equation of motion is
\begin{equation}\label{eqn:greensDropWithResistance:20}
m \frac{d^2 \Bx}{dt^2} = – \gamma \frac{d \Bx}{dt} – m \Bg,
\end{equation}
where \( \Bg \) is a constant (positively oriented) force. The first detail that needs to be included, is that this isn’t the differential equation for the stated problem, and will become problematic should we attempt to apply Green’s function methods. We have to account for the “boy drops” part of the problem statement, and solve with a different forcing function, namely
\begin{equation}\label{eqn:greensDropWithResistance:40}
m \frac{d^2 \Bx}{dt^2} = – \gamma \frac{d \Bx}{dt} – m \Bg \Theta(t).
\end{equation}
This revised model of the system begins the application of the constant (gravitational) force, at time \( t = 0 \). This is now a system that will yield to Green’s function methods.

Fourier transform solution.

The joke solution has strong hints that Fourier transform methods were part of the story. In particular, it appears that the following definitions of the transform pair were used
\begin{equation}\label{eqn:greensDropWithResistance:60}
\begin{aligned}
\hatU(\omega) = F(u(t)) &= \int_{-\infty}^\infty u(t) e^{-i\omega t} dt \\
u(t) = F^{-1}(\hatU(\omega)) &= \inv{2\pi} \int_{-\infty}^\infty \hatU(\omega) e^{i\omega t} d\omega.
\end{aligned}
\end{equation}
However, if we are using Fourier transforms, why bother with Green’s functions? Instead, we can just solve for the system response using Fourier transforms. When looking for the system response, we usually pose the problem with more generality. For example, instead of the specific theta-weighted constant gravitational forcing function above, we seek to find the solution of
\begin{equation}\label{eqn:greensDropWithResistance:80}
m \frac{d^2 \Bx}{dt^2} + \gamma \frac{d \Bx}{dt} = \BF(t).
\end{equation}
We start by assuming that the Fourier transforms of \( \Bx(t), \BF(t) \) are \( \hat{\BX}(\omega), \hat{\BF}(\omega) \) so
\begin{equation}\label{eqn:greensDropWithResistance:100}
\Bx(t) = \inv{2\pi} \int_{-\infty}^\infty e^{i\omega t} \hat{\BX}(\omega) d\omega.
\end{equation}
Derivatives of this presumed Fourier representation are trivial
\begin{equation}\label{eqn:greensDropWithResistance:120}
\begin{aligned}
\Bx'(t) &= \inv{2\pi} \int_{-\infty}^\infty \lr{ i\omega } e^{i\omega t} \hat{\BX}(\omega) d\omega \\
\Bx”(t) &= \inv{2\pi} \int_{-\infty}^\infty \lr{ i\omega }^2 e^{i\omega t} \hat{\BX}(\omega) d\omega,
\end{aligned}
\end{equation}
so the frequency representation of our system is
\begin{equation}\label{eqn:greensDropWithResistance:140}
\inv{2\pi} \int_{-\infty}^\infty \lr{ m \lr{ i\omega }^2 + \gamma \lr{ i\omega} } e^{i\omega t} \hat{\BX}(\omega) d\omega
=
\inv{2\pi} \int_{-\infty}^\infty e^{i\omega t} \hat{\BF}(\omega) d\omega,
\end{equation}
or
\begin{equation}\label{eqn:greensDropWithResistance:160}
\hat{\BX}(\omega) = \frac{\hat{\BF}(\omega)}{-m \omega^2 + i \omega \gamma}.
\end{equation}
We now only have to inverse Fourier transform to find a solution, namely
\begin{equation}\label{eqn:greensDropWithResistance:180}
\begin{aligned}
\Bx(t)
&= \inv{2\pi} \int_{-\infty}^\infty e^{i\omega t} \frac{\hat{\BF}(\omega)}{-m \omega^2 + i \omega \gamma} d\omega \\
&= \inv{2\pi} \int_{-\infty}^\infty e^{i\omega t} \frac{1}{-m \omega^2 + i \omega \gamma} d\omega
\int_{-\infty}^\infty \BF(t’) e^{-i \omega t’} dt’ \\
&= \int_{-\infty}^\infty \lr{ -\inv{2\pi} \int_{-\infty}^\infty \frac{ e^{i\omega (t-t’)} }{m \omega^2 – i \omega \gamma} d\omega }F(t’) dt’,
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:greensDropWithResistance:200}
\Bx(t) = \int_{-\infty}^\infty G(t – t’) \BF(t’) dt’,
\end{equation}
where
\begin{equation}\label{eqn:greensDropWithResistance:220}
G(\tau) = -\inv{2\pi} \int_{-\infty}^\infty \frac{ e^{i\omega \tau} }{\omega\lr{ m \omega – i \gamma}} d\omega.
\end{equation}

We’ve been fast and loose above, swapping order of integration without proper justification, and have assumed that all Fourier transforms and inverse transforms exist. Given all those assumptions, we now have a general solution for the system, requiring only the convolution of our driving force \( F(t) \) with the system response function \( G(t) \). The only caveat is that we have to be able to perform the integral for the system response function, and that integral does not exist.

There are lots of integrals that do not strictly exist when playing the fast and loose physicist game with Fourier transforms. One such example can be found by looking at any transform pair. For example, given \( u(t) = F^{-1}(\hatU(\omega)) \), we have
\begin{equation}\label{eqn:greensDropWithResistance:240}
\begin{aligned}
u(t)
&= \inv{2\pi} \int_{-\infty}^\infty \hatU(\omega) e^{i\omega t} d\omega \\
&= \inv{2\pi} \int_{-\infty}^\infty \lr{ \int_{-\infty}^\infty u(t’) e^{-i\omega t’} dt’ } e^{i\omega t} d\omega \\
&= \int_{-\infty}^\infty u(t’) \lr{ \inv{2\pi} \int_{-\infty}^\infty e^{i\omega (t-t’)} d\omega } dt’.
\end{aligned}
\end{equation}
This is exactly the sort of integration order swapping that we did to find the system response function above, and we are left with a statement that \( f(t) \) is the convolution of \( f(t) \), with another, also non-integrable, convolution kernel. Any physics student will recognize that kernel as a representation of the Dirac delta function, and without blinking, would just write
\begin{equation}\label{eqn:greensDropWithResistance:260}
\delta(\tau) = \inv{2\pi} \int_{-\infty}^\infty e^{i\omega \tau} d\omega,
\end{equation}
without worrying that it is not possible to evaluate this integral. Somebody who is trying to use the right mathematical language, would say that this isn’t a function, but is, instead a distribution. Just like this delta function distribution, our system response integral, something that we also cannot actually evaluate in a strict sense, is a distribution. It’s a beastie that has delta function like characteristics, and if we want to try to integrate it, we have to play sneaky games.

Let’s put off evaluating that integral for now, and return to the Green’s function description of the story.

The Green’s function picture.

Using Fourier transforms, we found that it theoretically possible to find a convolution solution to the system, and found the convolution kernel for the system. The rough idea behind Green’s functions is to assume that such a convolution exists, say
\begin{equation}\label{eqn:greensDropWithResistance:280}
\Bx(t) = \Bx_0(t) + \int_{-\infty}^\infty G(t,t’) \BF(t’) dt’,
\end{equation}
where \( \Bx_0(t) \) is any solution of the homogeneous problem satisfying, in this case,
\begin{equation}\label{eqn:greensDropWithResistance:300}
m \frac{d^2}{dt^2} \Bx_0(t) + \gamma \frac{d}{dt} \Bx_0(t) = 0,
\end{equation}
and \( G(t,t’) \) is a convolution kernel, representing the system response, to be determined.
If we plug this presumed solution into our differential equation, we find
\begin{equation}\label{eqn:greensDropWithResistance:320}
\int_{-\infty}^\infty \lr{
m \frac{\partial^2}{\partial t^2} G(t,t’)
+ \gamma \frac{\partial}{\partial t} G(t,t’)
} \BF(t’) dt’
=
\BF(t),
\end{equation}
but
\begin{equation}\label{eqn:greensDropWithResistance:340}
\BF(t) = \int_{-\infty}^\infty \BF(t’) \delta(t – t’) dt’,
\end{equation}
so, if we can find \( G \) satisfying
\begin{equation}\label{eqn:greensDropWithResistance:360}
m \frac{\partial^2}{\partial t^2} G(t,t’) + \gamma \frac{\partial}{\partial t} G(t,t’) = \delta(t – t’),
\end{equation}
then we have solved the system. We can simplify this slightly by presuming that the \( t,t’ \) dependence is always a difference, and seek \( G(\tau) \) such that
\begin{equation}\label{eqn:greensDropWithResistance:380}
m G”(\tau) + \gamma G'(\tau) = \delta(\tau).
\end{equation}
We now pull the Fourier transform out of our toolbox again, assuming that
\begin{equation}\label{eqn:greensDropWithResistance:400}
G(\tau) = \inv{2 \pi} \int_{-\infty}^\infty \hat{G}(\omega) e^{i\omega\tau} d\omega,
\end{equation}
for which
\begin{equation}\label{eqn:greensDropWithResistance:420}
\inv{2 \pi} \int_{-\infty}^\infty \lr{ m \lr{ i \omega }^2 + \gamma \lr{ i \omega } } \hat{G}(\omega) e^{i\omega \tau} d\omega
=
\inv{2 \pi } \int_{-\infty}^\infty e^{i\omega \tau} d\omega,
\end{equation}
or
\begin{equation}\label{eqn:greensDropWithResistance:440}
\hat{G}(\omega) = \inv{ m \lr{ i \omega }^2 + \gamma \lr{ i \omega } }.
\end{equation}
This is the Fourier transform of the Green’s function, and is exactly what we found earlier using pure Fourier transforms. Our starting point was different this time, as we just blatantly assumed that the solution had a convolution structure. We then found a differential equation for that convolution kernel, the Green’s function. Only then did we pull the Fourier transform out of the toolbox to attempt to find the structure of that Green’s function.

Evaluating the Green’s function integral.

We can’t go any further without figuring out what to do with our nasty little divergent integral \ref{eqn:greensDropWithResistance:220}. We may coerce this into something that we can evaluate using standard contour integration, if we offset the pole at the origin slightly. Given \( \epsilon > 0 \), let’s evaluate
\begin{equation}\label{eqn:greensDropWithResistance:460}
G(\tau, \epsilon) = -\inv{2\pi} \oint \frac{ e^{i z \tau} }{\lr{ z – i \epsilon } \lr{ m z – i \gamma}} dz.
\end{equation}
We can evaluate this integral using infinite semicircular contours, using an upper half plane contour for \( \tau > 0 \) and a lower half plane contour for \( \tau < 0 \), as illustrated in fig. 1, and fig. 2.

 

fig. 1. Contour for tau > 0.

 

 

fig. 2. Contour for tau < 0.

By Jordan’s lemma, that upper half plane infinite semicircular part of the contour integral is zero for the \( \tau > 0 \) case, and for the \( \tau < 0 \) case, the lower half plane infinite semicircular part of the contour integral is zero. We can proceed with the residue calculations. In the upper half plane, we have both of the enclosed poles, so \begin{equation}\label{eqn:greensDropWithResistance:480} \begin{aligned} G(\tau > 0, \epsilon)
&= -\inv{2\pi m } \int_{-\infty}^\infty \frac{ e^{i \omega \tau} }{\lr{ \omega – i \epsilon } \lr{ \omega – i \gamma/m}} d\omega \\
&= -\frac{ 2 \pi i }{ 2 \pi m} \lr{
\evalbar{ \frac{ e^{i z \tau} }{ z – i \gamma/m} }{z = i \epsilon}
+
\evalbar{ \frac{ e^{i z \tau} }{ z – i \epsilon } }{ z = i \gamma/m}
} \\
&=
-\frac{i}{m} \lr{
\frac{ e^{-\epsilon \tau} }{ i \epsilon – i \gamma/m}
+
\frac{ e^{-\gamma\tau/m} }{ i \gamma/m – i \epsilon }
} \\
&=
-\lr{
\frac{e^{-\epsilon \tau}}{ m \epsilon – \gamma }
+
\frac{ e^{-\gamma\tau/m} }{ \gamma – m \epsilon }
},
\end{aligned}
\end{equation}
and for the lower half plane, where there are no enclosed poles we have \( G(\tau < 0, \epsilon) = 0 \). In the \( \epsilon \rightarrow 0 \) limit, we are left with
\begin{equation}\label{eqn:greensDropWithResistance:500}
G(\tau) = \inv{\gamma} \lr{ 1 – e^{-\gamma \tau/m} } \Theta(\tau).
\end{equation}

Back to the original problem.

We may now go and find the specific solution for the original problem where \( F(t) = – m g \Be_2 \Theta(t) \). That solution is
\begin{equation}\label{eqn:greensDropWithResistance:520}
\begin{aligned}
\Bx(t)
&= \Bx(0) + \int_{-\infty}^\infty G(t – t’) \lr{ – m g \Be_2 \Theta(t’) } dt’ \\
&= \Bx(0) – m g \Be_2 \int_{-\infty}^\infty \frac{\Theta(t – t’)}{\gamma} \lr{ 1 – e^{-\gamma \lr{ t – t’}/m } } \Theta(t’) dt’ \\
&= \Bx(0) – m g \Be_2 \int_{0}^\infty \frac{\Theta(t – t’)}{\gamma} \lr{ 1 – e^{-\gamma \lr{ t – t’}/m } } dt’ \\
&= \Bx(0) – \frac{m g}{\gamma} \Be_2 \int_{0}^t \lr{ 1 – e^{-\gamma \lr{ t – t’}/m } } dt’ \\
&= \Bx(0) – \frac{m g}{\gamma} \Be_2 \int_0^t \lr{ 1 – e^{-\gamma u/m } } du \\
&= \Bx(0) – \frac{m g}{\gamma} \Be_2 \evalrange{ \lr{ t’ – \frac{e^{-\gamma u/m } }{-\gamma/m} } }{u=0}{t} \\
&= \Bx(0) – \frac{m g}{\gamma} \Be_2 \lr{ t + \frac{m e^{-\gamma t/m }}{\gamma} – \frac{m}{\gamma} } \\
&= \Bx(0) – \frac{m g t}{\gamma} \Be_2 – \frac{m^2 g}{\gamma^2} \lr{ 1 – e^{-\gamma t/m } }.
\end{aligned}
\end{equation}

Ignoring the missing factor of \( g \) on the last term in the twitter slide, this is the final result before the limiting argument on that slide.

Having found the Green’s function for this system, we could then, fairly trivially, use it to solve similar systems with different forcing functions. For example, suppose we have a mass on a table, with friction, and a forcing function (perhaps sinusoidal) moving that mass. We could then figure out the time response for that particular forcing function, and would only have a convolution integral to evaluate. That general applicability is one of the beauties of these transform or Green’s function methods.