math and physics play

Weighted geometric series

March 23, 2025 math and physics play No comments , , ,

[Click here for a PDF version of this post]

Karl needed to evaluate the sum:

\begin{equation}\label{eqn:weightedGeometric:20}
S = \sum_{k = 0}^9 \frac{a + b k}{\lr{ 1 + i }^k}
\end{equation}

He ended up using a spreadsheet, which was a quick and effective way to deal with the problem. I was curious about this sum, since he asked me how to sum it symbolically, and I didn’t know.

Mathematica doesn’t have any problem with it, as seen in fig. 1.

fig. 1. A funky sum.

How can we figure this out?

Let’s write \( r = 1/(1+i) \) to start with, and break up the sum into constituent parts
\begin{equation}\label{eqn:weightedGeometric:40}
\begin{aligned}
S_n
&= \sum_{k = 0}^n \frac{a + b k}{\lr{ 1 + i }^k} \\
&= a \sum_{k = 0}^n r^k + b \sum_{k = 0}^n k r^k.
\end{aligned}
\end{equation}
We can evaluate the geometric part of this easily using the usual trick. Let
\begin{equation}\label{eqn:weightedGeometric:60}
T_n = \sum_{k = 0}^n r^k,
\end{equation}
then
\begin{equation}\label{eqn:weightedGeometric:80}
r T_n – T_n = r^{n+1} – 1,
\end{equation}
so
\begin{equation}\label{eqn:weightedGeometric:100}
T_n = \frac{r^{n+1} – 1}{r – 1}.
\end{equation}
Now we just have to figure out how to sum
\begin{equation}\label{eqn:weightedGeometric:120}
G_n = \sum_{k = 0}^n k r^k = \sum_{k = 1}^n k r^k.
\end{equation}
This looks suspiciously like the derivative of a geometric series. Let’s evaluate such a derivative, as a function of r:
\begin{equation}\label{eqn:weightedGeometric:140}
\begin{aligned}
\frac{d T_n(r)}{dr}
&= \sum_{k = 0}^n \frac{d}{dr} r^k \\
&= \sum_{k = 0}^n k r^{k-1} \\
&= \sum_{k = 1}^n k r^{k-1} \\
&= \inv{r} \sum_{k = 1}^n k r^k.
\end{aligned}
\end{equation}
Having summed the geometric series, we may also take the derivative of that summed result
\begin{equation}\label{eqn:weightedGeometric:160}
\begin{aligned}
\frac{d T_n(r)}{dr}
&= \frac{d}{dr} \lr{ \frac{r^{n+1} – 1}{r – 1} } \\
&= \frac{(n+1)r^n}{r – 1} – \frac{r^{n+1} – 1}{\lr{r -1}^2} \\
&= \frac{(n+1)r^n(r-1) – \lr{r^{n+1} – 1}}{\lr{r -1}^2}.
\end{aligned}
\end{equation}
Putting the pieces together, we have
\begin{equation}\label{eqn:weightedGeometric:180}
T_n = \frac{r}{\lr{r -1}^2} \lr{ (n+1)r^n(r-1) – \lr{r^{n+1} – 1} }.
\end{equation}

This means that our sum is
\begin{equation}\label{eqn:weightedGeometric:200}
S_n = a \frac{r^{n+1} – 1}{r – 1} + b \frac{r}{\lr{r -1}^2} \lr{ (n+1)r^n(r-1) – \lr{r^{n+1} – 1} }.
\end{equation}
Putting back \( r = 1/(1+i) \), and subsequent simplification, gives the Mathematica result. It’s not pretty, but at least we can do it if we want to.

Shout out to Grok that pointed out the derivative trick for the second series.  I’d forgotten that one.

Impedance refresher.

March 21, 2025 math and physics play No comments , , , , , , , , , , , ,

[Click here for a PDF version of this post]

Karl is taking his circuits course right now, which means that I get a chance to field some questions. I don’t get an excuse to think about this stuff any more. It’s fun material, since most of the ideas are all really simple, and you can figure out everything from first principles.

Karl just started sinusoidal circuits, which I think is a bit exciting. They are such a nice special case, as complex calculations are all effectively reduced to \( V = I R \) style computations.

Solving RLC circuits for general time dependent sources.

To contrast the simple case of sinusoidal sources, let’s consider what we have to do in order to solve a general case RLC circuit. The simple basic RC circuit sketched in fig. 1 provides a good illustrative example, even though it does not include any inductance.

With \( v \) as the voltage at the capacitor, the equations that describe the circuit are
\begin{equation}\label{eqn:impedance:20}
\begin{aligned}
v_s – v &= i R \\
i &= C \frac{dv}{dt}.
\end{aligned}
\end{equation}

We can combine these into one equation for \( v \). Letting \( \tau = RC \), that is
\begin{equation}\label{eqn:impedance:40}
v + \tau \frac{dv}{dt} = v_s.
\end{equation}
Here \( v_s = v_s(t) \) can be an arbitrary function of time. This is a simple enough differential equation, and can probably be solved in various ways (integrating factors, Fourier transforms, Laplace transforms, …)

For illustration purposes, let’s tackle this little equation with Fourier transforms, a method logically equivalent to the computation of the Green’s function for the system.

Let’s use a symmetric representation of the Fourier transform
\begin{equation}\label{eqn:impedance:60}
\begin{aligned}
F(\omega) &= \inv{\sqrt{2 \pi}} \int_{-\infty}^\infty e^{-j\omega t} f(t) dt \\
f(t) &= \inv{\sqrt{2 \pi}} \int_{-\infty}^\infty e^{j\omega t} F(\omega) d\omega \\
\end{aligned}
\end{equation}
Recall that the Fourier transform of the derivative is just a \( j \omega \) scaled frequency domain function, which we show with integration by parts
\begin{equation}\label{eqn:impedance:80}
\begin{aligned}
\inv{\sqrt{2 \pi}} \int_{-\infty}^\infty e^{-j\omega t} \frac{df}{dt} dt
&= \inv{\sqrt{2 \pi}} \int_{-\infty}^\infty \lr{ \frac{d}{dt} \lr{ f(t) e^{-j \omega t} } – f(t) \frac{d}{dt}\lr{ e^{-j\omega t}} } dt \\
&= j \omega F(\omega).
\end{aligned}
\end{equation}
That means that the frequency domain equivalent of our system is
\begin{equation}\label{eqn:impedance:100}
V + j \omega \tau V = V_s,
\end{equation}
or
\begin{equation}\label{eqn:impedance:120}
V(\omega) = \frac{V_s(\omega)}{1 + j \omega \tau}.
\end{equation}
Inverse transformation yields
\begin{equation}\label{eqn:impedance:140}
\begin{aligned}
v(t)
&= \inv{\sqrt{2 \pi}} \int_{-\infty}^\infty e^{j\omega t} \frac{V_s(\omega)}{1 + j \omega \tau} d\omega \\
&= \inv{2 \pi} \iint_{-\infty}^\infty e^{j\omega t} \frac{1}{1 + j \omega \tau} d\omega e^{-j\omega t’} v_s(t’) dt’ \\
&= \int_{-\infty}^\infty dt’ v_s(t’) \inv{2\pi} \int_{-\infty}^\infty \frac{e^{j\omega(t-t’)}}{1 + j \omega \tau} d\omega,
\end{aligned}
\end{equation}
or with
\begin{equation}\label{eqn:impedance:160}
G(u) = \inv{2\pi} \int_{-\infty}^\infty \frac{e^{j\omega u}}{1 + j \omega \tau} d\omega,
\end{equation}
\begin{equation}\label{eqn:impedance:180}
v(t) = \int_{-\infty}^\infty v_s(t’) G(t – t’) dt’.
\end{equation}
We just need to evaluate the Green’s function \( G(u) \) to proceed, which we can do with standard contour integration, first writing:

\begin{equation}\label{eqn:impedance:200}
\begin{aligned}
G(u)
&= \inv{2\pi j \tau} \int_{-\infty}^\infty \frac{e^{j\omega u}}{\inv{j\tau} + \omega} d\omega \\
&= \inv{2\pi j \tau} \oint \frac{e^{j z u}}{\inv{j\tau} + z} dz.
\end{aligned}
\end{equation}
This has a single pole at \( z = j/\tau \). We need an infinite semicircular contour in the lower half plane for \( u < 0 \), and can use the upper half plane infinite semicircular contour (surrounding the pole) for \( u > 0 \). That gives
\begin{equation}\label{eqn:impedance:220}
\begin{aligned}
G(u)
&= \Theta(u) \frac{2 \pi j}{2\pi j \tau} \evalbar{e^{j z u}}{z = j/\tau} \\
&= \frac{\Theta(u)}{\tau} e^{- u/\tau}.
\end{aligned}
\end{equation}

The solution to the problem, for any Fourier integrable source \( v_s(t) \), is
\begin{equation}\label{eqn:impedance:240}
\boxed{
v(t) = \int_{-\infty}^t v_s(t’) \frac{e^{- \lr{t – t’}/\tau}}{\tau} dt’.
}
\end{equation}

As a check, let’s evaluate this convolution integral for a step source \( v_s(t) = V \Theta(t) \), to find
\begin{equation}\label{eqn:impedance:260}
\begin{aligned}
v(t)
&= \frac{V e^{-t/tau}}{\tau} \int_0^t e^{t’/\tau} dt’ \\
&= V e^{-t/tau} \evalrange{e^{t’/\tau} }{0}{t} \\
&= V e^{-t/tau} \lr{ e^{t/\tau} – 1 } \\
&= V \lr{ 1 – e^{-t/\tau} }.
\end{aligned}
\end{equation}
This is the damped time domain response that we remember for an RC circuit. In Karl’s first year engineering notes, this was presented as a given (without the step factor), and he had to verify that it worked by differentiation (for \( t > 0 \).)

Solving this exactly, even for arbitrary sources, as we’ve done above, is not strictly hard, if you have all the required tools. But the first year engineering student doesn’t have all those tools to start with. This is where the beauty of the phasor techniques for sinusoidal sources comes in. Let’s now see how that works.

Phasor approach.

Let’s consider the three simplest RLC circuits, each with just a single element, and a variable voltage source. I’ll depict that element with a box as in fig. 2.

Resistor case.

If the element is a resistor with value \( R \), our equations are simple
\begin{equation}\label{eqn:impedance:280}
v_s = i R.
\end{equation}
Clearly \( i \) is directly proportional to the source voltage. In particular, if \( v_s(t) \) has a sinusoidal character, such as
\begin{equation}\label{eqn:impedance:300}
v_s(t) = V \cos(\omega t),
\end{equation}
then
\begin{equation}\label{eqn:impedance:320}
i(t) = \frac{V}{R} \cos\lr{ \omega t }.
\end{equation}
In particular, if we let \( i(t) = I \cos\lr{ \omega t } \), then we have
\begin{equation}\label{eqn:impedance:340}
I = \frac{V}{R},
\end{equation}
or \( V = I R \).

Capacitor case.

If the load element is a capacitor with capacitance \( C \), then the equation for the system is
\begin{equation}\label{eqn:impedance:360}
i = C \frac{dv_s(t)}{dt}.
\end{equation}
If we just plug in \( v_s(t) = V \cos(\omega t) \), as before, we get a bit of a mess
\begin{equation}\label{eqn:impedance:380}
i = -C \omega V \sin\lr{ \omega t }.
\end{equation}
We no longer have a nice simple proportionality relationship between the current and the voltage source, as the capacitor has introduced a phase shift into the mix.
We can figure out that phase factor by solving the equation
\begin{equation}\label{eqn:impedance:400}
-\sin x = \cos\lr{ x + \phi }.
\end{equation}
The easiest way to solve this is to express the sinusoids in complex exponential form
\begin{equation}\label{eqn:impedance:420}
\textrm{Re} \lr{ e^{j + \phi} } = \Real \lr{ j e^{j x} } = \Real \lr{ e^{j \pi/2} e^{j x} }.
\end{equation}
We see that the phase factor is a \( \pi/2 \) shift. However, even better, we have a strong hint that working with complex exponentials may be a better approach to formulating the problem.

Let’s write
\begin{equation}\label{eqn:impedance:440}
v_s(t) = V \cos\lr{ \omega t } = \textrm{Re} \lr{ V e^{j \omega t} }.
\end{equation}
Then we have
\begin{equation}\label{eqn:impedance:460}
i(t) = C V \textrm{Re} \lr{ \frac{d}{dt} e^{j \omega t} }.
\end{equation}
If we also assume that we can write
\begin{equation}\label{eqn:impedance:480}
i(t) = \textrm{Re} \lr{ I e^{j \omega t} },
\end{equation}
then if the real parts are equal, we must also have
\begin{equation}\label{eqn:impedance:500}
I e^{j \omega t} = j \omega C V e^{j \omega t},
\end{equation}
or
\begin{equation}\label{eqn:impedance:520}
I = j \omega C V.
\end{equation}
We have a \( V = I R \) relationship, which we write as
\begin{equation}\label{eqn:impedance:540}
V = I Z,
\end{equation}
where
\begin{equation}\label{eqn:impedance:560}
Z = \inv{j \omega C}.
\end{equation}
This is the phasor description of the circuit.

Inductive case.

If the circuit has an inductive load, then the system equation is
\begin{equation}\label{eqn:impedance:580}
v_s(t) = L \frac{di}{dt}.
\end{equation}
Again, we can write \( v_s(t) = \textrm{Re} \lr{ V e^{j\omega t} } \), and assume that \( i = \Real \lr{ I e^{j\omega t} } \). We then require
\begin{equation}\label{eqn:impedance:600}
V e^{j \omega t} = L \frac{d}{dt} \lr{ I e^{j \omega t} },
\end{equation}
or
\begin{equation}\label{eqn:impedance:620}
V = j \omega L I.
\end{equation}
We write
\begin{equation}\label{eqn:impedance:640}
Z = j \omega L,
\end{equation}
so once again \( V = I Z \).

Solving a more complex RLC configuration.

An example of a more complicated RLC circuit is sketched in fig. 3.

Here we have two impedances in parallel
\begin{equation}\label{eqn:impedance:660}
\begin{aligned}
Z_C &= \inv{j \omega C} \\
Z_L &= j \omega L.
\end{aligned}
\end{equation}
The parallel impedance through that reactive load is
\begin{equation}\label{eqn:impedance:680}
\begin{aligned}
Z
&= \lr{ \inv{Z_C} + \inv{Z_L} }^{-1} \\
&= \lr{ j \omega C + \inv{ j \omega L } }^{-1}.
\end{aligned}
\end{equation}
We can compute the current through \( R \) now
\begin{equation}\label{eqn:impedance:700}
I = \frac{V_s}{ R + \lr{ j \omega C + \inv{ j \omega L } }^{-1} }.
\end{equation}
We also have
\begin{equation}\label{eqn:impedance:720}
\frac{V_s – V}{R} = I,
\end{equation}
or
\begin{equation}\label{eqn:impedance:740}
\begin{aligned}
V
&= V_s – I R \\
&= V_s \lr{ 1 – \frac{R}{ R + \lr{ j \omega C + \inv{ j \omega L } }^{-1} } } \\
&= V_s \frac{\lr{ j \omega C + \inv{ j \omega L } }^{-1} }{ R + \lr{ j \omega C + \inv{ j \omega L } }^{-1} } \\
&= \frac{V_s}{ R \lr{ j \omega C + \inv{ j \omega L } } + 1 }.
\end{aligned}
\end{equation}
The complicated time response for this system is reduced to a trivial voltage divider calculation. We see that it was kind of pointless to run an inductor and capacitor in parallel, as they are both purely reactive (imaginary). That’s a detail that I didn’t remember, since it’s been decades since I did any practical circuits applications. However, the point is, by using a complex exponential source representation, these types of systems are reduced from systems of coupled differential equations to simple linear systems. Imagine how messy it would be to try to solve this system using the Green’s function methods that we used above!

Playing around with privately hosted LLM models

February 17, 2025 math and physics play , , , ,

I have a personal use copy of Mathematica, and wanted to play with the new LLM functionality (LLMConfiguration, LLMFunction, …)

It turns out that this functionality is doubly useless.  Wolfram wants to charge you extra to use it with their models, and it fails with the chatGPT models, because openAI also requires a subscription payment tier.

However, I found that you can install and run local LLM models fairly easily.  Using ollama, it was fairly trivial to install and run the llama2 and mixtral models.  I was surprised just how easy it is to get an LLM running on your own hardware.

The steps, if I remember them properly, was just:

  • Install ollama
  • ollama pull llama2
  • ollama run llama2

Once you have the model’s installed, you can check that the model(s) are installed, and how much storage they consume:

# ollama list
NAME              ID              SIZE      MODIFIED     
mixtral:latest    a3b6bef0f836    26 GB     31 hours ago    
llama2:latest     78e26419b446    3.8 GB    32 hours ago    

Here’s an example of a command line session with the mixtral model:

# ollama run mixtral
>>> Show me Mathematica code to solve the linear system: a x + b y = c, d x + e y = f
Here is some Mathematica code that solves the linear system you have given:

```
a = 2; b = 3; c = 4;
d = 5; e = 6; f = 7;

sol = Solve[{a x + b y == c, d x + e y == f}, {x, y}]

(* {{x -> -(1/2) - (3 f)/14, y -> 2/7 + (5 c)/14}} *)
```

Incidentally, the llama2 model gets this wrong, and suggests:

solve {a*x + b*y == c, d*x + e*y == f}

which is malformed. The mixtral model needlessly includes specific values for all the matrix elements, but can be coached not to do so.

One nice thing about ollama is that it provides a rest server interface, so you can query it programatically. Here’s some sample python code cobbled together to do just that:

#! /usr/bin/env python3

import requests
import json
import sys
import os
import argparse
import re
from pathlib import Path

#MODEL = 'mixtral'
MODEL = 'llama2'
CACHE_FILE = ''

def load_cache():
    if os.path.exists(CACHE_FILE):
        with open(CACHE_FILE, 'r') as f:
            return json.load(f)
    return []

def save_cache(context):
    with open(CACHE_FILE, 'w') as f:
        json.dump(context, f)

def query_ollama(prompt, context):
    response = requests.post(
        'http://localhost:11434/api/generate',
        json={'model': MODEL, 'prompt': prompt, 'context': context, 'stream': False}
    ).json()

    return response

def main():
    global MODEL, CACHE_FILE

    # Set up argument parser
    parser = argparse.ArgumentParser(description="Query the Ollama Mixtral LLM model.")
    parser.add_argument('--clean', action='store_true', help="Start with an empty context list (no cache).")
    parser.add_argument('--model', type=str, default=MODEL, help="Specify an alternative model.")
    parser.add_argument('--cache', type=str, default='context_cache', help="context cache path component.")
    parser.add_argument('--query', type=str, default='', help="multiline query in a file")

    args = parser.parse_args()

    MODEL = args.model
    print(f"Using model: {MODEL}")

    cache = args.cache;
    CACHE_FILE = os.path.expanduser(f'~/.llmq.{cache}.{MODEL}.json')
    print(f"Context cache file: {CACHE_FILE}")

    # Initialize context based on command-line options
    if args.clean:
        context = []
    else:
        context = load_cache()

    while True:
        if args.query:
            prompt = Path(args.query).read_text().strip()
        else:
            # Read input from stdin (interactive prompt)
            prompt = input("Enter your query: ")

            # Check if the prompt matches exit or quit
            if re.match(r'^(exit|quit)$', prompt.lower()):
                print("Exiting...")
                break

        # Query the model and get the response
        response = query_ollama(prompt, context)
        rr = response.get('response', 'No response key in API output')
        print(f"Response: {rr}")

        # Update context (for context propagation)
        context = response.get('context', [])
        save_cache(context)

        if args.query:
            break

if __name__ == "__main__":
    main()

# vim: et ts=4 sw=4

This code initially queries the rest server with an empty list context, but has a mechanism to save context once returned. Here is a sample session with the script:

# llmq.py --model mixtral --cache mathematica 
Using model: mixtral
Context cache file: /Users/pjoot/.llmq.mathematica.mixtral.json
Enter your query: show me code to plot sin(x), and 2 sin(x) cos(x) in a single plot with different colors
Response: 	To plot both `Sin[x]` and `2 Sin[x] Cos[x]` in the same plot with different colors, you can use the `Plot` function in Mathematica as follows:
```mathematica
Plot[{Sin[x], 2*Sin[x]*Cos[x]}, {x, -Pi, Pi}, PlotStyle -> {Blue, Red}]
```
This will create a plot with `Sin[x]` in blue and `2 Sin[x] Cos[x]` in red. The x-axis ranges from `-Pi` to `Pi`. The `Plot` function automatically chooses appropriate tick marks and axis labels, but you can customize these using additional options if desired.
Note that the option `PlotStyle -> {Blue, Red}` specifies that the first curve should be plotted in blue and the second curve in red. You can change the colors by modifying this option.[br]
You can also add labels and title to your plot with the following code:
```mathematica
Plot[{Sin[x], 2*Sin[x]*Cos[x]}, {x, -Pi, Pi}, PlotStyle -> {Blue, Red}, PlotLabel -> "Plots of sin(x) and 2 sin(x) cos(x)", AxesLabel -> {"x", ""}]
```
This will create a plot with the title "Plots of sin(x) and 2 sin(x) cos(x)" and labels "x" on the x-axis.[br]
You can also add a legend to your plot by adding the option `PlotLegends -> {"sin(x)", "2 sin(x)cos(x)"}`:
```mathematica
Plot[{Sin[x], 2*Sin[x]*Cos[x]}, {x, -Pi, Pi}, PlotStyle -> {Blue, Red}, PlotLabel -> "Plots of sin(x) and 2 sin(x) cos(x)", AxesLabel -> {"x", ""}, PlotLegends -> {"sin(x)", "2 sin(x)cos(x)"}]
```
Enter your query: Now I'd like to plot the gamma function in the complex plane.  What is a good way to visualize that?
Response: 	Visualizing the Gamma function on the complex plane can be done using a 3D surface plot, where the real and imaginary parts of the input argument are the x-axis and y-axis, respectively, and the output value (the magnitude of the Gamma function) is the z-axis. Here's an example code in Mathematica to create such a plot:
```mathematica
ComplexPlot3D[Gamma[x + I*y], {x, -5, 5}, {y, -5, 5}]
```
This creates a 3D surface plot of the Gamma function on the complex plane for the range `-5 <= Re(z) <= 5` and `-5 <= Im(z) <= 5`. The resulting plot shows the magnitude of the Gamma function as a surface in 3D space, with contour lines indicating the values of the function.

Alternatively, you can use the following code to create a density plot of the absolute value of the Gamma function on the complex plane:
```mathematica
ComplexPlot[Abs[Gamma[x + I*y]], {x, -5, 5}, {y, -5, 5}]
```
This creates a 2D density plot of the absolute value of the Gamma function on the complex plane for the same range as before. The resulting plot shows the magnitude of the Gamma function using a color scale, with darker colors indicating higher values and lighter colors indicating lower values. This can be useful to visualize the behavior of the Gamma function in different regions of the complex plane.

Note that the Gamma function has poles at non-positive integers, so the plot will show singularities at those points. You may want to exclude these points from your plot by using `Exclusions -> {Re[z] <= 0, Im[z] == 0, Re[z] == Integer}` in the `ComplexPlot3D` function or `Exclusions -> {Re[z] <= 0, Im[z] == 0}` in the `ComplexPlot` function.
Enter your query: that produced the error: Options expected (instead of {y,-5,5}) beyond position 2 in ...
Response: 	I apologize for the mistake in my previous response. Here's the corrected code to create a density plot of the absolute value of the Gamma function on the complex plane:
```mathematica
ComplexPlot[Abs[Gamma[x + I*y]], {x, -5, 5}, {y, -5, 5}]
```
This creates a 2D density plot of the absolute value of the Gamma function on the complex plane for the range `-5 <= Re(z) <= 5` and `-5 <= Im(z) <= 5`. The resulting plot shows the magnitude of the Gamma function using a color scale, with darker colors indicating higher values and lighter colors indicating lower values. This can be useful to visualize the behavior of the Gamma function in different regions of the complex plane.

Again, I apologize for any confusion my previous response may have caused.
Enter your query: from the manual it appears that the plot range should be complex: like this : {z, -2 - 2 I, 2 + 2 I}
Response: 	You're absolutely right! Thank you for bringing that up. When creating a `ComplexPlot` or `ComplexPlot3D` plot in Mathematica, it is indeed necessary to specify the range of complex values as a pair of complex numbers. Here's an example code for a 3D surface plot of the Gamma function on the complex plane using the correct syntax:
```mathematica
ComplexPlot3D[Gamma[z], {z, -2 - 2 I, 2 + 2 I}]
```
This creates a 3D surface plot of the Gamma function on the complex plane for the range `-2 - 2 I <= z <= 2 + 2 I`. The resulting plot shows the magnitude of the Gamma function as a surface in 3D space, with contour lines indicating the values of the function.

Thank you for pointing out this important distinction!

Testing the LLM output: Complex plots of the gamma function (2D and 3D).

Like chatGPT, the model can get things wrong, and needs coaching, but it is still impressive to me that you can run this locally on generic hardware (although, this was done on a well specced macbook pro.) I don’t really intend to use this extensively — I was really just playing around.  Next I’d like to try on my windows laptop (equipped with a little Nivida 20XX model GPU) and see how that compares to the mac.  Is there really a good use case for me to do this on my own hardware — probably not, but it’s really cool!

Pascal’s triangle: binomial coefficients, value and recurrence relation

February 12, 2025 math and physics play , , ,

[Click here for a PDF version of this post]

I saw an interesting solution of an introductory programming source sample problem, showing how to display Pascal’s triangle up to a given size. For example, given input \( n = 5 \), the output should be like:
\begin{equation*}
\begin{array}{c}
1 \\
1 \quad 1 \\
1 \quad 2 \quad 1 \\
1 \quad 3 \quad 3 \quad 1 \\
1 \quad 4 \quad 6 \quad 4 \quad 1 \\
1 \quad 5 \quad 10 \quad 10 \quad 5 \quad 1
\end{array}
\end{equation*}

One kind of neat property of Pascal’s triangle is that any non-unity number on the triangle, is the sum of the two numbers above it. That’s one of the ways that I’d imagine a student would solve the programming problem of displaying this triangle.

Another way would be to use a formula for the binomial coefficients. Recall that

Theorem 1.1: Binomial theorem for integer powers.

Given integer \( n \), and any values \( a, b \) that commute under multiplication
\begin{equation*}
\lr{ a + b }^n = \sum_{k = 0}^n \binom{n}{k} a^k b^{n-k},
\end{equation*}
where
\begin{equation*}
\binom{n}{k} = \frac{n!}{k!\lr{n-k}!}.
\end{equation*}

These \( \binom{n}{k} \) values are the binomial coefficients, and are, in fact the values in Pascal’s triangle (we will show this in a bit.)

What surprised me about the programming solution that I saw was that it didn’t use the binomial coefficients. Instead if appeared to use a recurrence relationship, which was interesting, because I didn’t remember any such relationship. We’ll also find that recurrence relationship below, which allows for evaluation of the binomial coefficients without evaluating any of the factorials.

Combinatoric argument for the binomial coefficients

Let’s look at an explicit expansion of the LHS of the binomial theorem for the first few values of \( n \), starting at \(2\).
\begin{equation}\label{eqn:pascals:20}
\lr{ H + T }^2 = \lr{ H + T }\lr{ H + T} = H H + H T + T H + T T.
\end{equation}
We see that we have all possible combinations of \( H, T \) in the final expression, but because of assumed commutation, there is some redundancy, and we can write
\begin{equation}\label{eqn:pascals:40}
\lr{ H + T }^2 = H^2 + 2 H T + T T.
\end{equation}
If we enumerate all possible values for two flips of a coin, there can only be one \( H, H \) pair, there will be two \( H, T \) pairs, and one \( T, T \) pair. These numbers are the second row of Pascal’s triangle: \( 1, 2, 1 \).

Similarly, for \( n = 3 \), we have
\begin{equation}\label{eqn:pascals:60}
\begin{aligned}
\lr{ H + T }^3
&= \lr{ H H + H T + T H + T T } \lr{ H + T } \\
&= H H H + H T H + T H H + T T H + H H T + H T T + T H T + T T T.
\end{aligned}
\end{equation}
This time, after grouping, we have
\begin{equation}\label{eqn:pascals:80}
\lr{ H + T }^3 = H^3 + 3 H^2 T + 3 H T^2 + T^3.
\end{equation}
If we enumerate all possible values for three flips of a coin, there can only be one \( H, H, H \) triplet, there will be three \( H, H, T \) triplets, three \( T, T, H \) triplets, and one \( T, T, T \) triplet. These numbers are the third row of Pascal’s triangle: \( 1, 3, 3, 1 \).

To understand the binomial coefficient, we have to consider how to pick indistinguishable items from a set. Suppose that we want to pick three items from a set of 7, in a specific order. There are 7 ways to pick the first item, 6 ways to pick the second, and 5 ways to pick the third. We can express that as

\begin{equation}\label{eqn:pascals:120}
P(7,3) = 7 \times 6 \times 5 = \frac{7!}{4!} = \frac{7!}{\lr{7-3}!},
\end{equation}
or more generally, to pick \( k \) distinguishable items from a set of \( n \)
\begin{equation}\label{eqn:pascals:140}
P(n,k) = \frac{n!}{\lr{n-k}!}.
\end{equation}
In our \( \lr{H + T}^n \) expansion, we cannot distinguish any of the elements. For each \( k \) items we pick, there are \( k! \) ways of arranging those, so the binomial coefficient is
\begin{equation}\label{eqn:pascals:160}
\binom{n}{k} = \frac{P(n,k)}{k!} = \frac{n!}{k!\lr{n-k}!},
\end{equation}
as stated earlier.

Recurrence relation

For the recurrence relation, let’s assume that we have already found \( \binom{n}{k} \), let’s see if we can use that to find the next on the line. That is, for \( k < n \)
\begin{equation}\label{eqn:pascals:100}
\begin{aligned}
\binom{n}{k + 1}
&= \frac{n!}{\lr{k+1}!\lr{n-(k+1)}!} \\
&= \frac{n!}{\lr{k+1} k!\lr{n – k – 1)}!} \\
&= \frac{n! \lr{ n – k }}{\lr{k+1} k!\lr{n – k)}!} \\
&= \binom{n}{k} \frac{n – k}{k + 1}.
\end{aligned}
\end{equation}

This is a kind of cool result, and is the method that I saw used in the solution to the Pascal’s triangle program. We can start with the 1 value at the left of any row, and with a single multiply and divide, figure out the next entry in the triangle, and proceed from there.

Pascal’s triangle.

For the Pascal’s triangle property to hold, we must have, for any \( k \ne 0 \)
\begin{equation}\label{eqn:pascals:180}
\binom{n}{k} = \binom{n-1}{k-1} + \binom{n-1}{k},
\end{equation}
as illustrated in fig. 1.

fig. 1. Pascal’s triangle, sum of elements above.

It’s fairly straightforward to verify that this property holds.
\begin{equation}\label{eqn:pascals:200}
\begin{aligned}
\binom{n-1}{k-1} + \binom{n-1}{k}
&=
\frac{\lr{n-1}!}{\lr{k-1}!\lr{n-1-\lr{k-1}}!}
+
\frac{\lr{n-1}!}{\lr{k}!\lr{n-1-k}!} \\
&=
\frac{\lr{n-1}!}{\lr{k-1}!} \lr{ \inv{\lr{n-k}!} + \inv{k \lr{n-1-k}!} } \\
&=
\frac{\lr{n-1}!}{\lr{k-1}!\lr{n-1-k}!} \lr{ \inv{n-k} + \inv{k} } \\
&=
\frac{\lr{n-1}!}{\lr{k-1}!\lr{n-1-k}!} \frac{k + n – k }{k \lr{ n – k } } \\
&=
\frac{n!}{k!\lr{n-k}!} \\
&=
\binom{n}{k},
\end{aligned}
\end{equation}
as expected.

Electric fields for ring and disk charge distributions (along the central axis.)

February 5, 2025 math and physics play , , , , , , , ,

[Click here for a PDF version of this post]

Motivation: explain field formulas in an engineering problem set.

The following equations were given in a problem set for the magnitude of the electric field strength at a point \( z \) due to symmetric ring and disk charge distributions respectively:
\begin{equation}\label{eqn:ringAndDiskField:20}
E = \frac{q z}{4 \pi \epsilon_0 \lr{ z^2 + R^2 }^{3/2} },
\end{equation}
\begin{equation}\label{eqn:ringAndDiskField:40}
E = \frac{\sigma}{2 \epsilon_0} \lr{ 1 – \frac{ z }{\sqrt{ z^2 + R^2 }} }.
\end{equation}

While these were given as black box results to use in a few superposition problems, there isn’t actually any magic required to find these, provided one knows how to perform some line and area integrals.

Electric field due to charge distributions.

Assuming that we are talking about stationary charges, we have only an electric field. Recall that the field, measured at a point \( \Bx \) for a single stationary point situated at point \( \Bx’ \) is
\begin{equation}\label{eqn:ringAndDiskField:60}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \frac{ q \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
If we want to find the field due to charges \( q_k \) at points \( \Bx_k \), then we just sum over all those charges
\begin{equation}\label{eqn:ringAndDiskField:80}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \sum_k \frac{ q_k \lr{ \Bx – \Bx_k }}{\Norm{\Bx – \Bx_k}^3 }.
\end{equation}
This sum can be extended to a continuous charge density. If \( \rho(\Bx) \) represents the charge per unit volume situated at point \( \Bx \), then the field due to that continuous charge density is
\begin{equation}\label{eqn:ringAndDiskField:100}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \int dV’ \frac{ \rho(\Bx’) \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
Similarly, if \( \sigma(\Bx) \) represents the charge per unit area for at point \( \Bx \), then the field due to that continuous surface charge density is
\begin{equation}\label{eqn:ringAndDiskField:120}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \int dA’ \frac{ \sigma(\Bx’) \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
Finally, if \( \lambda(\Bx) \) represents the charge per unit length situated at point \( \Bx \), then the field due to that continuous linear charge density is
\begin{equation}\label{eqn:ringAndDiskField:140}
\BE(\Bx) = \inv{4 \pi \epsilon_0} \int dl’ \frac{ \lambda(\Bx’) \lr{ \Bx – \Bx’ }}{\Norm{\Bx – \Bx’}^3 }.
\end{equation}
At least theoretically, given any set of charge distributions, it’s just a matter of integration to find the corresponding electric field at any point. It happens that these integrals can be tricky, but we can get lucky when the charge configurations have special symmetries, as in the problem set formulas.

Ring charge distribution.

Let’s compute the electric field at a point along the axis of a ring charge distribution. I’ll use \( \setlr{ \Be_1, \Be_2, \Be_3 } \) to represent the standard basis direction vectors. We are considering the geometry of fig. 1.

fig. 1. Circular ring charge distribution.

The point on the central axis can be written as
\begin{equation}\label{eqn:ringAndDiskField:160}
\Bx = z \Be_3,
\end{equation}
and the points on the ring are
\begin{equation}\label{eqn:ringAndDiskField:180}
\Bx'(\theta) = R \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta },
\end{equation}
then
\begin{equation}\label{eqn:ringAndDiskField:200}
\Bx – \Bx’ = z \Be_3 – R \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta },
\end{equation}
and
\begin{equation}\label{eqn:ringAndDiskField:220}
\Norm{\Bx – \Bx’}^2 = z^2 + R^2.
\end{equation}
Since \( R d\theta \) is the length of a segment of the ring, the charge on that segment is \( R d\theta \lambda \), and the field is
\begin{equation}\label{eqn:ringAndDiskField:240}
\begin{aligned}
\BE(\Bx)
&= \inv{4 \pi \epsilon_0} \int_{\theta =0}^{2 \pi} R d\theta \lambda \frac{z \Be_3 – R \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta }}{\lr{ z^2 + R^2 }^{3/2} } \\
&= \inv{4 \pi \epsilon_0} \frac{2 \pi R z \Be_3 }{ \lr{ z^2 + R^2 }^{3/2} }.
\end{aligned}
\end{equation}
Observe that the integral of the \( \sin\theta \) and \( \cos\theta \) components are zero, since we are integrating over a full period. Finally, since the total charge is \( q = 2 \pi R \lambda \), we have
\begin{equation}\label{eqn:ringAndDiskField:260}
\BE(0, 0, z) = \inv{4 \pi \epsilon_0} \frac{q z \Be_3 }{ \lr{ z^2 + R^2 }^{3/2} }.
\end{equation}
This recovers the equation from the problem set, with the only difference being the explicit inclusion of the direction vector, which was implied in the problem statement.

Field of a disk.

The setup above can be used to state the integral to solve for the field of a disk. The area of a segment of the disk is \( r dr d\theta \), so the charge of that fragment is \( r dr d\theta \sigma \), so the field is
\begin{equation}\label{eqn:ringAndDiskField:280}
\begin{aligned}
\BE(\Bx)
&= \inv{4 \pi \epsilon_0} \int_{r = 0}^R \int_{\theta =0}^{2 \pi} r dr d\theta \sigma \frac{z \Be_3 – r \lr{ \Be_1 \cos\theta + \Be_2 \sin\theta }}{\lr{ z^2 + r^2 }^{3/2} } \\
&= \frac{\sigma z \Be_3}{2 \epsilon_0} \int_{r = 0}^R dr \frac{r}{\lr{ z^2 + r^2 }^{3/2} }.
\end{aligned}
\end{equation}
As before, when evaluating the \( \theta \) integral, the sinusoidal components are killed. Since \( \int_0^{2 \pi} d\theta = 2 \pi \), we have a nice cancellation of \( 2 \pi \) downstairs.
Finally, provided \( z \ne 0 \),
\begin{equation}\label{eqn:ringAndDiskField:300}
\int_{r = 0}^R dr \frac{r}{\lr{ z^2 + r^2 }^{3/2} } = \inv{\Abs{z}} – \inv{\sqrt{ z^2 + R^2 }},
\end{equation}
so the electric field is
\begin{equation}\label{eqn:ringAndDiskField:320}
\BE(0, 0, z) = \frac{ \sigma }{2 \epsilon_0 } \lr{ \mathrm{sgn}(z) – \frac{z}{\sqrt{z^2 + R^2}} },
\end{equation}
where \( \mathrm{sgn}(z) \) is the sign of \( z \). We see that the equation provided in the problem set is actually only true for \( z > 0 \), and needs a sign correction otherwise.