math and physics play

Deriving the nth Fibonacci number formula.

November 20, 2020 math and physics play No comments , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

My last three posts:

  1. The nth term of a Fibonacci series.
  2. More on that cool Fibonacci formula.
  3. Guessing the nth Fibonacci number formula.

were all about a cool formula for the n-th term of the Fibonacci series.  Here’s the final chapter of the story of that play.  A recap:

Definition 1.1: Fibonacci series.

With \( F_0 = 0 \), and \( F_1 = 1 \), the nth term \( F_n \) in the Fibonacci series is the sum of the previous two terms
\begin{equation*}
F_n = F_{n-2} + F_{n-1}.
\end{equation*}

Theorem 1.1: Nth term of the Fibonacci series.

\begin{equation*}
F_n = \frac{ \lr{ 1 + \sqrt{5} }^n – \lr{ 1 – \sqrt{5} }^n }{ 2^n \sqrt{5} }.
\end{equation*}

Deriving the nth Fibonacci formula.

There was a particularly unsatisfactory aspect of the guess that we made in the last post. In particular, we didn’t have any reason to guess the form of that solution, except for the fact that we already knew the answer. Now we will attempt to attack this in a more systematic fashion, so that each step along the way seems logical. First, we need to put a couple goodies in our toolbox.

Definition 1.1: Discrete sum.

Given a set of discrete values \( \setlr{G_a, G_{a+1}, \cdots, G_n } \) we define a discrete sum of \( n – a + 1 \) of these terms as \( F_n \)
\begin{equation}\label{eqn:fibonacciblog:1540}
F_n = \sum_{k = a}^n G_k + C,
\end{equation}
where \( C \) is an arbitrary boundary value constant.

Definition 1.2: Difference operators.

Define a backwards difference operator \( \Delta \), operating on \( X_n \) as
\begin{equation}\label{eqn:fibonacciblog:1560}
\Delta X_n = X_n – X_{n-1}.
\end{equation}

The difference operator is a discrete analogue of a differential operator. It is also possible to define a (forward) difference operator as \( \Delta X_n = X_{n+1} – X_{n} \), but the choice is arbitary, and we can find the same results either way.

Lemma 1.1: Antidifference of discrete sum.

Given a sum \( F_n \) of the form \ref{eqn:fibonacciblog:1540}, the difference operation is just the highest \( n \) term of the sum
\begin{equation}\label{eqn:fibonacciblog:1580}
\Delta F_n = G_n.
\end{equation}

Start proof:

\begin{equation}\label{eqn:fibonacci:960}
\Delta F_n =
\sum_{k = a}^n G_k + C
– \lr{ \sum_{k = a}^{n-1} G_k + C }
= G_n.
\end{equation}

End proof.

Computing differences is pretty easy. What we want to do is the inverse operation (analogous to integration), where we find a closed form representation of \( F_n \) given a difference equation \( \Delta F_n = G_n \). Just as we can compute antiderivatives for \( x^n \), we may do the same for \( n^k \) antidifferences, but the results are messier. The first few such antidifferences are

Theorem 1.1: Antidifferences for powers of \(n\).

\begin{equation}\label{eqn:fibonacciblog:1600}
\begin{aligned}
1 &= \Delta n \\
n &= \Delta \lr{ \frac{n}{2}\lr{ n + 1} } \\
n^2 &= \Delta \lr{ \frac{n}{6}\lr{2 n + 1}\lr{n + 1} } \\
n^3 &= \Delta \lr{ \frac{n^2}{4}\lr{n + 1}^2 }.
\end{aligned}
\end{equation}

Start proof:

The \( \Delta n \) identity is easily verified
\begin{equation}\label{eqn:fibonacci:1000}
\Delta n = n – (n-1) = 1.
\end{equation}
For higher orders it is a bit tedious to verify directly, but we can iteratively build up those results by evaluating the difference operator on each of the powers of \( n \).
\begin{equation}\label{eqn:fibonacci:660}
\begin{aligned}
\Delta n^2
&= n^2 – (n-1)^2 \\
&= n^2 – (n^2 – 2 n + 1) \\
&= 2 n – 1, \\
&= 2 n – \Delta n.
\end{aligned}
\end{equation}
Because the difference operator is linear, we can rearrange to find
\begin{equation}\label{eqn:fibonacci:1020}
\Delta \lr{ n^2 + n } = 2 n.
\end{equation}
Dividing through by \( 2 \) and factoring out an \( n \), recovers the desired result.

For the next power, we have
\begin{equation}\label{eqn:fibonacci:680}
\begin{aligned}
\Delta n^3
&= n^3 – (n-1)^3 \\
&= n^3 – (n^3 – 3 n^2 + 3 n – 1) \\
&= 3 n^2 – 3 n + 1 \\
&= 3 n^2 – 3 \Delta \frac{n}{2}\lr{ n + 1 } + \Delta n,
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:fibonacci:1040}
\begin{aligned}
3 n^2
&=
\Delta \lr{n} \lr{ n^2 + \frac{3}{2}\lr{ n + 1} – 1 } \\
&=
\Delta \frac{n}{2} \lr{ 2 n^2 + 3\lr{ n + 1} – 2} \\
&=
\Delta \frac{n}{2} \lr{ 2 n^2 + 3 n + 1 } \\
&=
\Delta \frac{n}{2} \lr{ 2 n + 1}\lr{ n + 1 }
\end{aligned}
\end{equation}
Dividing through by \( 3 \) recovers the desired result.

The final result is left to the reader. It can be derived or verified easily with a couple lines of Mathematica code.

End proof.

Problem: Sum some series.

Find the sums \( \sum_{k = 1}^n k^m \), for \( m = 1, 2, 3 \).

Answer

  • \( m = 1 \). This is the (probably apocryphal) sum of Gauss’s grade school classroom:
    \begin{equation}\label{eqn:fibonacci:1060}
    F_n = \sum_{k = 1}^n k = 1 + 2 + \cdots n,
    \end{equation}
    satisfying
    \begin{equation}\label{eqn:fibonacci:1080}
    \begin{aligned}
    \Delta F_n
    &= F_n – F_{n-1} \\
    &=
    (n + (n-1) + \cdots + 1)

    ((n-1) + \cdots + 1) \\
    &= n \\
    &= \Delta \frac{n}{2}(n + 1).
    \end{aligned}
    \end{equation}
    We must have
    \begin{equation}\label{eqn:fibonacci:1100}
    F_n = \frac{n}{2}\lr{ n + 1} + C.
    \end{equation}
    To fix \( C \) consider \( F_1 \)
    \begin{equation}\label{eqn:fibonacci:1180}
    F_1 = \inv{2}(1 + 1) + C = 1,
    \end{equation}
    so \( C = 0 \), so we find Gauss’s summation formula
    \begin{equation}\label{eqn:fibonacci:1200}
    \sum_{k = 1}^n k = \frac{n}{2}\lr{ n + 1},
    \end{equation}
    as expected.
  • \( m = 2 \). Now let’s do the sum of squares
    \begin{equation}\label{eqn:fibonacci:1120}
    F_n = \sum_{k = 1}^n k^2,
    \end{equation}
    for which we have
    \begin{equation}\label{eqn:fibonacci:1140}
    \Delta F_n = n^2 = \Delta \frac{n}{6}( 2 n + 1 )(n+1),
    \end{equation}
    so
    \begin{equation}\label{eqn:fibonacci:1160}
    F_n = \frac{n}{6}( 2 n + 1 )(n+1) + C.
    \end{equation}
    Clearly \( C = 0 \) satisfies the boundary condition, leaving
    \begin{equation}\label{eqn:fibonacci:1220}
    \sum_{k = 1}^n k^2 =
    \frac{n}{6}( 2 n + 1 )(n+1).
    \end{equation}
  • \( m = 3 \). We see the pattern, so for the sum of cubes, we can just write down the answer
    \begin{equation}\label{eqn:fibonacci:1240}
    \sum_{k = 1}^n k^3 =
    \frac{n^2}{4}\lr{n + 1}^2
    .
    \end{equation}

Now that we have some basic comfort with the ideas of difference equations, and their solutions,
let’s get back to the Fibonacci problem. In that case, we have
\begin{equation}\label{eqn:fibonacci:1260}
F_n = F_{n-1} + F_{n-2}.
\end{equation}
Stated as a difference equation, this is
\begin{equation}\label{eqn:fibonacci:1280}
\Delta F_n = F_{n-2}.
\end{equation}
Before tackling the Fibonacci problem, let’s try one that slightly simpler.

Problem: A simpler problem.

Solve \( \Delta F_n = F_{n-1} \), where \( F_0 = 0, F_1 = 1 \).

Answer

The problem to solve is just
\begin{equation}\label{eqn:fibonacci:1300}
F_n = 2 F_{n-1}.
\end{equation}
This sequence is \( \setlr{ 1, 2, 4, 8, \cdots } \), so we can solve it by inspection, and the answer is just \( F_n = 2^{n-1} \). We want inspiration for the Fibonacci problem, so let’s pretend that we can’t see the answer, but that we can guess something close, and see if it works. Namely, let’s guess:
\begin{equation}\label{eqn:fibonacci:1320}
F_n = \alpha a^n + C.
\end{equation}
If we plug this trial solution into our difference equation, we get
\begin{equation}\label{eqn:fibonacci:1340}
\begin{aligned}
\alpha a^{n-1} + C
&=
\Delta F_n \\
&= \alpha \lr{ a^n – a^{n-1} } \\
&= \alpha a^{n-1} \lr{ a – 1 }
\end{aligned}
\end{equation}
This can be satisfied by setting \( C = 0 \) and \( a – 1 = 1 \), or \( a = 2 \), as we already knew. To fix the constant \( \alpha \) we utilize our boundary constraints, namely
\begin{equation}\label{eqn:fibonacci:1400}
F_1 = 1 = \alpha 2
\end{equation}
so \(\alpha = 1/2 \).

Compared to just seeing the answer, the procedure above was a lot of work. However, a side effect of this work is discovery of a guessing strategy that is somewhat like using \( f(t) = e^{s t} \) to generate a characteristic equation when solving a differential equation. For a difference equation of this form, it appears we can substitute \( F_n = \alpha a^n + C \) and use the differences to determine the values of \( \alpha, a, C \). Now let’s try this with the Fibonacci difference equation.

Problem: Find a solution to the Fibonacci difference equation.

Without worrying about boundary constraints, find the solutions to \( \Delta F_n = F_{n-2} \), using a trial solution of \( F_n = \alpha a^n \).

Answer

Inserting our trial solution, we have
\begin{equation}\label{eqn:fibonacci:1420}
\begin{aligned}
\alpha a^n
&=
F_n \\
&= F_{n-1} + F_{n-2} \\
&= \alpha \lr{ a^{n-1} + a^{n-2} } \\
&= \alpha a^{n-2} \lr{ a + 1 },
\end{aligned}
\end{equation}
so our “characteristic equation” is
\begin{equation}\label{eqn:fibonacci:1440}
a + 1 = a^2.
\end{equation}
Completing the square yields
\begin{equation}\label{eqn:fibonacci:1460}
\lr{ a – \inv{2} }^2 = 1 + \inv{4},
\end{equation}
or
\begin{equation}\label{eqn:fibonacci:1480}
a = \inv{2} \pm \frac{\sqrt{5}}{2}.
\end{equation}
Bamn. There’s our golden ratio, and it’s buddy!
We find that
\begin{equation}\label{eqn:fibonacci:1500}
F_n = \alpha \lr{\frac{1 \pm \sqrt{5}}{2} }^n,
\end{equation}
are solutions to the difference equation \ref{eqn:fibonacci:1280}.

Since we have a second order difference equation, we need a superposition of both solutions to try to satisfy the boundary conditions. In particular, we want to find the constants
\begin{equation}\label{eqn:fibonacci:1520}
F_n =
\alpha_{+} \lr{\frac{1 + \sqrt{5}}{2} }^n
+
\alpha_{-} \lr{\frac{1 – \sqrt{5}}{2} }^n + C.
\end{equation}
However, we already did this when we guessed used \( F_n = \alpha a^n + \beta b^n \) as a trial solution. When we did that, it was just to see if we could find the end result, knowing only the structure of the solution, but none of the specific constants. Now we have justified why that was a reasonable trial solution, since exactly this structure follows naturally from the difference equation itself.

This train of thought, makes me want to dig out my little Dover book on difference equations [1] that I’ve had since I was a kid. I think I only worked through the first chapter of that book. I have a lot of little sad neglected Dover books on mathematics and physics that I bought super cheap at the World’s Biggest Bookstore when I was back in school. It will be interesting to see how to tackle problems such as this, in a still more systematic fashion.

References

[1] Hyman Levy and Freda Lessman. Finite difference equations. Courier Corporation, 1992.

Guessing the nth Fibonacci number formula

November 17, 2020 math and physics play No comments , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

My last two posts:

  1. The nth term of a Fibonacci series.
  2. More on that cool Fibonacci formula.

were both about a cool formula for the n-th term of the Fibonacci series.  Looks like I’m not done playing with this beastie.  A recap:

Definition 1.1: Fibonacci series.

With \( F_0 = 0 \), and \( F_1 = 1 \), the nth term \( F_n \) in the Fibonacci series is the sum of the previous two terms
\begin{equation*}
F_n = F_{n-2} + F_{n-1}.
\end{equation*}

Theorem 1.1: Nth term of the Fibonacci series.

\begin{equation*}
F_n = \frac{ \lr{ 1 + \sqrt{5} }^n – \lr{ 1 – \sqrt{5} }^n }{ 2^n \sqrt{5} }.
\end{equation*}

 

The guess.

We can rearrange the formula for the nth Fibonacci number as a difference equation
\begin{equation}\label{eqn:fibonacci:260}
F_n – F_{n-1} = F_{n-2}.
\end{equation}
This is a second order difference equation, so my naive expectation is that there are two particular solutions involved. We know the answer, so it’s not too hard to guess that the particular form of the solution has the following form
\begin{equation}\label{eqn:fibonacci:280}
F_n = \alpha a^n + \beta b^n.
\end{equation}
Given this guess, can we take some of the magic out of the formula, by just solving for \( \alpha, \beta, a, b \)? Let’s try that
\begin{equation}\label{eqn:fibonacci:300}
F_0 = \alpha + \beta = 0,
\end{equation}
\begin{equation}\label{eqn:fibonacci:320}
\begin{aligned}
F_1 &= \alpha a + \beta b \\
&= \alpha \lr{ a – b } \\
&= 1,
\end{aligned}
\end{equation}
and
\begin{equation}\label{eqn:fibonacci:340}
\begin{aligned}
F_n
&= F_{n-1} + F_{n-2} \\
&=
\alpha \lr{ a^{n-1} + a^{n-2} }
-\alpha \lr{ b^{n-1} + b^{n-2} } \\
&=
\alpha a^{n-2} \lr{ 1 + a }
-\alpha b^{n-2} \lr{ 1 + b },
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:fibonacci:360}
\begin{aligned}
a^2 &= a + 1 \\
b^2 &= b + 1.
\end{aligned}
\end{equation}
If we complete the square we find
\begin{equation}\label{eqn:fibonacci:380}
\lr{ a – \inv{2} }^2 = 1 + \inv{4} = \frac{5}{4},
\end{equation}
or
\begin{equation}\label{eqn:fibonacci:400}
a, b = \inv{2} \pm \frac{\sqrt{5}}{2}.
\end{equation}
Out pop the golden ratio and it’s complement. Clearly we need to pick alternate roots for \( a \) and \( b \) or else we’d have zero for every value of \( n > 0 \). Suppose we pick the positive root for \( a \), then to find the scaling constant \( \alpha \), we just compute
\begin{equation}\label{eqn:fibonacci:420}
\begin{aligned}
1
&=
\alpha \lr{ \frac{ 1 + \sqrt{5}}{2} – \frac{ 1 – \sqrt{5} }{2} } \\
&= \alpha \sqrt{5},
\end{aligned}
\end{equation}
so our system \ref{eqn:fibonacci:280} has the solution:
\begin{equation}\label{eqn:fibonacci:520}
\begin{aligned}
a &= \frac{1 + \sqrt{5}}{2} \\
b &= \frac{1 – \sqrt{5}}{2} \\
\alpha &= \inv{\sqrt{5}} \\
\beta &= -\inv{\sqrt{5}}.
\end{aligned}
\end{equation}

We now see a path that will systematically lead us from the Fibonacci difference equation to the final result, and have only to fill in a few missing steps to understand how this could be discovered from scratch.

Motivating the root-fives.

I showed this to Sofia, and she came up with a neat very direct way to motivate the \( \sqrt{5} \). It follows naturally (again knowing the answer), by assuming the Fibonacci formula has the following form:
\begin{equation}\label{eqn:fibonacci:440}
F_n = \inv{x} \lr{
\lr{ \frac{1 + x}{2}}^n

\lr{ \frac{1 – x}{2}}^n
}.
\end{equation}
We have only to plug in \( n = 3 \) to find
\begin{equation}\label{eqn:fibonacci:460}
\begin{aligned}
2 x
&= \inv{4} \lr{ 1 + 3 x + 3 x^2 + x^3 – \lr{ 1 – 3 x + 3 x^2 – x^3 } } \\
&= \inv{2} \lr{ 3 x + x^3 },
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:fibonacci:480}
8 = 3 + x^2,
\end{equation}
so
\begin{equation}\label{eqn:fibonacci:500}
x = \pm \sqrt{5}.
\end{equation}
Again the \( \sqrt{5} \)’s pop out naturally, taking away some of the mystery of the cool formula.

More on that cool Fibonacci formula

November 15, 2020 math and physics play No comments

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

In my previous post, I explored the following cool formula for the nth term of the Fibonacci series. In this post, I’ll show why there are no square root fives after evaluation. A recap:

Definition 1.1: Fibonacci series.

With \( F_0 = 0 \), and \( F_1 = 1 \), the nth term \( F_n \) in the Fibonacci series is the sum of the previous two terms
\begin{equation*}
F_n = F_{n-2} + F_{n-1}.
\end{equation*}

Theorem 1.1: Nth term of the Fibonacci series.

\begin{equation*}
F_n = \frac{ \lr{ 1 + \sqrt{5} }^n – \lr{ 1 – \sqrt{5} }^n }{ 2^n \sqrt{5} }.
\end{equation*}

How the square root fives cancel out.

One of the interesting things in this Fibonacci formula, is the \( \sqrt{5} \)’s that are all over the place, while the formula represents only integer values. Expanding the formula in binomial series shows us exactly why those terms all vanish. Consider the first few values of \( n \) explicitly.
\begin{equation}\label{eqn:fibonacci:160}
\begin{aligned}
F_1
&= \frac{ 1 + \sqrt{5} – \lr{ 1 – \sqrt{5} } }{ 2^1 \sqrt{5} } \\
&= \frac{ 2 \sqrt{5} }{ 2^1 \sqrt{5} } \\
&= 1,
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:fibonacci:180}
\begin{aligned}
F_2
&= \frac{ 1 + 2 \sqrt{5} + 5 – \lr{ 1 – 2 \sqrt{5} + 5 } }{ 2^2 \sqrt{5} } \\
&= \frac{ 4 \sqrt{5} }{ 2^2 \sqrt{5} } \\
&= 1,
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:fibonacci:200}
\begin{aligned}
F_3
&= \frac{ 1 + 3 \sqrt{5} + 3 (5) + \sqrt{5} 5 – \lr{ 1 – 3 \sqrt{5} + 3(5) – \sqrt{5} 5 } }{ 2^3 \sqrt{5} } \\
&= \frac{ 2 \lr{ 3 \sqrt{5} + \sqrt{5} 5 } }{ 2^3 \sqrt{5} } \\
&= \frac{ 3 + 5 }{ 2^2 } \\
&= 2.
\end{aligned}
\end{equation}
In the general case, we have
\begin{equation}\label{eqn:fibonacci:220}
\begin{aligned}
2^n \sqrt{5} F_n
&=
\sum_{k = 0}^n
\binom{n}{k}
{\sqrt{5}}^k

\sum_{k = 0}^n \binom{n}{k} (-\sqrt{5})^k \\
&=
2 \sum_{1 \le k \le n, \mbox{$k$ is odd}} \binom{n}{k} (\sqrt{5})^k \\
&=
2 \sqrt{5} \sum_{m = 0}^{\lfloor (n-1)/2 \rfloor} \binom{n}{2 m + 1} 5^m,
\end{aligned}
\end{equation}

so (for any \( n > 0 \)),
\begin{equation}\label{eqn:fibonacci:240}
F_n =
\inv{2^{n-1}} \sum_{m = 0}^{\lfloor (n-1)/2 \rfloor } \binom{n}{2 m + 1} 5^m.
\end{equation}
Since only the odd powers of \( \sqrt{5} \) in the binomial expansions survive, the root in the basement is obliterated every time, leaving only integers upstairs, and a power of two factor downstairs. It is still somewhat remarkable seeming that there is always a perfect cancellation of all the factors of two in the basement.

The nth term of a Fibonacci series.

November 13, 2020 math and physics play 3 comments , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post.]

I’ve just started reading [1], but already got distracted from the plot by a fun math fact. Namely, a cute formula for the nth term of a Fibonacci series. Recall

Definition 1.1: Fibonacci series.

With \( F_0 = 0 \), and \( F_1 = 1 \), the nth term \( F_n \) in the Fibonacci series is the sum of the previous two terms
\begin{equation*}
F_n = F_{n-2} + F_{n-1}.
\end{equation*}

We can quickly find that the series has values \( 0, 1, 1, 2, 3, 5, 8, 13, \cdots \). What’s really cool, is that there’s a closed form expression for the nth term in the series that doesn’t require calculation of all the previous terms.

Theorem 1.1: Nth term of the Fibonacci series.

\begin{equation*}
F_n = \frac{ \lr{ 1 + \sqrt{5} }^n – \lr{ 1 – \sqrt{5} }^n }{ 2^n \sqrt{5} }.
\end{equation*}

This is a rather miraculous and interesting looking equation. Other than the \(\sqrt{5}\) scale factor, this is exactly the difference of the nth powers of the golden ratio \( \phi = (1+\sqrt{5})/2 \), and \( 1 – \phi = (1-\sqrt{5})/2 \). That is:
\begin{equation}\label{eqn:fibonacci:60}
F_n = \frac{\phi^n – (1 -\phi)^n}{\sqrt{5}}.
\end{equation}

How on Earth would somebody figure this out? According to Tattersal [2], this relationship was discovered by Kepler.

Understanding this from the ground up looks like it’s a pretty deep rabbit hole to dive into. Let’s save that game for another day, but try the more pedestrian task of proving that this formula works.

Start proof:

\begin{equation}\label{eqn:fibonacci:80}
\begin{aligned}
\sqrt{5} F_n
&=
\sqrt{5} \lr{ F_{n-2} + F_{n-1} } \\
&=
\phi^{n-2} – \lr{ 1 – \phi}^{n-2}
+ \phi^{n-1} – \lr{ 1 – \phi}^{n-1} \\
&=
\phi^{n-2} \lr{ 1 + \phi }
-\lr{1 – \phi}^{n-2} \lr{ 1 + 1 – \phi } \\
&=
\phi^{n-2}
\frac{ 3 + \sqrt{5} }{2}
-\lr{1 – \phi}^{n-2}
\frac{ 3 – \sqrt{5} }{2}.
\end{aligned}
\end{equation}
However,
\begin{equation}\label{eqn:fibonacci:100}
\begin{aligned}
\phi^2
&= \lr{ \frac{ 1 + \sqrt{5} }{2} }^2 \\
&= \frac{ 1 + 2 \sqrt{5} + 5 }{4} \\
&= \frac{ 3 + \sqrt{5} }{2},
\end{aligned}
\end{equation}
and
\begin{equation}\label{eqn:fibonacci:120}
\begin{aligned}
(1-\phi)^2
&= \lr{ \frac{ 1 – \sqrt{5} }{2} }^2 \\
&= \frac{ 1 – 2 \sqrt{5} + 5 }{4} \\
&= \frac{ 3 – \sqrt{5} }{2},
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:fibonacci:140}
\sqrt{5} F_n = \phi^n – (1-\phi)^n.
\end{equation}

End proof.

References

[1] Steven Strogatz and Don Joffray. The calculus of friendship: What a teacher and a student learned about life while corresponding about math. Princeton University Press, 2009.

[2] James J Tattersall. Elementary number theory in nine chapters. Cambridge University Press, 2005.

More satisfying editing of classical mechanics notes.

November 3, 2020 math and physics play 2 comments , , , , ,

I’ve purged about 30 pages of material related to field Lagrangian densities and Maxwell’s equation, replacing it with about 8 pages of new less incoherent material.

As before, I’ve physically ripped out all the pages that have been replaced, which is satisfying, and makes it easier to see what is left to review.

The new version is now reduced to 333 pages, close to a 100 page reduction from the original mess.  I may print myself a new physical copy, as I’ve moved things around so much that I have to search the latex to figure out where to make changes.

Maxwell’s equation Lagrangian (geometric algebra and tensor formalism)

November 1, 2020 math and physics play 1 comment , , , , , , , , , , , , , , , , , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Maxwell’s equation using geometric algebra Lagrangian.

Motivation.

In my classical mechanics notes, I’ve got computations of Maxwell’s equation (singular in it’s geometric algebra form) from a Lagrangian in various ways (using a tensor, scalar and multivector Lagrangians), but all of these seem more convoluted than they should be.
Here we do this from scratch, starting with the action principle for field variables, covering:

  • Derivation of the relativistic form of the Euler-Lagrange field equations from the covariant form of the action,
  • Derivation of Maxwell’s equation (in it’s STA form) from the Maxwell Lagrangian,
  • Relationship of the STA Maxwell Lagrangian to the tensor equivalent,
  • Relationship of the STA form of Maxwell’s equation to it’s tensor equivalents,
  • Relationship of the STA Maxwell’s equation to it’s conventional Gibbs form.
  • Show that we may use a multivector valued Lagrangian with all of \( F^2 \), not just the scalar part.

It is assumed that the reader is thoroughly familiar with the STA formalism, and if that is not the case, there is no better reference than [1].

Field action.

Theorem 1.1: Relativistic Euler-Lagrange field equations.

Let \( \phi \rightarrow \phi + \delta \phi \) be any variation of the field, such that the variation
\( \delta \phi = 0 \) vanishes at the boundaries of the action integral
\begin{equation}\label{eqn:maxwells:2120}
S = \int d^4 x \LL(\phi, \partial_\nu \phi).
\end{equation}
The extreme value of the action is found when the Euler-Lagrange equations
\begin{equation}\label{eqn:maxwells:2140}
0 = \PD{\phi}{\LL} – \partial_\nu \PD{(\partial_\nu \phi)}{\LL},
\end{equation}
are satisfied. For a Lagrangian with multiple field variables, there will be one such equation for each field.

Start proof:

To ease the visual burden, designate the variation of the field by \( \delta \phi = \epsilon \), and perform a first order expansion of the varied Lagrangian
\begin{equation}\label{eqn:maxwells:20}
\begin{aligned}
\LL
&\rightarrow
\LL(\phi + \epsilon, \partial_\nu (\phi + \epsilon)) \\
&=
\LL(\phi, \partial_\nu \phi)
+
\PD{\phi}{\LL} \epsilon +
\PD{(\partial_\nu \phi)}{\LL} \partial_\nu \epsilon.
\end{aligned}
\end{equation}
The variation of the Lagrangian is
\begin{equation}\label{eqn:maxwells:40}
\begin{aligned}
\delta \LL
&=
\PD{\phi}{\LL} \epsilon +
\PD{(\partial_\nu \phi)}{\LL} \partial_\nu \epsilon \\
&=
\PD{\phi}{\LL} \epsilon +
\partial_\nu \lr{ \PD{(\partial_\nu \phi)}{\LL} \epsilon }

\epsilon \partial_\nu \PD{(\partial_\nu \phi)}{\LL},
\end{aligned}
\end{equation}
which we may plug into the action integral to find
\begin{equation}\label{eqn:maxwells:60}
\delta S
=
\int d^4 x \epsilon \lr{
\PD{\phi}{\LL}

\partial_\nu \PD{(\partial_\nu \phi)}{\LL}
}
+
\int d^4 x
\partial_\nu \lr{ \PD{(\partial_\nu \phi)}{\LL} \epsilon }.
\end{equation}
The last integral can be evaluated along the \( dx^\nu \) direction, leaving
\begin{equation}\label{eqn:maxwells:80}
\int d^3 x
\evalbar{ \PD{(\partial_\nu \phi)}{\LL} \epsilon }{\Delta x^\nu},
\end{equation}
where \( d^3 x = dx^\alpha dx^\beta dx^\gamma \) is the product of differentials that does not include \( dx^\nu \). By construction, \( \epsilon \) vanishes on the boundary of the action integral so \ref{eqn:maxwells:80} is zero. The action takes its extreme value when
\begin{equation}\label{eqn:maxwells:100}
0 = \delta S
=
\int d^4 x \epsilon \lr{
\PD{\phi}{\LL}

\partial_\nu \PD{(\partial_\nu \phi)}{\LL}
}.
\end{equation}
The proof is complete after noting that this must hold for all variations of the field \( \epsilon \), which means that we must have
\begin{equation}\label{eqn:maxwells:120}
0 =
\PD{\phi}{\LL}

\partial_\nu \PD{(\partial_\nu \phi)}{\LL}.
\end{equation}

End proof.

Armed with the Euler-Lagrange equations, we can apply them to the Maxwell’s equation Lagrangian, which we will claim has the following form.

Theorem 1.2: Maxwell’s equation Lagrangian.

Application of the Euler-Lagrange equations to the Lagrangian
\begin{equation}\label{eqn:maxwells:2160}
\LL = – \frac{\epsilon_0 c}{2} F \cdot F + J \cdot A,
\end{equation}
where \( F = \grad \wedge A \), yields the vector portion of Maxwell’s equation
\begin{equation}\label{eqn:maxwells:2180}
\grad \cdot F = \inv{\epsilon_0 c} J,
\end{equation}
which implies
\begin{equation}\label{eqn:maxwells:2200}
\grad F = \inv{\epsilon_0 c} J.
\end{equation}
This is Maxwell’s equation.

Start proof:

We wish to apply all of the Euler-Lagrange equations simultaneously (i.e. once for each of the four \(A_\mu\) components of the potential), and cast it into four-vector form
\begin{equation}\label{eqn:maxwells:140}
0 = \gamma_\nu \lr{ \PD{A_\nu}{} – \partial_\mu \PD{(\partial_\mu A_\nu)}{} } \LL.
\end{equation}
Since our Lagrangian splits nicely into kinetic and interaction terms, this gives us
\begin{equation}\label{eqn:maxwells:160}
0 = \gamma_\nu \lr{ \PD{A_\nu}{(A \cdot J)} + \frac{\epsilon_0 c}{2} \partial_\mu \PD{(\partial_\mu A_\nu)}{ (F \cdot F)} }.
\end{equation}
The interaction term above is just
\begin{equation}\label{eqn:maxwells:180}
\gamma_\nu \PD{A_\nu}{(A \cdot J)}
=
\gamma_\nu \PD{A_\nu}{(A_\mu J^\mu)}
=
\gamma_\nu J^\nu
=
J,
\end{equation}
but the kinetic term takes a bit more work. Let’s start with evaluating
\begin{equation}\label{eqn:maxwells:200}
\begin{aligned}
\PD{(\partial_\mu A_\nu)}{ (F \cdot F)}
&=
\PD{(\partial_\mu A_\nu)}{ F } \cdot F
+
F \cdot \PD{(\partial_\mu A_\nu)}{ F } \\
&=
2 \PD{(\partial_\mu A_\nu)}{ F } \cdot F \\
&=
2 \PD{(\partial_\mu A_\nu)}{ (\partial_\alpha A_\beta) } \lr{ \gamma^\alpha \wedge \gamma^\beta } \cdot F \\
&=
2 \lr{ \gamma^\mu \wedge \gamma^\nu } \cdot F.
\end{aligned}
\end{equation}
We hit this with the \(\mu\)-partial and expand as a scalar selection to find
\begin{equation}\label{eqn:maxwells:220}
\begin{aligned}
\partial_\mu \PD{(\partial_\mu A_\nu)}{ (F \cdot F)}
&=
2 \lr{ \partial_\mu \gamma^\mu \wedge \gamma^\nu } \cdot F \\
&=
– 2 (\gamma^\nu \wedge \grad) \cdot F \\
&=
– 2 \gpgradezero{ (\gamma^\nu \wedge \grad) F } \\
&=
– 2 \gpgradezero{ \gamma^\nu \grad F – \gamma^\nu \cdot \grad F } \\
&=
– 2 \gamma^\nu \cdot \lr{ \grad \cdot F }.
\end{aligned}
\end{equation}
Putting all the pieces together yields
\begin{equation}\label{eqn:maxwells:240}
0
= J – \epsilon_0 c \gamma_\nu \lr{ \gamma^\nu \cdot \lr{ \grad \cdot F } }
= J – \epsilon_0 c \lr{ \grad \cdot F },
\end{equation}
but
\begin{equation}\label{eqn:maxwells:260}
\begin{aligned}
\grad \cdot F
&=
\grad F – \grad \wedge F \\
&=
\grad F – \grad \wedge (\grad \wedge A) \\
&=
\grad F,
\end{aligned}
\end{equation}
so the multivector field equations for this Lagrangian are
\begin{equation}\label{eqn:maxwells:280}
\grad F = \inv{\epsilon_0 c} J,
\end{equation}
as claimed.

End proof.

Problem: Correspondence with tensor formalism.

Cast the Lagrangian of \ref{eqn:maxwells:2160} into the conventional tensor form
\begin{equation}\label{eqn:maxwells:300}
\LL = \frac{\epsilon_0 c}{4} F_{\mu\nu} F^{\mu\nu} + A^\mu J_\mu.
\end{equation}
Also show that the four-vector component of Maxwell’s equation \( \grad \cdot F = J/(\epsilon_0 c) \) is equivalent to the conventional tensor form of the Gauss-Ampere law
\begin{equation}\label{eqn:maxwells:320}
\partial_\mu F^{\mu\nu} = \inv{\epsilon_0 c} J^\nu,
\end{equation}
where \( F^{\mu\nu} = \partial^\mu A^\nu – \partial^\nu A^\mu \) as usual. Also show that the trivector component of Maxwell’s equation \( \grad \wedge F = 0 \) is equivalent to the tensor form of the Gauss-Faraday law
\begin{equation}\label{eqn:maxwells:340}
\partial_\alpha \lr{ \epsilon^{\alpha \beta \mu \nu} F_{\mu\nu} } = 0.
\end{equation}

Answer

To show the Lagrangian correspondence we must expand \( F \cdot F \) in coordinates
\begin{equation}\label{eqn:maxwells:360}
\begin{aligned}
F \cdot F
&=
( \grad \wedge A ) \cdot
( \grad \wedge A ) \\
&=
\lr{ (\gamma^\mu \partial_\mu) \wedge (\gamma^\nu A_\nu) }
\cdot
\lr{ (\gamma^\alpha \partial_\alpha) \wedge (\gamma^\beta A_\beta) } \\
&=
\lr{ \gamma^\mu \wedge \gamma^\nu } \cdot \lr{ \gamma_\alpha \wedge \gamma_\beta }
(\partial_\mu A_\nu )
(\partial^\alpha A^\beta ) \\
&=
\lr{
{\delta^\mu}_\beta
{\delta^\nu}_\alpha

{\delta^\mu}_\alpha
{\delta^\nu}_\beta
}
(\partial_\mu A_\nu )
(\partial^\alpha A^\beta ) \\
&=
– \partial_\mu A_\nu \lr{
\partial^\mu A^\nu

\partial^\nu A^\mu
} \\
&=
– \partial_\mu A_\nu F^{\mu\nu} \\
&=
– \inv{2} \lr{
\partial_\mu A_\nu F^{\mu\nu}
+
\partial_\nu A_\mu F^{\nu\mu}
} \\
&=
– \inv{2} \lr{
\partial_\mu A_\nu

\partial_\nu A_\mu
}
F^{\mu\nu} \\
&=

\inv{2}
F_{\mu\nu}
F^{\mu\nu}.
\end{aligned}
\end{equation}
With a substitution of this and \( A \cdot J = A_\mu J^\mu \) back into the Lagrangian, we recover the tensor form of the Lagrangian.

To recover the tensor form of Maxwell’s equation, we first split it into vector and trivector parts
\begin{equation}\label{eqn:maxwells:1580}
\grad \cdot F + \grad \wedge F = \inv{\epsilon_0 c} J.
\end{equation}
Now the vector component may be expanded in coordinates by dotting both sides with \( \gamma^\nu \) to find
\begin{equation}\label{eqn:maxwells:1600}
\inv{\epsilon_0 c} \gamma^\nu \cdot J = J^\nu,
\end{equation}
and
\begin{equation}\label{eqn:maxwells:1620}
\begin{aligned}
\gamma^\nu \cdot
\lr{ \grad \cdot F }
&=
\partial_\mu \gamma^\nu \cdot \lr{ \gamma^\mu \cdot \lr{ \gamma_\alpha \wedge \gamma_\beta } \partial^\alpha A^\beta } \\
&=
\lr{
{\delta^\mu}_\alpha
{\delta^\nu}_\beta

{\delta^\nu}_\alpha
{\delta^\mu}_\beta
}
\partial_\mu
\partial^\alpha A^\beta \\
&=
\partial_\mu
\lr{
\partial^\mu A^\nu

\partial^\nu A^\mu
} \\
&=
\partial_\mu F^{\mu\nu}.
\end{aligned}
\end{equation}
Equating \ref{eqn:maxwells:1600} and \ref{eqn:maxwells:1620} finishes the first part of the job. For the trivector component, we have
\begin{equation}\label{eqn:maxwells:1640}
0
= \grad \wedge F
= (\gamma^\mu \partial_\mu) \wedge \lr{ \gamma^\alpha \wedge \gamma^\beta } \partial_\alpha A_\beta
= \inv{2} (\gamma^\mu \partial_\mu) \wedge \lr{ \gamma^\alpha \wedge \gamma^\beta } F_{\alpha \beta}.
\end{equation}
Wedging with \( \gamma^\tau \) and then multiplying by \( -2 I \) we find
\begin{equation}\label{eqn:maxwells:1660}
0 = – \lr{ \gamma^\mu \wedge \gamma^\alpha \wedge \gamma^\beta \wedge \gamma^\tau } I \partial_\mu F_{\alpha \beta},
\end{equation}
but
\begin{equation}\label{eqn:maxwells:1680}
\gamma^\mu \wedge \gamma^\alpha \wedge \gamma^\beta \wedge \gamma^\tau = -I \epsilon^{\mu \alpha \beta \tau},
\end{equation}
which leaves us with
\begin{equation}\label{eqn:maxwells:1700}
\epsilon^{\mu \alpha \beta \tau} \partial_\mu F_{\alpha \beta} = 0,
\end{equation}
as expected.

Problem: Correspondence of tensor and Gibbs forms of Maxwell’s equations.

Given the identifications

\begin{equation}\label{eqn:lorentzForceCovariant:1500}
F^{k0} = E^k,
\end{equation}
and
\begin{equation}\label{eqn:lorentzForceCovariant:1520}
F^{rs} = -\epsilon^{rst} B^t,
\end{equation}
and
\begin{equation}\label{eqn:maxwells:1560}
J^\mu = \lr{ c \rho, \BJ },
\end{equation}
the reader should satisfy themselves that the traditional Gibbs form of Maxwell’s equations can be recovered from \ref{eqn:maxwells:320}.

Answer

The reader is referred to Exercise 3.4 “Electrodynamics, variational principle.” from [2].

Problem: Correspondence with grad and curl form of Maxwell’s equations.

With \( J = c \rho \gamma_0 + J^k \gamma_k \) and \( F = \BE + I c \BB \) show that Maxwell’s equation, as stated in \ref{eqn:maxwells:2200} expand to the conventional div and curl expressions for Maxwell’s equations.

Answer

To obtain Maxwell’s equations in their traditional vector forms, we pre-multiply both sides with \( \gamma_0 \)
\begin{equation}\label{eqn:maxwells:1720}
\gamma_0 \grad F = \inv{\epsilon_0 c} \gamma_0 J,
\end{equation}
and then select each grade separately. First observe that the RHS above has scalar and bivector components, as
\begin{equation}\label{eqn:maxwells:1740}
\gamma_0 J
=
c \rho + J^k \gamma_0 \gamma_k.
\end{equation}
In terms of the spatial bivector basis \( \Be_k = \gamma_k \gamma_0 \), the RHS of \ref{eqn:maxwells:1720} is
\begin{equation}\label{eqn:maxwells:1760}
\gamma_0 \frac{J}{\epsilon_0 c} = \frac{\rho}{\epsilon_0} – \mu_0 c \BJ.
\end{equation}
For the LHS, first note that
\begin{equation}\label{eqn:maxwells:1780}
\begin{aligned}
\gamma_0 \grad
&=
\gamma_0
\lr{
\gamma_0 \partial^0 +
\gamma_k \partial^k
} \\
&=
\partial_0 – \gamma_0 \gamma_k \partial_k \\
&=
\inv{c} \PD{t}{} + \spacegrad.
\end{aligned}
\end{equation}
We can express all the the LHS of \ref{eqn:maxwells:1720} in the bivector spatial basis, so that Maxwell’s equation in multivector form is
\begin{equation}\label{eqn:maxwells:1800}
\lr{ \inv{c} \PD{t}{} + \spacegrad } \lr{ \BE + I c \BB } = \frac{\rho}{\epsilon_0} – \mu_0 c \BJ.
\end{equation}
Selecting the scalar, vector, bivector, and trivector grades of both sides (in the spatial basis) gives the following set of respective equations
\begin{equation}\label{eqn:maxwells:1840}
\spacegrad \cdot \BE = \frac{\rho}{\epsilon_0}
\end{equation}
\begin{equation}\label{eqn:maxwells:1860}
\inv{c} \partial_t \BE + I c \spacegrad \wedge \BB = – \mu_0 c \BJ
\end{equation}
\begin{equation}\label{eqn:maxwells:1880}
\spacegrad \wedge \BE + I \partial_t \BB = 0
\end{equation}
\begin{equation}\label{eqn:maxwells:1900}
I c \spacegrad \cdot B = 0,
\end{equation}
which we can rewrite after some duality transformations (and noting that \( \mu_0 \epsilon_0 c^2 = 1 \)), we have
\begin{equation}\label{eqn:maxwells:1940}
\spacegrad \cdot \BE = \frac{\rho}{\epsilon_0}
\end{equation}
\begin{equation}\label{eqn:maxwells:1960}
\spacegrad \cross \BB – \mu_0 \epsilon_0 \PD{t}{\BE} = \mu_0 \BJ
\end{equation}
\begin{equation}\label{eqn:maxwells:1980}
\spacegrad \cross \BE + \PD{t}{\BB} = 0
\end{equation}
\begin{equation}\label{eqn:maxwells:2000}
\spacegrad \cdot B = 0,
\end{equation}
which are Maxwell’s equations in their traditional form.

Problem: Alternative multivector Lagrangian.

Show that a scalar+pseudoscalar Lagrangian of the following form
\begin{equation}\label{eqn:maxwells:2220}
\LL = – \frac{\epsilon_0 c}{2} F^2 + J \cdot A,
\end{equation}
which omits the scalar selection of the Lagrangian in \ref{eqn:maxwells:2160}, also represents Maxwell’s equation. Discuss the scalar and pseudoscalar components of \( F^2 \), and show why the pseudoscalar inclusion is irrelevant.

Answer

The quantity \( F^2 = F \cdot F + F \wedge F \) has both scalar and pseudoscalar
components. Note that unlike vectors, a bivector wedge in 4D with itself need not be zero (example: \( \gamma_0 \gamma_1 + \gamma_2 \gamma_3 \) wedged with itself).
We can see this multivector nature nicely by expansion in terms of the electric and magnetic fields
\begin{equation}\label{eqn:maxwells:2020}
\begin{aligned}
F^2
&= \lr{ \BE + I c \BB }^2 \\
&= \BE^2 – c^2 \BB^2 + I c \lr{ \BE \BB + \BB \BE } \\
&= \BE^2 – c^2 \BB^2 + 2 I c \BE \cdot \BB.
\end{aligned}
\end{equation}
Both the scalar and pseudoscalar parts of \( F^2 \) are Lorentz invariant, a requirement of our Lagrangian, but most Maxwell equation Lagrangians only include the scalar \( \BE^2 – c^2 \BB^2 \) component of the field square. If we allow the Lagrangian to be multivector valued, and evaluate the Euler-Lagrange equations, we quickly find the same results
\begin{equation}\label{eqn:maxwells:2040}
\begin{aligned}
0
&= \gamma_\nu \lr{ \PD{A_\nu}{} – \partial_\mu \PD{(\partial_\mu A_\nu)}{} } \LL \\
&= \gamma_\nu \lr{ J^\nu + \frac{\epsilon_0 c}{2} \partial_\mu
\lr{
(\gamma^\mu \wedge \gamma^\nu) F
+
F (\gamma^\mu \wedge \gamma^\nu)
}
}.
\end{aligned}
\end{equation}
Here some steps are skipped, building on our previous scalar Euler-Lagrange evaluation experience. We have a symmetric product of two bivectors, which we can express as a 0,4 grade selection, since
\begin{equation}\label{eqn:maxwells:2060}
\gpgrade{ X F }{0,4} = \inv{2} \lr{ X F + F X },
\end{equation}
for any two bivectors \( X, F \). This leaves
\begin{equation}\label{eqn:maxwells:2080}
\begin{aligned}
0
&= J + \epsilon_0 c \gamma_\nu \gpgrade{ (\grad \wedge \gamma^\nu) F }{0,4} \\
&= J + \epsilon_0 c \gamma_\nu \gpgrade{ -\gamma^\nu \grad F + (\gamma^\nu \cdot \grad) F }{0,4} \\
&= J + \epsilon_0 c \gamma_\nu \gpgrade{ -\gamma^\nu \grad F }{0,4} \\
&= J – \epsilon_0 c \gamma_\nu
\lr{
\gamma^\nu \cdot \lr{ \grad \cdot F } + \gamma^\nu \wedge \grad \wedge F
}.
\end{aligned}
\end{equation}
However, since \( \grad \wedge F = \grad \wedge \grad \wedge A = 0 \), we see that there is no contribution from the \( F \wedge F \) pseudoscalar component of the Lagrangian, and we are left with
\begin{equation}\label{eqn:maxwells:2100}
\begin{aligned}
0
&= J – \epsilon_0 c (\grad \cdot F) \\
&= J – \epsilon_0 c \grad F,
\end{aligned}
\end{equation}
which is Maxwell’s equation, as before.

References

[1] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[2] Peeter Joot. Quantum field theory. Kindle Direct Publishing, 2018.

Some nice positive feedback for my book.

October 31, 2020 math and physics play No comments , , , , , , , , , ,

Here’s a fun congratulatory email that I received today for my Geometric Algebra for Electrical Engineers book

Peeter ..
I had to email to congratulate you on your geometric algebra book. Like yourself, when I came across it, I was totally blown away and your book, being written from the position of a discoverer rather than an expert, answers most of the questions I was confronted by when reading Doran and Lasenby’s book.
You’re a C++ programmer and from my perspective, when using natural world math, you are constructing a representation of a problem (like code does) except many physicists do not recognize this. They’re doing physics with COBOL (or C with classes!).
congratulations
.. Reader
I couldn’t resist pointing out the irony of his COBOL comment, as my work at LzLabs is now heavily focused on COBOL (and PL/I) compilers and compiler runtimes.  You could say that my work, at work or at play, is all an attempt to transition people away from the evils of legacy COBOL.
For reference the Doran and Lasenby book is phenomenal work, but it is really hard material.  To attempt to read this, you’ll need a thorough understanding of electromagnetism, relativity, tensor algebra, quantum mechanics, advanced classical mechanics, and field theory.  I’m still working on this book, and it’s probably been 12 years since I bought it.  I managed to teach myself some of this material as I went, but also took most of the 4th year UofT undergrad physics courses (and some grad courses) to fill in some of the gaps.
When I titled my book, I included “for Electrical Engineers” in the title.  That titling choice was somewhat derivative, as there were already geometric algebra books “for physicists”,  and “for computer science“.  However, I thought it was also good shorthand for the prerequisites required for the book as “for Electrical Engineers” seemed to be good shorthand for “for a student that has seen electromagnetism in its div, grad, curl form, and doesn’t know special relativity, field theory, differential forms, tensor algebra, or other topics from more advanced physics.”
The relativistic presentation of electromagnetism in Doran and Lasenby, using the Dirac algebra (aka Space Time Algebra (STA)), is much more beautiful than the form that I have used in my book.  However, I was hoping to present the subject in a way that was accessible, and provided a stepping stone for the STA approach when the reader was ready to tackle a next interval of the “learning curve.”

Lagrangian for the Lorentz force equation.

October 24, 2020 math and physics play No comments , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Motivation.

In my old classical mechanics notes it appears that I did covariant derivations of the Lorentz force equations a number of times, using different trial Lagrangians (relativistic and non-relativistic), and using both geometric algebra and tensor methods. However, none of these appear to have been done concisely, and a number not even coherently.

The following document has been drafted as replacement text for those incoherent classical mechanics notes. I’ll attempt to cover

  • a lighting review of the geometric algebra STA (Space Time Algebra),
  • relations between Dirac matrix algebra and STA,
  • derivation of the relativistic form of the Euler-Lagrange equations from the covariant form of the action,
  • relationship of the STA form of the Euler-Lagrange equations to their tensor equivalents,
  • derivation of the Lorentz force equation from the STA Lorentz force Lagrangian,
  • relationship of the STA Lorentz force equation to its equivalent in the tensor formalism,
  • relationship of the STA Lorentz force equation to the traditional vector form.

Note that some of the prerequisite ideas and auxiliary details are presented as problems with solutions. If the reader has sufficient background to attempt those problems themselves, they are encouraged to do so.

The STA and geometric algebra ideas used here are not complete to learn from in isolation. The reader is referred to [1] for a more complete exposition of both STA and geometric algebra.

Conventions.

Definition 1.1: Index conventions.

Latin indexes \( i, j, k, r, s, t, \cdots \) are used to designate values in the range \( \setlr{ 1,2,3 } \). Greek indexes are \( \alpha, \beta, \mu, \nu, \cdots \) are used for indexes of spacetime quantities \( \setlr{0,1,2,3} \).
The Einstein convention of implied summation for mixed upper and lower Greek indexes will be used, for example
\begin{equation*}
x^\alpha x_\alpha \equiv \sum_{\alpha = 0}^3 x^\alpha x_\alpha.
\end{equation*}

Space Time Algebra (STA.)

In the geometric algebra literature, the Dirac algebra of quantum field theory has been rebranded Space Time Algebra (STA). The differences between STA and the Dirac theory that uses matrices (\( \gamma_\mu \)) are as follows

  • STA completely omits any representation of the Dirac basis vectors \( \gamma_\mu \). In particular, any possible matrix representation is irrelevant.
  • STA provides a rich set of fundamental operations (grade selection, generalized dot and wedge products for multivector elements, rotation and reflection operations, …)
  • Matrix trace, and commutator and anticommutator operations are nowhere to be found in STA, as geometrically grounded equivalents are available instead.
  • The “slashed” quantities from Dirac theory, such as \( \gamma_\mu p^\mu \) are nothing more than vectors in their entirety in STA (where the basis is no longer implicit, as is the case for coordinates.)

Our basis vectors have the following properties.

Definition 1.2: Standard basis.

Let the four-vector standard basis be designated \( \setlr{\gamma_0, \gamma_1, \gamma_2, \gamma_3 } \), where the basis vectors satisfy
\begin{equation}\label{eqn:lorentzForceCovariant:1540}
\begin{aligned}
\gamma_0^2 &= -\gamma_i^2 = 1 \\
\gamma_\alpha \cdot \gamma_\beta &= 0, \forall \alpha \ne \beta.
\end{aligned}
\end{equation}

Problem: Commutator properties of the STA basis.

In Dirac theory, the commutator properties of the Dirac matrices is considered fundamental, namely
\begin{equation*}
\symmetric{\gamma_\mu}{\gamma_\nu} = 2 \eta_{\mu\nu}.
\end{equation*}

Show that this follows from the axiomatic assumptions of geometric algebra, and describe how the dot and wedge products are related to the anticommutator and commutator products of Dirac theory.

Answer

The anticommutator is defined as symmetric sum of products
\begin{equation}\label{eqn:lorentzForceCovariant:1040}
\symmetric{\gamma_\mu}{\gamma_\nu}
\equiv
\gamma_\mu \gamma_\nu
+
\gamma_\nu \gamma_\mu,
\end{equation}
but this is just twice the dot product in its geometric algebra form \( a b = (a b + ba)/2 \). Observe that the properties of the basis vectors defined in \ref{eqn:lorentzForceCovariant:1540} may be summarized as
\begin{equation}\label{eqn:lorentzForceCovariant:1060}
\gamma_\mu \cdot \gamma_\nu = \eta_{\mu\nu},
\end{equation}
where \( \eta_{\mu\nu} = \text{diag}(+,-,-,-)
=
\begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & -1 & 0 & 0 \\
0 & 0 & -1 & 0 \\
0 & 0 & 0 & -1 \\
\end{bmatrix}
\) is the conventional metric tensor. This means
\begin{equation}\label{eqn:lorentzForceCovariant:1080}
\gamma_\mu \cdot \gamma_\nu = \eta_{\mu\nu} = 2 \symmetric{\gamma_\mu}{\gamma_\nu},
\end{equation}
as claimed.

Similarly, observe that the commutator, defined as the antisymmetric sum of products
\begin{equation}\label{eqn:lorentzForceCovariant:1100}
\antisymmetric{\gamma_\mu}{\gamma_\nu} \equiv
\gamma_\mu \gamma_\nu

\gamma_\nu \gamma_\mu,
\end{equation}
is twice the wedge product \( a \wedge b = (a b – b a)/2 \). This provides geometric identifications for the respective anti-commutator and commutator products respectively
\begin{equation}\label{eqn:lorentzForceCovariant:1120}
\begin{aligned}
\symmetric{\gamma_\mu}{\gamma_\nu} &= 2 \gamma_\mu \cdot \gamma_\nu \\
\antisymmetric{\gamma_\mu}{\gamma_\nu} &= 2 \gamma_\mu \wedge \gamma_\nu,
\end{aligned}
\end{equation}

Definition 1.3: Pseudoscalar.

The pseudoscalar for the space is denoted \( I = \gamma_0 \gamma_1 \gamma_2 \gamma_3 \).

Problem: Pseudoscalar.

Show that the STA pseudoscalar \( I \) defined by \ref{eqn:lorentzForceCovariant:1540} satisfies
\begin{equation*}
\tilde{I} = I,
\end{equation*}
where the tilde operator designates reversion. Also show that \( I \) has the properties of an imaginary number
\begin{equation*}
I^2 = -1.
\end{equation*}
Finally, show that, unlike the spatial pseudoscalar that commutes with all grades, \( I \) anticommutes with any vector or trivector, and commutes with any bivector.

Answer

Since \( \gamma_\alpha \gamma_\beta = -\gamma_\beta \gamma_\alpha \) for any \( \alpha \ne \beta \), any permutation of the factors of \( I \) changes the sign once. In particular
\begin{equation}\label{eqn:lorentzForceCovariant:680}
\begin{aligned}
I &=
\gamma_0
\gamma_1
\gamma_2
\gamma_3 \\
&=

\gamma_1
\gamma_2
\gamma_3
\gamma_0 \\
&=

\gamma_2
\gamma_3
\gamma_1
\gamma_0 \\
&=
+
\gamma_3
\gamma_2
\gamma_1
\gamma_0
= \tilde{I}.
\end{aligned}
\end{equation}
Using this, we have
\begin{equation}\label{eqn:lorentzForceCovariant:700}
\begin{aligned}
I^2
&= I \tilde{I} \\
&=
(
\gamma_0
\gamma_1
\gamma_2
\gamma_3
)(
\gamma_3
\gamma_2
\gamma_1
\gamma_0
) \\
&=
\lr{\gamma_0}^2
\lr{\gamma_1}^2
\lr{\gamma_2}^2
\lr{\gamma_3}^2 \\
&=
(+1)
(-1)
(-1)
(-1) \\
&= -1.
\end{aligned}
\end{equation}
To illustrate the anticommutation property with any vector basis element, consider the following two examples:
\begin{equation}\label{eqn:lorentzForceCovariant:720}
\begin{aligned}
I \gamma_0 &=
\gamma_0
\gamma_1
\gamma_2
\gamma_3
\gamma_0 \\
&=

\gamma_0
\gamma_0
\gamma_1
\gamma_2
\gamma_3 \\
&=

\gamma_0 I,
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:lorentzForceCovariant:740}
\begin{aligned}
I \gamma_2
&=
\gamma_0
\gamma_1
\gamma_2
\gamma_3
\gamma_2 \\
&=

\gamma_0
\gamma_1
\gamma_2
\gamma_2
\gamma_3 \\
&=

\gamma_2
\gamma_0
\gamma_1
\gamma_2
\gamma_3 \\
&= -\gamma_2 I.
\end{aligned}
\end{equation}
A total of three sign swaps is required to “percolate” any given \(\gamma_\alpha\) through the factors of \( I \), resulting in an overall sign change of \( -1 \).

For any bivector basis element \( \alpha \ne \beta \)
\begin{equation}\label{eqn:lorentzForceCovariant:760}
\begin{aligned}
I \gamma_\alpha \gamma_\beta
&=
-\gamma_\alpha I \gamma_\beta \\
&=
+\gamma_\alpha \gamma_\beta I.
\end{aligned}
\end{equation}

Similarly for any trivector basis element \( \alpha \ne \beta \ne \sigma \)
\begin{equation}\label{eqn:lorentzForceCovariant:780}
\begin{aligned}
I \gamma_\alpha \gamma_\beta \gamma_\sigma
&=
-\gamma_\alpha I \gamma_\beta \gamma_\sigma \\
&=
+\gamma_\alpha \gamma_\beta I \gamma_\sigma \\
&=
-\gamma_\alpha \gamma_\beta \gamma_\sigma I.
\end{aligned}
\end{equation}

Definition 1.4: Reciprocal basis.

The reciprocal basis \( \setlr{ \gamma^0, \gamma^1, \gamma^2, \gamma^3 } \) is defined , such that the property \( \gamma^\alpha \cdot \gamma_\beta = {\delta^\alpha}_\beta \) holds.

Observe that, \( \gamma^0 = \gamma_0 \) and \( \gamma^i = -\gamma_i \).

Theorem 1.1: Coordinates.

Coordinates are defined in terms of dot products with the standard basis, or reciprocal basis
\begin{equation*}
\begin{aligned}
x^\alpha &= x \cdot \gamma^\alpha \\
x_\alpha &= x \cdot \gamma_\alpha,
\end{aligned}
\end{equation*}

Start proof:

Suppose that a coordinate representation of the following form is assumed
\begin{equation}\label{eqn:lorentzForceCovariant:820}
x = x^\alpha \gamma_\alpha = x_\beta \gamma^\beta.
\end{equation}
We wish to determine the representation of the \( x^\alpha \) or \( x_\beta \) coordinates in terms of \( x\) and the basis elements. Taking the dot product with any standard basis element, we find
\begin{equation}\label{eqn:lorentzForceCovariant:840}
\begin{aligned}
x \cdot \gamma_\mu
&= (x_\beta \gamma^\beta) \cdot \gamma_\mu \\
&= x_\beta {\delta^\beta}_\mu \\
&= x_\mu,
\end{aligned}
\end{equation}
as claimed. Similarly, dotting with a reciprocal frame vector, we find
\begin{equation}\label{eqn:lorentzForceCovariant:860}
\begin{aligned}
x \cdot \gamma^\mu
&= (x^\beta \gamma_\beta) \cdot \gamma^\mu \\
&= x^\beta {\delta_\beta}^\mu \\
&= x^\mu.
\end{aligned}
\end{equation}

End proof.

Observe that raising or lowering the index of a spatial index toggles the sign of a coordinate, but timelike indexes are left unchanged.
\begin{equation}\label{eqn:lorentzForceCovariant:880}
\begin{aligned}
x^0 &= x_0 \\
x^i &= -x_i \\
\end{aligned}
\end{equation}

Definition 1.5: Spacetime gradient.

The spacetime gradient operator is
\begin{equation*}
\grad = \gamma^\mu \partial_\mu = \gamma_\nu \partial^\nu,
\end{equation*}
where
\begin{equation*}
\partial_\mu = \PD{x^\mu}{},
\end{equation*}
and
\begin{equation*}
\partial^\mu = \PD{x_\mu}{}.
\end{equation*}

This definition of gradient is consistent with the Dirac gradient (sometimes denoted as a slashed \(\partial\)).

Definition 1.6: Timelike and spacelike components of a four-vector.

Given a four vector \( x = \gamma_\mu x^\mu \), that would be designated \( x^\mu = \setlr{ x^0, \Bx} \) in conventional special relativity, we write
\begin{equation*}
x^0 = x \cdot \gamma_0,
\end{equation*}
and
\begin{equation*}
\Bx = x \wedge \gamma_0,
\end{equation*}
or
\begin{equation*}
x = (x^0 + \Bx) \gamma_0.
\end{equation*}

The spacetime split of a four-vector \( x \) is relative to the frame. In the relativistic lingo, one would say that it is “observer dependent”, as the same operations with \( {\gamma_0}’ \), the timelike basis vector for a different frame, would yield a different set of coordinates.

While the dot and wedge products above provide an effective mechanism to split a four vector into a set of timelike and spacelike quantities, the spatial component of a vector has a bivector representation in STA. Consider the following coordinate expansion of a spatial vector
\begin{equation}\label{eqn:lorentzForceCovariant:1000}
\Bx =
x \wedge \gamma_0
=
\lr{ x^\mu \gamma_\mu } \wedge \gamma_0
=
\sum_{k = 1}^3 x^k \gamma_k \gamma_0.
\end{equation}

Definition 1.7: Spatial basis.

We designate
\begin{equation}\label{eqn:lorentzForceCovariant:1560}
\Be_i = \gamma_i \gamma_0,
\end{equation}
as the standard basis vectors for \(\mathbb{R}^3\).

In the literature, this bivector representation of the spatial basis may be designated \( \sigma_i = \gamma_i \gamma_0 \), as these bivectors have the properties of the Pauli matrices \( \sigma_i \). Because I intend to expand these notes to include purely non-relativistic applications, I won’t use the Pauli notation here.

Problem: Orthonormality of the spatial basis.

Show that the spatial basis \( \setlr{ \Be_1, \Be_2, \Be_3 } \), defined by \ref{eqn:lorentzForceCovariant:1560}, is orthonormal.

Answer

\begin{equation}\label{eqn:lorentzForceCovariant:620}
\begin{aligned}
\Be_i \cdot \Be_j
&= \gpgradezero{ \gamma_i \gamma_0 \gamma_j \gamma_0 } \\
&= -\gpgradezero{ \gamma_i \gamma_j } \\
&= – \gamma_i \cdot \gamma_j.
\end{aligned}
\end{equation}
This is zero for all \( i \ne j \), and unity for any \( i = j \).

Problem: Spatial pseudoscalar.

Show that the STA pseudoscalar \( I = \gamma_0 \gamma_1 \gamma_2 \gamma_3 \) equals the spatial pseudoscalar \( I = \Be_1 \Be_2 \Be_3 \).

Answer

The spatial pseudoscalar, expanded in terms of the STA basis vectors, is
\begin{equation}\label{eqn:lorentzForceCovariant:1020}
\begin{aligned}
I
&= \Be_1 \Be_2 \Be_3 \\
&= \lr{ \gamma_1 \gamma_0 }
\lr{ \gamma_2 \gamma_0 }
\lr{ \gamma_3 \gamma_0 } \\
&= \lr{ \gamma_1 \gamma_0 } \gamma_2 \lr{ \gamma_0 \gamma_3 } \gamma_0 \\
&= \lr{ -\gamma_0 \gamma_1 } \gamma_2 \lr{ -\gamma_3 \gamma_0 } \gamma_0 \\
&= \gamma_0 \gamma_1 \gamma_2 \gamma_3 \lr{ \gamma_0 \gamma_0 } \\
&= \gamma_0 \gamma_1 \gamma_2 \gamma_3,
\end{aligned}
\end{equation}
as claimed.

Problem: Characteristics of the Pauli matrices.

The Pauli matrices obey the following anticommutation relations:
\begin{equation}\label{eqn:lorentzForceCovariant:660}
\symmetric{ \sigma_a}{\sigma_b } = 2 \delta_{a b},
\end{equation}
and commutation relations:
\begin{equation}\label{eqn:lorentzForceCovariant:640}
\antisymmetric{ \sigma_a}{ \sigma_b } = 2 i \epsilon_{a b c}\,\sigma_c,
\end{equation}
Show how these relate to the geometric algebra dot and wedge products, and determine the geometric algebra representation of the imaginary \( i \) above.

Euler-Lagrange equations.

I’ll start at ground zero, with the derivation of the relativistic form of the Euler-Lagrange equations from the action. A relativistic action for a single particle system has the form
\begin{equation}\label{eqn:lorentzForceCovariant:20}
S = \int d\tau L(x, \dot{x}),
\end{equation}
where \( x \) is the spacetime coordinate, \( \dot{x} = dx/d\tau \) is the four-velocity, and \( \tau \) is proper time.

Theorem 1.2: Relativistic Euler-Lagrange equations.

Let \( x \rightarrow x + \delta x \) be any variation of the Lagrangian four-vector coordinates, where \( \delta x = 0 \) at the boundaries of the action integral. The variation of the action is
\begin{equation}\label{eqn:lorentzForceCovariant:1580}
\delta S = \int d\tau \delta x \cdot \delta L(x, \dot{x}),
\end{equation}
where
\begin{equation}\label{eqn:lorentzForceCovariant:1600}
\delta L = \grad L – \frac{d}{d\tau} (\grad_v L),
\end{equation}
where \( \grad = \gamma^\mu \partial_\mu \), and where we construct a similar velocity-gradient with respect to the proper-time derivatives of the coordinates \( \grad_v = \gamma^\mu \partial/\partial \dot{x}^\mu \).The action is extremized when \( \delta S = 0 \), or when \( \delta L = 0 \). This latter condition is called the Euler-Lagrange equations.

Start proof:

Let \( \epsilon = \delta x \), and expand the Lagrangian in Taylor series to first order
\begin{equation}\label{eqn:lorentzForceCovariant:60}
\begin{aligned}
S &\rightarrow S + \delta S \\
&= \int d\tau L( x + \epsilon, \dot{x} + \dot{\epsilon})
&=
\int d\tau \lr{
L(x, \dot{x}) + \epsilon \cdot \grad L + \dot{\epsilon} \cdot \grad_v L
}.
\end{aligned}
\end{equation}
Subtracting off \( S \) and integrating by parts, leaves
\begin{equation}\label{eqn:lorentzForceCovariant:80}
\delta S =
\int d\tau \epsilon \cdot \lr{
\grad L – \frac{d}{d\tau} \grad_v L
}
+
\int d\tau \frac{d}{d\tau} (\grad_v L ) \cdot \epsilon.
\end{equation}
The boundary integral
\begin{equation}\label{eqn:lorentzForceCovariant:100}
\int d\tau \frac{d}{d\tau} (\grad_v L ) \cdot \epsilon
=
\evalbar{(\grad_v L ) \cdot \epsilon}{\Delta \tau} = 0,
\end{equation}
is zero since the variation \( \epsilon \) is required to vanish on the boundaries. So, if \( \delta S = 0 \), we must have
\begin{equation}\label{eqn:lorentzForceCovariant:120}
0 =
\int d\tau \epsilon \cdot \lr{
\grad L – \frac{d}{d\tau} \grad_v L
},
\end{equation}
for all variations \( \epsilon \). Clearly, this requires that
\begin{equation}\label{eqn:lorentzForceCovariant:140}
\delta L = \grad L – \frac{d}{d\tau} (\grad_v L) = 0,
\end{equation}
or
\begin{equation}\label{eqn:lorentzForceCovariant:145}
\grad L = \frac{d}{d\tau} (\grad_v L),
\end{equation}
which is the coordinate free statement of the Euler-Lagrange equations.

End proof.

Problem: Coordinate form of the Euler-Lagrange equations.

Working in coordinates, use the action argument show that the Euler-Lagrange equations have the form
\begin{equation*}
\PD{x^\mu}{L} = \frac{d}{d\tau} \PD{\dot{x}^\mu}{L}
\end{equation*}
Observe that this is identical to the statement of \ref{eqn:lorentzForceCovariant:1600} after contraction with \( \gamma^\mu \).

Answer

In terms of coordinates, the first order Taylor expansion of the action is
\begin{equation}\label{eqn:lorentzForceCovariant:180}
\begin{aligned}
S &\rightarrow S + \delta S \\
&= \int d\tau L( x^\alpha + \epsilon^\alpha, \dot{x}^\alpha + \dot{\epsilon}^\alpha) \\
&=
\int d\tau \lr{
L(x^\alpha, \dot{x}^\alpha) + \epsilon^\mu \PD{x^\mu}{L} + \dot{\epsilon}^\mu \PD{\dot{x}^\mu}{L}
}.
\end{aligned}
\end{equation}
As before, we integrate by parts to separate out a pure boundary term
\begin{equation}\label{eqn:lorentzForceCovariant:200}
\delta S =
\int d\tau \epsilon^\mu
\lr{
\PD{x^\mu}{L} – \frac{d}{d\tau} \PD{\dot{x}^\mu}{L}
}
+
\int d\tau \frac{d}{d\tau} \lr{
\epsilon^\mu \PD{\dot{x}^\mu}{L}
}.
\end{equation}
The boundary term is killed since \( \epsilon^\mu = 0 \) at the end points of the action integral. We conclude that extremization of the action (\( \delta S = 0 \), for all \( \epsilon^\mu \)) requires
\begin{equation}\label{eqn:lorentzForceCovariant:220}
\PD{x^\mu}{L} – \frac{d}{d\tau} \PD{\dot{x}^\mu}{L} = 0.
\end{equation}

Lorentz force equation.

Theorem 1.3: Lorentz force.

The relativistic Lagrangian for a charged particle is
\begin{equation}\label{eqn:lorentzForceCovariant:1640}
L = \inv{2} m v^2 + q A \cdot v/c.
\end{equation}
Application of the Euler-Lagrange equations to this Lagrangian yields the Lorentz-force equation
\begin{equation}\label{eqn:lorentzForceCovariant:1660}
\frac{dp}{d\tau} = q F \cdot v/c,
\end{equation}
where \( p = m v \) is the proper momentum, \( F \) is the Faraday bivector \( F = \grad \wedge A \), and \( c \) is the speed of light.

Start proof:

To make life easier, let’s take advantage of the linearity of the Lagrangian, and break it into the free particle Lagrangian \( L_0 = (1/2) m v^2 \) and a potential term \( L_1 = q A \cdot v/c \). For the free particle case we have
\begin{equation}\label{eqn:lorentzForceCovariant:240}
\begin{aligned}
\delta L_0
&= \grad L_0 – \frac{d}{d\tau} (\grad_v L_0) \\
&= – \frac{d}{d\tau} (m v) \\
&= – \frac{dp}{d\tau}.
\end{aligned}
\end{equation}
For the potential contribution we have
\begin{equation}\label{eqn:lorentzForceCovariant:260}
\begin{aligned}
\delta L_1
&= \grad L_1 – \frac{d}{d\tau} (\grad_v L_1) \\
&= \frac{q}{c} \lr{ \grad (A \cdot v) – \frac{d}{d\tau} \lr{ \grad_v (A \cdot v)} } \\
&= \frac{q}{c} \lr{ \grad (A \cdot v) – \frac{dA}{d\tau} }.
\end{aligned}
\end{equation}
The proper time derivative can be evaluated using the chain rule
\begin{equation}\label{eqn:lorentzForceCovariant:280}
\frac{dA}{d\tau}
=
\frac{\partial x^\mu}{\partial \tau} \partial_\mu A
= (v \cdot \grad) A.
\end{equation}
Putting all the pieces back together we have
\begin{equation}\label{eqn:lorentzForceCovariant:300}
\begin{aligned}
0
&= \delta L \\
&=
-\frac{dp}{d\tau} + \frac{q}{c} \lr{ \grad (A \cdot v) – (v \cdot \grad) A } \\
&=
-\frac{dp}{d\tau} + \frac{q}{c} \lr{ \grad \wedge A } \cdot v.
\end{aligned}
\end{equation}

End proof.

Problem: Gradient of a squared position vector.

Show that
\begin{equation*}
\grad (a \cdot x) = a,
\end{equation*}
and
\begin{equation*}
\grad x^2 = 2 x.
\end{equation*}
It should be clear that the same ideas can be used for the velocity gradient, where we obtain \( \grad_v (v^2) = 2 v \), and \( \grad_v (A \cdot v) = A \), as used in the derivation above.

Answer

The first identity follows easily by expansion in coordinates
\begin{equation}\label{eqn:lorentzForceCovariant:320}
\begin{aligned}
\grad (a \cdot x)
&=
\gamma^\mu \partial_\mu a_\alpha x^\alpha \\
&=
\gamma^\mu a_\alpha \delta_\mu^\alpha \\
&=
\gamma^\mu a_\mu \\
&=
a.
\end{aligned}
\end{equation}
The second identity follows by linearity of the gradient
\begin{equation}\label{eqn:lorentzForceCovariant:340}
\begin{aligned}
\grad x^2
&=
\grad (x \cdot x) \\
&=
\evalbar{\lr{\grad (x \cdot a)}}{a = x}
+
\evalbar{\lr{\grad (b \cdot x)}}{b = x} \\
&=
\evalbar{a}{a = x}
+
\evalbar{b}{b = x} \\
&=
2x.
\end{aligned}
\end{equation}

It is desirable to put this relativistic Lorentz force equation into the usual vector and tensor forms for comparison.

Theorem 1.4: Tensor form of the Lorentz force equation.

The tensor form of the Lorentz force equation is
\begin{equation}\label{eqn:lorentzForceCovariant:1620}
\frac{dp^\mu}{d\tau} = \frac{q}{c} F^{\mu\nu} v_\nu,
\end{equation}
where the antisymmetric Faraday tensor is defined as \( F^{\mu\nu} = \partial^\mu A^\nu – \partial^\nu A^\mu \).

Start proof:

We have only to dot both sides with \( \gamma^\mu \). On the left we have
\begin{equation}\label{eqn:lorentzForceCovariant:380}
\gamma^\mu \cdot \frac{dp}{d\tau}
=
\frac{dp^\mu}{d\tau}.
\end{equation}
On the right, we have
\begin{equation}\label{eqn:lorentzForceCovariant:400}
\begin{aligned}
\gamma^\mu \cdot \lr{ \frac{q}{c} F \cdot v }
&=
\frac{q}{c} (( \grad \wedge A ) \cdot v ) \cdot \gamma^\mu \\
&=
\frac{q}{c} ( \grad ( A \cdot v ) – (v \cdot \grad) A ) \cdot \gamma^\mu \\
&=
\frac{q}{c} \lr{ (\partial^\mu A^\nu) v_\nu – v_\nu \partial^\nu A^\mu } \\
&=
\frac{q}{c} F^{\mu\nu} v_\nu.
\end{aligned}
\end{equation}

End proof.

Problem: Tensor expansion of \(F\).

An alternate way to demonstrate \ref{eqn:lorentzForceCovariant:1620} is to first expand \( F = \grad \wedge A \) in terms of coordinates, an expansion that can be expressed in terms of a second rank tensor antisymmetric tensor \( F^{\mu\nu} \). Find that expansion, and re-evaluate the dot products of \ref{eqn:lorentzForceCovariant:400} using that.

Answer

\begin{equation}\label{eqn:lorentzForceCovariant:900}
\begin{aligned}
F &=
\grad \wedge A \\
&=
\lr{ \gamma_\mu \partial^\mu } \wedge \lr{ \gamma_\nu A^\nu } \\
&=
\lr{ \gamma_\mu \wedge \gamma_\nu } \partial^\mu A^\nu.
\end{aligned}
\end{equation}
To this we can use the usual tensor trick (add self to self, change indexes, and divide by two), to give
\begin{equation}\label{eqn:lorentzForceCovariant:920}
\begin{aligned}
F &=
\inv{2} \lr{
\lr{ \gamma_\mu \wedge \gamma_\nu } \partial^\mu A^\nu
+
\lr{ \gamma_\nu \wedge \gamma_\mu } \partial^\nu A^\mu
} \\
&=
\inv{2}
\lr{ \gamma_\mu \wedge \gamma_\nu } \lr{
\partial^\mu A^\nu

\partial^\nu A^\mu
},
\end{aligned}
\end{equation}
which is just
\begin{equation}\label{eqn:lorentzForceCovariant:940}
F =
\inv{2} \lr{ \gamma_\mu \wedge \gamma_\nu } F^{\mu\nu}.
\end{equation}
Now, let’s expand \( (F \cdot v) \cdot \gamma^\mu \) to compare to the earlier expansion in terms of \( \grad \) and \( A \).
\begin{equation}\label{eqn:lorentzForceCovariant:960}
\begin{aligned}
(F \cdot v) \cdot \gamma^\mu
&=
\inv{2}
F^{\alpha\nu}
\lr{ \lr{ \gamma_\alpha \wedge \gamma_\nu } \cdot \lr{ \gamma^\beta v_\beta } } \cdot \gamma^\mu \\
&=
\inv{2}
F^{\alpha\nu} v_\beta
\lr{
{\delta_\nu}^\beta {\gamma_\alpha}^\mu

{\delta_\alpha}^\beta {\gamma_\nu}^\mu
} \\
&=
\inv{2}
\lr{
F^{\mu\beta} v_\beta

F^{\beta\mu} v_\beta
} \\
&=
F^{\mu\nu} v_\nu.
\end{aligned}
\end{equation}
This alternate expansion illustrates some of the connectivity between the geometric algebra approach and the traditional tensor formalism.

Problem: Lorentz force direct tensor derivation.

Instead of using the geometric algebra form of the Lorentz force equation as a stepping stone, we may derive the tensor form from the Lagrangian directly, provided the Lagrangian is put into tensor form
\begin{equation*}
L = \inv{2} m v^\mu v_\mu + q A^\mu v_\mu /c.
\end{equation*}
Evaluate the Euler-Lagrange equations in coordinate form and compare to \ref{eqn:lorentzForceCovariant:1620}.

Answer

Let \( \delta_\mu L = \gamma_\mu \cdot \delta L \), so that we can write the Euler-Lagrange equations as
\begin{equation}\label{eqn:lorentzForceCovariant:460}
0 = \delta_\mu L = \PD{x^\mu}{L} – \frac{d}{d\tau} \PD{\dot{x}^\mu}{L}.
\end{equation}
Operating on the kinetic term of the Lagrangian, we have
\begin{equation}\label{eqn:lorentzForceCovariant:480}
\delta_\mu L_0 = – \frac{d}{d\tau} m v_\mu.
\end{equation}
For the potential term
\begin{equation}\label{eqn:lorentzForceCovariant:500}
\begin{aligned}
\delta_\mu L_1
&=
\frac{q}{c} \lr{
v_\nu \PD{x^\mu}{A^\nu} – \frac{d}{d\tau} A_\mu
} \\
&=
\frac{q}{c} \lr{
v_\nu \PD{x^\mu}{A^\nu} – \frac{dx_\alpha}{d\tau} \PD{x_\alpha}{ A_\mu }
} \\
&=
\frac{q}{c} v^\nu \lr{
\partial_\mu A_\nu – \partial_\nu A_\mu
} \\
&=
\frac{q}{c} v^\nu F_{\mu\nu}.
\end{aligned}
\end{equation}
Putting the pieces together gives
\begin{equation}\label{eqn:lorentzForceCovariant:520}
\frac{d}{d\tau} (m v_\mu) = \frac{q}{c} v^\nu F_{\mu\nu},
\end{equation}
which is identical\footnote{Some minor index raising and lowering gymnastics are required.} to the tensor form that we found by expanding the geometric algebra form of Maxwell’s equation in coordinates.

Theorem 1.5: Vector Lorentz force equation.

Relative to a fixed observer’s frame, the Lorentz force equation of \ref{eqn:lorentzForceCovariant:1660} splits into a spatial rate of change of momentum, and (timelike component) rate of change of energy, as follows
\begin{equation}\label{eqn:lorentzForceCovariant:1680}
\begin{aligned}
\ddt{(\gamma m \Bv)} &= q \lr{ \BE + \Bv \cross \BB } \\
\ddt{(\gamma m c^2)} &= q \Bv \cdot \BE,
\end{aligned}
\end{equation}
where \( F = \BE + I c \BB \), \( \gamma = 1/\sqrt{1 – \Bv^2/c^2 }\).

Start proof:

The first step is to eliminate the proper time dependencies in the Lorentz force equation. Consider first the coordinate representation of an arbitrary position four-vector \( x \)
\begin{equation}\label{eqn:lorentzForceCovariant:1140}
x = c t \gamma_0 + x^k \gamma_k.
\end{equation}
The corresponding four-vector velocity is
\begin{equation}\label{eqn:lorentzForceCovariant:1160}
v = \ddtau{x} = c \ddtau{t} \gamma_0 + \ddtau{t} \ddt{x^k} \gamma_k.
\end{equation}
By construction, \( v^2 = c^2 \) is a Lorentz invariant quantity (this is one of the relativistic postulates), so the LHS of \ref{eqn:lorentzForceCovariant:1160} must have the same square. That is
\begin{equation}\label{eqn:lorentzForceCovariant:1240}
c^2 = \lr{ \ddtau{t} }^2 \lr{ c^2 – \Bv^2 },
\end{equation}
where \( \Bv = v \wedge \gamma_0 \). This shows that we may make the identification
\begin{equation}\label{eqn:lorentzForceCovariant:1260}
\gamma = \ddtau{t} = \inv{1 – \Bv^2/c^2 },
\end{equation}
and
\begin{equation}\label{eqn:lorentzForceCovariant:1280}
\ddtau{} = \ddtau{t} \ddt{} = \gamma \ddt{}.
\end{equation}
We may now factor the four-velocity \( v \) into its spacetime split
\begin{equation}\label{eqn:lorentzForceCovariant:1300}
v = \gamma \lr{ c + \Bv } \gamma_0.
\end{equation}
In particular the LHS of the Lorentz force equation can be rewritten as
\begin{equation}\label{eqn:lorentzForceCovariant:1320}
\ddtau{p} = \gamma \ddt{}\lr{ \gamma \lr{ c + \Bv } } \gamma_0,
\end{equation}
and the RHS of the Lorentz force equation can be rewritten as
\begin{equation}\label{eqn:lorentzForceCovariant:1340}
\frac{q}{c} F \cdot v
=
\frac{\gamma q}{c} F \cdot \lr{ (c + \Bv) \gamma_0 }.
\end{equation}
Equating timelike and spacelike components leaves us
\begin{equation}\label{eqn:lorentzForceCovariant:1380}
\ddt{ (m \gamma c) } = \frac{q}{c} \lr{ F \cdot \lr{ (c + \Bv) \gamma_0 } } \cdot \gamma_0,
\end{equation}
\begin{equation}\label{eqn:lorentzForceCovariant:1400}
\ddt{ (m \gamma \Bv) } = \frac{q}{c} \lr{ F \cdot \lr{ (c + \Bv) \gamma_0 } } \wedge \gamma_0,
\end{equation}
Evaluating these products requires some care, but is an essentially manual process. The reader is encouraged to do so once, but the end result may also be obtained easily using software (see lorentzForce.nb in [2]). One finds
\begin{equation}\label{eqn:lorentzForceCovariant:1440}
F = \BE + I c \BB
=
E^1 \gamma_{10} +
+ E^2 \gamma_{20} +
+ E^3 \gamma_{30} +
– c B^1 \gamma_{23} +
– c B^2 \gamma_{31} +
– c B^3 \gamma_{12},
\end{equation}
\begin{equation}\label{eqn:lorentzForceCovariant:1460}
\frac{q}{c} \lr{ F \cdot \lr{ (c + \Bv) \gamma_0 } } \cdot \gamma_0
= \frac{q}{c} \BE \cdot \Bv,
\end{equation}
\begin{equation}\label{eqn:lorentzForceCovariant:1480}
\frac{q}{c} \lr{ F \cdot \lr{ (c + \Bv) \gamma_0 } } \wedge \gamma_0
= q \lr{ \BE + \Bv \cross \BB }.
\end{equation}

End proof.

Problem: Algebraic spacetime split of the Lorentz force equation.

Derive the results of \ref{eqn:lorentzForceCovariant:1440} through \ref{eqn:lorentzForceCovariant:1480} algebraically.

Problem: Spacetime split of the Lorentz force tensor equation.

Show that \ref{eqn:lorentzForceCovariant:1680} also follows from the tensor form of the Lorentz force equation (\ref{eqn:lorentzForceCovariant:1620}) provided we identify
\begin{equation}\label{eqn:lorentzForceCovariant:1500}
F^{k0} = E^k,
\end{equation}
and
\begin{equation}\label{eqn:lorentzForceCovariant:1520}
F^{rs} = -\epsilon^{rst} B^t.
\end{equation}

Also verify that the identifications of \ref{eqn:lorentzForceCovariant:1500} and \ref{eqn:lorentzForceCovariant:1520} is consistent with the geometric algebra Faraday bivector \( F = \BE + I c \BB \), and the associated coordinate expansion of the field \( F = (1/2) (\gamma_\mu \wedge \gamma_\nu) F^{\mu\nu} \).

References

[1] C. Doran and A.N. Lasenby. Geometric algebra for physicists. Cambridge University Press New York, Cambridge, UK, 1st edition, 2003.

[2] Peeter Joot. Mathematica modules for Geometric Algebra’s GA(2,0), GA(3,0), and GA(1,3), 2017. URL https://github.com/peeterjoot/gapauli. [Online; accessed 24-Oct-2020].

Classical mechanics notes on Amazon in paperback (but don’t buy a copy!)

October 13, 2020 math and physics play No comments ,

I have a fairly monstrous set of classical mechanics notes that I accumulated when I was learning all about the theory of Lagrangians, Hamiltonians, and Noether’s theorem.

I also audited a few of the classes from the 2012 session of PHY354H1S, Advanced Classical Mechanics, taught by Prof. Erich Poppitz, at the University of Toronto, and have some notes and problems from those classes in this set of notes.

These notes are not self contained.  In particular, there is fairly heavy use of geometric algebra in many of the problems, with assumptions that the reader is proficient with that algebra.

These notes (436 pages, 6″x9″) are available in the following formats:

  • for free in PDF format (colour),
  • on Amazon in paperback (black and white),
  • as latex sources.

I’ve pressed the publish button on kindle-direct-publishing so that I could get a paper copy of these notes for myself.  An extremely vicious edit is required.  Until I do that editing (assuming I do), the price is set to the absolute minimum no commission price that Amazon let’s me offer (i.e. printing cost plus profit for Amazon.)  I wouldn’t actually recommend that anybody buy this in it’s current form — download the pdf if you are interested.

I’m actually toying with the idea of rewriting these notes from scratch, creating an “Advanced Classical Mechanics, with Geometric Algebra” book out of some of the ideas.  I could flush out many of the details that I explored originally, but add some actual structure and coherence to this mess of write-once-read-none junk.  Tying things to a geometric algebra theme would be the value add proposition that could distinguish things from all the other classical mechanics books in the universe.

That said, this idea would be a very tough book project (for me), as I’d have to understand all the material enough to present it in a coherent fashion.  I’d want to include and explore both Euclidean and relativistic Lagrangians, which would make the material tougher, but comprehensive.  I don’t like the idea of assuming the reader is familiar with special relativity, but the thought of me having to include a self contained introduction to that topic that isn’t complete garbage is pretty intimidating.  Especially if you consider that I’d also want to introduce STA, and help the reader understand the connections between all that material.  There’s a lot of ideas that would all have to come together!

Notes for ece1228 (Electromagnetic Theory) now in book form on Amazon.

September 26, 2020 math and physics play No comments ,

My notes for ece1228 (Electromagnetic Theory) are now available in book form on Amazon.

This version omits all assigned problem solutions (but includes some self-generated problem solutions.)  As such, it is very short.  I published it so that I could get a copy (of the non-redacted version) for myself , but in the unlikely chance that somebody else is interested I’ve left the redacted version in published state (available very cheaply.)  Feel free to contact me for the full (pdf) version if you are not taking the course (and don’t intend to.)

The official course description at the time was:

Fundamentals: Maxwell’s equations, constitutive relations and boundary conditions, wave polarization. Field representations: potentials, Green’s functions and integral equations. Theorems and concepts: duality, uniqueness, images, equivalence, reciprocity and Babinet’s principles. Plane, cylindrical and spherical waves and waveguides. radiation and scattering.

New material (for me) in this course was limited to:

  • dispersion relations.
  • Druid-Lorentz model
  • quadropole moments
  • magnetic moments, magnetostatic force, and torque (mentioned in class without details, but studied from Jackson)
  • matrix representation of transmission and reflection through multiple interfaces