As your T.A., I have to punish you …

December 19, 2020 C/C++ development and debugging. No comments , , ,

Back in university, I had to implement a reverse polish notation calculator in a software engineering class.  Overall the assignment was pretty stupid, and I entertained myself by generating writing a very compact implementation.  It worked perfectly, but I got a 25/40 (62.5%) grade on it.  That mark was well deserved, although I did not think so at the time.

The grading remarks were actually some of best feedback that I ever received, and also really funny to boot.  I don’t know the name of this old now-nameless TA anymore, but I took his advice to heart, and kept his grading remarks on my wall in my IBM office for years.  That served as an excellent reminder not to write over complicated code.

Today, I found those remarks again, and am posting them for posterity.  Enjoy!

Transcription for easy reading

  • It is obvious that are a very clever person, but this program is is like a big puzzle, and in understanding it, I appreciated it and enjoyed it, because of your cleverness. However much I enjoyed, it is none the less a very poorly designed program.
  • A program should be constructed in the easiest and simplest to understand manner because when you construct very large programs the “complexity” of them will increase greatly.
  • A program should not be an intricate puzzle, where you show off how clever you are.
  • Your string class is an elephant gun trying to kill a mouse.
  • macros Build_binary_op and Binary_op are the worst examples of programming style I have ever seen in my entire life!  Veru c;ever. bit a cardinal sin of programming style.
  • Your binary_expr constructor does all the computation.  Not good style.
  • Your “expr” class is a baroque mess.
  • Although I enjoyed your program, Never write a program like this in your life again.  As your T.A., I have to pushish you so that you do not develop bad habits in the future.  I hate to do it, but I can only give you 25/40 for this “clever puzzle”.

Reflection.

The only part of this feedback that I would refute was the comment about the string class.  That was a actually a pretty good string implementation.  I didn’t write it because I was a viscous mouse hunter, but because I hit a porting issue with pre-std:: C.  In particular, we had two sets of Solaris machines available to us, and I was using one that had a compiler that included a nice C++ string class.  So, naturally I used it.  For submission, our code had to compile an run on a different Solaris machine, and lo and behold, the string class that all my code was based on was not available.

What I should have done (20/20 hindsight), was throw out my horrendous code, and start over from scratch.  However, I took the more fun approach, and wrote my own string class so that my machine would compile on either machine.

Amusingly, when I worked on IBM LUW, there was a part of the query optimizer code seemed to have learned all it’s tricks from the ugly macros and token pasting that I did in this assignment.  It was truly gross, but there was 10000x more of it than my assignment.  Having been thoroughly punished for my atrocities, I easily recognized this code for the evil it was.  The only way that you could debug that optimizer code, was by running it through the preprocessor, cut and pasting the results, and filtering that cut and paste through something like cindent (these days you would probably use clang-format.)  That code was brutal, and I always wished that it’s authors had had the good luck of having a TA like mine.  That code is probably still part of LUW terrorizing developers.  Apparently the justification for it was that it was originally written by an IBM researcher using templates, but templates couldn’t be used in DB2 code because we didn’t have compiler on all platforms that supported them at the time.

I have used token pasting macros very judiciously and sparingly in the 26 years since I originally used them in this assignment, and I do think that there are a few good uses for that sort of generative code.  However, if you do have to write that sort of code, I think it’s better to write perl (or some other language) code that generates understandable code that can be debugged, instead of relying on token pasting.

Fundamental theorem of geometric calculus for line integrals (relativistic.)

December 16, 2020 math and physics play 1 comment , , , , , , , , , , , , , , , , , , , , , , , , , , ,

[This post is best viewed in PDF form, due to latex elements that I could not format with wordpress mathjax.]

Background for this particular post can be found in

  1. Curvilinear coordinates and gradient in spacetime, and reciprocal frames, and
  2. Lorentz transformations in Space Time Algebra (STA)
  3. A couple more reciprocal frame examples.

Motivation.

I’ve been slowly working my way towards a statement of the fundamental theorem of integral calculus, where the functions being integrated are elements of the Dirac algebra (space time multivectors in the geometric algebra parlance.)

This is interesting because we want to be able to do line, surface, 3-volume and 4-volume space time integrals. We have many \(\mathbb{R}^3\) integral theorems
\begin{equation}\label{eqn:fundamentalTheoremOfGC:40a}
\int_A^B d\Bl \cdot \spacegrad f = f(B) – f(A),
\end{equation}
\begin{equation}\label{eqn:fundamentalTheoremOfGC:60a}
\int_S dA\, \ncap \cross \spacegrad f = \int_{\partial S} d\Bx\, f,
\end{equation}
\begin{equation}\label{eqn:fundamentalTheoremOfGC:80a}
\int_S dA\, \ncap \cdot \lr{ \spacegrad \cross \Bf} = \int_{\partial S} d\Bx \cdot \Bf,
\end{equation}
\begin{equation}\label{eqn:fundamentalTheoremOfGC:100a}
\int_S dx dy \lr{ \PD{y}{P} – \PD{x}{Q} }
=
\int_{\partial S} P dx + Q dy,
\end{equation}
\begin{equation}\label{eqn:fundamentalTheoremOfGC:120a}
\int_V dV\, \spacegrad f = \int_{\partial V} dA\, \ncap f,
\end{equation}
\begin{equation}\label{eqn:fundamentalTheoremOfGC:140a}
\int_V dV\, \spacegrad \cross \Bf = \int_{\partial V} dA\, \ncap \cross \Bf,
\end{equation}
\begin{equation}\label{eqn:fundamentalTheoremOfGC:160a}
\int_V dV\, \spacegrad \cdot \Bf = \int_{\partial V} dA\, \ncap \cdot \Bf,
\end{equation}
and want to know how to generalize these to four dimensions and also make sure that we are handling the relativistic mixed signature correctly. If our starting point was the mess of equations above, we’d be in trouble, since it is not obvious how these generalize. All the theorems with unit normals have to be handled completely differently in four dimensions since we don’t have a unique normal to any given spacetime plane.
What comes to our rescue is the Fundamental Theorem of Geometric Calculus (FTGC), which has the form
\begin{equation}\label{eqn:fundamentalTheoremOfGC:40}
\int F d^n \Bx\, \lrpartial G = \int F d^{n-1} \Bx\, G,
\end{equation}
where \(F,G\) are multivectors functions (i.e. sums of products of vectors.) We’ve seen ([2], [1]) that all the identities above are special cases of the fundamental theorem.

Do we need any special care to state the FTGC correctly for our relativistic case? It turns out that the answer is no! Tangent and reciprocal frame vectors do all the heavy lifting, and we can use the fundamental theorem as is, even in our mixed signature space. The only real change that we need to make is use spacetime gradient and vector derivative operators instead of their spatial equivalents. We will see how this works below. Note that instead of starting with \ref{eqn:fundamentalTheoremOfGC:40} directly, I will attempt to build up to that point in a progressive fashion that is hopefully does not require the reader to make too many unjustified mental leaps.

Multivector line integrals.

We want to define multivector line integrals to start with. Recall that in \(\mathbb{R}^3\) we would say that for scalar functions \( f\), the integral
\begin{equation}\label{eqn:fundamentalTheoremOfGC:180b}
\int d\Bx\, f = \int f d\Bx,
\end{equation}
is a line integral. Also, for vector functions \( \Bf \) we call
\begin{equation}\label{eqn:fundamentalTheoremOfGC:200}
\int d\Bx \cdot \Bf = \inv{2} \int d\Bx\, \Bf + \Bf d\Bx.
\end{equation}
a line integral. In order to generalize line integrals to multivector functions, we will allow our multivector functions to be placed on either or both sides of the differential.

Definition 1.1: Line integral.

Given a single variable parameterization \( x = x(u) \), we write \( d^1\Bx = \Bx_u du \), and call
\begin{equation}\label{eqn:fundamentalTheoremOfGC:220a}
\int F d^1\Bx\, G,
\end{equation}
a line integral, where \( F,G \) are arbitrary multivector functions.

We must be careful not to reorder any of the factors in the integrand, since the differential may not commute with either \( F \) or \( G \). Here is a simple example where the integrand has a product of a vector and differential.

Problem: Circular parameterization.

Given a circular parameterization \( x(\theta) = \gamma_1 e^{-i\theta} \), where \( i = \gamma_1 \gamma_2 \), the unit bivector for the \(x,y\) plane. Compute the line integral
\begin{equation}\label{eqn:fundamentalTheoremOfGC:100}
\int_0^{\pi/4} F(\theta)\, d^1 \Bx\, G(\theta),
\end{equation}
where \( F(\theta) = \Bx^\theta + \gamma_3 + \gamma_1 \gamma_0 \) is a multivector valued function, and \( G(\theta) = \gamma_0 \) is vector valued.

Answer

The tangent vector for the curve is
\begin{equation}\label{eqn:fundamentalTheoremOfGC:60}
\Bx_\theta
= -\gamma_1 \gamma_1 \gamma_2 e^{-i\theta}
= \gamma_2 e^{-i\theta},
\end{equation}
with reciprocal vector \( \Bx^\theta = e^{i \theta} \gamma^2 \). The differential element is \( d^1 \Bx = \gamma_2 e^{-i\theta} d\theta \), so the integrand is
\begin{equation}\label{eqn:fundamentalTheoremOfGC:80}
\begin{aligned}
\int_0^{\pi/4} \lr{ \Bx^\theta + \gamma_3 + \gamma_1 \gamma_0 } d^1 \Bx\, \gamma_0
&=
\int_0^{\pi/4} \lr{ e^{i\theta} \gamma^2 + \gamma_3 + \gamma_1 \gamma_0 } \gamma_2 e^{-i\theta} d\theta\, \gamma_0 \\
&=
\frac{\pi}{4} \gamma_0 + \lr{ \gamma_{32} + \gamma_{102} } \inv{-i} \lr{ e^{-i\pi/4} – 1 } \gamma_0 \\
&=
\frac{\pi}{4} \gamma_0 + \inv{\sqrt{2}} \lr{ \gamma_{32} + \gamma_{102} } \gamma_{120} \lr{ 1 – \gamma_{12} } \\
&=
\frac{\pi}{4} \gamma_0 + \inv{\sqrt{2}} \lr{ \gamma_{310} + 1 } \lr{ 1 – \gamma_{12} }.
\end{aligned}
\end{equation}
Observe how care is required not to reorder any terms. This particular end result is a multivector with scalar, vector, bivector, and trivector grades, but no pseudoscalar component. The grades in the end result depend on both the function in the integrand and on the path. For example, had we integrated all the way around the circle, the end result would have been the vector \( 2 \pi \gamma_0 \) (i.e. a \( \gamma_0 \) weighted unit circle circumference), as all the other grades would have been killed by the complex exponential integrated over a full period.

Problem: Line integral for boosted time direction vector.

Let \( x = e^{\vcap \alpha/2} \gamma_0 e^{-\vcap \alpha/2} \) represent the spacetime curve of all the boosts of \( \gamma_0 \) along a specific velocity direction vector, where \( \vcap = (v \wedge \gamma_0)/\Norm{v \wedge \gamma_0} \) is a unit spatial bivector for any constant vector \( v \). Compute the line integral
\begin{equation}\label{eqn:fundamentalTheoremOfGC:240}
\int x\, d^1 \Bx.
\end{equation}

Answer

Observe that \( \vcap \) and \( \gamma_0 \) anticommute, so we may write our boost as a one sided exponential
\begin{equation}\label{eqn:fundamentalTheoremOfGC:260}
x(\alpha) = \gamma_0 e^{-\vcap \alpha} = e^{\vcap \alpha} \gamma_0 = \lr{ \cosh\alpha + \vcap \sinh\alpha } \gamma_0.
\end{equation}
The tangent vector is just
\begin{equation}\label{eqn:fundamentalTheoremOfGC:280}
\Bx_\alpha = \PD{\alpha}{x} = e^{\vcap\alpha} \vcap \gamma_0.
\end{equation}
Let’s get a bit of intuition about the nature of this vector. It’s square is
\begin{equation}\label{eqn:fundamentalTheoremOfGC:300}
\begin{aligned}
\Bx_\alpha^2
&=
e^{\vcap\alpha} \vcap \gamma_0
e^{\vcap\alpha} \vcap \gamma_0 \\
&=
-e^{\vcap\alpha} \vcap e^{-\vcap\alpha} \vcap (\gamma_0)^2 \\
&=
-1,
\end{aligned}
\end{equation}
so we see that the tangent vector is a spacelike unit vector. As the vector representing points on the curve is necessarily timelike (due to Lorentz invariance), these two must be orthogonal at all points. Let’s confirm this algebraically
\begin{equation}\label{eqn:fundamentalTheoremOfGC:320}
\begin{aligned}
x \cdot \Bx_\alpha
&=
\gpgradezero{ e^{\vcap \alpha} \gamma_0 e^{\vcap \alpha} \vcap \gamma_0 } \\
&=
\gpgradezero{ e^{-\vcap \alpha} e^{\vcap \alpha} \vcap (\gamma_0)^2 } \\
&=
\gpgradezero{ \vcap } \\
&= 0.
\end{aligned}
\end{equation}
Here we used \( e^{\vcap \alpha} \gamma_0 = \gamma_0 e^{-\vcap \alpha} \), and \( \gpgradezero{A B} = \gpgradezero{B A} \). Geometrically, we have the curious fact that the direction vectors to points on the curve are perpendicular (with respect to our relativistic dot product) to the tangent vectors on the curve, as illustrated in fig. 1.

fig. 1. Tangent perpendicularity in mixed metric.

Perfect differentials.

Having seen a couple examples of multivector line integrals, let’s now move on to figure out the structure of a line integral that has a “perfect” differential integrand. We can take a hint from the \(\mathbb{R}^3\) vector result that we already know, namely
\begin{equation}\label{eqn:fundamentalTheoremOfGC:120}
\int_A^B d\Bl \cdot \spacegrad f = f(B) – f(A).
\end{equation}
It seems reasonable to guess that the relativistic generalization of this is
\begin{equation}\label{eqn:fundamentalTheoremOfGC:140}
\int_A^B dx \cdot \grad f = f(B) – f(A).
\end{equation}
Let’s check that, by expanding in coordinates
\begin{equation}\label{eqn:fundamentalTheoremOfGC:160}
\begin{aligned}
\int_A^B dx \cdot \grad f
&=
\int_A^B d\tau \frac{dx^\mu}{d\tau} \partial_\mu f \\
&=
\int_A^B d\tau \frac{dx^\mu}{d\tau} \PD{x^\mu}{f} \\
&=
\int_A^B d\tau \frac{df}{d\tau} \\
&=
f(B) – f(A).
\end{aligned}
\end{equation}
If we drop the dot product, will we have such a nice result? Let’s see:
\begin{equation}\label{eqn:fundamentalTheoremOfGC:180}
\begin{aligned}
\int_A^B dx \grad f
&=
\int_A^B d\tau \frac{dx^\mu}{d\tau} \gamma_\mu \gamma^\nu \partial_\nu f \\
&=
\int_A^B d\tau \frac{dx^\mu}{d\tau} \PD{x^\mu}{f}
+
\int_A^B
d\tau
\sum_{\mu \ne \nu} \gamma_\mu \gamma^\nu
\frac{dx^\mu}{d\tau} \PD{x^\nu}{f}.
\end{aligned}
\end{equation}
This scalar component of this integrand is a perfect differential, but the bivector part of the integrand is a complete mess, that we have no hope of generally integrating. It happens that if we consider one of the simplest parameterization examples, we can get a strong hint of how to generalize the differential operator to one that ends up providing a perfect differential. In particular, let’s integrate over a linear constant path, such as \( x(\tau) = \tau \gamma_0 \). For this path, we have
\begin{equation}\label{eqn:fundamentalTheoremOfGC:200a}
\begin{aligned}
\int_A^B dx \grad f
&=
\int_A^B \gamma_0 d\tau \lr{
\gamma^0 \partial_0 +
\gamma^1 \partial_1 +
\gamma^2 \partial_2 +
\gamma^3 \partial_3 } f \\
&=
\int_A^B d\tau \lr{
\PD{\tau}{f} +
\gamma_0 \gamma^1 \PD{x^1}{f} +
\gamma_0 \gamma^2 \PD{x^2}{f} +
\gamma_0 \gamma^3 \PD{x^3}{f}
}.
\end{aligned}
\end{equation}
Just because the path does not have any \( x^1, x^2, x^3 \) component dependencies does not mean that these last three partials are neccessarily zero. For example \( f = f(x(\tau)) = \lr{ x^0 }^2 \gamma_0 + x^1 \gamma_1 \) will have a non-zero contribution from the \( \partial_1 \) operator. In that particular case, we can easily integrate \( f \), but we have to know the specifics of the function to do the integral. However, if we had a differential operator that did not include any component off the integration path, we would ahve a perfect differential. That is, if we were to replace the gradient with the projection of the gradient onto the tangent space, we would have a perfect differential. We see that the function of the dot product in \ref{eqn:fundamentalTheoremOfGC:140} has the same effect, as it rejects any component of the gradient that does not lie on the tangent space.

Definition 1.2: Vector derivative.

Given a spacetime manifold parameterized by \( x = x(u^0, \cdots u^{N-1}) \), with tangent vectors \( \Bx_\mu = \PDi{u^\mu}{x} \), and reciprocal vectors \( \Bx^\mu \in \textrm{Span}\setlr{\Bx_\nu} \), such that \( \Bx^\mu \cdot \Bx_\nu = {\delta^\mu}_\nu \), the vector derivative is defined as
\begin{equation}\label{eqn:fundamentalTheoremOfGC:240a}
\partial = \sum_{\mu = 0}^{N-1} \Bx^\mu \PD{u^\mu}{}.
\end{equation}
Observe that if this is a full parameterization of the space (\(N = 4\)), then the vector derivative is identical to the gradient. The vector derivative is the projection of the gradient onto the tangent space at the point of evaluation.Furthermore, we designate \( \lrpartial \) as the vector derivative allowed to act bidirectionally, as follows
\begin{equation}\label{eqn:fundamentalTheoremOfGC:260a}
R \lrpartial S
=
R \Bx^\mu \PD{u^\mu}{S}
+
\PD{u^\mu}{R} \Bx^\mu S,
\end{equation}
where \( R, S \) are multivectors, and summation convention is implied. In this bidirectional action,
the vector factors of the vector derivative must stay in place (as they do not neccessarily commute with \( R,S\)), but the derivative operators apply in a chain rule like fashion to both functions.

Noting that \( \Bx_u \cdot \grad = \Bx_u \cdot \partial \), we may rewrite the scalar line integral identity \ref{eqn:fundamentalTheoremOfGC:140} as
\begin{equation}\label{eqn:fundamentalTheoremOfGC:220}
\int_A^B dx \cdot \partial f = f(B) – f(A).
\end{equation}
However, as our example hinted at, the fundamental theorem for line integrals has a multivector generalization that does not rely on a dot product to do the tangent space filtering, and is more powerful. That generalization has the following form.

Theorem 1.1: Fundamental theorem for line integrals.

Given multivector functions \( F, G \), and a single parameter curve \( x(u) \) with line element \( d^1 \Bx = \Bx_u du \), then
\begin{equation}\label{eqn:fundamentalTheoremOfGC:280a}
\int_A^B F d^1\Bx \lrpartial G = F(B) G(B) – F(A) G(A).
\end{equation}

Start proof:

Writing out the integrand explicitly, we find
\begin{equation}\label{eqn:fundamentalTheoremOfGC:340}
\int_A^B F d^1\Bx \lrpartial G
=
\int_A^B \lr{
\PD{\alpha}{F} d\alpha\, \Bx_\alpha \Bx^\alpha G
+
F d\alpha\, \Bx_\alpha \Bx^\alpha \PD{\alpha}{G }
}
\end{equation}
However for a single parameter curve, we have \( \Bx^\alpha = 1/\Bx_\alpha \), so we are left with
\begin{equation}\label{eqn:fundamentalTheoremOfGC:360}
\begin{aligned}
\int_A^B F d^1\Bx \lrpartial G
&=
\int_A^B d\alpha\, \PD{\alpha}{(F G)} \\
&=
\evalbar{F G}{B}

\evalbar{F G}{A}.
\end{aligned}
\end{equation}

End proof.

More to come.

In the next installment we will explore surface integrals in spacetime, and the generalization of the fundamental theorem to multivector space time integrals.

References

[1] Peeter Joot. Geometric Algebra for Electrical Engineers. Kindle Direct Publishing, 2019.

[2] A. Macdonald. Vector and Geometric Calculus. CreateSpace Independent Publishing Platform, 2012.

A couple more reciprocal frame examples.

December 14, 2020 math and physics play No comments , , , , , , , , , , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

This post logically follows both of the following:

  1. Curvilinear coordinates and gradient in spacetime, and reciprocal frames, and
  2. Lorentz transformations in Space Time Algebra (STA)

The PDF linked above above contains all the content from this post plus (1.) above [to be edited later into a more logical sequence.]

More examples.

Here are a few additional examples of reciprocal frame calculations.

Problem: Unidirectional arbitrary functional dependence.

Let
\begin{equation}\label{eqn:reciprocal:2540}
x = a f(u),
\end{equation}
where \( a \) is a constant vector and \( f(u)\) is some arbitrary differentiable function with a non-zero derivative in the region of interest.

Answer

Here we have just a single tangent space direction (a line in spacetime) with tangent vector
\begin{equation}\label{eqn:reciprocal:2400}
\Bx_u = a \PD{u}{f} = a f_u,
\end{equation}
so we see that the tangent space vectors are just rescaled values of the direction vector \( a \).
This is a simple enough parameterization that we can compute the reciprocal frame vector explicitly using the gradient. We expect that \( \Bx^u = 1/\Bx_u \), and find
\begin{equation}\label{eqn:reciprocal:2420}
\inv{a} \cdot x = f(u),
\end{equation}
but for constant \( a \), we know that \( \grad a \cdot x = a \), so taking gradients of both sides we find
\begin{equation}\label{eqn:reciprocal:2440}
\inv{a} = \grad f = \PD{u}{f} \grad u,
\end{equation}
so the reciprocal vector is
\begin{equation}\label{eqn:reciprocal:2460}
\Bx^u = \grad u = \inv{a f_u},
\end{equation}
as expected.

Problem: Linear two variable parameterization.

Let \( x = a u + b v \), where \( x \wedge a \wedge b = 0 \) represents spacetime plane (also the tangent space.) Find the curvilinear coordinates and their reciprocals.

Answer

The frame vectors are easy to compute, as they are just
\begin{equation}\label{eqn:reciprocal:1960}
\begin{aligned}
\Bx_u &= \PD{u}{x} = a \\
\Bx_v &= \PD{v}{x} = b.
\end{aligned}
\end{equation}
This is an example of a parametric equation that we can easily invert, as we have
\begin{equation}\label{eqn:reciprocal:1980}
\begin{aligned}
x \wedge a &= – v \lr{ a \wedge b } \\
x \wedge b &= u \lr{ a \wedge b },
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:reciprocal:2000}
\begin{aligned}
u
&= \inv{ a \wedge b } \cdot \lr{ x \wedge b } \\
&= \inv{ \lr{a \wedge b}^2 } \lr{ a \wedge b } \cdot \lr{ x \wedge b } \\
&=
\frac{
\lr{b \cdot x} \lr{ a \cdot b }

\lr{a \cdot x} \lr{ b \cdot b }
}{ \lr{a \wedge b}^2 }
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:reciprocal:2020}
\begin{aligned}
v &= -\inv{ a \wedge b } \cdot \lr{ x \wedge a } \\
&= -\inv{ \lr{a \wedge b}^2 } \lr{ a \wedge b } \cdot \lr{ x \wedge a } \\
&=
-\frac{
\lr{b \cdot x} \lr{ a \cdot a }

\lr{a \cdot x} \lr{ a \cdot b }
}{ \lr{a \wedge b}^2 }
\end{aligned}
\end{equation}
Recall that \( \grad \lr{ a \cdot x} = a \), if \( a \) is a constant, so our gradients are just
\begin{equation}\label{eqn:reciprocal:2040}
\begin{aligned}
\grad u
&=
\frac{
b \lr{ a \cdot b }

a
\lr{ b \cdot b }
}{ \lr{a \wedge b}^2 } \\
&=
b \cdot \inv{ a \wedge b },
\end{aligned}
\end{equation}
and
\begin{equation}\label{eqn:reciprocal:2060}
\begin{aligned}
\grad v
&=
-\frac{
b \lr{ a \cdot a }

a \lr{ a \cdot b }
}{ \lr{a \wedge b}^2 } \\
&=
-a \cdot \inv{ a \wedge b }.
\end{aligned}
\end{equation}
Expressed in terms of the frame vectors, this is just
\begin{equation}\label{eqn:reciprocal:2080}
\begin{aligned}
\Bx^u &= \Bx_v \cdot \inv{ \Bx_u \wedge \Bx_v } \\
\Bx^v &= -\Bx_u \cdot \inv{ \Bx_u \wedge \Bx_v },
\end{aligned}
\end{equation}
so we were able to show, for this special two parameter linear case, that the explicit evaluation of the gradients has the exact structure that we intuited that the reciprocals must have, provided they are constrained to the spacetime plane \( a \wedge b \). It is interesting to observe how this structure falls out of the linear system solution so directly. Also note that these reciprocals are not defined at the origin of the \( (u,v) \) parameter space.

Problem: Quadratic two variable parameterization.

Now consider a variation of the previous problem, with \( x = a u^2 + b v^2 \). Find the curvilinear coordinates and their reciprocals.

Answer

\begin{equation}\label{eqn:reciprocal:2100}
\begin{aligned}
\Bx_u &= \PD{u}{x} = 2 u a \\
\Bx_v &= \PD{v}{x} = 2 v b.
\end{aligned}
\end{equation}
Our tangent space is still the \( a \wedge b \) plane (as is the surface itself), but the spacing of the cells starts getting wider in proportion to \( u, v \).
Utilizing the work from the previous problem, we have
\begin{equation}\label{eqn:reciprocal:2120}
\begin{aligned}
2 u \grad u &=
b \cdot \inv{ a \wedge b } \\
2 v \grad v &=
-a \cdot \inv{ a \wedge b }.
\end{aligned}
\end{equation}
A bit of rearrangement can show that this is equivalent to the reciprocal frame identities. This is a second demonstration that the gradient and the algebraic formulations for the reciprocals match, at least for these special cases of linear non-coupled parameterizations.

Problem: Reciprocal frame for generalized cylindrical parameterization.

Let the vector parameterization be \( x(\rho,\theta) = \rho e^{-i\theta/2} x(\rho_0, \theta_0) e^{i \theta} \), where \( i^2 = \pm 1 \) is a unit bivector (\(+1\) for a boost, and \(-1\) for a rotation), and where \(\theta, \rho\) are scalars. Find the tangent space vectors and their reciprocals.

fig. 1. “Cylindrical” boost parameterization.

Note that this is cylindrical parameterization for the rotation case, and traces out hyperbolic regions for the boost case. The boost case is illustrated in fig. 1 where hyperbolas in the light cone are found for boosts of \( \gamma_0\) with various values of \(\rho\), and the spacelike hyperbolas are boosts of \( \gamma_1 \), again for various values of \( \rho \).

Answer

The tangent space vectors are
\begin{equation}\label{eqn:reciprocal:2480}
\Bx_\rho = \frac{x}{\rho},
\end{equation}
and

\begin{equation}\label{eqn:reciprocal:2500}
\begin{aligned}
\Bx_\theta
&= -\frac{i}{2} x + x \frac{i}{2} \\
&= x \cdot i.
\end{aligned}
\end{equation}
Recall that \( x \cdot i \) lies perpendicular to \( x \) (in the plane \( i \)), as illustrated in fig. 2. This means that \( \Bx_\rho \) and \( \Bx_\theta \) are orthogonal, so we can find the reciprocal vectors by just inverting them
\begin{equation}\label{eqn:reciprocal:2520}
\begin{aligned}
\Bx^\rho &= \frac{\rho}{x} \\
\Bx^\theta &= \frac{1}{x \cdot i}.
\end{aligned}
\end{equation}

fig. 2. Projection and rejection geometry.

Parameterization of a general linear transformation.

Given \( N \) parameters \( u^0, u^1, \cdots u^{N-1} \), a general linear transformation from the parameter space to the vector space has the form
\begin{equation}\label{eqn:reciprocal:2160}
x =
{a^\alpha}_\beta \gamma_\alpha u^\beta,
\end{equation}
where \( \beta \in [0, \cdots, N-1] \) and \( \alpha \in [0,3] \).
For such a general transformation, observe that the curvilinear basis vectors are
\begin{equation}\label{eqn:reciprocal:2180}
\begin{aligned}
\Bx_\mu
&= \PD{u^\mu}{x} \\
&= \PD{u^\mu}{}
{a^\alpha}_\beta \gamma_\alpha u^\beta \\
&=
{a^\alpha}_\mu \gamma_\alpha.
\end{aligned}
\end{equation}
We find an interpretation of \( {a^\alpha}_\mu \) by dotting \( \Bx_\mu \) with the reciprocal frame vectors of the standard basis
\begin{equation}\label{eqn:reciprocal:2200}
\begin{aligned}
\Bx_\mu \cdot \gamma^\nu
&=
{a^\alpha}_\mu \lr{ \gamma_\alpha \cdot \gamma^\nu } \\
&=
{a^\nu}_\mu,
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:reciprocal:2220}
x = \Bx_\mu u^\mu.
\end{equation}
We are able to reinterpret \ref{eqn:reciprocal:2160} as a contraction of the tangent space vectors with the parameters, scaling and summing these direction vectors to characterize all the points in the tangent plane.

Theorem 1.1: Projecting onto the tangent space.

Let \( T \) represent the tangent space. The projection of a vector onto the tangent space has the form
\begin{equation}\label{eqn:reciprocal:2560}
\textrm{Proj}_{\textrm{T}} y = \lr{ y \cdot \Bx^\mu } \Bx_\mu = \lr{ y \cdot \Bx_\mu } \Bx^\mu.
\end{equation}

Start proof:

Let’s designate \( a \) as the portion of the vector \( y \) that lies outside of the tangent space
\begin{equation}\label{eqn:reciprocal:2260}
y = y^\mu \Bx_\mu + a.
\end{equation}
If we knew the coordinates \( y^\mu \), we would have a recipe for the projection.
Algebraically, requiring that \( a \) lies outside of the tangent space, is equivalent to stating \( a \cdot \Bx_\mu = a \cdot \Bx^\mu = 0 \). We use that fact, and then take dot products
\begin{equation}\label{eqn:reciprocal:2280}
\begin{aligned}
y \cdot \Bx^\nu
&= \lr{ y^\mu \Bx_\mu + a } \cdot \Bx^\nu \\
&= y^\nu,
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:reciprocal:2300}
y = \lr{ y \cdot \Bx^\mu } \Bx_\mu + a.
\end{equation}
Similarly, the tangent space projection can be expressed as a linear combination of reciprocal basis elements
\begin{equation}\label{eqn:reciprocal:2320}
y = y_\mu \Bx^\mu + a.
\end{equation}
Dotting with \( \Bx_\mu \), we have
\begin{equation}\label{eqn:reciprocal:2340}
\begin{aligned}
y \cdot \Bx^\mu
&= \lr{ y_\alpha \Bx^\alpha + a } \cdot \Bx_\mu \\
&= y_\mu,
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:reciprocal:2360}
y = \lr{ y \cdot \Bx^\mu } \Bx_\mu + a.
\end{equation}
We find the two stated ways of computing the projection.

Observe that, for the special case that all of \( \setlr{ \Bx_\mu } \) are orthogonal, the equivalence of these two projection methods follows directly, since
\begin{equation}\label{eqn:reciprocal:2380}
\begin{aligned}
\lr{ y \cdot \Bx^\mu } \Bx_\mu
&=
\lr{ y \cdot \inv{\Bx_\mu} } \inv{\Bx^\mu} \\
&=
\lr{ y \cdot \frac{\Bx_\mu}{\lr{\Bx_\mu}^2 } } \frac{\Bx^\mu}{\lr{\Bx^\mu}^2} \\
&=
\lr{ y \cdot \Bx_\mu } \Bx^\mu.
\end{aligned}
\end{equation}

End proof.

Lorentz transformations in Space Time Algebra (STA)

December 12, 2020 math and physics play No comments , , , , , , , , , , , , , , , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

Motivation.

One of the remarkable features of geometric algebra are the complex exponential sandwiches that can be used to encode rotations in any dimension, or rotation like operations like Lorentz transformations in Minkowski spaces. In this post, we show some examples that unpack the geometric algebra expressions for Lorentz transformations operations of this sort. In particular, we will look at the exponential sandwich operations for spatial rotations and Lorentz boosts in the Dirac algebra, known as Space Time Algebra (STA) in geometric algebra circles, and demonstrate that these sandwiches do have the desired effects.

Lorentz transformations.

Theorem 1.1: Lorentz transformation.

The transformation
\begin{equation}\label{eqn:lorentzTransform:580}
x \rightarrow e^{B} x e^{-B} = x’,
\end{equation}
where \( B = a \wedge b \), is an STA 2-blade for any two linearly independent four-vectors \( a, b \), is a norm preserving, that is
\begin{equation}\label{eqn:lorentzTransform:600}
x^2 = {x’}^2.
\end{equation}

Start proof:

The proof is disturbingly trivial in this geometric algebra form
\begin{equation}\label{eqn:lorentzTransform:40}
\begin{aligned}
{x’}^2
&=
e^{B} x e^{-B} e^{B} x e^{-B} \\
&=
e^{B} x x e^{-B} \\
&=
x^2 e^{B} e^{-B} \\
&=
x^2.
\end{aligned}
\end{equation}

End proof.

In particular, observe that we did not need to construct the usual infinitesimal representations of rotation and boost transformation matrices or tensors in order to demonstrate that we have spacetime invariance for the transformations. The rough idea of such a transformation is that the exponential commutes with components of the four-vector that lie off the spacetime plane specified by the bivector \( B \), and anticommutes with components of the four-vector that lie in the plane. The end result is that the sandwich operation simplifies to
\begin{equation}\label{eqn:lorentzTransform:60}
x’ = x_\parallel e^{-B} + x_\perp,
\end{equation}
where \( x = x_\perp + x_\parallel \) and \( x_\perp \cdot B = 0 \), and \( x_\parallel \wedge B = 0 \). In particular, using \( x = x B B^{-1} = \lr{ x \cdot B + x \wedge B } B^{-1} \), we find that
\begin{equation}\label{eqn:lorentzTransform:80}
\begin{aligned}
x_\parallel &= \lr{ x \cdot B } B^{-1} \\
x_\perp &= \lr{ x \wedge B } B^{-1}.
\end{aligned}
\end{equation}
When \( B \) is a spacetime plane \( B = b \wedge \gamma_0 \), then this exponential has a hyperbolic nature, and we end up with a Lorentz boost. When \( B \) is a spatial bivector, we end up with a single complex exponential, encoding our plane old 3D rotation. More general \( B \)’s that encode composite boosts and rotations are also possible, but \( B \) must be invertible (it should have no lightlike factors.) The rough geometry of these projections is illustrated in fig 1, where the spacetime plane is represented by \( B \).

Projection and rejection geometry.

fig 1. Projection and rejection geometry.

 

What is not so obvious is how to pick \( B \)’s that correspond to specific rotation axes or boost directions. Let’s consider each of those cases in turn.

Theorem 1.2: Boost.

The boost along a direction vector \( \vcap \) and rapidity \( \alpha \) is given by
\begin{equation}\label{eqn:lorentzTransform:620}
x’ = e^{-\vcap \alpha/2} x e^{\vcap \alpha/2},
\end{equation}
where \( \vcap = \gamma_{k0} \cos\theta^k \) is an STA bivector representing a spatial direction with direction cosines \( \cos\theta^k \).

Start proof:

We want to demonstrate that this is equivalent to the usual boost formulation. We can start with decomposition of the four-vector \( x \) into components that lie in and off of the spacetime plane \( \vcap \).
\begin{equation}\label{eqn:lorentzTransform:100}
\begin{aligned}
x
&= \lr{ x^0 + \Bx } \gamma_0 \\
&= \lr{ x^0 + \Bx \vcap^2 } \gamma_0 \\
&= \lr{ x^0 + \lr{ \Bx \cdot \vcap} \vcap + \lr{ \Bx \wedge \vcap} \vcap } \gamma_0,
\end{aligned}
\end{equation}
where \( \Bx = x \wedge \gamma_0 \). The first two components lie in the boost plane, whereas the last is the spatial component of the vector that lies perpendicular to the boost plane. Observe that \( \vcap \) anticommutes with the dot product term and commutes with he wedge product term, so we have
\begin{equation}\label{eqn:lorentzTransform:120}
\begin{aligned}
x’
&=
\lr{ x^0 + \lr{ \Bx \cdot \vcap } \vcap } \gamma_0
e^{\vcap \alpha/2 }
e^{\vcap \alpha/2 }
+
\lr{ \Bx \wedge \vcap } \vcap \gamma_0
e^{-\vcap \alpha/2 }
e^{\vcap \alpha/2 } \\
&=
\lr{ x^0 + \lr{ \Bx \cdot \vcap } \vcap } \gamma_0
e^{\vcap \alpha }
+
\lr{ \Bx \wedge \vcap } \vcap \gamma_0.
\end{aligned}
\end{equation}
Noting that \( \vcap^2 = 1 \), we may expand the exponential in hyperbolic functions, and find that the boosted portion of the vector expands as
\begin{equation}\label{eqn:lorentzTransform:260}
\begin{aligned}
\lr{ x^0 + \lr{ \Bx \cdot \vcap} \vcap } \gamma_0 e^{\vcap \alpha}
&=
\lr{ x^0 + \lr{ \Bx \cdot \vcap} \vcap } \gamma_0 \lr{ \cosh\alpha + \vcap \sinh \alpha} \\
&=
\lr{ x^0 + \lr{ \Bx \cdot \vcap} \vcap } \lr{ \cosh\alpha – \vcap \sinh \alpha} \gamma_0 \\
&=
\lr{ x^0 \cosh\alpha – \lr{ \Bx \cdot \vcap} \sinh \alpha} \gamma_0
+
\lr{ -x^0 \sinh \alpha + \lr{ \Bx \cdot \vcap} \cosh \alpha } \vcap \gamma_0.
\end{aligned}
\end{equation}
We are left with
\begin{equation}\label{eqn:lorentzTransform:320}
\begin{aligned}
x’
&=
\lr{ x^0 \cosh\alpha – \lr{ \Bx \cdot \vcap} \sinh \alpha} \gamma_0
+
\lr{ \lr{ \Bx \cdot \vcap} \cosh \alpha -x^0 \sinh \alpha } \vcap \gamma_0
+
\lr{ \Bx \wedge \vcap} \vcap \gamma_0 \\
&=
\begin{bmatrix}
\gamma_0 & \vcap \gamma_0
\end{bmatrix}
\begin{bmatrix}
\cosh\alpha & – \sinh\alpha \\
-\sinh\alpha & \cosh\alpha
\end{bmatrix}
\begin{bmatrix}
x^0 \\
\Bx \cdot \vcap
\end{bmatrix}
+
\lr{ \Bx \wedge \vcap} \vcap \gamma_0,
\end{aligned}
\end{equation}
which has the desired Lorentz boost structure. Of course, this is usually seen with \( \vcap = \gamma_{10} \) so that the components in the coordinate column vector are \( (ct, x) \).

End proof.

Theorem 1.3: Spatial rotation.

Given two linearly independent spatial bivectors \( \Ba = a^k \gamma_{k0}, \Bb = b^k \gamma_{k0} \), a rotation of \(\theta\) radians in the plane of \( \Ba, \Bb \) from \( \Ba \) towards \( \Bb \), is given by
\begin{equation}\label{eqn:lorentzTransform:640}
x’ = e^{-i\theta} x e^{i\theta},
\end{equation}
where \( i = (\Ba \wedge \Bb)/\Abs{\Ba \wedge \Bb} \), is a unit (spatial) bivector.

Start proof:

Without loss of generality, we may pick \( i = \acap \bcap \), where \( \acap^2 = \bcap^2 = 1 \), and \( \acap \cdot \bcap = 0 \). With such an orthonormal basis for the plane, we can decompose our four vector into portions that lie in and off the plane
\begin{equation}\label{eqn:lorentzTransform:400}
\begin{aligned}
x
&= \lr{ x^0 + \Bx } \gamma_0 \\
&= \lr{ x^0 + \Bx i i^{-1} } \gamma_0 \\
&= \lr{ x^0 + \lr{ \Bx \cdot i } i^{-1} + \lr{ \Bx \wedge i } i^{-1} } \gamma_0.
\end{aligned}
\end{equation}
The projective term lies in the plane of rotation, whereas the timelike and spatial rejection term are perpendicular. That is
\begin{equation}\label{eqn:lorentzTransform:420}
\begin{aligned}
x_\parallel &= \lr{ \Bx \cdot i } i^{-1} \gamma_0 \\
x_\perp &= \lr{ x^0 + \lr{ \Bx \wedge i } i^{-1} } \gamma_0,
\end{aligned}
\end{equation}
where \( x_\parallel \wedge i = 0 \), and \( x_\perp \cdot i = 0 \). The plane pseudoscalar \( i \) anticommutes with \( x_\parallel \), and commutes with \( x_\perp \), so
\begin{equation}\label{eqn:lorentzTransform:440}
\begin{aligned}
x’
&= e^{-i\theta/2} \lr{ x_\parallel + x_\perp } e^{i\theta/2} \\
&= x_\parallel e^{i\theta} + x_\perp.
\end{aligned}
\end{equation}
However
\begin{equation}\label{eqn:lorentzTransform:460}
\begin{aligned}
\lr{ \Bx \cdot i } i^{-1}
&=
\lr{ \Bx \cdot \lr{ \acap \wedge \bcap } } \bcap \acap \\
&=
\lr{\Bx \cdot \acap} \bcap \bcap \acap
-\lr{\Bx \cdot \bcap} \acap \bcap \acap \\
&=
\lr{\Bx \cdot \acap} \acap
+\lr{\Bx \cdot \bcap} \bcap,
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:lorentzTransform:480}
\begin{aligned}
x_\parallel e^{i\theta}
&=
\lr{
\lr{\Bx \cdot \acap} \acap
+
\lr{\Bx \cdot \bcap} \bcap
}
\gamma_0
\lr{
\cos\theta + \acap \bcap \sin\theta
} \\
&=
\acap \lr{
\lr{\Bx \cdot \acap} \cos\theta

\lr{\Bx \cdot \bcap} \sin\theta
}
\gamma_0
+
\bcap \lr{
\lr{\Bx \cdot \acap} \sin\theta
+
\lr{\Bx \cdot \bcap} \cos\theta
}
\gamma_0,
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:lorentzTransform:500}
x’
=
\begin{bmatrix}
\acap & \bcap
\end{bmatrix}
\begin{bmatrix}
\cos\theta & – \sin\theta \\
\sin\theta & \cos\theta
\end{bmatrix}
\begin{bmatrix}
\Bx \cdot \acap \\
\Bx \cdot \bcap \\
\end{bmatrix}
\gamma_0
+
\lr{ x \wedge i} i^{-1} \gamma_0.
\end{equation}
Observe that this rejection term can be explicitly expanded to
\begin{equation}\label{eqn:lorentzTransform:520}
\lr{ \Bx \wedge i} i^{-1} \gamma_0 =
x –
\lr{ \Bx \cdot \acap } \acap \gamma_0

\lr{ \Bx \cdot \acap } \acap \gamma_0.
\end{equation}
This is the timelike component of the vector, plus the spatial component that is normal to the plane. This exponential sandwich transformation rotates only the portion of the vector that lies in the plane, and leaves the rest (timelike and normal) untouched.

End proof.

Problems.

Problem: Verify components relative to boost direction.

In the proof of thm. 1.2, the vector \( x \) was expanded in terms of the spacetime split. An alternate approach, is to expand as
\begin{equation}\label{eqn:lorentzTransform:340}
\begin{aligned}
x
&= x \vcap^2 \\
&= \lr{ x \cdot \vcap + x \wedge \vcap } \vcap \\
&= \lr{ x \cdot \vcap } \vcap + \lr{ x \wedge \vcap } \vcap.
\end{aligned}
\end{equation}
Show that
\begin{equation}\label{eqn:lorentzTransform:360}
\lr{ x \cdot \vcap } \vcap
=
\lr{ x^0 + \lr{ \Bx \cdot \vcap} \vcap } \gamma_0,
\end{equation}
and
\begin{equation}\label{eqn:lorentzTransform:380}
\lr{ x \wedge \vcap } \vcap
=
\lr{ \Bx \wedge \vcap} \vcap \gamma_0.
\end{equation}

Answer

Let \( x = x^\mu \gamma_\mu \), so that
\begin{equation}\label{eqn:lorentzTransform:160}
\begin{aligned}
x \cdot \vcap
&=
\gpgradeone{ x^\mu \gamma_\mu \cos\theta^b \gamma_{b 0} } \\
&=
x^\mu \cos\theta^b \gpgradeone{ \gamma_\mu \gamma_{b 0} }
.
\end{aligned}
\end{equation}
The \( \mu = 0 \) component of this grade selection is
\begin{equation}\label{eqn:lorentzTransform:180}
\gpgradeone{ \gamma_0 \gamma_{b 0} }
=
-\gamma_b,
\end{equation}
and for \( \mu = a \ne 0 \), we have
\begin{equation}\label{eqn:lorentzTransform:200}
\gpgradeone{ \gamma_a \gamma_{b 0} }
=
-\delta_{a b} \gamma_0,
\end{equation}
so we have
\begin{equation}\label{eqn:lorentzTransform:220}
\begin{aligned}
x \cdot \vcap
&=
x^0 \cos\theta^b (-\gamma_b)
+
x^a \cos\theta^b (-\delta_{ab} \gamma_0 ) \\
&=
-x^0 \vcap \gamma_0

x^b \cos\theta^b \gamma_0 \\
&=
– \lr{ x^0 \vcap + \Bx \cdot \vcap } \gamma_0,
\end{aligned}
\end{equation}
where \( \Bx = x \wedge \gamma_0 \) is the spatial portion of the four vector \( x \) relative to the stationary observer frame. Since \( \vcap \) anticommutes with \( \gamma_0 \), the component of \( x \) in the spacetime plane \( \vcap \) is
\begin{equation}\label{eqn:lorentzTransform:240}
\lr{ x \cdot \vcap } \vcap =
\lr{ x^0 + \lr{ \Bx \cdot \vcap} \vcap } \gamma_0,
\end{equation}
as expected.

For the rejection term, we have
\begin{equation}\label{eqn:lorentzTransform:280}
x \wedge \vcap
=
x^\mu \cos\theta^s \gpgradethree{ \gamma_\mu \gamma_{s 0} }.
\end{equation}
The \( \mu = 0 \) term clearly contributes nothing, leaving us with:
\begin{equation}\label{eqn:lorentzTransform:300}
\begin{aligned}
\lr{ x \wedge \vcap } \vcap
&=
\lr{ x \wedge \vcap } \cdot \vcap \\
&=
x^r \cos\theta^s \cos\theta^t \lr{ \lr{ \gamma_r \wedge \gamma_{s}} \gamma_0 } \cdot \lr{ \gamma_{t0} } \\
&=
x^r \cos\theta^s \cos\theta^t \gpgradeone{
\lr{ \gamma_r \wedge \gamma_{s} } \gamma_0 \gamma_{t0}
} \\
&=
-x^r \cos\theta^s \cos\theta^t \lr{ \gamma_r \wedge \gamma_{s}} \cdot \gamma_t \\
&=
-x^r \cos\theta^s \cos\theta^t \lr{ -\gamma_r \delta_{st} + \gamma_s \delta_{rt} } \\
&=
x^r \cos\theta^t \cos\theta^t \gamma_r

x^t \cos\theta^s \cos\theta^t \gamma_s \\
&=
\Bx \gamma_0
– (\Bx \cdot \vcap) \vcap \gamma_0 \\
&=
\lr{ \Bx \wedge \vcap} \vcap \gamma_0,
\end{aligned}
\end{equation}
as expected. Is there a clever way to demonstrate this without resorting to coordinates?

Problem: Rotation transformation components.

Given a unit spatial bivector \( i = \acap \bcap \), where \( \acap \cdot \bcap = 0 \) and \( i^2 = -1 \), show that
\begin{equation}\label{eqn:lorentzTransform:540}
\lr{ x \cdot i } i^{-1}
=
\lr{ \Bx \cdot i } i^{-1} \gamma_0
=
\lr{\Bx \cdot \acap } \acap \gamma_0
+
\lr{\Bx \cdot \bcap } \bcap \gamma_0,
\end{equation}
and
\begin{equation}\label{eqn:lorentzTransform:560}
\lr{ x \wedge i } i^{-1}
=
\lr{ \Bx \wedge i } i^{-1} \gamma_0
=
x –
\lr{\Bx \cdot \acap } \acap \gamma_0

\lr{\Bx \cdot \bcap } \bcap \gamma_0.
\end{equation}
Also show that \( i \) anticommutes with \( \lr{ x \cdot i } i^{-1} \) and commutes with \( \lr{ x \wedge i } i^{-1} \).

Answer

This problem is left for the reader, as I don’t feel like typing out my solution.

The first part of this problem can be done in the tedious coordinate approach used above, but hopefully there is a better way.

For the last (commutation) part of the problem, here is a hint. Let \( x \wedge i = n i \), where \( n \cdot i = 0 \). The result then follows easily.

Curvilinear coordinates and gradient in spacetime, and reciprocal frames.

December 1, 2020 math and physics play 2 comments , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

Motivation.

I started pondering some aspects of spacetime integration theory, and found that there were some aspects of the concepts of reciprocal frames that were not clear to me. In the process of sorting those ideas out for myself, I wrote up the following notes.

In the notes below, I will introduce the many of the prerequisite ideas that are needed to express and apply the fundamental theorem of geometric calculus in a 4D relativistic context. The focus will be the Dirac’s algebra of special relativity, known as STA (Space Time Algebra) in geometric algebra parlance. If desired, it should be clear how to apply these ideas to lower or higher dimensional spaces, and to plain old Euclidean metrics.

On notation.

In Euclidean space we use bold face reciprocal frame vectors \( \Bx^i \cdot \Bx_j = {\delta^i}_j \), which nicely distinguishes them from the generalized coordinates \( x_i, x^j \) associated with the basis or the reciprocal frame, that is
\begin{equation}\label{eqn:reciprocalblog:640}
\Bx = x^i \Bx_i = x_j \Bx^j.
\end{equation}
On the other hand, it is conventional to use non-bold face for both the four-vectors and their coordinates in STA, such as the following standard basis decomposition
\begin{equation}\label{eqn:reciprocalblog:660}
x = x^\mu \gamma_\mu = x_\mu \gamma^\mu.
\end{equation}
If we use non-bold face \( x^\mu, x_\nu \) for the coordinates with respect to a specified frame, then we cannot also use non-bold face for the curvilinear basis vectors.

To resolve this notational ambiguity, I’ve chosen to use bold face \( \Bx^\mu, \Bx_\nu \) symbols as the curvilinear basis elements in this relativistic context, as we do for Euclidean spaces.

Basis and coordinates.

Definition 1.1: Standard Dirac basis.

The Dirac basis elements are \(\setlr{ \gamma_0, \gamma_1, \gamma_2, \gamma_3 } \), satisfying
\begin{equation}\label{eqn:reciprocalblog:1940}
\gamma_0^2 = 1 = -\gamma_k^2, \quad \forall k = 1,2,3,
\end{equation}
and
\begin{equation}\label{eqn:reciprocalblog:740}
\gamma_\mu \cdot \gamma_\nu = 0, \quad \forall \mu \ne \nu.
\end{equation}

A conventional way of summarizing these orthogonality relationships is \( \gamma_\mu \cdot \gamma_\nu = \eta_{\mu\nu} \), where \( \eta_{\mu\nu} \) are the elements of the metric \( G = \text{diag}(+,-,-,-) \).

Definition 1.2: Reciprocal basis for the standard Dirac basis.

We define a reciprocal basis \( \setlr{ \gamma^0, \gamma^1, \gamma^2, \gamma^3} \) satisfying \( \gamma^\mu \cdot \gamma_\nu = {\delta^\mu}_\nu, \forall \mu,\nu \in 0,1,2,3 \).

Theorem 1.1: Reciprocal basis uniqueness.

This reciprocal basis is unique, and for our choice of metric has the values
\begin{equation}\label{eqn:reciprocalblog:1960}
\gamma^0 = \gamma_0, \quad \gamma^k = -\gamma_k, \quad \forall k = 1,2,3.
\end{equation}

Proof is left to the reader.

Definition 1.3: Coordinates.

We define the coordinates of a vector with respect to the standard basis as \( x^\mu \) satisfying
\begin{equation}\label{eqn:reciprocalblog:1980}
x = x^\mu \gamma_\mu,
\end{equation}
and define the coordinates of a vector with respect to the reciprocal basis as \( x_\mu \) satisfying
\begin{equation}\label{eqn:reciprocalblog:2000}
x = x_\mu \gamma^\mu,
\end{equation}

Theorem 1.2: Coordinates.

Given the definitions above, we may compute the coordinates of a vector, simply by dotting with the basis elements
\begin{equation}\label{eqn:reciprocalblog:2020}
x^\mu = x \cdot \gamma^\mu,
\end{equation}
and
\begin{equation}\label{eqn:reciprocalblog:2040}
x_\mu = x \cdot \gamma_\mu,
\end{equation}

Start proof:

This follows by straightforward computation
\begin{equation}\label{eqn:reciprocalblog:840}
\begin{aligned}
x \cdot \gamma^\mu
&=
\lr{ x^\nu \gamma_\nu } \cdot \gamma^\mu \\
&=
x^\nu \lr{ \gamma_\nu \cdot \gamma^\mu } \\
&=
x^\nu {\delta_\nu}^\mu \\
&=
x^\mu,
\end{aligned}
\end{equation}
and
\begin{equation}\label{eqn:reciprocalblog:860}
\begin{aligned}
x \cdot \gamma_\mu
&=
\lr{ x_\nu \gamma^\nu } \cdot \gamma_\mu \\
&=
x_\nu \lr{ \gamma^\nu \cdot \gamma_\mu } \\
&=
x_\nu {\delta^\nu}_\mu \\
&=
x_\mu.
\end{aligned}
\end{equation}

End proof.

Derivative operators.

We’d like to determine the form of the (spacetime) gradient operator. The gradient can be defined in terms of coordinates directly, but we choose an implicit definition, in terms of the directional derivative.

Definition 1.4: Directional derivative and gradient.

Let \( F = F(x) \) be a four-vector parameterized multivector. The directional derivative of \( F \) with respect to the (four-vector) direction \( a \) is denoted
\begin{equation}\label{eqn:reciprocalblog:2060}
\lr{ a \cdot \grad } F = \lim_{\epsilon \rightarrow 0} \frac{ F(x + \epsilon a) – F(x) }{ \epsilon },
\end{equation}
where \( \grad \) is called the space time gradient.

Theorem 1.3: Gradient.

The standard basis representation of the gradient is
\begin{equation}\label{eqn:reciprocalblog:2080}
\grad = \gamma^\mu \partial_\mu,
\end{equation}
where
\begin{equation}\label{eqn:reciprocalblog:2100}
\partial_\mu = \PD{x^\mu}{}.
\end{equation}

Start proof:

The Dirac gradient pops naturally out of the coordinate representation of the directional derivative, as we can see by expanding \( F(x + \epsilon a) \) in Taylor series
\begin{equation}\label{eqn:reciprocalblog:900}
\begin{aligned}
F(x + \epsilon a)
&= F(x) + \epsilon \frac{dF(x + \epsilon a)}{d\epsilon} + O(\epsilon^2) \\
&= F(x) + \epsilon \PD{\lr{x^\mu + \epsilon a^\mu}}{F} \PD{\epsilon}{\lr{x^\mu + \epsilon a^\mu}} \\
&= F(x) + \epsilon \PD{\lr{x^\mu + \epsilon a^\mu}}{F} a^\mu.
\end{aligned}
\end{equation}
The directional derivative is
\begin{equation}\label{eqn:reciprocalblog:920}
\begin{aligned}
\lim_{\epsilon \rightarrow 0}
\frac{F(x + \epsilon a) – F(x)}{\epsilon}
&=
\lim_{\epsilon \rightarrow 0}\,
a^\mu
\PD{\lr{x^\mu + \epsilon a^\mu}}{F} \\
&=
a^\mu
\PD{x^\mu}{F} \\
&=
\lr{a^\nu \gamma_\nu} \cdot \gamma^\mu \PD{x^\mu}{F} \\
&=
a \cdot \lr{ \gamma^\mu \partial_\mu } F.
\end{aligned}
\end{equation}

End proof.

Curvilinear bases.

Curvilinear bases are the foundation of the fundamental theorem of multivector calculus. This form of integral calculus is defined over parameterized surfaces (called manifolds) that satisfy some specific non-degeneracy and continuity requirements.

A parameterized vector \( x(u,v, \cdots w) \) can be thought of as tracing out a hypersurface (curve, surface, volume, …), where the dimension of the hypersurface depends on the number of parameters. At each point, a bases can be constructed from the differentials of the parameterized vector. Such a basis is called the tangent space to the surface at the point in question. Our curvilinear bases will be related to these differentials. We will also be interested in a dual basis that is restricted to the span of the tangent space. This dual basis will be called the reciprocal frame, and line the basis of the tangent space itself, also varies from point to point on the surface.

Fig 1a. One parameter curve, with illustration of tangent space along the curve.

Fig 1b. Two parameter surface, with illustration of tangent space along the surface.

One and two parameter spaces are illustrated in fig. 1a, and 1b.  The tangent space basis at a specific point of a two parameter surface, \( x(u^0, u^1) \), is illustrated in fig. 1. The differential directions that span the tangent space are
\begin{equation}\label{eqn:reciprocalblog:1040}
\begin{aligned}
d\Bx_0 &= \PD{u^0}{x} du^0 \\
d\Bx_1 &= \PD{u^1}{x} du^1,
\end{aligned}
\end{equation}
and the tangent space itself is \( \mbox{Span}\setlr{ d\Bx_0, d\Bx_1 } \). We may form an oriented surface area element \( d\Bx_0 \wedge d\Bx_1 \) over this surface.

Fig 2. Two parameter surface.

Tangent spaces associated with 3 or more parameters cannot be easily visualized in three dimensions, but the idea generalizes algebraically without trouble.

Definition 1.5: Tangent basis and space.

Given a parameterization \( x = x(u^0, \cdots, u^N) \), where \( N < 4 \), the span of the vectors
\begin{equation}\label{eqn:reciprocalblog:2120}
\Bx_\mu = \PD{u^\mu}{x},
\end{equation}
is called the tangent space for the hypersurface associated with the parameterization, and it’s basis is
\( \setlr{ \Bx_\mu } \).

Later we will see that parameterization constraints must be imposed, as not all surfaces generated by a set of parameterizations are useful for integration theory. In particular, degenerate parameterizations for which the wedge products of the tangent space basis vectors are zero, or those wedge products cannot be inverted, are not physically meaningful. Properly behaved surfaces of this sort are called manifolds.

Having introduced curvilinear coordinates associated with a parameterization, we can now determine the form of the gradient with respect to a parameterization of spacetime.

Theorem 1.4: Gradient, curvilinear representation.

Given a spacetime parameterization \( x = x(u^0, u^1, u^2, u^3) \), the gradient with respect to the parameters \( u^\mu \) is
\begin{equation}\label{eqn:reciprocalblog:2140}
\grad = \sum_\mu \Bx^\mu
\PD{u^\mu}{},
\end{equation}
where
\begin{equation}\label{eqn:reciprocalblog:2160}
\Bx^\mu = \grad u^\mu.
\end{equation}
The vectors \( \Bx^\mu \) are called the reciprocal frame vectors, and the ordered set \( \setlr{ \Bx^0, \Bx^1, \Bx^2, \Bx^3 } \) is called the reciprocal basis.It is convenient to define \( \partial_\mu \equiv \PDi{u^\mu}{} \), so that the gradient can be expressed in mixed index representation
\begin{equation}\label{eqn:reciprocalblog:2180}
\grad = \Bx^\mu \partial_\mu.
\end{equation}
This introduces some notational ambiguity, since we used \( \partial_\mu = \PDi{x^\mu}{} \) for the standard basis derivative operators too, but we will be careful to be explicit when there is any doubt about what is intended.

Start proof:

The proof follows by application of the chain rule.
\begin{equation}\label{eqn:reciprocalblog:960}
\begin{aligned}
\grad F
&=
\gamma^\alpha \PD{x^\alpha}{F} \\
&=
\gamma^\alpha
\PD{x^\alpha}{u^\mu}
\PD{u^\mu}{F} \\
&=
\lr{ \grad u^\mu } \PD{u^\mu}{F} \\
&=
\Bx^\mu \PD{u^\mu}{F}.
\end{aligned}
\end{equation}

End proof.

Theorem 1.5: Reciprocal relationship.

The vectors \( \Bx^\mu = \grad u^\mu \), and \( \Bx_\mu = \PDi{u^\mu}{x} \) satisfy the reciprocal relationship
\begin{equation}\label{eqn:reciprocalblog:2200}
\Bx^\mu \cdot \Bx_\nu = {\delta^\mu}_\nu.
\end{equation}

Start proof:

\begin{equation}\label{eqn:reciprocalblog:1020}
\begin{aligned}
\Bx^\mu \cdot \Bx_\nu
&=
\grad u^\mu \cdot
\PD{u^\nu}{x} \\
&=
\lr{
\gamma^\alpha \PD{x^\alpha}{u^\mu}
}
\cdot
\lr{
\PD{u^\nu}{x^\beta} \gamma_\beta
} \\
&=
{\delta^\alpha}_\beta \PD{x^\alpha}{u^\mu}
\PD{u^\nu}{x^\beta} \\
&=
\PD{x^\alpha}{u^\mu} \PD{u^\nu}{x^\alpha} \\
&=
\PD{u^\nu}{u^\mu} \\
&=
{\delta^\mu}_\nu
.
\end{aligned}
\end{equation}

End proof.

It is instructive to consider an example. Here is a parameterization that scales the proper time parameter, and uses polar coordinates in the \(x-y\) plane.

Problem: Compute the curvilinear and reciprocal basis.

Given
\begin{equation}\label{eqn:reciprocalblog:2360}
x(t,\rho,\theta,z) = c t \gamma_0 + \gamma_1 \rho e^{i \theta} + z \gamma_3,
\end{equation}
where \( i = \gamma_1 \gamma_2 \), compute the curvilinear frame vectors and their reciprocals.

Answer

The frame vectors are all easy to compute
\begin{equation}\label{eqn:reciprocalblog:1180}
\begin{aligned}
\Bx_0 &= \PD{t}{x} = c \gamma_0 \\
\Bx_1 &= \PD{\rho}{x} = \gamma_1 e^{i \theta} \\
\Bx_2 &= \PD{\theta}{x} = \rho \gamma_1 \gamma_1 \gamma_2 e^{i \theta} = – \rho \gamma_2 e^{i \theta} \\
\Bx_3 &= \PD{z}{x} = \gamma_3.
\end{aligned}
\end{equation}
The \( \Bx_1 \) vector is radial, \( \Bx^2 \) is perpendicular to that tangent to the same unit circle, as plotted in fig 3.

Fig3: Tangent space direction vectors.

All of these particular frame vectors happen to be mutually perpendicular, something that will not generally be true for a more arbitrary parameterization.

To compute the reciprocal frame vectors, we must express our parameters in terms of \( x^\mu \) coordinates, and use implicit integration techniques to deal with the coupling of the rotational terms. First observe that
\begin{equation}\label{eqn:reciprocalblog:1200}
\gamma_1 e^{i\theta}
= \gamma_1 \lr{ \cos\theta + \gamma_1 \gamma_2 \sin\theta }
= \gamma_1 \cos\theta – \gamma_2 \sin\theta,
\end{equation}
so
\begin{equation}\label{eqn:reciprocalblog:1220}
\begin{aligned}
x^0 &= c t \\
x^1 &= \rho \cos\theta \\
x^2 &= -\rho \sin\theta \\
x^3 &= z.
\end{aligned}
\end{equation}
We can easily evaluate the \( t, z \) gradients
\begin{equation}\label{eqn:reciprocalblog:1240}
\begin{aligned}
\grad t &= \frac{\gamma^1 }{c} \\
\grad z &= \gamma^3,
\end{aligned}
\end{equation}
but the \( \rho, \theta \) gradients are not as easy. First writing
\begin{equation}\label{eqn:reciprocalblog:1260}
\rho^2 = \lr{x^1}^2 + \lr{x^2}^2,
\end{equation}
we find
\begin{equation}\label{eqn:reciprocalblog:1280}
\begin{aligned}
2 \rho \grad \rho = 2 \lr{ x^1 \grad x^1 + x^2 \grad x^2 }
&= 2 \rho \lr{ \cos\theta \gamma^1 – \sin\theta \gamma^2 } \\
&= 2 \rho \gamma^1 \lr{ \cos\theta – \gamma_1 \gamma^2 \sin\theta } \\
&= 2 \rho \gamma^1 e^{i\theta},
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:reciprocalblog:1300}
\grad \rho = \gamma^1 e^{i\theta}.
\end{equation}
For the \( \theta \) gradient, we can write
\begin{equation}\label{eqn:reciprocalblog:1320}
\tan\theta = -\frac{x^2}{x^1},
\end{equation}
so
\begin{equation}\label{eqn:reciprocalblog:1340}
\begin{aligned}
\inv{\cos^2 \theta} \grad \theta
&= -\frac{\gamma^2}{x^1} – x^2 \frac{-\gamma^1}{\lr{x^1}^2} \\
&= \inv{\lr{x^1}^2} \lr{ – \gamma^2 x^1 + \gamma^1 x^2 } \\
&= \frac{\rho}{\rho^2 \cos^2\theta } \lr{ – \gamma^2 \cos\theta – \gamma^1 \sin\theta } \\
&= -\frac{1}{\rho \cos^2\theta } \gamma^2 \lr{ \cos\theta + \gamma_2 \gamma^1 \sin\theta } \\
&= -\frac{\gamma^2 e^{i\theta} }{\rho \cos^2\theta },
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:reciprocalblog:1360}
\grad\theta = -\inv{\rho} \gamma^2 e^{i\theta}.
\end{equation}
In summary,
\begin{equation}\label{eqn:reciprocalblog:1380}
\begin{aligned}
\Bx^0 &= \frac{\gamma^0}{c} \\
\Bx^1 &= \gamma^1 e^{i\theta} \\
\Bx^2 &= -\inv{\rho} \gamma^2 e^{i\theta} \\
\Bx^3 &= \gamma^3.
\end{aligned}
\end{equation}

Despite being a fairly simple parameterization, it was still fairly difficult to solve for the gradients when the parameterization introduced coupling between the coordinates. In this particular case, we could have solved for the parameters in terms of the coordinates (but it was easier not to), but that will not generally be true. We want a less labor intensive strategy to find the reciprocal frame. When we have a full parameterization of spacetime, then we can do this with nothing more than a matrix inversion.

Theorem 1.6: Reciprocal frame matrix equations.

Given a spacetime basis \( \setlr{\Bx_0, \cdots \Bx_3} \), let \( [\Bx_\mu] \) and \( [\Bx^\nu] \) be column matrices with the coordinates of these vectors and their reciprocals, with respect to the standard basis \( \setlr{\gamma_0, \gamma_1, \gamma_2, \gamma_3 } \). Let
\begin{equation}\label{eqn:reciprocalblog:2220}
A =
\begin{bmatrix}
[\Bx_0] & \cdots & [\Bx_{3}]
\end{bmatrix}
,\qquad
X =
\begin{bmatrix}
[\Bx^0] & \cdots & [\Bx^{3}]
\end{bmatrix}.
\end{equation}
The coordinates of the reciprocal frame vectors can be found by solving
\begin{equation}\label{eqn:reciprocalblog:2240}
A^\T G X = 1,
\end{equation}
where \( G = \text{diag}(1,-1,-1,-1) \) and the RHS is an \( 4 \times 4 \) identity matrix.

Start proof:

Let \( \Bx_\mu = {a_\mu}^\alpha \gamma_\alpha, \Bx^\nu = b^{\nu\beta} \gamma_\beta \), so that
\begin{equation}\label{eqn:reciprocalblog:140}
A =
\begin{bmatrix}
{a_\nu}^\mu
\end{bmatrix},
\end{equation}
and
\begin{equation}\label{eqn:reciprocalblog:160}
X =
\begin{bmatrix}
b^{\nu\mu}
\end{bmatrix},
\end{equation}
where \( \mu \in [0,3]\) are the row indexes and \( \nu \in [0,N-1]\) are the column indexes. The reciprocal frame satisfies \( \Bx_\mu \cdot \Bx^\nu = {\delta_\mu}^\nu \), which has the coordinate representation of
\begin{equation}\label{eqn:reciprocalblog:180}
\begin{aligned}
\Bx_\mu \cdot \Bx^\nu
&=
\lr{
{a_\mu}^\alpha \gamma_\alpha
}
\cdot
\lr{
b^{\nu\beta} \gamma_\beta
} \\
&=
{a_\mu}^\alpha
\eta_{\alpha\beta}
b^{\nu\beta} \\
&=
{[A^\T G B]_\mu}^\nu,
\end{aligned}
\end{equation}
where \( \mu \) is the row index and \( \nu \) is the column index.

End proof.

Problem: Matrix inversion reciprocals.

For the parameterization of \ref{eqn:reciprocalblog:2360}, find the reciprocal frame vectors by matrix inversion.

Answer

We expanded \( \Bx_1 \) explicitly in \ref{eqn:reciprocalblog:1200}. Doing the same for \( \Bx_2 \), we have
\begin{equation}\label{eqn:reciprocalblog:1201}
\Bx_2 =
-\rho \gamma_2 e^{i\theta}
= -\rho \gamma_2 \lr{ \cos\theta + \gamma_1 \gamma_2 \sin\theta }
= – \rho \lr{ \gamma_2 \cos\theta + \gamma_1 \sin\theta}.
\end{equation}
Reading off the coordinates of our frame vectors, we have
\begin{equation}\label{eqn:reciprocalblog:1400}
X =
\begin{bmatrix}
c & 0 & 0 & 0 \\
0 & C & -\rho S & 0 \\
0 & -S & -\rho C & 0 \\
0 & 0 & 0 & 1 \\
\end{bmatrix},
\end{equation}
where \( C = \cos\theta \) and \( S = \sin\theta \). We want
\begin{equation}\label{eqn:reciprocalblog:1420}
Y =
{\begin{bmatrix}
c & 0 & 0 & 0 \\
0 & -C & S & 0 \\
0 & \rho S & \rho C & 0 \\
0 & 0 & 0 & -1 \\
\end{bmatrix}}^{-1}
=
\begin{bmatrix}
\inv{c} & 0 & 0 & 0 \\
0 & -C & \frac{S}{\rho} & 0 \\
0 & S & \frac{C}{\rho} & 0 \\
0 & 0 & 0 & -1 \\
\end{bmatrix}.
\end{equation}
We can read off the coordinates of the reciprocal frame vectors
\begin{equation}\label{eqn:reciprocalblog:1440}
\begin{aligned}
\Bx^0 &= \inv{c} \gamma_0 \\
\Bx^1 &= -\cos\theta \gamma_1 + \sin\theta \gamma_2 \\
\Bx^2 &= \inv{\rho} \lr{ \sin\theta \gamma_1 + \cos\theta \gamma_2 } \\
\Bx^3 &= -\gamma_3.
\end{aligned}
\end{equation}
Factoring out \( \gamma^1 \) from the \( \Bx^1 \) terms, we find
\begin{equation}\label{eqn:reciprocalblog:1460}
\begin{aligned}
\Bx^1
&= -\cos\theta \gamma_1 + \sin\theta \gamma_2 \\
&= \gamma^1 \lr{ \cos\theta + \gamma_1 \gamma_2 \sin\theta } \\
&= \gamma^1 e^{i\theta}.
\end{aligned}
\end{equation}
Similarly for \( \Bx^2 \),
\begin{equation}\label{eqn:reciprocalblog:1480}
\begin{aligned}
\Bx^2
&= \inv{\rho} \lr{ \sin\theta \gamma_1 + \cos\theta \gamma_2 } \\
&= \frac{\gamma^2}{\rho} \lr{ \sin\theta \gamma_2 \gamma_1 – \cos\theta } \\
&= -\frac{\gamma^2}{\rho} e^{i\theta}.
\end{aligned}
\end{equation}
This matches \ref{eqn:reciprocalblog:1380}, as expected, but required only algebraic work to compute.

There will be circumstances where we parameterize only a subset of spacetime, and are interested in calculating quantities associated with such a surface. For example, suppose that
\begin{equation}\label{eqn:reciprocalblog:1500}
x(\rho,\theta) = \gamma_1 \rho e^{i \theta},
\end{equation}
where \( i = \gamma_1 \gamma_2 \) as before. We are now parameterizing only the \(x-y\) plane. We will still find
\begin{equation}\label{eqn:reciprocalblog:1520}
\begin{aligned}
\Bx_1 &= \gamma_1 e^{i \theta} \\
\Bx_2 &= -\gamma_2 \rho e^{i \theta}.
\end{aligned}
\end{equation}
We can compute the reciprocals of these vectors using the gradient method. It’s possible to state matrix equations representing the reciprocal relationship of \ref{eqn:reciprocalblog:2200}, which, in this case, is \( X^\T G Y = 1 \), where the RHS is a \( 2 \times 2 \) identity matrix, and \( X, Y\) are \( 4\times 2\) matrices of coordinates, with
\begin{equation}\label{eqn:reciprocalblog:1540}
X =
\begin{bmatrix}
0 & 0 \\
C & -\rho S \\
-S & -\rho C \\
0 & 0
\end{bmatrix}.
\end{equation}
We no longer have a square matrix problem to solve, and our solution set is multivalued. In particular, this matrix equation has solutions
\begin{equation}\label{eqn:reciprocalblog:1560}
\begin{aligned}
\Bx^1 &= \gamma^1 e^{i\theta} + \alpha \gamma^0 + \beta \gamma^3 \\
\Bx^2 &= -\frac{\gamma^2}{\rho} e^{i\theta} + \alpha’ \gamma^0 + \beta’ \gamma^3.
\end{aligned}
\end{equation}
where \( \alpha, \alpha’, \beta, \beta’ \) are arbitrary constants. In the example we considered, we saw that our \( \rho, \theta \) parameters were functions of only \( x^1, x^2 \), so taking gradients could not introduce any \( \gamma^0, \gamma^3 \) dependence in \( \Bx^1, \Bx^2 \). It seems reasonable to assert that we seek an algebraic method of computing a set of vectors that satisfies the reciprocal relationships, where that set of vectors is restricted to the tangent space. We will need to figure out how to prove that this reciprocal construction is identical to the parameter gradients, but let’s start with figuring out what such a tangent space restricted solution looks like.

Theorem 1.7: Reciprocal frame for two parameter subspace.

Given two vectors, \( \Bx_1, \Bx_2 \), the vectors \( \Bx^1, \Bx^2 \in \mbox{Span}\setlr{ \Bx_1, \Bx_2 } \) such that \( \Bx^\mu \cdot \Bx_\nu = {\delta^\mu}_\nu \) are given by
\begin{equation}\label{eqn:reciprocalblog:2260}
\begin{aligned}
\Bx^1 &= \Bx_2 \cdot \inv{\Bx_1 \wedge \Bx_2} \\
\Bx^2 &= -\Bx_1 \cdot \inv{\Bx_1 \wedge \Bx_2},
\end{aligned}
\end{equation}
provided \( \Bx_1 \wedge \Bx_2 \ne 0 \) and
\( \lr{ \Bx_1 \wedge \Bx_2 }^2 \ne 0 \).

Start proof:

The most general set of vectors that satisfy the span constraint are
\begin{equation}\label{eqn:reciprocalblog:1580}
\begin{aligned}
\Bx^1 &= a \Bx_1 + b \Bx_2 \\
\Bx^2 &= c \Bx_1 + d \Bx_2.
\end{aligned}
\end{equation}
We can use wedge products with either \( \Bx_1 \) or \( \Bx_2 \) to eliminate the other from the RHS
\begin{equation}\label{eqn:reciprocalblog:1600}
\begin{aligned}
\Bx^1 \wedge \Bx_2 &= a \lr{ \Bx_1 \wedge \Bx_2 } \\
\Bx^1 \wedge \Bx_1 &= – b \lr{ \Bx_1 \wedge \Bx_2 } \\
\Bx^2 \wedge \Bx_2 &= c \lr{ \Bx_1 \wedge \Bx_2 } \\
\Bx^2 \wedge \Bx_1 &= – d \lr{ \Bx_1 \wedge \Bx_2 },
\end{aligned}
\end{equation}
and then dot both sides with \( \Bx_1 \wedge \Bx_2 \) to produce four scalar equations
\begin{equation}\label{eqn:reciprocalblog:1640}
\begin{aligned}
a \lr{ \Bx_1 \wedge \Bx_2 }^2
&= \lr{ \Bx^1 \wedge \Bx_2 } \cdot \lr{ \Bx_1 \wedge \Bx_2 } \\
&=
\lr{ \Bx_2 \cdot \Bx_1 } \lr{ \Bx^1 \cdot \Bx_2 }

\lr{ \Bx_2 \cdot \Bx_2 } \lr{ \Bx^1 \cdot \Bx_1 } \\
&=
\lr{ \Bx_2 \cdot \Bx_1 } (0)

\lr{ \Bx_2 \cdot \Bx_2 } (1) \\
&= – \Bx_2 \cdot \Bx_2
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:reciprocalblog:1660}
\begin{aligned}
– b \lr{ \Bx_1 \wedge \Bx_2 }^2
&=
\lr{ \Bx^1 \wedge \Bx_1 } \cdot \lr{ \Bx_1 \wedge \Bx_2 } \\
&=
\lr{ \Bx^1 \cdot \Bx_2 } \lr{ \Bx_1 \cdot \Bx_1 }

\lr{ \Bx^1 \cdot \Bx_1 } \lr{ \Bx_1 \cdot \Bx_2 } \\
&=
(0) \lr{ \Bx_1 \cdot \Bx_1 }

(1) \lr{ \Bx_1 \cdot \Bx_2 } \\
&= – \Bx_1 \cdot \Bx_2
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:reciprocalblog:1680}
\begin{aligned}
c \lr{ \Bx_1 \wedge \Bx_2 }^2
&= \lr{ \Bx^2 \wedge \Bx_2 } \cdot \lr{ \Bx_1 \wedge \Bx_2 } \\
&=
\lr{ \Bx_2 \cdot \Bx_1 } \lr{ \Bx^2 \cdot \Bx_2 }

\lr{ \Bx_2 \cdot \Bx_2 } \lr{ \Bx^2 \cdot \Bx_1 } \\
&=
\lr{ \Bx_2 \cdot \Bx_1 } (1)

\lr{ \Bx_2 \cdot \Bx_2 } (0) \\
&= \Bx_2 \cdot \Bx_1
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:reciprocalblog:1700}
\begin{aligned}
– d \lr{ \Bx_1 \wedge \Bx_2 }^2
&= \lr{ \Bx^2 \wedge \Bx_1 } \cdot \lr{ \Bx_1 \wedge \Bx_2 } \\
&=
\lr{ \Bx_1 \cdot \Bx_1 } \lr{ \Bx^2 \cdot \Bx_2 }

\lr{ \Bx_1 \cdot \Bx_2 } \lr{ \Bx^2 \cdot \Bx_1 } \\
&=
\lr{ \Bx_1 \cdot \Bx_1 } (1)

\lr{ \Bx_1 \cdot \Bx_2 } (0) \\
&= \Bx_1 \cdot \Bx_1.
\end{aligned}
\end{equation}
Putting the pieces together we have
\begin{equation}\label{eqn:reciprocalblog:1740}
\begin{aligned}
\Bx^1
&= \frac{ – \lr{ \Bx_2 \cdot \Bx_2 } \Bx_1 + \lr{ \Bx_1 \cdot \Bx_2 } \Bx_2
}{\lr{\Bx_1 \wedge \Bx_2}^2} \\
&=
\frac{
\Bx_2 \cdot \lr{ \Bx_1 \wedge \Bx_2 }
}{\lr{\Bx_1 \wedge \Bx_2}^2} \\
&=
\Bx_2 \cdot \inv{\Bx_1 \wedge \Bx_2}
\end{aligned}
\end{equation}
\begin{equation}\label{eqn:reciprocalblog:1760}
\begin{aligned}
\Bx^2
&=
\frac{ \lr{ \Bx_1 \cdot \Bx_2 } \Bx_1 – \lr{ \Bx_1 \cdot \Bx_1 } \Bx_2
}{\lr{\Bx_1 \wedge \Bx_2}^2} \\
&=
\frac{ -\Bx_1 \cdot \lr{ \Bx_1 \wedge \Bx_2 } }
{\lr{\Bx_1 \wedge \Bx_2}^2} \\
&=
-\Bx_1 \cdot \inv{\Bx_1 \wedge \Bx_2}
\end{aligned}
\end{equation}

End proof.

Lemma 1.1: Distribution identity.

Given k-vectors \( B, C \) and a vector \( a \), where the grade of \( C \) is greater than that of \( B \), then
\begin{equation}\label{eqn:reciprocalblog:2280}
\lr{a \wedge B} \cdot C = a \cdot \lr{ B \cdot C }.
\end{equation}

See [1] for a proof.

Theorem 1.8: Higher order tangent space reciprocals.

Given an \(N\) parameter tangent space with basis \( \setlr{ \Bx_0, \Bx_1, \cdots \Bx_{N-1} } \), the reciprocals are given by
\begin{equation}\label{eqn:reciprocalblog:2300}
\Bx^\mu = (-1)^\mu
\lr{ \Bx_0 \wedge \cdots \check{\Bx_\mu} \cdots \wedge \Bx_{N-1} } \cdot I_N^{-1},
\end{equation}
where the checked term (\(\check{\Bx_\mu}\)) indicates that all terms are included in the wedges except the \( \Bx_\mu \) term, and \( I_N = \Bx_0 \wedge \cdots \Bx_{N-1} \) is the pseudoscalar for the tangent space.

Start proof:

I’ll outline the proof for the three parameter tangent space case, from which the pattern will be clear. The motivation for this proof is a reexamination of the algebraic structure of the two vector solution. Suppose we have a tangent space basis \( \setlr{\Bx_0, \Bx_1} \), for which we’ve shown that
\begin{equation}\label{eqn:reciprocalblog:1860}
\begin{aligned}
\Bx^0
&= \Bx_1 \cdot \inv{\Bx_0 \wedge \Bx_1} \\
&= \frac{\Bx_1 \cdot \lr{\Bx_0 \wedge \Bx_1} }{\lr{ \Bx_0 \wedge \Bx_1}^2 }.
\end{aligned}
\end{equation}
If we dot with \( \Bx_0 \) and \( \Bx_1 \) respectively, we find
\begin{equation}\label{eqn:reciprocalblog:1800}
\begin{aligned}
\Bx_0 \cdot \Bx^0
&=
\Bx_0 \cdot \frac{ \Bx_1 \cdot \lr{ \Bx_0 \wedge \Bx_1 } }{\lr{ \Bx_0 \wedge \Bx_1}^2 } \\
&=
\lr{ \Bx_0 \wedge \Bx_1 } \cdot \frac{ \Bx_0 \wedge \Bx_1 }{\lr{ \Bx_0 \wedge \Bx_1}^2 }.
\end{aligned}
\end{equation}
We end up with unity as expected. Here the
“factored” out vector is reincorporated into the pseudoscalar using the distribution identity \ref{eqn:reciprocalblog:2280}.
Similarly, dotting with \( \Bx_1 \), we find
\begin{equation}\label{eqn:reciprocalblog:0810}
\begin{aligned}
\Bx_1 \cdot \Bx^0
&=
\Bx_1 \cdot \frac{ \Bx_1 \cdot \lr{ \Bx_0 \wedge \Bx_1 } }{\lr{ \Bx_0 \wedge \Bx_1}^2 } \\
&=
\lr{ \Bx_1 \wedge \Bx_1 } \cdot \frac{ \Bx_0 \wedge \Bx_1 }{\lr{ \Bx_0 \wedge \Bx_1}^2 }.
\end{aligned}
\end{equation}
This is zero, since wedging a vector with itself is zero. We can perform such an operation in reverse, taking the square of the tangent space pseudoscalar, and factoring out one of the basis vectors. After this, division by that squared pseudoscalar will normalize things.

For a three parameter tangent space with basis \( \setlr{ \Bx_0, \Bx_1, \Bx_2 } \), we can factor out any of the tangent vectors like so
\begin{equation}\label{eqn:reciprocalblog:1880}
\begin{aligned}
\lr{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 }^2
&= \Bx_0 \cdot \lr{ \lr{ \Bx_1 \wedge \Bx_2 } \cdot \lr{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 } } \\
&= (-1) \Bx_1 \cdot \lr{ \lr{ \Bx_0 \wedge \Bx_2 } \cdot \lr{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 } } \\
&= (-1)^2 \Bx_2 \cdot \lr{ \lr{ \Bx_0 \wedge \Bx_1 } \cdot \lr{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 } }.
\end{aligned}
\end{equation}
The toggling of sign reflects the number of permutations required to move the vector of interest to the front of the wedge sequence. Having factored out any one of the vectors, we can rearrange to find that vector that is it’s inverse and perpendicular to all the others.
\begin{equation}\label{eqn:reciprocalblog:1900}
\begin{aligned}
\Bx^0 &= (-1)^0 \lr{ \Bx_1 \wedge \Bx_2 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 } \\
\Bx^1 &= (-1)^1 \lr{ \Bx_0 \wedge \Bx_2 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 } \\
\Bx^2 &= (-1)^2 \lr{ \Bx_0 \wedge \Bx_1 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 }.
\end{aligned}
\end{equation}

End proof.

In the fashion above, should we want the reciprocal frame for all of spacetime given dimension 4 tangent space, we can state it trivially
\begin{equation}\label{eqn:reciprocalblog:1920}
\begin{aligned}
\Bx^0 &= (-1)^0 \lr{ \Bx_1 \wedge \Bx_2 \wedge \Bx_3 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 \wedge \Bx_3 } \\
\Bx^1 &= (-1)^1 \lr{ \Bx_0 \wedge \Bx_2 \wedge \Bx_3 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 \wedge \Bx_3 } \\
\Bx^2 &= (-1)^2 \lr{ \Bx_0 \wedge \Bx_1 \wedge \Bx_3 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 \wedge \Bx_3 } \\
\Bx^3 &= (-1)^3 \lr{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 } \cdot \inv{ \Bx_0 \wedge \Bx_1 \wedge \Bx_2 \wedge \Bx_3 }.
\end{aligned}
\end{equation}
This is probably not an efficient way to compute all these reciprocals, since we can utilize a single matrix inversion to solve them in one shot. However, there are theoretical advantages to this construction that will be useful when we get to integration theory.

On degeneracy.

A small mention of degeneracy was mentioned above. Regardless of metric, \( \Bx_0 \wedge \Bx_1 = 0 \) means that this pair of vectors are colinear. A tangent space with such a pseudoscalar is clearly undesirable, and we must construct parameterizations for which the area element is non-zero in all regions of interest.

Things get more interesting in mixed signature spaces where we can have vectors that square to zero (i.e. lightlike). If the tangent space pseudoscalar has a lightlike factor, then that pseudoscalar will not be invertible. Such a degeneracy will will likely lead to many other troubles, and parameterizations of this sort should be avoided.

This following problem illustrates an example of this sort of degenerate parameterization.

Problem: Degenerate surface parameterization.

Given a spacetime plane parameterization \( x(u,v) = u a + v b \), where
\begin{equation}\label{eqn:reciprocalblog:480}
a = \gamma_0 + \gamma_1 + \gamma_2 + \gamma_3,
\end{equation}
\begin{equation}\label{eqn:reciprocalblog:500}
b = \gamma_0 – \gamma_1 + \gamma_2 – \gamma_3,
\end{equation}
show that this is a degenerate parameterization, and find the bivector that represents the tangent space. Are these vectors lightlike, spacelike, or timelike? Comment on whether this parameterization represents a physically relevant spacetime surface.

Answer

To characterize the vectors, we square them
\begin{equation}\label{eqn:reciprocalblog:1080}
a^2 = b^2 =
\gamma_0^2 +
\gamma_1^2 +
\gamma_2^2 +
\gamma_3^2
=
1 – 3
= -2,
\end{equation}
so \( a, b \) are both spacelike vectors. The tangent space is clearly just \( \mbox{Span}\setlr{ a, b } = \mbox{Span}\setlr{ e, f }\) where
\begin{equation}\label{eqn:reciprocalblog:1100}
\begin{aligned}
e &= \gamma_0 + \gamma_2 \\
f &= \gamma_1 + \gamma_3.
\end{aligned}
\end{equation}
Observe that \( a = e + f, b = e – f \), and \( e \) is lightlike (\( e^2 = 0 \)), whereas \( f \) is spacelike (\( f^2 = -2 \)), and \( e \cdot f = 0 \), so \( e f = – f e \). The bivector for the tangent plane is
\begin{equation}\label{eqn:reciprocalblog:1120}
\gpgradetwo{
a b
}
=
\gpgradetwo{
(e + f) (e – f)
}
=
\gpgradetwo{
e^2 – f^2 – 2 e f
}
= -2 e f,
\end{equation}
where
\begin{equation}\label{eqn:reciprocalblog:1140}
e f = \gamma_{01} + \gamma_{21} + \gamma_{23} + \gamma_{03}.
\end{equation}
Because \( e \) is lightlike (zero square), and \( e f = – f e \),
the bivector \( e f \) squares to zero
\begin{equation}\label{eqn:reciprocalblog:1780}
\lr{ e f }^2
= -e^2 f^2
= 0,
\end{equation}
which shows that the parameterization is degenerate.

This parameterization can also be expressed as
\begin{equation}\label{eqn:reciprocalblog:1160}
x(u,v)
= u ( e + f ) + v ( e – f )
= (u + v) e + (u – v) f,
\end{equation}
a linear combination of a lightlike and spacelike vector. Intuitively, we expect that a physically meaningful spacetime surface involves linear combinations spacelike vectors, or combinations of a timelike vector with spacelike vectors. This beastie is something entirely different.

Final notes.

There are a few loose ends above. In particular, we haven’t conclusively proven that the set of reciprocal vectors \( \Bx^\mu = \grad u^\mu \) are exactly those obtained through algebraic means. For a full parameterization of spacetime, they are necessarily the same, since both are unique. So we know that \ref{eqn:reciprocalblog:1920} must equal the reciprocals obtained by evaluating the gradient for a full parameterization (and this must also equal the reciprocals that we can obtain through matrix inversion.) We have also not proved explicitly that the three parameter construction of the reciprocals in \ref{eqn:reciprocalblog:1900} is in the tangent space, but that is a fairly trivial observation, so that can be left as an exercise for the reader dismissal. Some additional thought about this is probably required, but it seems reasonable to put that on the back burner and move on to some applications.

References

[1] Peeter Joot. Geometric Algebra for Electrical Engineers. Kindle Direct Publishing, 2019.

More anti-lockdown signage

November 28, 2020 Incoherent ramblings No comments , ,

Not everybody wants to live in fear, have their jobs, savings and livelyhood taken away by zealous authoritarian tyrants that are desperate to appear as if they are doing something:

I haven’t checked out the website above, but like the sentiment.  If you want to be driven to substance abuse, or live life with all privileges doled out to you like a slave, then maybe lockdowns are for you.

Danger due to: Lockdown #2

November 27, 2020 Incoherent ramblings No comments , , , ,

Somebody appears to have left a blank “Danger due to…” sign at a local construction site.

There’s two captions, both apt. The second is also likely also a true harbinger of danger, since I’m sure not enough sleep is also a significant source of on the job injury and death.

A minimally configured Windows laptop

November 22, 2020 Windows No comments , , , , , ,

I’ve now installed enough that my new Windows machine is minimally functional (LaTex, Linux, and Mathematica), with enough installed that I can compile any of my latex based books, or standalone content for blog posts.  My list of installed extras includes:

  • Brother HL-2170W (printer driver)
  • Windows Terminal
  • GPL Ghostscript (for MaTeX, latex labels in Mathematica figures.)
  • Wolfram Mathematica
  • Firefox
  • Chrome
  • Visual Studio
  • Python
  • Julia
  • Adobe Acrobat Reader
  • Discord
  • OBS Studio
  • MikTeX
  • SumatraPDF
  • GVim
  • Git
  • PowerShell (7)
  • Ubuntu
  • Dropbox

Some notes:

  • On Windows, for my LaTeX work, I used to use MikTex + cygwin.  The cygwin dependency was for my makefile dependencies (gnu-make+perl).  With this new machine, I tried WSL2.  I’m running my bash shells within the new Windows Terminal, which is far superior to the old cmd.
  • Putty is no longer required.  Windows Terminal does the job very nicely.  It does terminal emulation well enough that I can even ssh into a Linux machine and use screen within my Linux session, and my .screenrc just works.  Very nice.
  • SumatraPDF is for latex reverse tex lookup.  i.e. I can double click on pdf content, and up pops the editor with the latex file.  Last time I used Sumatra, I had to configure it to use GVim (notepad used to be the default I think.)  Now it seems to be the default (to my suprise.)
  • I will probably uninstall Git, as it seems superfluous given all the repos I want to access are cloned within my bash file system.
  • I used to use GVim extensively on Windows, but most of my editing has been in vim in the bash shell.  I expect I’ll now only use it for reverse tex (–synctex) lookup editing.

WSL2 has very impressive integration.  A really nice demo of that was access of synctex lookup.  Here’s a screenshot that shows it in action:

I invoked the windows pdf viewer within a bash shell in the Ubuntu VM, using the following:

 
pjoot@DESKTOP-6J7L1NS:~/project/blogit$ alias pdfview
alias pdfview='/mnt/c/Users/peete/AppData/Local/SumatraPDF/SumatraPDF.exe'
pjoot@DESKTOP-6J7L1NS:~/project/blogit$ pdfview fibonacci.pdf

The Ubuntu filesystem directory has the fibonacci.synctex.gz reverse lookup index that Summatra is able to read. Note that this file, after unzipping, has only Linux paths (/home/pjoot/…), but Summatra is able to use those without any trouble, and pops up the (Windows executable) editor on the files after I double click on the file. This sequence is pretty convoluted:

  • Linux bash ->
  • invoke Windows pdf viewer ->
  • that program reading Linux files ->
  • it invokes a windows editor (presumably using the Linux path), and that editor magically knows the path to the Linux file that it has to edit.

Check out the very upper corner of that GVim window, where it shows the \\wsl$\Ubuntu\home\pjoot\project\blogit\fibonacci.tex path

As well as full Linux access to the Windows filesystem, we have full Windows access to the Linux filesystem.

Not all applications know how to access files with UNC paths (for example, the old crappy cmd.exe cannot), but so far all the ones I have cared about have been able to do so.

Public service announcement graffiti?

November 21, 2020 Incoherent ramblings No comments , , ,

On Wednesday night’s Tessa walk, I encountered a very large piece of public service announcement graffiti:

This is on a very wide pedestrian bridge, spanning the DVP in Riverdale.  I assume that bitchute is one of the uncensorable platforms like Dtube (a blockchain based video platform), but I haven’t actually checked it out.  I guess I’m not the only one that is annoyed that the big social media platforms have made it their policy to regulate the set of allowed opinions.  I wouldn’t have expressed myself with graffiti, but I guess that I’m not alone in my distaste for being told what I am allowed to think, or that some information is too dangerous for me to be allowed to look at it.

Deriving the nth Fibonacci number formula.

November 20, 2020 math and physics play No comments , , , ,

[If mathjax doesn’t display properly for you, click here for a PDF of this post]

My last three posts:

  1. The nth term of a Fibonacci series.
  2. More on that cool Fibonacci formula.
  3. Guessing the nth Fibonacci number formula.

were all about a cool formula for the n-th term of the Fibonacci series.  Here’s the final chapter of the story of that play.  A recap:

Definition 1.1: Fibonacci series.

With \( F_0 = 0 \), and \( F_1 = 1 \), the nth term \( F_n \) in the Fibonacci series is the sum of the previous two terms
\begin{equation*}
F_n = F_{n-2} + F_{n-1}.
\end{equation*}

Theorem 1.1: Nth term of the Fibonacci series.

\begin{equation*}
F_n = \frac{ \lr{ 1 + \sqrt{5} }^n – \lr{ 1 – \sqrt{5} }^n }{ 2^n \sqrt{5} }.
\end{equation*}

Deriving the nth Fibonacci formula.

There was a particularly unsatisfactory aspect of the guess that we made in the last post. In particular, we didn’t have any reason to guess the form of that solution, except for the fact that we already knew the answer. Now we will attempt to attack this in a more systematic fashion, so that each step along the way seems logical. First, we need to put a couple goodies in our toolbox.

Definition 1.1: Discrete sum.

Given a set of discrete values \( \setlr{G_a, G_{a+1}, \cdots, G_n } \) we define a discrete sum of \( n – a + 1 \) of these terms as \( F_n \)
\begin{equation}\label{eqn:fibonacciblog:1540}
F_n = \sum_{k = a}^n G_k + C,
\end{equation}
where \( C \) is an arbitrary boundary value constant.

Definition 1.2: Difference operators.

Define a backwards difference operator \( \Delta \), operating on \( X_n \) as
\begin{equation}\label{eqn:fibonacciblog:1560}
\Delta X_n = X_n – X_{n-1}.
\end{equation}

The difference operator is a discrete analogue of a differential operator. It is also possible to define a (forward) difference operator as \( \Delta X_n = X_{n+1} – X_{n} \), but the choice is arbitary, and we can find the same results either way.

Lemma 1.1: Antidifference of discrete sum.

Given a sum \( F_n \) of the form \ref{eqn:fibonacciblog:1540}, the difference operation is just the highest \( n \) term of the sum
\begin{equation}\label{eqn:fibonacciblog:1580}
\Delta F_n = G_n.
\end{equation}

Start proof:

\begin{equation}\label{eqn:fibonacci:960}
\Delta F_n =
\sum_{k = a}^n G_k + C
– \lr{ \sum_{k = a}^{n-1} G_k + C }
= G_n.
\end{equation}

End proof.

Computing differences is pretty easy. What we want to do is the inverse operation (analogous to integration), where we find a closed form representation of \( F_n \) given a difference equation \( \Delta F_n = G_n \). Just as we can compute antiderivatives for \( x^n \), we may do the same for \( n^k \) antidifferences, but the results are messier. The first few such antidifferences are

Theorem 1.1: Antidifferences for powers of \(n\).

\begin{equation}\label{eqn:fibonacciblog:1600}
\begin{aligned}
1 &= \Delta n \\
n &= \Delta \lr{ \frac{n}{2}\lr{ n + 1} } \\
n^2 &= \Delta \lr{ \frac{n}{6}\lr{2 n + 1}\lr{n + 1} } \\
n^3 &= \Delta \lr{ \frac{n^2}{4}\lr{n + 1}^2 }.
\end{aligned}
\end{equation}

Start proof:

The \( \Delta n \) identity is easily verified
\begin{equation}\label{eqn:fibonacci:1000}
\Delta n = n – (n-1) = 1.
\end{equation}
For higher orders it is a bit tedious to verify directly, but we can iteratively build up those results by evaluating the difference operator on each of the powers of \( n \).
\begin{equation}\label{eqn:fibonacci:660}
\begin{aligned}
\Delta n^2
&= n^2 – (n-1)^2 \\
&= n^2 – (n^2 – 2 n + 1) \\
&= 2 n – 1, \\
&= 2 n – \Delta n.
\end{aligned}
\end{equation}
Because the difference operator is linear, we can rearrange to find
\begin{equation}\label{eqn:fibonacci:1020}
\Delta \lr{ n^2 + n } = 2 n.
\end{equation}
Dividing through by \( 2 \) and factoring out an \( n \), recovers the desired result.

For the next power, we have
\begin{equation}\label{eqn:fibonacci:680}
\begin{aligned}
\Delta n^3
&= n^3 – (n-1)^3 \\
&= n^3 – (n^3 – 3 n^2 + 3 n – 1) \\
&= 3 n^2 – 3 n + 1 \\
&= 3 n^2 – 3 \Delta \frac{n}{2}\lr{ n + 1 } + \Delta n,
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:fibonacci:1040}
\begin{aligned}
3 n^2
&=
\Delta \lr{n} \lr{ n^2 + \frac{3}{2}\lr{ n + 1} – 1 } \\
&=
\Delta \frac{n}{2} \lr{ 2 n^2 + 3\lr{ n + 1} – 2} \\
&=
\Delta \frac{n}{2} \lr{ 2 n^2 + 3 n + 1 } \\
&=
\Delta \frac{n}{2} \lr{ 2 n + 1}\lr{ n + 1 }
\end{aligned}
\end{equation}
Dividing through by \( 3 \) recovers the desired result.

The final result is left to the reader. It can be derived or verified easily with a couple lines of Mathematica code.

End proof.

Problem: Sum some series.

Find the sums \( \sum_{k = 1}^n k^m \), for \( m = 1, 2, 3 \).

Answer

  • \( m = 1 \). This is the (probably apocryphal) sum of Gauss’s grade school classroom:
    \begin{equation}\label{eqn:fibonacci:1060}
    F_n = \sum_{k = 1}^n k = 1 + 2 + \cdots n,
    \end{equation}
    satisfying
    \begin{equation}\label{eqn:fibonacci:1080}
    \begin{aligned}
    \Delta F_n
    &= F_n – F_{n-1} \\
    &=
    (n + (n-1) + \cdots + 1)

    ((n-1) + \cdots + 1) \\
    &= n \\
    &= \Delta \frac{n}{2}(n + 1).
    \end{aligned}
    \end{equation}
    We must have
    \begin{equation}\label{eqn:fibonacci:1100}
    F_n = \frac{n}{2}\lr{ n + 1} + C.
    \end{equation}
    To fix \( C \) consider \( F_1 \)
    \begin{equation}\label{eqn:fibonacci:1180}
    F_1 = \inv{2}(1 + 1) + C = 1,
    \end{equation}
    so \( C = 0 \), so we find Gauss’s summation formula
    \begin{equation}\label{eqn:fibonacci:1200}
    \sum_{k = 1}^n k = \frac{n}{2}\lr{ n + 1},
    \end{equation}
    as expected.
  • \( m = 2 \). Now let’s do the sum of squares
    \begin{equation}\label{eqn:fibonacci:1120}
    F_n = \sum_{k = 1}^n k^2,
    \end{equation}
    for which we have
    \begin{equation}\label{eqn:fibonacci:1140}
    \Delta F_n = n^2 = \Delta \frac{n}{6}( 2 n + 1 )(n+1),
    \end{equation}
    so
    \begin{equation}\label{eqn:fibonacci:1160}
    F_n = \frac{n}{6}( 2 n + 1 )(n+1) + C.
    \end{equation}
    Clearly \( C = 0 \) satisfies the boundary condition, leaving
    \begin{equation}\label{eqn:fibonacci:1220}
    \sum_{k = 1}^n k^2 =
    \frac{n}{6}( 2 n + 1 )(n+1).
    \end{equation}
  • \( m = 3 \). We see the pattern, so for the sum of cubes, we can just write down the answer
    \begin{equation}\label{eqn:fibonacci:1240}
    \sum_{k = 1}^n k^3 =
    \frac{n^2}{4}\lr{n + 1}^2
    .
    \end{equation}

Now that we have some basic comfort with the ideas of difference equations, and their solutions,
let’s get back to the Fibonacci problem. In that case, we have
\begin{equation}\label{eqn:fibonacci:1260}
F_n = F_{n-1} + F_{n-2}.
\end{equation}
Stated as a difference equation, this is
\begin{equation}\label{eqn:fibonacci:1280}
\Delta F_n = F_{n-2}.
\end{equation}
Before tackling the Fibonacci problem, let’s try one that slightly simpler.

Problem: A simpler problem.

Solve \( \Delta F_n = F_{n-1} \), where \( F_0 = 0, F_1 = 1 \).

Answer

The problem to solve is just
\begin{equation}\label{eqn:fibonacci:1300}
F_n = 2 F_{n-1}.
\end{equation}
This sequence is \( \setlr{ 1, 2, 4, 8, \cdots } \), so we can solve it by inspection, and the answer is just \( F_n = 2^{n-1} \). We want inspiration for the Fibonacci problem, so let’s pretend that we can’t see the answer, but that we can guess something close, and see if it works. Namely, let’s guess:
\begin{equation}\label{eqn:fibonacci:1320}
F_n = \alpha a^n + C.
\end{equation}
If we plug this trial solution into our difference equation, we get
\begin{equation}\label{eqn:fibonacci:1340}
\begin{aligned}
\alpha a^{n-1} + C
&=
\Delta F_n \\
&= \alpha \lr{ a^n – a^{n-1} } \\
&= \alpha a^{n-1} \lr{ a – 1 }
\end{aligned}
\end{equation}
This can be satisfied by setting \( C = 0 \) and \( a – 1 = 1 \), or \( a = 2 \), as we already knew. To fix the constant \( \alpha \) we utilize our boundary constraints, namely
\begin{equation}\label{eqn:fibonacci:1400}
F_1 = 1 = \alpha 2
\end{equation}
so \(\alpha = 1/2 \).

Compared to just seeing the answer, the procedure above was a lot of work. However, a side effect of this work is discovery of a guessing strategy that is somewhat like using \( f(t) = e^{s t} \) to generate a characteristic equation when solving a differential equation. For a difference equation of this form, it appears we can substitute \( F_n = \alpha a^n + C \) and use the differences to determine the values of \( \alpha, a, C \). Now let’s try this with the Fibonacci difference equation.

Problem: Find a solution to the Fibonacci difference equation.

Without worrying about boundary constraints, find the solutions to \( \Delta F_n = F_{n-2} \), using a trial solution of \( F_n = \alpha a^n \).

Answer

Inserting our trial solution, we have
\begin{equation}\label{eqn:fibonacci:1420}
\begin{aligned}
\alpha a^n
&=
F_n \\
&= F_{n-1} + F_{n-2} \\
&= \alpha \lr{ a^{n-1} + a^{n-2} } \\
&= \alpha a^{n-2} \lr{ a + 1 },
\end{aligned}
\end{equation}
so our “characteristic equation” is
\begin{equation}\label{eqn:fibonacci:1440}
a + 1 = a^2.
\end{equation}
Completing the square yields
\begin{equation}\label{eqn:fibonacci:1460}
\lr{ a – \inv{2} }^2 = 1 + \inv{4},
\end{equation}
or
\begin{equation}\label{eqn:fibonacci:1480}
a = \inv{2} \pm \frac{\sqrt{5}}{2}.
\end{equation}
Bamn. There’s our golden ratio, and it’s buddy!
We find that
\begin{equation}\label{eqn:fibonacci:1500}
F_n = \alpha \lr{\frac{1 \pm \sqrt{5}}{2} }^n,
\end{equation}
are solutions to the difference equation \ref{eqn:fibonacci:1280}.

Since we have a second order difference equation, we need a superposition of both solutions to try to satisfy the boundary conditions. In particular, we want to find the constants
\begin{equation}\label{eqn:fibonacci:1520}
F_n =
\alpha_{+} \lr{\frac{1 + \sqrt{5}}{2} }^n
+
\alpha_{-} \lr{\frac{1 – \sqrt{5}}{2} }^n + C.
\end{equation}
However, we already did this when we guessed used \( F_n = \alpha a^n + \beta b^n \) as a trial solution. When we did that, it was just to see if we could find the end result, knowing only the structure of the solution, but none of the specific constants. Now we have justified why that was a reasonable trial solution, since exactly this structure follows naturally from the difference equation itself.

This train of thought, makes me want to dig out my little Dover book on difference equations [1] that I’ve had since I was a kid. I think I only worked through the first chapter of that book. I have a lot of little sad neglected Dover books on mathematics and physics that I bought super cheap at the World’s Biggest Bookstore when I was back in school. It will be interesting to see how to tackle problems such as this, in a still more systematic fashion.

References

[1] Hyman Levy and Freda Lessman. Finite difference equations. Courier Corporation, 1992.

%d bloggers like this: