math and physics play

Simplifying the previous adjoint matrix results.

January 17, 2024 math and physics play , , , , , , , , ,

[Click here for a PDF version of this (and the previous) post]

We previously found determinant expressions for the matrix elements of the adjoint for 2D and 3D matrices \( M \). However, we can extract additional structure from each of those results.

2D case.

Given a matrix expressed in block matrix form in terms of it’s columns
\begin{equation}\label{eqn:adjoint:500}
M =
\begin{bmatrix}
\Bm_1 & \Bm_2
\end{bmatrix},
\end{equation}
we found that the adjoint \( A \) satisfying \( M A = \Abs{M} I \) had the structure
\begin{equation}\label{eqn:adjoint:520}
A =
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 \end{vmatrix} \\
& \\
\begin{vmatrix} \Bm_1 & \Be_1 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Be_2 \end{vmatrix}
\end{bmatrix}.
\end{equation}
We initially had wedge product expressions for each of these matrix elements, and can discover our structure by putting back those wedge products. Modulo sign, each of these matrix elemens has the form
\begin{equation}\label{eqn:adjoint:540}
\begin{aligned}
\begin{vmatrix} \Be_i & \Bm_j \end{vmatrix}
&=
\lr{ \Be_i \wedge \Bm_j } i^{-1} \\
&=
\gpgradezero{
\lr{ \Be_i \wedge \Bm_j } i^{-1}
} \\
&=
\gpgradezero{
\lr{ \Be_i \Bm_j – \Be_i \cdot \Bm_j } i^{-1}
} \\
&=
\gpgradezero{
\Be_i \Bm_j i^{-1}
} \\
&=
\Be_i \cdot \lr{ \Bm_j i^{-1} },
\end{aligned}
\end{equation}
where \( i = \Be_{12} \). The adjoint matrix is
\begin{equation}\label{eqn:adjoint:560}
A =
\begin{bmatrix}
-\lr{ \Bm_2 i } \cdot \Be_1 & -\lr{ \Bm_2 i } \cdot \Be_2 \\
\lr{ \Bm_1 i } \cdot \Be_1 & \lr{ \Bm_1 i } \cdot \Be_2 \\
\end{bmatrix}.
\end{equation}
If we use a column vector representation of the vectors \( \Bm_j i^{-1} \), we can write the adjoint in a compact hybrid geometric-algebra matrix form
\begin{equation}\label{eqn:adjoint:640}
A =
\begin{bmatrix}
-\lr{ \Bm_2 i }^\T \\
\lr{ \Bm_1 i }^\T
\end{bmatrix}.
\end{equation}

Check:

Let’s see if this works, by multiplying with \( M \)
\begin{equation}\label{eqn:adjoint:580}
\begin{aligned}
A M &=
\begin{bmatrix}
-\lr{ \Bm_2 i }^\T \\
\lr{ \Bm_1 i }^\T
\end{bmatrix}
\begin{bmatrix}
\Bm_1 & \Bm_2
\end{bmatrix} \\
&=
\begin{bmatrix}
-\lr{ \Bm_2 i }^\T \Bm_1 & -\lr{ \Bm_2 i }^\T \Bm_2 \\
\lr{ \Bm_1 i }^\T \Bm_1 & \lr{ \Bm_1 i }^\T \Bm_2
\end{bmatrix}.
\end{aligned}
\end{equation}
Those dot products have the form
\begin{equation}\label{eqn:adjoint:600}
\begin{aligned}
\lr{ \Bm_j i }^\T \Bm_i
&=
\lr{ \Bm_j i } \cdot \Bm_i \\
&=
\gpgradezero{ \lr{ \Bm_j i } \Bm_i } \\
&=
\gpgradezero{ -i \Bm_j \Bm_i } \\
&=
-i \lr{ \Bm_j \wedge \Bm_i },
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:adjoint:620}
\begin{aligned}
A M &=
\begin{bmatrix}
i \lr{ \Bm_2 \wedge \Bm_1 } & 0 \\
0 & -i \lr { \Bm_1 \wedge \Bm_2 }
\end{bmatrix} \\
&=
\Abs{M} I.
\end{aligned}
\end{equation}
We find the determinant weighted identity that we expected. Our methods are a bit schizophrenic, switching fluidly between matrix and geometric algebra representations, but provided we are careful enough, this isn’t problematic.

3D case.

Now, let’s look at the 3D case, where we assume a column vector representation of the matrix of interest
\begin{equation}\label{eqn:adjoint:660}
M =
\begin{bmatrix}
\Bm_1 & \Bm_2 & \Bm_3
\end{bmatrix},
\end{equation}
and try to simplify the expression we found for the adjoint
\begin{equation}\label{eqn:adjoint:680}
A =
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_2 & \Bm_3 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Be_1 & \Bm_3 & \Bm_1 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_3 & \Bm_1 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_3 & \Bm_1 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Be_1 & \Bm_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_1 & \Bm_2 \end{vmatrix}
\end{bmatrix}.
\end{equation}
As with the 2D case, let’s re-express these determinants in wedge product form. We’ll write \( I = \Be_{123} \), and find
\begin{equation}\label{eqn:adjoint:700}
\begin{aligned}
\begin{vmatrix} \Be_i & \Bm_j & \Bm_k \end{vmatrix}
&=
\lr{ \Be_i \wedge \Bm_j \wedge \Bm_k } I^{-1} \\
&=
\gpgradezero{ \lr{ \Be_i \wedge \Bm_j \wedge \Bm_k } I^{-1} } \\
&=
\gpgradezero{ \lr{
\Be_i \lr{ \Bm_j \wedge \Bm_k }
\Be_i \cdot \lr{ \Bm_j \wedge \Bm_k }
} I^{-1} } \\
&=
\gpgradezero{
\Be_i \lr{ \Bm_j \wedge \Bm_k }
I^{-1} } \\
&=
\gpgradezero{
\Be_i \lr{ \Bm_j \cross \Bm_k } I
I^{-1} } \\
&=
\Be_i \cdot \lr{ \Bm_j \cross \Bm_k }.
\end{aligned}
\end{equation}
We see that we can put the adjoint in block matrix form
\begin{equation}\label{eqn:adjoint:720}
A =
\begin{bmatrix}
\lr{ \Bm_2 \cross \Bm_3 }^\T \\
\lr{ \Bm_3 \cross \Bm_1 }^\T \\
\lr{ \Bm_1 \cross \Bm_2 }^\T \\
\end{bmatrix}.
\end{equation}

Check:

\begin{equation}\label{eqn:adjoint:740}
\begin{aligned}
A M
&=
\begin{bmatrix}
\lr{ \Bm_2 \cross \Bm_3 }^\T \\
\lr{ \Bm_3 \cross \Bm_1 }^\T \\
\lr{ \Bm_1 \cross \Bm_2 }^\T \\
\end{bmatrix}
\begin{bmatrix}
\Bm_1 & \Bm_2 & \Bm_3
\end{bmatrix} \\
&=
\begin{bmatrix}
\lr{ \Bm_2 \cross \Bm_3 }^\T \Bm_1 & \lr{ \Bm_2 \cross \Bm_3 }^\T \Bm_2 & \lr{ \Bm_2 \cross \Bm_3 }^\T \Bm_3 \\
\lr{ \Bm_3 \cross \Bm_1 }^\T \Bm_1 & \lr{ \Bm_3 \cross \Bm_1 }^\T \Bm_2 & \lr{ \Bm_3 \cross \Bm_1 }^\T \Bm_3 \\
\lr{ \Bm_1 \cross \Bm_2 }^\T \Bm_1 & \lr{ \Bm_1 \cross \Bm_2 }^\T \Bm_2 & \lr{ \Bm_1 \cross \Bm_2 }^\T \Bm_3
\end{bmatrix} \\
&=
\Abs{M} I.
\end{aligned}
\end{equation}

Essentially, we found that the rows of the adjoint matrix are each parallel to the reciprocal frame vectors of the columns of \( M \). This makes sense, as the reciprocal frame encodes a generalized inverse of sorts.

Computing the adjoint matrix

January 16, 2024 math and physics play , , , , ,

[Click here for a PDF version of this post]

I started reviewing a book draft that mentions the adjoint in passing, but I’ve forgotten what I knew about the adjoint (not counting self-adjoint operators, which is different.) I do recall that adjoint matrices were covered in high school linear algebra (now 30+ years ago!), but never really used after that.

It appears that the basic property of the adjoint \( A \) of a matrix \( M \), when it exists, is
\begin{equation}\label{eqn:adjoint:20}
M A = \Abs{M} I,
\end{equation}
so it’s proportional to the inverse, where the numerical factor is the determinant of that matrix. Let’s try to compute this beastie for 1D, 2D, and 3D cases.

Simplest case: \(1 \times 1\) matrix.

For a one by one matrix, say
\begin{equation}\label{eqn:adjoint:40}
M =
\begin{bmatrix}
m_{11}
\end{bmatrix},
\end{equation}
the determinant is just \( \Abs{M} = m_11 \), so our adjoint is the identity matrix
\begin{equation}\label{eqn:adjoint:60}
A =
\begin{bmatrix}
1
\end{bmatrix}.
\end{equation}
Not too interesting. Let’s try the 2D case.

Less trivial case: \(2 \times 2\) matrix.

For the 2D case, let’s define our matrix as a pair of column vectors
\begin{equation}\label{eqn:adjoint:80}
M =
\begin{bmatrix}
\Bm_1 & \Bm_2
\end{bmatrix},
\end{equation}
and let’s write the adjoint out in full in coordinates as
\begin{equation}\label{eqn:adjoint:100}
A =
\begin{bmatrix}
a_{11} & a_{12} \\
a_{21} & a_{22}
\end{bmatrix}.
\end{equation}
We seek solutions to a pair of vector equations
\begin{equation}\label{eqn:adjoint:120}
\begin{aligned}
\Bm_1 a_{11} + \Bm_2 a_{21} &= \Abs{M} \Be_1 \\
\Bm_1 a_{12} + \Bm_2 a_{22} &= \Abs{M} \Be_2.
\end{aligned}
\end{equation}
We can immediately solve either of these, by taking wedge products, yielding
\begin{equation}\label{eqn:adjoint:140}
\begin{aligned}
\lr{ \Bm_1 \wedge \Bm_2 } a_{11} + \lr{ \Bm_2 \wedge \Bm_2 } a_{21} &= \Abs{M} \lr{ \Be_1 \wedge \Bm_2 } \\
\lr{ \Bm_1 \wedge \Bm_1 } a_{11} + \lr{ \Bm_1 \wedge \Bm_2 } a_{21} &= \Abs{M} \lr{ \Bm_1 \wedge \Be_1 } \\
\lr{ \Bm_1 \wedge \Bm_2 } a_{12} + \lr{ \Bm_2 \wedge \Bm_2 } a_{22} &= \Abs{M} \lr{ \Be_2 \wedge \Bm_2 } \\
\lr{ \Bm_1 \wedge \Bm_1 } a_{12} + \lr{ \Bm_1 \wedge \Bm_2 } a_{22} &= \Abs{M} \lr{ \Bm_1 \wedge \Be_2}.
\end{aligned}
\end{equation}
Any wedge with a repeated vector is zero.
Provided the determinant is non-zero, we can divide both sides by \( \Bm_1 \wedge \Bm_2 = \Abs{M} \Be_{12} \) to find a single determinant for each element in the adjoint
\begin{equation}\label{eqn:adjoint:160}
\begin{aligned}
a_{11} &= \begin{vmatrix} \Be_1 & \Bm_2 \end{vmatrix} \\
a_{21} &= \begin{vmatrix} \Bm_1 & \Be_1 \end{vmatrix} \\
a_{12} &= \begin{vmatrix} \Be_2 & \Bm_2 \end{vmatrix} \\
a_{22} &= \begin{vmatrix} \Bm_1 & \Be_2 \end{vmatrix}
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:adjoint:400}
A =
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 \end{vmatrix} \\
& \\
\begin{vmatrix} \Bm_1 & \Be_1 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Be_2 \end{vmatrix}
\end{bmatrix},
\end{equation}
or
\begin{equation}\label{eqn:adjoint:440}
A_{ij} =
\epsilon_{ir}
\begin{vmatrix}
\Be_j & \Bm_r
\end{vmatrix},
\end{equation}
where \( \epsilon_{ir} \) is the completely antisymmetric tensor, and the Einstein summation convention is in effect (summation implied over any repeated indexes.)

Check:

We should verify that expanding these determinants explicitly reproduces the usual representation of the 2D adjoint:
\begin{equation}\label{eqn:adjoint:420}
\begin{aligned}
\begin{vmatrix} \Be_1 & \Bm_2 \end{vmatrix} &= \begin{vmatrix} 1 & m_{12} \\ 0 & m_{22} \end{vmatrix} = m_{22} \\
\begin{vmatrix} \Bm_1 & \Be_1 \end{vmatrix} &= \begin{vmatrix} m_{11} & 1 \\ m_{21} & 0 \end{vmatrix} = -m_{21} \\
\begin{vmatrix} \Be_2 & \Bm_2 \end{vmatrix} &= \begin{vmatrix} 0 & m_{12} \\ 1 & m_{22} \end{vmatrix} = -m_{12} \\
\begin{vmatrix} \Bm_1 & \Be_2 \end{vmatrix} &= \begin{vmatrix} m_{11} & 0 \\ m_{21} & 1 \end{vmatrix} = m_{11},
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:adjoint:180}
A =
\begin{bmatrix}
m_{22} & -m_{12} \\
-m_{21} & m_{11}
\end{bmatrix}.
\end{equation}

Multiplying everything out should give us determinant weighted identity
\begin{equation}\label{eqn:adjoint:200}
\begin{aligned}
M A
&=
\begin{bmatrix}
m_{11} & m_{12} \\
m_{21} & m_{22}
\end{bmatrix}
\begin{bmatrix}
m_{22} & -m_{12} \\
-m_{21} & m_{11}
\end{bmatrix} \\
&=
\lr{ m_{11} m_{22} – m_{12} m_{21} }
\begin{bmatrix}
1 & 0 \\
0 & 1
\end{bmatrix} \\
&= \Abs{M} I,
\end{aligned}
\end{equation}
as expected.

3D case: \(3 \times 3\) matrix.

For the 3D case, let’s also define our matrix as column vectors
\begin{equation}\label{eqn:adjoint:220}
M =
\begin{bmatrix}
\Bm_1 & \Bm_2 & \Bm_3
\end{bmatrix},
\end{equation}
and let’s write the adjoint out in full in coordinates as
\begin{equation}\label{eqn:adjoint:240}
A =
\begin{bmatrix}
a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33}
\end{bmatrix}.
\end{equation}
This time, we seek solutions to three vector equations
\begin{equation}\label{eqn:adjoint:260}
\begin{aligned}
\Bm_1 a_{11} + \Bm_2 a_{21} + \Bm_3 a_{31} &= \Abs{M} \Be_1 \\
\Bm_1 a_{12} + \Bm_2 a_{22} + \Bm_3 a_{32} &= \Abs{M} \Be_2 \\
\Bm_1 a_{13} + \Bm_2 a_{23} + \Bm_3 a_{33} &= \Abs{M} \Be_3,
\end{aligned}
\end{equation}
and can immediately solve, once again, by taking wedge products, yielding
\begin{equation}\label{eqn:adjoint:280}
\begin{aligned}
\lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{11} + \lr{ \Bm_2 \wedge \Bm_2 \wedge \Bm_3 }a_{21} + \lr{ \Bm_3 \wedge \Bm_2 \wedge \Bm_3 }a_{31} &= \Abs{M} \Be_1 \wedge \Bm_2 \wedge \Bm_3 \\
\lr{ \Bm_1 \wedge \Bm_1 \wedge \Bm_3 }a_{11} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{21} + \lr{ \Bm_1 \wedge \Bm_3 \wedge \Bm_3 }a_{31} &= \Abs{M} \Bm_1 \wedge \Be_1 \wedge \Bm_3 \\
\lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_1 }a_{11} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_2 }a_{21} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{31} &= \Abs{M} \Bm_1 \wedge \Bm_2 \wedge \Be_1 \\
\lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{12} + \lr{ \Bm_2 \wedge \Bm_2 \wedge \Bm_3 }a_{22} + \lr{ \Bm_3 \wedge \Bm_2 \wedge \Bm_3 }a_{32} &= \Abs{M} \Be_2 \wedge \Bm_2 \wedge \Bm_3 \\
\lr{ \Bm_1 \wedge \Bm_1 \wedge \Bm_3 }a_{12} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{22} + \lr{ \Bm_1 \wedge \Bm_3 \wedge \Bm_3 }a_{32} &= \Abs{M} \Bm_1 \wedge \Be_2 \wedge \Bm_3 \\
\lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_1 }a_{12} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_2 }a_{22} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{32} &= \Abs{M} \Bm_1 \wedge \Bm_2 \wedge \Be_2 \\
\lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{13} + \lr{ \Bm_2 \wedge \Bm_2 \wedge \Bm_3 }a_{23} + \lr{ \Bm_3 \wedge \Bm_2 \wedge \Bm_3 }a_{33} &= \Abs{M} \Be_3 \wedge \Bm_2 \wedge \Bm_3 \\
\lr{ \Bm_1 \wedge \Bm_1 \wedge \Bm_3 }a_{13} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{23} + \lr{ \Bm_1 \wedge \Bm_3 \wedge \Bm_3 }a_{33} &= \Abs{M} \Bm_1 \wedge \Be_3 \wedge \Bm_3 \\
\lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_1 }a_{13} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_2 }a_{23} + \lr{ \Bm_1 \wedge \Bm_2 \wedge \Bm_3 }a_{33} &= \Abs{M} \Bm_1 \wedge \Bm_2 \wedge \Be_3,
\end{aligned}
\end{equation}
Any wedge with a repeated vector is zero.
Like before, provided the determinant is non-zero, we can divide both sides by \( \Bm_1 \wedge \Bm_2 \wedge \Bm_3 = \Abs{M} \Be_{123} \) to find a single determinant for each element in the adjoint
\begin{equation}\label{eqn:adjoint:360}
\begin{aligned}
A &=
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_2 & \Bm_3 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Bm_1 & \Be_1 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Be_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Be_3 & \Bm_3 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Bm_1 & \Bm_2 & \Be_1 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Bm_2 & \Be_2 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Bm_2 & \Be_3 \end{vmatrix}
\end{bmatrix} \\
&=
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_2 & \Bm_3 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Be_1 & \Bm_3 & \Bm_1 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_3 & \Bm_1 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_3 & \Bm_1 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Be_1 & \Bm_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_1 & \Bm_2 \end{vmatrix}
\end{bmatrix},
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:adjoint:380}
A_{ij} = \frac{\epsilon_{irs}}{2!} \begin{vmatrix} \Be_j & \Bm_r & \Bm_s \end{vmatrix}.
\end{equation}
Observe that the inclusion of the \( \Be_j \) column vector in this determinant, means that we really need only compute a \( 2 \times 2 \) determinant for each adjoint matrix element. That is

\begin{equation}\label{eqn:adjoint:480}
A_{ij} = \frac{(-1)^j \epsilon_{irs}\epsilon_{jab}}{(2!)^2}
\begin{vmatrix}
m_{ar} & m_{as} \\
m_{br} & m_{bs}
\end{vmatrix}
.
\end{equation}
This looks a lot like the usual minor/cofactor recipe, but written out explicitly for each element, using the antisymmetric tensor to encode the index alternation. It’s worth noting that there may be an error or subtle difference from the usual in my formulation, since wikipedia defines the adjoint as the transpose of the cofactor matrix, see: [1].

General case: \(n \times n\) matrix.

It appears that if we wanted an induction hypotheses for the general \( n > 1 \) case, the \( ij \) element of the adjoint matrix is likely
\begin{equation}\label{eqn:adjoint:460}
\begin{aligned}
A_{ij} &= \frac{\epsilon_{i s_1 s_2 \cdots s_{n-1}}}{(n-1)!} \begin{vmatrix} \Be_j & \Bm_{s_1} & \Bm_{s_2} & \cdots & \Bm_{s_{n-1}} \end{vmatrix} \\
&= \frac{(-1)^j \epsilon_{i r_1 r_2 \cdots r_{n-1}} \epsilon_{j s_1 s_2 \cdots s_{n-1}} }{\lr{(n-1)!}^2}
\begin{vmatrix}
m_{r_1 s_1} & \cdots & m_{r_1 s_{n-1}} \\
\vdots & & \vdots \\
m_{r_{n-1} s_{1}} & \cdots & m_{r_{n-1} s_{n-1}}
\end{vmatrix}.
\end{aligned}
\end{equation}
I’m not going to try to prove this, inductively or otherwise.

References

[1] Wikipedia contributors. Minor (linear algebra) — Wikipedia, the free encyclopedia, 2023. URL https://en.wikipedia.org/w/index.php?title=Minor_(linear_algebra)&oldid=1182988311. [Online; accessed 16-January-2024].

An absurd COBOL library: 2D Euclidean GA

December 31, 2023 COBOL, math and physics play , , , ,

I’ve achieved a new pinnacle of obscurity, and have now written a rudimentary COBOL implementation of a geometric algebra library for \( \mathbb{R}^2 \) calculations.

Who will use this?  Absolutely nobody.  Effectively, nobody knows geometric algebra.  Nobody wants to know COBOL, but some do.  The union of those two groups is vanishingly small (probably one: argued below.)

I understand that some Opus Dei members have taught themselves COBOL, as looking at COBOL has been found to be equally painful as a course of self flagellation.

Figure 0. A flagellation representation of COBOL.

Assuming that no Opus Dei practitioners know geometric algebra, that means that there is exactly one person in the world that both knows COBOL and geometric algebra.  Me.

Why did I write this little library?  Well, I was tickled to write something so completely stupid, and I’ve been laughing at the absurdity of it. I also thought I might learn a few things about COBOL in the process of trying to use it for something slightly non-trivial.  I’m adept at writing simple test programs that exercise various obscure compiler features, but those are usually fairly small.  On the flip side of complexity, I have to debug through a number of horribly complicated customer programs as part of my compiler validation work.  A simple real life test scenario might run 100+ COBOL programs in a set of CICS transactions, executing thousands of EXEC DLI and EXEC CICS statements as well as all of the rest of the COBOL language statements!  Despite having gained familiarity with COBOL from that sort of observational use, walking through stuff in the debugger doesn’t provide the same level of comfort with the language as writing code from scratch.  Since I have no interest in simulating a boring business application, why not do something just for fun as a learning game.

The compiler I am using does not seem to support object-COBOL (which would have been nicely suited for this project), so I’ve written my little toy in conventional COBOL, using one external procedure for each type of mathematical operation.  In the huge set of customer COBOL code that I’ve examined and done test compilations of, none of it has used object-COBOL.  I am guessing that the object-COBOL community is as large as the user base for my little toy COBOL geometric algebra library will ever be.

I’ve implemented methods to construct multivectors with scalar, vector and pseudoscalar components, or a general multivector with all of the above.  I’ve also implemented multiply, add, subtract, scalar multiplication, grade selection, and a DISPLAY function to write a multivector to SYSOUT (stdout equivalent.)

The multivector “type”

Figure 1 shows the implementation of my multivector type, implemented in copybook (include file) named MVI.  I have an alternate MV copybook that doesn’t have the VALUE (initialization) clauses, as you don’t want initialization for LINKAGE-SECTION values (i.e.: program parameters.)

Figure 1. Copybook with multivector declaration and initialization.

If you are wondering what the hell a ‘PIC S9(9) USAGE IS COMP-5’ is, well, that’s the “easy to remember” way to declare a 32-bit signed integer in COBOL.  A COMP-2, on the other hand, is a floating point value.

Figure 2 shows an example of the use of this copybook:

Figure 2. Using the multivector copybook.

Figure 3 shows these two copybook declarations after preprocessor expansion

Figure 3. Multivector global variable examples after preprocessing.

The global variable declarations above are roughly equivalent to the following pseudo C++ code (pretending that we can have anonymous unions that match the COBOL declarations above):

#include <complex>

using complex = std::complex<double>;

struct ga20{
   int grade{};
   union {
      struct { double sc{}; double ps{}; };
      complex g02{};
   };
   union { 
      struct { double x{}; double y{}; };
      complex g1{};
   };
};

ga20 a;
ga20 b;

COBOL is inherently untyped, but requires matching types for CALL parameters, or else all hell ensues, so you have to rely on naming conventions and other mechanisms to enforce the required type equivalences.  In this toy GA library, I’ve used copybooks to enforce the types required for everything.  Global variable declarations like these A-MV and B-MV variables are declared only using a copybook that knows the representation required, and all the uses in sub-programs of the effective -MV “type” use a matching copybook for their declarations.  However, I’ve also made use of the lack of typing to treat A-G02, B-G02, A-G1, and B-G1 as if they were complex numbers, and pass those “variables” off to complex number sub-programs, knowing that I’ve constructed the parameters to those programs in a way that is bit compatible with the MV field values.  You can screw things up really nicely doing stuff like this, especially because all COBOL sub-program parameters are (generally) passed by reference.  If you don’t match up the types right “fun ensues.”

Also observe that the nested level specifiers are optional in COBOL.  For nested fields in C++, we might write a.g1.x.  With a nested variable like this in COBOL, we could write something equivalent to that, like:

A-X OF A-G1 OF A-MV

but we can leave out any of the intermediate “level” specifications if we want.  This gets really confusing in complicated real-life COBOL code.  If you are looking to see where something is modified, you have to not only look for the variable of interest, but also any of the higher level fields, since any of those could have been passed off to other code, which implicitly wrote the value you are interested in.

Here’s what one of these multivectors looks like in memory on my (Linux x86-64) system

(lldb) c
Process 3903259 resuming
Process 3903259 stopped
* thread #10, name = 'GA20', stop reason = breakpoint 7.1
    frame #0: 0x00007fffd9189a02 PJOOT.GA20V01.LOADLIB(MULT).ec73dc4b`MULT at MULT.cob:50:1
   47              CALL GA-MKVECTOR-MODIFY USING C-MV, A-X, A-Y
   48              CALL GA-MKPSEUDO-MODIFY USING D-MV, A-PS
   49  
-> 50              MOVE 'A' TO WS-DISPPARM-N
   51              CALL GA-DISPLAY USING
   52                WS-DISPPARM-N,
   53                A-MV
(lldb) p A-MV
(A-MV) A-MV = {
  A-GRADE = -1
  A-G02 = (A-SC = 1, A-PS = 4)
  A-G1 = (A-X = 2, A-Y = 3)
}

i.e.: this has the value \( 1 + 2 \mathbf{e}_{12} + 3 \mathbf{e}_1 + 4 \mathbf{e}_1 \).

Looking at the multivector in it’s hex representation:

(lldb) fr v -format x A-MV
(A-MV) A-MV = {
  A-GRADE = 0xffffffff
  A-G02 = {
    A-SC = 0x3ff0000000000000
    A-PS = 0x4010000000000000
  }
  A-G1 = {
    A-X = 0x4000000000000000
    A-Y = 0x4008000000000000
  }
}

we see that the debugger is showing an underlying IEEE floating point representation for the COMP-2 variables in the program as it was compiled.

I have a multivector print routine that prints multivectors to SYSOUT:

Figure 4. Calling the multivector DISPLAY function.

where WS-DISPPARM-N is a PIC X(20).  (i.e.: a fixed size character array.)  Output for the A-MV value showing in the debug session above looks like:

A                     ( .10000000000000000E 01)                                                                         
                    + ( .20000000000000000E 01) e_1 + ( .30000000000000000E 01) e_2                                     
                    + ( .40000000000000000E 01) e_{12}            

End of sentence required for nested IFs?

I encountered a curious language issue in my multivector multiply function.  Here’s an example of how I’ve been coding IF statements

Figure 5. An IF END-IF pair without a period to terminate the sentence.

Notice that I don’t do anything special between the END-IF and the statement that follows it.  However, if I have an IF statement that includes nested IF END-IFs, then it appears that I need a period after the final END-IF, like so:

Figure 6. An IF with nested conditions that seems to require a period to terminate the sentence.

If I don’t include that period after the final END-IF (ending the COBOL sentence), then in some circumstances, I was seeing the program exit after the last interior basic block within this nested IF was executed.  In COBOL parlance, it seems as if a GOBACK (i.e.: return) was implicitly executed once we fell out of the big nested IF.  Why is that period required for a nested IF, but not for a simple IF?

In my “Murach’s mainframe COBOL”, he ends ALL if statements with a period, even simple IFs.  I don’t see a rationale for that in the book anywhere, but it’s a ~700 page book, so perhaps he says why at some point.

I’ve asked our compiler guys if this is a bug or expected behaviour, but I am guessing the latter…. I just don’t know why.

The multiplication kernel for this library

The workhorse of this GA(2,0) implementation, is a multivector multiplication operation, which can be implemented in two lines in Mathematica (or C++)

multivector /: multivector[_, m1_, m2_] ** multivector[_, n1_, n2_] := 
   multivector[-1, m1 n1 + Conjugate[m2] n2, n1 m2 + Conjugate[m1] n2 ]

In COBOL, it takes a lot more, and as usual, COBOL verbosity obfuscates things considerably. Here’s the equivalent code in my library:

Figure 7. GA(2,0) multiplication kernel in COBOL.

The library and a little test program.

If you are curious, you can poke around in the code for this library and the test program on github.  The sample/test program is src/MULT.cob, and running the job gives the following SYSOUT:

Figure 8. Sample SYSOUT for MULT.cob

Potentials for multivector Maxwell’s equation (again.)

December 8, 2023 math and physics play , , , , , , , , , , , , , , , , ,

[Click here for the PDF version of this post.]

Motivation.

This revisits my last blog post where I covered this content in a meandering fashion. This is an attempt to re-express this in a more compact form. In particular, in a form that is amenable to include in my book. When I wrote the potential section of my book, I cheated, and didn’t try to motivate the results. My cheat was figuring out the multivector potential representation starting with STA where things are simpler, and then translating it back to a multivector representation, instead of figuring out a reasonable way to motivate things from the foundation already laid.

I’d like to eventually have a less rushed treatment of potentials in my book, where the results are not pulled out of a magic hat. Here is an attempted step in that direction. I’ve opted to put some of the motivational material in problems (with solutions at the chapter end.)

Multivector potentials.

We know from conventional electromagnetism (given no fictitious magnetic sources) that we can represent the six components of the electric and magnetic fields in terms of four scalar fields
\begin{equation}\label{eqn:mvpotentials:80}
\begin{aligned}
\BE &= -\spacegrad \phi – \PD{t}{\BA} \\
\BH &= \inv{\mu} \spacegrad \cross \BA.
\end{aligned}
\end{equation}
The conventional way of constructing these potentials makes use of the identities
\begin{equation}\label{eqn:mvpotentials:60}
\begin{aligned}
\spacegrad \cdot \lr{ \spacegrad \cross \BA } &= 0 \\
\spacegrad \cross \lr{ \spacegrad \phi } &= 0,
\end{aligned}
\end{equation}
applying those to the source free Maxwell’s equations to find representations of \( \BE, \BH \) that automatically satisfy those equations. For that conventional analysis, see section 18-6 [2] (available online), or section 10.1 [3], or section 6.4 [4]. We can also find such a potential representation using geometric algebra methods that are cross product free (problem 1.)

For Maxwell’s equations with fictitious magnetic sources, it can be shown that a potential representation of the field
\begin{equation}\label{eqn:mvpotentials:100}
\begin{aligned}
\BH &= -\spacegrad \phi_m – \PD{t}{\BF} \\
\BE &= -\inv{\epsilon} \spacegrad \cross \BF.
\end{aligned}
\end{equation}
satisfies the source-free grades of Maxwell’s equation.
See [1], and [5] for such derivations. As with the conventional source potentials, we can also apply our geometric algebra toolbox to easily find these results (problem 2.)

We have a mix of time partials and curls that is reminiscent of Maxwell’s equation itself. It’s obvious to wonder whether there is a more coherent integrated form for the potential. This is in fact the case.

Lemma 1.1: Multivector potentials.

For Maxwell’s equation with electric sources, the total field \( F \) can be expressed in multivector potential form
\begin{equation}\label{eqn:mvpotentials:520}
F = \gpgrade{ \lr{ \spacegrad – \inv{c} \PD{t}{} } \lr{ -\phi + c \BA } }{1,2}.
\end{equation}
For Maxwell’s equation with only fictitious magnetic sources, the total field \( F \) can be expressed in multivector form
\begin{equation}\label{eqn:mvpotentials:540}
F = \gpgrade{ \lr{ \spacegrad – \inv{c} \PD{t}{} } I \eta \lr{ -\phi_m + c \BF } }{1,2}.
\end{equation}

The reader should try to verify this themselves (problem 3.)

Using superposition, we can form a multivector potential that includes all grades.

Definition 1.1: Multivector potential.

We call \( A \), a multivector with all grades, the multivector potential, defining the total field as
\begin{equation}\label{eqn:mvpotentials:600}
\begin{aligned}
F
&=
\gpgrade{ \lr{ \spacegrad – \inv{c} \PD{t}{} } A }{1,2} \\
&=
\lr{ \spacegrad – \inv{c} \PD{t}{} } A

\gpgrade{ \lr{ \spacegrad – \inv{c} \PD{t}{} } A }{0,3}.
\end{aligned}
\end{equation}
Imposition of the constraint
\begin{equation}\label{eqn:mvpotentials:680}
\gpgrade{ \lr{ \spacegrad – \inv{c} \PD{t}{} } A }{0,3} = 0,
\end{equation}
is called the Lorentz gauge condition, and allows us to express \( F \) in terms of the potential without any grade selection filters.

Lemma 1.2: Conventional multivector potential.

Let
\begin{equation}\label{eqn:mvpotentials:620}
A = -\phi + c \BA + I \eta \lr{ -\phi_m + c \BF }.
\end{equation}
This results in the conventional potential representation of the electric and magnetic fields
\begin{equation}\label{eqn:mvpotentials:640}
\begin{aligned}
\BE &= -\spacegrad \phi – \PD{t}{\BA} – \inv{\epsilon} \spacegrad \cross \BF \\
\BH &= -\spacegrad \phi_m – \PD{t}{\BF} + \inv{\mu} \spacegrad \cross \BA.
\end{aligned}
\end{equation}
In terms of potentials, the Lorentz gauge condition \ref{eqn:mvpotentials:680} takes the form
\begin{equation}\label{eqn:mvpotentials:660}
\begin{aligned}
0 &= \inv{c} \PD{t}{\phi} + \spacegrad \cdot (c \BA) \\
0 &= \inv{c} \PD{t}{\phi_m} + \spacegrad \cdot (c \BF).
\end{aligned}
\end{equation}

Start proof:

See problem 4.

End proof.

Problems.

Problem 1: Potentials for no-fictitious sources.

Starting with Maxwell’s equation with only conventional electric sources
\begin{equation}\label{eqn:mvpotentials:120}
\lr{ \spacegrad + \inv{c}\PD{t}{} } F = \gpgrade{J}{0,1}.
\end{equation}
Show that this may be split by grade into three equations
\begin{equation}\label{eqn:mvpotentials:140}
\begin{aligned}
\gpgrade{ \lr{ \spacegrad + \inv{c}\PD{t}{} } F}{0,1} &= \gpgrade{J}{0,1} \\
\spacegrad \wedge \BE + \inv{c}\PD{t}{} \lr{ I \eta \BH } &= 0 \\
\spacegrad \wedge \lr{ I \eta \BH } &= 0.
\end{aligned}
\end{equation}
Then use the identities \( \spacegrad \wedge \spacegrad \wedge \BA = 0 \), for vector \( \BA \) and \( \spacegrad \wedge \spacegrad \phi = 0 \), for scalar \( \phi \) to find the potential representation.

Answer

Taking grade(0,1) and (2,3) selections of Maxwell’s equation, we split our equations into source dependent and source free equations
\begin{equation}\label{eqn:mvpotentials:200}
\gpgrade{ \lr{ \spacegrad + \inv{c} \PD{t}{} } F }{0,1} = \gpgrade{J}{0,1},
\end{equation}
\begin{equation}\label{eqn:mvpotentials:220}
\gpgrade{ \lr{ \spacegrad + \inv{c} \PD{t}{} } F }{2,3} = 0.
\end{equation}

In terms of \( F = \BE + I \eta \BH \), the source free equation expands to
\begin{equation}\label{eqn:mvpotentials:240}
\begin{aligned}
0
&=
\gpgrade{
\lr{ \spacegrad + \inv{c} \PD{t}{} } \lr{ \BE + I \eta \BH }
}{2,3} \\
&=
\gpgradetwo{\spacegrad \BE}
+ \gpgradethree{I \eta \spacegrad \BH} + I \eta \inv{c} \PD{t}{\BH} \\
&=
\spacegrad \wedge \BE
+ \spacegrad \wedge \lr{ I \eta \BH }
+ I \eta \inv{c} \PD{t}{\BH},
\end{aligned}
\end{equation}
which can be further split into a bivector and trivector equation
\begin{equation}\label{eqn:mvpotentials:260}
0 = \spacegrad \wedge \BE + I \eta \inv{c} \PD{t}{\BH}
\end{equation}
\begin{equation}\label{eqn:mvpotentials:280}
0 = \spacegrad \wedge \lr{ I \eta \BH }.
\end{equation}
It’s clear that we want to write the magnetic field as a (bivector) curl, so we let
\begin{equation}\label{eqn:mvpotentials:300}
I \eta \BH = I c \BB = c \spacegrad \wedge \BA,
\end{equation}
or
\begin{equation}\label{eqn:mvpotentials:301}
\BH = \inv{\mu} \spacegrad \cross \BA.
\end{equation}

\Cref{eqn:mvpotentials:260} is reduced to
\begin{equation}\label{eqn:mvpotentials:320}
\begin{aligned}
0
&= \spacegrad \wedge \BE + I \eta \inv{c} \PD{t}{\BH} \\
&= \spacegrad \wedge \BE + \inv{c} \PD{t}{} \spacegrad \wedge \lr{ c \BA } \\
&= \spacegrad \wedge \lr{ \BE + \PD{t}{\BA} }.
\end{aligned}
\end{equation}
We can now let
\begin{equation}\label{eqn:mvpotentials:340}
\BE + \PD{t}{\BA} = -\spacegrad \phi.
\end{equation}
We sneakily adjust the sign of the gradient so that the result matches the conventional representation.

Problem 2: Potentials for fictitious sources.

Starting with Maxwell’s equation with only fictitious magnetic sources
\begin{equation}\label{eqn:mvpotentials:160}
\lr{ \spacegrad + \inv{c}\PD{t}{} } F = \gpgrade{J}{2,3},
\end{equation}
show that this may be split by grade into three equations
\begin{equation}\label{eqn:mvpotentials:180}
\begin{aligned}
\gpgrade{ \lr{ \spacegrad + \inv{c}\PD{t}{} } I F}{0,1} &= I \gpgrade{J}{2,3} \\
-\eta \spacegrad \wedge \BH + \inv{c}\PD{t}{(I \BE)} &= 0 \\
\spacegrad \wedge \lr{ I \BE } &= 0.
\end{aligned}
\end{equation}
Then use the identities \( \spacegrad \wedge \spacegrad \wedge \BF = 0 \), for vector \( \BF \) and \( \spacegrad \wedge \spacegrad \phi_m = 0 \), for scalar \( \phi_m \) to find the potential representation \ref{eqn:mvpotentials:100}.

Answer

We multiply \ref{eqn:mvpotentials:160} by \( I \) to find
\begin{equation}\label{eqn:mvpotentials:360}
\lr{ \spacegrad + \inv{c}\PD{t}{} } I F = I \gpgrade{J}{2,3},
\end{equation}
which can be split into
\begin{equation}\label{eqn:mvpotentials:380}
\begin{aligned}
\gpgrade{ \lr{ \spacegrad + \inv{c}\PD{t}{} } I F }{1,2} &= I \gpgrade{J}{2,3} \\
\gpgrade{ \lr{ \spacegrad + \inv{c}\PD{t}{} } I F }{0,3} &= 0.
\end{aligned}
\end{equation}
We expand the source free equation in terms of \( I F = I \BE – \eta \BH \), to find
\begin{equation}\label{eqn:mvpotentials:400}
\begin{aligned}
0
&= \gpgrade{ \lr{ \spacegrad + \inv{c}\PD{t}{} } \lr{ I \BE – \eta \BH } }{0,3} \\
&= \spacegrad \wedge \lr{ I \BE } + \inv{c} \PD{t}{(I \BE)} – \eta \spacegrad \wedge \BH,
\end{aligned}
\end{equation}
which has the respective bivector and trivector grades
\begin{equation}\label{eqn:mvpotentials:420}
0 = \spacegrad \wedge \lr{ I \BE }
\end{equation}
\begin{equation}\label{eqn:mvpotentials:440}
0 = \inv{c} \PD{t}{(I \BE)} – \eta \spacegrad \wedge \BH.
\end{equation}
We can clearly satisfy \ref{eqn:mvpotentials:420} by setting
\begin{equation}\label{eqn:mvpotentials:460}
I \BE = -\inv{\epsilon} \spacegrad \wedge \BF,
\end{equation}
or
\begin{equation}\label{eqn:mvpotentials:461}
\BE = -\inv{\epsilon} \spacegrad \cross \BF.
\end{equation}
Here, once again, the sneaky inclusion of a constant factor \( -1/\epsilon \) is to make the result match the conventional. Inserting this value for \( I \BE \) into our bivector equation yields
\begin{equation}\label{eqn:mvpotentials:480}
\begin{aligned}
0
&= -\inv{\epsilon} \inv{c} \PD{t}{} (\spacegrad \wedge \BF) – \eta \spacegrad \wedge \BH \\
&= -\eta \spacegrad \wedge \lr{ \PD{t}{\BF} + \BH },
\end{aligned}
\end{equation}
so we set
\begin{equation}\label{eqn:mvpotentials:500}
\PD{t}{\BF} + \BH = -\spacegrad \phi_m,
\end{equation}
and have a field representation that automatically satisfies the source free equations.

Problem 3: Total field in terms of potentials.

Prove lemma 1.1, either by direct expansion, or by trying to discover the multivector form of the field by construction.

Answer

Proof by expansion is straightforward, and left to the reader. We form the respective total electromagnetic fields \( F = \BE + I \eta H \) for each case.

We find
\begin{equation}\label{eqn:mvpotentials:560}
\begin{aligned}
F
&= \BE + I \eta \BH \\
&= -\spacegrad \phi – \PD{t}{\BA} + I \frac{\eta}{\mu} \spacegrad \cross \BA \\
&= -\spacegrad \phi – \inv{c} \PD{t}{(c \BA)} + \spacegrad \wedge (c\BA) \\
&= \gpgrade{ -\spacegrad \phi – \inv{c} \PD{t}{(c \BA)} + \spacegrad \wedge (c\BA) }{1,2} \\
&= \gpgrade{ -\spacegrad \phi – \inv{c} \PD{t}{(c \BA)} + \spacegrad (c\BA) }{1,2} \\
&= \gpgrade{ \spacegrad \lr{ -\phi + c \BA } – \inv{c} \PD{t}{(c \BA)} }{1,2} \\
&= \gpgrade{ \lr{ \spacegrad -\inv{c} \PD{t}{} } \lr{ -\phi + c \BA } }{1,2}.
\end{aligned}
\end{equation}

For the field for the fictitious source case, we compute the result in the same way, inserting a no-op grade selection to allow us to simplify, finding
\begin{equation}\label{eqn:mvpotentials:580}
\begin{aligned}
F
&= \BE + I \eta \BH \\
&= -\inv{\epsilon} \spacegrad \cross \BF + I \eta \lr{ -\spacegrad \phi_m – \PD{t}{\BF} } \\
&= \inv{\epsilon c} I \lr{ \spacegrad \wedge (c \BF)} + I \eta \lr{ -\spacegrad \phi_m – \inv{c} \PD{t}{(c \BF)} } \\
&= I \eta \lr{ \spacegrad \wedge (c \BF) + \lr{ -\spacegrad \phi_m – \inv{c} \PD{t}{(c \BF)} } } \\
&= I \eta \gpgrade{ \spacegrad \wedge (c \BF) + \lr{ -\spacegrad \phi_m – \inv{c} \PD{t}{(c \BF)} } }{1,2} \\
&= I \eta \gpgrade{ \spacegrad (c \BF) – \spacegrad \phi_m – \inv{c} \PD{t}{(c \BF)} }{1,2} \\
&= I \eta \gpgrade{ \spacegrad (-\phi_m + c \BF) – \inv{c} \PD{t}{(c \BF)} }{1,2} \\
&= I \eta \gpgrade{ \lr{ \spacegrad -\inv{c} \PD{t}{} } (-\phi_m + c \BF) }{1,2}.
\end{aligned}
\end{equation}

Problem 4: Fields in terms of potentials.

Prove lemma 1.2.

Answer

Let’s expand and then group by grade
\begin{equation}\label{eqn:mvpotentials:n}
\begin{aligned}
\lr{ \spacegrad – \inv{c} \PD{t}{} } A
&=
\lr{ \spacegrad – \inv{c} \PD{t}{} } \lr{ -\phi + c \BA + I \eta \lr{ -\phi_m + c \BF }} \\
&=
-\spacegrad \phi + c \spacegrad \BA + I \eta \lr{ -\spacegrad \phi_m + c \spacegrad \BF }
-\inv{c} \PD{t}{\phi} + c \inv{c} \PD{t}{ \BA } + I \eta \lr{ -\inv{c} \PD{t}{\phi_m} + c \inv{c} \PD{t}{\BF} } \\
&=
– \spacegrad \phi
+ I \eta c \spacegrad \wedge \BF
– c \inv{c} \PD{t}{\BA}
\quad + c \spacegrad \wedge \BA
-I \eta \spacegrad \phi_m
– c I \eta \inv{c} \PD{t}{\BF} \\
&\quad + c \spacegrad \cdot \BA
+\inv{c} \PD{t}{\phi}
\quad + I \eta \lr{ c \spacegrad \cdot \BF
+ \inv{c} \PD{t}{\phi_m} } \\
&=
– \spacegrad \phi
– \inv{\epsilon} \spacegrad \cross \BF
– \PD{t}{\BA}
\quad + I \eta \lr{
\inv{\mu} \spacegrad \cross \BA
– \spacegrad \phi_m
– \PD{t}{\BF}
} \\
&\quad + c \spacegrad \cdot \BA
+\inv{c} \PD{t}{\phi}
\quad + I \eta \lr{ c \spacegrad \cdot \BF
+ \inv{c} \PD{t}{\phi_m} }.
\end{aligned}
\end{equation}
Observing that \( F = \gpgrade{ \lr{ \spacegrad -(1/c) \partial_t } A }{1,2} = \BE + I \eta \BH \), completes the problem. If the Lorentz gauge condition is assumed, the scalar and pseudoscalar components above are obliterated, leaving just
\( F = \lr{ \spacegrad -(1/c) \partial_t } A \).

References

[1] Constantine A Balanis. Antenna theory: analysis and design. John Wiley & Sons, 3rd edition, 2005.

[2] R.P. Feynman, R.B. Leighton, and M.L. Sands. Feynman lectures on physics, Volume II.[Lectures on physics], chapter The Maxwell Equations. Addison-Wesley Publishing Company. Reading, Massachusetts, 1963. URL https://www.feynmanlectures.caltech.edu/II_18.html.

[3] David Jeffrey Griffiths and Reed College. Introduction to electrodynamics. Prentice hall Upper Saddle River, NJ, 3rd edition, 1999.

[4] JD Jackson. Classical Electrodynamics. John Wiley and Sons, 2nd edition, 1975.

[5] David M Pozar. Microwave engineering. John Wiley & Sons, 2009.

Derivatives of spherical polar vector representation.

December 6, 2023 math and physics play , , , , ,

[Click here for a PDF version of this post]

On discord, on the bivector server, ‘stationaryactionprinciple’ asked a question that I really liked.
It’s a question that nagged me before too, but I hadn’t taken the time to puzzle through it properly.

The main character in this question is the spherical polar form of a radial vector, which has the form
\begin{equation}\label{eqn:dexpquestion:20}
\begin{aligned}
i &= \Be_{12} \\
j &= \Be_{31} e^{i\phi} \\
\Bx(r,\theta,\phi) &= r \Be_3 e^{j \theta},
\end{aligned}
\end{equation}
as illustrated in Fig. 1

Fig. 1. Spherical polar conventions.

Notice that all the \( \phi \) dependence comes from the bivector \( j = j(\phi) \), which makes life a bit tricky. We can take \( r, \theta \) or \( \phi \) partials of \( \Bx \), but need to be particularly careful how we do this for the \( \phi \) partials of the exponential factor.

One correct way to compute such a partial is to first expand the exponential in its trig constituents, as
\begin{equation}\label{eqn:dexpquestion:120}
e^{j \theta} = \cos\theta + j \sin\theta,
\end{equation}
and then take the derivative with respect to \(\phi\). If we do so, we get
\begin{equation}\label{eqn:dexpquestion:140}
\PD{\phi}{} e^{j\theta} = \PD{\phi}{j} \sin\theta.
\end{equation}
On the other hand, should we just directly take derivatives of the exponential, one might think that the result is
\begin{equation}\label{eqn:dexpquestion:160}
\PD{\phi}{} e^{j\theta} = \PD{\phi}{(j\theta)} e^{j\theta} = \theta \PD{\phi}{j} e^{j\theta}.
\end{equation}
but this is not correct, for a subtle reason. To understand why, we can step back to the power series representation of the exponential, and compute
\begin{equation}\label{eqn:dexpquestion:60}
\begin{aligned}
\PD{\phi}{e^{j\theta}}
&= \sum_{k = 0}^\infty \PD{\phi}{} \frac{ (j \theta)^k }{k!} \\
&= \sum_{k = 1}^\infty \PD{\phi}{j^k} \frac{ \theta^k }{k!}.
\end{aligned}
\end{equation}
If you treat \( j \) as a complex number, this then reduces to
\begin{equation}\label{eqn:dexpquestion:80}
\begin{aligned}
\PD{\phi}{e^{j\theta}}
&= \sum_{k = 1}^\infty k \PD{\phi}{j} j^{k-1} \frac{ \theta^k }{k!} \\
&=
\theta \PD{\phi}{j} \sum_{k = 1}^\infty \frac{ (j\theta)^{k-1} }{(k-1)!} \\
&=
\theta \PD{\phi}{j} e^{j\theta}.
\end{aligned}
\end{equation}
But, as we have said, this is wrong. The reason that this is wrong is because \( \PDi{\phi}{j} \) does not commute with \( j \), so
\begin{equation}\label{eqn:dexpquestion:100}
\PD{\phi}{j^k} = \PD{\phi}{j} j^{k-1} + j \PD{\phi}{j} j^{k-2} + \cdots,
\end{equation}
not \( k (\PDi{\phi}{j}) j^{k-1} \).

This non-commutativity, sneakily hiding in the power series for the exponential, messes us up. If we are careful, though, we should still be able to compute the correct result using the power series representation of the exponential. To do so, we need to understand the commutation relations for \( j \) and \( j’ \). Writing \( j’ = \PDi{\phi}{j} \), those two bivectors are
\begin{equation}\label{eqn:dexpquestion:180}
\begin{aligned}
j &= \Be_{31} e^{i\phi} \\
j’ &= \Be_{32} e^{i\phi},
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:dexpquestion:200}
\begin{aligned}
j j’
&= \Be_{31} e^{i\phi} \Be_{32} e^{i\phi} \\
&= \Be_{3132} e^{-i\phi} e^{i\phi} \\
&= -\Be_{12},
\end{aligned}
\end{equation}
and
\begin{equation}\label{eqn:dexpquestion:220}
\begin{aligned}
j’ j
&= \Be_{32} e^{i\phi} \Be_{31} e^{i\phi} \\
&= \Be_{3231} e^{-i\phi} e^{i\phi} \\
&= \Be_{12}.
\end{aligned}
\end{equation}
We find that \( j \) and \( j’ \), in this case, anticommute
\begin{equation}\label{eqn:dexpquestion:240}
j j’ = -j’ j.
\end{equation}
We can now compute
\begin{equation}\label{eqn:dexpquestion:260}
\begin{aligned}
\PD{\phi}{j^k}
&= j’ j^{k-1} + j j’ j^{k-2} + j^2 j’ j^{k-3} \cdots \\
&= j’ j^{k-1} – j’ j^{k-1} + (-1)^2 j’ j^{k-1} \cdots
\end{aligned}
\end{equation}
This is zero for any even \( k \) and \( j’ j^{k-1} \) for odd \( k \).

Plugging this back into our Taylor series for the derivative (before we messed it up), we find
\begin{equation}\label{eqn:dexpquestion:280}
\begin{aligned}
\PD{\phi}{e^{j\theta}}
&= \sum_{k = 1, k \in \mathrm{odd}}^\infty j’ j^{k-1} \frac{ \theta^k }{k!} \\
&= j’ \inv{j}
\sum_{k = 1,\, k \in \mathrm{odd}}^\infty \frac{ (j\theta)^k }{k!} \\
&= j’ \inv{j} \sinh( j \theta ) \\
&= j’ \inv{j} j \sin( \theta ) \\
&= j’ \sin( \theta ).
\end{aligned}
\end{equation}
This is exactly the result that we had when we expanded \( e^{j\theta} \) in it’s cis form, and then took derivatives, so we have now reconciled the two different approaches.

Observe that, as a side effect of this exploration, we know also know how to compute the derivative of \( e^{j\theta} \) for the special case where \( j j’ = -j’ j \), which will be the case for any \( j \) where \( j^2 = \mathrm{constant} \).