The quaternion form of Maxwell’s equations as stated in [2] is nearly indecipherable. The modern quaternionic form of these equations can be found in [1]. Looking for this representation was driven by the question of whether or not the compact geometric algebra representations of Maxwell’s equations \( \grad F = J \), was possible using a quaternion representation of the fields.

As quaternions may be viewed as the even subalgebra of GA(3,0), it is possible to the quaternion representation of Maxwell’s equations using only geometric algebra, including source terms and independent of the heat considerations discussed in [1]. Such a derivation will be performed here. Examination of the results appears to answer the question about the compact representation in the negative.

Quaternions as multivectors.

Quaternions are vector plus scalar sums, where the vector basis \( \setlr{ \Bi, \Bj, \Bk } \) are subject to the complex like multiplication rules
\Bi^2 &= \Bj^2 = \Bk^2 = -1 \\
\Bi \Bj &= \Bk = -\Bj \Bi \\
\Bj \Bk &= \Bi = -\Bk \Bj \\
\Bk \Bi &= \Bj = -\Bi \Bk.

We can represent these basis vectors in terms of the \(\mathbb{R}^{3}\) unit bivectors
\Bi &= \Be_{3} \Be_{2} = -I \Be_1 \\
\Bj &= \Be_{1} \Be_{3} = -I \Be_2 \\
\Bk &= \Be_{2} \Be_{1} = -I \Be_3,
where \( I = \Be_1 \Be_2 \Be_3 \) is the ordered product of the \(\mathbb{R}^{3}\) basis elements. Within geometric algebra, the quaternion basis “vectors” are more properly viewed as a bivector space basis that happens to have dimension three.

Similar to [1] (which used \(d/dr\), whereas \(d/dX\) is used here to invoke the connection to a relativistic four vector \(X = (c t, \mathbf{x})\)), we may introduce a quaternionic spacetime gradient, and express that in terms of geometric algebra
\frac{d}{dX} = \inv{c} \PD{t}{}
+ \Bi \PD{x}{}
+ \Bj \PD{y}{}
+ \Bk \PD{z}{}
\inv{c}\PD{t}{} -I \spacegrad.

Of particular interest is how do we write the curl, divergence and time partials in terms of the quaternionic spacetime gradient or its components. Like [1], we will use modern commutator notation for an antisymmetric difference of products
\antisymmetric{a}{b} = a b – b a,
and anticommutator notation for a symmetric difference of products
\symmetric{a}{b} = a b + b a.
The curl of a vector \( \Bf \) in terms of vector products with the gradient is
\spacegrad \cross \Bf
&= -I(\spacegrad \wedge \Bf) \\
&= -\frac{I}{2} \lr{ \spacegrad \Bf – \Bf \spacegrad } \\
&= \frac{1}{2} \lr{ (-I \spacegrad) \Bf – \Bf (-I\spacegrad) } \\
&= \inv{2} \antisymmetric{ -I \spacegrad }{ \Bf } \\
&= \inv{2} \antisymmetric{ \frac{d}{dX} }{ \Bf },
where the last step takes advantage of the fact that the timelike contribution of the spacetime gradient commutes with any vector \( \Bf \) due to its scalar nature, so cancels out of the commutator. In a similar fashion, the dot product may be written as an anticommutator
\spacegrad \cdot \Bf
\inv{2} \lr{ \spacegrad \Bf + \Bf \spacegrad }
\inv{2} \symmetric{ \spacegrad}{ \Bf },
as can the scalar time derivative
= \inv{2} \symmetric{ \inv{c} \PD{t}{} } { c \Bf }.

Quaternionic form of Maxwell’s equations.

Using geometric algebra as an intermediate transformation, let’s see directly how to express Maxwell’s equations in terms of this quaternionic operator. Our starting point is Maxwell’s equations in their standard macroscopic form

\spacegrad \cross \BH = \BJ + \PD{t}{\BD}
\spacegrad \cdot \BD = \rho
\spacegrad \cross \BE = – \PD{t}{\BB}
\spacegrad \cdot \BB = 0.

Inserting these into Maxwell-Faraday and into Gauss’s law for magnetism we have
\inv{2} \antisymmetric{ \frac{d}{dX} }{ \BE } &= – \symmetric{ \inv{c}\PD{t}{} }{ c \BB } \\
\inv{2} \symmetric{ \spacegrad }{ c \BB } &= 0,
\inv{2} \antisymmetric{ \frac{d}{dX} }{ -I \BE } + \symmetric{ \inv{c}\PD{t}{} }{ -I c \BB } &= 0 \\
\inv{2} \symmetric{ -I \spacegrad }{ -I c \BB } &= 0
We can introduce quaternionic electric and magnetic field “vectors” (really bivectors)
\boldsymbol{\mathcal{E}} &= -I \BE = \Bi E_x + \Bj E_y + \Bk E_z \\
\boldsymbol{\mathcal{B}} &= -I \BB = \Bi B_x + \Bj B_y + \Bk B_z,
and substitute these and sum to find the quaternionic representation of the two source free Maxwell’s equations
\inv{2} \antisymmetric{ \frac{d}{dX} }{ \boldsymbol{\mathcal{E}} } + \inv{2} \symmetric{ \frac{d}{dX} }{ c \boldsymbol{\mathcal{B}} } = 0.

Inserting the quaternion curl, div and time derivative representations into Ampere-Maxwell’s law and Gauss’s law, gives
\inv{2} \antisymmetric{ \frac{d}{dX} }{ \BH } &= \BJ + \inv{2} \symmetric{ \inv{c} \PD{t}{} } { c \BD } \\
\inv{2} \symmetric{ \spacegrad }{ c \BD } &= c \rho,
\inv{2} \antisymmetric{ \frac{d}{dX} }{ -I \BH } – \inv{2} \symmetric{ \inv{c} \PD{t}{} } { -I c \BD } &= -I \BJ \\
-\inv{2} \symmetric{ -I \spacegrad }{ -I c \BD } &= c \rho.
With quaternionic displacement vector and magnetization, and current densities
\boldsymbol{\mathcal{D}} &= -I \BD = \Bi D_x + \Bj D_y + \Bk D_z \\
\boldsymbol{\mathcal{H}} &= -I \BH = \Bi H_x + \Bj H_y + \Bk H_z \\
\boldsymbol{\mathcal{J}} &= -I \BJ = \Bi J_x + \Bj J_y + \Bk J_z,
and summing yields the two remaining two Maxwell equations in their quaternionic form
\inv{2} \antisymmetric{ \frac{d}{dX} }{ \boldsymbol{\mathcal{H}} } – \inv{2} \symmetric{ \frac{d}{dX} } { c \boldsymbol{\mathcal{D}} } = c \rho + \boldsymbol{\mathcal{J}}.


Maxwell’s equations in the quaternion representation have a structure that is not apparent in the Heaviside-Gibbs notation. There is some elegance to this result, but comes with the cost of having to use commutator and anticommutator operators, which are arguably non-intuitive. The compact geometric algebra representation of Maxwell’s equation does not appear possible with a quaternion representation, as an additional complex degree of freedom would be required (biquaternions?) Such a degree of freedom may also allow a quaternion representation of the (fictitious) magnetic sources that are useful in antenna theory with a quaternion model. Magnetic sources are easily incorporated into the current multivector in geometric algebra, but if done so in the derivation above, yield an odd grade multivector source which has no quaternion representation.


In the geometric algebra formulation of Maxwell’s equation (singular in GA), the Green’s function for the spacetime derivative ends up with terms like

\frac{d}{dr} \delta( -r/c + t – t’ ),

\frac{d}{dt} \delta( -r/c + t – t’ ),

where \( t’ \) is the integration variable of the test function that the delta function will be applied to. If these were derivatives with respect to the integration variable, then we could use the well known formula

\lr{ \frac{d}{dt’} \delta(t’) } \phi(t’) = -\phi'(0),

which follows by chain rule, and an assumption that \( \phi(t’) \) is well behaved at the points at infinity. It’s not obvious to me that this can be applied to either of our delta function derivatives.

Let’s go back to square one, and figure out the meaning of these delta functions by their action on a test function. We wish to compute

\int_{-\infty}^\infty \frac{d}{du} \delta( a u + b – t’ ) f(t’) dt’.

Let’s start with a change of variables \( t” = a u + b – t’ \), for which we find

t’ &= a u + b – t” \\
dt” &= – dt’ \\
\frac{d}{du} &= \frac{dt”}{du} \frac{d}{dt”} = a \frac{d}{dt”}.

Back substitution gives

a \int_{\infty}^{-\infty} \lr{ \frac{d}{dt”} \delta( t” ) } f( a u + b – t” ) (-dt”)
a \int_{-\infty}^{\infty} \lr{ \frac{d}{dt”} \delta( t” ) } f( a u + b – t” ) dt” \\
\evalrange{a \delta(t”) f( a u + b – t”)}{-\infty}{\infty}

a \int_{-\infty}^{\infty} \delta( t” ) \frac{d}{dt”} f( a u + b – t” ) dt” \\
– \evalbar{ a \frac{d}{dt”} f( a u + b – t” ) }{t” = 0} \\
\evalbar{ a \frac{d}{ds} f( s ) }{s = a u + b}.

This shows that the action of the derivative of the delta function (with respect to a non-integration variable parameter \( u \)) is

\frac{d}{du} \delta( a u + b – t’ )
\evalbar{a \frac{d}{ds}}{s = a u + b}.

Calculate the field due to a spherical shell. The field is

[katex display=”true”]\mathbf{E} = \frac{\sigma}{4 \pi \epsilon_0} \int \frac{(\mathbf{r} – \mathbf{r}’)}{{{\left\lvert{{\mathbf{r} – \mathbf{r}’}}\right\rvert}}^3} da’,[/katex]

where [katex]\mathbf{r}'[/katex] is the position to the area element on the shell. For the test position, let [katex]\mathbf{r} = z \mathbf{e}_3[/katex].


We need to parameterize the area integral. A complex-number like geometric algebra representation works nicely.

[katex display=”true”]\begin{aligned}\mathbf{r}’ &= R \left( \sin\theta \cos\phi, \sin\theta \sin\phi, \cos\theta \right) \\ &= R \left( \mathbf{e}_1 \sin\theta \left( \cos\phi + \mathbf{e}_1 \mathbf{e}_2 \sin\phi \right) + \mathbf{e}_3 \cos\theta \right) \\ &= R \left( \mathbf{e}_1 \sin\theta e^{i\phi} + \mathbf{e}_3 \cos\theta \right).\end{aligned}[/katex]

Here [katex]i = \mathbf{e}_1 \mathbf{e}_2[/katex] has been used to represent to horizontal rotation plane.

The difference in position between the test vector and area-element is

[katex display=”true”]\mathbf{r} – \mathbf{r}’ = \mathbf{e}_3 {\left({ z – R \cos\theta }\right)} – R \mathbf{e}_1 \sin\theta e^{i \phi},[/katex]

with an absolute squared length of

[katex display=”true”]\begin{aligned}{{\left\lvert{{\mathbf{r} – \mathbf{r}’ }}\right\rvert}}^2 &= {\left({ z – R \cos\theta }\right)}^2 + R^2 \sin^2\theta \\ &= z^2 + R^2 – 2 z R \cos\theta.\end{aligned}[/katex]

As a side note, this is a kind of fun way to prove the old “cosine-law” identity. With that done, the field integral can now be expressed explicitly

[katex display=”true”]\begin{aligned} \mathbf{E} &= \frac{\sigma}{4 \pi \epsilon_0} \int_{\phi = 0}^{2\pi} \int_{\theta = 0}^\pi R^2 \sin\theta d\theta d\phi \frac{\mathbf{e}_3 {\left({ z – R \cos\theta }\right)} – R \mathbf{e}_1 \sin\theta e^{i \phi}} { {\left({z^2 + R^2 – 2 z R \cos\theta}\right)}^{3/2} } \\ &= \frac{2 \pi R^2 \sigma \mathbf{e}_3}{4 \pi \epsilon_0} \int_{\theta = 0}^\pi \sin\theta d\theta \frac{z – R \cos\theta} { {\left({z^2 + R^2 – 2 z R \cos\theta}\right)}^{3/2} } \\ &= \frac{2 \pi R^2 \sigma \mathbf{e}_3}{4 \pi \epsilon_0} \int_{\theta = 0}^\pi \sin\theta d\theta \frac{ R( z/R – \cos\theta) } { (R^2)^{3/2} {\left({ (z/R)^2 + 1 – 2 (z/R) \cos\theta}\right)}^{3/2} } \\ &= \frac{\sigma \mathbf{e}_3}{2 \epsilon_0} \int_{u = -1}^{1} du \frac{ z/R – u} { {\left({1 + (z/R)^2 – 2 (z/R) u}\right)}^{3/2} }. \end{aligned}[/katex]

Observe that all the azimuthal contributions get killed. We expect that due to the symmetry of the problem. We are left with an integral that submits to Mathematica, but doesn’t look fun to attempt manually. Specifically

[katex display=”true”]\int_{-1}^1 \frac{a-u}{{\left({1 + a^2 – 2 a u}\right)}^{3/2}} du = \frac{2}{a^2},[/katex]

if [katex]a > 1[/katex], and zero otherwise, so

[katex display=”true”]\boxed{ \mathbf{E} = \frac{\sigma (R/z)^2 \mathbf{e}_3}{\epsilon_0} }[/katex]

for [katex]z > R[/katex], and zero otherwise.

In the problem, it is pointed out to be careful of the sign when evaluating [katex]\sqrt{ R^2 + z^2 – 2 R z }[/katex], however, I don’t see where that is even useful?

The notation I prefer for relativistic geometric algebra uses Hestenes’ space time algebra (STA) [2], where the basis is a four dimensional space \( \setlr{ \gamma_\mu } \), subject to Dirac matrix like relations \( \gamma_\mu \cdot \gamma_\nu = \eta_{\mu \nu} \).

In this formalism a four vector is just the sum of the products of coordinates and basis vectors, for example, using summation convention

x = x^\mu \gamma_\mu.

The invariant for a four-vector in STA is just the square of that vector

&= (x^\mu \gamma_\mu) \cdot (x^\nu \gamma_\nu) \\
&= \sum_\mu (x^\mu)^2 (\gamma_\mu)^2 \\
&= (x^0)^2 – \sum_{k = 1}^3 (x^k)^2 \\
&= (ct)^2 – \Bx^2.

Recall that a four-vector is time-like if this squared-length is positive, spacelike if negative, and light-like when zero.

Time-like projections are possible by dotting with the “lab-frame” time like basis vector \( \gamma_0 \)

ct = x \cdot \gamma_0 = x^0,

and space-like projections are wedges with the same

\Bx = x \cdot \gamma_0 = x^k \sigma_k,

where sums over Latin indexes \( k \in \setlr{1,2,3} \) are implied, and where the elements \( \sigma_k \)

\sigma_k = \gamma_k \gamma_0.

which are bivectors in STA, can be viewed as an Euclidean vector basis \( \setlr{ \sigma_k } \).

Rotations in STA involve exponentials of space like bivectors \( \theta = a_{ij} \gamma_i \wedge \gamma_j \)

x’ = e^{ \theta/2 } x e^{ -\theta/2 }.

Boosts, on the other hand, have exactly the same form, but the exponentials are with respect to space-time bivectors arguments, such as \( \theta = a \wedge \gamma_0 \), where \( a \) is any four-vector.

Observe that both boosts and rotations necessarily conserve the space-time length of a four vector (or any multivector with a scalar square).

\lr{ e^{ \theta/2 } x e^{ -\theta/2 } } \lr{ e^{ \theta/2 } x e^{ -\theta/2 } } \\
e^{ \theta/2 } x \lr{ e^{ -\theta/2 } e^{ \theta/2 } } x e^{ -\theta/2 } \\
e^{ \theta/2 } x^2 e^{ -\theta/2 } \\
x^2 e^{ \theta/2 } e^{ -\theta/2 } \\


Paravectors, as used by Baylis [1], represent four-vectors using a Euclidean multivector basis \( \setlr{ \Be_\mu } \), where \( \Be_0 = 1 \). The conversion between STA and paravector notation requires only multiplication with the timelike basis vector for the lab frame \( \gamma_0 \)

&= x \gamma_0 \\
&= \lr{ x^0 \gamma_0 + x^k \gamma_k } \gamma_0 \\
&= x^0 + x^k \gamma_k \gamma_0 \\
&= x^0 + \Bx \\
&= c t + \Bx,

We need a different structure for the invariant length in paravector form. That invariant length is
\lr{ \lr{ ct + \Bx } \gamma_0 }
\lr{ \lr{ ct + \Bx } \gamma_0 } \\
\lr{ \lr{ ct + \Bx } \gamma_0 }
\lr{ \gamma_0 \lr{ ct – \Bx } } \\
\lr{ ct + \Bx }
\lr{ ct – \Bx }.

Baylis introduces an involution operator \( \overline{{M}} \) which toggles the sign of any vector or bivector grades of a multivector. For example, if \( M = a + \Ba + I \Bb + I c \), where \( a,c \in \mathbb{R} \) and \( \Ba, \Bb \in \mathbb{R}^3 \) is a multivector with all grades \( 0,1,2,3 \), then the involution of \( M \) is

\overline{{M}} = a – \Ba – I \Bb + I c.

Utilizing this operator, the invariant length for a paravector \( X \) is \( X \overline{{X}} \).

Let’s consider how boosts and rotations can be expressed in the paravector form. The half angle operator for a boost along the spacelike \( \Bv = v \vcap \) direction has the form

L = e^{ -\vcap \phi/2 },

c t’ + \Bx’ \\
x’ \gamma_0 \\
L x L^\dagger \\
e^{ -\vcap \phi/2 } x^\mu \gamma_\mu
e^{ \vcap \phi/2 } \gamma_0 \\
e^{ -\vcap \phi/2 } x^\mu \gamma_\mu \gamma_0
e^{ -\vcap \phi/2 } \\
e^{ -\vcap \phi/2 } \lr{ x^0 + \Bx } e^{ -\vcap \phi/2 } \\
L X L.

Because the involution operator toggles the sign of vector grades, it is easy to see that the required invariance is maintained

X’ \overline{{X’}}
\overline{{ L X L }} \\
\overline{{ L }} \overline{{ X }} \overline{{ L }} \\
L X \overline{{ X }} \overline{{ L }} \\
X \overline{{ X }} L \overline{{ L }} \\
X \overline{{ X }}.

Let’s explicitly expand the transformation of \ref{eqn:boostToParavector:140}, so we can relate the rapidity angle \( \phi \) to the magnitude of the velocity. This is most easily done by splitting the spacelike component \( \Bx \) of the four vector into its projective and rejective components

&= \vcap \vcap \Bx \\
&= \vcap \lr{ \vcap \cdot \Bx + \vcap \wedge \Bx } \\
&= \vcap \lr{ \vcap \cdot \Bx } + \vcap \lr{ \vcap \wedge \Bx } \\
&= \Bx_\parallel + \Bx_\perp.

The exponential

e^{-\vcap \phi/2}
\cosh\lr{ \phi/2 }
– \vcap \sinh\lr{ \phi/2 },

commutes with any scalar grades and with \( \Bx_\parallel \), but anticommutes with \( \Bx_\perp \), so

\lr{ c t + \Bx_\parallel } e^{ -\vcap \phi/2 } e^{ -\vcap \phi/2 }
\Bx_\perp e^{ \vcap \phi/2 } e^{ -\vcap \phi/2 } \\
\lr{ c t + \Bx_\parallel } e^{ -\vcap \phi }
\Bx_\perp \\
\lr{ c t + \vcap \lr{ \vcap \cdot \Bx } } \lr{ \cosh \phi – \vcap \sinh \phi }
\Bx_\perp \\
\lr{ c t \cosh\phi – \lr{ \vcap \cdot \Bx} \sinh \phi }
\vcap \lr{ \lr{ \vcap \cdot \Bx } \cosh\phi – c t \sinh \phi } \\
\cosh\phi \lr{ c t – \lr{ \vcap \cdot \Bx} \tanh \phi }
\vcap \cosh\phi \lr{ \vcap \cdot \Bx – c t \tanh \phi }.

Employing the argument from [3],
we want \( \phi \) defined so that this has structure of a Galilean transformation in the limit where \( \phi \rightarrow 0 \). This means we equate

\tanh \phi = \frac{v}{c},

so that for small \(\phi\)

\Bx’ = \Bx – \Bv t.

We can solving for \( \sinh^2 \phi \) and \( \cosh^2 \phi \) in terms of \( v/c \) using

\tanh^2 \phi
= \frac{v^2}{c^2}
\frac{ \sinh^2 \phi }{1 + \sinh^2 \phi}
\frac{ \cosh^2 \phi – 1 }{\cosh^2 \phi}.

which after picking the positive root required for Galilean equivalence gives
\cosh \phi &= \frac{1}{\sqrt{1 – (\Bv/c)^2}} \equiv \gamma \\
\sinh \phi &= \frac{v/c}{\sqrt{1 – (\Bv/c)^2}} = \gamma v/c.

The Lorentz boost, written out in full is

ct’ + \Bx’
\gamma \lr{ c t – \frac{\Bv}{c} \cdot \Bx }
\gamma \lr{ \vcap \lr{ \vcap \cdot \Bx } – \Bv t }

Authors like Chappelle, et al., that also use paravectors [4], specify the form of the Lorentz transformation for the electromagnetic field, but for that transformation reversion is used instead of involution.
I plan to explore that in a later post, starting from the STA formalism that I already understand, and see if I can make sense
of the underlying rationale.


