derivative

Green’s function for the spacetime gradient (and solution of Maxwell’s equation)

October 28, 2025 math and physics play No comments , , , , , , , , , , , , , , , , ,

[Click here for a PDF version of this post]

Motivation

I’ve been assembling a table of all the Green’s functions that can be used in electrodynamics. There’s one set of those Green’s functions left to fill in, the Green’s functions for the spacetime gradient:
\begin{equation}\label{eqn:spacetimeGradientGreens:20}
\lr{\spacegrad + \inv{c}\PD{t}{}} G(\Bx, \Bx’, t, t’) = \delta(\Bx – \Bx’)\delta(t – t’).
\end{equation}
I’d like to compute the retarded and advanced Green’s function for this operator for the 1D, 2D and 3D cases.

In [2] I use the retarded time Green’s function for the spacetime gradient to derive the Jefimenkos equations. However, in retrospect my handling of that material is sloppy. The starting point is the retarded wave equation Green’s function, but I didn’t even derive it, instead just lazily pointing to other authors that did.
I don’t actually ever state the spacetime gradient Green’s function, instead just using a sequence of intermediate results of that would be derivation. Even worse, all of that is scattered roughshod across both chapter II and III, as well as the appendix.

The idea.

Suppose that we know the Green’s functions for the wave equation
\begin{equation}\label{eqn:spacetimeGradientGreens:40}
\lr{\spacegrad^2 – \inv{c^2}\frac{\partial^2}{\partial t^2}} G_r(\Bx, \Bx’, t, t’) = \delta(\Bx – \Bx’)\delta(t – t’).
\end{equation}
\begin{equation}\label{eqn:spacetimeGradientGreens:60}
\lr{\spacegrad + \inv{c}\frac{\partial}{\partial t}} \lr{\spacegrad – \inv{c}\frac{\partial}{\partial t}} G_r(\Bx, \Bx’, t, t’) = \delta(\Bx – \Bx’)\delta(t – t’).
\end{equation}
This means that the Green’s function for the spacetime gradient, a multivector valued entity, satisfying \ref{eqn:spacetimeGradientGreens:20}, is
\begin{equation}\label{eqn:spacetimeGradientGreens:80}
G(\Bx, \Bx’, t, t’) = \lr{\spacegrad – \inv{c}\frac{\partial}{\partial t}} G_r(\Bx, \Bx’, t, t’).
\end{equation}
So if we have a Green’s function for the wave equation, it’s just a matter of taking derivatives to figure out the Green’s function for the spacetime gradient.

Why do we care? Recall that the multivector form of Maxwell’s equations is just
\begin{equation}\label{eqn:spacetimeGradientGreens:100}
\lr{\spacegrad + \inv{c}\frac{\partial}{\partial t}} F = J,
\end{equation}
so, if we know the Green’s function for this non-homogeneous problem, we may simply invert this equation for \( F \) with a convolution. This is how we can obtain the Jefimenkos equations in one fell swoop.

Now let’s evaluate these derivatives.

3D case.

Retarded case.

I’m going to start with the 3D retarded case, since I know the answer for that, and at least nominally, have all the composite parts of that derivation at hand. Then we can move on and compute the same for the advanced case, and then the 2D and 1D variants for fun. It’s not clear to me that we necessarily care about the 1D and 2D cases. I can imagine that there are circumstances where weird geometries or constraints force 1D and 2D solutions, but perhaps the 1D and 2D solutions will be academic and not practical.

Recall that the 3D retarded Green’s function for the wave equation was found to be
\begin{equation}\label{eqn:spacetimeGradientGreens:120}
G_r = -\inv{4 \pi r} \delta\lr{ t – t’ – r/c },
\end{equation}
where \( \Br = \Bx – \Bx’, r = \Abs{\Br} \).

Lemma 1.1: Gradient of \(\Abs{\Bx – \Bx’} \).

The gradient of the scalar \( r = \Abs{\Bx – \Bx’} \) is
\begin{equation*}
\spacegrad \Abs{\Bx – \Bx’} = \frac{\Br}{r}.
\end{equation*}
This will be written as \( \spacegrad r = \rcap \), with \( \rcap = \Br/r \).

Start proof:

\begin{equation}\label{eqn:spacetimeGradientGreens:140}
\begin{aligned}
\spacegrad \Abs{\Bx – \Bx’}
&=
\sum_m \Be_m \partial_m \sqrt{ \sum_n (x_n – x_n’)^2 } \\
&=
\sum_m \Be_m \inv{2} 2 \frac{x_m – x_m’}{r} \\
&=
\sum_m \Be_m \inv{2} 2 \frac{x_m – x_m’}{r} \\
&= \frac{\Br}{r}.
\end{aligned}
\end{equation}

End proof.

This means, suppressing the arguments of the delta function, that
\begin{equation}\label{eqn:spacetimeGradientGreens:160}
\begin{aligned}
\lr{ \spacegrad -(1/c) \partial_t } G_r
&= -\inv{4 \pi} \lr{
(\spacegrad r) \frac{\partial_r \delta}{r} + (\spacegrad r) \lr{ -\frac{1}{r^2}}\delta
– \inv{c r} \partial_t \delta
} \\
&= -\inv{4 \pi} \lr{ \frac{\rcap}{r} \partial_r \delta -\frac{\rcap}{r^2} \delta – \inv{c r} \partial_t \delta} \\
&= -\inv{4 \pi r} \lr{ \rcap \partial_r \delta – \frac{\rcap}{r} \delta – \inv{c} \partial_t \delta} \\
\end{aligned}
\end{equation}

Lemma 1.2: Derivatives of the delta function.

The derivative of the delta function (with respect to a non-integration variable parameter \( u \)) is
\begin{equation*}
\frac{d}{du} \delta( a u + b – t’ ) = a \delta( a u + b – t’ ) \frac{d}{dt’},
\end{equation*}
where \( t’ \) is the integration parameter for the delta function.

Observe that this is different than the usual identity
\begin{equation}\label{eqn:spacetimeGradientGreens:200}
\frac{d}{dt’} \delta(t’) = -\delta(t’) \frac{d}{dt’}.
\end{equation}

Start proof:

As usual, we figure out the meaning of these delta function derivatives by their action on a test function in a convolution.
\begin{equation}\label{eqn:spacetimeGradientGreens:220}
\int_{-\infty}^\infty \frac{d}{du} \delta( a u + b – t’ ) f(t’) dt’.
\end{equation}

Let’s start with a change of variables \( z = a u + b – t’ \), for which we find
\begin{equation}\label{eqn:spacetimeGradientGreens:240}
\begin{aligned}
t’ &= a u + b – z \\
dz &= – dt’ \\
\frac{d}{du} &= \frac{dz}{du} \frac{d}{dz} = a \frac{d}{dz}.
\end{aligned}
\end{equation}

Substitution back into \ref{eqn:spacetimeGradientGreens:220} gives
\begin{equation}\label{eqn:spacetimeGradientGreens:260}
\begin{aligned}
\int_{-\infty}^\infty \frac{d}{du} \delta( a u + b – t’ ) f(t’) dt’
&=
a \int_{\infty}^{-\infty} \lr{ \frac{d}{dz} \delta( z ) } f( a u + b – z ) (-dz) \\
&=
a \int_{-\infty}^{\infty} \lr{ \frac{d}{dz} \delta( z ) } f( a u + b – z ) dz \\
&=
\evalrange{a \delta(z) f( a u + b – z)}{-\infty}{\infty} \\
&\qquad –
a \int_{-\infty}^{\infty} \delta( z ) \frac{d}{dz} f( a u + b – z ) dz \\
&=
– \evalbar{ a \frac{d}{dz} f( a u + b – z ) }{z = 0} \\
&=
– \evalbar{ a \frac{d}{d(au + b – t’)} f( t’ ) }{t’ = a u + b} \\
&=
+ \evalbar{ a \frac{d}{d(t’ -(au + b))} f( t’ ) }{t’ = a u + b} \\
&=
\evalbar{ a \frac{dt’}{d(t’ – (a u + b))} \frac{d}{dt’} f( t’ ) }{t’ = a u + b} \\
&=
\evalbar{ a \frac{d}{dt’} f( t’ ) }{t’ = a u + b} \\
&=
\int_{-\infty}^\infty a \delta(a u + b – t’) \frac{df(t’)}{dt’} dt’.
\end{aligned}
\end{equation}

End proof.

In particular, this means that
\begin{equation}\label{eqn:spacetimeGradientGreens:280}
\begin{aligned}
\partial_r \delta(t – t’ – r/c) &= -\frac{1}{c} \delta(t – t’ – r/c) \PD{t’}{} \\
\partial_t \delta(t – t’ – r/c) &= \delta(t – t’ – r/c) \PD{t’}{} \\
\end{aligned}
\end{equation}

Application to \ref{eqn:spacetimeGradientGreens:160} gives
\begin{equation}\label{eqn:spacetimeGradientGreens:300}
\begin{aligned}
\lr{ \spacegrad -(1/c) \partial_t } G_r
&=
\inv{4 \pi r} \delta(t – t’ – r/c)
\lr{
\frac{\rcap}{r}
+
\lr{ \rcap + 1} \inv{c} \PD{t’}{}
} \\
\end{aligned}
\end{equation}
With \( t_r = t – r/c \), \ref{eqn:spacetimeGradientGreens:80} is found to be
\begin{equation}\label{eqn:spacetimeGradientGreens:320}
G(\Bx, \Bx’, t, t’) = \inv{4 \pi r} \delta(t_r – t’)
\lr{
\frac{\rcap}{r}
+
\lr{ \rcap + 1} \inv{c} \PD{t_r}{}
}
\end{equation}

Advanced case.

The advanced Green’s function for the wave equation is
\begin{equation}\label{eqn:spacetimeGradientGreens:340}
G_a(\Bx, \Bx’, t, t’) = -\inv{4 \pi r} \delta\lr{ t’ – t – r/c },
\end{equation}
so with \( t_a = t + r/c \), we must evaluate the delta function derivatives
\begin{equation}\label{eqn:spacetimeGradientGreens:360}
\begin{aligned}
\partial_r \delta\lr{ t’ – t – r/c } &= -\inv{c} \delta\lr{ t’ – t_a } \frac{d}{dt_a} \\
\partial_t \delta\lr{ t’ – t – r/c } &= – \delta\lr{ t’ – t_a } \frac{d}{dt_a}.
\end{aligned}
\end{equation}
So the Green’s function for the space time gradient is
\begin{equation}\label{eqn:spacetimeGradientGreens:380}
\begin{aligned}
G(\Bx, \Bx’, t, t’)
&= -\inv{4 \pi r} \lr{ \rcap \partial_r \delta – \frac{\rcap}{r} \delta – \inv{c} \partial_t \delta} \\
&= \inv{4 \pi r} \delta\lr{t’ – t_a} \lr{ \frac{\rcap}{r} + \lr{ \rcap – 1} \inv{c} \frac{d}{d t_a}}.
\end{aligned}
\end{equation}

Application: Maxwell’s equation.

Let’s use this to solve Maxwell’s equation. Finding a specific solution is now trivial. The retarded solution is
\begin{equation}\label{eqn:spacetimeGradientGreens:400}
\begin{aligned}
F(\Bx, t)
&= \int dV’ dt’ \gpgrade{
G(\Bx, \Bx’, t, t’) J(\Bx’, t’)
}{1,2} \\
&= \inv{ 4 \pi } \int d^3 \Bx’ dt’
\delta(t_r – t’)
\gpgrade{
\inv{r}
\lr{
\frac{\rcap}{r}
+
\lr{ \rcap + 1} \inv{c} \PD{t_r}{}
}
J(\Bx’, t’)
}{1,2} \\
&=
\inv{ 4 \pi } \int d^3 \Bx’
\gpgrade{
\inv{r}
\lr{
\frac{\rcap}{r} J(\Bx’, t_r)
+
\lr{ \rcap + 1} \inv{c} J'(\Bx’, t_r)
}
}{1,2},
\end{aligned}
\end{equation}
where \( J'(\Bx’, t_r) = \PDi{t_r}{J(\Bx’, t_r)} \).
Similarly, the advanced solution is
\begin{equation}\label{eqn:spacetimeGradientGreens:520}
F(\Bx, t) =
\inv{ 4 \pi } \int d^3 \Bx’
\gpgrade{
\inv{r}
\lr{
\frac{\rcap}{r} J(\Bx’, t_a)
+
\lr{ \rcap – 1} \inv{c} J'(\Bx’, t_a)
}
}{1,2},
\end{equation}
where derivatives are with respect to \( t_a \). In general, we are free to form a superposition of both the retarded and advanced solutions, as well as any solution of the homogeneous equation for charge and current free space \( \lr{ \spacegrad + (1/c) \partial_t } F = 0 \).

There’s a lot of abstraction baked into these solutions. One is the multivector charge and current density \( J \)
\begin{equation}\label{eqn:spacetimeGradientGreens:420}
J = \eta \lr{ c \rho – \BJ } + I \lr{ c \rho_\txtm – \BM },
\end{equation}
where \( \rho_\txtm, \BM \) are the fictitious magnetic sources that are used in engineering antenna and microwave circuit theory. We can ignore those if we choose. We also have the abstraction of the multivector field \( F = \BE + I \eta \BH = \BE + I c \BB \) itself on LHS.

Let’s unpack this solution into it’s constituent electric and magnetic field components, to see if the result looks more familiar. First note that
\begin{equation}\label{eqn:spacetimeGradientGreens:440}
\begin{aligned}
\gpgrade{\rcap J}{1}
&=
\gpgrade{
\rcap \eta \lr{ c \rho – \BJ } + \rcap I \lr{ c \rho_\txtm – \BM }
}{1} \\
&=
\eta c \rho \rcap
– I \rcap \wedge \BM \\
&=
\frac{\rho}{\epsilon} \rcap
+ \rcap \cross \BM,
\end{aligned}
\end{equation}
and
\begin{equation}\label{eqn:spacetimeGradientGreens:460}
\begin{aligned}
\gpgrade{\rcap J}{2}
&=
\gpgrade{
\rcap \eta \lr{ c \rho – \BJ } + \rcap I \lr{ c \rho_\txtm – \BM }
}{2} \\
&=
I \lr{
– \eta \rcap \cross \BJ
+ \rcap c \rho_\txtm
} \\
&=
I \eta \lr{
\BJ \cross \rcap
+ \rcap \frac{\rho_\txtm}{\mu}
}
\end{aligned}
\end{equation}
Selecting the vector and bivector components of the field \( F = \BE + I \eta \BH \), we have
\begin{equation}\label{eqn:spacetimeGradientGreens:480}
\BE(\Bx, t)
=
\inv{4 \pi \epsilon}
\int d^3 \Bx’
\lr{
\frac{\rho}{r^2} \rcap
+ \frac{\rho’}{c r} \rcap
+ \epsilon \frac{\rcap}{r^2} \cross \BM
+ \frac{\epsilon \rcap}{c r} \cross \BM’
\mp \frac{1}{c^2 r} \BJ’
}
\end{equation}
and
\begin{equation}\label{eqn:spacetimeGradientGreens:500}
\BH(\Bx, t)
=
\inv{4 \pi \mu}
\int d^3 \Bx’
\lr{
\frac{\rho_\txtm}{r^2} \rcap
+ \frac{\rho_\txtm}{c r} \rcap
+ \mu \BJ \cross \frac{\rcap}{r^2}
+ \mu \BJ’ \cross \frac{\rcap}{c r}
\mp \inv{c^2 r} \BM’
},
\end{equation}
where the negative sign is for the retarded solution, with times and derivatives with respect to the retarded time \( t_r = t – \Abs{\Bx – \Bx’}/c \), and the positive case for the advanced solutions where times are evaluated at the advanced time \( t_a = t + \Abs{\Bx – \Bx’}/c \).
For the retarded case, if we zero the fictitious sources, setting \( \rho_\txtm = 0, \BM = 0 \), these are Jefimenko’s equations, as seen in [1]. Griffiths derives them by first solving for the potential functions that solve the 2nd order scalar wave equation problem, and then computing all the derivatives.

1D case.

The Green’s function for the 1D spacetime gradient is easy to compute
\begin{equation}\label{eqn:spacetimeGradientGreens:540}
\begin{aligned}
G
&= -\frac{c}{2} \lr{ \spacegrad – \inv{c} \partial_t } \Theta(\pm (t – t’) – r/c) \\
&=
-\frac{c}{2} \lr{
-\inv{c} \rcap – \inv{c} (\pm 1)
}
\delta(\pm (t – t’) – r/c) \\
&=
\inv{2} \lr{ \rcap \pm 1 } \delta(\pm (t – t’) – r/c).
\end{aligned}
\end{equation}

2D case.

The Green’s function for the 2D spacetime gradient is
\begin{equation}\label{eqn:spacetimeGradientGreens:560}
G = -\inv{2 \pi}
\lr{ \spacegrad – \inv{c} \partial_t }
\frac{\Theta(\pm (t – t’) – r/c) }{
\sqrt{\lr{ \tau^2 – r^2/c^2 }}
}.
\end{equation}

The derivatives of the step are
\begin{equation}\label{eqn:spacetimeGradientGreens:580}
\begin{aligned}
\lr{ \spacegrad – \inv{c} \partial_t } \Theta(\pm (t – t’) – r/c)
&=
\lr{
-\inv{c} \rcap -\inv{c} (\pm 1)
}
\delta(\pm (t – t’) – r/c) \\
&=
-\inv{c} \lr{ \rcap \pm 1 }
\delta(\pm \tau – r/c).
\end{aligned}
\end{equation}
and the derivatives of the denominator is
\begin{equation}\label{eqn:spacetimeGradientGreens:600}
\begin{aligned}
\lr{ \spacegrad – \inv{c} \partial_t }
\lr{(t – t’)^2 – r^2/c^2}^{-1/2}
&=
-\inv{2}(2) \lr{ -\inv{c^2} r \rcap -\inv{c} (t – t’) }
\lr{(t – t’)^2 – r^2/c^2}^{-3/2} \\
&=
\inv{c^2} \lr{ \Br + c \tau }
\lr{\tau^2 – r^2/c^2}^{-3/2}.
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:spacetimeGradientGreens:620}
G(r, \tau) =
\frac{
\lr{\tau^2 – r^2/c^2}^{-3/2}
}{2 \pi c^2}
\lr{
c \lr{ \rcap \pm 1 }
\lr{\tau^2 – r^2/c^2}
\delta(\pm \tau – r/c)
-\lr{ \Br + c \tau }
\Theta(\pm \tau – r/c)
}.
\end{equation}

References

[1] David Jeffrey Griffiths and Reed College. Introduction to electrodynamics. Prentice hall Upper Saddle River, NJ, 3rd edition, 1999.

[2] Peeter Joot. Geometric Algebra for Electrical Engineers. Kindle Direct Publishing, Toronto, 2019.

Weighted geometric series

March 23, 2025 math and physics play , , ,

[Click here for a PDF version of this post]

Karl needed to evaluate the sum:

\begin{equation}\label{eqn:weightedGeometric:20}
S = \sum_{k = 0}^9 \frac{a + b k}{\lr{ 1 + i }^k}
\end{equation}

He ended up using a spreadsheet, which was a quick and effective way to deal with the problem. I was curious about this sum, since he asked me how to sum it symbolically, and I didn’t know.

Mathematica doesn’t have any problem with it, as seen in fig. 1.

fig. 1. A funky sum.

How can we figure this out?

Let’s write \( r = 1/(1+i) \) to start with, and break up the sum into constituent parts
\begin{equation}\label{eqn:weightedGeometric:40}
\begin{aligned}
S_n
&= \sum_{k = 0}^n \frac{a + b k}{\lr{ 1 + i }^k} \\
&= a \sum_{k = 0}^n r^k + b \sum_{k = 0}^n k r^k.
\end{aligned}
\end{equation}
We can evaluate the geometric part of this easily using the usual trick. Let
\begin{equation}\label{eqn:weightedGeometric:60}
T_n = \sum_{k = 0}^n r^k,
\end{equation}
then
\begin{equation}\label{eqn:weightedGeometric:80}
r T_n – T_n = r^{n+1} – 1,
\end{equation}
so
\begin{equation}\label{eqn:weightedGeometric:100}
T_n = \frac{r^{n+1} – 1}{r – 1}.
\end{equation}
Now we just have to figure out how to sum
\begin{equation}\label{eqn:weightedGeometric:120}
G_n = \sum_{k = 0}^n k r^k = \sum_{k = 1}^n k r^k.
\end{equation}
This looks suspiciously like the derivative of a geometric series. Let’s evaluate such a derivative, as a function of r:
\begin{equation}\label{eqn:weightedGeometric:140}
\begin{aligned}
\frac{d T_n(r)}{dr}
&= \sum_{k = 0}^n \frac{d}{dr} r^k \\
&= \sum_{k = 0}^n k r^{k-1} \\
&= \sum_{k = 1}^n k r^{k-1} \\
&= \inv{r} \sum_{k = 1}^n k r^k.
\end{aligned}
\end{equation}
Having summed the geometric series, we may also take the derivative of that summed result
\begin{equation}\label{eqn:weightedGeometric:160}
\begin{aligned}
\frac{d T_n(r)}{dr}
&= \frac{d}{dr} \lr{ \frac{r^{n+1} – 1}{r – 1} } \\
&= \frac{(n+1)r^n}{r – 1} – \frac{r^{n+1} – 1}{\lr{r -1}^2} \\
&= \frac{(n+1)r^n(r-1) – \lr{r^{n+1} – 1}}{\lr{r -1}^2}.
\end{aligned}
\end{equation}
Putting the pieces together, we have
\begin{equation}\label{eqn:weightedGeometric:180}
T_n = \frac{r}{\lr{r -1}^2} \lr{ (n+1)r^n(r-1) – \lr{r^{n+1} – 1} }.
\end{equation}

This means that our sum is
\begin{equation}\label{eqn:weightedGeometric:200}
S_n = a \frac{r^{n+1} – 1}{r – 1} + b \frac{r}{\lr{r -1}^2} \lr{ (n+1)r^n(r-1) – \lr{r^{n+1} – 1} }.
\end{equation}
Putting back \( r = 1/(1+i) \), and subsequent simplification, gives the Mathematica result. It’s not pretty, but at least we can do it if we want to.

Shout out to Grok that pointed out the derivative trick for the second series.  I’d forgotten that one.

New version of Geometric Algebra for Electrical Engineers published.

December 9, 2023 Geometric Algebra for Electrical Engineers , , , , , ,

A new version of my book is now published.  The free PDF and the leanpub versions are available now.  The paperback and hardcover versions should be available on Amazon within the week.

What has changed:

  • V0.3.2 (Dec 8, 2023)
    • Add to helpful formulas: Determinant form of triple wedge.
    • Add figure showing the spherical polar conventions picked.
    • Add a problem showing that \( (e^x)’ = x’ e^x \) only when \( x \) and \( x’ \) commute, which is true for scalars and complex numbers, but not necessarily true for abstract entities, such as multivectors and square matrices.
    • Spherical polar coordinates: do not skip steps for \( \mathbf{x}_\phi \) computation.
    • Rewrite the Multivector potentials section. No longer pulling the ideas out of a magic hat, instead trying to motivate them.  Compromised on the strategy to do so, leaving some of the details to problems.

This potentials rewrite I’ve been working on indirectly for the last month, and have published two blog posts about the topic, as well another that I wrote and discarded, but helped me form and sequence some of the ideas.

The exponential derivative topic was also covered on my blog recently.  I’ve reworked that so that it is independent of the specific application to spherical polar coordinates, and set it as a problem for the reader (with solution at the end of chapter I in case I didn’t give enough hints in the problem statement.)

Radial vector representation, momentum, and angular momentum.

September 8, 2023 math and physics play , , , , , ,

[Click here for a PDF version of this post], and here for a video version of this post.

 

Motivation.

In my last couple GA YouTube videos, circular and spherical coordinates were examined.

This post is a text representation of a new video that follows up on those two videos.

We found the form of the unit vector derivatives in both cases.

\begin{equation}\label{eqn:radialderivatives:20}
\Bx = r \mathbf{\hat{r}},
\end{equation}
leaving the angular dependence of \( \mathbf{\hat{r}} \) unspecified. We want to find both \( \Bv = \Bx’ \) and \( \mathbf{\hat{r}}’\).

Derivatives.

Lemma 1.1: Radial length derivative.

The derivative of a spherical length \( r \) can be expressed as
\begin{equation*}
\frac{dr}{dt} = \mathbf{\hat{r}} \cdot \frac{d\Bx}{dt}.
\end{equation*}

Start proof:

We write \( r^2 = \Bx \cdot \Bx \), and take derivatives of both sides, to find
\begin{equation}\label{eqn:radialderivatives:60}
2 r \frac{dr}{dt} = 2 \Bx \cdot \frac{d\Bx}{dt},
\end{equation}
or
\begin{equation}\label{eqn:radialderivatives:80}
\frac{dr}{dt} = \frac{\Bx}{r} \cdot \frac{d\Bx}{dt} = \mathbf{\hat{r}} \cdot \frac{d\Bx}{dt}.
\end{equation}

End proof.

Application of the chain rule to \ref{eqn:radialderivatives:20} is straightforward
\begin{equation}\label{eqn:radialderivatives:100}
\Bx’ = r’ \mathbf{\hat{r}} + r \mathbf{\hat{r}}’,
\end{equation}
but we don’t know the form for \( \mathbf{\hat{r}}’ \). We could proceed with a niave expansion of
\begin{equation}\label{eqn:radialderivatives:120}
\frac{d}{dt} \lr{ \frac{\Bx}{r} },
\end{equation}
but we can be sneaky, and perform a projective and rejective split of \( \Bx’ \) with respect to \( \mathbf{\hat{r}} \). That is
\begin{equation}\label{eqn:radialderivatives:140}
\begin{aligned}
\Bx’
&= \mathbf{\hat{r}} \mathbf{\hat{r}} \Bx’ \\
&= \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \Bx’ } \\
&= \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \cdot \Bx’ + \mathbf{\hat{r}} \wedge \Bx’} \\
&= \mathbf{\hat{r}} \lr{ r’ + \mathbf{\hat{r}} \wedge \Bx’}.
\end{aligned}
\end{equation}
We used our lemma in the last step above, and after distribution, find
\begin{equation}\label{eqn:radialderivatives:160}
\Bx’ = r’ \mathbf{\hat{r}} + \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ }.
\end{equation}
Comparing to \ref{eqn:radialderivatives:100}, we see that
\begin{equation}\label{eqn:radialderivatives:180}
r \mathbf{\hat{r}}’ = \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ }.
\end{equation}
We see that the radial unit vector derivative is proportional to the rejection of \( \mathbf{\hat{r}} \) from \( \Bx’ \)
\begin{equation}\label{eqn:radialderivatives:200}
\mathbf{\hat{r}}’ = \inv{r} \mathrm{Rej}_{\mathbf{\hat{r}}}(\Bx’) = \inv{r^3} \Bx \lr{ \Bx \wedge \Bx’ }.
\end{equation}
The vector \( \mathbf{\hat{r}}’ \) is perpendicular to \( \mathbf{\hat{r}} \) for any parameterization of it’s orientation, or in symbols
\begin{equation}\label{eqn:radialderivatives:220}
\mathbf{\hat{r}} \cdot \mathbf{\hat{r}}’ = 0.
\end{equation}
We saw this for the circular and spherical parameterizations, and see now that this also holds more generally.

Angular momentum.

Let’s now write out the momentum \( \Bp = m \Bv \) for a point particle with mass \( m \), and determine the kinetic energy \( m \Bv^2/2 = \Bp^2/2m \) for that particle.

The momentum is
\begin{equation}\label{eqn:radialderivatives:320}
\begin{aligned}
\Bp
&= m r’ \mathbf{\hat{r}} + m \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bv } \\
&= m r’ \mathbf{\hat{r}} + \inv{r} \mathbf{\hat{r}} \lr{ \Br \wedge \Bp }.
\end{aligned}
\end{equation}
Observe that \( p_r = m r’ \) is the radial component of the momentum. It is natural to introduce a bivector valued angular momentum operator
\begin{equation}\label{eqn:radialderivatives:340}
L = \Br \wedge \Bp,
\end{equation}
splitting the momentum into a component that is strictly radial and a component that lies purely on the surface of a spherical surface in momentum space. That is
\begin{equation}\label{eqn:radialderivatives:360}
\Bp = p_r \mathbf{\hat{r}} + \inv{r} \mathbf{\hat{r}} L.
\end{equation}
Making use of the fact that \( \mathbf{\hat{r}} \) and \( \mathrm{Rej}_{\mathbf{\hat{r}}}(\Bx’) \) are perpendicular (so there are no cross terms when we square the momentum), the
kinetic energy is
\begin{equation}\label{eqn:radialderivatives:380}
\begin{aligned}
\inv{2m} \Bp^2
&= \inv{2m} \lr{ p_r \mathbf{\hat{r}} + \inv{r} \mathbf{\hat{r}} L }^2 \\
&= \inv{2m} p_r^2 + \inv{2 m r^2 } \mathbf{\hat{r}} L \mathbf{\hat{r}} L \\
&= \inv{2m} p_r^2 – \inv{2 m r^2 } \mathbf{\hat{r}} L^2 \mathbf{\hat{r}} \\
&= \inv{2m} p_r^2 – \inv{2 m r^2 } L^2 \mathbf{\hat{r}}^2,
\end{aligned}
\end{equation}
where we’ve used the anticommutative nature of \( \mathbf{\hat{r}} \) and \( L \) (i.e.: a sign swap is needed to swap them), and used the fact that \( L^2 \) is a scalar, allowing us to commute \( \mathbf{\hat{r}} \) with \( L^2 \). This leaves us with
\begin{equation}\label{eqn:radialderivatives:400}
E = \inv{2m} \Bp^2 = \inv{2m} p_r^2 – \inv{2 m r^2 } L^2.
\end{equation}
Observe that both the radial momentum term and the angular momentum term are both strictly postive, since \( L \) is a bivector and \( L^2 \le 0 \).

Problems.

Problem:

Find \ref{eqn:radialderivatives:200} without being sneaky.

Answer

\begin{equation}\label{eqn:radialderivatives:280}
\begin{aligned}
\mathbf{\hat{r}}’
&= \frac{d}{dt} \lr{ \frac{\Bx}{r} } \\
&= \inv{r} \Bx’ – \inv{r^2} \Bx r’ \\
&= \inv{r} \Bx’ – \inv{r} \mathbf{\hat{r}} r’ \\
&= \inv{r} \lr{ \Bx’ – \mathbf{\hat{r}} r’ } \\
&= \inv{r} \lr{ \mathbf{\hat{r}} \mathbf{\hat{r}} \Bx’ – \mathbf{\hat{r}} r’ } \\
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \Bx’ – r’ } \\
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \Bx’ – \mathbf{\hat{r}} \cdot \Bx’ } \\
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ }.
\end{aligned}
\end{equation}

Problem:

Show that \ref{eqn:radialderivatives:200} can be expressed as a triple vector cross product
\begin{equation}\label{eqn:radialderivatives:230}
\mathbf{\hat{r}}’ = \inv{r^3} \lr{ \Bx \cross \Bx’ } \cross \Bx,
\end{equation}

Answer

While this may be familiar from elementary calculus, such as in [1], we can show follows easily from our GA result
\begin{equation}\label{eqn:radialderivatives:300}
\begin{aligned}
\mathbf{\hat{r}}’
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ } \\
&= \inv{r} \gpgradeone{ \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ } } \\
&= \inv{r} \gpgradeone{ \mathbf{\hat{r}} I \lr{ \mathbf{\hat{r}} \cross \Bx’ } } \\
&= \inv{r} \gpgradeone{ I \lr{ \mathbf{\hat{r}} \cdot \lr{ \mathbf{\hat{r}} \cross \Bx’ } + \mathbf{\hat{r}} \wedge \lr{ \mathbf{\hat{r}} \cross \Bx’ } } } \\
&= \inv{r} \gpgradeone{ I^2 \mathbf{\hat{r}} \cross \lr{ \mathbf{\hat{r}} \cross \Bx’ } } \\
&= \inv{r} \lr{ \mathbf{\hat{r}} \cross \Bx’ } \cross \mathbf{\hat{r}}.
\end{aligned}
\end{equation}

References

[1] S.L. Salas and E. Hille. Calculus: one and several variables. Wiley New York, 1990.

Jacobian and Hessian matrices

January 15, 2017 ece1505 , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Motivation

In class this Friday the Jacobian and Hessian matrices were introduced, but I did not find the treatment terribly clear. Here is an alternate treatment, beginning with the gradient construction from [2], which uses a nice trick to frame the multivariable derivative operation as a single variable Taylor expansion.

Multivariable Taylor approximation

The Taylor series expansion for a scalar function \( g : {\mathbb{R}} \rightarrow {\mathbb{R}} \) about the origin is just

\begin{equation}\label{eqn:jacobianAndHessian:20}
g(t) = g(0) + t g'(0) + \frac{t^2}{2} g”(0) + \cdots
\end{equation}

In particular

\begin{equation}\label{eqn:jacobianAndHessian:40}
g(1) = g(0) + g'(0) + \frac{1}{2} g”(0) + \cdots
\end{equation}

Now consider \( g(t) = f( \Bx + \Ba t ) \), where \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}} \), \( g(0) = f(\Bx) \), and \( g(1) = f(\Bx + \Ba) \). The multivariable Taylor expansion now follows directly

\begin{equation}\label{eqn:jacobianAndHessian:60}
f( \Bx + \Ba)
= f(\Bx)
+ \evalbar{\frac{df(\Bx + \Ba t)}{dt}}{t = 0} + \frac{1}{2} \evalbar{\frac{d^2f(\Bx + \Ba t)}{dt^2}}{t = 0} + \cdots
\end{equation}

The first order term is

\begin{equation}\label{eqn:jacobianAndHessian:80}
\begin{aligned}
\evalbar{\frac{df(\Bx + \Ba t)}{dt}}{t = 0}
&=
\sum_{i = 1}^n
\frac{d( x_i + a_i t)}{dt}
\evalbar{\PD{(x_i + a_i t)}{f(\Bx + \Ba t)}}{t = 0} \\
&=
\sum_{i = 1}^n
a_i
\PD{x_i}{f(\Bx)} \\
&= \Ba \cdot \spacegrad f.
\end{aligned}
\end{equation}

Similarily, for the second order term

\begin{equation}\label{eqn:jacobianAndHessian:100}
\begin{aligned}
\evalbar{\frac{d^2 f(\Bx + \Ba t)}{dt^2}}{t = 0}
&=
\evalbar{\lr{
\frac{d}{dt}
\lr{
\sum_{i = 1}^n
a_i
\PD{(x_i + a_i t)}{f(\Bx + \Ba t)}
}
}
}{t = 0} \\
&=
\evalbar{
\lr{
\sum_{j = 1}^n
\frac{d(x_j + a_j t)}{dt}
\sum_{i = 1}^n
a_i
\frac{\partial^2 f(\Bx + \Ba t)}{\partial (x_j + a_j t) \partial (x_i + a_i t) }
}
}{t = 0} \\
&=
\sum_{i,j = 1}^n a_i a_j \frac{\partial^2 f}{\partial x_i \partial x_j} \\
&=
(\Ba \cdot \spacegrad)^2 f.
\end{aligned}
\end{equation}

The complete Taylor expansion of a scalar function \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}} \) is therefore

\begin{equation}\label{eqn:jacobianAndHessian:120}
f(\Bx + \Ba)
= f(\Bx) +
\Ba \cdot \spacegrad f +
\inv{2} \lr{ \Ba \cdot \spacegrad}^2 f + \cdots,
\end{equation}

so the Taylor expansion has an exponential structure

\begin{equation}\label{eqn:jacobianAndHessian:140}
f(\Bx + \Ba) = \sum_{k = 0}^\infty \inv{k!} \lr{ \Ba \cdot \spacegrad}^k f = e^{\Ba \cdot \spacegrad} f.
\end{equation}

Should an approximation of a vector valued function \( \Bf : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \) be desired it is only required to form a matrix of the components

\begin{equation}\label{eqn:jacobianAndHessian:160}
\Bf(\Bx + \Ba)
= \Bf(\Bx) +
[\Ba \cdot \spacegrad f_i]_i +
\inv{2} [\lr{ \Ba \cdot \spacegrad}^2 f_i]_i + \cdots,
\end{equation}

where \( [.]_i \) denotes a column vector over the rows \( i \in [1,m] \), and \( f_i \) are the coordinates of \( \Bf \).

The Jacobian matrix

In [1] the Jacobian \( D \Bf \) of a function \( \Bf : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \) is defined in terms of the limit of the \( l_2 \) norm ratio

\begin{equation}\label{eqn:jacobianAndHessian:180}
\frac{\Norm{\Bf(\Bz) – \Bf(\Bx) – (D \Bf) (\Bz – \Bx)}_2 }{ \Norm{\Bz – \Bx}_2 },
\end{equation}

with the statement that the function \( \Bf \) has a derivative if this limit exists. Here the Jacobian \( D \Bf \in {\mathbb{R}}^{m \times n} \) must be matrix valued.

Let \( \Bz = \Bx + \Ba \), so the first order expansion of \ref{eqn:jacobianAndHessian:160} is

\begin{equation}\label{eqn:jacobianAndHessian:200}
\Bf(\Bz)
= \Bf(\Bx) + [\lr{ \Bz – \Bx } \cdot \spacegrad f_i]_i
.
\end{equation}

With the (unproven) assumption that this Taylor expansion satisfies the norm limit criteria of \ref{eqn:jacobianAndHessian:180}, it is possible to extract the structure of the Jacobian by comparison

\begin{equation}\label{eqn:jacobianAndHessian:220}
\begin{aligned}
(D \Bf)
(\Bz – \Bx)
&=
{\begin{bmatrix}
\lr{ \Bz – \Bx } \cdot \spacegrad f_i
\end{bmatrix}}_i \\
&=
{\begin{bmatrix}
\sum_{j = 1}^n (z_j – x_j) \PD{x_j}{f_i}
\end{bmatrix}}_i \\
&=
{\begin{bmatrix}
\PD{x_j}{f_i}
\end{bmatrix}}_{ij}
(\Bz – \Bx),
\end{aligned}
\end{equation}

so
\begin{equation}\label{eqn:jacobianAndHessian:240}
\boxed{
(D \Bf)_{ij} = \PD{x_j}{f_i}
}
\end{equation}

Written out explictly as a matrix the Jacobian is

\begin{equation}\label{eqn:jacobianAndHessian:320}
D \Bf
=
\begin{bmatrix}
\PD{x_1}{f_1} & \PD{x_2}{f_1} & \cdots & \PD{x_n}{f_1} \\
\PD{x_1}{f_2} & \PD{x_2}{f_2} & \cdots & \PD{x_n}{f_2} \\
\vdots & \vdots & & \vdots \\
\PD{x_1}{f_m} & \PD{x_2}{f_m} & \cdots & \PD{x_n}{f_m} \\
\end{bmatrix}
=
\begin{bmatrix}
(\spacegrad f_1)^\T \\
(\spacegrad f_2)^\T \\
\vdots \\
(\spacegrad f_m)^\T
\end{bmatrix}.
\end{equation}

In particular, when the function is scalar valued
\begin{equation}\label{eqn:jacobianAndHessian:261}
D f = (\spacegrad f)^\T.
\end{equation}

With this notation, the first Taylor expansion, in terms of the Jacobian matrix is

\begin{equation}\label{eqn:jacobianAndHessian:260}
\boxed{
\Bf(\Bz)
\approx \Bf(\Bx) + (D \Bf) \lr{ \Bz – \Bx }.
}
\end{equation}

The Hessian matrix

For scalar valued functions, the text expresses the second order expansion of a function in terms of the Jacobian and Hessian matrices

\begin{equation}\label{eqn:jacobianAndHessian:271}
f(\Bz)
\approx f(\Bx) + (D f) \lr{ \Bz – \Bx }
+ \inv{2} \lr{ \Bz – \Bx }^\T (\spacegrad^2 f) \lr{ \Bz – \Bx }.
\end{equation}

Because \( \spacegrad^2 \) is the usual notation for a Laplacian operator, this \( \spacegrad^2 f \in {\mathbb{R}}^{n \times n}\) notation for the Hessian matrix is not ideal in my opinion. Ignoring that notational objection for this class, the structure of the Hessian matrix can be extracted by comparison with the coordinate expansion

\begin{equation}\label{eqn:jacobianAndHessian:300}
\Ba^\T (\spacegrad^2 f) \Ba
=
\sum_{r,s = 1}^n a_r a_s \frac{\partial^2 f}{\partial x_r \partial x_s}
\end{equation}

so
\begin{equation}\label{eqn:jacobianAndHessian:280}
\boxed{
(\spacegrad^2 f)_{ij}
=
\frac{\partial^2 f_i}{\partial x_i \partial x_j}.
}
\end{equation}

In explicit matrix form the Hessian is

\begin{equation}\label{eqn:jacobianAndHessian:340}
\spacegrad^2 f
=
\begin{bmatrix}
\frac{\partial^2 f}{\partial x_1 \partial x_1} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_1 \partial x_n} \\
\frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2 \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_2 \partial x_n} \\
\vdots & \vdots & & \vdots \\
\frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_n \partial x_n}
\end{bmatrix}.
\end{equation}

Is there a similar nice matrix structure for the Hessian of a function \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \)?

References

[1] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.

[2] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.