dot product

Equation of a hyperplane, and shortest distance between two hyperplanes.

December 13, 2024 math and physics play , , , , , , ,

[Click here for a PDF version of this post]

Scalar equation for a hyperplane.

In our last post, we found, in a round about way, that

Theorem 1.1:

The equation of a \(\mathbb{R}^N\) hyperplane, with distance \( d \) from the origin, and normal \( \mathbf{\hat{n}} \) is
\begin{equation*}
\Bx \cdot \mathbf{\hat{n}} = d.
\end{equation*}

Start proof:

Let \( \beta = \setlr{ \mathbf{\hat{f}}_1, \cdots \mathbf{\hat{f}}_{N-1} } \) be an orthonormal basis for the hyperplane normal to \( \mathbf{\hat{n}} \), and \( \Bd = d \mathbf{\hat{n}} \) be the vector in that hyperplane, closest to the origin, as illustrated in fig. 1.

 

fig 1. R^3 plane with normal n-cap

The hyperplane \( d \) distant from the origin with normal \( \mathbf{\hat{n}} \) has the parametric representation
\begin{equation}\label{eqn:hyperplaneGeometry:40}
\Bx(a_1, \cdots, a_{N-1}) = d \mathbf{\hat{n}} + \sum_{i = 1}^{N-1} a_i \mathbf{\hat{f}}_i.
\end{equation}
Equivalently, suppressing the parameterization, with \( \Bx = \Bx(a_1, \cdots, a_{N-1}) \), representing any vector in that hyperplane, by dotting with \( \mathbf{\hat{n}} \), we have
\begin{equation}\label{eqn:hyperplaneGeometry:60}
\Bx \cdot \mathbf{\hat{n}} = d \mathbf{\hat{n}} \cdot \mathbf{\hat{n}},
\end{equation}
where all the \( \mathbf{\hat{f}}_i \cdot \mathbf{\hat{n}} \) dot products are zero by construction. Since \( \mathbf{\hat{n}} \cdot \mathbf{\hat{n}} = 0 \), the proof is complete.

End proof.

Incidentally, observe we can also write the hyperplane equation in dual form, as
\begin{equation}\label{eqn:hyperplaneGeometry:220}
\Bx \wedge (\mathbf{\hat{n}} I) = d I,
\end{equation}
where \( I \) is an \(\mathbb{R}^N\) pseudoscalar (such as \( I = \mathbf{\hat{n}} \mathbf{\hat{f}}_1 \cdots \mathbf{\hat{f}}_{N-1} \)).

Our previous parallel plane separation problem.

The standard \(\mathbb{R}^3\) scalar form for an equation of a plane is
\begin{equation}\label{eqn:hyperplaneGeometry:80}
a x + b y + c z = d,
\end{equation}
where \( d \) looses it’s geometrical meaning. If we form \( \Bn = (a,b,c) \), then we can rewrite this as
\begin{equation}\label{eqn:hyperplaneGeometry:100}
\Bx \cdot \Bn = d,
\end{equation}
for this representation of an equation of a plane, we see that \( d/\Norm{\Bn} \) is the shortest distance from the origin to the plane. This means that if we have a pair of parallel plane equations
\begin{equation}\label{eqn:hyperplaneGeometry:120}
\begin{aligned}
\Bx \cdot \Bn &= d_1 \\
\Bx \cdot \Bn &= d_2,
\end{aligned}
\end{equation}
then the distance between those planes, by inspection, is
\begin{equation}\label{eqn:hyperplaneGeometry:140}
\Abs{ \frac{d_2}{\Norm{\Bn}} – \frac{d_1}{\Norm{\Bn}} },
\end{equation}
which reduces to just \( \Abs{d_2 – d_1} \) if \( \Bn \) is a unit normal for the plane. In our previous post, the problem to solve was to find the shortest distance between the parallel planes given by
\begin{equation}\label{eqn:hyperplaneGeometry:160}
\begin{aligned}
x – y + 2 z &= -3 \\
3 x – 3 y + 6 z &= 1.
\end{aligned}
\end{equation}
The more natural geometrical form for these plane equations is
\begin{equation}\label{eqn:hyperplaneGeometry:180}
\begin{aligned}
\Bx \cdot \mathbf{\hat{n}} &= -\frac{3}{\sqrt{6}} \\
\Bx \cdot \mathbf{\hat{n}} &= \inv{3 \sqrt{6}},
\end{aligned}
\end{equation}
where \( \mathbf{\hat{n}} = (1,-1,2)/\sqrt{6} \), as illustrated in fig. 2.

fig. 2. The two planes.

 

Given that representation, we can find the distance between the planes just by taking the absolute difference of the respective distances to the origin
\begin{equation}\label{eqn:hyperplaneGeometry:200}
\begin{aligned}
\Abs{ -\frac{3}{\sqrt{6}} – \inv{3 \sqrt{6}} }
&= \frac{\sqrt{6}}{6} \lr{ 3 + \inv{3} } \\
&= \frac{10}{18} \sqrt{6} \\
&= \frac{5}{9} \sqrt{6}.
\end{aligned}
\end{equation}

Shortest distance between two parallel planes.

December 13, 2024 math and physics play , , ,

[Click here for a PDF version of this post]

The problem.

Helping Karl with his linear algebra exam prep, he asked me about this problem

Problem:

Find the shortest distance between the two parallel planes, \( P_1 \), and \( P_2 \), with respective equations:
\begin{equation*}
\begin{aligned}
x – y + 2 z &= -3 \\
3 x – 3 y + 6 z &= 1.
\end{aligned}
\end{equation*}

A numerical way to tackle the problem.

A fairly straightforward way to tackle this problem is illustrated in the sketch of fig. 1. If we can find a point in the first plane, we can follow the normal to the plane to the next, and compute the length of that connecting vector.

fig. 1. Distance between two planes.

fig. 1. Distance between two planes.

For this problem, let
\begin{equation}\label{eqn:distanceBetweenPlanes:20}
\Bn = (1,-1,2),
\end{equation}
and rescale the two plane equations to use the same normal. That is
\begin{equation}\label{eqn:distanceBetweenPlanes:40}
\begin{aligned}
\Bx_1 \cdot \Bn &= -3 \\
\Bx_2 \cdot \Bn &= \inv{3},
\end{aligned}
\end{equation}
where \( \Bx_1 \) are vectors in the first plane, and \( \Bx_2 \) are vectors in the second plane. Finding a vector in one of the planes isn’t hard. Suppose, for example, that \( \Bx_0 = (\alpha, \beta, \gamma) \) is a vector in the first plane, then
\begin{equation}\label{eqn:distanceBetweenPlanes:60}
\alpha – \beta + 2 \gamma = -3.
\end{equation}
One solution is \( \alpha = -3, \beta = 0, \gamma = 0 \), or \( \Bx_0 = (-3, 0, 0) \). We can follow the normal from that point to the closest point in the second plane by forming
\begin{equation}\label{eqn:distanceBetweenPlanes:80}
\By_0 = \Bx_0 + k \Bn,
\end{equation}
where \( k \) is to be determined. If \( \By_0 \) is a point in the second plane, we must have
\begin{equation}\label{eqn:distanceBetweenPlanes:100}
\begin{aligned}
\inv{3}
&=
\By_0 \cdot \Bn \\
&=
\lr{ \Bx_0 + k \Bn } \cdot \Bn \\
&=
(-3, 0, 0 ) \cdot (1,-1,2) + k (1,-1,2) \cdot (1,-1,2) \\
&=
-3 + 6 k,
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:distanceBetweenPlanes:120}
k = \frac{10}{18} = \frac{5}{9}.
\end{equation}
This means the point in plane two closest to \( \Bx_0 = (-3,0,0) \) is
\begin{equation}\label{eqn:distanceBetweenPlanes:140}
\begin{aligned}
\By_0
&= (-3, 0, 0 ) + \frac{5}{9} (1,-1,2) \\
&= \inv{9} (-27 + 5, -5, 10) \\
&= \inv{9} (-22, -5, 10),
\end{aligned}
\end{equation}
and the vector distance between the planes is
\begin{equation}\label{eqn:distanceBetweenPlanes:160}
\begin{aligned}
\By_0 – \Bx_0
&= \inv{9} (-22, -5, 10) – (-3, 0, 0 ) \\
&= \inv{9} (-22 + 27, -5, 10) \\
&= \inv{9} (5, -5, 10).
\end{aligned}
\end{equation}
This vector’s length is \( \sqrt{150}/9 = (5/9) \sqrt{6} \), which is the shortest distance between the planes.

A symbolic approach.

Generally, we get more clarity if we avoid plugging in numbers until the very end, so let’s try a generalization of this problem.

Problem:

Find the shortest distance between the two parallel planes, \( P_1 \), and \( P_2 \), with respective equations:
\begin{equation*}
\begin{aligned}
\Bx_1 \cdot \Bn_1 &= d_1 \\
\Bx_2 \cdot \Bn_2 &= d_2.
\end{aligned}
\end{equation*}

We can use the same approach, but first, let’s rescale the two normals. Let
\begin{equation}\label{eqn:distanceBetweenPlanes:180}
\Bn_2 = t \Bn_1,
\end{equation}
or
\begin{equation}\label{eqn:distanceBetweenPlanes:200}
\Bn_1 \cdot \Bn_2 = t \Bn_1^2,
\end{equation}
so
\begin{equation}\label{eqn:distanceBetweenPlanes:220}
\Bn_2 = \frac{\Bn_1 \cdot \Bn_2}{\Bn_1^2} \Bn_1,
\end{equation}
which means that our plane equations are
\begin{equation}\label{eqn:distanceBetweenPlanes:240}
\begin{aligned}
\Bx_1 \cdot \Bn_1 &= d_1 \\
\Bx_2 \cdot \Bn_1 &= \frac{\Bn_1^2}{\Bn_1 \cdot \Bn_2} d_2,
\end{aligned}
\end{equation}
We can further streamline our plane equation representation, setting \( \ncap = \Bn_1/\Norm{\Bn_1} \), which gives us
\begin{equation}\label{eqn:distanceBetweenPlanes:260}
\begin{aligned}
\Bx_1 \cdot \ncap &= \frac{d_1}{\Norm{\Bn_1}} \\
\Bx_2 \cdot \ncap &= \frac{d_2}{\ncap \cdot \Bn_2}.
\end{aligned}
\end{equation}

This time, let’s assume that we can find a point \( \Bx_0 \) in the first plane, but not actually try to find it. We can still follow the normal to the second plane from that point
\begin{equation}\label{eqn:distanceBetweenPlanes:280}
\By_0 = \Bx_0 + k \ncap,
\end{equation}
but since we only care about the vector distance between the planes, we seek
\begin{equation}\label{eqn:distanceBetweenPlanes:300}
\By_0 -\Bx_0 = k \ncap.
\end{equation}
Now, the constant \( k \), once we find it, is exactly the distance between the planes that we seek. Plugging \( \By_0 \) into the \( P_2 \) equation, we find
\begin{equation}\label{eqn:distanceBetweenPlanes:320}
\begin{aligned}
\frac{d_2}{\ncap \cdot \Bn_2}
&=
\lr{ \Bx_0 + k \ncap } \cdot \ncap \\
&=
\Bx_0 \cdot \ncap + k \\
&=
\frac{d_1}{\Norm{\Bn_1}} + k,
\end{aligned}
\end{equation}
or
\begin{equation}\label{eqn:distanceBetweenPlanes:340}
\boxed{
\Abs{k} = \Norm{\By_0 – \Bx_0} = \Abs{ \frac{d_2}{\ncap \cdot \Bn_2} – \frac{d_1}{\ncap \cdot \Bn_1} }.
}
\end{equation}
If \( \Bn_2 = \Bn_1 = \Bn \), then we have
\begin{equation}\label{eqn:distanceBetweenPlanes:360}
\begin{aligned}
\Norm{\By_0 – \Bx_0} &=
\Abs{
\frac{d_2}{\Bn_1^2/\Norm{\Bn_1}} – \frac{d_1}{\Norm{\Bn_1}}
} \\
&=
\frac{\Abs{d_2 – d_1}}{\Norm{\Bn}},
\end{aligned}
\end{equation}
and if \( \Bn \) is a unit normal, this further reduces to just \( \Abs{d_2 – d_1} \).

Let’s try this for the specific problem originally given. We have \( \Bn_1 = \Bn_2 \), so the distance between the planes is
\begin{equation}\label{eqn:distanceBetweenPlanes:380}
\begin{aligned}
\Norm{\By_0 – \Bx_0}
&= \frac{\Abs{1/3 + 3}}{\sqrt{6}} \\
&= \frac{10}{3 \times 6} \sqrt{6} \\
&= \frac{5}{9} \sqrt{6},
\end{aligned}
\end{equation}
as previously calculated.

Area within closed boundary

August 11, 2024 math and physics play , , , , , ,

[Click here for a PDF version of this post]

Motivation.

On vacation I was reading some more of [1]. It was mentioned in passing that the area contained within a closed parameterized curve is given by
\begin{equation}\label{eqn:containedArea:20}
A = \inv{2} \int_{t_0}^{t_1} \lr{x y’ – y x’} dt,
\end{equation}
where \( x = x(t), y = y(t), t \in [t_0, t_1] \). This has the look of a Stokes theorem coordinate expansion (specifically, the Green’s theorem special case of Stokes’), but with somewhat mysterious looking factor of one half out in front. My aim in this post is to understand the origins of this area relationship, and play with it a bit.

Circular coordinates example.

The book suggests that the reader verify this for a circular parameterization, so we’ll do that here too.

Let
\begin{equation}\label{eqn:containedArea:40}
\begin{aligned}
x(t) &= r \cos t \\
y(t) &= r \sin t,
\end{aligned}
\end{equation}
where \( t \in [0, 2 \pi] \). Plugging in this, we have
\begin{equation}\label{eqn:containedArea:60}
\begin{aligned}
A
&= \inv{2} \int_0^{2 \pi} \lr{ r \cos t \lr{ r \cos t } – r \sin t \lr{ – r \sin t } } dt \\
&= \frac{r^2}{2} \int_0^{2 \pi} \lr{ \cos^2 t + \sin^2 t } dt \\
&= \frac{2 \pi r^2}{2} \\
&= \pi r^2.
\end{aligned}
\end{equation}
This simple example works out.

Piecewise linear parametrization example.

One parameterization of the unit parallelogram depicted in fig. 1 is

\begin{equation}\label{eqn:containedArea:340}
\begin{aligned}
(x,y) &= (t, 0),\quad t \in [0,1] \\
&= (t, t – 1),\quad t \in [1,2] \\
&= (4 – t, 1),\quad t \in [2,3] \\
&= (4 – t, 4 – t),\quad t \in [3,4]
\end{aligned}
\end{equation}

fig. 1. Parallelogram with unit area.

fig. 1. Parallelogram with unit area.

Respective evaluating of \( x y’ – y x’ \) in each of these regions gives
\begin{equation}\label{eqn:containedArea:360}
\begin{aligned}
(t) (0) – (0)(0) &= 0 \\
(t) (1) – (t-1)(1) &= 1 \\
(4-t)(0) – (1)(-1) &= 1 \\
(4-t)(-1) – (4-t)(-1) &= 0,
\end{aligned}
\end{equation}
and integrating
\begin{equation}\label{eqn:containedArea:380}
\inv{2} \int_0^4 \lr{ x y’ – y x’} dt = \frac{2}{2} = 1,
\end{equation}
as expected. In this example, the directional derivative is not continuous at the corners of the parallelogram, but that is not a requirement (as it should not be, as the area is well defined despite any corners.)

Can we discover this relationship using the Jacobian?

Graphically, I can imagine that we could find this area relationship, by considering a parameterization of a family of nested closed curves, as depicted in fig. 2.

fig. 2. Family of nested closed curves.

fig. 2. Family of nested closed curves.

For such a parameterization, calculating the area is just a Jacobian evaluation
\begin{equation}\label{eqn:containedArea:80}
\begin{aligned}
A
&= \iint \frac{\partial(x, y)}{\partial(u,t)} du dt \\
&= \iint \lr{ \PD{u}{x} \PD{t}{y} – \PD{u}{y} \PD{t}{x} } du dt \\
&= \iint \lr{ \PD{u}{x} y’ – \PD{u}{y} x’ } du dt.
\end{aligned}
\end{equation}
Let’s try to eliminate the \( u \) derivatives using integration by parts, and see what we get.
\begin{equation}\label{eqn:containedArea:100}
\begin{aligned}
A
&= \iint \lr{ \PD{u}{x} y’ – \PD{u}{y} x’ } du dt \\
&= \iint \frac{d}{du} \lr{ x y’ – y x’ } du dt – \iint \lr{ x \PD{u}{y’} – y \PD{u}{x’} } du dt \\
&= \int \lr{ x y’ – y x’ } dt – \iint \lr{ x \PD{u}{y’} – y \PD{u}{x’} } du dt.
\end{aligned}
\end{equation}
This is interesting, as we find the area equation that we are interested (times two), but we have a strange new area equation. Essentially, we have found, assuming we trust the claim in the book, that
\begin{equation}\label{eqn:containedArea:120}
A = 2 A – \iint \lr{ x \PD{u}{y’} – y \PD{u}{x’} } du dt,
\end{equation}
so it seems that the area can also be expressed as
\begin{equation}\label{eqn:containedArea:140}
A = \iint \lr{ x \frac{\partial^2 y}{\partial u \partial t} – y \frac{\partial^2 x}{\partial u \partial t} } du dt.
\end{equation}
Let’s again use the circular parameterization to verify that this works. I won’t try to prove this directly, but instead, we’ll use Stokes’ theorem to prove the stated result, from which we get this second derivative area formula as a side effect by virtue of our integration by parts expansion above.

For the circular parameterization, we have
\begin{equation}\label{eqn:containedArea:160}
\begin{aligned}
A
&= \int_{r = 0}^R dr \int_{t = 0}^{2 \pi} dt \lr{ x \frac{\partial^2 y}{\partial r \partial t} – y \frac{\partial^2 x}{\partial r \partial t} } \\
&= \int_{r = 0}^R dr \int_{t = 0}^{2 \pi} dt \lr{ r \cos t \frac{\partial \sin t}{\partial t} – r \sin t \frac{\partial \cos t}{\partial t} } \\
&= \int_{r = 0}^R r dr \int_{t = 0}^{2 \pi} dt \lr{ \cos^2 t + \sin^2 t } \\
&= \frac{R^2}{2} 2 \pi \\
&= \pi R^2.
\end{aligned}
\end{equation}
This checks out, at least for this one specific circular parameterization.

Area formula derivation using Stokes’ theorem.

Theorem 1.1: Green’s theorem.

\begin{equation}\label{eqn:containedArea:260}
\iint dx dy \lr{ \PD{x}{M} – \PD{y}{L} } = \oint L dx + M dy.
\end{equation}

Start proof:

We start with the general two parameter integration theorem
\begin{equation}\label{eqn:containedArea:180}
\iint F d^2 \Bx \lrpartial G = -\oint F d\Bx G,
\end{equation}
set \( F = 1, G = \Bf \), and apply scalar selection
\begin{equation}\label{eqn:containedArea:200}
\iint \gpgradezero{ d^2 \Bx \lrpartial \Bf } = -\oint d\Bx \cdot \Bf,
\end{equation}
to find the two parameter form of Stokes’ theorem
\begin{equation}\label{eqn:containedArea:220}
\iint d^2 \Bx \cdot \lr{ \spacegrad \wedge \Bf } = -\oint d\Bx \cdot \Bf,
\end{equation}

With a planar parameterization, say \( \Bf = L \Be_1 + M \Be_2 \), we have \( d\Bx \cdot \Bf = L dx + M dy \), and for the LHS
\begin{equation}\label{eqn:containedArea:240}
\begin{aligned}
\iint d^2 \Bx \cdot \lr{ \spacegrad \wedge \Bf }
&=
\iint dx dy \Be_{12}^2
\begin{vmatrix}
\partial_1 & \partial_2 \\
L & M
\end{vmatrix} \\
&=
-\iint dx dy \lr{ \PD{x}{M} – \PD{y}{L} }.
\end{aligned}
\end{equation}

End proof.

Parameterized area equation.

If we wish to evaluate an elementary area, we can pick \( L, M \) such that \( \PDi{x}{M} – \PDi{y}{L} = 1 \). One such selection is
\begin{equation}\label{eqn:containedArea:280}
\begin{aligned}
M &= \frac{x}{2} \\
L &= -\frac{y}{2},
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:containedArea:300}
A = \inv{2} \oint -y dx + x dy = \inv{2} \int \lr{ x y’ – y x’ } dt.
\end{equation}
Clearly, there are other possible choices of \( L, M \) that we could use to find alternate area equations, but this choice seems to be independent of the shape of the region.

References

[1] F.W. Byron and R.W. Fuller. Mathematics of Classical and Quantum Physics. Dover Publications, 1992.

More on time derivatives of integrals.

June 9, 2024 math and physics play , , , , , , , , , , , , ,

[Click here for a PDF version of this post]

Motivation.

I was asked about geometric algebra equivalents for a couple identities found in [1], one for line integrals
\begin{equation}\label{eqn:more_feynmans_trick:20}
\ddt{} \int_{C(t)} \Bf \cdot d\Bx =
\int_{C(t)} \lr{
\PD{t}{\Bf} + \spacegrad \lr{ \Bv \cdot \Bf } – \Bv \cross \lr{ \spacegrad \cross \Bf }
}
\cdot d\Bx,
\end{equation}
and one for area integrals
\begin{equation}\label{eqn:more_feynmans_trick:40}
\ddt{} \int_{S(t)} \Bf \cdot d\BA =
\int_{S(t)} \lr{
\PD{t}{\Bf} + \Bv \lr{ \spacegrad \cdot \Bf } – \spacegrad \cross \lr{ \Bv \cross \Bf }
}
\cdot d\BA.
\end{equation}

Both of these look questionable at first glance, because neither has boundary term. However, they can be transformed with Stokes theorem to
\begin{equation}\label{eqn:more_feynmans_trick:60}
\ddt{} \int_{C(t)} \Bf \cdot d\Bx
=
\int_{C(t)} \lr{
\PD{t}{\Bf} – \Bv \cross \lr{ \spacegrad \cross \Bf }
}
\cdot d\Bx
+
\evalbar{\Bv \cdot \Bf }{\Delta C},
\end{equation}
and
\begin{equation}\label{eqn:more_feynmans_trick:80}
\ddt{} \int_{S(t)} \Bf \cdot d\BA =
\int_{S(t)} \lr{
\PD{t}{\Bf} + \Bv \lr{ \spacegrad \cdot \Bf }
}
\cdot d\BA

\oint_{\partial S(t)} \lr{ \Bv \cross \Bf } \cdot d\Bx.
\end{equation}
The area integral derivative is now seen to be a variation of one of the special cases of the Leibniz integral rule, see for example [2]. The author admits that the line integral relationship is not well used, and doesn’t show up in the wikipedia page.

My end goal will be to evaluate the derivative of a general multivector line integral
\begin{equation}\label{eqn:more_feynmans_trick:100}
\ddt{} \int_{C(t)} F d\Bx G,
\end{equation}
and area integral
\begin{equation}\label{eqn:more_feynmans_trick:120}
\ddt{} \int_{S(t)} F d^2\Bx G.
\end{equation}
We’ve derived that line integral result in a different fashion previously, but it’s interesting to see a different approach. Perhaps this approach will lend itself nicely to non-scalar integrands?

Prerequisites.

Definition 1.1: Convective derivative.

The convective derivative,
of \( \phi(t, \Bx(t)) \) is defined as
\begin{equation*}
\frac{D \phi}{D t} = \lim_{\Delta t \rightarrow 0} \frac{ \phi(t + \Delta t, \Bx + \Delta t \Bv) – \phi(t, \Bx)}{\Delta t},
\end{equation*}
where \( \Bv = d\Bx/dt \).

Theorem 1.1: Convective derivative.

The convective derivative operator may be written
\begin{equation*}
\frac{D}{D t} = \PD{t}{} + \Bv \cdot \spacegrad.
\end{equation*}

Start proof:

Let’s write
\begin{equation}\label{eqn:more_feynmans_trick:140}
\begin{aligned}
v_0 &= 1 \\
u_0 &= t + v_0 h \\
u_k &= x_k + v_k h, k \in [1,3] \\
\end{aligned}
\end{equation}

The limit, if it exists, must equal the sum of the individual limits
\begin{equation}\label{eqn:more_feynmans_trick:160}
\frac{D \phi}{D t} = \sum_{\alpha = 0}^3 \lim_{\Delta t \rightarrow 0} \frac{ \phi(u_\alpha + v_\alpha h) – \phi(t, Bx)}{h},
\end{equation}
but that is just a sum of derivitives, which can be evaluated by chain rule
\begin{equation}\label{eqn:more_feynmans_trick:180}
\begin{aligned}
\frac{D \phi}{D t}
&= \sum_{\alpha = 0}^{3} \evalbar{ \PD{u_\alpha}{\phi(u_\alpha)} \PD{h}{u_\alpha} }{h = 0} \\
&= \PD{t}{\phi} + \sum_{k = 1}^3 v_k \PD{x_k}{\phi} \\
&= \lr{ \PD{t}{} + \Bv \cdot \spacegrad } \phi.
\end{aligned}
\end{equation}

End proof.

Definition 1.2: Hestenes overdot notation.

We may use a dot or a tick with a derivative operator, to designate the scope of that operator, allowing it to operate bidirectionally, or in a restricted fashion, holding specific multivector elements constant. This is called the Hestenes overdot notation.Illustrating by example, with multivectors \( F, G \), and allowing the gradient to act bidirectionally, we have
\begin{equation*}
\begin{aligned}
F \spacegrad G
&=
\dot{F} \dot{\spacegrad} G
+
F \dot{\spacegrad} \dot{G} \\
&=
\sum_i \lr{ \partial_i F } \Be_i G + \sum_i F \Be_i \lr{ \partial_i G }.
\end{aligned}
\end{equation*}
The last step is a precise statement of the meaning of the overdot notation, showing that we hold the position of the vector elements of the gradient constant, while the (scalar) partials are allowed to commute, acting on the designated elements.

We will need one additional identity

Lemma 1.1: Gradient of dot product (one constant vector.)

Given vectors \( \Ba, \Bb \) the gradient of their dot product is given by
\begin{equation*}
\spacegrad \lr{ \Ba \cdot \Bb }
= \lr{ \Bb \cdot \spacegrad } \Ba – \Bb \cdot \lr{ \spacegrad \wedge \Ba }
+ \lr{ \Ba \cdot \spacegrad } \Bb – \Ba \cdot \lr{ \spacegrad \wedge \Bb }.
\end{equation*}
If \( \Bb \) is constant, this reduces to
\begin{equation*}
\spacegrad \lr{ \Ba \cdot \Bb }
=
\dot{\spacegrad} \lr{ \dot{\Ba} \cdot \Bb }
= \lr{ \Bb \cdot \spacegrad } \Ba – \Bb \cdot \lr{ \spacegrad \wedge \Ba }.
\end{equation*}

Start proof:

The \( \Bb \) constant case is trivial to prove. We use \( \Ba \cdot \lr{ \Bb \wedge \Bc } = \lr{ \Ba \cdot \Bb} \Bc – \Bb \lr{ \Ba \cdot \Bc } \), and simply expand the vector, curl dot product
\begin{equation}\label{eqn:more_feynmans_trick:200}
\Bb \cdot \lr{ \spacegrad \wedge \Ba }
=
\Bb \cdot \lr{ \dot{\spacegrad} \wedge \dot{\Ba} }
= \lr{ \Bb \cdot \dot{\spacegrad} } \dot{\Ba} – \dot{\spacegrad} \lr{ \dot{\Ba} \cdot \Bb }. \end{equation}
Rearrangement proves that \( \Bb \) constant identity. The more general statement follows from a chain rule evaluation of the gradient, holding each vector constant in turn
\begin{equation}\label{eqn:more_feynmans_trick:320}
\spacegrad \lr{ \Ba \cdot \Bb }
=
\dot{\spacegrad} \lr{ \dot{\Ba} \cdot \Bb }
+
\dot{\spacegrad} \lr{ \dot{\Bb} \cdot \Ba }.
\end{equation}

End proof.

Time derivative of a line integral of a vector field.

We now have all our tools assembled, and can proceed to evaluate the derivative of the line integral. We want to show that

Theorem 1.2:

Given a path parameterized by \( \Bx(\lambda) \), where \( d\Bx = (\PDi{\lambda}{\Bx}) d\lambda \), with points along a \( C(t) \) moving through space at a velocity \( \Bv(\Bx(\lambda)) \), and a vector function \( \Bf = \Bf(t, \Bx(\lambda)) \),
\begin{equation*}
\ddt{} \int_{C(t)} \Bf \cdot d\Bx =
\int_{C(t)} \lr{
\PD{t}{\Bf} + \spacegrad \lr{ \Bf \cdot \Bv } + \Bv \cdot \lr{ \spacegrad \wedge \Bf}
} \cdot d\Bx
\end{equation*}

Start proof:

I’m going to avoid thinking about the rigorous details, like any requirements for curve continuity and smoothness. We will however, specify that the end points are given by \( [\lambda_1, \lambda_2] \). Expanding out the parameterization, we seek to evaluate
\begin{equation}\label{eqn:more_feynmans_trick:240}
\int_{C(t)} \Bf \cdot d\Bx
=
\int_{\lambda_1}^{\lambda_2} \Bf(t, \Bx(\lambda) ) \cdot \frac{\partial \Bx}{\partial \lambda} d\lambda.
\end{equation}
The parametric form nicely moves all the boundary time dependence into the integrand, allowing us to write
\begin{equation}\label{eqn:more_feynmans_trick:260}
\begin{aligned}
\ddt{} \int_{C(t)} \Bf \cdot d\Bx
&=
\lim_{\Delta t \rightarrow 0}
\inv{\Delta t}
\int_{\lambda_1}^{\lambda_2}
\lr{ \Bf(t + \Delta t, \Bx(\lambda) + \Delta t \Bv(\Bx(\lambda) ) \cdot \frac{\partial}{\partial \lambda} \lr{ \Bx + \Delta t \Bv(\Bx(\lambda)) } – \Bf(t, \Bx(\lambda)) \cdot \frac{\partial \Bx}{\partial \lambda} } d\lambda \\
&=
\lim_{\Delta t \rightarrow 0}
\inv{\Delta t}
\int_{\lambda_1}^{\lambda_2}
\lr{ \Bf(t + \Delta t, \Bx(\lambda) + \Delta t \Bv(\Bx(\lambda) ) – \Bf(t, \Bx)} \cdot \frac{\partial \Bx}{\partial \lambda} d\lambda \\
&\quad+
\lim_{\Delta t \rightarrow 0}
\int_{\lambda_1}^{\lambda_2}
\Bf(t + \Delta t, \Bx(\lambda) + \Delta t \Bv(\Bx(\lambda) )) \cdot \PD{\lambda}{}\Bv(\Bx(\lambda)) d\lambda \\
&=
\int_{\lambda_1}^{\lambda_2}
\frac{D \Bf}{Dt} \cdot \frac{\partial \Bx}{\partial \lambda} d\lambda +
\lim_{\Delta t \rightarrow 0}
\int_{\lambda_1}^{\lambda_2}
\Bf(t + \Delta t, \Bx(\lambda) + \Delta t \Bv(\Bx(\lambda) \cdot \frac{\partial}{\partial \lambda} \Bv(\Bx(\lambda)) d\lambda \\
&=
\int_{\lambda_1}^{\lambda_2}
\lr{ \PD{t}{\Bf} + \lr{ \Bv \cdot \spacegrad } \Bf } \cdot \frac{\partial \Bx}{\partial \lambda} d\lambda
+
\int_{\lambda_1}^{\lambda_2}
\Bf \cdot \frac{\partial \Bv}{\partial \lambda} d\lambda
\end{aligned}
\end{equation}
At this point, we have a \( d\Bx \) in the first integrand, and a \( d\Bv \) in the second. We can expand the second integrand, evaluating the derivative using chain rule to find
\begin{equation}\label{eqn:more_feynmans_trick:280}
\begin{aligned}
\Bf \cdot \PD{\lambda}{\Bv}
&=
\sum_i \Bf \cdot \PD{x_i}{\Bv} \PD{\lambda}{x_i} \\
&=
\sum_{i,j} f_j \PD{x_i}{v_j} \PD{\lambda}{x_i} \\
&=
\sum_{j} f_j \lr{ \spacegrad v_j } \cdot \PD{\lambda}{\Bx} \\
&=
\sum_{j} \lr{ \dot{\spacegrad} f_j \dot{v_j} } \cdot \PD{\lambda}{\Bx} \\
&=
\dot{\spacegrad} \lr{ \Bf \cdot \dot{\Bv} } \cdot \PD{\lambda}{\Bx}.
\end{aligned}
\end{equation}
Substitution gives
\begin{equation}\label{eqn:more_feynmans_trick:300}
\begin{aligned}
\ddt{} \int_{C(t)} \Bf \cdot d\Bx
&=
\int_{C(t)}
\lr{ \PD{t}{\Bf} + \lr{ \Bv \cdot \spacegrad } \Bf + \dot{\spacegrad} \lr{ \Bf \cdot \dot{\Bv} } } \cdot \frac{\partial \Bx}{\partial \lambda} d\lambda \\
&=
\int_{C(t)}
\lr{ \PD{t}{\Bf}
+ \spacegrad \lr{ \Bf \cdot \Bv }
+ \lr{ \Bv \cdot \spacegrad } \Bf
– \dot{\spacegrad} \lr{ \dot{\Bf} \cdot \Bv }
} \cdot d\Bx \\
&=
\int_{C(t)}
\lr{ \PD{t}{\Bf}
+ \spacegrad \lr{ \Bf \cdot \Bv }
+ \Bv \cdot \lr{ \spacegrad \wedge \Bf }
} \cdot d\Bx,
\end{aligned}
\end{equation}
where the last simplification utilizes lemma 1.1.

End proof.

Since \( \Ba \cdot \lr{ \Bb \wedge \Bc } = -\Ba \cross \lr{ \Bb \cross \Bc } \), observe that we have also recovered \ref{eqn:more_feynmans_trick:20}.

Time derivative of a line integral of a bivector field.

For a bivector line integral, we have

Theorem 1.3:

Given a path parameterized by \( \Bx(\lambda) \), where \( d\Bx = (\PDi{\lambda}{\Bx}) d\lambda \), with points along a \( C(t) \) moving through space at a velocity \( \Bv(\Bx(\lambda)) \), and a bivector function \( B = B(t, \Bx(\lambda)) \),
\begin{equation*}
\ddt{} \int_{C(t)} B \cdot d\Bx =
\int_{C(t)}
\PD{t}{B} \cdot d\Bx + \lr{ d\Bx \cdot \spacegrad } \lr{ B \cdot \Bv } + \lr{ \lr{ \Bv \wedge d\Bx } \cdot \spacegrad } \cdot B.
\end{equation*}

Start proof:

Skipping the steps that follow our previous proceedure exactly, we have
\begin{equation}\label{eqn:more_feynmans_trick:340}
\ddt{} \int_{C(t)} B \cdot d\Bx =
\int_{C(t)}
\PD{t}{B} \cdot d\Bx + \lr{ \Bv \cdot \spacegrad } B \cdot d\Bx + B \cdot d\Bv.
\end{equation}
Since
\begin{equation}\label{eqn:more_feynmans_trick:360}
\begin{aligned}
B \cdot d\Bv
&= B \cdot \PD{\lambda}{\Bv} d\lambda \\
&= B \cdot \PD{x_i}{\Bv} \PD{\lambda}{x_i} d\lambda \\
&= B \cdot \lr{ \lr{ d\Bx \cdot \spacegrad } \Bv },
\end{aligned}
\end{equation}
we have
\begin{equation}\label{eqn:more_feynmans_trick:380}
\ddt{} \int_{C(t)} B \cdot d\Bx
=
\int_{C(t)}
\PD{t}{B} \cdot d\Bx + \lr{ \Bv \cdot \spacegrad } B \cdot d\Bx + B \cdot \lr{ \lr{ d\Bx \cdot \spacegrad } \Bv } \\
\end{equation}
Let’s reduce the two last terms in this integrand
\begin{equation}\label{eqn:more_feynmans_trick:400}
\begin{aligned}
\lr{ \Bv \cdot \spacegrad } B \cdot d\Bx + B \cdot \lr{ \lr{ d\Bx \cdot \spacegrad } \Bv }
&=
\lr{ \Bv \cdot \spacegrad } B \cdot d\Bx –
\lr{ d\Bx \cdot \dot{\spacegrad} } \lr{ \dot{\Bv} \cdot B } \\
&=
\lr{ \Bv \cdot \spacegrad } B \cdot d\Bx
– \lr{ d\Bx \cdot \spacegrad} \lr{ \Bv \cdot B }
+ \lr{ d\Bx \cdot \dot{\spacegrad} } \lr{ \Bv \cdot \dot{B} } \\
&=
\lr{ d\Bx \cdot \spacegrad} \lr{ B \cdot \Bv }
+ \lr{ \Bv \cdot \dot{\spacegrad} } \dot{B} \cdot d\Bx
+ \lr{ d\Bx \cdot \dot{\spacegrad} } \lr{ \Bv \cdot \dot{B} } \\
&=
\lr{ d\Bx \cdot \spacegrad} \lr{ B \cdot \Bv }
+ \lr{ \Bv \lr{ d\Bx \cdot \spacegrad } – d\Bx \lr{ \Bv \cdot \spacegrad } } \cdot B \\
&=
\lr{ d\Bx \cdot \spacegrad} \lr{ B \cdot \Bv }
+ \lr{ \lr{ \Bv \wedge d\Bx } \cdot \spacegrad } \cdot B.
\end{aligned}
\end{equation}
Back substitution finishes the job.

End proof.

Time derivative of a multivector line integral.

Theorem 1.4: Time derivative of multivector line integral.

Given a path parameterized by \( \Bx(\lambda) \), where \( d\Bx = (\PDi{\lambda}{\Bx}) d\lambda \), with points along a \( C(t) \) moving through space at a velocity \( \Bv(\Bx(\lambda)) \), and multivector functions \( M = M(t, \Bx(\lambda)), N = N(t, \Bx(\lambda)) \),
\begin{equation*}
\ddt{} \int_{C(t)} M d\Bx N =
\int_{C(t)}
\frac{D}{D t} M d\Bx N + M \lr{ \lr{ d\Bx \cdot \dot{\spacegrad} } \dot{\Bv} } N.
\end{equation*}

It is useful to write this out explicitly for clarity
\begin{equation}\label{eqn:more_feynmans_trick:420}
\ddt{} \int_{C(t)} M d\Bx N =
\int_{C(t)}
\PD{t}{M} d\Bx N + M d\Bx \PD{t}{N}
+ \dot{M} \lr{ \Bv \cdot \dot{\spacegrad} } N
+ M \lr{ \Bv \cdot \dot{\spacegrad} } \dot{N}
+ M \lr{ \lr{ d\Bx \cdot \dot{\spacegrad} } \dot{\Bv} } N.
\end{equation}

Proof is left to the reader, but follows the patterns above.

It’s not obvious whether there is a nice way to reduce this, as we did for the scalar valued line integral of a vector function, and the vector valued line integral of a bivector function. In particular, our vector and bivector results had \( \spacegrad \lr{ \Bf \cdot \Bv } \), and \( \spacegrad \lr{ B \cdot \Bv } \) terms respectively, which allows for the boundary term to be evaluated using Stokes’ theorem. Is such a manipulation possible here?

Coming later: surface integrals!

References

[1] Nicholas Kemmer. Vector Analysis: A physicist’s guide to the mathematics of fields in three dimensions. CUP Archive, 1977.

[2] Wikipedia contributors. Leibniz integral rule — Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Leibniz_integral_rule&oldid=1223666713, 2024. [Online; accessed 22-May-2024].

Simplifying the previous adjoint matrix results.

January 17, 2024 math and physics play , , , , , , , , ,

[Click here for a PDF version of this (and the previous) post]

We previously found determinant expressions for the matrix elements of the adjoint for 2D and 3D matrices \( M \). However, we can extract additional structure from each of those results.

2D case.

Given a matrix expressed in block matrix form in terms of it’s columns
\begin{equation}\label{eqn:adjoint:500}
M =
\begin{bmatrix}
\Bm_1 & \Bm_2
\end{bmatrix},
\end{equation}
we found that the adjoint \( A \) satisfying \( M A = \Abs{M} I \) had the structure
\begin{equation}\label{eqn:adjoint:520}
A =
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 \end{vmatrix} \\
& \\
\begin{vmatrix} \Bm_1 & \Be_1 \end{vmatrix} & \begin{vmatrix} \Bm_1 & \Be_2 \end{vmatrix}
\end{bmatrix}.
\end{equation}
We initially had wedge product expressions for each of these matrix elements, and can discover our structure by putting back those wedge products. Modulo sign, each of these matrix elemens has the form
\begin{equation}\label{eqn:adjoint:540}
\begin{aligned}
\begin{vmatrix} \Be_i & \Bm_j \end{vmatrix}
&=
\lr{ \Be_i \wedge \Bm_j } i^{-1} \\
&=
\gpgradezero{
\lr{ \Be_i \wedge \Bm_j } i^{-1}
} \\
&=
\gpgradezero{
\lr{ \Be_i \Bm_j – \Be_i \cdot \Bm_j } i^{-1}
} \\
&=
\gpgradezero{
\Be_i \Bm_j i^{-1}
} \\
&=
\Be_i \cdot \lr{ \Bm_j i^{-1} },
\end{aligned}
\end{equation}
where \( i = \Be_{12} \). The adjoint matrix is
\begin{equation}\label{eqn:adjoint:560}
A =
\begin{bmatrix}
-\lr{ \Bm_2 i } \cdot \Be_1 & -\lr{ \Bm_2 i } \cdot \Be_2 \\
\lr{ \Bm_1 i } \cdot \Be_1 & \lr{ \Bm_1 i } \cdot \Be_2 \\
\end{bmatrix}.
\end{equation}
If we use a column vector representation of the vectors \( \Bm_j i^{-1} \), we can write the adjoint in a compact hybrid geometric-algebra matrix form
\begin{equation}\label{eqn:adjoint:640}
A =
\begin{bmatrix}
-\lr{ \Bm_2 i }^\T \\
\lr{ \Bm_1 i }^\T
\end{bmatrix}.
\end{equation}

Check:

Let’s see if this works, by multiplying with \( M \)
\begin{equation}\label{eqn:adjoint:580}
\begin{aligned}
A M &=
\begin{bmatrix}
-\lr{ \Bm_2 i }^\T \\
\lr{ \Bm_1 i }^\T
\end{bmatrix}
\begin{bmatrix}
\Bm_1 & \Bm_2
\end{bmatrix} \\
&=
\begin{bmatrix}
-\lr{ \Bm_2 i }^\T \Bm_1 & -\lr{ \Bm_2 i }^\T \Bm_2 \\
\lr{ \Bm_1 i }^\T \Bm_1 & \lr{ \Bm_1 i }^\T \Bm_2
\end{bmatrix}.
\end{aligned}
\end{equation}
Those dot products have the form
\begin{equation}\label{eqn:adjoint:600}
\begin{aligned}
\lr{ \Bm_j i }^\T \Bm_i
&=
\lr{ \Bm_j i } \cdot \Bm_i \\
&=
\gpgradezero{ \lr{ \Bm_j i } \Bm_i } \\
&=
\gpgradezero{ -i \Bm_j \Bm_i } \\
&=
-i \lr{ \Bm_j \wedge \Bm_i },
\end{aligned}
\end{equation}
so
\begin{equation}\label{eqn:adjoint:620}
\begin{aligned}
A M &=
\begin{bmatrix}
i \lr{ \Bm_2 \wedge \Bm_1 } & 0 \\
0 & -i \lr { \Bm_1 \wedge \Bm_2 }
\end{bmatrix} \\
&=
\Abs{M} I.
\end{aligned}
\end{equation}
We find the determinant weighted identity that we expected. Our methods are a bit schizophrenic, switching fluidly between matrix and geometric algebra representations, but provided we are careful enough, this isn’t problematic.

3D case.

Now, let’s look at the 3D case, where we assume a column vector representation of the matrix of interest
\begin{equation}\label{eqn:adjoint:660}
M =
\begin{bmatrix}
\Bm_1 & \Bm_2 & \Bm_3
\end{bmatrix},
\end{equation}
and try to simplify the expression we found for the adjoint
\begin{equation}\label{eqn:adjoint:680}
A =
\begin{bmatrix}
\begin{vmatrix} \Be_1 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_2 & \Bm_3 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_2 & \Bm_3 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Be_1 & \Bm_3 & \Bm_1 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_3 & \Bm_1 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_3 & \Bm_1 \end{vmatrix} \\
& & \\
\begin{vmatrix} \Be_1 & \Bm_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_2 & \Bm_1 & \Bm_2 \end{vmatrix} & \begin{vmatrix} \Be_3 & \Bm_1 & \Bm_2 \end{vmatrix}
\end{bmatrix}.
\end{equation}
As with the 2D case, let’s re-express these determinants in wedge product form. We’ll write \( I = \Be_{123} \), and find
\begin{equation}\label{eqn:adjoint:700}
\begin{aligned}
\begin{vmatrix} \Be_i & \Bm_j & \Bm_k \end{vmatrix}
&=
\lr{ \Be_i \wedge \Bm_j \wedge \Bm_k } I^{-1} \\
&=
\gpgradezero{ \lr{ \Be_i \wedge \Bm_j \wedge \Bm_k } I^{-1} } \\
&=
\gpgradezero{ \lr{
\Be_i \lr{ \Bm_j \wedge \Bm_k }
\Be_i \cdot \lr{ \Bm_j \wedge \Bm_k }
} I^{-1} } \\
&=
\gpgradezero{
\Be_i \lr{ \Bm_j \wedge \Bm_k }
I^{-1} } \\
&=
\gpgradezero{
\Be_i \lr{ \Bm_j \cross \Bm_k } I
I^{-1} } \\
&=
\Be_i \cdot \lr{ \Bm_j \cross \Bm_k }.
\end{aligned}
\end{equation}
We see that we can put the adjoint in block matrix form
\begin{equation}\label{eqn:adjoint:720}
A =
\begin{bmatrix}
\lr{ \Bm_2 \cross \Bm_3 }^\T \\
\lr{ \Bm_3 \cross \Bm_1 }^\T \\
\lr{ \Bm_1 \cross \Bm_2 }^\T \\
\end{bmatrix}.
\end{equation}

Check:

\begin{equation}\label{eqn:adjoint:740}
\begin{aligned}
A M
&=
\begin{bmatrix}
\lr{ \Bm_2 \cross \Bm_3 }^\T \\
\lr{ \Bm_3 \cross \Bm_1 }^\T \\
\lr{ \Bm_1 \cross \Bm_2 }^\T \\
\end{bmatrix}
\begin{bmatrix}
\Bm_1 & \Bm_2 & \Bm_3
\end{bmatrix} \\
&=
\begin{bmatrix}
\lr{ \Bm_2 \cross \Bm_3 }^\T \Bm_1 & \lr{ \Bm_2 \cross \Bm_3 }^\T \Bm_2 & \lr{ \Bm_2 \cross \Bm_3 }^\T \Bm_3 \\
\lr{ \Bm_3 \cross \Bm_1 }^\T \Bm_1 & \lr{ \Bm_3 \cross \Bm_1 }^\T \Bm_2 & \lr{ \Bm_3 \cross \Bm_1 }^\T \Bm_3 \\
\lr{ \Bm_1 \cross \Bm_2 }^\T \Bm_1 & \lr{ \Bm_1 \cross \Bm_2 }^\T \Bm_2 & \lr{ \Bm_1 \cross \Bm_2 }^\T \Bm_3
\end{bmatrix} \\
&=
\Abs{M} I.
\end{aligned}
\end{equation}

Essentially, we found that the rows of the adjoint matrix are each parallel to the reciprocal frame vectors of the columns of \( M \). This makes sense, as the reciprocal frame encodes a generalized inverse of sorts.