derivative

New version of Geometric Algebra for Electrical Engineers published.

December 9, 2023 Geometric Algebra for Electrical Engineers , , , , , ,

A new version of my book is now published.  The free PDF and the leanpub versions are available now.  The paperback and hardcover versions should be available on Amazon within the week.

What has changed:

  • V0.3.2 (Dec 8, 2023)
    • Add to helpful formulas: Determinant form of triple wedge.
    • Add figure showing the spherical polar conventions picked.
    • Add a problem showing that \( (e^x)’ = x’ e^x \) only when \( x \) and \( x’ \) commute, which is true for scalars and complex numbers, but not necessarily true for abstract entities, such as multivectors and square matrices.
    • Spherical polar coordinates: do not skip steps for \( \mathbf{x}_\phi \) computation.
    • Rewrite the Multivector potentials section. No longer pulling the ideas out of a magic hat, instead trying to motivate them.  Compromised on the strategy to do so, leaving some of the details to problems.

This potentials rewrite I’ve been working on indirectly for the last month, and have published two blog posts about the topic, as well another that I wrote and discarded, but helped me form and sequence some of the ideas.

The exponential derivative topic was also covered on my blog recently.  I’ve reworked that so that it is independent of the specific application to spherical polar coordinates, and set it as a problem for the reader (with solution at the end of chapter I in case I didn’t give enough hints in the problem statement.)

Radial vector representation, momentum, and angular momentum.

September 8, 2023 math and physics play , , , , , ,

[Click here for a PDF version of this post], and here for a video version of this post.

 

Motivation.

In my last couple GA YouTube videos, circular and spherical coordinates were examined.

This post is a text representation of a new video that follows up on those two videos.

We found the form of the unit vector derivatives in both cases.

\begin{equation}\label{eqn:radialderivatives:20}
\Bx = r \mathbf{\hat{r}},
\end{equation}
leaving the angular dependence of \( \mathbf{\hat{r}} \) unspecified. We want to find both \( \Bv = \Bx’ \) and \( \mathbf{\hat{r}}’\).

Derivatives.

Lemma 1.1: Radial length derivative.

The derivative of a spherical length \( r \) can be expressed as
\begin{equation*}
\frac{dr}{dt} = \mathbf{\hat{r}} \cdot \frac{d\Bx}{dt}.
\end{equation*}

Start proof:

We write \( r^2 = \Bx \cdot \Bx \), and take derivatives of both sides, to find
\begin{equation}\label{eqn:radialderivatives:60}
2 r \frac{dr}{dt} = 2 \Bx \cdot \frac{d\Bx}{dt},
\end{equation}
or
\begin{equation}\label{eqn:radialderivatives:80}
\frac{dr}{dt} = \frac{\Bx}{r} \cdot \frac{d\Bx}{dt} = \mathbf{\hat{r}} \cdot \frac{d\Bx}{dt}.
\end{equation}

End proof.

Application of the chain rule to \ref{eqn:radialderivatives:20} is straightforward
\begin{equation}\label{eqn:radialderivatives:100}
\Bx’ = r’ \mathbf{\hat{r}} + r \mathbf{\hat{r}}’,
\end{equation}
but we don’t know the form for \( \mathbf{\hat{r}}’ \). We could proceed with a niave expansion of
\begin{equation}\label{eqn:radialderivatives:120}
\frac{d}{dt} \lr{ \frac{\Bx}{r} },
\end{equation}
but we can be sneaky, and perform a projective and rejective split of \( \Bx’ \) with respect to \( \mathbf{\hat{r}} \). That is
\begin{equation}\label{eqn:radialderivatives:140}
\begin{aligned}
\Bx’
&= \mathbf{\hat{r}} \mathbf{\hat{r}} \Bx’ \\
&= \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \Bx’ } \\
&= \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \cdot \Bx’ + \mathbf{\hat{r}} \wedge \Bx’} \\
&= \mathbf{\hat{r}} \lr{ r’ + \mathbf{\hat{r}} \wedge \Bx’}.
\end{aligned}
\end{equation}
We used our lemma in the last step above, and after distribution, find
\begin{equation}\label{eqn:radialderivatives:160}
\Bx’ = r’ \mathbf{\hat{r}} + \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ }.
\end{equation}
Comparing to \ref{eqn:radialderivatives:100}, we see that
\begin{equation}\label{eqn:radialderivatives:180}
r \mathbf{\hat{r}}’ = \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ }.
\end{equation}
We see that the radial unit vector derivative is proportional to the rejection of \( \mathbf{\hat{r}} \) from \( \Bx’ \)
\begin{equation}\label{eqn:radialderivatives:200}
\mathbf{\hat{r}}’ = \inv{r} \mathrm{Rej}_{\mathbf{\hat{r}}}(\Bx’) = \inv{r^3} \Bx \lr{ \Bx \wedge \Bx’ }.
\end{equation}
The vector \( \mathbf{\hat{r}}’ \) is perpendicular to \( \mathbf{\hat{r}} \) for any parameterization of it’s orientation, or in symbols
\begin{equation}\label{eqn:radialderivatives:220}
\mathbf{\hat{r}} \cdot \mathbf{\hat{r}}’ = 0.
\end{equation}
We saw this for the circular and spherical parameterizations, and see now that this also holds more generally.

Angular momentum.

Let’s now write out the momentum \( \Bp = m \Bv \) for a point particle with mass \( m \), and determine the kinetic energy \( m \Bv^2/2 = \Bp^2/2m \) for that particle.

The momentum is
\begin{equation}\label{eqn:radialderivatives:320}
\begin{aligned}
\Bp
&= m r’ \mathbf{\hat{r}} + m \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bv } \\
&= m r’ \mathbf{\hat{r}} + \inv{r} \mathbf{\hat{r}} \lr{ \Br \wedge \Bp }.
\end{aligned}
\end{equation}
Observe that \( p_r = m r’ \) is the radial component of the momentum. It is natural to introduce a bivector valued angular momentum operator
\begin{equation}\label{eqn:radialderivatives:340}
L = \Br \wedge \Bp,
\end{equation}
splitting the momentum into a component that is strictly radial and a component that lies purely on the surface of a spherical surface in momentum space. That is
\begin{equation}\label{eqn:radialderivatives:360}
\Bp = p_r \mathbf{\hat{r}} + \inv{r} \mathbf{\hat{r}} L.
\end{equation}
Making use of the fact that \( \mathbf{\hat{r}} \) and \( \mathrm{Rej}_{\mathbf{\hat{r}}}(\Bx’) \) are perpendicular (so there are no cross terms when we square the momentum), the
kinetic energy is
\begin{equation}\label{eqn:radialderivatives:380}
\begin{aligned}
\inv{2m} \Bp^2
&= \inv{2m} \lr{ p_r \mathbf{\hat{r}} + \inv{r} \mathbf{\hat{r}} L }^2 \\
&= \inv{2m} p_r^2 + \inv{2 m r^2 } \mathbf{\hat{r}} L \mathbf{\hat{r}} L \\
&= \inv{2m} p_r^2 – \inv{2 m r^2 } \mathbf{\hat{r}} L^2 \mathbf{\hat{r}} \\
&= \inv{2m} p_r^2 – \inv{2 m r^2 } L^2 \mathbf{\hat{r}}^2,
\end{aligned}
\end{equation}
where we’ve used the anticommutative nature of \( \mathbf{\hat{r}} \) and \( L \) (i.e.: a sign swap is needed to swap them), and used the fact that \( L^2 \) is a scalar, allowing us to commute \( \mathbf{\hat{r}} \) with \( L^2 \). This leaves us with
\begin{equation}\label{eqn:radialderivatives:400}
E = \inv{2m} \Bp^2 = \inv{2m} p_r^2 – \inv{2 m r^2 } L^2.
\end{equation}
Observe that both the radial momentum term and the angular momentum term are both strictly postive, since \( L \) is a bivector and \( L^2 \le 0 \).

Problems.

Problem:

Find \ref{eqn:radialderivatives:200} without being sneaky.

Answer

\begin{equation}\label{eqn:radialderivatives:280}
\begin{aligned}
\mathbf{\hat{r}}’
&= \frac{d}{dt} \lr{ \frac{\Bx}{r} } \\
&= \inv{r} \Bx’ – \inv{r^2} \Bx r’ \\
&= \inv{r} \Bx’ – \inv{r} \mathbf{\hat{r}} r’ \\
&= \inv{r} \lr{ \Bx’ – \mathbf{\hat{r}} r’ } \\
&= \inv{r} \lr{ \mathbf{\hat{r}} \mathbf{\hat{r}} \Bx’ – \mathbf{\hat{r}} r’ } \\
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \Bx’ – r’ } \\
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \Bx’ – \mathbf{\hat{r}} \cdot \Bx’ } \\
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ }.
\end{aligned}
\end{equation}

Problem:

Show that \ref{eqn:radialderivatives:200} can be expressed as a triple vector cross product
\begin{equation}\label{eqn:radialderivatives:230}
\mathbf{\hat{r}}’ = \inv{r^3} \lr{ \Bx \cross \Bx’ } \cross \Bx,
\end{equation}

Answer

While this may be familiar from elementary calculus, such as in [1], we can show follows easily from our GA result
\begin{equation}\label{eqn:radialderivatives:300}
\begin{aligned}
\mathbf{\hat{r}}’
&= \inv{r} \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ } \\
&= \inv{r} \gpgradeone{ \mathbf{\hat{r}} \lr{ \mathbf{\hat{r}} \wedge \Bx’ } } \\
&= \inv{r} \gpgradeone{ \mathbf{\hat{r}} I \lr{ \mathbf{\hat{r}} \cross \Bx’ } } \\
&= \inv{r} \gpgradeone{ I \lr{ \mathbf{\hat{r}} \cdot \lr{ \mathbf{\hat{r}} \cross \Bx’ } + \mathbf{\hat{r}} \wedge \lr{ \mathbf{\hat{r}} \cross \Bx’ } } } \\
&= \inv{r} \gpgradeone{ I^2 \mathbf{\hat{r}} \cross \lr{ \mathbf{\hat{r}} \cross \Bx’ } } \\
&= \inv{r} \lr{ \mathbf{\hat{r}} \cross \Bx’ } \cross \mathbf{\hat{r}}.
\end{aligned}
\end{equation}

References

[1] S.L. Salas and E. Hille. Calculus: one and several variables. Wiley New York, 1990.

Jacobian and Hessian matrices

January 15, 2017 ece1505 , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Motivation

In class this Friday the Jacobian and Hessian matrices were introduced, but I did not find the treatment terribly clear. Here is an alternate treatment, beginning with the gradient construction from [2], which uses a nice trick to frame the multivariable derivative operation as a single variable Taylor expansion.

Multivariable Taylor approximation

The Taylor series expansion for a scalar function \( g : {\mathbb{R}} \rightarrow {\mathbb{R}} \) about the origin is just

\begin{equation}\label{eqn:jacobianAndHessian:20}
g(t) = g(0) + t g'(0) + \frac{t^2}{2} g”(0) + \cdots
\end{equation}

In particular

\begin{equation}\label{eqn:jacobianAndHessian:40}
g(1) = g(0) + g'(0) + \frac{1}{2} g”(0) + \cdots
\end{equation}

Now consider \( g(t) = f( \Bx + \Ba t ) \), where \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}} \), \( g(0) = f(\Bx) \), and \( g(1) = f(\Bx + \Ba) \). The multivariable Taylor expansion now follows directly

\begin{equation}\label{eqn:jacobianAndHessian:60}
f( \Bx + \Ba)
= f(\Bx)
+ \evalbar{\frac{df(\Bx + \Ba t)}{dt}}{t = 0} + \frac{1}{2} \evalbar{\frac{d^2f(\Bx + \Ba t)}{dt^2}}{t = 0} + \cdots
\end{equation}

The first order term is

\begin{equation}\label{eqn:jacobianAndHessian:80}
\begin{aligned}
\evalbar{\frac{df(\Bx + \Ba t)}{dt}}{t = 0}
&=
\sum_{i = 1}^n
\frac{d( x_i + a_i t)}{dt}
\evalbar{\PD{(x_i + a_i t)}{f(\Bx + \Ba t)}}{t = 0} \\
&=
\sum_{i = 1}^n
a_i
\PD{x_i}{f(\Bx)} \\
&= \Ba \cdot \spacegrad f.
\end{aligned}
\end{equation}

Similarily, for the second order term

\begin{equation}\label{eqn:jacobianAndHessian:100}
\begin{aligned}
\evalbar{\frac{d^2 f(\Bx + \Ba t)}{dt^2}}{t = 0}
&=
\evalbar{\lr{
\frac{d}{dt}
\lr{
\sum_{i = 1}^n
a_i
\PD{(x_i + a_i t)}{f(\Bx + \Ba t)}
}
}
}{t = 0} \\
&=
\evalbar{
\lr{
\sum_{j = 1}^n
\frac{d(x_j + a_j t)}{dt}
\sum_{i = 1}^n
a_i
\frac{\partial^2 f(\Bx + \Ba t)}{\partial (x_j + a_j t) \partial (x_i + a_i t) }
}
}{t = 0} \\
&=
\sum_{i,j = 1}^n a_i a_j \frac{\partial^2 f}{\partial x_i \partial x_j} \\
&=
(\Ba \cdot \spacegrad)^2 f.
\end{aligned}
\end{equation}

The complete Taylor expansion of a scalar function \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}} \) is therefore

\begin{equation}\label{eqn:jacobianAndHessian:120}
f(\Bx + \Ba)
= f(\Bx) +
\Ba \cdot \spacegrad f +
\inv{2} \lr{ \Ba \cdot \spacegrad}^2 f + \cdots,
\end{equation}

so the Taylor expansion has an exponential structure

\begin{equation}\label{eqn:jacobianAndHessian:140}
f(\Bx + \Ba) = \sum_{k = 0}^\infty \inv{k!} \lr{ \Ba \cdot \spacegrad}^k f = e^{\Ba \cdot \spacegrad} f.
\end{equation}

Should an approximation of a vector valued function \( \Bf : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \) be desired it is only required to form a matrix of the components

\begin{equation}\label{eqn:jacobianAndHessian:160}
\Bf(\Bx + \Ba)
= \Bf(\Bx) +
[\Ba \cdot \spacegrad f_i]_i +
\inv{2} [\lr{ \Ba \cdot \spacegrad}^2 f_i]_i + \cdots,
\end{equation}

where \( [.]_i \) denotes a column vector over the rows \( i \in [1,m] \), and \( f_i \) are the coordinates of \( \Bf \).

The Jacobian matrix

In [1] the Jacobian \( D \Bf \) of a function \( \Bf : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \) is defined in terms of the limit of the \( l_2 \) norm ratio

\begin{equation}\label{eqn:jacobianAndHessian:180}
\frac{\Norm{\Bf(\Bz) – \Bf(\Bx) – (D \Bf) (\Bz – \Bx)}_2 }{ \Norm{\Bz – \Bx}_2 },
\end{equation}

with the statement that the function \( \Bf \) has a derivative if this limit exists. Here the Jacobian \( D \Bf \in {\mathbb{R}}^{m \times n} \) must be matrix valued.

Let \( \Bz = \Bx + \Ba \), so the first order expansion of \ref{eqn:jacobianAndHessian:160} is

\begin{equation}\label{eqn:jacobianAndHessian:200}
\Bf(\Bz)
= \Bf(\Bx) + [\lr{ \Bz – \Bx } \cdot \spacegrad f_i]_i
.
\end{equation}

With the (unproven) assumption that this Taylor expansion satisfies the norm limit criteria of \ref{eqn:jacobianAndHessian:180}, it is possible to extract the structure of the Jacobian by comparison

\begin{equation}\label{eqn:jacobianAndHessian:220}
\begin{aligned}
(D \Bf)
(\Bz – \Bx)
&=
{\begin{bmatrix}
\lr{ \Bz – \Bx } \cdot \spacegrad f_i
\end{bmatrix}}_i \\
&=
{\begin{bmatrix}
\sum_{j = 1}^n (z_j – x_j) \PD{x_j}{f_i}
\end{bmatrix}}_i \\
&=
{\begin{bmatrix}
\PD{x_j}{f_i}
\end{bmatrix}}_{ij}
(\Bz – \Bx),
\end{aligned}
\end{equation}

so
\begin{equation}\label{eqn:jacobianAndHessian:240}
\boxed{
(D \Bf)_{ij} = \PD{x_j}{f_i}
}
\end{equation}

Written out explictly as a matrix the Jacobian is

\begin{equation}\label{eqn:jacobianAndHessian:320}
D \Bf
=
\begin{bmatrix}
\PD{x_1}{f_1} & \PD{x_2}{f_1} & \cdots & \PD{x_n}{f_1} \\
\PD{x_1}{f_2} & \PD{x_2}{f_2} & \cdots & \PD{x_n}{f_2} \\
\vdots & \vdots & & \vdots \\
\PD{x_1}{f_m} & \PD{x_2}{f_m} & \cdots & \PD{x_n}{f_m} \\
\end{bmatrix}
=
\begin{bmatrix}
(\spacegrad f_1)^\T \\
(\spacegrad f_2)^\T \\
\vdots \\
(\spacegrad f_m)^\T
\end{bmatrix}.
\end{equation}

In particular, when the function is scalar valued
\begin{equation}\label{eqn:jacobianAndHessian:261}
D f = (\spacegrad f)^\T.
\end{equation}

With this notation, the first Taylor expansion, in terms of the Jacobian matrix is

\begin{equation}\label{eqn:jacobianAndHessian:260}
\boxed{
\Bf(\Bz)
\approx \Bf(\Bx) + (D \Bf) \lr{ \Bz – \Bx }.
}
\end{equation}

The Hessian matrix

For scalar valued functions, the text expresses the second order expansion of a function in terms of the Jacobian and Hessian matrices

\begin{equation}\label{eqn:jacobianAndHessian:271}
f(\Bz)
\approx f(\Bx) + (D f) \lr{ \Bz – \Bx }
+ \inv{2} \lr{ \Bz – \Bx }^\T (\spacegrad^2 f) \lr{ \Bz – \Bx }.
\end{equation}

Because \( \spacegrad^2 \) is the usual notation for a Laplacian operator, this \( \spacegrad^2 f \in {\mathbb{R}}^{n \times n}\) notation for the Hessian matrix is not ideal in my opinion. Ignoring that notational objection for this class, the structure of the Hessian matrix can be extracted by comparison with the coordinate expansion

\begin{equation}\label{eqn:jacobianAndHessian:300}
\Ba^\T (\spacegrad^2 f) \Ba
=
\sum_{r,s = 1}^n a_r a_s \frac{\partial^2 f}{\partial x_r \partial x_s}
\end{equation}

so
\begin{equation}\label{eqn:jacobianAndHessian:280}
\boxed{
(\spacegrad^2 f)_{ij}
=
\frac{\partial^2 f_i}{\partial x_i \partial x_j}.
}
\end{equation}

In explicit matrix form the Hessian is

\begin{equation}\label{eqn:jacobianAndHessian:340}
\spacegrad^2 f
=
\begin{bmatrix}
\frac{\partial^2 f}{\partial x_1 \partial x_1} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_1 \partial x_n} \\
\frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2 \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_2 \partial x_n} \\
\vdots & \vdots & & \vdots \\
\frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_n \partial x_n}
\end{bmatrix}.
\end{equation}

Is there a similar nice matrix structure for the Hessian of a function \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \)?

References

[1] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.

[2] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.