Taylor series

Jacobian and Hessian matrices

January 15, 2017 ece1505 , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Motivation

In class this Friday the Jacobian and Hessian matrices were introduced, but I did not find the treatment terribly clear. Here is an alternate treatment, beginning with the gradient construction from [2], which uses a nice trick to frame the multivariable derivative operation as a single variable Taylor expansion.

Multivariable Taylor approximation

The Taylor series expansion for a scalar function \( g : {\mathbb{R}} \rightarrow {\mathbb{R}} \) about the origin is just

\begin{equation}\label{eqn:jacobianAndHessian:20}
g(t) = g(0) + t g'(0) + \frac{t^2}{2} g”(0) + \cdots
\end{equation}

In particular

\begin{equation}\label{eqn:jacobianAndHessian:40}
g(1) = g(0) + g'(0) + \frac{1}{2} g”(0) + \cdots
\end{equation}

Now consider \( g(t) = f( \Bx + \Ba t ) \), where \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}} \), \( g(0) = f(\Bx) \), and \( g(1) = f(\Bx + \Ba) \). The multivariable Taylor expansion now follows directly

\begin{equation}\label{eqn:jacobianAndHessian:60}
f( \Bx + \Ba)
= f(\Bx)
+ \evalbar{\frac{df(\Bx + \Ba t)}{dt}}{t = 0} + \frac{1}{2} \evalbar{\frac{d^2f(\Bx + \Ba t)}{dt^2}}{t = 0} + \cdots
\end{equation}

The first order term is

\begin{equation}\label{eqn:jacobianAndHessian:80}
\begin{aligned}
\evalbar{\frac{df(\Bx + \Ba t)}{dt}}{t = 0}
&=
\sum_{i = 1}^n
\frac{d( x_i + a_i t)}{dt}
\evalbar{\PD{(x_i + a_i t)}{f(\Bx + \Ba t)}}{t = 0} \\
&=
\sum_{i = 1}^n
a_i
\PD{x_i}{f(\Bx)} \\
&= \Ba \cdot \spacegrad f.
\end{aligned}
\end{equation}

Similarily, for the second order term

\begin{equation}\label{eqn:jacobianAndHessian:100}
\begin{aligned}
\evalbar{\frac{d^2 f(\Bx + \Ba t)}{dt^2}}{t = 0}
&=
\evalbar{\lr{
\frac{d}{dt}
\lr{
\sum_{i = 1}^n
a_i
\PD{(x_i + a_i t)}{f(\Bx + \Ba t)}
}
}
}{t = 0} \\
&=
\evalbar{
\lr{
\sum_{j = 1}^n
\frac{d(x_j + a_j t)}{dt}
\sum_{i = 1}^n
a_i
\frac{\partial^2 f(\Bx + \Ba t)}{\partial (x_j + a_j t) \partial (x_i + a_i t) }
}
}{t = 0} \\
&=
\sum_{i,j = 1}^n a_i a_j \frac{\partial^2 f}{\partial x_i \partial x_j} \\
&=
(\Ba \cdot \spacegrad)^2 f.
\end{aligned}
\end{equation}

The complete Taylor expansion of a scalar function \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}} \) is therefore

\begin{equation}\label{eqn:jacobianAndHessian:120}
f(\Bx + \Ba)
= f(\Bx) +
\Ba \cdot \spacegrad f +
\inv{2} \lr{ \Ba \cdot \spacegrad}^2 f + \cdots,
\end{equation}

so the Taylor expansion has an exponential structure

\begin{equation}\label{eqn:jacobianAndHessian:140}
f(\Bx + \Ba) = \sum_{k = 0}^\infty \inv{k!} \lr{ \Ba \cdot \spacegrad}^k f = e^{\Ba \cdot \spacegrad} f.
\end{equation}

Should an approximation of a vector valued function \( \Bf : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \) be desired it is only required to form a matrix of the components

\begin{equation}\label{eqn:jacobianAndHessian:160}
\Bf(\Bx + \Ba)
= \Bf(\Bx) +
[\Ba \cdot \spacegrad f_i]_i +
\inv{2} [\lr{ \Ba \cdot \spacegrad}^2 f_i]_i + \cdots,
\end{equation}

where \( [.]_i \) denotes a column vector over the rows \( i \in [1,m] \), and \( f_i \) are the coordinates of \( \Bf \).

The Jacobian matrix

In [1] the Jacobian \( D \Bf \) of a function \( \Bf : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \) is defined in terms of the limit of the \( l_2 \) norm ratio

\begin{equation}\label{eqn:jacobianAndHessian:180}
\frac{\Norm{\Bf(\Bz) – \Bf(\Bx) – (D \Bf) (\Bz – \Bx)}_2 }{ \Norm{\Bz – \Bx}_2 },
\end{equation}

with the statement that the function \( \Bf \) has a derivative if this limit exists. Here the Jacobian \( D \Bf \in {\mathbb{R}}^{m \times n} \) must be matrix valued.

Let \( \Bz = \Bx + \Ba \), so the first order expansion of \ref{eqn:jacobianAndHessian:160} is

\begin{equation}\label{eqn:jacobianAndHessian:200}
\Bf(\Bz)
= \Bf(\Bx) + [\lr{ \Bz – \Bx } \cdot \spacegrad f_i]_i
.
\end{equation}

With the (unproven) assumption that this Taylor expansion satisfies the norm limit criteria of \ref{eqn:jacobianAndHessian:180}, it is possible to extract the structure of the Jacobian by comparison

\begin{equation}\label{eqn:jacobianAndHessian:220}
\begin{aligned}
(D \Bf)
(\Bz – \Bx)
&=
{\begin{bmatrix}
\lr{ \Bz – \Bx } \cdot \spacegrad f_i
\end{bmatrix}}_i \\
&=
{\begin{bmatrix}
\sum_{j = 1}^n (z_j – x_j) \PD{x_j}{f_i}
\end{bmatrix}}_i \\
&=
{\begin{bmatrix}
\PD{x_j}{f_i}
\end{bmatrix}}_{ij}
(\Bz – \Bx),
\end{aligned}
\end{equation}

so
\begin{equation}\label{eqn:jacobianAndHessian:240}
\boxed{
(D \Bf)_{ij} = \PD{x_j}{f_i}
}
\end{equation}

Written out explictly as a matrix the Jacobian is

\begin{equation}\label{eqn:jacobianAndHessian:320}
D \Bf
=
\begin{bmatrix}
\PD{x_1}{f_1} & \PD{x_2}{f_1} & \cdots & \PD{x_n}{f_1} \\
\PD{x_1}{f_2} & \PD{x_2}{f_2} & \cdots & \PD{x_n}{f_2} \\
\vdots & \vdots & & \vdots \\
\PD{x_1}{f_m} & \PD{x_2}{f_m} & \cdots & \PD{x_n}{f_m} \\
\end{bmatrix}
=
\begin{bmatrix}
(\spacegrad f_1)^\T \\
(\spacegrad f_2)^\T \\
\vdots \\
(\spacegrad f_m)^\T
\end{bmatrix}.
\end{equation}

In particular, when the function is scalar valued
\begin{equation}\label{eqn:jacobianAndHessian:261}
D f = (\spacegrad f)^\T.
\end{equation}

With this notation, the first Taylor expansion, in terms of the Jacobian matrix is

\begin{equation}\label{eqn:jacobianAndHessian:260}
\boxed{
\Bf(\Bz)
\approx \Bf(\Bx) + (D \Bf) \lr{ \Bz – \Bx }.
}
\end{equation}

The Hessian matrix

For scalar valued functions, the text expresses the second order expansion of a function in terms of the Jacobian and Hessian matrices

\begin{equation}\label{eqn:jacobianAndHessian:271}
f(\Bz)
\approx f(\Bx) + (D f) \lr{ \Bz – \Bx }
+ \inv{2} \lr{ \Bz – \Bx }^\T (\spacegrad^2 f) \lr{ \Bz – \Bx }.
\end{equation}

Because \( \spacegrad^2 \) is the usual notation for a Laplacian operator, this \( \spacegrad^2 f \in {\mathbb{R}}^{n \times n}\) notation for the Hessian matrix is not ideal in my opinion. Ignoring that notational objection for this class, the structure of the Hessian matrix can be extracted by comparison with the coordinate expansion

\begin{equation}\label{eqn:jacobianAndHessian:300}
\Ba^\T (\spacegrad^2 f) \Ba
=
\sum_{r,s = 1}^n a_r a_s \frac{\partial^2 f}{\partial x_r \partial x_s}
\end{equation}

so
\begin{equation}\label{eqn:jacobianAndHessian:280}
\boxed{
(\spacegrad^2 f)_{ij}
=
\frac{\partial^2 f_i}{\partial x_i \partial x_j}.
}
\end{equation}

In explicit matrix form the Hessian is

\begin{equation}\label{eqn:jacobianAndHessian:340}
\spacegrad^2 f
=
\begin{bmatrix}
\frac{\partial^2 f}{\partial x_1 \partial x_1} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_1 \partial x_n} \\
\frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2 \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_2 \partial x_n} \\
\vdots & \vdots & & \vdots \\
\frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots &\frac{\partial^2 f}{\partial x_n \partial x_n}
\end{bmatrix}.
\end{equation}

Is there a similar nice matrix structure for the Hessian of a function \( f : {\mathbb{R}}^n \rightarrow {\mathbb{R}}^m \)?

References

[1] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.

[2] D. Hestenes. New Foundations for Classical Mechanics. Kluwer Academic Publishers, 1999.

Magnetic moment for a localized magnetostatic current

October 13, 2016 math and physics play , , , , , , , , ,

[Click here for a PDF of this post with nicer formatting]

Motivation.

I was once again reading my Jackson [2]. This time I found that his presentation of magnetic moment didn’t really make sense to me. Here’s my own pass through it, filling in a number of details. As I did last time, I’ll also translate into SI units as I go.

Vector potential.

The Biot-Savart expression for the magnetic field can be factored into a curl expression using the usual tricks

\begin{equation}\label{eqn:magneticMomentJackson:20}
\begin{aligned}
\BB
&= \frac{\mu_0}{4\pi} \int \frac{\BJ(\Bx’) \cross (\Bx – \Bx’)}{\Abs{\Bx – \Bx’}^3} d^3 x’ \\
&= -\frac{\mu_0}{4\pi} \int \BJ(\Bx’) \cross \spacegrad \inv{\Abs{\Bx – \Bx’}} d^3 x’ \\
&= \frac{\mu_0}{4\pi} \spacegrad \cross \int \frac{\BJ(\Bx’)}{\Abs{\Bx – \Bx’}} d^3 x’,
\end{aligned}
\end{equation}

so the vector potential, through its curl, defines the magnetic field \( \BB = \spacegrad \cross \BA \) is given by

\begin{equation}\label{eqn:magneticMomentJackson:40}
\BA(\Bx) = \frac{\mu_0}{4 \pi} \int \frac{J(\Bx’)}{\Abs{\Bx – \Bx’}} d^3 x’.
\end{equation}

If the current source is localized (zero outside of some finite region), then there will always be a region for which \( \Abs{\Bx} \gg \Abs{\Bx’} \), so the denominator yields to Taylor expansion

\begin{equation}\label{eqn:magneticMomentJackson:60}
\begin{aligned}
\inv{\Abs{\Bx – \Bx’}}
&=
\inv{\Abs{\Bx}} \lr{1 + \frac{\Abs{\Bx’}^2}{\Abs{\Bx}^2} – 2 \frac{\Bx \cdot \Bx’}{\Abs{\Bx}^2} }^{-1/2} \\
&\approx
\inv{\Abs{\Bx}} \lr{ 1 + \frac{\Bx \cdot \Bx’}{\Abs{\Bx}^2} } \\
&=
\inv{\Abs{\Bx}} + \frac{\Bx \cdot \Bx’}{\Abs{\Bx}^3}.
\end{aligned}
\end{equation}

so the vector potential, far enough away from the current source is
\begin{equation}\label{eqn:magneticMomentJackson:80}
\BA(\Bx)
=
\frac{\mu_0}{4 \pi} \int \frac{J(\Bx’)}{\Abs{\Bx}} d^3 x’
+\frac{\mu_0}{4 \pi} \int \frac{(\Bx \cdot \Bx’)J(\Bx’)}{\Abs{\Bx}^3} d^3 x’.
\end{equation}

Jackson uses a sneaky trick to show that the first integral is killed for a localized source. That trick appears to be based on evaluating the following divergence

\begin{equation}\label{eqn:magneticMomentJackson:100}
\begin{aligned}
\spacegrad \cdot (\BJ(\Bx) x_i)
&=
(\spacegrad \cdot \BJ) x_i
+
(\spacegrad x_i) \cdot \BJ \\
&=
(\Be_k \partial_k x_i) \cdot\BJ \\
&=
\delta_{ki} J_k \\
&=
J_i.
\end{aligned}
\end{equation}

Note that this made use of the fact that \( \spacegrad \cdot \BJ = 0 \) for magnetostatics. This provides a way to rewrite the current density as a divergence

\begin{equation}\label{eqn:magneticMomentJackson:120}
\begin{aligned}
\int \frac{J(\Bx’)}{\Abs{\Bx}} d^3 x’
&=
\Be_i \int \frac{\spacegrad’ \cdot (x_i’ \BJ(\Bx’))}{\Abs{\Bx}} d^3 x’ \\
&=
\frac{\Be_i}{\Abs{\Bx}} \int \spacegrad’ \cdot (x_i’ \BJ(\Bx’)) d^3 x’ \\
&=
\frac{1}{\Abs{\Bx}} \oint \Bx’ (d\Ba \cdot \BJ(\Bx’)).
\end{aligned}
\end{equation}

When \( \BJ \) is localized, this is zero provided we pick the integration surface for the volume outside of that localization region.

It is now desired to rewrite \( \int \Bx \cdot \Bx’ \BJ \) as a triple cross product since the dot product of such a triple cross product has exactly this term in it

\begin{equation}\label{eqn:magneticMomentJackson:140}
\begin{aligned}
– \Bx \cross \int \Bx’ \cross \BJ
&=
\int (\Bx \cdot \Bx’) \BJ

\int (\Bx \cdot \BJ) \Bx’ \\
&=
\int (\Bx \cdot \Bx’) \BJ

\Be_k x_i \int J_i x_k’,
\end{aligned}
\end{equation}

so
\begin{equation}\label{eqn:magneticMomentJackson:160}
\int (\Bx \cdot \Bx’) \BJ
=
– \Bx \cross \int \Bx’ \cross \BJ
+
\Be_k x_i \int J_i x_k’.
\end{equation}

To get of this second term, the next sneaky trick is to consider the following divergence

\begin{equation}\label{eqn:magneticMomentJackson:180}
\begin{aligned}
\oint d\Ba’ \cdot (\BJ(\Bx’) x_i’ x_j’)
&=
\int dV’ \spacegrad’ \cdot (\BJ(\Bx’) x_i’ x_j’) \\
&=
\int dV’ (\spacegrad’ \cdot \BJ)
+
\int dV’ \BJ \cdot \spacegrad’ (x_i’ x_j’) \\
&=
\int dV’ J_k \cdot \lr{ x_i’ \partial_k x_j’ + x_j’ \partial_k x_i’ } \\
&=
\int dV’ \lr{J_k x_i’ \delta_{kj} + J_k x_j’ \delta_{ki}} \\
&=
\int dV’ \lr{J_j x_i’ + J_i x_j’}.
\end{aligned}
\end{equation}

The surface integral is once again zero, which means that we have an antisymmetric relationship in integrals of the form

\begin{equation}\label{eqn:magneticMomentJackson:200}
\int J_j x_i’ = -\int J_i x_j’.
\end{equation}

Now we can use the tensor algebra trick of writing \( y = (y + y)/2 \),

\begin{equation}\label{eqn:magneticMomentJackson:220}
\begin{aligned}
\int (\Bx \cdot \Bx’) \BJ
&=
– \Bx \cross \int \Bx’ \cross \BJ
+
\Be_k x_i \int J_i x_k’ \\
&=
– \Bx \cross \int \Bx’ \cross \BJ
+
\inv{2} \Be_k x_i \int \lr{ J_i x_k’ + J_i x_k’ } \\
&=
– \Bx \cross \int \Bx’ \cross \BJ
+
\inv{2} \Be_k x_i \int \lr{ J_i x_k’ – J_k x_i’ } \\
&=
– \Bx \cross \int \Bx’ \cross \BJ
+
\inv{2} \Be_k x_i \int (\BJ \cross \Bx’)_j \epsilon_{ikj} \\
&=
– \Bx \cross \int \Bx’ \cross \BJ

\inv{2} \epsilon_{kij} \Be_k x_i \int (\BJ \cross \Bx’)_j \\
&=
– \Bx \cross \int \Bx’ \cross \BJ

\inv{2} \Bx \cross \int \BJ \cross \Bx’ \\
&=
– \Bx \cross \int \Bx’ \cross \BJ
+
\inv{2} \Bx \cross \int \Bx’ \cross \BJ \\
&=
-\inv{2} \Bx \cross \int \Bx’ \cross \BJ,
\end{aligned}
\end{equation}

so

\begin{equation}\label{eqn:magneticMomentJackson:240}
\BA(\Bx) \approx \frac{\mu_0}{4 \pi \Abs{\Bx}^3} \lr{ -\frac{\Bx}{2} } \int \Bx’ \cross \BJ(\Bx’) d^3 x’.
\end{equation}

Letting

\begin{equation}\label{eqn:magneticMomentJackson:260}
\boxed{
\Bm = \inv{2} \int \Bx’ \cross \BJ(\Bx’) d^3 x’,
}
\end{equation}

the far field approximation of the vector potential is
\begin{equation}\label{eqn:magneticMomentJackson:280}
\boxed{
\BA(\Bx) = \frac{\mu_0}{4 \pi} \frac{\Bm \cross \Bx}{\Abs{\Bx}^3}.
}
\end{equation}

Note that when the current is restricted to an infintisimally thin loop, the magnetic moment reduces to

\begin{equation}\label{eqn:magneticMomentJackson:300}
\Bm(\Bx) = \frac{I}{2} \int \Bx \cross d\Bl’.
\end{equation}

Refering to [1] (pr. 1.60), this can be seen to be \( I \) times the “vector-area” integral.

References

[1] David Jeffrey Griffiths and Reed College. Introduction to electrodynamics. Prentice hall Upper Saddle River, NJ, 3rd edition, 1999.

[2] JD Jackson. Classical Electrodynamics. John Wiley and Sons, 2nd edition, 1975.

Final notes for ECE1254, Modelling of Multiphysics Systems

December 27, 2014 ece1254 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Capture

I’ve now finished my first grad course, Modelling of Multiphysics Systems, taught by Prof Piero Triverio.

I’ve posted notes for lectures and other material as I was taking the course, but now have an aggregated set of notes for the whole course posted.
This is now updated with all my notes from the lectures, solved problems, additional notes on auxillary topics I wanted to explore (like SVD), plus the notes from the Harmonic Balance report that Mike and I will be presenting in January.

This version of my notes also includes all the matlab figures regenerating using http://www.mathworks.com/matlabcentral/fileexchange/23629-export-fig, which allows a save-as pdf, which rescales much better than Matlab saveas() png’s when embedded in latex.  I’m not sure if that’s the best way to include Matlab figures in latex, but they are at least not fuzzy looking now.

All in all, I’m pretty pleased with my notes for this course.  They are a lot more readable than any of the ones I’ve done for the physics undergrad courses I was taking (https://peeterjoot.com/writing/).  While there was quite a lot covered in this course, the material really only requires an introductory circuits course and some basic math (linear algebra and intro calculus), so is pretty accessible.

This was a fun course.  I recall, back in ancient times when I was a first year student, being unsatisfied with all the ad-hoc strategies we used to solve circuits problems.  This finally answers the questions of how to tackle things more systematically.

Here’s the contents outline for these notes:

Preface
Lecture notes
1 nodal analysis
1.1 In slides
1.2 Mechanical structures example
1.3 Assembling system equations automatically. Node/branch method
1.4 Nodal Analysis
1.5 Modified nodal analysis (MNA)
2 solving large systems
2.1 Gaussian elimination
2.2 LU decomposition
2.3 Problems
3 numerical errors and conditioning
3.1 Strict diagonal dominance
3.2 Exploring uniqueness and existence
3.3 Perturbation and norms
3.4 Matrix norm
4 singular value decomposition, and conditioning number
4.1 Singular value decomposition
4.2 Conditioning number
5 sparse factorization
5.1 Fill ins
5.2 Markowitz product
5.3 Markowitz reordering
5.4 Graph representation
6 gradient methods
6.1 Summary of factorization costs
6.2 Iterative methods
6.3 Gradient method
6.4 Recap: Summary of Gradient method
6.5 Conjugate gradient method
6.6 Full Algorithm
6.7 Order analysis
6.8 Conjugate gradient convergence
6.9 Gershgorin circle theorem
6.10 Preconditioning
6.11 Symmetric preconditioning
6.12 Preconditioned conjugate gradient
6.13 Problems
7 solution of nonlinear systems
7.1 Nonlinear systems
7.2 Richardson and Linear Convergence
7.3 Newton’s method
7.4 Solution of N nonlinear equations in N unknowns
7.5 Multivariable Newton’s iteration
7.6 Automatic assembly of equations for nonlinear system
7.7 Damped Newton’s method
7.8 Continuation parameters
7.9 Singular Jacobians
7.10 Struts and Joints, Node branch formulation
7.11 Problems
8 time dependent systems
8.1 Assembling equations automatically for dynamical systems
8.2 Numerical solution of differential equations
8.3 Forward Euler method
8.4 Backward Euler method
8.5 Trapezoidal rule (TR)
8.6 Nonlinear differential equations
8.7 Analysis, accuracy and stability (Dt ! 0)
8.8 Residual for LMS methods
8.9 Global error estimate
8.10 Stability
8.11 Stability (continued)
8.12 Problems
9 model order reduction
9.1 Model order reduction
9.2 Moment matching
9.3 Model order reduction (cont).
9.4 Moment matching
9.5 Truncated Balanced Realization (1000 ft overview)
9.6 Problems
Final report
10 harmonic balance
10.1 Abstract
10.2 Introduction
10.2.1 Modifications to the netlist syntax
10.3 Background
10.3.1 Discrete Fourier Transform
10.3.2 Harmonic Balance equations
10.3.3 Frequency domain representation of MNA equations
10.3.4 Example. RC circuit with a diode.
10.3.5 Jacobian
10.3.6 Newton’s method solution
10.3.7 Alternative handling of the non-linear currents and Jacobians
10.4 Results
10.4.1 Low pass filter
10.4.2 Half wave rectifier
10.4.3 AC to DC conversion
10.4.4 Bridge rectifier
10.4.5 Cpu time and error vs N
10.4.6 Taylor series non-linearities
10.4.7 Stiff systems
10.5 Conclusion
10.6 Appendices
10.6.1 Discrete Fourier Transform inversion
Appendices
a singular value decomposition
b basic theorems and definitions
c norton equivalents
d stability of discretized linear differential equations
e laplace transform refresher
f discrete fourier transform
g harmonic balance, rough notes
g.1 Block matrix form, with physical parameter ordering
g.2 Block matrix form, with frequency ordering
g.3 Representing the linear sources
g.4 Representing non-linear sources
g.5 Newton’s method
g.6 A matrix formulation of Harmonic Balance non-linear currents
h matlab notebooks
i mathematica notebooks
Index
Bibliography