ece1254

Numerical LU example where pivoting is required

October 9, 2014 ece1254 , , ,

[Click here for a PDF of this post with nicer formatting]

I’m having trouble understanding how to apply the LU procedure where pivoting is required. Proceeding with such a factorization, doesn’t produce an LU factorization that is the product of a lower triangular matrix and an upper triangular matrix. What we get instead is what looks like a permutation of a lower triangular matrix with an upper triangular matrix. As an example, consider the LU reduction of

\begin{equation}\label{eqn:luExample2:20}
M \Bx =
\begin{bmatrix}
0 & 0 & 1 \\
2 & 0 & 4 \\
1 & 1 & 1
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3 \\
\end{bmatrix}.
\end{equation}

We want \( r_2 \) as the pivot row, since it has the biggest first column value, so we can write

\begin{equation}\label{eqn:luExample2:40}
M \Bx \rightarrow M’ \Bx’
=
\begin{bmatrix}
2 & 0 & 4 \\
0 & 0 & 1 \\
1 & 1 & 1
\end{bmatrix}
\begin{bmatrix}
x_2 \\
x_1 \\
x_3 \\
\end{bmatrix}.
\end{equation}

We can express this permutation algebraically as a row permutation matrix operation

\begin{equation}\label{eqn:luExample2:60}
M’ =
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 0 \\
0 & 0 & 1
\end{bmatrix}
M.
\end{equation}

We can now proceed with the Gaussian reduction operations
\begin{equation}\label{eqn:luExample2:80}
\begin{aligned}
r_2 & \rightarrow r_2 – \frac{0}{2} r_1 \\
r_3 & \rightarrow r_3 – \frac{1}{2} r_1 \\
\end{aligned},
\end{equation}

which gives

\begin{equation}\label{eqn:luExample2:100}
M_1 =
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
-1/2 & 0 & 1
\end{bmatrix}
M’
=
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
-1/2 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
2 & 0 & 4 \\
0 & 0 & 1 \\
1 & 1 & 1
\end{bmatrix}
=
\begin{bmatrix}
2 & 0 & 4 \\
0 & 0 & 1 \\
0 & 1 & -1
\end{bmatrix}.
\end{equation}

Application of one more permutation operation gives us the desired upper triangular matrix

\begin{equation}\label{eqn:luExample2:120}
U = M_1′ =
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0 \\
\end{bmatrix}
\begin{bmatrix}
2 & 0 & 4 \\
0 & 0 & 1 \\
0 & 1 & -1
\end{bmatrix}
=
\begin{bmatrix}
2 & 0 & 4 \\
0 & 1 & -1 \\
0 & 0 & 1
\end{bmatrix}.
\end{equation}

This new matrix operator applies to the permuted vector

\begin{equation}\label{eqn:luExample2:140}
\Bx” =
\begin{bmatrix}
x_2 \\
x_3 \\
x_1 \\
\end{bmatrix}.
\end{equation}

We’ve constructed \( U \) by the following row operations

\begin{equation}\label{eqn:luExample2:160}
U =
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0 \\
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
-1/2 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
M.
\end{equation}

Seeking \( L U = M \), we want

\begin{equation}\label{eqn:luExample2:180}
L
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0 \\
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
-1/2 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
M
= M,
\end{equation}

or

\begin{equation}\label{eqn:luExample2:200}
L
=
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
1/2 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0 \\
\end{bmatrix}
=
\begin{bmatrix}
0 & 1 & 0 \\
1 & 0 & 0 \\
1/2 & 0 & 1
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 0 & 1 \\
0 & 1 & 0 \\
\end{bmatrix}
=
\begin{bmatrix}
0 & 0 & 1 \\
1 & 0 & 0 \\
1/2 & 1 & 0
\end{bmatrix}.
\end{equation}

The \( L U \) factorization that we’ve attempted does not appear to produce a lower triangular factor, but a permutation of a lower triangular factor?

When such pivoting is required it isn’t obvious, at least to me, how to do the clever \( L U \) algorithm that we outlined in class. How can we pack the operations into the lower triangle when we actually have to apply permutation matrices to the results of the last iteration.

It seems that while we cannot perform a \( LU \) decomposition of \( M \) we can perform an \( L U \) factorization of \( P M \), where \( P \) is the permutation matrix for the permutation \( 2,3,1 \) that we applied to the rows during the Gaussian operations.

Let’s do that \( L U \) factorization to check

\begin{equation}\label{eqn:luExample2:220}
P M =
\begin{bmatrix}
0 & 1 & 0 \\
0 & 0 & 1 \\
1 & 0 & 0 \\
\end{bmatrix}
\begin{bmatrix}
0 & 0 & 1 \\
2 & 0 & 4 \\
1 & 1 & 1
\end{bmatrix}
=
\begin{bmatrix}
2 & 0 & 4 \\
1 & 1 & 1 \\
0 & 0 & 1 \\
\end{bmatrix}.
\end{equation}

We want to apply the elementary row operation

\begin{equation}\label{eqn:luExample2:240}
r_2 \rightarrow r_2 – \frac{1}{2} r_1,
\end{equation}

for

\begin{equation}\label{eqn:luExample2:260}
\lr{P M}_1 =
\begin{bmatrix}
2 & 0 & 4 \\
0 & 1 & -1 \\
0 & 0 & 1 \\
\end{bmatrix},
\end{equation}

The \( L U \) factorization is therefore

\begin{equation}\label{eqn:luExample2:280}
P M =
L U =
\begin{bmatrix}
1 & 0 & 0 \\
1/2 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
2 & 0 & 4 \\
0 & 1 & -1 \\
0 & 0 & 1 \\
\end{bmatrix}.
\end{equation}

Observe that we can also write this as

\begin{equation}\label{eqn:luExample2:300}
M = \lr{ P^{-1} L } U.
\end{equation}

In our case the inverse permutation is a \( 3,1,2 \) permutation matrix

\begin{equation}\label{eqn:luExample2:320}
P^{-1} =
\begin{bmatrix}
0 & 0 & 1 \\
1 & 0 & 0 \\
0 & 1 & 0 \\
\end{bmatrix}.
\end{equation}

We see that \( P^{-1} L \) produces the not-lower-triangular matrix factor found earlier in \ref{eqn:luExample2:200}.

Numeric LU factorization example

October 9, 2014 ece1254 , , ,

[Click here for a PDF of this post with nicer formatting]

To get a better feel for LU factorization before attempting a numeric implementation, let’s look at a numeric example in detail. This will strip some of the abstraction away.

Let’s compute the LU factorization for

\begin{equation}\label{eqn:luExample:20}
M =
\begin{bmatrix}
5 & 1 & 1 \\
2 & 3 & 4 \\
3 & 1 & 2 \\
\end{bmatrix}.
\end{equation}

This matrix was picked to avoid having to think of selecting the right pivot
row. The first two operations are

\begin{equation}\label{eqn:luExample:41}
\begin{aligned}
r_2 &\rightarrow r_2 – \frac{2}{5} r_1 \\
r_3 &\rightarrow r_3 – \frac{3}{5} r_1
\end{aligned},
\end{equation}

and produce
\begin{equation}\label{eqn:luExample:40}
\begin{bmatrix}
5 & 1 & 1 \\
0 &13/5 & 18/5 \\
0 &2/5 & 7/5 \\
\end{bmatrix}
\end{equation}

The row operations (left multiplication) that produce this matrix are

\begin{equation}\label{eqn:luExample:60}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
-3/5 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
-2/5 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
=
\begin{bmatrix}
1 & 0 & 0 \\
-2/5 & 1 & 0 \\
-3/5 & 0 & 1 \\
\end{bmatrix}.
\end{equation}

These operations happen to be commutative and also both invert simply. The inverse operations are
\begin{equation}\label{eqn:luExample:80}
\begin{bmatrix}
1 & 0 & 0 \\
2/5 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
3/5 & 0 & 1 \\
\end{bmatrix}
=
\begin{bmatrix}
1 & 0 & 0 \\
2/5 & 1 & 0 \\
3/5 & 0 & 1 \\
\end{bmatrix}.
\end{equation}

In matrix form the elementary matrix operations that take us to the first stage of the Gaussian reduction are

\begin{equation}\label{eqn:luExample:100}
\begin{bmatrix}
1 & 0 & 0 \\
-2/5 & 1 & 0 \\
-3/5 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
5 & 1 & 1 \\
2 & 3 & 4 \\
3 & 1 & 2 \\
\end{bmatrix}
=
\begin{bmatrix}
5 & 1 & 1 \\
0 &13/5 & 18/5 \\
0 &2/5 & 7/5 \\
\end{bmatrix}.
\end{equation}

Inverted that is

\begin{equation}\label{eqn:luExample:120}
\begin{bmatrix}
5 & 1 & 1 \\
2 & 3 & 4 \\
3 & 1 & 2 \\
\end{bmatrix}
=
\begin{bmatrix}
1 & 0 & 0 \\
2/5 & 1 & 0 \\
3/5 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
5 & 1 & 1 \\
0 &13/5 & 18/5 \\
0 &2/5 & 7/5 \\
\end{bmatrix}.
\end{equation}

This is the first stage of the LU decomposition, although the U matrix is not yet in upper triangular form. Again, with our pivot row in the desired position already, the last row operation to perform is

\begin{equation}\label{eqn:luExample:140}
r_3 \rightarrow r_3 – \frac{2/5}{5/13} r_2 = r_3 – \frac{2}{13} r_2.
\end{equation}

The final stage of this Gaussian reduction is

\begin{equation}\label{eqn:luExample:160}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & -2/13 & 1 \\
\end{bmatrix}
\begin{bmatrix}
5 & 1 & 1 \\
0 &13/5 & 18/5 \\
0 &2/5 & 7/5 \\
\end{bmatrix}
=
\begin{bmatrix}
5 & 1 & 1 \\
0 &13/5 & 18/5 \\
0 & 0 & 11/13 \\
\end{bmatrix}
= U,
\end{equation}

and our desired lower triangular matrix factor is
\begin{equation}\label{eqn:luExample:180}
\begin{bmatrix}
1 & 0 & 0 \\
2/5 & 1 & 0 \\
3/5 & 0 & 1 \\
\end{bmatrix}
\begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 2/13 & 1 \\
\end{bmatrix}
=
\begin{bmatrix}
1 & 0 & 0 \\
2/5 & 1 & 0 \\
3/5 & 2/13 & 1 \\
\end{bmatrix}
= L.
\end{equation}

A bit of matlab code easily verifies that the above manual computation recovers \( M = L U \)

l = [ 1 0 0 ; 2/5 1 0 ; 3/5 2/13 1 ] ;
u = [ 5 1 1 ; 0 13/5 18/5 ; 0 0 11/13 ] ;
m = l * u

Singular Value Decomposition

October 2, 2014 ece1254 , , , ,

[Click here for a PDF of this post with nicer formatting]

We’ve been presented with the definition of the singular value decomposition or SVD, but not how to compute it.

Recall that the definition was

Singular value decomposition (SVD)

Given \( M \in \text{R}^{m \times n} \), we can find a representation of \( M \)

\begin{equation}\label{eqn:svdNotes:81}
M = U \Sigma V^\T,
\end{equation}

where \( U \) and \( V\) are orthogonal matrices such that \( U^\T U = 1 \), and \( V^\T V = 1 \), and

\begin{equation}\label{eqn:svdNotes:101}
\Sigma =
\begin{bmatrix}
\sigma_1 & & & & & &\\
& \sigma_2 & & & & &\\
& & \ddots & & & &\\
& & & \sigma_r & & &\\
& & & & 0 & & \\
& & & & & \ddots & \\
& & & & & & 0 \\
\end{bmatrix}
\end{equation}

The values \( \sigma_i, i \le \min(n,m) \) are called the singular values of \( M \). The singular values are subject to the ordering

\begin{equation}\label{eqn:svdNotes:261}
\sigma_{1} \ge \sigma_{2} \ge \cdots \ge 0
\end{equation}

If \(r\) is the rank of \( M \), then the \( \sigma_r \) above is the minimum non-zero singular value (but the zeros are also called singular values).

Here I’ll review some of the key ideas from the MIT OCW SVD lecture by Prof. Gilbert Strang. This is largely to avoid having to rewatch this again in a few years as I just did.

The idea behind the SVD is to find an orthogonal basis that relates the image space of the transformation, as well as the basis for the vectors that the transformation applies to. That is a relation of the form

\begin{equation}\label{eqn:svdNotes:281}
\Bu_i = M \Bv_j
\end{equation}

Where \( \Bv_j \) are orthonormal basis vectors, and \( \Bu_i \) are orthonormal basis vectors for the image space. %Strang’s lectures call these the row and column spaces (or reverse?)

Let’s suppose that such a set of vectors can be computed and that \( M \) has a representation of the desired form

\begin{equation}\label{eqn:svdNotes:301}
M = U \Sigma V^\conj
\end{equation}

where
\begin{equation}\label{eqn:svdNotes:321}
U =
\begin{bmatrix}
\Bu_1 & \Bu_2 & \cdots & \Bu_m
\end{bmatrix},
\end{equation}

and

\begin{equation}\label{eqn:svdNotes:341}
V =
\begin{bmatrix}
\Bv_1 & \Bv_2 & \cdots & \Bv_n
\end{bmatrix}.
\end{equation}

By left or right multiplication of \( M \) with its (conjugate) transpose, we will see that the decomposed form of these products have a very simple form

\begin{equation}\label{eqn:svdNotes:361}
M^\conj M
=
V \Sigma^\conj U^\conj U \Sigma V^\conj
=
V \Sigma^\conj \Sigma V^\conj,
\end{equation}
\begin{equation}\label{eqn:svdNotes:381}
M M^\conj
=
U \Sigma V^\conj V \Sigma^\conj U^\conj
=
U \Sigma \Sigma^\conj U^\conj.
\end{equation}

Because \( \Sigma \) is diagonal, the products \( \Sigma^\conj \Sigma \) and \( \Sigma \Sigma^\conj \) are also both diagonal, and populated with the absolute squares of the singular values that we have presumed to exist. Because \( \Sigma \Sigma^\conj \) is an \( m \times m \) matrix, whereas \( \Sigma^\conj \Sigma \) is an \( n \times n \) matrix, so the numbers of zeros in each of these will differ, but each will have the structure

\begin{equation}\label{eqn:svdNotes:401}
\Sigma^\conj \Sigma \sim \Sigma \Sigma^\conj \sim
\begin{bmatrix}
\Abs{\sigma_1}^2 & & & & & &\\
& \Abs{ \sigma_2 }^2 & & & & &\\
& & \ddots & & & &\\
& & & \Abs{ \sigma_r }^2 & & &\\
& & & & 0 & & \\
& & & & & \ddots & \\
& & & & & & 0 \\
\end{bmatrix}
\end{equation}

This shows us one method of computing the singular value decomposition (for full rank systems), because we need only solve for the eigensystem of either \( M M^\conj \) or \( M^\conj M \) to find the singular values, and one of \( U \) or \( V \). To form \( U \) we will be able to find \( r \) orthonormal eigenvectors of \( M M^\conj \) and then supplement that with a mutually orthonormal set of vectors from the Null space of \( M \). We can form \( \Sigma \) by taking the square roots of the eigenvalues of \( M M^\conj \). With both \( \Sigma \) and \( U \) computed, we can then compute \( V \) by inversion. We could similarly solve for \( \Sigma \) and \( V \) by computing the eigensystem of \( M^\conj M \) and similarly supplementing those eigenvectors with vectors from the null space, and then compute \( U \) by inversion.

Let’s work the examples from the lecture to see how this works.

Example: Full rank 2×2 matrix

In the lecture the SVD decomposition is computed for

\begin{equation}\label{eqn:svdNotes:421}
M =
\begin{bmatrix}
4 & 4 \\
3 & -3
\end{bmatrix}
\end{equation}

For this we have

\begin{equation}\label{eqn:svdNotes:441}
M M^\conj
=
\begin{bmatrix}
32 & 0 \\
0 & 18
\end{bmatrix}
\end{equation}
\begin{equation}\label{eqn:svdNotes:461}
M^\conj M
=
\begin{bmatrix}
25 & 7 \\
7 & 25
\end{bmatrix}
\end{equation}

The first is already diagonalized so \( U = I \), and the singular values are found by inspection \( \{ \sqrt{32}, \sqrt{18} \} \), or

\begin{equation}\label{eqn:svdNotes:541}
\Sigma =
\begin{bmatrix}
\sqrt{32} & 0 \\
0 & \sqrt{18}
\end{bmatrix}
\end{equation}

Because the system is full rank we can solve for \( V \) by inversion

\begin{equation}\label{eqn:svdNotes:481}
\Sigma^{-1} U^\conj M = V^\conj,
\end{equation}

or
\begin{equation}\label{eqn:svdNotes:501}
V = M^\conj U \lr{ \Sigma^{-1} }^\conj.
\end{equation}

In this case that is

\begin{equation}\label{eqn:svdNotes:521}
V = \inv{\sqrt{2}}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix}
\end{equation}

We could alternately compute \( V \) directly by diagonalizing \( M^\conj M \). We see that the eigenvectors are \( \lr{ 1, \pm 1 }/\sqrt{2} \), with respective eigenvalues \( \{ 32, 18 \} \).

This gives us \ref{eqn:svdNotes:541} and \ref{eqn:svdNotes:521}. Again, because the system is full rank, we can compute \( U \) by inversion. That is

\begin{equation}\label{eqn:svdNotes:561}
U = M V \Sigma^{-1}.
\end{equation}

Carrying out this calculation recovers \( U = I \) as expected. Looks like I used a different matrix than Prof. Strang used in his lecture (alternate signs on the 3’s). He had some trouble that arrived from independent calculation of the respective eigenspaces. Calculating one from the other avoids that trouble since there are different signed eigenvalues that can be chosen, and we are looking for specific mappings between the eigenspaces that satisfy the \( \Bu_i = M \Bv_j \) constraints encoded by the relationship \( M = U \Sigma V^\conj \).

Let’s work a non-full rank example, as in the lecture.

Example: 2×2 matrix without full rank

How about

\begin{equation}\label{eqn:svdNotes:581}
M =
\begin{bmatrix}
1 & 1 \\
2 & 2
\end{bmatrix}.
\end{equation}

For this we have

\begin{equation}\label{eqn:svdNotes:601}
M^\conj M =
\begin{bmatrix}
5 & 5 \\
5 & 5
\end{bmatrix}
\end{equation}
\begin{equation}\label{eqn:svdNotes:621}
M M^\conj =
\begin{bmatrix}
2 & 4 \\
4 & 8
\end{bmatrix}
\end{equation}

For which the non-zero eigenvalue is \( 10 \) and the corresponding eigenvalue is
\begin{equation}\label{eqn:svdNotes:641}
\Bv =
\inv{\sqrt{2}}
\begin{bmatrix}
1 \\
1
\end{bmatrix}.
\end{equation}

This gives us

\begin{equation}\label{eqn:svdNotes:661}
\Sigma =
\begin{bmatrix}
\sqrt{10} & 0 \\
0 & 0
\end{bmatrix}
\end{equation}

Since we require \( V \) to be orthonormal, there is only one choice (up to a sign) for the vector from the null space.

Let’s try

\begin{equation}\label{eqn:svdNotes:681}
V =
\inv{\sqrt{2}}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix}
\end{equation}

We find that \( M M^\conj \) has eigenvalues \( \{ 10, 0 \} \) as expected. The eigenvector for the non-zero eigenvalue is found to be

\begin{equation}\label{eqn:svdNotes:701}
\Bu = \inv{\sqrt{5}}
\begin{bmatrix}
1 \\
2
\end{bmatrix}.
\end{equation}

It’s easy to expand this to an orthonormal basis, but do we have to pick a specific sign relative to the choice that we’ve made for \( V \)?

Let’s try

\begin{equation}\label{eqn:svdNotes:721}
U = \inv{\sqrt{5}}
\begin{bmatrix}
1 & 2 \\
2 & -1
\end{bmatrix}.
\end{equation}

Multiplying out \( U \Sigma V^\conj \) we have

\begin{equation}\label{eqn:svdNotes:741}
\begin{aligned}
\inv{\sqrt{5}}
\begin{bmatrix}
1 & 2 \\
2 & -1
\end{bmatrix}
\begin{bmatrix}
\sqrt{10} & 0 \\
0 & 0
\end{bmatrix}
\inv{\sqrt{2}}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix}
&=
\begin{bmatrix}
1 & 2 \\
2 & -1
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & 0
\end{bmatrix}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 0 \\
2 & 0
\end{bmatrix}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 1 \\
2 & 2
\end{bmatrix}.
\end{aligned}
\end{equation}

It appears that this works, although we haven’t demonstrated why that should be, and we could have gotten lucky with signs. There’s some theoretical work to do here, but let’s leave that for another day (or rely on software to do this computational task).

ECE1254H Modeling of Multiphysics Systems. Lecture 6: Singular value decomposition, and conditioning number. Taught by Prof. Piero Triverio

October 1, 2014 ece1254 , , , , ,

[Click here for a PDF of this post with nicer formatting]

Disclaimer

Peeter’s lecture notes from class. These may be incoherent and rough.

Matrix norm

We’ve defined the matrix norm of \( M \), for the system \( \overline{{y}} = M \overline{{x}} \) as

\begin{equation}\label{eqn:multiphysicsL6:21}
\Norm{M} = \max_{\Norm{\overline{{x}}} = 1} \Norm{ M \overline{{x}} }.
\end{equation}

We will typically use the \( L_2 \) norm, so that the matrix norm is
\begin{equation}\label{eqn:multiphysicsL6:41}
\Norm{M}_2 = \max_{\Norm{\overline{{x}}}_2 = 1} \Norm{ M \overline{{x}} }_2.
\end{equation}

It can be shown that

\begin{equation}\label{eqn:multiphysicsL6:61}
\Norm{M}_2 = \max_i \sigma_i(M),
\end{equation}

where \( \sigma_i(M) \) are the (SVD) singular values.

Singular value decomposition (SVD)

Given \( M \in R^{m \times n} \), we can find a representation of \( M \)

\begin{equation}\label{eqn:multiphysicsL6:81}
M = U \Sigma V^\T,
\end{equation}

where \( U \) and \( V\) are orthogonal matrices such that \( U^\T U = 1 \), and \( V^\T V = 1 \), and

\begin{equation}\label{eqn:multiphysicsL6:101}
\Sigma =
\begin{bmatrix}
\sigma_1 & & & & & &\\
& \sigma_2 & & & & &\\
& & \ddots & & & &\\
& & & \sigma_r & & &\\
& & & & 0 & & \\
& & & & & \ddots & \\
& & & & & & 0 \\
\end{bmatrix}
\end{equation}

The values \( \sigma_i, i \le \min(n,m) \) are called the singular values of \( M \). The singular values are subject to the ordering

\begin{equation}\label{eqn:multiphysicsL6:261}
\sigma_{1} \ge \sigma_{2} \ge \cdots \ge 0
\end{equation}

If \(r\) is the rank of \( M \), then the \( \sigma_r \) above is the minimum non-zero singular value (but the zeros are also called singular values).

Observe that the condition \( U^\T U = 1 \) is a statement of orthonormality. In terms of column vectors \( \overline{{u}}_i \), such a product written out explicitly is

\begin{equation}\label{eqn:multiphysicsL6:301}
\begin{bmatrix}
\overline{{u}}_1^\T \\ \overline{{u}}_2^\T \\ \vdots \\ \overline{{u}}_m^\T
\end{bmatrix}
\begin{bmatrix}
\overline{{u}}_1 & \overline{{u}}_2 & \cdots & \overline{{u}}_m
\end{bmatrix}
=
\begin{bmatrix}
1 & & & \\
& 1 & & \\
& & \ddots & \\
& & & 1
\end{bmatrix}.
\end{equation}

This is both normality \( \overline{{u}}_i^\T \overline{{u}}_i = 1 \), and orthonormality \( \overline{{u}}_i^\T \overline{{u}}_j = 1, i \ne j \).

Example: 2 x 2 case

(for column vectors \( \overline{{u}}_i, \overline{{v}}_j \)).

\begin{equation}\label{eqn:multiphysicsL6:281}
M =
\begin{bmatrix}
\overline{{u}}_1 & \overline{{u}}_2
\end{bmatrix}
\begin{bmatrix}
\sigma_1 & \\
& \sigma_2
\end{bmatrix}
\begin{bmatrix}
\overline{{v}}_1^\T \\
\overline{{v}}_2^\T
\end{bmatrix}
\end{equation}

Consider \( \overline{{y}} = M \overline{{x}} \), and take an \( \overline{{x}} \) with \( \Norm{\overline{{x}}}_2 = 1 \)

Note: I’ve chosen not to sketch what was drawn on the board. See instead the animated gif of the same in \citep{wiki:svd}.

A very nice video treatment of SVD by Prof Gilbert Strang can be found in \citep{ocw:svd}.

Conditioning number

Given a perturbation of \( M \overline{{x}} = \overline{{b}} \) to

\begin{equation}\label{eqn:multiphysicsL6:121}
\lr{ M + \delta M }
\lr{ \overline{{x}} + \delta \overline{{x}} } = \overline{{b}},
\end{equation}

or
\begin{equation}\label{eqn:multiphysicsL6:141}
\underbrace{ M \overline{{x}} – \overline{{b}} }_{=0} + \delta M \overline{{x}} + M \delta \overline{{x}} + \delta M \delta \overline{{x}} = 0.
\end{equation}

This gives

\begin{equation}\label{eqn:multiphysicsL6:161}
M \delta \overline{{x}} = – \delta M \overline{{x}} – \delta M \delta \overline{{x}},
\end{equation}

or
\begin{equation}\label{eqn:multiphysicsL6:181}
\delta \overline{{x}} = – M^{-1} \delta M \lr{ \overline{{x}} + \delta \overline{{x}} }.
\end{equation}

Taking norms

\begin{equation}\label{eqn:multiphysicsL6:201}
\Norm{ \delta \overline{{x}}}_2 = \Norm{
M^{-1} \delta M \lr{ \overline{{x}} + \delta \overline{{x}} }
}_2
\le
\Norm{ M^{-1} }_2 \Norm{ \delta M }_2 \Norm{ \overline{{x}} + \delta \overline{{x}} }_2,
\end{equation}

or
\begin{equation}\label{eqn:multiphysicsL6:221}
\underbrace{ \frac{\Norm{ \delta \overline{{x}}}_2 }{ \Norm{ \overline{{x}} + \delta \overline{{x}}}_2 } }_{\text{relative error of solution}}
\le
\underbrace{ \Norm{M}_2 \Norm{M^{-1}}_2 }_{\text{conditioning number of \(M\)}}
\underbrace{ \frac{ \Norm{ \delta M}_2 } {\Norm{M}_2} }_{\text{relative perturbation of \( M \)}}.
\end{equation}

The conditioning number can be shown to be

\begin{equation}\label{eqn:multiphysicsL6:241}
\text{cond}(M) =
\frac
{\sigma_{\mathrm{max}}}
{\sigma_{\mathrm{min}}}
\ge 1
\end{equation}

FIXME: justify.

sensitivity to conditioning number

Double precision relative rounding errors can be of the order \( 10^{-16} \sim 2^{-54} \), which allows us to gauge the relative error of the solution

 

Capture

ECE1254H Modeling of Multiphysics Systems. Lecture 4: Modified nodal analysis and solutions of large systems. Taught by Prof. Piero Triverio

September 26, 2014 ece1254 , , , ,

[Click here for a PDF of this post with nicer formatting]

Disclaimer

Peeter’s lecture notes from class. These may be incoherent and rough.

Modified nodal analysis

We add extra unknowns for

  • branch currents for voltage sources
  • all elements for which it is impossible or inconvenient to write \( i = f(v_1, v_2) \).

    Imagine, for example, that we have a component illustrated in fig. 1.

    fig. 1.  Variable voltage device

    fig. 1. Variable voltage device

    \begin{equation}\label{eqn:multiphysicsL4:20}
    v_1 – v_2 = 3 i^2
    \end{equation}

  • any current which is controlling dependent sources, as in fig. 2.
    fig. 2.  Current controlled device

    fig. 2. Current controlled device

  • Inductors
    \begin{equation}\label{eqn:multiphysicsL4:40}
    v_1 – v_2 = L \frac{di}{dt}.
    \end{equation}

Solving large systems

We are interested in solving linear systems of the form

\begin{equation}\label{eqn:multiphysicsL4:60}
M \overline{{x}} = \overline{{b}},
\end{equation}

possibly with thousands of elements.

Gaussian elimination

\begin{equation}\label{eqn:multiphysicsL4:80}
\begin{bmatrix}
M_{11} & M_{12} & M_{13} \\
M_{21} & M_{22} & M_{23} \\
M_{31} & M_{32} & M_{33} \\
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3 \\
\end{bmatrix}
=
\begin{bmatrix}
b_1 \\
b_2 \\
b_3 \\
\end{bmatrix}
\end{equation}

It’s claimed for now, to be seen later, that back substitution is the fastest way to arrive at the solution, less computationally complex than completion the diagonalization.

Steps

\begin{equation}\label{eqn:multiphysicsL4:100}
(1) \cdot \frac{M_{21}}{M_{11}} \Rightarrow
\begin{bmatrix}
M_{21} & \frac{M_{21}}{M_{11}} M_{12} & \frac{M_{21}}{M_{11}} M_{13} \\
\end{bmatrix}
\end{equation}

\begin{equation}\label{eqn:multiphysicsL4:120}
(2) \cdot \frac{M_{31}}{M_{11}} \Rightarrow
\begin{bmatrix}
M_{31} & \frac{M_{31}}{M_{11}} M_{32} & \frac{M_{31}}{M_{11}} M_{33} \\
\end{bmatrix}
\end{equation}

This gives

\begin{equation}\label{eqn:multiphysicsL4:140}
\begin{bmatrix}
M_{11} & M_{12} & M_{13} \\
0 & M_{22} – \frac{M_{21}}{ M_{11} } M_{12} & M_{23} – \frac{M_{21}}{M_{11}} M_{13} \\
0 & M_{32} – \frac{M_{31}}{M_{11}} M_{32} & M_{33} – \frac{M_{31}}{M_{11}} M_{33} \\
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3 \\
\end{bmatrix}
=
\begin{bmatrix}
b_1 \\
b_2 – \frac{M_{21}}{M_{11}} b_1 \\
b_3 – \frac{M_{31}}{M_{11}} b_1
\end{bmatrix}.
\end{equation}

Here the \( M_{11} \) element is called the {pivot}. Each of the \(
M_{j1}/M_{11} \) elements is called a {multiplier}. This operation can be written as

\begin{equation}\label{eqn:multiphysicsL4:160}
\begin{bmatrix}
M_{11} & M_{12} & M_{13} \\
0 & M_{22}^{(2)} & M_{23}^{(3)} \\
0 & M_{32}^{(2)} & M_{33}^{(3)} \\
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3 \\
\end{bmatrix}
=
\begin{bmatrix}
b_1 \\
b_2^{(2)} \\
b_3^{(2)} \\
\end{bmatrix}.
\end{equation}

Using \( M_{22}^{(2)} \) as the pivot this time, we form
\begin{equation}\label{eqn:multiphysicsL4:180}
\begin{bmatrix}
M_{11} & M_{12} & M_{13} \\
0 & M_{22}^{(2)} & M_{23}^{(3)} \\
0 & 0 & M_{33}^{(3)} – \frac{M_{32}^{(2)}}{M_{22}^{(2)}} M_{23}^{(2)} \\
\end{bmatrix}
\begin{bmatrix}
x_1 \\
x_2 \\
x_3 \\
\end{bmatrix}
=
\begin{bmatrix}
b_1 \\
b_2 – \frac{M_{21}}{M_{11}} b_1 \\
b_3 – \frac{M_{31}}{M_{11}} b_1
– \frac{M_{32}^{(2)}}{M_{22}^{(2)}} b_{2}^{(2)} \\
\end{bmatrix}.
\end{equation}

LU decomposition

Through Gaussian elimination, we have transformed the system from

\begin{equation}\label{eqn:multiphysicsL4:200}
M x = b
\end{equation}

to
\begin{equation}\label{eqn:multiphysicsL4:220}
U x = y.
\end{equation}

Writing out our Gaussian transformation in the form \( U \overline{{x}} = b \) we have

\begin{equation}\label{eqn:multiphysicsL4:240}
U \overline{{x}} =
\begin{bmatrix}
1 & 0 & 0 \\
-\frac{M_{21}}{M_{11}} & 1 & 0 \\
\frac{M_{32}^{(2)}}{M_{22}^{(2)}}
\frac{M_{21}}{M_{11}}

\frac{M_{31}}{M_{11}}
&

\frac{M_{32}^{(2)}}{M_{22}^{(2)}}
& 1
\end{bmatrix}
\begin{bmatrix}
b_1 \\
b_2 \\
b_3
\end{bmatrix}.
\end{equation}

We can verify that the operation matrix \( K^{-1} \), where \( K^{-1} U = M \) that takes us to this form is

\begin{equation}\label{eqn:multiphysicsL4:260}
\begin{bmatrix}
1 & 0 & 0 \\
\frac{M_{21}}{M_{11}} & 1 & 0 \\
\frac{M_{31}}{M_{11}} &
\frac{M_{32}^{(2)}}{M_{22}^{(2)}} &
1
\end{bmatrix}
\begin{bmatrix}
U_{11} & U_{12} & U_{13} \\
0 & U_{22} & U_{23} \\
0 & 0 & U_{33} \\
\end{bmatrix}
\overline{{x}} = \overline{{b}}
\end{equation}

Using this {LU} decomposition is generally superior to standard Gaussian elimination, since we can use this for many different \(\overline{{b}}\) vectors using the same amount of work.

Our steps are

\begin{equation}\label{eqn:multiphysicsL4:280}
b
= M x
= L \lr{ U x}
\equiv L y.
\end{equation}

We can now solve \( L y = b \), using substitution for \( y_1 \), then \( y_2 \), and finally \( y_3 \). This is called {forward substitution}.

Finally, we can now solve

\begin{equation}\label{eqn:multiphysicsL4:300}
U x = y,
\end{equation}

using {back substitution}.

Note that we produced the vector \( y \) as a side effect of performing the Gaussian elimination process.