eigenvalue

Update to old phy356 (Quantum Mechanics I) notes.

February 12, 2015 math and physics play , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

It’s been a long time since I took QM I. My notes from that class were pretty rough, but I’ve cleaned them up a bit.

The main value to these notes is that I worked a number of introductory Quantum Mechanics problems.

These were my personal lecture notes for the Fall 2010, University of Toronto Quantum mechanics I course (PHY356H1F), taught by Prof. Vatche Deyirmenjian.

The official description of this course was:

The general structure of wave mechanics; eigenfunctions and eigenvalues; operators; orbital angular momentum; spherical harmonics; central potential; separation of variables, hydrogen atom; Dirac notation; operator methods; harmonic oscillator and spin.

This document contains a few things

• My lecture notes.
Typos, if any, are probably mine(Peeter), and no claim nor attempt of spelling or grammar correctness will be made. The first four lectures had chosen not to take notes for since they followed the text very closely.
• Notes from reading of the text. This includes observations, notes on what seem like errors, and some solved problems. None of these problems have been graded. Note that my informal errata sheet for the text has been separated out from this document.
• Some assigned problems. I have corrected some the errors after receiving grading feedback, and where I have not done so I at least recorded some of the grading comments as a reference.
• Some worked problems associated with exam preparation.

Singular Value Decomposition

October 2, 2014 ece1254 , , , ,

[Click here for a PDF of this post with nicer formatting]

We’ve been presented with the definition of the singular value decomposition or SVD, but not how to compute it.

Recall that the definition was

Singular value decomposition (SVD)

Given \( M \in \text{R}^{m \times n} \), we can find a representation of \( M \)

\begin{equation}\label{eqn:svdNotes:81}
M = U \Sigma V^\T,
\end{equation}

where \( U \) and \( V\) are orthogonal matrices such that \( U^\T U = 1 \), and \( V^\T V = 1 \), and

\begin{equation}\label{eqn:svdNotes:101}
\Sigma =
\begin{bmatrix}
\sigma_1 & & & & & &\\
& \sigma_2 & & & & &\\
& & \ddots & & & &\\
& & & \sigma_r & & &\\
& & & & 0 & & \\
& & & & & \ddots & \\
& & & & & & 0 \\
\end{bmatrix}
\end{equation}

The values \( \sigma_i, i \le \min(n,m) \) are called the singular values of \( M \). The singular values are subject to the ordering

\begin{equation}\label{eqn:svdNotes:261}
\sigma_{1} \ge \sigma_{2} \ge \cdots \ge 0
\end{equation}

If \(r\) is the rank of \( M \), then the \( \sigma_r \) above is the minimum non-zero singular value (but the zeros are also called singular values).

Here I’ll review some of the key ideas from the MIT OCW SVD lecture by Prof. Gilbert Strang. This is largely to avoid having to rewatch this again in a few years as I just did.

The idea behind the SVD is to find an orthogonal basis that relates the image space of the transformation, as well as the basis for the vectors that the transformation applies to. That is a relation of the form

\begin{equation}\label{eqn:svdNotes:281}
\Bu_i = M \Bv_j
\end{equation}

Where \( \Bv_j \) are orthonormal basis vectors, and \( \Bu_i \) are orthonormal basis vectors for the image space. %Strang’s lectures call these the row and column spaces (or reverse?)

Let’s suppose that such a set of vectors can be computed and that \( M \) has a representation of the desired form

\begin{equation}\label{eqn:svdNotes:301}
M = U \Sigma V^\conj
\end{equation}

where
\begin{equation}\label{eqn:svdNotes:321}
U =
\begin{bmatrix}
\Bu_1 & \Bu_2 & \cdots & \Bu_m
\end{bmatrix},
\end{equation}

and

\begin{equation}\label{eqn:svdNotes:341}
V =
\begin{bmatrix}
\Bv_1 & \Bv_2 & \cdots & \Bv_n
\end{bmatrix}.
\end{equation}

By left or right multiplication of \( M \) with its (conjugate) transpose, we will see that the decomposed form of these products have a very simple form

\begin{equation}\label{eqn:svdNotes:361}
M^\conj M
=
V \Sigma^\conj U^\conj U \Sigma V^\conj
=
V \Sigma^\conj \Sigma V^\conj,
\end{equation}
\begin{equation}\label{eqn:svdNotes:381}
M M^\conj
=
U \Sigma V^\conj V \Sigma^\conj U^\conj
=
U \Sigma \Sigma^\conj U^\conj.
\end{equation}

Because \( \Sigma \) is diagonal, the products \( \Sigma^\conj \Sigma \) and \( \Sigma \Sigma^\conj \) are also both diagonal, and populated with the absolute squares of the singular values that we have presumed to exist. Because \( \Sigma \Sigma^\conj \) is an \( m \times m \) matrix, whereas \( \Sigma^\conj \Sigma \) is an \( n \times n \) matrix, so the numbers of zeros in each of these will differ, but each will have the structure

\begin{equation}\label{eqn:svdNotes:401}
\Sigma^\conj \Sigma \sim \Sigma \Sigma^\conj \sim
\begin{bmatrix}
\Abs{\sigma_1}^2 & & & & & &\\
& \Abs{ \sigma_2 }^2 & & & & &\\
& & \ddots & & & &\\
& & & \Abs{ \sigma_r }^2 & & &\\
& & & & 0 & & \\
& & & & & \ddots & \\
& & & & & & 0 \\
\end{bmatrix}
\end{equation}

This shows us one method of computing the singular value decomposition (for full rank systems), because we need only solve for the eigensystem of either \( M M^\conj \) or \( M^\conj M \) to find the singular values, and one of \( U \) or \( V \). To form \( U \) we will be able to find \( r \) orthonormal eigenvectors of \( M M^\conj \) and then supplement that with a mutually orthonormal set of vectors from the Null space of \( M \). We can form \( \Sigma \) by taking the square roots of the eigenvalues of \( M M^\conj \). With both \( \Sigma \) and \( U \) computed, we can then compute \( V \) by inversion. We could similarly solve for \( \Sigma \) and \( V \) by computing the eigensystem of \( M^\conj M \) and similarly supplementing those eigenvectors with vectors from the null space, and then compute \( U \) by inversion.

Let’s work the examples from the lecture to see how this works.

Example: Full rank 2×2 matrix

In the lecture the SVD decomposition is computed for

\begin{equation}\label{eqn:svdNotes:421}
M =
\begin{bmatrix}
4 & 4 \\
3 & -3
\end{bmatrix}
\end{equation}

For this we have

\begin{equation}\label{eqn:svdNotes:441}
M M^\conj
=
\begin{bmatrix}
32 & 0 \\
0 & 18
\end{bmatrix}
\end{equation}
\begin{equation}\label{eqn:svdNotes:461}
M^\conj M
=
\begin{bmatrix}
25 & 7 \\
7 & 25
\end{bmatrix}
\end{equation}

The first is already diagonalized so \( U = I \), and the singular values are found by inspection \( \{ \sqrt{32}, \sqrt{18} \} \), or

\begin{equation}\label{eqn:svdNotes:541}
\Sigma =
\begin{bmatrix}
\sqrt{32} & 0 \\
0 & \sqrt{18}
\end{bmatrix}
\end{equation}

Because the system is full rank we can solve for \( V \) by inversion

\begin{equation}\label{eqn:svdNotes:481}
\Sigma^{-1} U^\conj M = V^\conj,
\end{equation}

or
\begin{equation}\label{eqn:svdNotes:501}
V = M^\conj U \lr{ \Sigma^{-1} }^\conj.
\end{equation}

In this case that is

\begin{equation}\label{eqn:svdNotes:521}
V = \inv{\sqrt{2}}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix}
\end{equation}

We could alternately compute \( V \) directly by diagonalizing \( M^\conj M \). We see that the eigenvectors are \( \lr{ 1, \pm 1 }/\sqrt{2} \), with respective eigenvalues \( \{ 32, 18 \} \).

This gives us \ref{eqn:svdNotes:541} and \ref{eqn:svdNotes:521}. Again, because the system is full rank, we can compute \( U \) by inversion. That is

\begin{equation}\label{eqn:svdNotes:561}
U = M V \Sigma^{-1}.
\end{equation}

Carrying out this calculation recovers \( U = I \) as expected. Looks like I used a different matrix than Prof. Strang used in his lecture (alternate signs on the 3’s). He had some trouble that arrived from independent calculation of the respective eigenspaces. Calculating one from the other avoids that trouble since there are different signed eigenvalues that can be chosen, and we are looking for specific mappings between the eigenspaces that satisfy the \( \Bu_i = M \Bv_j \) constraints encoded by the relationship \( M = U \Sigma V^\conj \).

Let’s work a non-full rank example, as in the lecture.

Example: 2×2 matrix without full rank

How about

\begin{equation}\label{eqn:svdNotes:581}
M =
\begin{bmatrix}
1 & 1 \\
2 & 2
\end{bmatrix}.
\end{equation}

For this we have

\begin{equation}\label{eqn:svdNotes:601}
M^\conj M =
\begin{bmatrix}
5 & 5 \\
5 & 5
\end{bmatrix}
\end{equation}
\begin{equation}\label{eqn:svdNotes:621}
M M^\conj =
\begin{bmatrix}
2 & 4 \\
4 & 8
\end{bmatrix}
\end{equation}

For which the non-zero eigenvalue is \( 10 \) and the corresponding eigenvalue is
\begin{equation}\label{eqn:svdNotes:641}
\Bv =
\inv{\sqrt{2}}
\begin{bmatrix}
1 \\
1
\end{bmatrix}.
\end{equation}

This gives us

\begin{equation}\label{eqn:svdNotes:661}
\Sigma =
\begin{bmatrix}
\sqrt{10} & 0 \\
0 & 0
\end{bmatrix}
\end{equation}

Since we require \( V \) to be orthonormal, there is only one choice (up to a sign) for the vector from the null space.

Let’s try

\begin{equation}\label{eqn:svdNotes:681}
V =
\inv{\sqrt{2}}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix}
\end{equation}

We find that \( M M^\conj \) has eigenvalues \( \{ 10, 0 \} \) as expected. The eigenvector for the non-zero eigenvalue is found to be

\begin{equation}\label{eqn:svdNotes:701}
\Bu = \inv{\sqrt{5}}
\begin{bmatrix}
1 \\
2
\end{bmatrix}.
\end{equation}

It’s easy to expand this to an orthonormal basis, but do we have to pick a specific sign relative to the choice that we’ve made for \( V \)?

Let’s try

\begin{equation}\label{eqn:svdNotes:721}
U = \inv{\sqrt{5}}
\begin{bmatrix}
1 & 2 \\
2 & -1
\end{bmatrix}.
\end{equation}

Multiplying out \( U \Sigma V^\conj \) we have

\begin{equation}\label{eqn:svdNotes:741}
\begin{aligned}
\inv{\sqrt{5}}
\begin{bmatrix}
1 & 2 \\
2 & -1
\end{bmatrix}
\begin{bmatrix}
\sqrt{10} & 0 \\
0 & 0
\end{bmatrix}
\inv{\sqrt{2}}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix}
&=
\begin{bmatrix}
1 & 2 \\
2 & -1
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & 0
\end{bmatrix}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 0 \\
2 & 0
\end{bmatrix}
\begin{bmatrix}
1 & 1 \\
1 & -1
\end{bmatrix} \\
&=
\begin{bmatrix}
1 & 1 \\
2 & 2
\end{bmatrix}.
\end{aligned}
\end{equation}

It appears that this works, although we haven’t demonstrated why that should be, and we could have gotten lucky with signs. There’s some theoretical work to do here, but let’s leave that for another day (or rely on software to do this computational task).