[Click here for a PDF of this post with nicer formatting]

I pulled [1], one of too many lonely Dover books, off my shelf and started reading the review chapter. It posed the following question, which I thought had an interesting subquestion.

Variational principle with two by two matrix.

Consider a \( 2 \times 2 \) real symmetric matrix operator \(\BO \), with an arbitrary normalized trial vector

\begin{equation}\label{eqn:variationalMatrix:20}
\Bc =
\begin{bmatrix}
\cos\theta \\
\sin\theta
\end{bmatrix}.
\end{equation}

The variational principle requires that minimum value of \( \omega(\theta) = \Bc^\dagger \BO \Bc \) is greater than or equal to the lowest eigenvalue. If that minimum value occurs at \( \omega(\theta_0) \), show that this is exactly equal to the lowest eigenvalue and explain why this is expected.

Why this is expected is the part of the question that I thought was interesting.

Finding the minimum.

If the operator representation is

\begin{equation}\label{eqn:variationalMatrix:40}
\BO =
\begin{bmatrix}
a & b \\
b & d
\end{bmatrix},
\end{equation}

then the variational product is

\begin{equation}\label{eqn:variationalMatrix:80}
\begin{aligned}
\omega(\theta)
&=
\begin{bmatrix}
\cos\theta & \sin\theta
\end{bmatrix}
\begin{bmatrix}
a & b \\
b & d
\end{bmatrix}
\begin{bmatrix}
\cos\theta \\
\sin\theta
\end{bmatrix} \\
&=
\begin{bmatrix}
\cos\theta & \sin\theta
\end{bmatrix}
\begin{bmatrix}
a \cos\theta + b \sin\theta \\
b \cos\theta + d \sin\theta
\end{bmatrix} \\
&=
a \cos^2\theta + 2 b \sin\theta \cos\theta
+ d \sin^2\theta \\
&=
a \cos^2\theta + b \sin( 2 \theta )
+ d \sin^2\theta.
\end{aligned}
\end{equation}

The minimum is given by

\begin{equation}\label{eqn:variationalMatrix:60}
\begin{aligned}
0
&=
\frac{d\omega}{d\theta} \\
&=
-2 a \sin\theta \cos\theta + 2 b \cos( 2 \theta )
+ 2 d \sin\theta \cos\theta \\
&=
2 b \cos( 2 \theta )
+ (d -a)\sin( 2 \theta )
\end{aligned}
,
\end{equation}

so the extreme values will be found at

\begin{equation}\label{eqn:variationalMatrix:100}
\tan(2\theta_0) = \frac{2 b}{a – d}.
\end{equation}

Solving for \( \cos(2\theta_0) \), with \( \alpha = 2b/(a-d) \), we have

\begin{equation}\label{eqn:variationalMatrix:120}
1 – \cos^2(2\theta) = \alpha^2 \cos^2(2 \theta),
\end{equation}

or

\begin{equation}\label{eqn:variationalMatrix:140}
\begin{aligned}
\cos^2(2\theta_0)
&= \frac{1}{1 + \alpha^2} \\
&= \frac{1}{1 + 4 b^2/(a-d)^2 } \\
&= \frac{(a-d)^2}{(a-d)^2 + 4 b^2 }.
\end{aligned}
\end{equation}

So,

\begin{equation}\label{eqn:variationalMatrix:200}
\begin{aligned}
\cos(2 \theta_0) &= \frac{ \pm (a-d) }{\sqrt{ (a-d)^2 + 4 b^2 }} \\
\sin(2 \theta_0) &= \frac{ \pm 2 b }{\sqrt{ (a-d)^2 + 4 b^2 }},
\end{aligned}
\end{equation}

Substituting this back into \( \omega(\theta_0) \) is a bit tedious.
I did it once on paper, then confirmed with Mathematica (quantumchemistry/twoByTwoSymmetricVariation.nb). The end result is

\begin{equation}\label{eqn:variationalMatrix:160}
\omega(\theta_0)
=
\inv{2} \lr{ a + d \pm \sqrt{ (a-d)^2 + 4 b^2 } }.
\end{equation}

The eigenvalues of the operator are given by

\begin{equation}\label{eqn:variationalMatrix:220}
\begin{aligned}
0
&= (a-\lambda)(d-\lambda) – b^2 \\
&= \lambda^2 – (a+d) \lambda + a d – b^2 \\
&= \lr{\lambda – \frac{a+d}{2}}^2 -\lr{ \frac{a+d}{2}}^2 + a d – b^2 \\
&= \lr{\lambda – \frac{a+d}{2}}^2 – \inv{4} \lr{ (a-d)^2 + 4 b^2 },
\end{aligned}
\end{equation}

so the eigenvalues are exactly the values \ref{eqn:variationalMatrix:160} as stated by the problem statement.

Why should this have been anticipated?

If the eigenvectors are \( \Be_1, \Be_2 \), the operator can be diagonalized as

\begin{equation}\label{eqn:variationalMatrix:240}
\BO = U D U^\T,
\end{equation}

where \( U = \begin{bmatrix} \Be_1 & \Be_2 \end{bmatrix} \), and \( D \) has the eigenvalues along the diagonal. The energy function \( \omega \) can now be written

\begin{equation}\label{eqn:variationalMatrix:260}
\begin{aligned}
\omega
&= \Bc^\T U D U^\T \Bc \\
&= (U^\T \Bc)^\T D U^\T \Bc.
\end{aligned}
\end{equation}

We can show that the transformed vector \( U^\T \Bc \) is still a unit vector

\begin{equation}\label{eqn:variationalMatrix:280}
\begin{aligned}
U^\T \Bc
&=
\begin{bmatrix}
\Be_1^\T \\
\Be_2^\T \\
\end{bmatrix}
\Bc \\
&=
\begin{bmatrix}
\Be_1^\T \Bc \\
\Be_2^\T \Bc \\
\end{bmatrix},
\end{aligned}
\end{equation}

so
\begin{equation}\label{eqn:variationalMatrix:300}
\begin{aligned}
\Abs{
U^\T \Bc
}^2
&=
\Bc^\T \Be_1
\Be_1^\T \Bc
+
\Bc^\T \Be_2
\Be_2^\T \Bc \\
&=
\Bc^\T \lr{ \Be_1 \Be_1^\T
+
\Be_2
\Be_2^\T } \Bc \\
&=
\Bc^\T \Bc \\
&= 1,
\end{aligned}
\end{equation}

so the transformed vector can be written as

\begin{equation}\label{eqn:variationalMatrix:320}
U^\T \Bc =
\begin{bmatrix}
\cos\phi \\
\sin\phi
\end{bmatrix},
\end{equation}

for some \( \phi \). With such a representation we have
\begin{equation}\label{eqn:variationalMatrix:340}
\begin{aligned}
\omega
&=
\begin{bmatrix}
\cos\phi & \sin\phi
\end{bmatrix}
\begin{bmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{bmatrix}
\begin{bmatrix}
\cos\phi \\
\sin\phi
\end{bmatrix} \\
&=
\begin{bmatrix}
\cos\phi & \sin\phi
\end{bmatrix}
\begin{bmatrix}
\lambda_1 \cos\phi \\
\lambda_2 \sin\phi
\end{bmatrix} \\
&=
\lambda_1 \cos^2\phi + \lambda_2 \sin^2\phi.
\end{aligned}
\end{equation}

This has it’s minimums where \( 0 = \sin(2 \phi)( \lambda_2 – \lambda_1 ) \). For the non-degenerate case, two zeros at \( \phi = n \pi/2 \) for integral \( n \). For \( \phi = 0, \pi/2 \), we have

\begin{equation}\label{eqn:variationalMatrix:360}
\Bc =
\begin{bmatrix}
1 \\
0
\end{bmatrix},
\begin{bmatrix}
0 \\
1
\end{bmatrix}.
\end{equation}

We see that the extreme values of \( \omega \) occur when the trial vectors \( \Bc \) are eigenvectors of the operator.

References

[1] Attila Szabo and Neil S Ostlund. Modern quantum chemistry: introduction to advanced electronic structure theory. Dover publications, 1989.