A curious proof of the Baker-Campbell-Hausdorff formula

Equation (39) of [1] states the Baker-Campbell-Hausdorff formula for two operators \( a, b\) that commute with their commutator \( \antisymmetric{a}{b} \)

e^a e^b = e^{a + b + \antisymmetric{a}{b}/2},

and provides the outline of an interesting method of proof. That method is to consider the derivative of

f(\lambda) = e^{\lambda a} e^{\lambda b} e^{-\lambda (a + b)},

That derivative is
e^{\lambda a} a e^{\lambda b} e^{-\lambda (a + b)}
e^{\lambda a} b e^{\lambda b} e^{-\lambda (a + b)}

e^{\lambda a} b e^{\lambda b} (a + b)e^{-\lambda (a + b)} \\
e^{\lambda a} \lr{
a e^{\lambda b}
b e^{\lambda b}

e^{\lambda b} (a+b)
e^{-\lambda (a + b)} \\
e^{\lambda a} \lr{
\antisymmetric{a}{e^{\lambda b}}
{\antisymmetric{b}{e^{\lambda b}}}
e^{-\lambda (a + b)} \\
e^{\lambda a}
\antisymmetric{a}{e^{\lambda b}}
e^{-\lambda (a + b)}

The commutator above is proportional to \( \antisymmetric{a}{b} \)

\antisymmetric{a}{e^{\lambda b}}
\sum_{k=0}^\infty \frac{\lambda^k}{k!} \antisymmetric{a}{ b^k } \\
\sum_{k=0}^\infty \frac{\lambda^k}{k!} k b^{k-1} \antisymmetric{a}{b} \\
\lambda \sum_{k=1}^\infty \frac{\lambda^{k-1}}{(k-1)!} b^{k-1}
\antisymmetric{a}{b} \\
\lambda e^{\lambda b} \antisymmetric{a}{b},


\frac{df}{d\lambda} = \lambda \antisymmetric{a}{b} f.

To get the above, we should also do the induction demonstration for \( \antisymmetric{a}{ b^k } = k b^{k-1} \antisymmetric{a}{b} \).

This clearly holds for \( k = 0,1 \). For any other \( k \) we have

a b^{k+1} – b^{k+1} a \\
\lr{ \antisymmetric{a}{b^{k}} + b^k a
} b – b^{k+1} a \\
k b^{k-1} \antisymmetric{a}{b} b
+ b^k \lr{ \antisymmetric{a}{b} + {b a} }
– {b^{k+1} a} \\
k b^{k} \antisymmetric{a}{b}
+ b^k \antisymmetric{a}{b} \\
(k+1) b^k \antisymmetric{a}{b}.

Observe that \ref{eqn:bakercambell:100} is solved by

f = e^{\lambda^2\antisymmetric{a}{b}/2},

which gives

e^{\lambda^2 \antisymmetric{a}{b}/2} =
e^{\lambda a} e^{\lambda b} e^{-\lambda (a + b)}.

Right multiplication by \( e^{\lambda (a + b)} \) which commutes with \( e^{\lambda^2 \antisymmetric{a}{b}/2} \) and setting \( \lambda = 1 \) recovers \ref{eqn:bakercambell:20} as desired.

What I wonder looking at this, is what thought process led to trying this in the first place? This is not what I would consider an obvious approach to demonstrating this identity.


More on (SHO) coherent states

[1] pr. 2.19(c)

Show that \( \Abs{f(n)}^2 \) for a coherent state written as

\ket{z} = \sum_{n=0}^\infty f(n) \ket{n}

has the form of a Poisson distribution, and find the most probable value of \( n\), and thus the most probable energy.


The Poisson distribution has the form

P(n) = \frac{\mu^{n} e^{-\mu}}{n!}.

Here \( \mu \) is the mean of the distribution

&= \sum_{n=0}^\infty n P(n) \\
&= \sum_{n=1}^\infty n \frac{\mu^{n} e^{-\mu}}{n!} \\
&= \mu e^{-\mu} \sum_{n=1}^\infty \frac{\mu^{n-1}}{(n-1)!} \\
&= \mu e^{-\mu} e^{\mu} \\
&= \mu.

We found that the coherent state had the form

\ket{z} = c_0 \sum_{n=0} \frac{z^n}{\sqrt{n!}} \ket{n},

so the probability coefficients for \( \ket{n} \) are

&= c_0^2 \frac{\Abs{z^n}^2}{n!} \\
&= e^{-\Abs{z}^2} \frac{\Abs{z^n}^2}{n!}.

This has the structure of the Poisson distribution with mean \( \mu = \Abs{z}^2 \). The most probable value of \( n \) is that for which \( \Abs{f(n)}^2 \) is the largest. This is, in general, hard to compute, since we have a maximization problem in the integer domain that falls outside the normal toolbox. If we assume that \( n \) is large, so that Stirling’s approximation can be used to approximate the factorial, and also seek a non-integer value that maximizes the distribution, the most probable value will be the closest integer to that, and this can be computed. Let

&= \Abs{f(n)}^2 \\
&= \frac{e^{-\mu} \mu^n}{n!} \\
&= \frac{e^{-\mu} \mu^n}{e^{\ln n!}} \\
&\approx e^{-\mu – n \ln n + n } \mu^n \\
&= e^{-\mu – n \ln n + n + n \ln \mu }

This is maximized when

= \frac{dg}{dn}
= \lr{ – \ln n – 1 + 1 + \ln \mu } g(n),

which is maximized at \( n = \mu \). One of the integers \( n = \lfloor \mu \rfloor \) or \( n = \lceil \mu \rceil \) that brackets this value \( \mu = \Abs{z}^2 \) is the most probable. So, if an energy measurement is made of a coherent state \( \ket{z} \), the most probable value will be one of

E = \Hbar \lr{
+ \inv{2} },


E = \Hbar \lr{
+ \inv{2} },


Determining the rotation angle and normal for a rotation through Euler angles

[1] pr. 3.9 poses the problem to determine the total rotation angle for a set of Euler rotations given by

\mathcal{D}^{1/2}(\alpha, \beta, \gamma)
e^{-i(\alpha+\gamma)/2} \cos \frac{\beta}{2} & -e^{-i(\alpha-\gamma)/2} \sin \frac{\beta}{2} \\
e^{i(\alpha-\gamma)/2} \sin \frac{\beta}{2} & e^{i(\alpha+\gamma)/2} \cos \frac{\beta}{2}

Compare this to the matrix for a rotation (again double sided) about a normal, given by

= e^{-i \Bsigma \cdot \ncap \theta/2}
= \cos \frac{\theta}{2} I – i \Bsigma \cdot \ncap \sin \frac{\theta}{2}.

With \( \ncap = \lr{ \sin \Theta \cos\Phi, \sin \Theta \sin\Phi, \cos\Theta} \), the normal direction in its Pauli basis is

\Bsigma \cdot \ncap
\cos\Theta & \sin \Theta \cos\Phi – i \sin \Theta \sin\Phi \\
\sin \Theta \cos\Phi + i \sin \Theta \sin\Phi & -\cos\Theta
\cos\Theta & \sin \Theta e^{-i \Phi} \\
\sin \Theta e^{i \Phi} & -\cos\Theta


\mathcal{R} =
\cos \frac{\theta}{2} -i \sin \frac{\theta}{2} \cos\Theta & -i \sin \Theta e^{-i \Phi} \sin \frac{\theta}{2} \\
-i \sin \Theta e^{i \Phi} \sin \frac{\theta}{2} & \cos \frac{\theta}{2} +i \sin \frac{\theta}{2} \cos\Theta \\

It’s not obvious how to put this into correspondence with the matrix for the Euler rotations. Doing so certainly doesn’t look fun. To solve this problem, let’s go the opposite direction, and put the matrix for the Euler rotations into the form of \ref{eqn:eulerAngleRotationAngleAndNormal:40}.

That is
\mathcal{D}^{1/2}(\alpha, \beta, \gamma)
e^{-i(\alpha+\gamma)/2} \cos \frac{\beta}{2} & -e^{-i(\alpha-\gamma)/2} \sin \frac{\beta}{2} \\
e^{i(\alpha-\gamma)/2} \sin \frac{\beta}{2} & e^{i(\alpha+\gamma)/2} \cos \frac{\beta}{2}
\end{bmatrix} \\
\cos\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2} & – \cos\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2} \\
\cos\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2} & \cos\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2}
\end{bmatrix} \\
&\quad +
– \sin\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2} & \sin\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2} \\
\sin\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2} & \sin\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2}
\end{bmatrix} \\
\cos\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2}
+ i \sin\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2} \sigma_x
– i \cos\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2} \sigma_y
– i \sin\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2} \sigma_z

This gives us

\cos\frac{\theta}{2} &= \cos\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2} \\
\ncap \sin\frac{\theta}{2} &= \lr{ -\sin\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2}, \cos\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2}, \sin\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2} }.

The angle is

= 2 \textrm{arctan} \frac{
\sqrt{\sin^2\frac{\beta}{2} + \sin^2\frac{\alpha+\gamma}{2} \cos^2\frac{\beta}{2}
}{\cos\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2}},

\theta = 2 \textrm{arctan} \frac{
\sqrt{\tan^2\frac{\beta}{2} + \sin^2\frac{\alpha+\gamma}{2}

and the normal direction is
\inv{\sqrt{1 – \cos^2\frac{\alpha+\gamma}{2} \cos^2\frac{\beta}{2} }}
\lr{ -\sin\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2}, \cos\frac{\alpha-\gamma}{2} \sin \frac{\beta}{2}, \sin\frac{\alpha+\gamma}{2} \cos \frac{\beta}{2} }.


Ensembles for spin one half

Mixed ensemble averages

In [1], Sakurai leaves it to the reader to verify that knowledge of the three ensemble averages [S_x], [S_y],[S_z] is sufficient to reconstruct the density operator for a spin one half system.

I’ll do this in two parts, the first using a spin-up/down ensemble to see what form this has, then the general case. The general case is a bit messy algebraically. After first attempting it the hard way, I did the grunt work portion of that calculation in Mathematica, but then realized it’s not so bad to do it manually.

Consider first an ensemble with density operator

\rho =
w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-},

where these are the \( \BS \cdot (\pm \zcap) \) eigenstates. The traces are

\textrm{Tr}( \rho \sigma_x )
\bra{+} \rho \sigma_x \ket{+}
\bra{-} \rho \sigma_x \ket{-} \\
\bra{+} \rho \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} \ket{+}
\bra{-} \rho \begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} \ket{-} \\
\bra{+} \lr{ w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-} } \ket{-}
\bra{-} \lr{ w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-} } \ket{+} \\
\bra{+} w_{-} \ket{-}
\bra{-} w_{+} \ket{+} \\

\textrm{Tr}( \rho \sigma_y )
\bra{+} \rho \sigma_y \ket{+}
\bra{-} \rho \sigma_y \ket{-} \\
\bra{+} \rho \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} \ket{+}
\bra{-} \rho \begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} \ket{-} \\
i \bra{+} \lr{ w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-} } \ket{-}

i \bra{-} \lr{ w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-} } \ket{+} \\
i \bra{+} w_{-} \ket{-}

i \bra{-} w_{+} \ket{+} \\

\textrm{Tr}( \rho \sigma_z )
\bra{+} \rho \sigma_z \ket{+}
\bra{-} \rho \sigma_z \ket{-} \\
\bra{+} \rho \ket{+}

\bra{-} \rho \ket{-} \\
\bra{+} \lr{ w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-} } \ket{+}

\bra{-} \lr{ w_{+} \ket{+}\bra{+} + w_{-} \ket{-}\bra{-} } \ket{-} \\
\bra{+} w_{+} \ket{+}

\bra{-} w_{-} \ket{-} \\
w_{+} – w_{-}.

Since \( w_{+} + w_{-} = 1 \), this gives

w_{+} &= \frac{1 + \textrm{Tr}( \rho \sigma_z )}{2} \\
w_{-} &= \frac{1 – \textrm{Tr}( \rho \sigma_z )}{2}

Attempting to do a similar set of trace expansions this way for a more general spin basis turns out to be a really bad idea and horribly messy. So much so that I resorted to \href{}{Mathematica to do this symbolic work}. However, it’s not so bad if the trace is done completely in matrix form.

Using the basis

\ket{\BS \cdot \ncap ; + } &=
\cos(\theta/2) \\
\sin(\theta/2) e^{i \phi}
\end{bmatrix} \\
\ket{\BS \cdot \ncap ; – } &=
\sin(\theta/2) e^{-i \phi} \\
-\cos(\theta/2) \\

the projector matrices are

\ket{\BS \cdot \ncap ; + } \bra{\BS \cdot \ncap ; + }
\cos(\theta/2) \\
\sin(\theta/2) e^{i \phi}
\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
\end{bmatrix} \\
\cos^2(\theta/2) & \cos(\theta/2) \sin(\theta/2) e^{-i \phi} \\
\sin(\theta/2) \cos(\theta/2) e^{i \phi} & \sin^2(\theta/2)
\ket{\BS \cdot \ncap ; – } \bra{\BS \cdot \ncap ; – }
\sin(\theta/2) e^{-i \phi} \\
-\cos(\theta/2) \\
\sin(\theta/2) e^{i \phi} & -\cos(\theta/2) \\
\end{bmatrix} \\
\sin^2(\theta/2) & -\cos(\theta/2) \sin(\theta/2) e^{-i \phi} \\
-\cos(\theta/2) \sin(\theta/2) e^{i \phi} & \cos^2(\theta/2)

With \( C = \cos(\theta/2), S = \sin(\theta/2) \), a general density operator in this basis has the form

C^2 & C S e^{-i \phi} \\
S C e^{i \phi} & S^2
S^2 & -C S e^{-i \phi} \\
-C S e^{i \phi} & C^2
\end{bmatrix} \\
w_{+} C^2 + w_{-} S^2 & (w_{+} – w_{-})C S e^{-i \phi} \\
(w_{+} -w_{-} ) S C e^{i \phi} & w_{+} S^2 + w_{-} C^2

The products with the Pauli matrices are

\rho \sigma_x
w_{+} C^2 + w_{-} S^2 & (w_{+} – w_{-})C S e^{-i \phi} \\
(w_{+} -w_{-} ) S C e^{i \phi} & w_{+} S^2 + w_{-} C^2
\begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix} \\
(w_{+} – w_{-})C S e^{-i \phi} & w_{+} C^2 + w_{-} S^2 \\
w_{+} S^2 + w_{-} C^2 & (w_{+} -w_{-} ) S C e^{i \phi} \\

\rho \sigma_y
w_{+} C^2 + w_{-} S^2 & (w_{+} – w_{-})C S e^{-i \phi} \\
(w_{+} -w_{-} ) S C e^{i \phi} & w_{+} S^2 + w_{-} C^2
\begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix} \\
(w_{+} – w_{-})C S e^{-i \phi} & -w_{+} C^2 – w_{-} S^2 \\
w_{+} S^2 + w_{-} C^2 & -(w_{+} -w_{-} ) S C e^{i \phi} \\

\rho \sigma_z
w_{+} C^2 + w_{-} S^2 & (w_{+} – w_{-})C S e^{-i \phi} \\
(w_{+} -w_{-} ) S C e^{i \phi} & w_{+} S^2 + w_{-} C^2
\begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix} \\
w_{+} C^2 + w_{-} S^2 & -(w_{+} – w_{-})C S e^{-i \phi} \\
(w_{+} -w_{-} ) S C e^{i \phi} & – (w_{+} S^2 + w_{-} C^2)

The respective traces can be read right off the matrices
\textrm{Tr}( \rho \sigma_x ) &= (w_{+} – w_{-}) \sin\theta \cos\phi \\
\textrm{Tr}( \rho \sigma_y ) &= (w_{+} – w_{-}) \sin\theta \sin\phi \\
\textrm{Tr}( \rho \sigma_z ) &= (w_{+} – w_{-}) \cos\theta \\

This gives

(w_{+} – w_{-}) \ncap = \lr{ \textrm{Tr}( \rho \sigma_x ), \textrm{Tr}( \rho \sigma_y ), \textrm{Tr}( \rho \sigma_z ) },


w_{\pm} = \frac{1 \pm \sqrt{ \textrm{Tr}^2( \rho \sigma_x ) + \textrm{Tr}^2( \rho \sigma_y ) + \textrm{Tr}^2( \rho \sigma_z )} }{2} .

So, as claimed, it’s possible to completely describe the ensemble weight factors using the ensemble averages of \( [S_x], [S_y], [S_z] \). I used the Pauli matrices instead, but the difference is just an \( \Hbar/2 \) scaling adjustment.

Pure ensemble

It turns out that doing the above is also pr. 3.10(b). Part (a) of that problem is to show how the expectation values \( \expectation{S_x}, \expectation{S_y},\expectation{S_x} \) fully determine the spin orientation for a pure ensemble.

Suppose that the system is in the state \( \ket{\BS \cdot \ncap ; + } \) as defined above, then the expectation values of \( \sigma_x, \sigma_y, \sigma_z \) with respect to this state are

\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
\begin{bmatrix} 0 & 1 \\ 1 & 0 \\ \end{bmatrix}
\cos(\theta/2) \\
\sin(\theta/2) e^{i \phi}
\end{bmatrix} \\
\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
\sin(\theta/2) e^{i \phi} \\
\cos(\theta/2) \\
\end{bmatrix} \\
\sin\theta \cos\phi,
\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
\begin{bmatrix} 0 & -i \\ i & 0 \\ \end{bmatrix}
\cos(\theta/2) \\
\sin(\theta/2) e^{i \phi}
\end{bmatrix} \\
\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
-\sin(\theta/2) e^{i \phi} \\
\cos(\theta/2) \\
\end{bmatrix} \\
\sin\theta \sin\phi,
\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
\begin{bmatrix} 1 & 0 \\ 0 & -1 \\ \end{bmatrix}
\cos(\theta/2) \\
\sin(\theta/2) e^{i \phi}
\end{bmatrix} \\
\cos(\theta/2) &
\sin(\theta/2) e^{-i \phi}
\cos(\theta/2) \\
-\sin(\theta/2) e^{i \phi}
\end{bmatrix} \\

So we have
\ncap = \lr{ \expectation{\sigma_x}, \expectation{\sigma_y}, \expectation{\sigma_z} }.

The spin direction is completely determined by this vector of expectation values (or equivalently, the expectation values of \( S_x, S_y, S_z \)).


