[Click here for a PDF of this post with nicer formatting]

## Motivation

The conventional form for the Pauli matrices is

\begin{equation}\label{eqn:pauliMatrixXYgeometry:20}

\begin{aligned}

\sigma_x &=

\begin{bmatrix}

0 & 1 \\

1 & 0 \\

\end{bmatrix} \\

\sigma_y &=

\begin{bmatrix}

0 & -i \\

i & 0 \\

\end{bmatrix} \\

\sigma_z &=

\begin{bmatrix}

1 & 0 \\

0 & -1 \\

\end{bmatrix}

\end{aligned}.

\end{equation}

In [1] these forms are derived based on the commutation relations

\begin{equation}\label{eqn:pauliMatrixXYgeometry:40}

\antisymmetric{\sigma_r}{\sigma_s} = 2 i \epsilon_{r s t} \sigma_t,

\end{equation}

by defining raising and lowering operators \( \sigma_{\pm} = \sigma_x \pm i \sigma_y \) and figuring out what form the matrix must take. I noticed an interesting geometrical relation hiding in that derivation if \( \sigma_{+} \) is not assumed to be real.

## Derivation

For completeness, I’ll repeat the argument of [1], which builds on the commutation relations of the raising and lowering operators. Those are

\begin{equation}\label{eqn:pauliMatrixXYgeometry:60}

\begin{aligned}

\antisymmetric{\sigma_z}{\sigma_{\pm}}

&=

\sigma_z \lr{ \sigma_x \pm i \sigma_y }

-\lr{ \sigma_x \pm i \sigma_y } \sigma_z \\

&=

\antisymmetric{\sigma_z}{\sigma_x} \pm i \antisymmetric{\sigma_z}{\sigma_y} \\

&=

2 i \sigma_y \pm i (-2 i) \sigma_x \\

&= \pm 2 \lr{ \sigma_x \pm i \sigma_y } \\

&= \pm 2 \sigma_{\pm},

\end{aligned}

\end{equation}

and

\begin{equation}\label{eqn:pauliMatrixXYgeometry:80}

\begin{aligned}

\antisymmetric{\sigma_{+}}{\sigma_{-}}

&=

\lr{ \sigma_x + i \sigma_y } \lr{ \sigma_x – i \sigma_y }

-\lr{ \sigma_x – i \sigma_y } \lr{ \sigma_x + i \sigma_y } \\

&=

-i \sigma_x \sigma_y + i \sigma_y \sigma_x

– i \sigma_x \sigma_y + i \sigma_y \sigma_x \\

&= 2 i \antisymmetric{ \sigma_y }{\sigma_x} \\

&= 2 i (-2i) \sigma_z \\

&= 4 \sigma_z

\end{aligned}

\end{equation}

From these a matrix representation containing unknown values can be assumed. Let

\begin{equation}\label{eqn:pauliMatrixXYgeometry:100}

\sigma_{+} =

\begin{bmatrix}

a & b \\

c & d

\end{bmatrix}.

\end{equation}

The commutator with \( \sigma_z \) can be computed

\begin{equation}\label{eqn:pauliMatrixXYgeometry:120}

\begin{aligned}

\antisymmetric{\sigma_z}{\sigma_{+}}

&=

\begin{bmatrix}

1 & 0 \\

0 & -1 \\

\end{bmatrix}

\begin{bmatrix}

a & b \\

c & d

\end{bmatrix}

–

\begin{bmatrix}

a & b \\

c & d

\end{bmatrix}

\begin{bmatrix}

1 & 0 \\

0 & -1 \\

\end{bmatrix}

\\

&=

\begin{bmatrix}

a & b \\

-c & -d

\end{bmatrix}

–

\begin{bmatrix}

a & -b \\

c & -d

\end{bmatrix} \\

&=

2

\begin{bmatrix}

0 & b \\

-c & 0

\end{bmatrix}

\end{aligned}

\end{equation}

Now compare this with \ref{eqn:pauliMatrixXYgeometry:60}

\begin{equation}\label{eqn:pauliMatrixXYgeometry:140}

2

\begin{bmatrix}

0 & b \\

-c & 0

\end{bmatrix}

=

2 \sigma_{+}

=

2

\begin{bmatrix}

a & b \\

d & d

\end{bmatrix}.

\end{equation}

This shows that \( a = 0 \), and \( d = 0 \). Similarly the \( \sigma_z \) commutator with the lowering operator is

\begin{equation}\label{eqn:pauliMatrixXYgeometry:160}

\begin{aligned}

\antisymmetric{\sigma_z}{\sigma_{-}}

&=

\begin{bmatrix}

1 & 0 \\

0 & -1 \\

\end{bmatrix}

\begin{bmatrix}

0 & -c^\conj \\

b^\conj & 0

\end{bmatrix}

–

\begin{bmatrix}

0 & -c^\conj \\

b^\conj & 0

\end{bmatrix}

\begin{bmatrix}

1 & 0 \\

0 & -1 \\

\end{bmatrix}

\\

&=

\begin{bmatrix}

0 & -c^\conj \\

-b^\conj & 0

\end{bmatrix}

–

\begin{bmatrix}

0 & c^\conj \\

b^\conj & 0

\end{bmatrix} \\

&=

-2

\begin{bmatrix}

0 & c^\conj \\

b^\conj & 0

\end{bmatrix}

\end{aligned}

\end{equation}

Again comparing to \ref{eqn:pauliMatrixXYgeometry:60}, we have

\begin{equation}\label{eqn:pauliMatrixXYgeometry:180}

-2

\begin{bmatrix}

0 & c^\conj \\

b^\conj & 0

\end{bmatrix}

= – 2 \sigma_{-}

= – 2

\begin{bmatrix}

0 & -c^\conj \\

b^\conj & 0

\end{bmatrix},

\end{equation}

so \( c = 0 \). Computing the commutator of the raising and lowering operators fixes \( b \)

\begin{equation}\label{eqn:pauliMatrixXYgeometry:200}

\begin{aligned}

\antisymmetric{\sigma_{+}}{\sigma_{-}}

&=

\begin{bmatrix}

0 & b \\

0 & 0 \\

\end{bmatrix}

\begin{bmatrix}

0 & 0 \\

b^\conj & 0 \\

\end{bmatrix}

–

\begin{bmatrix}

0 & 0 \\

b^\conj & 0 \\

\end{bmatrix}

\begin{bmatrix}

0 & b \\

0 & 0 \\

\end{bmatrix} \\

&=

\begin{bmatrix}

\Abs{b}^2 & 0 \\

0 & 0

\end{bmatrix}

–

\begin{bmatrix}

0 & 0

0 & -\Abs{b}^2 \\

\end{bmatrix} \\

&=

\Abs{b}^2

\begin{bmatrix}

1 & 0 \\

0 & -1 \\

\end{bmatrix}

\\

&=

\Abs{b}^2 \sigma_z.

\end{aligned}

\end{equation}

From \ref{eqn:pauliMatrixXYgeometry:80} it must be that \( \Abs{b}^2 = 4\), so the most general form of the raising operator is

\begin{equation}\label{eqn:pauliMatrixXYgeometry:220}

\sigma_{+}

=

2

\begin{bmatrix}

0 & e^{i \phi} \\

0 & 0

\end{bmatrix}.

\end{equation}

## Observation

The conventional choice is to set \( \phi = 0 \), but I found it interesting to see the form of \( \sigma_x, \sigma_y \) without that choice. That is

\begin{equation}\label{eqn:pauliMatrixXYgeometry:240}

\begin{aligned}

\sigma_x

&= \inv{2} \lr{ \sigma_{+} + \sigma_{-} } \\

&=

\begin{bmatrix}

0 & e^{i \phi} \\

e^{-i \phi} & 0 \\

\end{bmatrix}

\end{aligned}

\end{equation}

\begin{equation}\label{eqn:pauliMatrixXYgeometry:260}

\begin{aligned}

\sigma_y

&= \inv{2 i} \lr{ \sigma_{+} – \sigma_{-} } \\

&=

\begin{bmatrix}

0 & -i e^{i \phi} \\

-i e^{-i \phi} & 0 \\

\end{bmatrix} \\

&=

\begin{bmatrix}

0 & e^{i (\phi – \pi/2) } \\

e^{-i (\phi – \pi/2)} & 0 \\

\end{bmatrix}.

\end{aligned}

\end{equation}

Notice that the Pauli matrices \( \sigma_x \) and \( \sigma_y \) actually both have the same form as \( \sigma_x \), but the phase of the complex argument of each differs by \(90^\circ\). That \( 90^\circ \) separation isn’t obvious in the standard form \ref{eqn:pauliMatrixXYgeometry:20}.

It’s a small detail, but I thought it was kind of cool that the orthogonality of these matrix unit vector representations is built directly into the structure of their matrix representations.

# References

[1] BR Desai. *Quantum mechanics with basic field theory*. Cambridge University Press, 2009.