Motivation

The conventional form for the Pauli matrices is

\label{eqn:pauliMatrixXYgeometry:20}
\begin{aligned}
\sigma_x &=
\begin{bmatrix}
0 & 1 \\
1 & 0 \\
\end{bmatrix} \\
\sigma_y &=
\begin{bmatrix}
0 & -i \\
i & 0 \\
\end{bmatrix} \\
\sigma_z &=
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\end{aligned}.

In [1] these forms are derived based on the commutation relations

\label{eqn:pauliMatrixXYgeometry:40}
\antisymmetric{\sigma_r}{\sigma_s} = 2 i \epsilon_{r s t} \sigma_t,

by defining raising and lowering operators $$\sigma_{\pm} = \sigma_x \pm i \sigma_y$$ and figuring out what form the matrix must take. I noticed an interesting geometrical relation hiding in that derivation if $$\sigma_{+}$$ is not assumed to be real.

Derivation

For completeness, I’ll repeat the argument of [1], which builds on the commutation relations of the raising and lowering operators. Those are

\label{eqn:pauliMatrixXYgeometry:60}
\begin{aligned}
\antisymmetric{\sigma_z}{\sigma_{\pm}}
&=
\sigma_z \lr{ \sigma_x \pm i \sigma_y }
-\lr{ \sigma_x \pm i \sigma_y } \sigma_z \\
&=
\antisymmetric{\sigma_z}{\sigma_x} \pm i \antisymmetric{\sigma_z}{\sigma_y} \\
&=
2 i \sigma_y \pm i (-2 i) \sigma_x \\
&= \pm 2 \lr{ \sigma_x \pm i \sigma_y } \\
&= \pm 2 \sigma_{\pm},
\end{aligned}

and

\label{eqn:pauliMatrixXYgeometry:80}
\begin{aligned}
\antisymmetric{\sigma_{+}}{\sigma_{-}}
&=
\lr{ \sigma_x + i \sigma_y } \lr{ \sigma_x – i \sigma_y }
-\lr{ \sigma_x – i \sigma_y } \lr{ \sigma_x + i \sigma_y } \\
&=
-i \sigma_x \sigma_y + i \sigma_y \sigma_x
– i \sigma_x \sigma_y + i \sigma_y \sigma_x \\
&= 2 i \antisymmetric{ \sigma_y }{\sigma_x} \\
&= 2 i (-2i) \sigma_z \\
&= 4 \sigma_z
\end{aligned}

From these a matrix representation containing unknown values can be assumed. Let

\label{eqn:pauliMatrixXYgeometry:100}
\sigma_{+} =
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}.

The commutator with $$\sigma_z$$ can be computed

\label{eqn:pauliMatrixXYgeometry:120}
\begin{aligned}
\antisymmetric{\sigma_z}{\sigma_{+}}
&=
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}

\begin{bmatrix}
a & b \\
c & d
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\\
&=
\begin{bmatrix}
a & b \\
-c & -d
\end{bmatrix}

\begin{bmatrix}
a & -b \\
c & -d
\end{bmatrix} \\
&=
2
\begin{bmatrix}
0 & b \\
-c & 0
\end{bmatrix}
\end{aligned}

Now compare this with \ref{eqn:pauliMatrixXYgeometry:60}

\label{eqn:pauliMatrixXYgeometry:140}
2
\begin{bmatrix}
0 & b \\
-c & 0
\end{bmatrix}
=
2 \sigma_{+}
=
2
\begin{bmatrix}
a & b \\
d & d
\end{bmatrix}.

This shows that $$a = 0$$, and $$d = 0$$. Similarly the $$\sigma_z$$ commutator with the lowering operator is

\label{eqn:pauliMatrixXYgeometry:160}
\begin{aligned}
\antisymmetric{\sigma_z}{\sigma_{-}}
&=
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\begin{bmatrix}
0 & -c^\conj \\
b^\conj & 0
\end{bmatrix}

\begin{bmatrix}
0 & -c^\conj \\
b^\conj & 0
\end{bmatrix}
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\\
&=
\begin{bmatrix}
0 & -c^\conj \\
-b^\conj & 0
\end{bmatrix}

\begin{bmatrix}
0 & c^\conj \\
b^\conj & 0
\end{bmatrix} \\
&=
-2
\begin{bmatrix}
0 & c^\conj \\
b^\conj & 0
\end{bmatrix}
\end{aligned}

Again comparing to \ref{eqn:pauliMatrixXYgeometry:60}, we have
\label{eqn:pauliMatrixXYgeometry:180}
-2
\begin{bmatrix}
0 & c^\conj \\
b^\conj & 0
\end{bmatrix}
= – 2 \sigma_{-}
= – 2
\begin{bmatrix}
0 & -c^\conj \\
b^\conj & 0
\end{bmatrix},

so $$c = 0$$. Computing the commutator of the raising and lowering operators fixes $$b$$

\label{eqn:pauliMatrixXYgeometry:200}
\begin{aligned}
\antisymmetric{\sigma_{+}}{\sigma_{-}}
&=
\begin{bmatrix}
0 & b \\
0 & 0 \\
\end{bmatrix}
\begin{bmatrix}
0 & 0 \\
b^\conj & 0 \\
\end{bmatrix}

\begin{bmatrix}
0 & 0 \\
b^\conj & 0 \\
\end{bmatrix}
\begin{bmatrix}
0 & b \\
0 & 0 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
\Abs{b}^2 & 0 \\
0 & 0
\end{bmatrix}

\begin{bmatrix}
0 & 0
0 & -\Abs{b}^2 \\
\end{bmatrix} \\
&=
\Abs{b}^2
\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}
\\
&=
\Abs{b}^2 \sigma_z.
\end{aligned}

From \ref{eqn:pauliMatrixXYgeometry:80} it must be that $$\Abs{b}^2 = 4$$, so the most general form of the raising operator is

\label{eqn:pauliMatrixXYgeometry:220}
\sigma_{+}
=
2
\begin{bmatrix}
0 & e^{i \phi} \\
0 & 0
\end{bmatrix}.

Observation

The conventional choice is to set $$\phi = 0$$, but I found it interesting to see the form of $$\sigma_x, \sigma_y$$ without that choice. That is

\label{eqn:pauliMatrixXYgeometry:240}
\begin{aligned}
\sigma_x
&= \inv{2} \lr{ \sigma_{+} + \sigma_{-} } \\
&=
\begin{bmatrix}
0 & e^{i \phi} \\
e^{-i \phi} & 0 \\
\end{bmatrix}
\end{aligned}

\label{eqn:pauliMatrixXYgeometry:260}
\begin{aligned}
\sigma_y
&= \inv{2 i} \lr{ \sigma_{+} – \sigma_{-} } \\
&=
\begin{bmatrix}
0 & -i e^{i \phi} \\
-i e^{-i \phi} & 0 \\
\end{bmatrix} \\
&=
\begin{bmatrix}
0 & e^{i (\phi – \pi/2) } \\
e^{-i (\phi – \pi/2)} & 0 \\
\end{bmatrix}.
\end{aligned}

Notice that the Pauli matrices $$\sigma_x$$ and $$\sigma_y$$ actually both have the same form as $$\sigma_x$$, but the phase of the complex argument of each differs by $$90^\circ$$. That $$90^\circ$$ separation isn’t obvious in the standard form \ref{eqn:pauliMatrixXYgeometry:20}.

It’s a small detail, but I thought it was kind of cool that the orthogonality of these matrix unit vector representations is built directly into the structure of their matrix representations.

References

[1] BR Desai. Quantum mechanics with basic field theory. Cambridge University Press, 2009.