Chap 5: Relativistic Quantum Mechanics

This chapter covers relativistic quantum mechanics, including the Klein-Gordon equation and the Dirac equation.

Natural Units: In this chapter, we set \(\hbar = c = 1\) for simplicity. Here are some interpretations:

\([L] = [T] = [E]^{-1}\)

\([k_B] = [\hbar] = [c] = 1\)

Some basic equations can be rewritten in natural units for cleaner expressions:

Schrödinger equation: \(i \frac{\partial}{\partial t} \ket{\psi} = H \ket{\psi}\)

Energy-momentum relation: \(E^2 = p^2 + m^2\)

Klein-Gordon Equation

K-G Equation is derived from a naive relativistic generalization of the Schrödinger equation. We start with the relativistic energy-momentum relation:

\[ E^2 = p^2 + m^2 \]

Substituting into the Schrödinger equation: \[ i \frac{\partial}{\partial t} \ket{\psi} = \sqrt{m^2 + \hat{p}^2} \ket{\psi} \]

Do it again:

\[ - \partial_t^2 \ket{\psi(t)} = (m^2 + \hat{p}^2) \ket{\psi(t)} \]

In space representation, this becomes:

\[ \left(-\partial_t^2 + \nabla^2 + m^2\right) \psi(t,\vec{x}) = 0 \]

Tensor / Index Notation (Conventions)

为了把 Klein–Gordon 方程写成洛伦兹协变形式，我们引入四维时空的指标记号（张量表示法）。本章采用 mostly plus 的度规签名： \[ \eta_{\mu\nu} = \mathrm{diag}(-1,+1,+1,+1). \]

1) 指标与 Einstein 求和约定

指标范围：希腊字母 \(\mu,\nu,\rho,\dots=0,1,2,3\)；拉丁字母 \(i,j,k=1,2,3\)。
Einstein 求和约定：重复出现的一上一下指标默认求和，例如 \[ a_\mu b^\mu \equiv \sum_{\mu=0}^3 a_\mu b^\mu. \]

2) 四矢量与内积

在自然单位 \(c=1\) 下，四坐标写作 \[ x^\mu=(t,\,\vec{x}),\qquad x^0=t,\ x^i\ (i=1,2,3). \] 用度规降指标： \[ x_\mu = \eta_{\mu\nu}x^\nu\quad\Rightarrow\quad x_0=-t,\ x_i=x^i. \]

任意四矢量 \(a^\mu=(a^0,\vec a)\) 与 \(b^\mu=(b^0,\vec b)\) 的洛伦兹不变量内积定义为 \[ a\cdot b \equiv \eta_{\mu\nu}a^\mu b^\nu = a_\mu b^\mu = -a^0 b^0 + \vec a\cdot\vec b. \]

3) 升降指标

降指标：\(a_\mu=\eta_{\mu\nu}a^\nu\)
升指标：\(a^\mu=\eta^{\mu\nu}a_\nu\)，其中 \(\eta^{\mu\nu}=\eta_{\mu\nu}\)（因为该度规的矩阵自身就是逆矩阵）

4) 四动量与能动关系的协变写法

定义四动量 \[ p^\mu=(E,\,\vec p). \] 则 \[ p_\mu p^\mu = \eta_{\mu\nu}p^\mu p^\nu = -E^2+\vec p^{\,2}. \] 相对论能动关系 \(E^2=\vec p^{\,2}+m^2\) 等价于 \[ p_\mu p^\mu + m^2 = 0. \]

5) 四导数与达朗贝尔算符

定义四导数 \[ \partial_\mu \equiv \frac{\partial}{\partial x^\mu}=(\partial_t,\,\nabla), \] 升指标后 \[ \partial^\mu \equiv \eta^{\mu\nu}\partial_\nu = (-\partial_t,\,\nabla). \]

达朗贝尔算符（d’Alembertian）定义为 \[ \Box \equiv \partial_\mu\partial^\mu = -\partial_t^2+\nabla^2. \]

6) Klein–Gordon 方程的协变形式

把 \(p_\mu p^\mu + m^2 = 0\) 做“量子化替换” \(p_\mu\to i\partial_\mu\)（等价地 \(p^\mu\to i\partial^\mu\)），得到 \[ \left(\partial_\mu\partial^\mu + m^2\right)\psi(x)=0, \] 也就是 \[ \left(\Box + m^2\right)\psi(x)=0. \] 在坐标表示下它正好回到你上面写的形式： \[ \left(-\partial_t^2+\nabla^2+m^2\right)\psi(t,\vec x)=0. \]

Probability Density and Current

To construct a continuity equation \(\partial_t \rho + \nabla\cdot\vec{j} = 0\), we follow the standard procedure for the Schrödinger equation but applied to the Klein-Gordon equation.

Start with the KG equation and its complex conjugate:

\[ (-\partial_t^2 + \nabla^2 - m^2)\psi = 0, \quad (-\partial_t^2 + \nabla^2 - m^2)\psi^* = 0 \]

Multiply the first by \(\psi^*\) and the second by \(\psi\), then subtract:

\[ \psi^*(\partial_t^2\psi) - \psi(\partial_t^2\psi^*) = \psi^*\nabla^2\psi - \psi\nabla^2\psi^* \]

The left side is \(\partial_t(\psi^*\dot\psi - \psi\dot\psi^*)\) and the right side is \(\nabla\cdot(\psi^*\nabla\psi - \psi\nabla\psi^*)\). So we obtain the continuity equation \(\partial_t\rho + \nabla\cdot\vec{j} = 0\) with:

\[ \boxed{ \rho = \frac{i}{2m}(\psi^*\dot\psi - \psi\dot\psi^*), \quad \vec{j} = \frac{1}{2mi}(\psi^*\nabla\psi - \psi\nabla\psi^*) } \]

(The \(1/(2m)\) factor is chosen so that \(\rho\) reduces to \(|\psi|^2\) in the non-relativistic limit.) In covariant notation:

\[ j^\mu = \frac{i}{2m}(\psi^*\partial^\mu\psi - \psi\partial^\mu\psi^*), \quad \partial_\mu j^\mu = 0 \]

The problem with \(\rho\): For a plane wave solution \(\psi = N e^{-i(Et - \vec{p}\cdot\vec{x})}\):

\[ \rho = \frac{E}{m}|N|^2 \]

Since the KG equation admits both positive-energy (\(E = +\sqrt{p^2 + m^2}\)) and negative-energy (\(E = -\sqrt{p^2 + m^2}\)) solutions, \(\rho\) can be negative. This means \(\rho\) cannot be interpreted as a probability density.

This was historically seen as a fatal flaw of the Klein-Gordon equation and motivated Dirac to seek a first-order relativistic wave equation. In modern QFT, the Klein-Gordon equation is reinterpreted as a classical field equation: \(\rho\) becomes a charge density (which can be negative for antiparticles), and probability is recovered through the quantum field theory framework.

Dirac Equation

Limitations of Klein-Gordon Equation: K-G equation is a second-order differential equation in time, which makes it difficult to interpret as a wave equation for a probability amplitude.

Dirac wanted to find a first-order differential equation whose square gives the Klein-Gordon equation, then he made a linearization assumption and derived the gamma matrix structure.

Linearization

Let’s make a guess: \[ (i \gamma^\mu \partial_\mu - m)\psi(x, t) = 0 \] where \(\gamma^\mu\) are some as-yet-unknown constant matrices (independent of \(x\)). Here the superscript \(\mu=0,1,2,3\) is a Lorentz index, so - \(\gamma^0\) is one matrix, - \(\gamma^1,\gamma^2,\gamma^3\) are three different matrices,

and the four of them together form a set of (4) matrices (later we will see that the minimal representation must be 4-dimensional) acting on a 4-component “spinor” \(\psi\).

This is just a “linearization” assumption: we assume there exist matrices \(\gamma^\mu\) such that the square of the linear differential operator \(i\gamma^\mu\partial_\mu-m\) gives the second-order operator corresponding to the Klein–Gordon equation. The mass \(m\) is generally identified with the particle’s mass (which is already evident from the Klein–Gordon equation).

Add a \((-i \gamma^\nu \partial_\nu - m)\) term: \[ (-i \gamma^\nu \partial_\nu - m)(i \gamma^\mu \partial_\mu - m)\psi(x, t) = 0 \]

Simplify: \[ (\gamma^\nu \partial_\nu \gamma^\mu \partial_\mu + m^2) \psi(x, t) = 0 \]

Comparing the K-G equation: \((\partial_\mu\partial^\mu + m^2)\psi(x)=0\), we get the imposing condition: \[ \gamma^\nu \gamma^\mu \partial_\nu \partial_\mu = \partial_\mu\partial^\mu = \eta^{\mu\nu}\partial_\nu\partial_\mu \] \[ \gamma^\nu \gamma^\mu + \gamma^\mu \gamma^\nu = 2\eta^{\mu\nu} \]

Clearly, this is a Clifford algebra, which implies that \[ \begin{cases} (\gamma^0)^2 = I \\ (\gamma^i)^2 = -I \quad (i=1,2,3) \\ \gamma^i\gamma^j + \gamma^j\gamma^i = 0 \quad (i\ne j) \end{cases} \]

Can we find an algebra with a smaller representation? No: in \(3+1\) dimensions (over \(\mathbb C\)) the Clifford algebra generated by \(\{\gamma^\mu,\gamma^\nu\}=2\eta^{\mu\nu}\) has a minimal faithful matrix representation of dimension \(4\).

Now we can write the Dirac equation in a form that looks like a Schrödinger equation by separating time and space components (using \(\partial_0\equiv \partial_t\)): \[ i\gamma^0\partial_t\psi+i\gamma^i\partial_i\psi-m\psi=0. \] Multiply from the left by \(\gamma^0\) and use \((\gamma^0)^2=I\): \[ i\partial_t\psi+ i\gamma^0\gamma^i\partial_i\psi-m\gamma^0\psi=0. \] Move the spatial and mass terms to the right-hand side: \[ i\partial_t\psi=\left(-i\,\gamma^0\gamma^i\partial_i+m\gamma^0\right)\psi. \]

This motivates the definitions \[ \boxed{ \alpha^i\equiv \gamma^0\gamma^i,\qquad \beta\equiv \gamma^0. } \] Then the Dirac equation becomes \[ \boxed{ i\partial_t\psi=\left(-i\,\alpha^i\partial_i+\beta m\right)\psi =\left(-i\,\vec\alpha\cdot\nabla+\beta m\right)\psi. } \]

From the Clifford algebra \(\{\gamma^\mu,\gamma^\nu\}=2\eta^{\mu\nu}\) (with \(\eta=\mathrm{diag}(1,-1,-1,-1)\)), we can derive the relations among \(\alpha^i\) and \(\beta\).

\(\beta^2=I\): \[ \beta^2=(\gamma^0)^2=I. \]
\(\{\alpha^i,\beta\}=0\): \[ \alpha^i\beta+\beta\alpha^i =\gamma^0\gamma^i\gamma^0+\gamma^0\gamma^0\gamma^i =\gamma^0\left(\gamma^i\gamma^0+\gamma^0\gamma^i\right) =\gamma^0\{\gamma^i,\gamma^0\}=0. \]
\(\{\alpha^i,\alpha^j\}=2\delta^{ij}\): Use \(\{\gamma^0,\gamma^i\}=0\) to note \(\gamma^0\gamma^i\gamma^0=-\gamma^i\), then \[ \alpha^i\alpha^j+\alpha^j\alpha^i =\gamma^0\gamma^i\gamma^0\gamma^j+\gamma^0\gamma^j\gamma^0\gamma^i =-\gamma^i\gamma^j-\gamma^j\gamma^i =-\{\gamma^i,\gamma^j\}. \] But \(\{\gamma^i,\gamma^j\}=2\eta^{ij}=-2\delta^{ij}\), hence \[ \{\alpha^i,\alpha^j\}=2\delta^{ij}. \]

So the \(\alpha^i\) and \(\beta\) matrices satisfy \[ \boxed{ \beta^2=I,\qquad \{\alpha^i,\beta\}=0,\qquad \{\alpha^i,\alpha^j\}=2\delta^{ij}. } \]

By definition \(\gamma^0=\beta\). Also, since \(\alpha^i=\gamma^0\gamma^i=\beta\gamma^i\) and \(\beta^2=I\), \[ \gamma^i=\beta\alpha^i. \] Equivalently, \[ \gamma^\mu=(\gamma^0,\gamma^i)=(\beta,\,\beta\alpha^i). \]

Dirac Representation

So we have derived the relations between the \(\gamma\), \(\alpha\) and \(\beta\). However, we still need some additional conditions to fix the matrix representations.

The key physical requirement is that the Dirac equation should define a sensible quantum time evolution: the Hamiltonian must be Hermitian. This is ensured if we assume \[ (\alpha^i)^\dagger=\alpha^i,\qquad \beta^\dagger=\beta. \]

However, this constraint alone does not fully determine the matrix representations. Mathematicians told us that any two sets of matrices satisfying the same Clifford algebra are related by a unitary similarity transformation.

Gamma Matrix Fundamental Theorem.
Any two sets of matrices \(\{\gamma^\mu\}\) and \(\{\gamma'^\mu\}\) that satisfy the same Clifford algebra
\[ \{\gamma^\mu,\gamma^\nu\} = 2\eta^{\mu\nu}I, \qquad \{\gamma'^\mu,\gamma'^\nu\} = 2\eta^{\mu\nu}I \] are related by a unitary similarity transformation. That is, there exists a unitary matrix \(U\) such that
\[ \gamma'^\mu = U\,\gamma^\mu\,U^{-1}, \qquad \forall\,\mu. \]

So that’s the essence of the Dirac Equation: He assumed that the \(\beta\) needs to be diagonal.(Dirac representation)

To allow nonzero matrices \(\alpha^i\) satisfying \(\{\alpha^i,\beta\}=0\), \(\beta\) must have both \(+1\) and \(-1\) eigenvalues. So we have: \[ \boxed{ \beta=\begin{pmatrix} I_2 & 0\\ 0 & -I_2 \end{pmatrix} } \]

Now write \(\alpha^i\) in \(2\times2\) block form, \[ \alpha^i=\begin{pmatrix} A^i & B^i\\ C^i & D^i \end{pmatrix}. \] The anticommutation \(\{\alpha^i,\beta\}=0\) gives \[ \beta\alpha^i+\alpha^i\beta=0 \quad\Longrightarrow\quad A^i=D^i=0, \] so each \(\alpha^i\) must be off-diagonal: \[ \alpha^i=\begin{pmatrix} 0 & B^i\\ C^i & 0 \end{pmatrix}. \] Hermiticity \((\alpha^i)^\dagger=\alpha^i\) then implies \(C^i=(B^i)^\dagger\).

Next impose \(\{\alpha^i,\alpha^j\}=2\delta^{ij}\). Using the block form, \[ \alpha^i\alpha^j =\begin{pmatrix} B^iC^j & 0\\ 0 & C^iB^j \end{pmatrix}, \] so the upper-left block condition reads \[ B^i(B^j)^\dagger+B^j(B^i)^\dagger=2\delta^{ij}I_2. \] Up to unitary redefinitions inside each \(2\times2\) block, the simplest choice is \[ B^i=C^i=\sigma^i, \] where \(\sigma^i\) are the Pauli matrices. Therefore \[ \boxed{ \alpha^i=\begin{pmatrix} 0 & \sigma^i\\ \sigma^i & 0 \end{pmatrix} } \]

Finally, the corresponding \(\gamma\) matrices are \[ \gamma^0=\beta,\qquad \gamma^i=\beta\alpha^i =\begin{pmatrix} 0 & \sigma^i\\ -\sigma^i & 0 \end{pmatrix}. \]

Majorana Representation

In the Majorana representation, one chooses a basis in which all \(\gamma^\mu\) are purely imaginary (for the metric \(\eta=\mathrm{diag}(1,-1,-1,-1)\)). Then the Dirac operator \(i\gamma^\mu\partial_\mu-m\) is real, so it is consistent to impose a reality condition on the spinor (a Majorana spinor).

One explicit Majorana basis (written using Pauli matrices) is \[ \gamma^0=\begin{pmatrix} 0 & \sigma^2\\ \sigma^2 & 0 \end{pmatrix},\qquad \gamma^1=\begin{pmatrix} i\sigma^3 & 0\\ 0 & i\sigma^3 \end{pmatrix},\qquad \gamma^2=\begin{pmatrix} 0 & -\sigma^2\\ \sigma^2 & 0 \end{pmatrix},\qquad \gamma^3=\begin{pmatrix} -i\sigma^1 & 0\\ 0 & -i\sigma^1 \end{pmatrix}. \] It is straightforward to check that these satisfy \[ \{\gamma^\mu,\gamma^\nu\}=2\eta^{\mu\nu},\qquad (\gamma^0)^2=I,\qquad (\gamma^i)^2=-I, \] and all entries are purely imaginary.

Examples

Free Particle with Zero Momentum

Dirac Equation reduces to \[ -i \partial_t \ket{\psi(t)} = \beta m \ket{\psi(t)}, \text{where } \beta = \begin{pmatrix} I_2 & 0 \\ 0 & -I_2 \end{pmatrix}.\]

This is a time-independent eigenvalue equation for the operator \(\beta m\), which has eigenvalues \(\pm m\). The eigenstates are:

\[ \left\{ \begin{aligned} \ket{\psi_1(t)} &= e^{-imt} (1 \quad 0 \quad 0 \quad 0)^T \\ \ket{\psi_2(t)} &= e^{-imt} (0 \quad 1 \quad 0 \quad 0)^T \\ \ket{\psi_3(t)} &= e^{imt} (0 \quad 0 \quad 1 \quad 0)^T \\ \ket{\psi_4(t)} &= e^{imt} (0 \quad 0 \quad 0 \quad 1)^T \end{aligned} \right. \]

Free Particle with z-axis Momentum

Consider a free particle with momentum \(\vec{p} = (0,0,p)\) along the z-axis (\(p > 0\)). We seek plane-wave solutions \(\psi(x) = u\, e^{-ip\cdot x}\) with \(p^\mu = (E, 0, 0, p)\) and \(E = +\sqrt{p^2 + m^2} > 0\).

The Dirac equation in momentum space requires:

\[ (\gamma^\mu p_\mu - m)u = 0 \]

In the Dirac representation, writing \(u = \begin{pmatrix} \phi \\ \chi \end{pmatrix}\) (each a 2-component spinor), the equation becomes:

\[ \begin{pmatrix} (E - m)I_2 & -\vec{\sigma}\cdot\vec{p} \\ \vec{\sigma}\cdot\vec{p} & -(E+m)I_2 \end{pmatrix} \begin{pmatrix} \phi \\ \chi \end{pmatrix} = 0 \]

From the lower block: \(\chi = \frac{\vec{\sigma}\cdot\vec{p}}{E + m}\phi\). For \(\vec{p} = p\hat{z}\), \(\vec{\sigma}\cdot\vec{p} = p\sigma_z = p\begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix}\).

Choosing \(\phi\) to be spin-up or spin-down along \(z\):

Positive-energy (particle) solutions \(u_s(p)e^{-ip\cdot x}\):

\[ u_1(p) = N\begin{pmatrix} 1 \\ 0 \\ \frac{p}{E+m} \\ 0 \end{pmatrix}, \quad u_2(p) = N\begin{pmatrix} 0 \\ 1 \\ 0 \\ \frac{-p}{E+m} \end{pmatrix} \]

Negative-energy (antiparticle) solutions: For antiparticle solutions, write \(\psi = v\, e^{+ip\cdot x}\), which corresponds to \((\gamma^\mu p_\mu + m)v = 0\). This gives \(\phi = \frac{\vec{\sigma}\cdot\vec{p}}{E+m}\chi\):

\[ v_1(p) = N\begin{pmatrix} \frac{p}{E+m} \\ 0 \\ 1 \\ 0 \end{pmatrix}, \quad v_2(p) = N\begin{pmatrix} 0 \\ \frac{-p}{E+m} \\ 0 \\ 1 \end{pmatrix} \]

Normalization: The conventional covariant normalization is \(\bar{u}_s u_{s'} = 2m\,\delta_{ss'}\) (where \(\bar{u} = u^\dagger\gamma^0\)). This fixes:

\[ N = \sqrt{E + m} \]

so the explicit normalized spinors are:

\[ \boxed{ u_1(p) = \begin{pmatrix} \sqrt{E+m} \\ 0 \\ \sqrt{E-m} \\ 0 \end{pmatrix}, \quad u_2(p) = \begin{pmatrix} 0 \\ \sqrt{E+m} \\ 0 \\ -\sqrt{E-m} \end{pmatrix} } \]

where we used \(\frac{p\sqrt{E+m}}{E+m} = \frac{p}{\sqrt{E+m}} = \sqrt{E-m}\) (since \(p^2 = E^2 - m^2\)). In the non-relativistic limit (\(p \ll m\)), \(\sqrt{E-m} \to 0\) and the lower components vanish, recovering the two-component Pauli spinor. In the ultra-relativistic limit (\(p \gg m\)), the upper and lower components become equal in magnitude.

Completeness and spin sums: The positive-energy spinors satisfy:

\[ \sum_{s=1}^2 u_s(p)\bar{u}_s(p) = \gamma^\mu p_\mu + m, \quad \sum_{s=1}^2 v_s(p)\bar{v}_s(p) = \gamma^\mu p_\mu - m \]

These relations are essential for computing cross sections in quantum field theory.

Helicity Operator

Definition: Helicity is the projection of the (spin) angular momentum along the direction of momentum.

For a two-component Pauli spinor, a common convention is \[ \hat h \equiv \frac{\vec\sigma\cdot\vec p}{|\vec p|},\qquad (\vec p\neq 0) \] whose eigenvalues are \(\pm 1\).
For a 4-component Dirac spinor, the spin operator is built from \[ \vec\Sigma\equiv\begin{pmatrix} \vec\sigma & 0\\ 0 & \vec\sigma \end{pmatrix}, \] so the helicity operator is \[ \hat h \equiv \frac{\vec\Sigma\cdot\vec p}{|\vec p|}. \]

For momentum along the \(z\)-axis, \(\vec p=(0,0,p_z)\), we have \[ \hat h = \frac{\Sigma^3 p_z}{|p_z|}=\operatorname{sgn}(p_z)\,\Sigma^3, \] so the helicity eigenvalues are still \(\pm 1\), and the eigenstates are simply “spin up/down along \(z\)”.

Helicity is a property of the spin state relative to the momentum direction; it is not determined by whether the solution is a particle (\(u_s\)) or antiparticle (\(v_s\)), nor by the sign of the energy. Both \(u\) and \(v\) solutions can be chosen as helicity eigenstates.

Finally:

Massive case (\(m\neq 0\)): helicity is not Lorentz invariant; a boost can reverse \(\vec p\) and flip helicity.
Massless case (\(m=0\)): helicity becomes Lorentz invariant (for proper Lorentz transformations), and helicity eigenstates coincide with chirality eigenstates (related to \(\gamma^5\)).