Optimization, Game Theory, and Quantum Mechanics, Part II
This post is the second in what might be three-post sequence about framing things in quantum mechanics through ideas in computer science and mathematics. These were inspired by things I heard while sitting with my friend in Charman's OH for Physics 112, Statistical Mechanics, and some things I learned in CS 270 with Rao. I would highly recommend both classes; I haven't actually taken Physics 112 yet, but from what my friend says, it seems amazing. This blog post is independent of the first blog post. I would say that readers should be comfortable with
- Some basic Quantum Mechanics,
- Some background in game theory and optimization,
- Comfort with linear algebra and calculus,
A Brief Review of Mirror Descent
We begin with definitions of Bregman divergence and distance generating functions.Definition: Distance Generating Function (DGF)
A distance generating function $\phi$ on a convex set $X$ is strictly convex on $X$ and is continuously differentiable on $int(X)$.
Definition: Bregman Divergence
The Bregman Divergence is given by $D_{\phi}(x, y) = \phi(x) - \phi(y) - \langle \nabla \phi(y), x - y\rangle$ for $x \in X$ and $y \in int(X)$, where $\phi$ is a distance generating function on $X$.
MD might not be enough to describe QM
Let's examine time-evolution in quantum mechanics: \[\begin{align} i\hbar \partial_t \rho = [H, \rho ] \Rightarrow \partial_t \rho = -(i/\hbar) [H, \rho]\end{align}.\] To first order, a time $\delta t$ later, \[\begin{align} \rho' = \rho + \delta t \cdot \partial_t \rho = \rho - \frac{i \delta t}{\hbar} [H, \rho]. \end{align}\] Applying $\nabla\phi$ to both sides, we have \[ \nabla \phi (\rho') = \nabla \phi\left(\rho - \frac{i \delta t}{\hbar} [H, \rho]\right).\] Again, to first order in $\delta t$, we have \[ \nabla \phi(\rho') = \nabla\phi(\rho) - \frac{i \delta t}{\hbar} \nabla^2 \phi(\rho) \{ [H, \rho]\},\] where $\nabla^2 \phi$ is the Hessian and is hence a superoperator. Comparing this to the MD update, we must have \[ \lambda \nabla f(\rho) = \frac{i \delta t}{\hbar} \nabla^2 \phi(\rho) \{ [H, \rho]\}. \] Note that this is impossible, or at least very hard to do. Unitary time-evolution enforces that $\rho$ moves along a constant contour of $H$; hence the update, or the RHS of our equation, is more of a rotation step, and from QM, we know that it is possible to return to a prior state. However, if we keep following the MD update step, we cannot return to a prior state unless we are already at the minimum and all updates are nonexistent. This is a fundamental shortcoming of using MD to describe all of QM; as the updates are in fundamentally different directions, we cannot possibly capture unitary evolution.Note: Alternative Derivation to MD Update Condition
We can also derive the MD and QM update condition another way; if we say that the MD updates are
dependent on $\delta t$ (ignoring higher-order terms, and noting that we cannot have a constant term): \[ \begin{align} \nabla \phi(\rho(t + \delta t)) = \nabla \phi(\rho) - \lambda \delta t \nabla f(\rho) \end{align},\]
and we take the limit of $\delta t \rightarrow 0$, then we can rewrite the update equation as \[ \frac{d (\nabla \phi(\rho))}{d t} = \nabla^2 \phi (\rho) \{ \dot{\rho}\} =- \lambda \nabla f(\rho).\]
Using the equation for time-evolution yields the same equation. Note that we included the $\delta t$ implicity in $\lambda$ before.
You're spinnin' me around
My feet are off the ground
I don't know where I stand
"Sunday" by The Cranberries, 1993
God's playing dice: QM as a game
So where else do you have rotation? One place is in games. Let's try to get some intuition as to how this can occur. Suppose we have two players, $x$ and $y$, with a payoff function $L(x, y) = xy$, with $x$ minimizing and $y$ maximizing. $x$ updates $x \rightarrow x - \lambda \nabla_x L$, while $y$ updates $y \rightarrow y + \lambda \nabla_y L$. Making this a continuous-time process and evaluating the gradients, we have \[\begin{align} \dot{x} &= - \lambda y \\ \dot{y} &= \lambda x \end{align}\] The solution to this, $\textbf{r} = [x, y]^T$, traces a circle in the $xy$ plane. Wow! Now, framing QM, I thought, should be similar, just with different updates.Relativistic QM: Pigheadedness
There are some hints that suggest QM can be framed as a game. Initially, I had thought that each point in space was a player, and $\ket{\Psi}$ was the strategy vector of all players. But $\ket{\Psi}$ has to satisfy some constraints, and it's hard to encode that into a game.If all of $\ket{\Psi}$ is a strategy for a player, then we come to the question: who is the second player? In order to have that rotation, we need a second player. So, let's call the first player's strategy $\ket{\psi_1}$ and the second player's strategy $\ket{\psi_2}$. We'll say, without loss of generality, that $\psi_1$ aims to minimize ($\psi_2$ aims to maximize) a payoff function $L$; in the spirit of QM, this payoff function will be some sort of complex map ($L : L^2(\mathbb{R}) \times L^2(\mathbb{R}) \rightarrow \mathbb{C}$. Hence, the update rules are \[ \begin{align}\partial_t \psi_1 &= - \nabla_{\psi_1^{\dagger}} L \\ \partial_t \psi_2 &= + \nabla_{\psi_2^{\dagger}} L \end{align}\] We then make the following ansatze:
- Furthermore, $L$ can be written as \[ L = \sum_{i, j} \alpha_{ij} \psi_i^{\dagger} \partial_x \psi_j + \sum_{ij} \beta_{ij} \psi_i^{\dagger} \psi_j \]
- In the limit of $m >> 0$, $\psi_1$ becomes the wavefunction in the free particle TDSE: $\partial_t \psi_1 = (i\hbar/2m) \partial_x^2 \psi_1$.
The Dirac Equation in 1+1
In 1+1, the Dirac Equation can be written as the system of equations:
\[ \begin{align} \partial_t \psi_1 &= - \partial_x \psi_2 -im \psi_1 \\ \partial_t \psi_2 &= - \partial_x \psi_1 + im \psi_2 \end{align} \]
where $\psi_1$ and $\psi_2$ are the components of the spinor $\psi$. If we define $\psi_1 = e^{-imt} \phi_1$ and $\psi_2 = e^{imt} \phi_2$, the equations become \[\begin{align} \partial_t \phi_1 &= -e^{2imt} \, \,\, \partial_x \phi_2 \\ \partial_t \phi_2 &= -e^{-2imt} \partial_x \phi_1 \end{align}.\]
I pulled a sort of fast one. A game with a complex-valued payoff function doesn't really make sense; there's no ordering on $\mathbb{C}$. We can resolve this if we switch to the rotating frame, where our strategies become $\phi_1$ and $\phi_2$ instead of $\psi_1$ and $\psi_2$.
Ralph: I’m not a bad guy, Felix. I’m just... tired of being the guy who lives in the garbage.
Felix: But you’re the breaker! I’m the fixer! That’s how it works! It’s the natural order of things!
Wreck-It Ralph, 2012
...to be continued