Training Distributions with KM Divergence

Short tutorial on how to form objectives like in this paper

First some imports and the advice to use tq.numpy for functions in objectives (avoid issues in automatic differentiations and use the regular numpy for everything else (avoid issues with jax where you don’t want them)

import tequila as tq
import numpy

First we define the \(\max(x,\epsilon)\) function we will need further down the road and some global variables for this example

# global variables, change here if you want
eps = 1.e-8
n_qubits = 5
n_layers = 1

def my_max(x):
    if x<eps:
        return eps
    else:
        return x
In this example we will train for the generation of GHZ state (as in the paper above).
The ansatz template consists of Rx gates and Mølmer-Sørensen gates which are just rotations around XX.
We give here the option of using multiple layers, but one layer actually suffices for this example
# create ansatz
U = tq.QCircuit()
for layer in range(n_layers):
    for q in range(n_qubits):
        # name can be any non-numeric hashable type
        # scaling variables with pi in this example
        variable = tq.Variable(name=(layer, q))
        U += tq.gates.Rx(angle=variable*tq.numpy.pi, target=q)
    for q in range(n_qubits):
        for p in range(q+1, n_qubits):
            U += tq.gates.ExpPauli(angle=tq.numpy.pi / 2.0, paulistring="X({})X({})".format(q, p))

Here are some demonstrations on how to extract informations from your template circuit U

print(U)
print("Circuit depends on variables: ", U.extract_variables())
The circuit is parametrized by variables, so if we want to simulate it we will have to choose for which values we want to simulate the wavefunction. Here is an example where we set all variables to \(\pi\).
Variables are passed down to the simulator as a dictionary holding the variable names and values
variables = {k : 1.0 for k in U.extract_variables()}
wfn = tq.simulate(U, variables=variables)
print("wfn with variables = ", variables)
print(wfn)

Instead of simulating directly the circuit can also be compiled and afterwards be used like a function

compiledU = tq.compile(U)
wfn = compiledU(variables=variables)
print(wfn)

Next we will define the target distribution which we want to optimize for. It is the distribution of a GHZ state, i.e. \(P(00..0) = 0.5\) and \(P(11..1)=0.5\) where in the following we will use binary notation for easier coding i.e \(00...0=0\) and \(11..1=2^{n-1}+2^{n-2}+..+1\)

one = sum([2**i for i in range(n_qubits)])
target_distribution = {0:0.5, one:0.5}

For the tequila objective Born’s rule will be used. In order to make this work we need to reformulate this as an expectation value. This can be achieved as

\[ \begin{align}\begin{aligned}\displaystyle \lvert\langle ijk..l \rvert \Psi \rangle\rvert^2 = \langle \Psi \rvert H \lvert \Psi \rangle\\where :math:`i,j,k..,l \in \left\{0,1 \right\}` and the Hamiltonian is\end{aligned}\end{align} \]
\[\displaystyle H = \lvert ijk..l \rangle \langle ijk..l \rvert = \otimes_{m \in (ijk..l)} \frac{1}{n}\left( 1 + (-1)^{m}\sigma_Z \right)\]

Tequila has a convenience shortcut for the operator

\[Q_{\pm} = \frac{1}{2}\left( 1 \pm \sigma_Z \right)\]

Lets build the Kullback–Leibler divergence of the distribution P generated by our template U and the target distribution Q

\[\displaystyle D(P,Q) = -\frac{1}{2^n} \sum_{x} P(x) \ln\left(\frac{P(x)}{Q(x)}\right)\]

where we use the same safety barrier as in the paper to prevent taking logarithms of zero (this is where the mymax function fron above is needed). In the end we will square the objective to avoid negative values and assure that zero is the minimum.

objective = tq.Objective()

for k, Q in target_distribution.items():
    Q = my_max(Q) # here we apply the mymax function to avoid having zeros
    wfn = tq.QubitWaveFunction.from_int(k, n_qubits=n_qubits)
    H = tq.paulis.Projector(wfn=wfn)
    P = tq.ExpectationValue(H=H, U=U)  # this is the born rule expectation value from above
    P = P.apply(my_max) # here we apply the mymax function to avoid having zeros
    objective += Q*(Q / P).apply(tq.numpy.log) # here we take the logarithm and sum up

objective = 1.0/2.0**n_qubits *objective
objective = objective**2 # here we take the square

Lets see what we created here. Tequila created an objective with 4 expectationvalues (we had 2 values in the target_distribution, and P entered twice each) but only 2 unique expectationvalues (meaning the simulator will only evaluate those two)

print(objective)
Objectives can be compiled and simulated in the same way as the circuits above. Here is a small demonstration.
Since our objective is parametrized we need to pass the variables down again. We will use the same as in the example above.
value1 = tq.simulate(objective, variables=variables)
compiled = tq.compile(objective)
value2 = compiled(variables=variables)
variables2 = {k:1.1 for k in objective.extract_variables()}
value3 = compiled(variables=variables2)
print("value1 = ", value1)
print("value2 = ", value2)
print("value3 = ", value3)
Lets optimize our objective with one of the inbuildt optimizers in tequila.
If no initial_values are passed down, random initialization will be used.
Check the OptimizerSciPy Tutorial for more information
result = tq.optimizer_scipy.minimize(objective=objective, method="Cobyla")

Lets compute our wavefunction with the optimized angles

final_wfn = tq.simulate(U, variables=result.angles)
print(final_wfn)

And plot some information about the optimization

result.history.plot("energies")
result.history.plot("angles")