Single-qubit classifier with data re-uploading tutorial
=======================================================
This tutorial shows: - How to construct a fidelity cost function for an
optimization problem - How to construct a quantum classifier with one
qubit
Based on `Data re-uploading for a universal quantum
classifier `__, A.
Pérez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, and J. I. Latorre,
*Quantum **4**, 226 (2020)*.
.. code:: ipython3
import tequila as tq
import numpy as np
import matplotlib.pyplot as plt
from numpy import random
import time
Introduction
------------
Single-qubit operations are just rotations on the Bloch sphere. A
collection of single-qubit rotations can be reduced to a single one by
combining their angles: you can always move from one point to another on
the Bloch sphere with a single operation.
Because of this property, a single-qubit classifier with simple
parameterized rotations can not treat complex data. Even if it succeeds
to classify one data point, it will probably misclassify the others
since rotational angles have been only adapted to one particular point.
To circumvent this limitation one can introduce the data points into
these angles, so each rotation will be data point-dependent. This
methodology is called data re-uploading and it can be shown that a
single-qubit classifier can be universal using this technique.
The strategy to train this single-qubit classifier is the following.
Given a problem with 𝑛n classes, we choose 𝑛n vectors on the Bloch
sphere. Then, we train the classifier by constructing a cost function
that adds penalties if the final state of the classifier is far from the
target state that corresponds to its class.
The single-qubit classifier circuit is divided into layers. Each layer
comprises single-qubit rotations that encode a data training point and
parameters to be optimized.
.. math:: L\left(\vec{x};\vec{\theta}_{i}\right) = U\left(\vec{x}\right)U\left(\vec{\theta}_{i}\right)
By considering more layers, the final state of the classifier will have
a richer structure in terms of the data point :math:`\vec{x}`.
.. math:: \mathcal{U}_{class}\left(\vec{x};\vec{\theta}_{1},\vec{\theta}_{2},...,\vec{\theta}_{l}\right) = L\left(\vec{x};\vec{\theta}_{1}\right)L\left(\vec{x};\vec{\theta}_{2}\right)\cdots L\left(\vec{x};\vec{\theta}_{l}\right)
We will run a :math:`\mathcal{U}_{class}` for each training point
:math:`\vec{x}`, but the parameters :math:`\vec{\theta}` are the same.
These are the variables to be optimized classically through the cost
function.
Model
-----
Let's define the model that we would like to classify. Let's start with
a simple model with two classes: a circle of radius
:math:`r =\sqrt{2/\pi}` centered at (0,0). Data points will be
distributed in a square of length 2 centered at (0,0). The particular
choice of the circle radius implies that a random classification will
have a :math:`\sim 50`\ % accuracy.
``circle`` function will have two parts: - ``random = True``: generate
and label random points according to their position inside or outside
the circle (used for training) - ``random = False``: computes the label
of a given point ``x_input`` (used for testing)
.. code:: ipython3
np.random.seed(42)
def circle(samples = False, random = True, x_input = False, center=[0.0, 0.0], radius=np.sqrt(2 / np.pi)):
"""
Args:
samples (int): number of samples to generate
random = True: generates and labels sample random points
random = False: labels x_input point
x_input (array[tuple]): given point to be labeled if random = False
center (tuple): center of the circle
radius (float): radius of the circle
Returns:
if random = True:
xvals (array[tuple]): data points coordinates
yvals (array[int]): corresponding data labels
if random = False:
y (int): label of x_input point
"""
xvals, yvals = [], []
if random == True:
for i in range(samples):
x = 2 * (np.random.rand(2)) - 1
y = 0
if np.linalg.norm(x - center) < radius:
y = 1
xvals.append(x)
yvals.append(y)
return np.array(xvals), np.array(yvals)
if random == False:
y = 0
if np.linalg.norm(x_input - center) < radius:
y = 1
return y
Target states
-------------
Next step is to fix the classes states. The classifier will be trained
to return these states depending on the data point. To reduce the
uncertanty, target states should be as much distanced as possible. With
a single qubit, that means to choose points in the Bloch sphere as much
separated as possible. For a two-class problem, an easy choice is
:math:`|0\rangle` and :math:`|1\rangle` states.
.. code:: ipython3
# |0> : points inside the circle
# |1> : points outside the circle
def targ_wfn(y, nclass):
"""
Arg:
- y (int): class label
- nclass: number of classes
Returns:
- wfn: wavefunction of the target state
"""
if nclass == 2:
if y==0:
wfn = tq.QubitWaveFunction.from_array(np.asarray([1,0]))
if y==1:
wfn = tq.QubitWaveFunction.from_array(np.asarray([0,1]))
else:
raise Exception("nclass = {} is not considered".format(nclass))
return wfn
Single-qubit classifier circuit
-------------------------------
The single-qubit classifier has a layer structure, i.e. we have to
decide how to design one layer and then how many layers we would like to
consider. We will consider the following structure: each layer is a
collection of rotational gates which angles are a linear function of a
data point with the free parameters to be optimized. In particular,
.. math:: L\left(\vec{x};\vec{\theta}_{i}\right) = R_{z}\left(x^{2}+\theta_{i}^{2}\right) R_{y}\left(x^{1}+\theta_{i}^{1}\right).
Then, each layer adds 2 parameters to be optimized. The data points
:math:`(x^1,x^2)` are re-uploaded in each layer.
.. code:: ipython3
# single-qubit quantum classifier circuit
def qcircuit(xval, param):
"""
Arg:
- xval (array[tuple]): data point
- param (dict): parameters dictionary
Returns:
- qc: quantum circuit
"""
layers = int((len(param))/2) # 2 parameters/layer
# initialize the circuit
qc = tq.gates.Rz(0.0,0)
for p in range(0,2*layers-1):
# add layers to the circuit
qc += tq.gates.Ry(xval[0] + param[p],0) + tq.gates.Rz(xval[1] + param[p+1],0)
return qc
Cost function
-------------
The cost function for this quantum classifier model will be constructed
from the fidelity of the classifier final state respect to the target
state of its corresponding class. It will penalize that the output state
is far from its label state.
First, we define the fidelity between two states as an objective (see
State Preparation Tutorial). Then, we construct the simplest cost
function of this kind: average of squared infidelities for all training
points :math:`M`:
.. math:: \chi^2 = \sum_{i=1}^{M}\left(1-|\langle\psi_{target}|\psi_{circuit}\rangle|^2\right)^2
.. code:: ipython3
# Fidelity objective
def fid(wfn_targ, qc):
"""
Arg:
- wfn_targ: target wavefunction
- qc : quantum circuit
Returns:
- O: objective
"""
rho_targ = tq.paulis.Projector(wfn=wfn_targ)
O = tq.Objective.ExpectationValue(U=qc, H=rho_targ)
# fidelity = tq.simulate(O)
return O
# cost function: sum of all infidelities for each data point respect the label state
def cost(x, y, param, nclass):
"""
Arg:
- x (array[tuple]): training points
- y (array[int]): labels of training points
- param (dict): parameters dictionary
- nclass (int): number of classes
Returns:
- loss/ len(x): loss objective
"""
loss = 0.0
# M = len(y): number of training points
for i in range(len(y)):
# state generated by the classifier
qc = qcircuit(x[i], param)
# fidelity objective respect to the label state
f = fid(targ_wfn(y[i],nclass), qc)
loss = loss + (1 - f)**2
return loss / len(x)
Training
--------
| We have now all the ingredients to train a single-qubit classifier
with data re-uploading.
| If a gradient based optimization is chosen for this type of
optimization problems, numerical gradients are adviced since
analytical ones become quite expensive.
.. code:: ipython3
layers = 3
nclass = 2
training_set = 400
# generate the training set and its corresponding labels
xdata, ydata = circle(training_set)
# generate the variational parameters
param = [tq.Variable(name='th_{}'.format(i)) for i in range(0,2*layers)]
# initialize the variational parameters
# note that due to the random initialization the result can be different from time to time
# With gradient based optimization you might get stuck
inval = {key : random.uniform(0, 2*np.pi) for key in param}
grad = '2-point' # numerical gradient (= None: analytical gradient)
mthd = 'rmsprop' # scipy minimization method
mthd_opt = {'eps':1.e-4} # method options (that's the stepsize for the gradients)
obj = cost(xdata, ydata, param, nclass) # objective to be optimized: cost function
t0 = time.time()
# depending on the optimizer this will take a while
test = tq.minimize(objective=obj, initial_values=inval, method = mthd,
gradient = grad, method_options = mthd_opt, silent=False)
t1 = time.time()
Extract the results:
.. code:: ipython3
print("loss = ", test.energy)
print("method: ", mthd)
print("method parameters: ", mthd_opt)
print("execution time = ", (t1-t0)/60, " min")
print(test.history.plot('energies', label='loss'))
print(test.history.plot('angles', label=""))
Test
----
Once trained, we have the optimal parameters for the classifier stored
in ``test.angles``. We run again the classifier with the test data set
and with these parameters fixed.
.. code:: ipython3
test_set = 1000
# initialize
xval_test, yval_test, yval_rand, yval_real = [], [], [], []
suc = 0 # success
suc_rand = 0 # random success
for i in range(test_set):
# random test point
x = 2 * (np.random.rand(2)) - 1
# state generated by the trained classifier
qc = qcircuit(x, param)
wfn_qc = tq.simulate(qc, variables=test.angles)
if nclass == 2:
# compute the fidelity respect to one of the label states, the |0>
f = abs(wfn_qc.inner(targ_wfn(0,nclass)))**2
y = 1
# if fidelity is >= 0.5, we conclude that this state belongs to |0> class
# (|1> class otherwise)
if f >= 0.5:
y = 0
# check the real class of the data point
y_real = circle(random=False, x_input=x)
else:
raise Exception("nclass = {} is not considered".format(nclass))
# compute success rate
if y == y_real:
suc = suc + 1
# compute random success rate
yrand = np.random.randint(0, nclass-1)
if yrand == y_real:
suc_rand = suc_rand + 1
xval_test.append(x)
yval_test.append(y)
yval_real.append(y_real)
print("success %: ", 100*suc/test_set,"%")
print("random success %: ", 100*suc_rand/test_set,"%")
Print results:
.. code:: ipython3
def plot_data(x, y, nclass, fig=None, ax=None):
"""
Arg:
- x (array[tuple]): data points
- y (array[int]): data labels
- nclass (int): number of classes
Returns:
- Plot
"""
if fig == None:
fig, ax = plt.subplots(1, 1, figsize=(5, 5))
# colors and labels
col = ["red","blue","green","yellow"]
lab = [0,1,2,3]
for i in range(nclass):
ax.scatter(x[y == lab[i], 0], x[y == lab[i], 1], c=col[i], s=20, edgecolor="k")
ax.set_xlabel("$x_1$")
ax.set_ylabel("$x_2$")
.. code:: ipython3
xval_test = np.array(xval_test)
yval_test = np.array(yval_test)
yval_real = np.array(yval_real)
fig, axes = plt.subplots(1, 2, figsize=(6, 3))
plot_data(xval_test, yval_test, nclass, fig, axes[0])
plot_data(xval_test, yval_real, nclass, fig, axes[1])
axes[0].set_title("Single-qubit class. {} layers".format(layers))
axes[1].set_title("True test data")
fig.tight_layout(pad=0.5)
plt.show()
t2 = time.time()
print("Total execution time: ", (t2-t0)/60," min.")
Improvements and customization
------------------------------
This tutorial just shows a simple classification example. It is
constructed in a way that one can easily change the classification model
and try more sophisticated problems comprising more classes. To do so,
one should define the target states for >2 classes, i.e. include more
vectors in the Bloch sphere.
The single-qubit classifier circuit can also be modified to include
higher dimensional data or to increase/reduce the number of parameters
per layer.
The cost function can also be improved. See for instance the weigthed
fidelity cost function proposed in the main reference.
Finally, the core of any variational algorithm is the minimization
method. Tequila provides many methods besides the scipy ones. See the
`optimizers
tutorial `__
for more information. Notice also that the algorithm starts with a
random initialization. It is well-known that random initialization in
variational circuits leads to a barren-plateaus problem when computing
the gradients. This problem can be avoided by providing a good
initialization guess.
Another possibility to play with expoloiting other Tequila modules is to
use Phoenics for initial exploration and a gradient based optimizer with
the best Phoenics results as starting point (see Phoenics
documentation).