Tequila Tutorial: ================= Single-qubit classifier with data re-uploading ---------------------------------------------- This tutorial shows: - How to construct a fidelity cost function for an optimization problem - How to construct a quantum classifier with one qubit Based on `Data re-uploading for a universal quantum classifier `__, A. Pérez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, and J. I. Latorre, *Quantum **4**, 226 (2020)*. .. code:: ipython3 import tequila as tq import numpy as np import matplotlib.pyplot as plt from numpy import random import time Introduction ~~~~~~~~~~~~ Single-qubit operations are just rotations on the Bloch sphere. A collection of single-qubit rotations can be reduced to a single one by combining their angles: you can always move from one point to another on the Bloch sphere with a single operation. Because of this property, a single-qubit classifier with simple parameterized rotations can not treat complex data. Even if it succeeds to classify one data point, it will probably misclassify the others since rotational angles have been only adapted to one particular point. To circumvent this limitation one can introduce the data points into these angles, so each rotation will be data point-dependent. This methodology is called data re-uploading and it can be shown that a single-qubit classifier can be universal using this technique. The strategy to train this single-qubit classifier is the following. Given a problem with 𝑛n classes, we choose 𝑛n vectors on the Bloch sphere. Then, we train the classifier by constructing a cost function that adds penalties if the final state of the classifier is far from the target state that corresponds to its class. The single-qubit classifier circuit is divided into layers. Each layer comprises single-qubit rotations that encode a data training point and parameters to be optimized. .. math:: L\left(\vec{x};\vec{\theta}_{i}\right) = U\left(\vec{x}\right)U\left(\vec{\theta}_{i}\right) By considering more layers, the final state of the classifier will have a richer structure in terms of the data point :math:`\vec{x}`. .. math:: \mathcal{U}_{class}\left(\vec{x};\vec{\theta}_{1},\vec{\theta}_{2},...,\vec{\theta}_{l}\right) = L\left(\vec{x};\vec{\theta}_{1}\right)L\left(\vec{x};\vec{\theta}_{2}\right)\cdots L\left(\vec{x};\vec{\theta}_{l}\right) We will run a :math:`\mathcal{U}_{class}` for each training point :math:`\vec{x}`, but the parameters :math:`\vec{\theta}` are the same. These are the variables to be optimized classically through the cost function. Model ~~~~~ Let's define the model that we would like to classify. Let's start with a simple model with two classes: a circle of radius :math:`r =\sqrt{2/\pi}` centered at (0,0). Data points will be distributed in a square of length 2 centered at (0,0). The particular choice of the circle radius implies that a random classification will have a :math:`\sim 50`\ % accuracy. ``circle`` function will have two parts: - ``random = True``: generate and label random points according to their position inside or outside the circle (used for training) - ``random = False``: computes the label of a given point ``x_input`` (used for testing) .. code:: ipython3 np.random.seed(42) def circle(samples = False, random = True, x_input = False, center=[0.0, 0.0], radius=np.sqrt(2 / np.pi)): """ Args: samples (int): number of samples to generate random = True: generates and labels sample random points random = False: labels x_input point x_input (array[tuple]): given point to be labeled if random = False center (tuple): center of the circle radius (float): radius of the circle Returns: if random = True: xvals (array[tuple]): data points coordinates yvals (array[int]): corresponding data labels if random = False: y (int): label of x_input point """ xvals, yvals = [], [] if random == True: for i in range(samples): x = 2 * (np.random.rand(2)) - 1 y = 0 if np.linalg.norm(x - center) < radius: y = 1 xvals.append(x) yvals.append(y) return np.array(xvals), np.array(yvals) if random == False: y = 0 if np.linalg.norm(x_input - center) < radius: y = 1 return y Target states ~~~~~~~~~~~~~ Next step is to fix the classes states. The classifier will be trained to return these states depending on the data point. To reduce the uncertanty, target states should be as much distanced as possible. With a single qubit, that means to choose points in the Bloch sphere as much separated as possible. For a two-class problem, an easy choice is :math:`|0\rangle` and :math:`|1\rangle` states. .. code:: ipython3 # |0> : points inside the circle # |1> : points outside the circle def targ_wfn(y, nclass): """ Arg: - y (int): class label - nclass: number of classes Returns: - wfn: wavefunction of the target state """ if nclass == 2: if y==0: wfn = tq.QubitWaveFunction.from_array(np.asarray([1,0])) if y==1: wfn = tq.QubitWaveFunction.from_array(np.asarray([0,1])) else: raise Exception("nclass = {} is not considered".format(nclass)) return wfn Single-qubit classifier circuit ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The single-qubit classifier has a layer structure, i.e. we have to decide how to design one layer and then how many layers we would like to consider. We will consider the following structure: each layer is a collection of rotational gates which angles are a linear function of a data point with the free parameters to be optimized. In particular, .. math:: L\left(\vec{x};\vec{\theta}_{i}\right) = R_{z}\left(x^{2}+\theta_{i}^{2}\right) R_{y}\left(x^{1}+\theta_{i}^{1}\right). Then, each layer adds 2 parameters to be optimized. The data points :math:`(x^1,x^2)` are re-uploaded in each layer. .. code:: ipython3 # single-qubit quantum classifier circuit def qcircuit(xval, param): """ Arg: - xval (array[tuple]): data point - param (dict): parameters dictionary Returns: - qc: quantum circuit """ layers = int((len(param))/2) # 2 parameters/layer # initialize the circuit qc = tq.gates.Rz(0.0,0) for p in range(0,2*layers-1): # add layers to the circuit qc += tq.gates.Ry(xval[0] + param[p],0) + tq.gates.Rz(xval[1] + param[p+1],0) return qc Cost function ~~~~~~~~~~~~~ The cost function for this quantum classifier model will be constructed from the fidelity of the classifier final state respect to the target state of its corresponding class. It will penalize that the output state is far from its label state. First, we define the fidelity between two states as an objective (see State Preparation Tutorial). Then, we construct the simplest cost function of this kind: average of squared infidelities for all training points :math:`M`: .. math:: \chi^2 = \sum_{i=1}^{M}\left(1-|\langle\psi_{target}|\psi_{circuit}\rangle|^2\right)^2 .. code:: ipython3 # Fidelity objective def fid(wfn_targ, qc): """ Arg: - wfn_targ: target wavefunction - qc : quantum circuit Returns: - O: objective """ rho_targ = tq.paulis.Projector(wfn=wfn_targ) O = tq.Objective.ExpectationValue(U=qc, H=rho_targ) # fidelity = tq.simulate(O) return O # cost function: sum of all infidelities for each data point respect the label state def cost(x, y, param, nclass): """ Arg: - x (array[tuple]): training points - y (array[int]): labels of training points - param (dict): parameters dictionary - nclass (int): number of classes Returns: - loss/ len(x): loss objective """ loss = 0.0 # M = len(y): number of training points for i in range(len(y)): # state generated by the classifier qc = qcircuit(x[i], param) # fidelity objective respect to the label state f = fid(targ_wfn(y[i],nclass), qc) loss = loss + (1 - f)**2 return loss / len(x) Training ~~~~~~~~ | We have now all the ingredients to train a single-qubit classifier with data re-uploading. | If a gradient based optimization is chosen for this type of optimization problems, numerical gradients are adviced since analytical ones become quite expensive. .. code:: ipython3 layers = 3 nclass = 2 training_set = 400 # generate the training set and its corresponding labels xdata, ydata = circle(training_set) # generate the variational parameters param = [tq.Variable(name='th_{}'.format(i)) for i in range(0,2*layers)] # initialize the variational parameters # note that due to the random initialization the result can be different from time to time # With gradient based optimization you might get stuck inval = {key : random.uniform(0, 2*np.pi) for key in param} grad = '2-point' # numerical gradient (= None: analytical gradient) mthd = 'L-BFGS-B' # scipy minimization method mthd_opt = {'eps':1.e-4} # method options (thats the stepsize for the gradients) obj = cost(xdata, ydata, param, nclass) # objective to be optimized: cost function t0 = time.time() # depending on the optimizer this will take a while test = tq.minimize(objective=obj, initial_values=inval, method = mthd, gradient = grad, method_options = mthd_opt, silent=False) t1 = time.time() Extract the results: .. code:: ipython3 print("loss = ", test.energy) print("method: ", mthd) print("method parameters: ", mthd_opt) print("execution time = ", (t1-t0)/60, " min") print(test.history.plot('energies', label='loss')) print(test.history.plot('angles', label="")) Test ~~~~ Once trained, we have the optimal parameters for the classifier stored in ``test.angles``. We run again the classifier with the test data set and with these parameters fixed. .. code:: ipython3 test_set = 1000 # initialize xval_test, yval_test, yval_rand, yval_real = [], [], [], [] suc = 0 # success suc_rand = 0 # random success for i in range(test_set): # random test point x = 2 * (np.random.rand(2)) - 1 # state generated by the trained classifier qc = qcircuit(x, param) wfn_qc = tq.simulate(qc, variables=test.angles) if nclass == 2: # compute the fidelity respect to one of the label states, the |0> f = abs(wfn_qc.inner(targ_wfn(0,nclass)))**2 y = 1 # if fidelity is >= 0.5, we conclude that this state belongs to |0> class # (|1> class otherwise) if f >= 0.5: y = 0 # check the real class of the data point y_real = circle(random=False, x_input=x) else: raise Exception("nclass = {} is not considered".format(nclass)) # compute success rate if y == y_real: suc = suc + 1 # compute random success rate yrand = np.random.randint(0, nclass-1) if yrand == y_real: suc_rand = suc_rand + 1 xval_test.append(x) yval_test.append(y) yval_real.append(y_real) print("success %: ", 100*suc/test_set,"%") print("random success %: ", 100*suc_rand/test_set,"%") Print results: .. code:: ipython3 def plot_data(x, y, nclass, fig=None, ax=None): """ Arg: - x (array[tuple]): data points - y (array[int]): data labels - nclass (int): number of classes Returns: - Plot """ if fig == None: fig, ax = plt.subplots(1, 1, figsize=(5, 5)) # colors and labels col = ["red","blue","green","yellow"] lab = [0,1,2,3] for i in range(nclass): ax.scatter(x[y == lab[i], 0], x[y == lab[i], 1], c=col[i], s=20, edgecolor="k") ax.set_xlabel("$x_1$") ax.set_ylabel("$x_2$") .. code:: ipython3 xval_test = np.array(xval_test) yval_test = np.array(yval_test) yval_real = np.array(yval_real) fig, axes = plt.subplots(1, 2, figsize=(6, 3)) plot_data(xval_test, yval_test, nclass, fig, axes[0]) plot_data(xval_test, yval_real, nclass, fig, axes[1]) axes[0].set_title("Single-qubit class. {} layers".format(layers)) axes[1].set_title("True test data") fig.tight_layout(pad=0.5) plt.show() t2 = time.time() print("Total execution time: ", (t2-t0)/60," min.") Improvements and customization ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This tutorial just shows a simple classification example. It is constructed in a way that one can easily change the classification model and try more sophisticated problems comprising more classes. To do so, one should define the target states for >2 classes, i.e. include more vectors in the Bloch sphere. The single-qubit classifier circuit can also be modified to include higher dimensional data or to increase/reduce the number of parameters per layer. The cost function can also be improved. See for instance the weigthed fidelity cost function proposed in the main reference. Finally, the core of any variational algorithm is the minimization method. Tequila provides many methods besides the scipy ones. See the optimizers tutorial for more information. Note that random initialization One possibility to play with is to use Phoenics for initial exploration and a gradient based optimizer with the best phoenics results as starting point.