Encode or compress classical data into smaller-sized data via a deterministic algorithm.For example, JPEG is essentially an algorithm that compresses images into smaller-sized images.
In a similar fashion to the classical counterpart, a quantum autoencoder compresses quantum data stored initially on n qubits into a smaller quantum register of m<n qubits via a variational circuit. However, quantum computing is reversible; therefore, qubits cannot be “erased”. Alternatively, a quantum autoencoder tries to achieve the following transformation from an uncoded quantum register of size n to a coded one of size m:∣ψ⟩n→∣ψ′⟩m∣0⟩n−mNamely, we try to decouple the initial state to a product state of a smaller register of size m and a register that is in the zero state.The former is usually called the coded state and the latter the trash state.
To train a quantum autoencoder, we define a proper cost function.Below are two common approaches, one using a swap test and the other using Hamiltonian measurements. We focus on the swap test case, and comment on the other approach at the end of this notebook.
The swap test is a quantum function that checks the overlap between two quantum states.The inputs of the function are two quantum registers of the same size, ∣ψ1⟩,∣ψ2⟩, and it returns as output a single “test” qubit whose state encodes the overlap between the two inputs: ∣q⟩test=α∣0⟩+1−α2∣1⟩, withα2=21(1+∣⟨ψ1∣ψ2⟩∣2).Thus, the probability to measure the test qubit at state 0 is 1 if the states are identical (up to a global phase) and 0 if the states are orthogonal to each other.The quantum model starts with an H gate on the test qubit, followed by swapping between the two states controlled on the test qubit and a final H gate on the test qubit.
@qfuncdef encoder_ansatz( exe_params: CArray[CReal], coded: QArray, trash: QArray,) -> None: """ This is a parametric model that acts on num_qubits=trash.len+coded.len qubits. It contains trash.len layers, each composed of RY gates and CX gates with a linear connectivity, and a final layer with RY gate on each of the trash qubits is applied. """ num_qubits = trash.len + coded.len x = QArray() within_apply( lambda: bind([coded, trash], x), lambda: repeat( trash.len, lambda r: ( repeat(num_qubits, lambda i: RY(exe_params[r * num_qubits + i], x[i])), repeat(num_qubits - 1, lambda i: CX(x[i], x[i + 1])), ), ), ) repeat(trash.len, lambda i: RY(exe_params[(trash.len) * num_qubits + i], trash[i]))
The network for training contains only a quantum layer.The corresponding quantum program was already defined above, so what remains is to define the execution preferences and the classical postprocess.The classical output is defined as 1−α2, with α being the probability of the test qubit being at state
The cost function to minimize is ∣1−α2∣ for all our training data.Looking at the Qlayer output, this means that we should define the corresponding labels as 0:
In this demo we initialize the network with trained parameters and run only one epoch for demonstration purposes.Reasonable training with the above hyperparameters can be achieved with ∼40 epochs. To train the network from the beginning, uncomment the following code line:
Once we have trained the network, we can build a new network with the trained variables. We verify our encoder by taking only the encoding block, changing the postprocess, etc.
Below, we verify our quantum autoencoder by comparing the input with the output of an encoder-decoder network.We create the following network containing three quantum blocks:
The first two blocks of the previous network: a block for loading the inputs followed by our quantum encoder.
We reset the trash qubits, assigning them to be at the zero state explicitly.
The inverse of the quantum encoder.
The network weights are allocated with the trained ones.
For the validator postprocessing, we take the output with the maximum counts. We run the validator quantum program with the trained weights, and compare every input data with its output.
trained_w = encoder_train_network.qlayer.weight.tolist()input_data = train_dataset.data.tolist()batch_data = [{"w": trained_w, "input_data": data} for data in input_data]with ExecutionSession(qprog_validator) as es: results_validator = es.batch_sample(batch_data)
Now we can compare the input with the output of the validator for different data:
for data, res in zip(input_data, results_validator): df = res.dataframe output = df.loc[df["probability"].idxmax(), "decoded"] print("input =", data, ", output =", output)
We can use our trained network for anomaly detection. Let’s see what happens to the trash qubits when we insert an anomaly; namely, non-domain-wall data:
import randominput_anomaly_data = [ [0, 0, 1, 1], [0, 0, 0, 1], [0, 1, 1, 1], [1, 0, 1, 0], [1, 1, 1, 1],]random.shuffle(input_anomaly_data)batch_data = [{"w": trained_w, "input_data": data} for data in input_anomaly_data]with ExecutionSession(qprog_ae_network) as es: results_anomaly = es.batch_sample(batch_data)
We print all the anomaly data based on predefined accuracy for the cost function:
tolerance = 1e-2for data, res in zip(input_anomaly_data, results_anomaly): # The probabiliy of the test qubit alpha_sqaured = res.counts_of_output("test").get("0", 0) / num_shots output = 1 - alpha_sqaured if abs(output) > tolerance: print(f"input= {data}, loss= {output} ----> ANOMALY DETECTED") else: print(f"input= {data}, loss= {output}")
Alternative Network for Training a Quantum Autoencoder
Another way to introduce a cost function is by estimating Hamiltonians.Measuring the Pauli Z matrix on a qubit at the general state ∣q⟩=a∣0⟩+b∣1⟩ is ⟨q∣Z∣q⟩=a2−b2. Therefore, a cost function can be defined by taking expectation values on the trash output (without a swap test) as follows:Cost=21k=1∑num of trash qubits1−⟨Zk⟩.Below we show how to define the corresponding Qlayer: the quantum program and postprocessing.