December

2022

Making academic quantum algorithms automatically executable

Quantum modeling constructs and strong synthesis abilities close the gap between the intent of quantum algorithm designers as published in academic papers, and actual implementations that can be executed on quantum computers, simulators, and resource estimators

Visualization of optimizing quantum circuits from academic papers to implementations using the Classiq Platform's Synthesis Engine and Microsoft's Azure Quantum Resource Estimator.

Introduction

Quantum algorithms have largely been an academic endeavor in the past thirty years. Theoreticians have published well-defined and proven algorithms and have described them in a sound scientific language. This description includes figures, captions, text, equations, and other tools available to scientists to formally describe and prove correctness of their work. Unfortunately, such description relies on mutual understandings between scholars and are based on a formal scientific language. Such language is not formal enough for compilers to understand and produce a quantum-computer-executable description of the quantum algorithm.

In this paper we present new modeling constructs and idioms that result from a thorough review of many quantum algorithms and implementations published over the years. Even in such singular algorithms as Shor’s, Grover’s, or HHL’s – all of them can be stated relatively simply by abstracting out all internal structure – a wealth of complexities arises when looking into the next level of details – the modular exponentiation in Shor’s algorithm, the oracle in Grover’s, and the form of the linear equations in HHL’s. We will follow below the example of Shor’s algorithm, as it is the simplest to comprehend for anyone taking an introductory course to quantum algorithms. Using Microsoft’s Azure Quantum Resource Estimator, we show that performing Shor’s algorithm with four modular additions requires 354,562 physical qubits, and we identify the specific gate counts required. In this example, we will show only the modeling constructs which we found relevant in a wide range of quantum algorithms.

The overall merit of our method is determined by how closely the model resembles the quantum circuit as described by the designer, e.g., in an academic paper, and by the ability of the model to unambiguously capture all formal requirements of the circuit.

The modeling constructs, as well as the synthesis engine able to formally understand the model and produce a concrete, optimized quantum circuit implementation, have all been implemented in Classiq’s quantum algorithm design platform, hence reaching for the first time the ability to create complex quantum algorithms from thought to execution. The quantum algorithms (e.g., implementations of Shor’s algorithm) created by the Classiq platform are by far the most complex and largest quantum algorithms ever created.

Implementation of Shor’s Algorithm

We go over the main concepts and tools used in Classiq’s graphical modeling to generate a circuit implementing Shor’s algorithm factoring a number N represented by n bits. The circuit follows the implementation of Beauregard (https://arxiv.org/pdf/quant-ph/0205095 [1]) and requires 4n+2 qubits. The difference between the two circuits is that the original circuit of Beauregard implements the final inverse QFT circuit on a single qubit by measuring and re-preparing a single qubit 2n times (hence requiring 2n+3 qubits), whereas in our implementation the “regular” QFT on 2n qubits is used.

The circuit implements the quantum part of Shor’s algorithm. Recall that in Shor’s algorithm to find a factor of a given n-qubit number N, a random number a which is smaller than N is selected and gcd(a,N) is calculated, where gcd denotes the greatest common denominator. If gcd(a,N) >1, then we are done - gcd(a,N) is a factor of N. If not, we carry out the quantum part of the algorithm – the main part of which is calculating $|a^xmodN\rangle$ (the quantum part is followed by additional classical post processing of the measured outcome).

To construct such a circuit for any given n, one has to be able to express the circuit hierarchically starting from the basic quantum building blocks at the lowest level (the quantum building blocks are basic gates and functions which are recognized by the synthesis engine) up to the entire circuit at the top level. Additionally, one needs to be able to work with registers rather than single qubits - that is, apply gates to registers while keeping the ability to slice registers and apply gates to selected qubits. The tools which allow such modeling of a circuit are demonstrated in each of the main hierarchies/layers of the circuit which are:

· A modular adder in the Fourier space constructed from basic gates (mainly phase, controlled-phase and doubly controlled-phase gates) and QFT.

· Modular multiplication constructed from n modular adders.

· Modular exponentiation constructed from n modular multiplication circuits.

The Controlled Modular Adder Circuit

In the modular adder, the number a is added to an (n+1) qubit quantum register (the b register). The input state of the register is where the most significant bit (MSB) of b has value zero. This qubit is used as an overflow bit. Additionally, the circuit includes two control bits and an auxiliary bit which is zero at the input and output. The modular adder is given by the following circuit. The thick black bar distinguishes between the adder (bar on the right) and its inverse implementing subtraction (bar on the left):

Fig. 1: Controlled modular adder. Taken from [1].

For c₁=c₂=1, the circuit operation is the following (for other values of c_i, the controlled operations are not activated and the rest of the operations cancel out).

o Sub-circuit A: Adds a to the (n+1) qubit b register carrying $\Phi(b)$.

o Sub-circuit B: Subtracts N from a+b and then adds N again if N < a+b – this is done by checking the MSB of a+b-N.

$$|1,1,\Phi (b),0\rangle \rightarrow |1,1,\Phi (b+a),0\rangle$$

Note that at this point the b register is in the state (a+b)mod N, however, in general, it is entangled with the auxiliary qubit.

o Sub-circuit C: Resets the last/bottom auxiliary qubit to $|0\rangle$(disentangling it from the b-register) by subtracting a from the register and checking the most significant bit of the b-register (flipping the auxiliary bit if necessary) and adding a back to the register.

The operation of the entire circuit on the ‘legal’ input states (where the MSB of b is zero) is the required:

$$|1,1,\Phi (b),0\rangle \rightarrow |1,1,\Phi ((b+a) mod N),0\rangle .$$

The main building block of the modular adder is an adder circuit (in the Fourier basis), which adds the number a (smaller than N) to a (n+1) qubit quantum register, and its inverse (with and without controls). The (uncontrolled) adder is constructed solely from single qubit phase gates (here - $|\Phi(b)\rangle = QFT|b\rangle $):

Fig. 2: Uncontrolled adder in Fourier space. Taken from [1].

The number a here is “classical” in the sense that it is not carried by a quantum register but encoded in the phases. In our platform, the adder is realized straightforwardly by the cascade modeling tool with which an operation is duplicated vertically: it is applied to all qubits in a register, as shown below (the reason it is called cascade and not, e.g., forall will become apparent later).

Fig. 3: A phase gate applied to all qubits in register.

The input and output above are W-qubit registers with W, a parameter set from above in the hierarchy, and the cascade instruction applies the single-qubit phase gate to each qubit in the register. A different phase is introduced to each of the qubits. These phases are introduced into the circuit by the user as part of the modeling. The formal port names of the gates are target_in and target_out.

Another building block used in the modular adder subcircuit (Fig. 1) is the controlled and doubly controlled adder, where the phase gates are replaced by controlled-phase (CPhase) gates. In this gate, a single qubit controls the phases of the target qubits. To apply a gate repeatedly to the same register, we use the instances instruction. In implementing the controlled adder, however the phases are introduced sequentially to all the qubits in the target register, therefore one must use both cascade (repetition on qubits) and instances (repetition of gates) together.

Fig. 4: Phase gates controlled by the same qubit.

The single qubit in the control register takes part in all the of W CPhase gates in the sequence, whereas each of the W qubits in the target register takes part in exactly one gate.

The entire scheme realizing the modular adder circuit of Fig. 1 is shown below. Notice how similar this circuit is to the academic drawing of Fig. 1, with the main difference, of course, that this circuit is fully executable and optimizable to the best concrete implementation through the synthesis engine.

Fig. 5: Fully editable, executable, and optimizable version of the controlled modular adder of Fig. 1. Some of the blocks (e.g., the multi-controlled/multi-target phase gate - MCPhase) are not described but are analogous to the ones described above.

The circuit consists of five adders (including controlled and inverse adders) and two “twoQFT” functions, which implement the two constructions in the circuit above, checking the MSB qubit in the b register and inverting the auxiliary qubit when necessary (each such construction includes two QFT functions taking the b register to and from the Fourier space). This naming convention is demonstrated in Fig. 5 and explained in the legend below.

The Controlled Modular Multiplier Circuit

The circuit implementing modular multiplication can be constructed from repeating the modular adder function n times as shown below:

Fig. 6: Controlled modular multiplier. Taken from [1].

Note that in Fig. 6, the controls on the $\Phi$ADD(2^j )mod N are not completely ‘legal’ (although it is graphically well defined in [1] as the circuit in Fig. 1), since the control is not applied to all the gates within the $\Phi$ADD(2^j )mod N subcircuit and the correct controlled evolution is obtained only for ‘legal’ inputs where the overflow bit is 0.

The entire operation of the n modular adders is carried out in Fourier space, therefore additional QFT and inverse QFT are required in and out of the Fourier space. This circuit can be straightforwardly implemented by the graphical modeler as shown below:

Note that the modular adder circuit (Fig. 5) receives two single qubit registers as controls, while at the upper multiplication layer (Fig. 7) the c2 register includes n qubits which are then cascaded at the lower modular addition layer.

The modular exponentiation is implemented by repeating modular multiplication 2n times for different powers of a. In the modular multiplication circuit, register b is initially at state 0, and in order to reuse it in each of the 2n modular multiplications, it should be sent back to 0. This can be done by applying a controlled swap between registers b and x and then applying the inverse modular multiplication for the inverse value of the suitable power of a.

Modular Exponentiation

Modular exponentiation is carried out by inputting the number 1 in register b, preparing a top 2n qubit register (by a Hadamard transform), and directly applying the pair of modular multiplication construction above 2n times - cascading the top 2n qubit register such that each qubit in the register controls the suitable modular multiplication operation and applying an inverse QFT to the top register (the outcome of the quantum part is obtained by measuring this register). The overall Shor’s algorithm, for which its main modeling part is modular exponentiation, is shown in Fig. 8.

Synthesis

Beyond the scope of this writeup is the synthesis engine which takes the model described above as input and produces an optimal implementation circuit conforming to the functional model. We only mention here that for each functional block in the model, many functionally-equivalent implementations exist (e.g., manifesting the tradeoffs between number of auxiliary qubits implementing the block, its depth, the number of 2-qubit gates it contains, and its overall accuracy). Finding a best implementation for the overall circuit under a budget of qubits, depth, runtime, overall fidelity, algorithmic accuracy, and overall objective is the task of the synthesis engine. We also note that optimization at this functional level is much more powerful than optimization at the transpiler level, as the transpiler must keep the low-level semantics of the circuit intact – as opposed to preserving only the functional semantics of the circuit when having access to these semantics. A final implementation of Shor’s algorithm is shown in Fig. 9.

Resource Estimation for a Fault-Tolerant Realization of Shor’s Circuit

Circuits implementing modular exponentiation and Shor’s algorithm, as is the above implementation, require significant resources in terms of the number of gates and the depth of the circuit. Consequently, a realization of such circuits would necessarily require the use of error correcting codes, which would increase the required resources by orders of magnitude.

A recently available tool for resource estimation of fault-tolerant implementation of quantum algorithms – the Azure Quantum Resource Estimator – was applied to the above circuit allowing us to compare between the resources required for ‘regular’ noisy realization of the circuit and its fault-tolerant one. A surface code was used in this analysis.

Resources Required for the Regular Implementation

Our implementation of the Beauregard circuit [1] requires 4n+2 qubits. The circuit includes 4n² modular adder subcircuits (including the inverse circuits). Each modular adder includes n+1 phase gates, n+1 cphase gates, 3(n+1) ccphase gates and additionally 4 QFT (and inverse QFT) functions. Since each QFT includes n(n+1)/2 cphase gates and n Hadamard gates, the entire circuit includes:

o $O(n^4)$ cphase gates

o $O(n^3)$ ccphase, phase, and Hadamard gates

o $O(n^2)$ ccx, cx, and x gates.

Azure Resource Estimator Results for n=4

Using the Classiq graphical modeling and synthesis engine, the Shor implementation was generated for n=4 (requiring 18 qubits in total) in QIR format. The circuit included the following phase gates:

o 3,244 cphase gates

o 920 ccphase gates

o 320 phase gates.

Azure Resource Estimator was applied to the circuit with default arguments – assuming 0.001 error rate for one and two qubit gates as well as T-gates and one qubit measurements. The required total overall error of the circuit (total error budget) was 0.33.

According to the resource estimator, to meet this error budget, the required surface-code distance is d=13 and the required error rate for logical qubits is $5.1 * 10^{-9}$. Additionally, 25 T-state factories (working in parallel) are required. The error rate required for the T-states is $1.35 * 10^{-7}$.

The number of physical qubits for a logical qubit is given by 2d² = 338. The total number of logical qubits in a fault-tolerant circuit is 16,562 (a planar layout of the original circuit requires 49 qubits each requiring 338 physical qubits). These qubits carry the information throughout the computation. In addition, many more qubits are required for generating and distilling T-states by the T-factories. These T-states are used whenever a T-gate is applied in the fault-tolerant circuit (that is the T-states interact and get entangled with the logical circuit qubits and then measured disentangling them from the logical circuit qubits). To distill T-states with the required error rate, 13,520 physical qubits are required per T-state. Thus, the total number of physical qubits required for the T-states is 338,000.

The total number of physical qubits (T-state qubits + circuit qubits) is 354,562. Note that the T-state qubits are reused many times during the application of the circuit.

An important point according to the resource estimator results is that a fault tolerant circuit would include 50,471 single qubit rotations, originating from the various phase gates (including cphase and ccphase) in the original circuit. These single qubit rotations would consume almost all of the T-states required for the entire circuit. This results suggests that it would be beneficial to reduce the number of the various phase gates in the circuit. This can be achieved for, example, by using the approximate QFT functions (instead of the exactQFT) in the circuit. Alternatively, a different implementation of Shor’s algorithm, which is not based on addition in the Fourier base but on other types of Adder circuits, may also be advantageous.

Summary

We have shown how academic descriptions of quantum algorithms can be seamlessly modeled in a formal way, while keeping the structure and way of thinking of the scientifically presented algorithm. In this writeup we discussed two main modeling constructs which we found ubiquitous in most quantum algorithms: cascade and instances. We found that an order of around ten such constructs, all simple to understand, cover almost any algorithm published to date. Having such modeling constructs allows us to close the gap between scientific presentation of algorithms and actual execution on hardware and simulators - coupling algorithmic tools, such as synthesizers for building concrete circuits out of the quantum algorithm model, with resource estimators, such as the recently presented Microsoft tool.

Quantum modeling constructs and strong synthesis abilities close the gap between the intent of quantum algorithm designers as published in academic papers, and actual implementations that can be executed on quantum computers, simulators, and resource estimators

Introduction