Unary Iteration

David Wierichs

Compilation/
Unary Iteration

Unary Iteration

Here we detail the construction of a unary iterator by building it up from three pieces: computing AND operations of a collection of control qubits using auxiliary qubits, cancelling neighbouring operations between repeated computations of AND between similar control values to effectively create a cache, and a cheaper decomposition for qml.TemporaryAND, or elbow, gates.

Compute AND values of controls on auxiliary qubits

The first step towards constructing the unary iterator is to compute AND between multiple control qubits reversibly, storing the output on auxiliary qubits. For this, we can start by computing AND between the first two control qubits, say ["c_0", "c_1"], and storing the result on a first zeroed auxiliary qubit, say "aux_0". This can be done using a qml.Toffoli gate:

  c_0: ─╭●─┤
  c_1: ─├●─┤
aux_0: ─╰X─┤.

We can check that the auxiliary qubit now stores a 1 exactly if both control qubits were in state $|1\rangle$ , as required for an AND gate.

For subsequent control qubits, we combine each of them with the output of the previous chain of AND operations by applying a Toffoli to the output ("aux_i-1"), the new control qubit ("c_i+1"), and a new auxiliary qubit ("aux_i"):

         ─:─────┤
aux_i-1: ─╰X─╭●─┤
  c_i+1: ────├●─┤
  aux_i: ────╰X─┤.

Chaining these Toffoli operators together in this way, we get the following structure for $c=5$ control qubits:

  c_0: ─╭●──────────┤
  c_1: ─├●──────────┤
aux_0: ─╰X─╭●───────┤
  c_2: ────├●───────┤
aux_1: ────╰X─╭●────┤
  c_3: ───────├●────┤
aux_2: ───────╰X─╭●─┤
  c_4: ──────────├●─┤
aux_3: ──────────╰X─┤.

Afterwards, the final auxiliary qubit "aux_3" carries the value $\prod_{i=0}^{c-1} x_i$ , where $x_i$ are the values of the control qubits, so that we can control a target operator on this single qubit to realize a multi-controlled operation $U$ . Afterwards, we may uncompute the chain of AND operations. Overall, this leads to the following decomposition

   c_0: ─╭●─┤        c_0: ─╭●──────────────────────╭●─┤
   c_1: ─├●─┤        c_1: ─├●──────────────────────├●─┤
 aux_0: ─│──┤      aux_0: ─╰X─╭●────────────────╭●─╰X─┤
   c_2: ─├●─┤        c_2: ────├●────────────────├●────┤
 aux_1: ─│──┤      aux_1: ────╰X─╭●──────────╭●─╰X────┤
   c_3: ─├●─┤  =     c_3: ───────├●──────────├●───────┤
 aux_2: ─│──┤      aux_2: ───────╰X─╭●────╭●─╰X───────┤
   c_4: ─├●─┤        c_4: ──────────├●────├●──────────┤
 aux_3: ─│──┤      aux_3: ──────────╰X─╭●─╰X──────────┤
target: ─╰U─┤     target: ─────────────╰U─────────────┤.

For control nodes controlling on the $|0\rangle$ state (○) instead of the $|1\rangle$ state (●), the corresponding nodes of the Toffoli gates are flipped.

Cache AND values

The next step is to simplify the circuit structure when very similar multi-control operators are executed in sequence. As an example, consider the first two operators appearing in a Select pattern with $c=3$ :

   c_0: ─╭○──╭○──┤          c_0: ─╭○───────────╭○─╭○───────────╭○─┤
   c_1: ─├○──├○──┤          c_1: ─├○───────────├○─├○───────────├○─┤
 aux_0: ─│───│───┤   =    aux_0: ─╰X─╭●─────╭●─╰X─╰X─╭●─────╭●─╰X─┤
   c_2: ─├○──├●──┤          c_2: ────├○─────├○───────├●─────├●────┤
 aux_1: ─│───│───┤        aux_1: ────╰X─╭●──╰X───────╰X─╭●──╰X────┤
target: ─╰U0─╰U1─┤       target: ───────╰U0─────────────╰U1───────┤.

As the Toffoli gate, and its variant with flipped control nodes, are self-inverse, we may remove pairs acting on the same qubits (potentially in an iterative manner) between the uncomputation for $U_0$ and the re-computation for $U_1$ :

   c_0: ─╭○──╭○──┤         c_0: ─╭○─────────────────────╭○─┤
   c_1: ─├○──├○──┤         c_1: ─├○─────────────────────├○─┤
 aux_0: ─│───│───┤   =   aux_0: ─╰X─╭●─────╭●─╭●─────╭●─╰X─┤
   c_2: ─├○──├●──┤         c_2: ────├○─────├○─├●─────├●────┤
 aux_1: ─│───│───┤       aux_1: ────╰X─╭●──╰X─╰X─╭●──╰X────┤
target: ─╰U0─╰U1─┤      target: ───────╰U0───────╰U1───────┤.

Now we may use the following identity to simplify the two Toffoli gates with differing control nodes:

─╭●─╭●─   ─╭●─
─├○─├●─ = ─│──   (1),
─╰X─╰X─   ─╰X─

which follows from the lazy Select rule. For all control qubits after "c_1", this is the only combination we will ever see, because the control nodes on the auxiliary wires are always controlling on $|1\rangle$ . For control qubits "c_0" and "c_1", however, we will also see combinations where the first node controls on $|0\rangle$ . It will thus also be handy to know the rules

─╭○─╭○─   ─╭○─         ─╭○─╭●─   ─╭○─╭●─╭●─╭●─   ────╭●─
─├○─├●─ = ─│──         ─├●─├○─ = ─├●─├●─├●─├○─ = ─╭●─│──
─╰X─╰X─   ─╰X─,  and   ─╰X─╰X─   ─╰X─╰X─╰X─╰X─   ─╰X─╰X─,

where the first is analogous to (1) above, and the second inserts an identity in the form of two Toffoli gates and then applies (1) to the two resulting pairs of gates.

For a full Select pattern with $c=3$ and $K=8$ , i.e.,

   c_0: ─╭○──╭○──╭○──╭○──╭●──╭●──╭●──╭●──┤
   c_1: ─├○──├○──├●──├●──├○──├○──├●──├●──┤
 aux_0: ─│───│───│───│───│───│───│───│───┤
   c_2: ─├○──├●──├○──├●──├○──├●──├○──├●──┤
 aux_1: ─│───│───│───│───│───│───│───│───┤
target: ─╰U0─╰U1─╰U2─╰U3─╰U4─╰U5─╰U6─╰U7─┤,

we find the reduced decomposition

   c_0: ─╭○──────────────────╭○─────────────────────╭●──────────────────╭●──────────────────╭●─┤
   c_1: ─├○──────────────────│───────────────────╭●─│───────────────────│───────────────────├●─┤
 aux_0: ─╰X─╭●─────╭●─────╭●─╰X─╭●─────╭●─────╭●─╰X─╰X─╭●─────╭●─────╭●─╰X─╭●─────╭●─────╭●─╰X─┤
   c_2: ────├○─────│──────├●────├○─────│──────├●───────├○─────│──────├●────├○─────│──────├●────┤
 aux_1: ────╰X─╭●──╰X─╭●──╰X────╰X─╭●──╰X─╭●──╰X───────╰X─╭●──╰X─╭●──╰X────╰X─╭●──╰X─╭●──╰X────┤
target: ───────╰U0────╰U1──────────╰U2────╰U3─────────────╰U4────╰U5──────────╰U6────╰U7───────┤.

As we can see, many Toffoli gates cancelled and the savings grow proportionally for larger Select patterns.

Decompose elbow gates into fewer `T` gates

In this last step, we apply a special decomposition to the Toffoli gates in the control structure that we obtained above. This decomposition exploits the fact that these gates are in fact TemporaryAND, or elbow, gates, which have cheaper decompositions due to the fixed input/output state on the target qubit. For the left elbow, the decomposition is given by

    ─╭●─┤     ─╭●─┤         ──────────────╭●──────────────────┤
    ─├●─┤  =  ─├●─┤  =      ───────╭●─────│─────╭●────────────┤
|0> ─╰X─┤     ─╰──┤     |0> ──H──T─╰X──T†─╰X──T─╰X──T†──H──S†─┤,

with $4$ T gates and for the right elbow we don't need any T gates at all:

─╭●─┤        ─●╮─┤     ─────────╭●────┤
─├●─┤    =   ─●┤─┤  =  ─────────╰Z────┤
─╰X─┤|0>     ──╯─┤     ──H──┤↗├──║──X─┤ |0>.
                             ╚═══╩══╝

With this, we are done constructing the unary iterator. The cheaper decomposition for elbow gates is typically not applied to the full circuit to maintain readability, but the Toffoli gates may be replaced by the elbow symbol (without X on the target) to indicate the reduced cost upon decomposition:

   c_0: ─╭○──────────────────╭○─────────────────────╭●──────────────────╭●──────────────────●╮─┤
   c_1: ─├○──────────────────│───────────────────╭●─│───────────────────│───────────────────●┤─┤
 aux_0: ─╰──╭●─────╭●─────●╮─╰X─╭●─────╭●─────●╮─╰X─╰X─╭●─────╭●─────●╮─╰X─╭●─────╭●─────●╮──╯─┤
   c_2: ────├○─────│──────●┤────├○─────│──────●┤───────├○─────│──────●┤────├○─────│──────●┤────┤
 aux_1: ────╰──╭●──╰X─╭●───╯────╰──╭●──╰X─╭●───╯───────╰──╭●──╰X─╭●───╯────╰──╭●──╰X─╭●───╯────┤
target: ───────╰U0────╰U1──────────╰U2────╰U3─────────────╰U4────╰U5──────────╰U6────╰U7───────┤.

About

Software & Documentation

Resources

Topics

Community & Support

Unary Iteration

Compute AND values of controls on auxiliary qubits

Cache AND values

Decompose elbow gates into fewer `T` gates

Unary Iteration

Compute AND values of controls on auxiliary qubits

Cache AND values

Decompose elbow gates into fewer T gates

Decompose elbow gates into fewer `T` gates