- Compilation/
Unary Iteration
Unary Iteration
Here we detail the construction of a unary iterator by building it up from three pieces:
computing AND
operations of a collection of control qubits using auxiliary qubits,
cancelling neighbouring operations between repeated computations of AND
between
similar control values to effectively create a cache, and a cheaper decomposition for
qml.TemporaryAND
, or elbow, gates.
Compute AND values of controls on auxiliary qubits
The first step towards constructing the unary iterator is to compute AND
between multiple control
qubits reversibly, storing the output on auxiliary qubits. For this, we can start by computing
AND
between the first two control qubits, say ["c_0", "c_1"]
, and storing the result on
a first zeroed auxiliary qubit, say "aux_0"
. This can be done using a
qml.Toffoli
gate:
c_0: ─╭●─┤
c_1: ─├●─┤
aux_0: ─╰X─┤.
We can check that the auxiliary qubit now stores a 1
exactly if both control qubits were
in state |1\rangle, as required for an AND
gate.
For subsequent control qubits, we combine each of them with the output of the previous chain
of AND
operations by applying a Toffoli
to the output ("aux_i-1"
), the new control qubit ("c_i+1"
), and a new auxiliary qubit ("aux_i"
):
─:─────┤
aux_i-1: ─╰X─╭●─┤
c_i+1: ────├●─┤
aux_i: ────╰X─┤.
Chaining these Toffoli
operators together in this way, we get the following structure for
c=5 control qubits:
c_0: ─╭●──────────┤
c_1: ─├●──────────┤
aux_0: ─╰X─╭●───────┤
c_2: ────├●───────┤
aux_1: ────╰X─╭●────┤
c_3: ───────├●────┤
aux_2: ───────╰X─╭●─┤
c_4: ──────────├●─┤
aux_3: ──────────╰X─┤.
Afterwards, the final auxiliary qubit "aux_3"
carries the value \prod_{i=0}^{c-1} x_i, where
x_i are the values of the control qubits, so that we can control a target operator on this
single qubit to realize a multi-controlled operation U. Afterwards, we may uncompute the chain
of AND
operations. Overall, this leads to the following decomposition
c_0: ─╭●─┤ c_0: ─╭●──────────────────────╭●─┤
c_1: ─├●─┤ c_1: ─├●──────────────────────├●─┤
aux_0: ─│──┤ aux_0: ─╰X─╭●────────────────╭●─╰X─┤
c_2: ─├●─┤ c_2: ────├●────────────────├●────┤
aux_1: ─│──┤ aux_1: ────╰X─╭●──────────╭●─╰X────┤
c_3: ─├●─┤ = c_3: ───────├●──────────├●───────┤
aux_2: ─│──┤ aux_2: ───────╰X─╭●────╭●─╰X───────┤
c_4: ─├●─┤ c_4: ──────────├●────├●──────────┤
aux_3: ─│──┤ aux_3: ──────────╰X─╭●─╰X──────────┤
target: ─╰U─┤ target: ─────────────╰U─────────────┤.
For control nodes controlling on the |0\rangle state (○
) instead of the |1\rangle
state (●
), the corresponding nodes of the Toffoli
gates are flipped.
Cache AND values
The next step is to simplify the circuit structure when very similar multi-control operators are
executed in sequence. As an example, consider the first two operators appearing in a Select
pattern with c=3:
c_0: ─╭○──╭○──┤ c_0: ─╭○───────────╭○─╭○───────────╭○─┤
c_1: ─├○──├○──┤ c_1: ─├○───────────├○─├○───────────├○─┤
aux_0: ─│───│───┤ = aux_0: ─╰X─╭●─────╭●─╰X─╰X─╭●─────╭●─╰X─┤
c_2: ─├○──├●──┤ c_2: ────├○─────├○───────├●─────├●────┤
aux_1: ─│───│───┤ aux_1: ────╰X─╭●──╰X───────╰X─╭●──╰X────┤
target: ─╰U0─╰U1─┤ target: ───────╰U0─────────────╰U1───────┤.
As the Toffoli
gate, and its variant with flipped control nodes, are self-inverse,
we may remove pairs acting on the same qubits (potentially in an iterative manner)
between the uncomputation for U_0 and the re-computation for U_1:
c_0: ─╭○──╭○──┤ c_0: ─╭○─────────────────────╭○─┤
c_1: ─├○──├○──┤ c_1: ─├○─────────────────────├○─┤
aux_0: ─│───│───┤ = aux_0: ─╰X─╭●─────╭●─╭●─────╭●─╰X─┤
c_2: ─├○──├●──┤ c_2: ────├○─────├○─├●─────├●────┤
aux_1: ─│───│───┤ aux_1: ────╰X─╭●──╰X─╰X─╭●──╰X────┤
target: ─╰U0─╰U1─┤ target: ───────╰U0───────╰U1───────┤.
Now we may use the following identity to simplify the two Toffoli
gates with differing
control nodes:
─╭●─╭●─ ─╭●─
─├○─├●─ = ─│── (1),
─╰X─╰X─ ─╰X─
which follows from the lazy Select rule.
For all control qubits after "c_1"
, this is the only combination we will ever see, because the
control nodes on the auxiliary wires are always controlling on |1\rangle.
For control qubits "c_0"
and "c_1"
, however, we will also see combinations where the first
node controls on |0\rangle. It will thus also be handy to know the rules
─╭○─╭○─ ─╭○─ ─╭○─╭●─ ─╭○─╭●─╭●─╭●─ ────╭●─
─├○─├●─ = ─│── ─├●─├○─ = ─├●─├●─├●─├○─ = ─╭●─│──
─╰X─╰X─ ─╰X─, and ─╰X─╰X─ ─╰X─╰X─╰X─╰X─ ─╰X─╰X─,
where the first is analogous to (1) above, and the second inserts an identity in the form
of two Toffoli
gates and then applies (1) to the two resulting pairs of gates.
For a full Select
pattern with c=3 and K=8, i.e.,
c_0: ─╭○──╭○──╭○──╭○──╭●──╭●──╭●──╭●──┤
c_1: ─├○──├○──├●──├●──├○──├○──├●──├●──┤
aux_0: ─│───│───│───│───│───│───│───│───┤
c_2: ─├○──├●──├○──├●──├○──├●──├○──├●──┤
aux_1: ─│───│───│───│───│───│───│───│───┤
target: ─╰U0─╰U1─╰U2─╰U3─╰U4─╰U5─╰U6─╰U7─┤,
we find the reduced decomposition
c_0: ─╭○──────────────────╭○─────────────────────╭●──────────────────╭●──────────────────╭●─┤
c_1: ─├○──────────────────│───────────────────╭●─│───────────────────│───────────────────├●─┤
aux_0: ─╰X─╭●─────╭●─────╭●─╰X─╭●─────╭●─────╭●─╰X─╰X─╭●─────╭●─────╭●─╰X─╭●─────╭●─────╭●─╰X─┤
c_2: ────├○─────│──────├●────├○─────│──────├●───────├○─────│──────├●────├○─────│──────├●────┤
aux_1: ────╰X─╭●──╰X─╭●──╰X────╰X─╭●──╰X─╭●──╰X───────╰X─╭●──╰X─╭●──╰X────╰X─╭●──╰X─╭●──╰X────┤
target: ───────╰U0────╰U1──────────╰U2────╰U3─────────────╰U4────╰U5──────────╰U6────╰U7───────┤.
As we can see, many Toffoli
gates cancelled and the savings grow proportionally
for larger Select
patterns.
Decompose elbow gates into fewer T
gates
In this last step, we apply a special decomposition to the Toffoli
gates in the control structure
that we obtained above. This decomposition exploits the fact that these gates are in fact TemporaryAND
, or
elbow, gates, which have cheaper decompositions due to the fixed input/output state on the target
qubit. For the left elbow, the decomposition is given by
─╭●─┤ ─╭●─┤ ──────────────╭●──────────────────┤
─├●─┤ = ─├●─┤ = ───────╭●─────│─────╭●────────────┤
|0> ─╰X─┤ ─╰──┤ |0> ──H──T─╰X──T†─╰X──T─╰X──T†──H──S†─┤,
with 4 T
gates and for the right elbow we don't need any T
gates at all:
─╭●─┤ ─●╮─┤ ─────────╭●────┤
─├●─┤ = ─●┤─┤ = ─────────╰Z────┤
─╰X─┤|0> ──╯─┤ ──H──┤↗├──║──X─┤ |0>.
╚═══╩══╝
With this, we are done constructing the unary iterator. The cheaper decomposition for elbow gates
is typically not applied to the full circuit to maintain readability, but the Toffoli
gates
may be replaced by the elbow symbol (without X
on the target) to indicate the reduced cost
upon decomposition:
c_0: ─╭○──────────────────╭○─────────────────────╭●──────────────────╭●──────────────────●╮─┤
c_1: ─├○──────────────────│───────────────────╭●─│───────────────────│───────────────────●┤─┤
aux_0: ─╰──╭●─────╭●─────●╮─╰X─╭●─────╭●─────●╮─╰X─╰X─╭●─────╭●─────●╮─╰X─╭●─────╭●─────●╮──╯─┤
c_2: ────├○─────│──────●┤────├○─────│──────●┤───────├○─────│──────●┤────├○─────│──────●┤────┤
aux_1: ────╰──╭●──╰X─╭●───╯────╰──╭●──╰X─╭●───╯───────╰──╭●──╰X─╭●───╯────╰──╭●──╰X─╭●───╯────┤
target: ───────╰U0────╰U1──────────╰U2────╰U3─────────────╰U4────╰U5──────────╰U6────╰U7───────┤.