Hello once more to the Coder’s Errand. Today’s post returns to the details of Quadratic Arithmetic Programs, highlighting the role of the vanishing polynomial.
In this endless journey through maths, crypto, coding and whatnot, I delve once again among the ZK-SNARK hills. This post is the result of about a week of frustrating debugging, trying to make DIZK correctly compute the Quadratic Arithmetic Program (QAP). It turned out to be a red herring, but directed me to study the concept of vanishing polynomial, which I did not understand in detail. This post makes clear what it is, what is its role in the ZK-SNARK construction, how it is computed in DIZK and how that computation differs so much from the simplified interpretation you’re likely to get if you read about it in Vitalik Buterin’s paper on Quadratic Arithmetic Programs (that I have often cited here).
Why the Vanishing Polynomial?
This mini-quest started with the following difficulty. The test for the porting of PGHR13 to DIZK started with a fixed Rank-1 Constraint System for a very simple proof (“show I know two factors of a third number”) and a corresponding witness evaluated by hand, created a QAP with the polynomials evaluated at a fixed point (for easy comparison between different libraries), generated the keys, proof and verified it. It failed in the last step, where the following evaluation simply did not succeed, for a very specific t
:
<A(t).s> * <B(t).s> - <C(t).s> = H(t) * Z(t).
A Brief Recap
You may remember this form from earlier posts, but let me recap if not. In a ZK-SNARK proof, we verify a computation step by step. To do so, this computation is first turned into a circuit, then each of its wires is assigned a value that results from feeding a specific input to the circuit. Each computing node of the circuit (called a “gate” in analogy to the nomenclature of electronic circuits) is transformed into a constraint, that verifies the output wire has the value it should have for the values assigned to the input wires.
For example, if in the assignment to your circuit you had the values 2 and 3 coming to a multiplication gate and the value 10 assigned to the output wire, you’d have found a fishy step in the proof. That computation would not be correct, and you, as a verifier, would not be convinced and should reject the proof.
Now, it is very expensive in general to verify the constraints for all the gates individually, and so we encode all of them in 3 vectors of polynomials A
,B
and C
. The constraints can then all be verified at once if a linear combination of these three vectors can be evenly divided by a vanishing polynomial Z
.
The above expression shows precisely this test. The letter s
is the computation’s witness, that represents a full assignment of all the circuit’s wires, and the left-hand side just shows it being mixed with all the constraints at the same time, defining the linear combination of polynomials. The equality to the right hand side then expresses that this is a multiple of Z
.
Hunting the Error
Given that the process had all been correct and there were several other tests validating each individual step, we were all baffled why this particular test would fail. There could be several points causing the error:
- we could have the wrong values in
A(t)
,B(t)
orC(t)
- we could be adding too many or too few terms in the dot products
- we might have computed the wrong
H(t)
- or
Z(t)
could simply be wrong.
None of these things should be failing, to be honest. We were using DIZK to
- evaluate
A(t)
,B(t)
,C(t)
andZ(t)
; - to compute
H(t)
from these known values; and - to perform the inner products.
Unless we had missed something in the arguments, every thing should work out. But because we didn’t know where the problem was, I focused my attention on specific details.
One thing that surprised me is that in DIZK Z(t)
is computed so easily and the result is so simple, so different from what Libsnark does. For DIZK,
Z(t) = td - 1.
Impressively simple. When looking at the complicated Libsnark code for the same function, we looked at each other and said: it simply can’t be just this, can it?
FFT and the Roots of Unity
This, dear reader, was when I felt like going into a sub-quest inside the sub-quest. After the twists and turns to find DIZK, to then implement each section of the pipeline, inspect the intricacies of calculating a QAP keeping the FFT monster as much away as possible (because none of us really knew what it was doing), I finally had to peep into its lair and understand the rationale for this minute calculation. I managed to get in without disturbing the FFT’s slumber and steal the secret treasure. In other words, I understood the calculation but I still do not understand the FFT. I did not try, it was not needed. I will when I absolutely must. For now, I’ll just explain what I found, so bear with me.
FFT stands for Fast Fourier Transform. It has loads of applications in engineering, but in algebra it is used for example to multiply polynomials. As you can tell from the above expression, we have just the right problem for it.
The FFT Object in DIZK
The first thing DIZK does is to create an FFT object with a domain where the FFT is going to be evaluated. This is close in size to the degree of our polynomial QAP. In particular, this is the smallest power of two greater or equal to that degree. This domain is the set of all the points at which the QAP polynomials generate the Rank-1 constraints. To finish the initialization of its FFT object, DIZK computes a d
-th root of unity. This sounds almost like algebraic jargon, so let me express it in lighter words:
a
d
-th root of unity is a valuer
such thatrd = 1.
Remember we are working in a finite field, and so, unlike the reals, this does not mean that r = 1
. In fact, by a standard algebra result, we are guaranteed to have d
such roots, and all easily related! If b
is a d
-th root of unity, the other roots are
{bi}, i = {0, ..., d-1}.
The proof is not difficult, and in turn will give us insight into the computation of Z(t)
, so let me give it.
All roots of unity are powers of each other
Because b
is a d
-th root of unity, we know that bd = 1
, and that bi != 1
for every 0 <= i < d
. Otherwise, b
would be a root of some order smaller than d
instead.
Also, all the roots must be distinct, for if that would not be the case, we would have two roots bi
and bj
, with i < j < d
, such that
bi = bj.
But then, bj / bi = 1
, and since bj / bi = b(j-i)
, we would have that b
was a root of order j - i < d
, which contradicts our initial assumption.
Computing the Vanishing Polynomial
So, we just found we have exactly d
solutions to the equation:
xd = 1
What that means, is that these solutions (b1, ..., bd
) are all the zeroes of the polynomial
xd -1
and so we can write
xd - 1 = (x - b1) (x - b2) ... (x - bd)
This is the explanation of why DIZK computes Z(t) = td - 1
. Implicitly, they simply find some d
-th root of unity, then compute the others and multiply their monomials. In fact, they just apply the above fact to shortcut the computation, while making sure the polynomials in A
,B
and C
are computed over the same domain. Very clever!
Incidentally, when I read Vitalik’s post, I had come away with the simplified impression that the points where the polynomial produces the constraints were simply 1, 2, 3,
etc. It could be, we only require any d
different points, and any library author is free to choose them as seems most suitable. But the fact is that in my previous mind map, there just was no way to marry a vanishing polynomial zero-ing at just the first integers with the simple formula above.
In the End
It was very satisfying to understand such an elegant relation to compute the vanishing polynomial. And I was able to follow up with tests that showed the rest of the QAP was correctly computed, and rule out all of the above possible justifications for the failure in the final validation.
But this test was run in special circumstances. I had fixed all the randomness, to always return predictable results. And I did this inside a test where I wanted to demonstrate that the polynomials correctly encoded each of the constraints. By definition, these polynomials only define constraints at the roots of unity, and so the test only evaluated the polynomials at precisely these places, which happened to special code when computing Lagrange polynomials.
After some persistent debugging I found a bug in DIZK that used a wrong equality test (for the developers among you, ==
instead of .equals()
) and so the calculation of the Lagrange coefficients was wrong, thereby destroying the rest of the computation. And so it was that a spurious error, caused by an error that exercises values we’ll never hit in a normal random test (the probability is just infinitesimally small) ended up revealing a new aspect of algebra to me and enlivening my day. I hope it has done the same for you.
This is enough for today. Hopefully you walk away with some more understanding of the Vanishing polynomial. Please, ‘Like’ if you liked this post. If you have particular doubts, or disagreements, why not share them below?
Until next time.
Have fun.