In the last post, I made an introduction to the concept of Zero Knowledge Proofs, and talked a bit about ZK-STARKs. Today, I’ll dive into the maths in Vitalik Buterin’s first post in a series about the subject. Vitalik does a good job of explaining them, but I feel the need to synthesise the information in there. This post is a personal sketch of reflections on that post, but hopefully it may be helpful for other people trying to understand this concept. I’ll write this post in the role of the Prover, because that makes the exposition clearer.
Purpose: ZK-STARKs are used to prove knowledge of some secret information. This can be seen as a witness that a known public function can have a given value. I’ll fix these for the rest of the post:
- A function
f
is publicly known (for example, authentication to access a given system) - This function could have one or several interesting values, which are publicly known. Let
y
denote one of them. - The verifier does not know if or where
f
can have valuey
. - I want to prove I know a value
x
, without revealing it, such thaty = f(x)
. - The verifier should be able to check
x
has the above property in much less time than it takes to computef(x)
.
Succinctness: The focus of the post is on making proofs verifiable quickly, which is a harder problem than providing the Zero-Knowledge property.
Problem Encoding: The original example in Vitalik’s post, which he calls “somewhat simple”, is a stepping stone for more realistic cases. Instead of demonstrating one value for which f
takes a certain value, the example wants to prove knowledge of a function f
where 1 million known points have a limited range. That is:
- I want to prove that for all
x
betweenx0
andx1
, the value off(x)
is betweeny0
andy1
.
It is crucial, in this example, that the range is limited.
A more plausible example is to prove one knows the 1-millionth Fibonacci number. Although the Fibonacci numbers are not secret and can be computed, computation still expends valuable resources so its knowledge may be valuable for someone who can’t compute it. And of course, you could easily imagine that this statement referred to proving knowledge of Satoshi Nakamoto’s secret key, which would be worth some billions of dollars outright.
The connection between these is that the Fibonacci problem can be encoded by a polynomial, which will have to satisfy some conditions: the problem’s constraints. In this case, we use Fib(x)
to represent the x’th Fibonacci polynomial, and then build a constraint polynomial
C(x1, x2, x3) = x3 - x2 - x1.
Finally, we apply a single constraint here: C(Fib(x), Fib(x-1), Fib(x-2)) = 0
. This is equivalent to the recursive definition of the Fibonacci series:
Fib(x) = Fib(x-1) + Fib(x-2).
Proving knowledge of a given Fib(x)
is a sub-case of proving knowledge of Fib
, provided that this can be efficiently computed.
Vitalik uses the first example to show how one can replace the original problem by an easier one.
Alternative Problem Statement
Proving the original statement may be hard. If the function f
can be represented by a constraint checking polynomial, we can use the following methodology.
In Vitalik’s simple example, the constraints are easy. For all points x
between 1
and 1000000
, we want to prove f(x)
is between 0 and 9. I’ll use a simpler example, and suppose f(x)
must be 1
, 3
or 7
.
We can easily create a polynomial C(x)
that takes the value 0
exactly when x
is one of these values:
C(x) = (x-1) * (x-3) * (x-7)
.
Then, the problem reduces to proving that C(f(x)) = 0
for all x
between 1
and 1000000
.
Notice what we did here:
- we started from the expression
f(x) = 1 OR f(x) = 3 OR f(x) = 7
- we created a polynomial
C(x)
that is0
if and only ifx = 1 OR x = 3 OR x = 7
- we state the original problem by the feeding f to C:
C(f(x)) = 0
for1 <= x <= 1000000
.
The Fibonacci problem follows the same approach, although the constraints are different.
- we start from the expression defining the Fibonacci sequence:
f(x) = f(x-1) +f(x-2)
- we create a polynomial
C(x1,x2,x3)
such thatC(x1,x2,x3) = 0 <=> x3 - x2 - x1
- we now recast the original problem to
C(f(x), f(x-1), f(x-2)) = 0
for allx
.
But because we are specifically interested in the 1000000th term of the series, we only require C to hold up to that element, that is:
C(f(x), f(x-1), f(x-2)) = 0
for 1 <= x <= 1000000f(2) = 1
f(1) = 1
There is a fact I used above without explanation, but that is worth detailing a bit more now. Every polynomial can be written as a product of a scalar and linear factors, (x - a)
, one for each root (a
) of the polynomial. If a polynomial counts x1
, x2
, x3
among its roots, then it must be a product of
(x - x1) * (x - x2) * (x - x3) * D(x)
, where D(x)
is some other polynomial.
When we have a domain-limited constraint as above, we can use this fact and create a polynomial Z(x)
that encodes that domain and write:
C(f(x)) = Z(x) * D(x)
Z(x) = (x - 1) * (x - 2) * ... * (x - 1000000)
Because C
and Z
are public, the proof now amounts to proving knowledge of D
and, in case it is not public, of f
, such that
C(f(x)) = Z(x) * D(x)
Interactive Proof of Polynomial Equality
So far, there does not seem to be much gain in doing all these transformations. We went from proving knowledge of x without revealing it to proving equality of polynomials without evaluating them. This, in itself, is not a trivial problem, since we can’t evaluate the polynomials at all points and we want to be sure they are exactly equal in a large number of them.
To do this, we make an interactive proof between a Prover and a Verifier. The proof will be a repetition of challenge-response rounds. In each round, the Verifier requests that the Prover sends the evaluation of f(x)
and D(x)
for a given point x
.Crucially, the value of x is chosen from a much larger interval than the constraint set above. Why?
There are two properties an interactive proof like this must satisfy:
- Completeness: the Prover must be able to convince the Verifier if the statement is valid
- Soundness: if the statement is false, that is the Prover does not actually know the secret information, then it should not be able to convince the Verifier with more than negligible property.
If the Prover does know f(x)
for all relevant x
, then it will be able to create the proof and convince the Verifier. Otherwise, how far can it go in fooling its counterpart?
If the Prover knew in advance the challenges posed by the Verifier, it’s quite possible it could adapt its answers in order to always succeed. To prevent this, the Prover must bind to its answer to all possible challenges before the “questioning” starts. One way to do this, as used by Vitalik, is to put them all in a Merkle tree and send the root to the Verifier. If the Prover does not know a function f(x)
that satisfies the problem statement, it must nevertheless commit to some values, thus implicitly defining a function f'(x)
in a large domain.
This is a bit of a shaky part in Vitalik’s paper, because he basically postpones a full solution to this problem for later posts of the series. The fundamental issue is the reliance on f
being a polynomial of a particular degree, in the example 1000000. Vitalik then reasons that a malicious Prover would forge a second polynomial, say f'
, of the same degree and would try to compute a polynomial D(x)
such that C(f'(x)) = Z(x) * D(x)
. But the Prover can’t do it, because that would also be a solution for the original statement. So, the Prover will just pick some other polynomial of the appropriate degree, with the knowledge that C(f'(x))
and Z(x) * D(x)
will not be equal at all points.
In fact, two different polynomials of degree k
can only agree on at most k
points, and so the Prover will be able to convince the Verifier for at most 10 000 000 points. But because the Verifier can send challenges from a domain 100 times larger, there is a 99% probability that it will hit on one of the points where polynomials disagree and the Prover will fail the proof.
Criticism
The reasoning above is ok, but there a number of assumptions here that make it weak. Namely, expecting the malicious Prover to stick to a fixed D(x)
and a representation of f'
as a polynomial of the same degree as the actual statement solution. Vitalik address this at the close of the post, leading into the second post.
In fact, the above is not a reasonable behaviour for the Prover. If the Prover must forge f'
, then it will at the same time create a function D(x)
by computing D(x) = f'(x) / Z(x)
such that the verification C(f'(x)) = Z(x) * D(x)
would always succeed. The trick is that D
(and possibly f'
) will much likely not be polynomials of the relatively low-level expected degree.
Can we expect them to be polynomials at all? Yes, in a way. The Verifier makes all his challenges from inside a well defined domain, and the Prover has to commit to values for all those points to define its functions f
and D
. Using polynomial interpolation, anyone can create a polynomial that matches each of those functions exactly at the points used to interpolate it. That is, inside the domain, the function will be equal to that polynomial. It does not matter what happens outside this domain, because the functions will never have to be evaluated there. However, notice that if these functions are chosen at random, they will likely have a very high degree, near size of the domain. It is then the Verifier’s job to check not only that the matching equation holds, but also, and very importantly, that f'
and D
are of a low-level degree. That is the job left for the second part of Vitalik’s series, and for a future post of mine.