Currently at Artos, we are researching ZK SNARKs in depth. One of the tasks we’re handling is to convert a library written in Java (DIZK) to work with the same ZK SNARK scheme used by ZoKrates and libsnark, which are written in C++. Parts of this have already been detailed somewhere else in this blog.
The work is almost done, but there have been some significant difficulties along the way, as is to be expected in any adventure. These are the obstacles that make us grow, after all. One of the biggest so far has been focused on the proper encoding of the function that we want to prove statements for. This post addresses this particular difficulty at a mathematical level, and covers similar ground to Vitalik Buterin’s article on Quadratic Arithmetic Programs (QAPs). In the first part, I will focus on the first step, the Rank-1 Constraint System (R1CS). On the follow-up, I will show how to create a QAP from the R1CS.
Although Vitalik does an excellent job of making the technical paper more accessible, I am still left with many questions at the time of coding, and seeing different libraries doing things in different ways, I feel the need to explain things to myself in my own words. In doing so, it is my objective to do it as plainly and succinctly as possible, and hope that will make this particular step more accessible to others doing the same journey. Because my objective is to solve a particular implementation pain, I may go into some low-level details.
Arithmetic Circuits for ZK SNARKs
The first step in the ZK SNARK end-to-end journey is to create a function to write proofs about. This could of course be specified in any language, but that wouldn’t do much good. ZK SNARKs must work with specific constructs. Therefore, these functions have to be specified as sequences of very simple terms, namely, additions and multiplications of only two terms in a particular field. Let me give you an example.
One of the oldest cryptographic systems, RSA, is based on the difficulty of factorization. If you’ve never heard about RSA before, it had a huge impact in cryptography as it was the first public-key encryption algorithm publicly available. Despite being over 40 years old, it is still widely used as a key-exchange and authentication algorithm in TLS (the protocol supporting https:// sites). If you want a good introductory text to it and to number theory, in an accessible way, this is a good book, despite being a bit old by now. For a short summary, you could always try the RSA Wikipedia page.
For my purposes, here is a small summary. Given a very large number n = pq
, where p
and q
are prime numbers, it is very difficult to compute either factor of n
. And so, in RSA, n
becomes part of the public key while (p,q)
are part of the private key. One way to prove knowledge of a key is to show p
, but of course this reveals your secret. So, factorization is a natural problem to use in a Zero Knowledge proof. For the sake of illustration of the capabilities of R1CS, I am going to use a more complex version of this.
An Example
Prove that you know
(p,q)
such that(p+3)(q+2) = n+1
. The values(p,q)
are private to you, while the valuen
is publicly known.
This problem is naturally defined in terms of additions and multiplications, so it is not difficult to write a program describing it in a way in which all mathematical expressions can only take two terms of the form
<variable> = <value 1> op <value 2> where op can only be + or *.
We do have the restriction that this language does not have an expression parser, and cannot understand parentheses or chains of operations. Every intermediate step has to be stored in some variable.
Let’s give it a try.
def main(private p, private q, n): v0 = p + 3 v1 = q + 2 v2 = v0 * v1 v3 = n + 1 assert (v2 == v3) return 1
The above program is practically written in ZoKrates syntax. I only introduced the assert
call to make the behaviour obvious and removed some other bits.
The program does a series of assignments. Then, a condition is tested: if it fails the program stops and never returns. Otherwise, it returns 1.
This is good, but not yet a constraint system that can be handled by a ZK SNARK.
How to Encode a Circuit into an R1CS
Let’s transform the above slightly.
v0 == p + 3 v1 == q + 2 v2 == v0 * v1 v3 == n + 1 v2 == v3 output == 1
At first sight, there are few differences:
- function header removed
- single ‘=’ replaced by double ‘==’
- return replaced by output ==
It almost reads naturally, does it not? We apparently set successive variables v0
to v3
with the next step in the computation until we are ready to make the final check. But that is not quite what is happening. No! The important thing to understand is that a R1CS is not a computer program, you are not asking it to produce a value from certain inputs. Instead, a R1CS is a verifier, it shows that an already complete computation is correct .
To do so, the R1CS must receive as input an assignment of the full state of the computation. This is defined by the value of all the variables used in the process, and because each was set only once, its final value is equal to what it was when it was used. Therefore, by getting a snapshot of the values of all variables we can verify they have all been properly computed and guarantee the computation was correct one step at a time.
So, what do we include as variables in our R1CS?
-
the constant 1 (whose purpose will be clear later)
-
all the public inputs to our original function
-
the outputs of our original function
-
the private inputs to the original function
-
all the auxiliary variables created during the computation
An assignment is just a vector where each entry assigns a value to each of the computation variables. It can be divided in two parts:
-
Primary assignment: the constant 1, public inputs and outputs
-
Auxiliary assignment: private inputs and all auxiliary variables
In total, in this example, we have 9 variables, divided in this way:
-
Primary:
[1, n, output]
-
Auxiliary:
[p, q, v0, v1, v2, v3]
How Constraints Work
Multiplication Gates
The most important thing to understand about R1CS, in my view, is the assignment vector. Take the two vectors at the end of the previous section and put them together like this.
V = [1, n, output, p, q, v0, v1, v2, v3]
An assignment is just a vector with these variable names replaced by values. For example for p =3 , q = 5
, a successful computation will look like this (n = 41):
s = [1, 41, 1, 3, 5, 6, 7, 42, 42]
Now for example, to encode the third constraint, we want to express v2 = v0 * v1.
We can do this by taking two copies of V, filtering out the only terms we care about, and multiplying them. We do that using the dot product. If you’ve never seen a dot product, it’s an operation that takes two vectors of the same size, multiplies the elements of both vectors one position at a time, and sums them. We can represent it like this: <A.B>
, meaning the product of vectors A and B. So, if I want to isolate v0
only, I do:
<V . [0, 0, 0, 0, 0, 1, 0, 0 ,0]> = v0 + a bunch of 0s
and to isolate v1
I do
<V . [0, 0, 0, 0, 0, 0, 1, 0 ,0]> = v1 + a bunch of 0s
The product of these will be, naturally, v0 * v1.
Now, to test for equality, I can isolate v2
, and subtract both terms. They are equal only if the result is 0.
<V . [0, 0, 0, 0, 0, 0, 0, 1 ,0]> = v2
I’ve been taking some liberties here just to make clear what is going on. In reality, we cannot add variable names, so it does not make sense to multiply by vector V. That is where the assignments come in. If I use s
instead of V
, I will have actual values on either side of the equality sign, and the condition will make perfect mathematical sense.
Then, defining
A3 = [0, 0, 0, 0, 0, 1, 0, 0 ,0] B3 = [0, 0, 0, 0, 0, 0, 1, 0 ,0] C3 = [0, 0, 0, 0, 0, 0, 0, 1 ,0],
third constraint can be written
<A3.s> * <B3.s> - <C3.s> == 0
Addition Gates
Addition gates make use of the same machinery used for multiplication gates. Indeed, the dot product is a sum of multiplications which means we can encode several additions in just one constraint. We just have to include more non-zero entries in one of the linear combinations.
All the constraints in the R1CS specify a term is equal to the product of two other terms. Each term is a linear combination of the system’s variables. This means we can add any multiple of each variable before multiplying them. It also means that even if we have an addition gate we have to craft the constraint as a product of something. In the previous example, our first constraint is v0 == p + 3
. The righthand side can be encoded as the product 1 * (p + 3)
A1 = [1, 0, 0, 0, 0, 0, 0, 0, 0] B1 = [3, 0, 0, 1, 0, 0, 0, 0, 0] C1 = [0, 0, 0, 0, 0, 1, 0, 0, 0]
Notice how I created the summand 3
. Since 1
is one of the variables, it is possible to create a constant term simply by multiplying 1
by a suitable coefficient (in vector B
). In the same way, I created the term 1
that multiplies by the whole sum in vector A
.
This turns out to be a very simple example. A single constraint at full power could express something like this in one go:
(2 * p + 7v0 - 1/2 q) * (3p + 5q + n - 2) = (v0 + 3v1 + v2 + 100)
The adequate vectors would be:
A = [0, 0, 0, 2, -1/2, 7, 0, 0, 0] B = [-2, 1, 0, 3, 5, 0, 0, 0, 0] C = [100, 0, 0, 0, 0, 1, 3, 1, 0]
This is much more than can be expressed in a single gate. If you don’t have any addition gates, this will not provide any saving: you cannot multiply different variables inside a single linear combination. But the additions can be easily agglutinated to a multiplication gate and save several constraints. Remember the real input to the ZK SNARK process is the R1CS, and not the circuit itself, so your front-end should be able to use this trick to optimize the circuit before creating the proof.
This is all for today. I hope that you now understand Rank-1 Constraint Systems better and how to use them to specify the function for a ZK SNARK. Keep tuned in for more about this fascinating technology. Meanwhile, comment, like and share to your heart’s content. I’d be interested in knowing if you’re using this technology, what were your difficulties. And if you’re in need of technical advice, you can contact me as well.
See you next time.