In the beginning, there was just an idea: Artos wanted to implement ZK Snarks in the Aventus protocol. No one of us had much of an inkling about how they worked, so I took the chance and volunteered to explore them. The first goal was simple: create a ZK Snark proof and implement an off-chain verifier in whatever language would be convenient.
Today, our goal is very different. We now have a small team dedicated to implementing the full journey with the following requirements for the Aventus ticketing protocol:
-
create a ZK Snark that guarantees a user has a legitimate ticket
-
implement ZK Snark capability in our SDK to enable:
-
-
generation of proving and verification keys;
-
generation of a proof in the back end and in a smartphone;
-
verification of a proof in the back end and on-chain;
-
a single back-end verification should be fast: under 100ms.
-
That journey is not yet complete, but we already have a long story behind us, with ups and downs, and many turning points. This post is about its early days. The rest will be detailed in forthcoming episodes.
ZK STARKs or ZK SNARKs?
The very first idea was actually to use ZK Starks. They have some notable advantages over ZK Snarks, namely weaker cryptographic assumptions and post-quantum resistance.
Weaker assumptions are a good thing. In cryptography, we always start from some assumptions (a bit like axioms) that are believed, but not known, to be true. A strong assumption means that you must believe more things than a weaker one, and therefore your position is weaker, since you are putting your faith in more things that may not actually be true.
On the other hand, a scheme is said to be post-quantum resistant if it is not immediately broken when quantum computers eventually become available. Most of the more established public-key cryptography is vulnerable to quantum computers, namely that based on factorization (like RSA) or the discrete logarithm problem (too numerous to count). That includes all the pairing-based cryptosystems, which power ZK Snarks.
However, there has been steady research in replacing these by, for example, lattice-based cryptosystems, which still are post-quantum resistant and the hope is that we’ll be able to do ZK Snarks with them as well.
Nevertheless, quantum computers are still not viable and for now other criteria took precedence. In particular, ZK STARKs generate proofs that are too long for our needs (100 to 1000 times larger than ZK Snarks), even though the proving key can be very large, and so we chose to use ZK Snarks.
Libsnark
My initial approach was to read what I could about ZK Snarks. I started with a few references but my reference one was the documentation for EIP197 proposal that introduces support for pairing calculations in Ethereum pre-compiled contracts. This is a valuable resource that clearly lists the parameters of the elliptic-curve and pairing supported by Ethereum. It is also helpful for the links to libraries that perform algebraic computations with 256-bit long numbers and elliptic curve operations in both Python and C++.
Then, I quickly found about libsnark as well, which is almost the default library for implementing ZK Snarks. It is quite complete, supporting several types of Snark as well as many gadgets to help in function specification.
ZoKrates
But the key advance in this process was when I found ZoKrates. You see, it’s pretty easy to find references to libsnark in this field, but the more I looked at it, the more I felt the learning curve was too steep for the speed we needed. We wanted to get to our goal as fast as possible.
ZoKrates provides just that. In only a couple of hours, you can setup your system, generate a circuit, keys, proof and a Solidity verifier. It is ready-made for a Unix-like environment, and to support itself on other systems, like Windows, it provides its own Docker image.
Its greatest contribution is an easy language to specify the statement to be proven by the ZK Snark. It can compile a program in this language and convert it into an adequate input for libsnark, which it uses to compute all the ZK Snark artifacts. In a morning, I had coded our Snark and generates keys, proofs and verified the whole thing in a Solidity contract. Pretty solid experience, but I have to thank the authors of these two posts that were my real introduction to the tool.
The First Prototype
The Solidity verifier generated by ZoKrates provided a blueprint for verifiers in other languages. The contract itself makes use of two libraries that provide necessary calculation abilities that are not supported natively by Ethereum yet: there is a pre-compiled contract to perform pairing operations, but we need more.
The first library is available here, and provides an interface into the world of elliptic curves (see my posts on pairings and elliptic curves if the below does not make sense to you):
- a representation of the source elements of a pairing (both in
G1
andG2
); - a function to calculate the additive inverse of a curve point (in
G1
); - an API to access Ethereum pre-compiled contracts to execute curve operations on
G1
and verifying that a product of several pairings cancels out.
The same file also includes a contract to verify a Snark of the default type produce by ZoKrates.
The second library is here, and provides the capability to calculate in G2
. The code used by ZoKrates seems to include updated versions of these two files.
A First Experiment in Javascript
Given that the languages I used at my work were Solidity and JavaScript, my first attempt was to port this code to JavaScript. That did not end well. At this time, I was very green and was focused on understanding how the different libraries acted together, what part of the abstraction layer was implemented where and how.
There was no Solidity implementation of the pairing logic, and I could not find a JavaScript library for the algebraic big number calculations. I started to port it manually until I had to concede I should look for other alternatives. I am not skilled in numerical methods and I could make important mistakes porting a codebase I did not fully understand.
So I looked at the Python pyPairing library. I was glad to identify all the bits were there: specific support for the Ethereum curve (here called bn256) in both an optimized and non-optimized version; support for finite field arithmetics in the underlying field; and support for the verification of pairings on that particular curve.
Early Successes
There were a few challenges that at the time seemed hard, but are so basic that they formed the groundwork for understanding all that would be to follow. I highlight these:
- understanding the textual representation of the verification key and proof output by ZoKrates;
- understanding how to properly instantiate a G2 point from their textual representation.
ZoKrates outputs both the proof and the verification key to text files. These are made up of several points that are inputs to a cryptographic pairing function. Without going much into details, let me just say that a pairing function takes two inputs over sets (technically groups) called G1
and G2
, and outputs a point in another set (group) called GT
. All three groups have different structures:
-
Elements of
G1
are points with two coordinates on the fieldFp
for the Ethereum primep = 21888242871839275222246405745257275088696311157297823662689037894645226208583
. Each element of this field of which is a number about 254 bits long -
Elements of
G2
are still points with two coordinates, but each of which can be seen as a degree-1 polynomial over the fieldFp.
For that reason, each of these coordinates is represented by two 254-bit long elements.
The group GT
is only used for verification, but its representation would take 12 Fp
elements. Fortunately, the ZK SNARK used by default by ZoKrates (PGHR13) only outputs elements of G1
or G2
in the keys or the proof (this is not true of the alternative snark GM17).
Representation of Elements in G1 and G2
The individual elements of the key and proof are adequately identified: they use names similar to those in the paper. It’s also easy to identify which elements are points of G1
and G2
: the former appear as a pair of two long strings, the latter as a pair of arrays. All of these are represented in the hexadecimal base.
But there is a quirk to how these elements are represented textually. When I copied the values of G2
points to Python, the verification did not work. Everything was as it should be apparently: there were no typos and the values were correctly copied. But for some reason, the maths failed.
It turned out the vectors are in reverse order to what you’d expect. Remember I said elements of G2
are polynomials? Yes, that is they key : they have form a_1 x + a_0
. And for human-readability, it makes sense to present the higher order coefficients first, on the left side, because that is just how we normally write polynomials. For a computer, however, the standard way of specifying a vector is to give the lowest-order coefficient first. And that was the key. I just had to invert the vectors when creating G2
elements and everything began to work. Voilà!
I found it pleasant to work with pyPairing. It is a small and simple library that goes straight to the point. But this was only the first step.
C++ Verifier
The Python verifier was an important milestone, as it was proof we could do an independent verification of a proof generated by ZoKrates with a ZoKrates-issued verification key.
The drawback was the speed. Even with the optimized library, the time for a single verification was around 4 seconds. Solidity seemed to be around 2 seconds, although only ad-hoc measurements were taken when running inside Ganache. It was way too slow.
The effort then focused on doing the same thing in C++. This time, I took the library libff. This was much more complex to understand and to tame.
Libff
Python is an extremely benevolent language: it’s easy to setup and run on Windows, you get immediate feedback via an easy to use console, it’s got an effective command line debugger and, let’s face it, it’s a breeze to write code in it.
By comparison, C++ is a nightmare. Although I had previous experience of it, and precisely in coding a cryptographic system, I still don’t like the language. Every time I look at a C++ listing, I cringe at the layout. It just seems ugly and cluttered.
Nevertheless, it was the best choice for speed and there was a calculation library ready for it. So, after coming to grips with libff, I was quickly able to essentially port the Python code to C++, adding some functionality that I had not included in the previous one. For example, the C++ verifier can read a verification key file instead of having it hard-coded, like the Python (and the Solidity) verifiers.
There are always lessons and challenges in any new project. Libff supported different curves, not just the Ethereum one, and due to that it is an extremely templated code. Understanding how to deal with them was the first challenge, and it added greatly to the confusion.
Another big obstacle was that it had to run in a Docker instance. There is no way I would embark in the adventure of trying to make a C++ code-base targetting a Unix-like environment run on a Windows machine. So I had to learn how to use Docker effectively and how to properly compile a big C++ project.
The biggest challenge was, maybe, understanding a whole new numerical library and knowing where not to go into the details. The types were different, the method names were different, even the internal representation of the pairing target group was different, so I had to adapt my mental map. But since both libraries were meant to target the same curve, the port was relatively easy. There would be later stages in this story where that turned out not to be the case.
The reward was amazing. The verification time went down to an average around 40 ms and we now could verify a batch of hundreds of proofs in only a few minutes, if we so wished.
In the Next Episode…
This was only the start. Once I had the verifiers running, we could verify SNARK proofs in Python, Solidity and C++. But that changed the game completely, and new goals were set. Stay tuned, as I reveal what happened in future posts in later posts.
Great article. Is the python & C++ code publicly available?