Posts in this series:
- The Case for zk-SNARKs
- The Case Against zk-SNARKs
- Alternatives to zk-SNARKs
I explored the strengths of zk-SNARKs in the first part of this series. They led to an explosion of interest in zk-SNARKs, and I’m sure that Vitalik Buterin’s public interest in them had a great deal to do with it.
This second part is the counterpart of the first, and explores the weaknesses of zk-SNARKs instead.
The Case Against zk-SNARKs
Trusted Setup
The most damaging arguments against zk-SNARKs are related to security. QAP-based zk-SNARKs rely on a long Common Reference String (CRS), that allows the prover to construct a long proof and the Verifier to check it with several queries evaluated at a random point.
In traditional ZK proofs, the Prover wants to hide its private input from the verifier, and so it chooses some randomness to scramble, or disguise its input, in a way that can be reversed, or at least can be checked.
The Verifier, on the other hand, wants to reduce as much as possible the chances that the Prover can fool it and prove something false. Therefore, it issues a random challenge to the Prover. The prover must then create its answer from a function of its private input, the Verifier’s challenge and its own randomness.
If the Prover could know what exactly the challenge is going to be, it could choose its randomness in such a way that it could satisfy the challenge, even if it did not know the correct solution for the instance (that is, faking the proof).
So, the Verifier must only issue the challenge after the Prover has already fixed its randomness. This is why the Prover first commits to his randomness, and implicitly reveals it only after the challenge, when it uses that value to compute the proof. That ensures two things:
- the Verifier cannot guess what value the Prover committed to;
- the Prover cannot change the value it committed to.
In zk-SNARKs, we have a further complication: non-interactivity. So, the Prover must be able to generate a proof that respects all of the above even though the Verifier never sends a challenge.
One way to do this, and the one usually used with QAP-based schemes, is to use the CRS. This includes an encoding of several building blocks that enable the Prover to construct the answer to several queries at a fixed, and hidden, random point, that serves as the Verifier’s challenge.
These values are produced in the setup stage, but in and of themselves, they do not constitute proof queries: they just furnish the building blocks.
The queries are only constructed at the time of proof generation, and verified offline at the verifier’s leisure.
This verification is achieved by using bilinear-pairings, which allow both parties to compute over the hidden (ie committed) values. If the Prover is honest and the proof is valid, the Verifier will be able to compute the answer to each query in two different ways and check that the Prover’s answer is consistent.
There is an important caveat in all the above: the material of the CRS enables the construction at specific random choices that represent the challenge. If these are known, the whole proof can be forged.
For that reason, these values must be completely destroyed or forgotten before the scheme goes public, and so must not make part of the CRS. It is not easy at all to convince others that a particular piece of information has been duly forgotten and is not stored somewhere, like a rogue backup of sensitive material that someone forgot to destroy. And this is before you go into conspiracy theories.
That makes this the Achilles’ heel of these schemes, as users may not believe the setup was honestly generated. The history of cryptography is rich with myths that the NSA has tainted several well-known algorithms, and installed or proposed back-doors there, that have led to the failure of or mistrust in certain schemes.
That explains why the promoters of ZCash, the highest profile blockchain using zk-SNARKs, had to produce two ellaborate ceremonies to convince users the secrets had been destroyed before going live with Sprout and then Sapling.
Other schemes relying on a trusted setup, where in essence you have to believe the originator of the scheme was and is fully honest, will face similar problems.
Very Long Keys
As explained above, the CRS provides the material to produce all the queries. Among other things, the queries probe each wire of the circuit, and there can be thousands or millions of them, which makes the CRS extremely long.
This sounds like a lot, and indeed it is. For circuits of this size, the CRS can run into the Gb which, despite the familiarity we have today with Big Data, is still a lot and severely reduces the ability to produce proofs on constrained devices (the proving key is usually much larger than the verification key).
Strong Security Assumptions
The need for a Trusted Setup is not the only security issue of zk-SNARKs. Following the good practices of cryptography, the existing schemes come with a proof of security, which reduces the difficulty of an attack against the zk-SNARK to that of an attack against some fundamental problem.
So far so good, that’s what makes modern cryptography a science with solid foundations. But just as in Mathematics we begin constructing our proofs from axioms, and in Physics we leave some explanations to postulates, in Cryptography we base our security proofs on some hardness assumptions.
A hardness assumption is a belief that some problem is hard to solve. They come originally from Complexity Theory, whose main purpose it is to classify problems according to their difficulty, and place them in correspondingly named classes.
Complexity Theory defines a precise, quantitative notion of efficiency. In Cryptography, we have functions that are clearly invertible but are believed to only have inefficient solutions, which means that for a large enough instance of the problem the computation would take literally astronomical time (several times the age of the universe). These are called, in simplistic terms, hard functions.
The two most famous assumptions in cryptography are that factoring a large number is hard, and that computing the discrete log of a group element is hard. These functions have been studied and tested for decades now, and barring the emergence of practical quantum computers, we have never found a way that is significantly better than brute-forcing a solution.
Still, let’s not get too excited about these. It is important to remember this word correctly: “assumptions” are not guarantees, they are merely empyrical observations, but these could be broken or falsified by further developments in science.
The problem is that these assumptions are not sufficient to prove the security of all the schemes we want to construct, and as such, a veritable hierarchy of assumptions has been developed. These assumptions are never unreasonable: either they’re chosen because we have a long history of trying to prove them wrong, or they are a generalization of similar well-tested assumptions.
This is the case of the assumptions underlying the construction of pairing-based zk-SNARKs, namely a Knowledge of Exponent Assumption. I will not detail it in this post, except to add that this is a relatively recent assumption that has not been studied or tested enough.
Researchers are still debating whether these assumptions hold, and a version of it has been shown to be false. On the other hand, these assumptions have a complicated structure of logical quantifications, and because of that are hard to falsify, which makes them even more uncomfortable to use within the scientific method.
Science lives by testing hypothesis and recycling them if they are false, and the nature of experimentation is that we can prove the existence of anomalies when one shows up, but can’t prove their total absence.
It may well be the case we have been unable to detect them, or they haven’t happened in a particular trial. Accordingly, science favours the use of hypothesis and assumptions that could be proven wrong by data, and some researchers dislike zk-SNARK’s assumptions precisely because of this.
This is all for today. Let me know if you think there are other disadvantages I should have listed here, or if I have been unfair in any of my comments. And stay tuned for the next, and final part of this series.