ball, http, www

ZK Snarks and Their Algebraic Structure – Part 5

Groups in the Tate Pairing

Welcome. We are finally approaching the inner sanctum of this series. In this post, I look at the set structure for the most popular elliptic curve pairing, and all the concepts covered in the previous posts will finally fall into place. This has been the hardest part to write, because this is where my own knowledge falls short and it wasn’t easy for me to select find the right material, to study it and synthesize it in the best way.

When I set out to write this series, I did it mostly for my benefit and that of my team at Artos, so that we would understand the sets we were dealing with when implementing our ZK Snarks project. It was a perfect example of a long quest in my adventure world: a gradual advance through mathematical lands that were progressively more alien to me; the continual search for more examples, articles and books that could help me overcome the difficulties I faced, and make sense of the many strange algorithms and coordinate representations; the drawing of a map of what I should study and ignore; and finally the writing of these posts to synthesize what I had learned, in a similar role to what Joseph Campbell calls the return to the ordinary world with the ultimate boon or the magic elixir.

My aim is to give an overview of the algebraic structure of pairings (since those dictate how each element of a ZK Snark is represented) in a way that is understandable by someone who is not familiar with algebra. But this is hard maths, and simplifying what is complex does not come easily. I’ll do my best, but let me know if anything is not clear or would benefit from a different explanation. I fear this time I could find no way to make it any simpler.

The Tate Pairing

As we have seen in a previous post, bilinear pairings work on 3 groups. Nowadays, we almost exclusively use Type 3 pairings, which means all three groups are different. They are all related to a specific elliptic curve, but in what way?

When I introduced pairings in this series, I talked of them at a high level, describing them as functions and specifying the main properties they should have. But I have not actually said how to define those functions. If you have to do calculations with pairings in your code, and have to manipulate elements in any of the pairing’s groups, it is useful to understand why they’re represented the way they are. All pairings used in practice today are based on just two types: the Weil pairing and the Tate pairing. Although others exist, I will focus only on BN curves and on the Tate pairing, since Ethereum uses a particular example of this family.

Point Representation

Grab a pairing library if you can (I find pyPairing / pyEcc comparatively accessible), and look at the code. You’ll soon find out that a pairing on a BN curve has elements of G1 represented as points with two (or three) elements of an underlying scalar field. The elements of G2,  instead, are represented as points with two (or three) pairs of elements of the scalar field. And you may be dismayed to find out that elements of GT are represented in one of a few different ways, all coming down to eventually twelve (!) elements of the scalar field. Now, where do these differences come from, you may ask?

This is what I want to explain in this post. But I warn you now, this is at a higher level of complexity than the previous posts, and will require having a clear idea of the material covered in them. If you still want to go on, which I do hope you do, grab your seat, hold tight, and let’s go.

The Tate Pairing in Libraries

As I said above, I will only focus on the Tate pairing, and keep this at as much of a high level as I can, going into just enough detail to explain the representation of group elements. I will not go into the actual computation itself: as usual, I advise you to leave the specific implementation details (of elliptic curves, pairing functions, even cryptographic algorithms) to libraries specifically designed for them. All I want is to explain to you what the structure of the inputs and results of those operations is, so that you can interact with those libraries if, for example, you need to build interfaces for them.

In your library of choice, you may quickly find the code for computing the pairing, and may locate a Miller loop and a final exponentiation step somewhere. In some libraries, you will even see that these are part of something called reduced Tate pairing. As for a full Tate pairing, you may not find it at all.

Domain and Range of the Tate Pairing Function

The full Tate pairing is based on an elliptic curve , but its inputs are not taken directly from it. If you recall the definition of elliptic curves, given in an earlier post, you’ll remember it can be described by just 2 coordinates, or maybe 3 if you use homogeneous coordinates. But the full Tate pairing does not take its elements from this set. Instead, they are drawn from two distinct subsets of E(𝔽pk). As you can see, both inputs of the pairing are taken from an elliptic curve over an extension field.

The output, though, is not a point on any elliptic curve. Instead, it is a member of a related group, a specific sub-set of the multiplicative group of the extension field. This is where it begins to sound complicated, and indeed it is. Just look at the type definition below:

τ : E(𝔽pk)[r] x E(𝔽pk)/rE(𝔽pk)--> 𝔽*pk / (𝔽*pk)r.

The Reduced Tate Pairing

This is just too much cognitive overload for my brain, and so I’m very happy that researchers have found a way to simplify this with the reduced pairing. The corresponding type definition is leaner, although still by no means elementary.

 
e:E(𝔽p)[r] x E(𝔽pk)--> µr.

The above notation indicates the function e:

  • takes its first argument from the r-torsion sub-group of E(𝔽p), the original elliptic curve;
  • takes its second argument from the group defined over the extension field E(𝔽pk), with embedding degree k.
  • returns a value in the sub-group of E(𝔽*pk) that only contains the r-roots of unity, where E(𝔽*pk) is the multiplicative group of the field E(𝔽pk).

There are quite a lot of notions packed in the above few lines. Let me try to explain what they’re all about.

Embedding Degree in Elliptic Curves

There is a tension when designing cryptographic algorithms with elliptic curves. Typically, such schemes are based on the difficulty to solve a certain problem that would be easy if we could solve the discrete logarithm problem on the curve’s group.

Note: when we talk about elliptic curves in the cryptographic setting, we are in reality talking about a prime-order cyclic group inside that curve. Every point is the product of a generator point, G, by a scalar a. The Discrete Logarithm Problem is the task of finding a if given X = aG, which is considered to be very hard in prime groups.

On the one hand, we want the curve’s order to be small, so that we can represent elements in as few bits as possible (to save on bandwidth). On the other hand, if the order is too small, the Discrete Logarithm Problem may become solvable, and the curve won’t be secure at all.

In pairings settings, we have a further compromise to make. The DLP must remain difficult in two groups, E(𝔽p) and E(𝔽pk). However, we also must be able to efficiently compute in the field 𝔽pk , which demands that k not be too large. We thus have to make a compromise on the size of k. This parameter is called the embedding degree of the curve.

An Elliptic Curve equipped with addition and the identity O is always a finite group, but it does not necessarily have a prime order. Its size can be written as n = p + 1 - t, where t could be positive or negative and is not too far from p. The pairing requires a sub-group of this curve with prime order r that is as large as possible. In the optimal case, n is prime and r = n. It also requires an appropriate extension field, and for that we want the degree k to be:

  • as small as possible
  • such that r divides pk - 1.

The embedding degree k is then defined as the smallest positive non-zero integer that satisfies this division.

The Tate Pairing’s Target Group: GT

The target group of the Reduced Tate Pairing, that is to say its range, is denoted above as µr. This unassuming notation hides quite a few concepts: it is the sub-group of the multiplicative group 𝔽*pk that includes only and exactly the r-order roots of unity. Let’s break this down.

First of all, what is the multiplicative group?

Multiplicative Group

Any field 𝔽n = (S, +, ×) can be broken down in two groups, resulting from associating its set with each of the operations:

  • the additive group (S, +)
  • the multiplicative group 𝔽*n = (S \ {0}, ×)

Note: I am talking about any field in here, so the set S does not have any particular structure. What is constant is that to build the multiplicative group we must remove the 0 element. There is a simple reason for this: in a group, every element must have an inverse with respect to the group operation, but we can’t divide by 0, so no matter how much we try, there is no other element that multiplied by 0 will produce the identity element 1.

The multiplicative group is compactly denoted as the field name marked with an *. Very often this notation stands both for the group or instead just its set.

For a simple example,

𝔽*7 = (S', ×)
S' = ({1, 2, 3, 4, 5, 6}:
Inverse pairs:
(1, 1): 1 * 1 (mod 7) = 1
(2, 4): 2 * 4 (mod 7) = 1
(3, 5): 3 * 5 (mod 7) = 1
(6, 6): 6 * 6 (mod 7) = 1

Roots of Unity

I said above that the target group is a specific subgroup of the multiplicative group. This means it is defined only over a subset of that group’s set, but conserving the same operation and identity element. The elements of this subgroup are exactly the roots of unity of order r. What this means is that every element X of this group satisfies:

Xr = X × X ... × X (r times) = 1

Written another way, Xr - 1 = 0, and according to algebra’s fundamental theorem this means there are exactly r elements that satisfy this property. In other words, the group µr must have exactly r elements. This is why the embedding degree is defined to be an integer k as a function of (p,r).

Because the order of every sub-group must be a divisor of the order of the parent group, the value r must divide the order of 𝔽*pk. And because that does not include the element 0, it is

Order of 𝔽*pk = pk - 1.

Given the above, the elements of this group are also elements of the extension field with embedded degree k, which for Barreto-Naehrig curves is k=12. Therefore, these elements can ultimately be encoded as a vector of 12 elements of 𝔽p. This is the approach taken by the pyPairing library, but other libraries like libff, Milagro and DIZK take alternative approaches, representing them instead as a tower of groups whose degrees multiply to equal 12. For example, a group with 3 elements in an extension field with degree 4; or a group with 2 elements of a group whose elements take 3 coordinates that are themselves in an extension field with degree 2. I see no favourite implementation (some libraries offer more than one alternative) and so I prefer to ignore these differences.

The Tate Pairing’s G1 Group

The first input of the pairing function is taken from group G1 . In the reduced Tate pairing, this is

E(𝔽p)[r]

This is simpler than the other two groups, but it is still not the vanilla elliptic curve underlying the pairing. One property that must hold in a pairing is that all three groups must have the same prime order, r. Remember that an elliptic curve is a finite group, but its order (n) may not be prime. Instead, we must find a sub-group of prime order (r). The way we do this is to use, instead, the r-torsion of the curve.

Note: In Barreto-Naehrig curves we don’t have to worry about this, since these are already of prime order and so r = n. The rest of this discussion applies to the general case, where we cannot guarantee this.

A sub-group of a group inherits the operation and identity element, but only a subset of the original group’s set. For example, if we have a group G = (X, +), a sub-group could be G' = (Y, +), where Y is a subset of X. And in this case, it is guaranteed that the order of G', which is equal to the size of Y, will evenly divide the order of G, which is the size of X.

Order of a Group Element

It is easier to explain with an example, and the most familiar one from everyday life is probably the group (Z12,+), where the set of elements is the numbers {0, 1, ..., 11}, and addition is computed modulo 12. This is just how we do calculations with hours on an analog clock. Remember the definition of group element order:

For an element X, its order is the smallest positive (non-zero ) integer k such that kX = 0, if the group is in additive notation, or Xk = 1, if the group is in multiplicative notation.

In group (Z12,+), the elements can be sorted in this way:

  • Elements with order 1: {0}
  • Elements with order 2: {6}
  • Elements with order 3: {4, 8}
  • Elements with order 4: {3, 9}
  • Elements with order 6: {2, 10}
  • Elements with order 12: {1, 5, 7, 11}

Torsions

Above, I showed the order of each element of the group, by finding the smallest multiplier that turns it into the identity. The idea of a torsion is more or less the opposite. For a given multiplier, m, the torsion includes only those elements whose order is a divisor of m. Note that these divisors include m itself.

For example, the 2-torsion of the group above must include only elements of order 1 and 2, while the 3-torsion must include those of order 1 and 3.

1-torsion: {0}

2-torsion: {0, 6}

3-torsion: {0, 4, 8}

4-torsion: {0, 3, 6, 9}

6-torsion: {0, 2, 4, 6, 8, 10}

12-torsion: the whole group

You may have noticed that all elements in the m-torsion are separated by multiples of a base element (different for each torsion): the whole torsion is a sequence of this base number, say g, multiplied by integers 0 to m-1, that is, the m-torsion has exactly m elements in it.

Given all the above, it should not be surprising that all the elements of G1 are also elements of the base curve E(𝔽p).

The Tate Pairing’s G2 pair

The final piece of the pairing puzzle is the G2 group. In the original Tate function, this is a sub-group of E(𝔽pk) which as we’ve seen requires each element to be represented by k elements of 𝔽p . The reduced pairing does not offer much of a difference in here, taking elements from the whole E(𝔽pk) instead of one of its subgroups.

This is where BN curves are interesting, since they allow us to compress G2 points in such a way that they can be represented by points over 𝔽p2 . It is worth noting that the pairing function is only a small part of a cryptographic algorithm (but usually the most expensive one). The points that we feed into the pairing function are themselves computed from other quantities, which means we must perform several operations in G1 and G2 before executing the actual pairing. For example, in the PGHR13 Snark, we have to execute 12 pairings to verify a proof. However, each of the arguments to these pairings is typically a linear combination of elements in G1 or G2 such as pc = <c, pkc> (the angle bracket notation represents the inner product of two vectors).

The consequence of this is that compression of points in G2 not only makes points cheaper to represent in storage, but especially allows the use of cheaper G2 operations before the pairing is executed, leading to significant savings. Let’s now take a brief look at the technique used for this.

Twists and Element Compression

This section applies only to Barreto-Naehrig curves, and is based on the original paper BN06. In these curves, the embedding degree is always 12, which means that members of G2 are in general represented as polynomials of degree smaller than 12. The technique developed by Barreto and Naehrig enables us to reduce such representation to points of 𝔽p2 . Since each point is represented by 2 coordinates, this takes four 𝔽p elements, that is one third of the original requirements.

To do this, they define a twist of the original curve E. A twist is a modified curve, related to the first one, that allows to reduce the representation of a point from a field with k coordinates (where this is the embedding degree) to just k/d, depending on the type of twist. For BN curves, we have sextic twists (d=6) bringing the representation of points down to 𝔽p2.

The essential property of a twist function is that it represents another elliptic curve that is isomorphic to the first one in a specific field. What this means is that there is a function that turns an element of E'(𝔽pk/d)element into an element of E(𝔽pk). Moreover, this function has an inverse and turns every element of E(𝔽pk) into an element of E'(𝔽pk/d) such that the two groups are equivalent.

As an example, a BN curve has the general form

E(𝔽p) : y2 = x3 + b

and its twisted curve is, for a new parameter ξ

E'(𝔽p2) : y'2 = x'3 + b/ξ.

There is an invertible map that turns a point (x', y') of E'(𝔽p2) into a point (x, y) of E(𝔽p12) such that the two curves are isomorphic. In this case, there is a well-defined z such that:

(x',y') -> (z2x', z3y')

This is about all I wanted to say in this post. I feel it has covered a lot of ground, and explained a number of jargon that for a long time puzzled me. I hope this can help you understand pairing libraries better, but let me know if there is anything you’d like to see expanded or improved upon. Meanwhile, have fun.

References

In writing this post I made heavy use of the following references. I thank the authors in earnest for having written them:

Leave a Reply