business, computer, security

Logging in Ethereum – Part 2

The EVM’s Log Opcodes

  1. Logging and Reporting Events
  2. The EVM’s Log Opcodes
  3. Event Topics

The previous post of this series introduced the notion of Solidity Events, and why they are important for Ethereum Apps. It also showed how they are stored in transactions and can be accessed programmaticaly.

This post shows how events are represented and created by the Ethereum Virtual Machine (EVM)

As with everything in Solidity, events can be represented in the EVM bytecode. The emit operation is very specific and has a dedicated type of opcode: LOG.

Snow Covered Log – Lisa McCarty

There are 5 variants, written LOG0 to LOG4, that write an event with 0 to 4 topics. The corresponding gas cost is relatively high, since logs effectively store information on the blockchain that is permanently accessible and so has to be maintained by every node. The gas cost is equal to 375 * (n + 1) + 8 * b, where n is the number of topics, and b is the number of bytes of its data.

Take a look at these opcodes in ethervm.io, and you’ll see that they take a variable number of arguments from the stack and from memory. What does this mean?

What is in an Event?

In Solidity, you can log an event by defining its type first and then emitting it, like this.

event Logged(uint valueOfA);

function testEvent(uint a) public {
  emit Logged(12);
}

This defines an event of type Logged with a single argument. You can take the result of the transaction and get the data for this event with

receipt.events.Logged

which gets you something like this:

Event: { logIndex: 0,
transactionIndex: 0,
transactionHash:
'0x61bd01120ee03c2b1b12d0bba40d3fe169bc0471ea20c877304d6ffe90b21003',
blockHash:
'0x61a5967064da0fcf0bd1bbe9bc650feef2d28ce064310ae4e4f41a1db5b02980',
blockNumber: 8,
address: '0xB44201e65ABBc6f618406B6d91d355C4102Ec919',
type: 'mined',
id: 'log_62cd521b',
returnValues: Result { '0': '12', valueOfA: '12' },
event: 'Logged',
signature:
'0x2adf2e2e5f5d116e0d0e1f0231d3f5b75916c3d37d9a16ce060f634cbcc61430',
raw:
{ data:
'0x000000000000000000000000000000000000000000000000000000000000000c',
topics:
[ '0x2adf2e2e5f5d116e0d0e1f0231d3f5b75916c3d37d9a16ce060f634cbcc61430',
[length]: 1 ] } }

There are a few interesting fields in this, that I have highlighted. In returnValues, we have a summary of the Event’s arguments. In this case, we have only one, which is clearly listed in the field returnValues, first with a number indicating the order in which it appears in the event’s argument list, and then with its actual name.

The value is 12, the argument we passed in emit Logged(12). This value is also present in raw.data, but now encoded in a hexadecimal number of 256 bits.

Then we have raw.topics and signature, which also show the same data. You may be wondering how they are related, but I’ll leave that to the next section.

The LOG Opcodes

If you look at the bytecode for this (try it in Remix), you will see a call to LOG1 at some point, that is what actually creates the event. Let’s take a quick look at a table of EVM Op Codes and see what this does:

Ethereum has 5 opcodes dedicated to emitting events.

The first two columns of this table show you the bytecode associated to this operation, and the opcode that describes it. Then, there is a graph indicating what values of the stack the operation uses. The columns further to the right explain how they are used.

All of these variants take the first two values in the stack to address a part of memory and use the value in there as the first argument to the event operation. This memory segment starts at the address indicated at the top of the stack (the cell marked offset above) and can have a virtually unlimited size, since this is determined by the second cell in the stack.

And this is where the operations differ. Each one can take one argument more, called topic, than the previous one. In this case, I called LOG1, which must have taken a value from memory and a topic.

If you could venture a guess now, you may say that the number of topics is related to the number of event arguments. We have one argument, we have one topic. Seems possible, although you may want to think why the only value in raw.topics above has nothing to do with the value 12, that I passed to the argument. So, let’s leave topics for the next post and instead focus on raw.data.

The Raw Data of Event Logs

Let’s try a few different event types.

event LogEmptyEvent();

function testEmptyEvent() public {
 emit LogEmptyEvent();
}

This still generates a call to LOG1, event though it has no argument. The stack shows this:

The layout of the stack before calling LOG1 for an empty event
The layout of the stack before calling LOG1 for an empty event

You can see that the length of the data now is 0, but that there is still a topic. Let’s try something different:

event LogEventData(uint firstArgument, uint8 secondArgument);
event LogEventData(uint8 one, uint16 two, uint32 three, uint64 four, uint128 five, uint256 six);

function testEmptyEvent() public {
 emit LogEventData(5252, 200);
 emit LogEventData(10, 20, 30, 40, 50, 60);
}

Ok, so this time I’m throwing two events at the same time. Let’s see what is different here. In terms of opcodes, we stubbornly have two calls to LOG1 still, despite having 2 and 6 arguments in the event. That means in both cases we only care about the first 3 cells of the stack (refer to the opcode table above). The next pictures show the contents in both cases.

Layout of stack for an event with 2 arguments
Layout of stack for an event with 2 arguments
Layout of the stack for an event with 6 arguments
Layout of the stack for an event with 6 rguments


There are two differences here. The first one is the value of length (the second cell), which in the first case is 0x40 (64 bytes) and in the second case is 0xc0 (192 bytes). This gives us, in the first case, the size of two 32-byte integers, and in the second case that of six such integers.

Memory-Bound Data

Hmmm… You may have noticed that all the arguments were of different types, and that indeed few of them were 32-byte integers. Let’s take a look at the memory layout. Instead of copying it from Remix, I copied it into a table instead. For the first case:

0x0 00000000000000000000000000000000
0x10 00000000000000000000000000000000
0x20 00000000000000000000000000000000
0x30 00000000000000000000000000000000
0x40 00000000000000000000000000000000
0x50 00000000000000000000000000000080
0x60 00000000000000000000000000000000
0x70 00000000000000000000000000000000

0x80 00000000000000000000000000000000
0x90 00000000000000000000000000001484
0xa0 00000000000000000000000000000000
0xb0 000000000000000000000000000000c8

And for the second:

0x0 00000000000000000000000000000000
0x10 00000000000000000000000000000000
0x20 00000000000000000000000000000000
0x30 00000000000000000000000000000000
0x40 00000000000000000000000000000000
0x50 00000000000000000000000000000080
0x60 00000000000000000000000000000000
0x70 00000000000000000000000000000000

0x80 00000000000000000000000000000000
0x90 0000000000000000000000000000000a
0xa0 00000000000000000000000000000000
0xb0 00000000000000000000000000000014
0xc0 00000000000000000000000000000000
0xd0 0000000000000000000000000000001e
0xe0 00000000000000000000000000000000
0xf0 00000000000000000000000000000028
0x100 00000000000000000000000000000000
0x110 00000000000000000000000000000032
0x120 00000000000000000000000000000000
0x130 0000000000000000000000000000003c

Remember that our offset is 0x80 in both cases, so I coloured only the memory cells that are after that address. I also highlighted only the non-zero bytes. In the first case, the highlighted numbers convert to, in decimal: 5252, 200. In the second case, the values are respectively: 10, 20, 30, 40, 50, 60. Ahah! So these are exactly the values that we passed as arguments.

To confirm, let’s look at the values extracted from the transaction receipts. In the first instance, we get:

Event: { logIndex: 0,
transactionIndex: 0,
transactionHash:
'0x8f3ed89f4c85c0c59f750681de27a19065cf2fe317d0eba5b030f548017fb547',
blockHash:
'0x27b5e24d81aeb06018b4fdf7266191f5c6d6b46c8deaf845fbfac550b4a772fd',
blockNumber: 22,
address: '0xEd9D5dc2eC72B4428000860507762DA7E201B54D',
type: 'mined',
id: 'log_170c9985',
returnValues:
Result {
'0': '5252',
'1': '200',
firstArgument: '5252',
secondArgument: '200' },
event: 'LogEventData',
signature:
'0x8604f2dc82d77e997bcc9f6b4a10c9175a9be5fb69a9ef44bbb5c4eaa66ca364',
raw:
{ data:
'0x000000000000000000000000000000000000000000000000000000000000148400000000000000000000000000000000000000000000000000000000000000c8',
topics:
[ '0x8604f2dc82d77e997bcc9f6b4a10c9175a9be5fb69a9ef44bbb5c4eaa66ca364',
[length]: 1 ] } }

And in the second:

Event: { logIndex: 1,
transactionIndex: 0,
transactionHash:
'0x8f3ed89f4c85c0c59f750681de27a19065cf2fe317d0eba5b030f548017fb547',
blockHash:
'0x27b5e24d81aeb06018b4fdf7266191f5c6d6b46c8deaf845fbfac550b4a772fd',
blockNumber: 22,
address: '0xEd9D5dc2eC72B4428000860507762DA7E201B54D',
type: 'mined',
id: 'log_d02b4bc7',
returnValues:
Result {
'0': '10',
'1': '20',
'2': '30',
'3': '40',
'4': '50',
'5': '60',
one: '10',
two: '20',
three: '30',
four: '40',
five: '50',
six: '60' },
event: 'LogEventData',
signature:
'0x99986dab9136c96a7db07b9393684e3a152ce628162b6ebbdb6fbdca3ae2d581',
raw:
{ data:
'0x000000000000000000000000000000000000000000000000000000000000000a0000000000000000000000000000000000000000000000000000000000000014
000000000000000000000000000000000000000000000000000000000000001e0000000000000000000000000000000000000000000000000000000000000028
0000000000000000000000000000000000000000000000000000000000000032000000000000000000000000000000000000000000000000000000000000003c',
topics:
[ '0x99986dab9136c96a7db07b9393684e3a152ce628162b6ebbdb6fbdca3ae2d581',
[length]: 1 ] } }

We can see here that the memory fragment selected by the values in the stack is wholly copied into the raw.data field. Less obvious is that the single topic here, which corresponds also the signature field, is also the third value in the stack (see the figures above).

Also, we can see that even though an event argument may fit in less than 32 bytes, it still uses a full memory word of that size. Dynamic types, like strings and arrays, are more complicated, and I leave them out of this post.


We’ve seen that the amount of arguments in an event does not influence what type of LOG function the EVM calls. And although we have only 4 varieties of LOGs, we don’t seem to actually be limited in the number of arguments (But we probably are. See my post on the Stack Too Deep for some thoughts on that).

It’s now time to explore Event Topics. That is the theme of my next post, and the last in this series.

2 thoughts on “Logging in Ethereum – Part 2”

  1. Thanks for this greatly informative post. I would be keen to read your observations on logging arrays.

    I have found that Solidity and the EVM log arrays inefficiently, using 32 bytes for every entry in the logged array. This is my impression anyway, given the number of zeroes present in the raw data I get from my transaction receipts. Stranger still, attempting to reduce memory usage by logging a uint32 array uses nearly twice the amount of gas as 256.

    Would be curious to know what’s going on under the hood, and if there are any ways to more efficiently log this data.

    Best wishes.

    1. Hello,

      thank you for your interesting comment. I have never analysed the logging of an array, as you said. As for why it would be more efficient to use u256 instead than u32, though, I can venture a guess. The EVM is natively a 256-bit machine, not 32 or 64. That means that most of the operations are designed to work on words of 32 bytes, and operations over this type can be very cheap. Truncating, then, to work with smaller types, would need some extra operations and incur more gas.

      I don’t know the details, though, and cannot confirm. It is just my intuition.

      Cheers

Leave a Reply