How Do Ethereum Smart Contracts Work?
- How Do Ethereum Smart Contracts Work?
- Writing Smart Contracts in Remix
- Setting Up Truffle to Deploy Smart Contracts
- Truffle Console and Truffle Tests
- Ethereum Apps with Web3.js
Welcome to another log in the quest of this errant explorer of blockchains, smart contracts and their technologies. I have been through some eerie places of late, ascending up to the airy reaches of Zero-Knowledge. Many of my posts have been dedicated to zk-SNARKs and algebra, which seem so esoteric when you first meet them that those posts remind me of vertiginous passes in lofty mountains. Today, I want to climb down from those heights, and come back to far more accessible matters.
This is the beginning of a new series into the basics, examining some different options on how to set up the environment to write, test and execute Ethereum smart contracts. I’ll visit three main options: executing contracts in Remix, automating tests with Truffle and creating command line applications with Node.js and Web3. If you want to start writing smart contracts but don’t know the basics of what they are and how to interact with them, hop along and come see some new sights.
The First Steps
It is not easy to write Ethereum contracts for the first time, and I face the same difficulties when I spend a few months without doing it and then have to come back. I keep looking for reference material every time that happens, and repeatedly think I should have written some notes the last time it happened. Well, here they are, they may be useful to you as well.
My emphasis here is not on how to write Solidity code. Rather, I focus on how to test a smart contract: invoking functions and methods, changing the contract’s state and getting results back.
The most basic way to do this is to open a console, call a function and read the result. This can be a quick-fire test for some easy function you’re writing, but at some point you will need more stable and expansive tests. At that stage, you will need some tool to write and run them all automatically.
Finally, you may want to develop an interactive prototype allowing you to query your contract at will to allow more unstructured tests of your code, avoiding the blindspots you, as the programmer, may have developed about it.
I will go through all these three goals in this series, but in this first post, I give an introduction to Ethereum and what happens when a contract method is executed. As a pre-requisite for the series, you should have a development environment setup. I have some notes on that to help you.
What is Ethereum?
If you’ve been around this blog for a while, you will probably be very familiar with Ethereum, but if not let me give a couple details. Ethereum is the second largest blockchain in market capitalization, and the most popular for smart contract developers. Its most original feature (meanwhile copied in many other chains) is the support and execution of unchangeable pieces of code that reside on the blockchain, called smart contracts, and whose execution can be publicly seen and verified by anyone.
While the main purpose of Bitcoin is to serve as a replacement for fiat currencies, it can be argued the main purpose of Ether is as a token to power the execution of smart contracts.
Ethereum’s smart contracts are very expressive and compute any function any other language can. But they cannot run forever. Ethereum imposes a resource-limit on computations, and after a certain number of operations the contract will be forcefully stopped if it has not finished yet.
Storage of Value in Ethereum
Ethereum is like a large database of accounts, each identified by an address. All accounts have an address and an Ether balance, and a transfer simply means deducting an amount from the sender account’s balance and crediting it to the recipient account’s balance.
In Ethereum, an address can have code associated or not. If it has, then that code can be executed when a transaction with some Ether value is sent to it. Conversely, if you want to execute a smart contract, you have to send it some Ether to pay for that execution. Each smart contract can access the Ether balance of its own account. It can also store its private data and change it according to the execution of its own methods.
Only externally owned accounts, ie accounts without code, can send Ether, since this involves sending a transaction that must, ultimately, be signed by a person (or their digital wallet).
Where is the Blockchain?
You may have a number of fundamental questions about the nature of blockchain when you start. Questions like:
Where is the blockchain? What gives it substance? How can I access it?
The blockchain is a series of blocks connected by their hashes. Each block is linked to the previous one, forming a sequential chain. In this way, the blocks record the evolution of a simple database that is a lot like a dictionary: a state of many keys (eg a smart contract’s fields) and their values.
A blockchain resides on a network. Normally, there is only one network that is considered to host the real blockchain for each token. There is only one main-net for Ether, although there are several other test networks.
To access a smart contract, you have to access your target network. You’ll also need your account’signing key to transact Ether. This could be stored in a cold wallet, but has to be connected to your application to sign transactions on your behalf.
Having a wallet with your own keys can give you the illusion that your coins are stored in the wallet and that some piece of the blockchain is can be held in your pocket. But instead, your coins are recorded directly on the blockchain, and your wallet just keeps track of your account’s balance, history of transactions, and the signing keys.
The blockchain is maintained by a series of nodes that are collectively responsible for creating the blocks that compose the blockchain and manage its growth by checking their correctness.
These so-called “full nodes” form the network, and can be seen as the inner sanctum of the ecosystem. Clients can access any of these nodes by sending messages according to a specific protocol (JSON-RPC), possibly using helpers like Infura or Metamask.
We can therefore have a scheme with several tiers:
- your wallet, containing your keys, connects to your application
- the application uses the wallet’s secret keys to create and sign transactions
- the application sends transactions to a full node
- which propagates them to all the other nodes in the network
The whole magic happens in step 4. If your transaction triggers some code to be executed, then this will be run by all the full nodes in the network. But that is only the smart contract logic. The main body of your application, the front end and the user interaction will be executed at the fringe of the network: on your desktop computer, laptop, browser or even phone. It will only use the smart contract for its particular needs, the public logic and specialized storage.
How Code is Executed
In Ethereum, smart contracts are public, and every time a transaction is executed everyone can see all the steps it goes through. The execution happens “on the blockchain”, but what does this mean?
The Blockchain itself is an abstract term, an ethereal concept, if you’ll allow me the pun. Think of it as a unique thing that exists up there in the sky, a database you can query and change the state of. How does this happen?
Obviously, the blockchain is not fully abstract, it exists somewhere. Actually, it exists everywhere, in every neetwork node in the network. Every change to its state must occur in every node.
To make this more clear, every time you submit a transaction to execute a piece of code, that transaction is propagated to every node and executed in each one of them, so that all nodes update their state to the same value.
Smart contracts can be accessed in two ways:
- query (read from the state) or
- change (write to state).
Each of these actions corresponds to a different kind of message. In the first case, we send a call message
, which basically is the same as calling a function that returns a value. All you get back is that value, but nothing else.
You can’t change the state of the contract, even if your function did include instructions to change it. This is because a call
does not a create a transaction. It runs in the node that receives it only, but any node could do it, since every node should contain the same state.
Write actions are totally different. In this case, we send a transaction
, and this does have the power to change the contract’s state. It returns a full receipt of the transaction, including the total gas cost (see more below), the logs emitted by the contract, the transaction and block hash and some more technical information. Crucially, it does not return the function’s return value, even if it did return one.
The EVM and Gas
As I said above, when a transaction is executed (because it is included in a block that has just been mined), all the nodes execute the respective functions in exactly the same way. This may be surprising when you think we have an heterogeneous network, where we cannot mandate operating system nor execution language.
Ethereum’s solution is that every full node must be able to interpret code written in a specific standardized form, Ethereum bytecode. Not only that, it must be executed inside a specific environment with its own rules, called the Ethereum Virtual Machine (EVM).
The EVM is like a computer with its own quirks. It has a stack with limited size (1024 slots) and each slot is a 256-bit word. It also has unlimited memory. The EVM only understands very simple operations that act on data located in the memory and the stack. Each operation is represented by an opcode, that is encoded by a specific byte sequence, and eventually its data. The sequence of opcodes with its data forms a binary string, which we call the bytecode. The list of all opcodes is available here.
As an example, this is the bytecode
608060405234801561001057600080fd5b5060b88061001f6000396000f3fe6080604052348015600f57600080fd5b506004361060285760003560e 01c8063771602f714602d575b600080fd5b606060048036036040811015604157600080fd5b81019080803590602001909291908035906020019092 91905050506076565b6040518082815260200191505060405180910390f35b600081830190509291505056fea265627a7a72315820ee2b85e93aab2 a776c18c9679e6dcce9c03b90cdec53a4ff715d09781f7abe1864736f6c634300050b0032
for the following very simple contract (bytecode was generated by Remix):
pragma solidity ^0.5.0; contract Demo { function add(uint a, uint b) public pure returns (uint) { return a + b; } }
Another property of the EVM is that it does not allow a program to run for an unlimited amount of time. Each operation has a cost which is measured in gas units. Gas is paid for in Ether, which means contract developers have an incentive to make the functions as efficient as possible.
The sender of a transaction specifies how much gas it is willing to spend and what price it is willing to pay. You can choose whatever value you wish, but keep in mind that gas is the reward for the miners. If the gas price you select is too low, a miner will not be incentivized to execute that transaction, and so it will never be included in a block.
Instead of guessing a value for the gas price, try Gas Station, which estimates the price you need for certain waiting times.
At the time of writing, when Ether is priced at around $170, the recommended gas prices for an average waiting time of 5 minutes is 10 Gwei. A minimal transaction, that only transfers Ether between accounts, would cost around $0.035. On the other extrme, a transaction costing 10M gas (which is the current maximum allowed by the mainnet) would cost about $17.
And for a reminder, if you simply query the contract to get a value, but not modify it, then you get to do it for free, without paying any gas.
If you got down to this point, you’re ready for the next posts in this series. It was just a gentle stroll among the hills where the villages of our quest lie waiting. You now have the weapons and armours to tackle the next challenges. I hope to see you then, in my roving adventuring band.