This article provides an introduction to the concepts of what is commonly referred to as “block-chain”.
The functionality offered by a blockchain is introduced, and its functioning is described. Subsequently blockchain-based solutions are briefly discussed.
The term ‘blockchain’ is used as a broad catch-all term for the implementation of a distributed ledger based on cryptographic hash functions. Both distributed ledgers and blockchains have a variety of implementations. The first implementation widespread implementation of a block-chain was the Bitcoin crypto currency. The novelty in Bitcoin was that it used a combination of well-known cryptographic techniques to solve the double spending problem of a virtual curren-cy. This ‘double spending’ problem refers to the difficulty to represent value in an electronic way, and prevent it from being used multiple times. With paper-based money and payment sys-tems each have their solutions to the double spending problem. However, these do not work for a virtual currency, where a coin is just a series of bits that can be copied.
These cryptographic techniques include linked timestamping for verifiable logs, which goes back to the concept of the Merkle tree and the timestamping concepts from Haber and Stornet-ta . Regarding digital cash, the seminal work was done by Chaum , and fault tolerant consensus protocols were proposed by a.o. Lamport . The idea of using public keys as identities is also due to Chaum , and smart contracts were proposed by Szabo .
After Bitcoin, many variations appeared, aiming to solve other problems, or using a different technical implementation. The most popular one is Ethereum, which comes with its own cur-rency, “Ether”.
A blockchain consists of a set of protected information blocks chained sequentially to one-another. Together they form an immutable ledger, distributed over the participating nodes. These nodes are computing platforms that interact with the end users. The purpose of the blockchain is to share information amongst all parties that access it via an application. Access to this ledger in terms of reading and writing may be unrestricted (‘permissionless’), or restricted (‘permissioned’). The shared information is protected against modification, meaning that any al-teration would be easily and immediately detectable. For that reason, once information is rec-orded on the blockchain, it is considered immutable because it is so strongly protected.
There is no such thing as ‘the blockchain’. There exist many different blockchains today, some are operated in public, some in private. Without the ambition of being exhaustive, the following are well-known blockhain implementations today. The seminal example is Bitcoin, of whom 667 other cryptocurrencies were derived. Other cryptocurrency families are based on Bytecoin, NXT and XRP. Ethereum, which allows logic to be executed in a distributed way, and which includes its own currency (Ether), is also a blockchain. Venezuela was the first country to issue a government-backed cryptocurrency. It was challenged for its lack of technical clarity and gov-ernance.
Non-blockchain based distributed ledgers include HyperLedger Fabric, Corda, BigChainDB and Rchain.
The main building blocks of a Blockchain system are its data structure (i.e. the Blockchain) and its nodes, where the logic and computations take place. There are two types of nodes: full-function nodes and partial nodes. Each full-function node maintains a complete copy of the Blockchain, is capable of executing transactions and contributes to extending the chain. All full-function nodes are equivalent in terms of functionality, and are connected in a peer-to-peer network. As such, there’s no hierarchy amongst nodes, and all nodes can communicate with one another. A partial node is also connected to the network in a peer-to-peer fashion, but doesn’t contain a full copy of the Blockchain. It needs the services of a full-function node to execute transactions, and it doesn’t extend the chain.
A Blockchain starts from its genesis block and new blocks are appended periodically. Each block records executed transactions. The nodes collaborate to connect the blocks into a Blockchain, creating a ledger that cannot be changed backwardly without redoing a proof of work (POW).
Each block contains two types of information; application-specific information (‘payload’) that records transactions or smart contracts (consisting of a combination of data and code executable by the nodes) and internal information that secures the block and specifies how it’s chained to another. Blocks are automatically propagated across the network, verified and linked via hash values.
There are two protection mechanisms inherent to each Blockchain, and a third that’s optional.
The first concerns linking each block with its predecessor in a way that is computationally hard to undo. This is achieved via two combined techniques: the use of a hash tree, which means a hash is calculated for each block, which includes the hash value of the previous block (this is done for each new block created, with the exception of the first block (the ‘genesis’ block), which has no predecessor); and the inclusion of a special number in each block, the block’s ‘nonce’. Insertion of the right nonce allows calculation of a specific hash value over the entire block. A nonce is computationally hard to calculate and is therefore referred to as a POW. When the nonce is inserted in the proper location, calculating the hash function over the block yields a specific hash value (one that starts with a specified number of zeroes).
Since the nonce is hard to calculate, replacing one block by another one would mean redoing nonce computations of all blocks subsequently linked to it. With the current state of algorithms and computing power, it’s generally believed to be infeasible after the chain has been extended by approximately six blocks.
A second protection mechanism is a peer-to-peer built-in consensus mechanism, whereby a majority of nodes need to agree on the next block that extends the chain. This means there’s no central point of control that can be compromised. A Blockchain system functions without a central trusted entity, in a peer-to-peer mode, where all nodes are equal.
There’s no trust between nodes; they rely on a consensus mechanism to confirm transactions. The consensus mechanism is based on verification by every node that the received information complies with a set of rules, and by verification of the nonce. The rules confirm that the proposed transaction complies with the application functionality, which is application-specific.
For example, in the case of a virtual currency, it’s first verified that the transaction adheres to the required structure and that the payer has ownership over the coins they want to spend (demonstrated by a signature using the private key of a Public Key Infrastructure (PKI) key pair and where the signature is successfully verifiable). Verification of the POW then has to demonstrate that a node invested the required computational power to participate in the extension of the chain.
If two nodes broadcast different versions of the next block at the same time, some nodes may receive one or the other first. Each node would work on the first block received, but save the other branch in case it becomes longer. The tie will be broken when the next nonce is found and one branch becomes longer; the nodes that were working on the other branch will then switch to the longer one.
The third (optional) protection mechanism stems from the fact that Blockchains come in two different flavours: permission-less and permissioned. The public, Bitcoin-like systems where every node can participate (read, add entries or extend the Blockchain by finalising a candidate block with the correct nonce) are denoted as permission-less.
Permissioned Blockchains allow only a limited set of known and accepted nodes to process transactions and extend the chain. As this type of chain is typically set by known and consenting organisations with an assumed level of trust, the consensus mechanism can be based on a less intensive computational process (they don’t need to prove to each other that they’ve invested a sufficient amount of computational power in confirming the transactions).
Virtual coins are a popular family of applications built on Blockchain. A coin consists of data (representing value) and code (rules on how to spend the value). Figure 1 illustrates the main components of a coin system such as Bitcoin (a virtual currency) or Namecoin (a repository where DNS-names and their corresponding IP addresses are stored):
To make a payment, an end user installs a wallet application and generates an account and an address to interact with the Blockchain. First, they pay to receive coins at that address using a traditional payment method. Once the coins have been received, the user can create their own payment transactions from the wallet. The transaction contains data (identifying payer, payee and amount) and code (a script defining how to unlock the value the payer wants to transfer to the payee and how to lock the value subsequently to the payee). Performing the transaction requires interaction with a full-function node to execute the script code. Upon successful execution, the transaction output is broadcast to peer nodes, which relay the output to further peers.
Mining is Bitcoin’s protection against users who try to double spend. Upon reception, nodes insert the transaction output they received in the payload of their new candidate block. In the payload, there’s room for this output along with two reserved locations, one to be filled by the nonce and the other by a value that represents the creation and allocation of a benefit. All full-function nodes insert the benefit value of their choice (typically a transaction that makes a payment to itself) and start ‘mining’ (searching the nonce that when combined with the rest of the information yields a valid hash value). This is referred to as the POW. Alternatives to mining exist, the most common of which is referred to as proof of stake (POS).
With POW, the first node to find a hash value that meets the specified condition broadcasts the newly completed block to all other nodes to verify it. This new block contains the benefit value for the miner that was the first to successfully find the required nonce. If this new block is successfully verified by the network, the originating miner sees its efforts rewarded by the benefit, which can be used in future transactions. The results included in the payload of the new block are available in all full-function nodes. A competing miner may broadcast its block just after the first miner, and also link its block to the Blockchain. However, the nodes will notice the time difference and its block will become an orphan block, it’ll no longer participate in the active chain.
Partial nodes do not mine and may store the entire Blockchain, or only parts thereof (blocks that contain transactions relevant to them). Partial nodes can interact with end users, but are dependent upon full-function nodes to commit transactions to the Blockchain. A wallet can be implemented on a mobile device as a partial node, maintaining only information about the coins its owner can spend. The mobile device would not have to store the full Blockchain, but would still be able to offer its user wallet functionality.
A smart contract is essentially a computer protocol to digitally agree, verify or enforce the negotiation or performance of terms between parties, without third parties. These transactions are trackable and irreversible. For example, consider a smart contract between two parties, Alice and Bob, about the price of publicly quoted stock S. Our imaginary contract specifies that Alice pays Bob a certain amount if on an agreed date, a condition holds. This condition could be, for example, be that the price of S is equal to or above 100 euros. Otherwise, Bob pays Alice the same amount.
This contract can be encoded in a smart contract programming language such as Solidity, which can then be activated on a Blockchain. When the time arrives, the contract will use an oracle to fetch the value of the stock S, and the payment will be made according to the condition. Obviously, the oracle must be trustworthy.
Smart contracts are based on the mechanism explained in the preceding section. The underlying idea is to make a breach of a contract expensive. Smart contracts define rules and consequences in the same way as traditional legal documents. They take information as input and perform specific actions as a result. They also contain a combination of data and code, but rather than being coded in a dedicated cryptocurrency script language, smart contracts are written in a richer programming language. A contract layout consists of:
|Variables||the data part, where public variables maintain the state|
|[Events]||optionally, a list of events the contract listens for|
|Functions||the code part|
|Constructor||the part of the code that creates the contract on the blockchain|
|Other functions||other application logic|
Contracts are created by a function called the constructor. Upon execution of the contract’s constructor, it’s inserted into the Blockchain. When the relevant event happens, a Blockchain transaction is sent to that address and the smart contract is executed. The execution typically consumes some cryptocurrency value.
Today the most popular implementation of smart contracts is probably Ethereum, a public Blockchain-based platform.
Bitcoin can be seen as the original Blockchain. This Blockchain was used to implement a cryptocurrency to create the first purely peer-to-peer version of electronic cash without central authority. Bitcoin was created by an unknown (group of) person(s) who invented the Blockchain. Its development is driven by a core group of Open Source developers.
The trust within the Blockchain, and thus in the public ledger it represents, is created thanks to collective agreement of the nodes within the network on a set of updates to the state of the Bitcoin ledger. This is referred to as consensus. This Blockchain is the most mature of all public Blockchains, but also suffered from the most attacks over the last years.
Ethereum is the first of the second-generation Blockchains, which focus on smart contracts, applications that run exactly as programmed without possibility of downtime, censorship, fraud or third-party interference. To achieve this, Ethereum’s creator, Vitalik Buterin, enhanced the Bitcoin’s virtual machine scripting mechanism to give Ethereum contracts a state and Turing-completeness.
Ethereum contracts encode arbitrary state transitions, making it possible to write systems by simply writing the logic in a few lines of code. Ethereum has its own cryptocurrency, called Ether.
Today’s most active Blockchain technologies include the pioneers such as Bitcoin, Ethereum and Ripple (a clearing and settlement technology), as well as Ethereum-based follow-ups that apply new approaches for scalability, such as Tendermint, Hydrachain and Hyperledger. There’s also a broad category of more scalable designs such as the Lightning Network, Raiden, BigchainDB, RChain and Aeternity. There are also Superchains, which connect multiple Blockchains together. These include Interledger and Cosmos. And there are others, such as Cord, which come with separated ‘fact’ databases, where the data is kept consistent, but not everyone has a copy of everything.
A functional summary is given in the following figures. Assuming a blockchain has been put in place, the following interactions are typically taken place. Users perform transactions through their application, every node broadcasts its transaction outputs, and every node can create its candidate block with the transaction outputs it selected. Every node then tries to satisfy the conditions that would allow its candidate block to be promoted to as the next block in the shared chain. The nodes jointly agree on which candidate block is promoted through what is referred to as the consensus.
There are many approaches that define this consensus and its conditions. The most well-known ones are Proof Of Work (POW) and Proof Of Stake (POS). POW originated as a solution to fight spam emails, by enforcing that senders demonstrate they performed some calculations prior to accenting their emails for sending onward. These calculations consist of solving a problem which is moderately hard but feasible to execute, and easy to check. A popular problem is a partial hash inversion, i.e. finding the input to a hash function that satisfies conditions on the output such as containing a number of consecutive zeros. The latter is moderately hard (depending on the hash chosen) because a good hash’s output will be random, so a minimal amount of calculations will have to be done to find a hash that contains the specified number of zeros. In Bitcoin, POW is used to prevent the „double spending“ of coins. In POS, the next valid block in the blockchain is selected on the basis of account holders‘ stakes. Many schemes exist for POS, including those based on the concept of "coin age", a number derived from the product of the number of coins times the number of days the coins have been held.
One might wonder if one’s candidate block might never make it to inclusion, if the node where this candidate block is formed is e.g. not powerful enough to successfully compete in a POW scheme. First, it should be realised that transactions are send out by nodes for inclusion in multiple candidate blocks. So a transaction will normally over time make it on the chain, via
6 An ICO is the initial period when a cryptocurrency is made available to anybody willing to buy. It can be com-pared to an Initial Purchase Offering (IPO) when a security is to be traded on a stock exchange
one candidate block or another. Second, techniques can be used that offer an incentive for a node to include a transaction in a candidate block, e.g. by paying a small transaction fee.