This post is an updated and expanded version of our 2017 cryptocurrency primer.
When Bitcoin (BTC) first appeared in 2009, few people had a clear idea of what it was, let alone the waves it would generate both financially and technologically. The underlying blockchain technology was more or less a new concept, and like most new concepts was poorly understood in general. In 2018, blockchain remains a hot topic: while it is tied in many people’s minds to cryptocurrencies, it is actually a standalone concept on which cryptocurrencies can be based. This article will clarify how blockchains work and, just as importantly, where blockchain ends and technologies based on it begin.
This post is an updated and expanded version of our 2017 cryptocurrency primer.
The purpose of blockchain is to create a ledger; that is, a record of historical transactions (be those financial transactions, messages, etc.).
Fundamentally, the blockchain is aptly named: it is a chain of blocks of data which at their most basic level (at least in most current implementations) can be conceptualised as something similar to the diagram below, which is based on the blockchain as famously implemented by Bitcoin.
Any given block of data in this implementation contains four pieces of information:
Timestamp – The time at which the block was created.
Transaction Root – The details of the transactions contained in this block – i.e. this section of the ledger. The amount of data held in this section can vary significantly: in Bitcoin, it will be approximately ten minutes’ worth of transactions. Other implementations use shorter windows.
Previous Hash – The hash of the last block in the chain – this is how the chain is linked together. When any given block has been processed, its hash becomes the Previous Hash of the next block in the chain, thus allowing historical records to be linked together and traversed.
Nonce – A cryptographic term referring to an arbitrary value used only once in a transaction. The purpose of this will be discussed in more detail later on.
The hash of the block – which becomes the Previous Hash value in the following block – is the hashed value of all the data held in these four chunks taken together.
Perhaps a truism of security is that nothing is inherently tamper-proof: it has to be designed to make tampering difficult and then protected by as many anti-tampering controls as possible. It shouldn’t, therefore, come as a surprise that blockchains are not themselves tamper-proof without some additional controls.
The first of these controls is distribution and decentralisation: by ensuring that all interested parties have access to the ledger and any new transactions which are supposed to be added to it, tampering should become much more evident. If all of the parties involved have access to the same information, an attempt by anything less than a majority of stakeholders to incorrectly report a transaction will be noticed by all of the other parties who are processing the data honestly.
Without distribution and decentralisation – and therefore equal access to data for all interested parties – blockchain is no more tamper proof than any other data storage mechanism. A blockchain owned and exclusively processed by one individual, regardless of how many nodes they operate and how many people can read the data stored in the blockchain, could be tampered with by virtue of the fact that one individual controls all of the processing.
At this point, we need a method whereby the interested parties can communicate with each other and check the validity of a new block submitted to the chain. This is where implementations diverge. Historically, there have been three common approaches:
Proof of Work – Make calculating a valid hash for a
block difficult to do, but easy for other parties to verify. The first
person to calculate a valid hash submits it to the network and it is
validated by the other parties prior to adding it to their chain.
Well-Known Uses: Bitcoin (cryptocurrency); Monero (cryptocurrency)
Proof of Stake – Block creators are determined
pseudo-randomly based on their ‘stake’ in the blockchain. This is
primarily used by cryptocurrencies as the stake is easily calculated
based on the amount of currency held by each member.
Well-Known Uses: DASH (cryptocurrency); Ethereum (cryptocurrency, hybrid PoS/PoW)
Practical Byzantine Fault Tolerance – Something of a
mouthful, PBFT is a consensus-based method of ‘tolerating’ faults in
the data and recovering automatically. The specifics of the system are
beyond the scope of this article.
Well-Known Uses: Hyperledger Fabric
As an aside: all of the above are solutions to the risk of what is known as a Byzantine Fault – that is, a fault where there is potentially imperfect or incomplete information which may result in the fault presenting differently to the various parties involved (consider that each party doesn’t know if or how many malicious participants there are and that some of these, while ‘in’ on any scheme to falsify data may only be malicious ‘approvers’ of bad data, not generators of it).