Crypto Crash Course: Bitcoin, the Blockchain, and Proof of Work

12/29/2023

With the upcoming halving and expected Exchange-Traded Fund (ETF) approval, this year promises to be a landmark period for Bitcoin. The 'halving' refers to an event that occurs roughly every four years, halving the rate at which new bitcoins are created. This mechanism, built into Bitcoin's design, mimics the scarcity and deflationary characteristics of precious metals like gold. Currently, there are just over 2 million bitcoins left to be mined out of the total possible 21 million. Adhering to this halving schedule, the last bitcoin is expected to be mined around the year 2140. This gradual reduction in new bitcoins ensures a controlled supply, making Bitcoin inherently deflationary.

For a variety of reasons, many otherwise technical people have not fully explored the world of Bitcoin or other cryptocurrencies, and until a few years ago, I was part of this group. When I decided to learn, the resources I found were either geared towards non-technical audiences or presupposed too much existing knowledge. My goal here is to provide a developer-friendly introduction to Bitcoin and lay the groundwork for further exploration.

Genesis and Motivations

Bitcoin and its source code first appeared online in 2009 in a post by a person or persons using the pseudonym Satoshi Nakamoto. Nakamoto's subsequent disappearance has added to Bitcoin's mystique and theories abound about their identity and ultimate fate. Nakamoto envisioned a decentralized digital currency that is not controlled by any one group or government. This decentralization addresses key criticisms of fiat money, such as inflation, manipulation, and control.

Centralization, however, has its benefits. Without a central authority, questions arise: Who decides the amount of currency in circulation, if and when more will be created, and how that's then distributed? Who validates currency legitimacy or oversees digital transactions without a central authority? Nakamoto's solution was the blockchain, a public ledger of every Bitcoin transaction. It creates a central authority — network-wide consensus on the state of the network — through a distributed mechanism.

The abstract of the bitcoin whitepaper published by Nakamoto further clarifies:

A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they'll generate the longest chain and outpace attackers. The network itself requires minimal structure. Messages are broadcast on a best effort basis, and nodes can leave and rejoin the network at will, accepting the longest proof-of-work chain as proof of what happened while they were gone.

The double spending problem, which is unique to digital mediums because they can be perfectly reproduced, refers to the risk that a unit of digital currency can be copied and spent more than once. In traditional physical currency systems, double spending is not an issue because once you hand over a bill or coin to someone else, you no longer have it, and thus you can't spend it again. Bitcoin addresses the double spending problem through the use of a consensus mechanism and participation incentive. When nodes participate in the network's Proof of Work (PoW) system, the consensus mechanism, they have a chance of earning bitcoin in the form of rewards and transaction fees. To participate, nodes connected to the network must select transactions from an unprocessed pool and assemble them into processed blocks. Once a valid block has been assembled, it is submitted to the rest of the network for inclusion in the network state, the chain of existing blocks, or blockchain. Building valid blocks is computationally difficult and requires trial and error, so the first node that does so successfully receives a block reward, but only after a consensus has been reached.

Fool Proof

The blockchain is composed of linked sequential blocks and forms a public ledger of all Bitcoin transactions to date. At any given point, the nodes in the network share a consensus on the state of the blockchain. As new transactions accumulate in the pool, nodes continuously work to generate the next valid block, ensuring that transactions are processed. The mechanism through which the winning node receives the block reward is simple: the first transaction in every block, called the coinbase transaction, is used to award the miner of the block a certain number of new coins and the transaction fees from all transactions within the block. The coinbase transaction is unique because it creates new coins from nothing, unlike regular transactions which transfer existing coins, but once the final bitcoin is minted, this will no longer be true. It is through this competitive process that participating nodes simultaneously secure the network and prevent double-spending (because a double-spent transaction can quickly be shown to be invalid due to the blockchain, and nodes on the network would reject any submitted block that included such a transaction).

A Bitcoin block consists of a header and a body of transactions. The header both contains metadata about — and is derived from — the transactions in the block's body. Below is a real Bitcoin block, Block #286819, mined on 2014-02-15:

{
  "hash": "0000000000000000e067a478024addfecdc93628978aa52d91fabd4292982a50",
  "height": 286819,
  "chain": "BTC.main",
  "total": 390068936365,
  "fees": 3049675,
  "size": 152509,
  "vsize": 152509,
  "ver": 2,
  "time": "2014-02-20T04:57:25Z",
  "received_time": "2014-02-20T04:57:25Z",
  "relayed_by": "",
  "bits": 419520339,
  "nonce": 856192328,
  "n_tx": 99,
  "prev_block": "000000000000000117c80378b8da0e33559b5997f2ad55e2f7d18ec1975b9717",
  "mrkl_root": "871714dcbae6c8193a2bb9b2a69fe1c0440399f38d94b3a0f1b447275a29978a",
  "txids": [
    "00baf6626abc2df808da36a518c69f09b0d2ed0a79421ccfde4f559d2e42128b",
    "91c5e9f288437262f218c60f986e8bc10fb35ab3b9f6de477ff0eb554da89dea",
    "46685c94b82b84fa05b6a0f36de6ff46475520113d5cb8c6fb060e043a0dbc5c",
    "ba7ed2544c78ad793ef5bb0ebe0b1c62e8eb9404691165ffcb08662d1733d7a8",
    "b8dc1b7b7ed847c3595e7b02dbd7372aa221756b718c5f2943c75654faf48589",
    "25074ef168a061fcc8663b4554a31b617683abc33b72d2e2834f9329c93f8214",
    "0fb8e311bffffadc6dc4928d7da9e142951d3ba726c8bde2cf1489b62fb9ebc5",
    "c67c79204e681c8bb453195db8ca7d61d4692f0098514ca198ccfd1b59dbcee3",
    "bd27570a6cbd8ad026bfdb8909fdae9321788f0643dea195f39cd84a60a1901b",
    "41a06e53ffc5108358ddcec05b029763d714ae9f33c5403735e8dee78027fe74",
    "cc2696b44cb07612c316f24c07092956f7d8b6e0d48f758572e0d611d1da6fb9",
    "8fc508772c60ace7bfeb3f5f3a507659285ea6f351ac0474a0a9710c7673d4fd",
    "62fed508c095446d971580099f976428fc069f32e966a40a991953b798b28684",
    "928eadbc39196b95147416eedf6f635dcff818916da65419904df8fde977d5db",
    "b137e685df7c1dffe031fb966a0923bb5d0e56f381e730bc01c6d5244cfe47c1",
    "b92207cee1f9e0bfbd797b05a738fab9de9c799b74f54f6b922f20bd5ec23dd6",
    "29d6f37ada0481375b6903c6480a81f8deaf2dcdba03411ed9e8d3e5684d02dd",
    "48158deb116e4fd0429fbbbae61e8e68cb6d0e0c4465ff9a6a990037f88c489c",
    "be64ea86960864cc0a0236bbb11f232faf5b19ae6e2c85518628f5fae37ec1ca",
    "081363552e9fff7461f1fc6663e1abd0fb2dd1c54931e177479a18c4c26260e8"
  ],
  "depth": 537699,
  "prev_block_url": "https://api.blockcypher.com/v1/btc/main/blocks/000000000000000117c80378b8da0e33559b5997f2ad55e2f7d18ec1975b9717",
  "tx_url": "https://api.blockcypher.com/v1/btc/main/txs/",
  "next_txids": "https://api.blockcypher.com/v1/btc/main/blocks/0000000000000000e067a478024addfecdc93628978aa52d91fabd4292982a50?txstart=20\u0026limit=20"
}

In the mining process, each component of the block header plays a crucial role. For Block #286819, these components were:

  • Version: 00000002
  • Previous Block Hash: 000000000000000117c80378b8da0e33559b5997f2ad55e2f7d18ec1975b9717
  • Merkle Root: 871714dcbae6c8193a2bb9b2a69fe1c0440399f38d94b3a0f1b447275a29978a
  • Timestamp: 1392872245
  • Bits (target in compact format): 419520339
  • Nonce: 856192328

Several of these elements are fixed and must be constructed in a specific way. Version, for instance, indicates the version of the block data structure and is determined by the Bitcoin software version. Previous block hash, which is the hash of the previous block in the blockchain; the Merkle root, which is derived from all of the transactions included in the block; and bits, which is calculated according to the rules of the Bitcoin protocol all have derived values that cannot be arbitrarily set.

The puzzle that participating nodes in the Bitcoin network are attempting to solve is simple: they must select a set of unprocessed transactions from the transaction pool and organize them in a Merkle tree, get the derived Merkle root, construct  a block header with the given Merkle root, add an arbitrary 32-bit nonce, and then hash the byte representation of the header. If the hashed header produces a value below the target threshold, their block is valid, otherwise, they must modify the nonce and try again. If all of the roughly 4 billion nonce values are checked and don't produce a valid block hash, miners can slightly adjust the timestamp, within certain network-enforced limits, and if all else fails, they can select a different set of transaction and try again.

The 'bits' field in the header provides a compact representation of the target and is encoded such that the first two hexadecimal digits represent the exponent, and the following six digits signify the coefficient. For Block #286819, the hexadecimal value of the bits field is 0x19015f53, so 19 is the exponent and 015f53 is the coefficient. The target is calculated by first converting the coefficient to a decimal, then multiplying it by 28*(exponent-3). The resulting number is what block hashes must be less than or equal to in order to be deemed valid.

In the header example above, the Merkle Hash of 871714dcbae6c8193a2bb9b2a69fe1c0440399f38d94b3a0f1b447275a29978a combined with the nonce value of 856192328 produces a block hash (in little-endian format) of 0000000000000000e067a478024addfecdc93628978aa52d91fabd4292982a50, which is less than the target value of00000000000000015f5300000000000000000000000000000000000000000000.

The Bitcoin protocol ensures there is always a valid hash. The SHA-256 hash function, used in Bitcoin mining, produces hashes that are uniformly distributed over its output space, which means every possible 256-bit number has an equal chance of being the output of a hash. SHA-256 is designed to produce outputs that are indistinguishable from random and changing even a single bit of the input (like the nonce or timestamp) results in an entirely different, unpredictable hash. The SHA-256 hash function has an output space of 2256 possible hashes so it's practically certain that for any given difficulty target there is an input that produces a valid hash.

The Bitcoin protocol adjusts the difficulty level every 2016 blocks (approximately every two weeks) to maintain an average block time of about 10 minutes. This is done through a decentralized, automated process and ensures that as the total computational power of the network changes, the difficulty of finding a new block adjusts to keep the rate of block discovery stable.

The Journey of a Bitcoin Transaction

Every Bitcoin transaction goes through a series of steps, from initiation by the user to confirmation on the blockchain. It  starts when a user decides to send Bitcoin and creates a transaction, specifying the amount of Bitcoin to send, the recipient's wallet address, their own public key, and the transaction fee they are willing to pay. The user then signs the transaction using their private key, proving its validity to the network.

The signed transaction is broadcast to the network where its is added to the pool of transactions waiting to be processed. The participating nodes on the network select transactions from the pool based on the fees offered, verify that the transactions are valid, and include them in the blocks they are attempting to generate.

Once a miner successfully mines a new block containing the transaction, it is considered confirmed. This process can take anywhere from a few minutes to several hours, depending on the state of the network, the transactions in the pool, and the fee set by the user.

The Vastness of the Bitcoin Address Space

Bitcoin wallets addresses are derived from Bitcoin private keys. While the private key space is significantly larger than the address space, with about 2256 (approximately 1.16×1077) possible private keys compared to 2160 (1.46×1048) possible addresses, collisions (where different private keys generate the same address) are highly unlikely. This is due to the immense size of both the private key and address spaces and the mechanics behind wallet address generation.

Your Bitcoin private key, corresponding public key, and corresponding wallet address are all derived from a single number. If someone else could guess this number, they could reproduce your private key, sign transactions, and control your funds. The only thing protecting your Bitcoin is the sheer size of the private key space, which is incredibly large. As humans, it's very difficult for us to reason about numbers beyond a certain size, so I find comparisons and practical applications helpful.

The entire Bitcoin wallet space, comprising all possible Bitcoin addresses, is around 2160, amounting to approximately 1.46×1048 unique addresses. Assuming each address requires an average size of 34 bytes, the total file size to represent the entire key space would be 34 bytes × 1.46×1048 addresses, or approximately 4.96×1049 bytes.

To store this data using hypothetical 1 billion TB (or 1×1015 bytes) hard drives, we would need about 4.96×1034 hard drives.

With the dimensions of a standard 3.5-inch HDD being 14.73 cm x 10.16 cm x 2.032 cm, the surface area of each hard drive is approximately 149.35 square centimeters. Therefore, with 4.96×1034 hard drives, the total area covered would be 149.35 square centimeters × 4.96×1034, which equals approximately 7.41×1036 square centimeters.

Comparing this to the surface area of the Sun, which is about 6.09×1024 square centimeters, these hard drives would cover the surface of the Sun roughly 1.22×1012 times. If the hard drives were stacked in layers, with each layer covering the entire surface of the sun, and each being approximately 2.032 centimeters in height, the total height of the stack would be 2.032 centimeters × 1.22×1012, equaling approximately 2.48×1013 centimeters. Given that the average distance from the Earth to the Sun is about 1.496×1014 centimeters, this stack would reach a height approximately 0.0166 times the distance between the Earth and the Sun.

The strength of Bitcoin's security model lies not only in the large number of possible keys and addresses but also in the robustness of the cryptographic algorithms used, such as ECDSA (Elliptic Curve Digital Signature Algorithm) for key generation and SHA-256 for hashing. These algorithms ensure that generating a private key from a public key or address is computationally infeasible, thereby safeguarding against reverse engineering. It's also worth noting that while the probability of address collisions is incredibly low due to the size of the key spaces, it is not zero. However, the likelihood is so minuscule that it doesn't practically affect the integrity or functionality of the Bitcoin network.

So You're Saying There's a Chance...

Even with the thought experiment above, it is difficult to grasp just how unlikely it is to randomly generate a number that has been used to create a wallet before. It certainly was for me, so I wrote a little program to prove it to myself. In the btc_lottery repository you'll find a Go program that generates thousands of wallets and compares them to a list of all wallets that hold a bitcoin balance. Every person on the planet could run this program continuously for hundreds of years without ever generating a match.

The Bitcoin ETF

The approval of a Bitcoin ETF by regulators in the United States, as is much anticipated in 2024, would represent a significant milestone in the integration of cryptocurrency into mainstream financial markets. Such an approval would be seen as a vote of confidence in the cryptocurrency market and would make it much easier for a broader range of investors to gain exposure to Bitcoin. Unlike buying Bitcoin directly, which requires dealing with cryptocurrency exchanges and secure storage solutions, an ETF can be bought and sold like any other stock through traditional brokerage accounts. This lowers the barrier to entry for individual and institutional investors who may have been hesitant to invest in cryptocurrencies due to perceived complexity or security concerns. The ETF's presence on major stock exchanges adds a layer of legitimacy to Bitcoin, positioning it as a more established and accepted financial asset.

Beyond legitimizing Bitcoin and opening it up to a flood of new potential capital from pension funds, mutual funds, and other large-scale investment entities, an ETF is subject to regulatory oversight, which brings a level of security and transparency that direct cryptocurrency investments lack. For investors concerned about the unregulated nature of the cryptocurrency market, an ETF offers a regulated alternative. A Bitcoin ETF is a significant step towards the institutional adoption of Bitcoin. It provides a familiar and regulated framework for institutional investors to gain exposure to Bitcoin's potential returns without the complexities and risks of direct cryptocurrency handling.

Beyond the Basics

While it's important to have a sense of the theory and technology underpinning Bitcoin, it is only a start. For one, Bitcoin may be the first and most famous cryptocurrency, but it most certainly is not the only one. What are the alt coins and which ones are commonly used? Where do you purchase cryptocurrencies and how do you store them? Why are there so many different options for wallets and which ones should you consider? What is a seed passphrase and how is it different than a private key? I plan to dive into these and other practical questions in an upcoming article.