Building Bitcoin in JavaScript

I figure the best way to learn about something is to try to build it from scratch and see what happens.

For this year’s JS@PayPal conference, I thought it would be an interesting exercise to live-code Bitcoin in a 30 minute talk. I’ve never attempted a live-coding talk, so I figured I would throw myself in the deep end!

Here’s the full-length video I recorded before the conference, with in-depth explanations of everything I did:

First, I decided to cover the core Bitcoin concepts in a simplified way:

  • The block chain
  • The block reward, and block reward schedule
  • The block difficulty, and block difficulty adjustments
  • The mempool of pending transactions
  • The structure of a completed block
  • Communicating new transactions and blocks over the network

To do this, I decided to vastly simplify everything:

  • It wouldn’t be production-ready
  • It wouldn’t be identical to Bitcoin in every way
  • It wouldn’t be secure in any way
  • It wouldn’t be especially performant

A picture paints a thousand words here:

Bitcoin is the Mona Lisa, with laser eyes. What I’m building is… that thing on the right.

It’s good to understand some Bitcoin Basics before we get started.

  • Bitcoin is powered by a ‘block chain’
  • A ‘block’ is just like an object containing various data, which is propagated across the Bitcoin network so everyone has the same state and source of truth, including the same transactions.
  • Each block is cryptographically tied to the previous block, forming a long chain of blocks each with new data and new transactions.
  • Each block contains transactions, which means coins can be transferred between different actors on the network. Bob sends Jane 5 coins, Jane sends Harry 3 coins, and so on. We can keep track of the balances by looking at all of the historical blocks.
  • Blocks need to be ‘mined’ to be generated, which takes a certain amount of computational power and luck. This secures the network: to hack or attack Bitcoin, you need more computational power than everyone else on the network.
  • Anyone who successfully mines a block, earns a reward of a fixed number of coins. This encourages miners to keep mining.
  • Transactions are also sent with optional fees. These fees are also collected by the miner of a block, along with the block reward. So there is a double incentive to mine new blocks.
  • Mining blocks gets incrementally harder or easier over time, to make sure each new block is mined at an approximately constant rate over time. Keeping the block rate constant is important: too high a block time, and the network will be unusable. Too low a block time, and the network can easily be spammed with conflicting blocks.
  • Each node and miner in the network agrees on all of these rules, and on the format of the blocks. This makes it extremely difficult to create fake blocks, or to send transactions which spend the same coins multiple times — even if you do have a vast amount of computing power.

To learn more about these fundamentals, I strongly recommend starting here.

First, I came up with some types for the public interface of the block chain.

The important elements here are:

  • Transactions, which require a sender, receiver, amount, and a user-defined fee. The fee will determine the priority for the transaction on the network.
  • Blocks, which will have a bunch of important fields.

Blocks will contain:

  • An id, to identify the current block
  • A miner id (public key), to identify who gets the reward for each block
  • A parent block id, to allow the blocks to form a linear chain
  • An index, to easily keep count of the blocks
  • A time, to represent when the block was mined
  • An elapsed time, to represent the time between this block and the last
  • An array of transactions, to track sending and receiving coins
  • A difficulty setting, to keep the block time constant
  • A block reward, to incentivize miners.

Next, let’s define some constants for how the block chain will work.

We have some really important stuff here:

  • BLOCK_TIME is the time we’re aiming for between each block being generated. Bitcoin aims for 10 minutes between each block; we’re going to go for 1 second, so we can quickly run this locally and try it out.
  • INITIAL_REWARD is the reward of coins, for each block. This reward goes to whoever mined the block. This reward will halve over time, and will encourage more miners to try to mine blocks, process transactions, and keep the network secure and immune from double-spends or other attacks.
  • REWARD_HALVING_SCHEDULE is the number of blocks before this reward halves. Bitcoin typically halves every four years. We’ll aim for every twenty blocks, equivalent to every 20 seconds.
  • BLOCK_SIZE_LIMIT is the maximum number of transactions that can be included in any block. We’ll set this to 10, but the real limit in Bitcoin is much larger and is based around the MB size of the block.
  • GENESIS_BLOCK defines the initial hard-coded block. All other blocks will have this genesis block as their ancestor. We’ll award this block to SATOSHI for now, and we’ll define a default difficulty and the initial block reward.

The meaning of each of these should hopefully become clearer as we use them in our code.

Next, let’s stub out the block chain. This will be our ‘database’ or system-of-record for all blocks and transactions.

Now we can start actually implementing some of these functions!

Defining the block chain data structure

First, the data structure for the blockchain itself:

You might think a block chain like Bitcoin is a linked-list, where each block is a node in the list.

Actually, a block chain is more like a tree of blocks.

There can be many competing branches of this tree. How do we know which branch to trust? Easy! We pick the longest branch, since that is the branch that took the most computational effort to create. The longer the branch, the more confidence we can have that no other branch can overtake it.

So, we use a tree data structure in memory to represent our block chain, because that allows us to easily calculate the longest branch at any given time.

We pass in GENESIS_BLOCK as the root of this tree, since that will always be the first block. Hard-coding this makes things easier and offers no particular disadvantage.

Next, let’s write some code to mine a new block. This will be the most challenging exercise of the day.

To mine a block, we need three things to start:

  1. The existing blockchain, represented here as root. We need to call getLongestBranchNode() here to find the current ‘true’ block chain, which we want to mine another block on top of.
  2. The publicKey of the miner. This needs to be part of the block, so everyone else on the network knows who will get the reward for mining the block.
  3. A list of transactions. There’s not much point in mining a block if we don’t have any transactions to publish to the network.

Most of this is easy. We insert the miner public key, the id, the parent id, the time, the index, and the transactions into the new block.

We’re going to re-validate the transactions here to make sure they’re correctly signed. Every other node will do the same before they accept the new block, and we want to maximize our chances of our block being accepted without giving anyone a good excuse to reject it.

The next tricky part is dealing with the reward.

We want to halve the reward every 20 blocks. So we calculate what the current reward will be, for the current block index, assuming the reward starts at 1024 and halves each 20 blocks. Some quick math here to achieve that, by checking if the current block index is divisible by 20, and if so, halving the reward. Otherwise we keep the same reward.

Next, we need to calculate the new difficulty for the block.

In reality, Bitcoin recalculates difficulty once every 2016 blocks, or approximately every 2 weeks. For simplicity, we’re going to recalculate the difficulty for every single new block so we can demo this in real time.

This is an extremely naive way of calculating the difficulty, but it works for this example.

  • If the block took too long to generate, we lower the difficulty by 1.
  • If the block took too little time to generate, we increase the difficulty by 1.

This means that we’re constantly re-calibrating how difficult is is to create the next block, based on how long it took us to create the previous block. This keeps blocks being created at an approximately-constant rate, on average, no matter how many miners there are trying to mine new blocks.

The more miners you have, the higher the ‘hash rate’, which is a measure of how much computational power is currently being used across the world to mine new coins.

So either you have:

  • A small number of miners, and a low hash rate, with a low difficulty
  • A large number of miners, and a high hash rate, with a large difficulty

This means that Hash-Rate / Difficulty should always be a constant number, which keeps the average time for each new block relatively constant.

Once we’ve created a new block, we’re still not done . The block will only be accepted by the network if the difficulty is correct, based on the previous block.

We check this difficulty in a pretty naive way here. We hash the block, turn that hash into a number, and if the number is evenly divisible by the difficulty, with no remainder, we allow it to pass.

This means as the difficulty number gets higher, there is a smaller and smaller chance that the block will pass the difficulty check, and we will have to make more and more guesses to generate a new block. This takes more computational power, and more time.

The opposite is also true. If the difficulty number gets lower, there is a higher chance that a new block will pass the difficulty check, and it will take fewer guesses to generate a new block. This takes less computational power, and less time.

This simple version of a difficulty check works because it is more likely that a number is divisible by a low number like 2 than by a higher number like 17. So it is easier to generate a block whose hash happens to be divisible by 2, and harder to generate a block whose hash happens to be divisible by 17.

When a new block is propagated to the network, we need to be able to add it to our tree data structure.

First, we want to verify the block matches its hash, and all of the transactions within the block are correctly signed.

Then, we want to verify the block passed all of the correct rules. For now, we’re just checking the difficulty matches. In reality, we would want to check every single field of the block and validate it completely. We can’t trust any other node on the network, so we want to validate every single thing.

If a block is not valid, at this point it will be rejected by the node. Since the entire network follows the exact same set of decentralized rules, every node will accept the exact same valid blocks, and reject the exact same invalid blocks, without needing a central authority to tell them which blocks are valid.

So not only do attackers need to spend a large amount of money for electricity to run the computations to create new blocks — they also would have an extremely hard time bending the rules that are set by every other node on the network for the integrity of a block.

Finally, we just need to find the parent block specified by the new block, and add the new block as a child node to that parent. This extends the existing tree with the new block, so we can use it later to calculate everyone’s current balances.

The final thing we need to do is have some way to take all of the following:

  • Transactions
  • Fees
  • Block Rewards

And calculate the final balances for each wallet on the chain based on those values in each block.

We do this by finding the longest chain at any given time, and adding up all of the balances, fees, and block rewards. All of the data we need to do this is contained in each block.

  • The miner gets the block reward for each block
  • The miner gets the fee for each transaction
  • The receiver gets the amount of the transaction
  • The sender loses the amount of the transaction
  • The sender loses the fee of the transaction

You’ll notice transactions are a zero-sum-game. Each amount is subtracted from one account and added to another account. The same goes for each fee. So transactions can only use coins that are already present on the network, they can not create or destroy coins. (Of course, you can send coins to a dead address or lose your private key in a boating accident. This does not destroy coins though, it only makes them permanently inaccessible).

Rewards, however, are not a zero-sum-game — we’re adding new coins to the network, without ever taking them away. This is actually the source of all new Bitcoins. This is also how Bitcoins eventually reach their fixed cap. Eventually, the block reward halves and halves and halves down to zero. At which point no new Bitcoins can ever be added or subtracted from the system, and we end up with a fixed cap of 21 million.

This halving schedule gives Bitcoin its S2F value, which programs its price to rise like clockwork every four years. Just kidding.

We’re now done with the data-structure / database side of things. We have a block chain, we’re able to create new blocks, add them to the tree, and calculate the final balance of each wallet based on the longest chain of blocks in the tree.

But we’re not done quite yet!

Bitcoin doesn’t just run on a single computer or server. We need a way to connect to a peer-to-peer network, share these blocks with others, and publish new transactions to the network. So we need a node!

Again, we need to define some types and stubs for our Node, so we have something to code against.

There are a few concepts here:

  • We want the ability to send Bitcoins from our wallet to another receiver address
  • And we want the ability to continually mine new blocks and earn block rewards

Now we can start implementing these functions!

First, we’re going to instantiate the stuff we need to run the node:

  • We need a cryptographic key pair. We’ll use the public key to identify the node/wallet/miner, and the private key to sign any new transactions. Each miner, node, wallet, or any other actor uses this public key to identify themselves on the network. The private key helps them prove their identity.
  • We need to connect to a peer-to-peer network of other nodes. We’ve used a fake Network here for convenience, so we can broadcast and listen for events. In reality, Bitcoin will connect to a real peer-to-peer network of nodes across the internet, without any centralized servers required.
  • Of course we need the block-chain itself, which we already implemented above. We’ll instantiate that block-chain here, so the node can use it to store new blocks and mine new blocks and get current wallet balances.
  • We need a mempool. This is going to store incoming transactions, until they can be added to a block — since it will take some time to generate a block, we need somewhere to keep the transactions in the mean time.

We’ll need to be able to listen for transactions being sent from any other nodes on the network:

At this point we just push the transaction into the mempool. We could and probably should validate it at this point — but we’re already doing that before including the transactions into a block, so we’ll skip it for now.

The mempool is a staging area for new transactions. They aren’t final or ‘real’ yet and nobody should rely on these mempool transactions to see if they got paid. To become final, the transactions will need to be included in a newly mined block.

We of course want to be able to send our own new transactions:

To do this, we create a signed message using our private key. This contains the sender address, the receiver address, the amount we want to send, and the fee we want to attach to the transaction (to give it priority on the network).

Only the person with the corresponding private key to the public key (public address) can sign a transaction like this. Without your private key, nobody can send transactions on your behalf. And of course the private key is never shared to anyone else on the network.

Once we’ve created and signed this transaction, all that remains to do is to broadcast it to the network, so it can be picked up by any other node that is listening, and added to their mempool.

Blocks can be created by any other node on the network, and this is where most blocks will come from in the real world. So we need to be able to listen for new blocks from others, and add them to our blockchain.

We’ve already defined the logic to validate the block’s integrity in addBlock, so we’re just going to invoke that method here. If the block gets rejected, no big deal, we’ll just continue on as usual.

We also are going to blank out the current mempool. In reality, we probably want to only filter out the transactions from the mempool that actually appeared in the new block, but we’re going to take a shortcut here for simplicity.

Finally, we actually want to run the node, and attempt to mine new blocks:

To avoid killing my browser, I’m running this in an asynchronous loop which runs every 500ms. In the loop, we pull out a limited number of transactions from the mempool (based on the block size limit we defined earlier), and we attempt to mine a new block which includes those transactions.

In reality, we would normally search for the transactions with the highest fees, so we get the maximum reward for mining the block. Here, for simplicity, we’re just taking them in the order they are added to the mempool.

If we’re successful in mining a new block, we publish that block to the network to be consumed by other nodes.

If someone else mines a new block before we do, we start over, and try mining a new block on top of that block instead — since createBlock will always try to mine on top of the longest chain of blocks at any given time.

You can find the code for this and run it here:

The most important files are:

When you run this project, my partner Jill has created an awesome UI to help you visualize what is going on, as nodes mine new blocks and send transactions back and forth:

I hope this was useful, to help you understand some of the principles used to power Bitcoin!

Remember, this is a vast simplification of the real Bitcoin node and mining software. It’s intended for learning purposes only, and nothing more.

Cheers! — Daniel

works for PayPal, as a lead engineer in Checkout. Opinions expressed herein belong to him and not his employer.