Select Page

what is consensus algorithm and why does Bitcoin need it welcome to prime number today I'd like to talk about the consensus algorithm that Bitcoin uses but before we talk about it let's step back a little bit to talk about a general database design because bitcoin is just a public database that anybody can write into for a traditional database design we have to come up with a master / slave architecture there's only one single node that can be used to save the transaction and the other node can used for reading the master node can be used for reading as well for sure but you cannot write in parallel from two nodes assuming we write a transaction into the master node the slave is always trust the master node even though once the slave is down the database is still functioning because when a slave comes up it will always pull and always trust what the final state the master node has but using this design you have to have a master node that is always available to write into the database but what if the master node is down so the later design come up with a fault tolerance that if the master node is down there would be a election happened to elect the next master node or the leader node the protocols such as Raft will elect the next leader node and once the node becomes the leader the write transaction will go through the leader as well that will increase the availability of your database but that brings another problem since any node can be elected as a leader there's no source of truth that the state from a single node is trustful we can think of a scenario that we have five nodes three of them is down we'll have two remaining still alive should we still allow the transaction to be written into the two nodes you may say why not the database can still be functioning but what if the three nodes(not down) and the two nodes there is a firewall in between and the node on both side think they are still fully functioning so the three node will become a small cluster and the two nodes will become another cluster and all of the sudden when a firewall disappears which side of the nodes make the call when things like this happened this is a time we need a consensus consensus is a fault-tolerant system that agreed on a single final value among a distributed unreliable participants since bitcoin is still a database so it should still follow the cap theorem consistency means every Reed should return the most recent write or an error availability means every request should get a response even though the response is not returned the latest data partition tolerance means the system should be still functioning despite a number of message has been dropped where number of node has has been off the network the cap theorem means in the distributed environment it is impossible to simultaneously support more than two out of the three guarantees let's keep in mind what two of the guarantees Bitcoin support that the other one Bitcoin doesn't while we discuss the consensus algorithm even though bitcoin is the database but it's different from a traditional database or a centralized the database is that every transaction has to be signed by the sender which means you don't need to guarantee that the transaction is valid you just need to validate the transaction through the signature for example if you do threat analysis if you are a attacker are you able to forge a transaction that paids coins to yourself no because all the transactions are signed by whoever hold the private key so what can you do well if you are a malicious node you can for example just ignore some transactions similar to deny of service attack you can ignore some known public key hash you never pick up the transaction to form into a block but it doesn't hurt because any miner any node in the network can pick it up and put it into the block since the transaction is valid and is signed by the sender and you cannot do DOS attack because any miner can pick up the transaction and produce a block there's nothing you can do for consensus right so what is the most important thing in Bitcoin right this is a double spend attack this is the only reason that the consensus algorithm exists in Bitcoin the consensus is hard not only there's no identity for any of the node in Bitcoin network there is no total number of the nodes there is no majority of the node and even worse you cannot trust or you cannot even use a global time to decide which transaction comes first given that a lot of limitations how do you design a consensus that can prevent the double spend I think this is a major contribute the Bitcoin provides to the whole cryptocurrency world it opened the doors to build a real decentralized application any node can join and drop doesn't depend on any global time but still as a whole the system is trustful since there is no master node in Bitcoin network there's no leader there's no Golden State all the other node can follow then the Bitcoin consensus algorithms is fairly simple always extend the blockchain to the longest one which means when the soft fork happens you always produce your block appended to the longest chain it may sounds much simpler than the other consensus algorithm used in the database industry such as Raft but it's not given that the Bitcoin node spread in in the different geography locations each of the node has different network latencies they may have their own global timestamp to find just the longest blockchain is not as easy as you might imagine so Bitcoin introduced the proof of work which slowed down the whole process and let the things settle down and then to make it easy to achieve the consensus the proof of work in order to produce a block the miner need to guess a nonce and put the nonce in the same block to calculate the hash value the hash value has to meet some criteria such as how many leading zeros in a hash value the total time to produce a block is roughly 10 minutes ten minutes is long enough for the majority of the Bitcoin nodes to get the transactions and the blocks using this trick to make everything settle down it makes much easier for the whole network to achieve a consensus it may sounds brutal and you may think it's not working but in reality it works pretty well let's take a look let's simulate how an attacker would do a double spend this is current state of a blockchain this is block number one number two number three an attacker named Alice she want to purchase something online from Bob but at the same time she wants to spend the same money to herself so what would she do she would create a transaction Alice to Bob and sent to the Bitcoin network at the same time she would also put another transaction Alice -> another Alice address belongs to herself as well assuming both transaction would land it on the same Bitcoin node the node only trust the first transaction is received so no matter which one comes first there's no double spend there's only either Alice paid to Bob or Alice to pay pay to herself so another situation is one transaction sent to one node and the other transaction sent to another node and both note is so lucky they come up with the block successfully which means they guessed the nonce successfully which meets the hash value criteria so both of them would append the block into the blockchain so one node would append Alice to Bob in her blockchain and the other node would append Alice to pay Alice to herself to his blockchain so at this time none of the blockchain node know the existence of the other transaction which the input as we discussed has to be referencing to a previous output so both are referencing to the previous output that's why if the both transactions sent to the same node the later one will get dropped but since they sent to a different node and they are referencing to the same transaction it's still valid so both block are valid does Alice successfully do a double spend well at this time we don't know so the consensus rules is the next block should always be appended to the longest chain so which one is longest chain we don't know then who decided which chain is the valid chain it's the next block so whoever come up with the next block is either add to this chain so if that's the case alice to paid to herself would be a valid transaction and Alice paid to Bob will be ignored by all the following block chains but what if the transaction is appended to this one then Alice paid to Bob will be a valid transaction and Alice paid to herself will be an invalid one but if you say that still could possible that two different node come up with the (2 different) blocks and append to each one almost at the same time well in reality is the possibility to produce a new block appended to different chains is exponentially dropped(the longer the new blocks are generated) given there's a another block append to the block chain at this time this chain will win so you may feel like even though I described Alice to pay to Bob is the valid transaction and Alice paid to herself is malicious but they're identical from Bitcoin network point of view both transactions are valid and both of them are treated equally the only thing that the network wants to prevent is the double spend it doesn't make judgment which transaction is a good transaction so if the left side of the chain has been picked as the blockchain so the Alice paid to itself is a it's a valid one the Alice paid to Bob is a malicious one you may ask Alice already paid to Bob and that payment was recorded in the blockchain and Bob may already send the product to Alice that's true that's why the common sense in Bitcoin is that even though your transaction is recorded in the blockchain but there's no more blocks appended this transaction is not confirmed the common sense is after sixth confirmation of your transaction is treated as a successful transaction so six block each blocks produced time is about ten minutes so that's about one hour the Bitcoin database is not a consistent database which means after a successful write or successful commit into the database there's no guaranteed your transaction can be read from the other node it may not even exist on the other node but Bitcoin database is eventual consistency so how long it achieved eventual consistency the common sense is one hour so after one hour your transaction is guaranteed to reach the majority of the Bitcoin network Bitcoin has been criticized by used too much energy for this proof of work consensus algorithm the followers build new consensus algorithm such as proof of stake or delegated proof of stake we can talk about them in following episode thanks for watching comments below what you want to see in my channel see you next time