RapidChain's Synchronized Byzantine Consensus


#1

We have been studying the consensus mechanism in RapidChain and wanted to start the conversation around it.

Consensus algorithms in blockchain need to have the properties of:

  1. Consistency: all honest nodes accept the same non-conflicting transactions.
  2. Liveness: transactions will be confirmed eventually by the network.

The three broad types of consensus in terms of network assumption are:

  1. Synchronous: messages are delivered within known time bound T. It tolerates at most ½ malicious nodes.
  2. Partially synchronous: messages are delivered within an unknown time bound T’. It tolerates at most ⅓ malicious nodes.
  3. Asynchronous: messages will be delivered but there is no time bound. It tolerates at most ⅓ malicious nodes.

Rapidchain uses practical synchronous byzantine consensus based on Ren et al. 2017 paper [link]. It assumes a message will always be delivered within a known time bound T. The consensus can tolerate up to 1/2 nodes being malicious in the network

The consensus mechanism follows these four rounds:

  1. Leader gossip <H-i, “propose”> to other nodes, where H-i is the block header of i-th block.
  2. Each node gossip <H-i, “echo”> to other nodes (to ensure all versions of the header are seen by all nodes, so equivocating leader can be detected)
  3. If more than one versions of Header is received, the node gossips <H-i’, “pending”>, where H-i’ contains a empty block merkle root.
  4. If >½ echos on a unique header is received, the node gossips <Hi, “accept”, all_echos>, where all_echos are the received echos.

The synchrony assumption of the network is fairly strong for a blockchain network. It’s hard to guarantee message delivery within a time bound. One solution is to make the time bound fairly large. For example, Bitcoin’s consensus inherently assumes the block data can be delivered to all nodes in the network within 10 minutes, thus the 10 minutes block interval in Bitcoin’s protocol. Given that, a synchronous consensus protocol may not be faster than PBFT (partially synchronous) in real-world deployment because the time bound for message delivery has to be relatively large.

Besides, RapidChain’s synchronous consensus requires so-called “locked step execution”, which means all nodes need to enter new rounds of the consensus roughly at the same time. This requirement demands a clock synchronization mechanism to synchronize the clock of each node in the consensus, which again is a hard problem to solve by itself. Overall, we believe the synchronous consensus protocol used in RapidChain is not quite practical in a real-world blockchain system.

We’d love to hear your thoughts!

Further reading: https://docs.google.com/presentation/u/1/d/1S3hG6RLJFwaVik5pv7pCtTVf11QrIQKBbP2A1nYEQTM/edit


#2

I think one of the key take away I had reading this part of Rapidchain’s paper is that a new leader is elected for each round. One of the benefits is that this avoid the complex recovery path of traditional PBFT when the leader is malicious.

I know that Algorand also has a consensus where the leader changes for each round. I would be curious to understand the difference between this consensus and the one from Algorand.