Instant Finality in Byzantine Generals With Unknown and Dynamic Participation

August 28, 2022 8 min read

Author: Chainlink Labs Research

Authors: Dahlia Malkhi (Chainlink Labs), Atsuki Momose (UIUC), Ling Ren (UIUC)

Synopsis

We present a remarkably simple solution for the Byzantine Generals problem with deterministic and unconditional Safety guarantees in a setting that has Unknown and Dynamic Participation, but without the energy consumption of proof-of-work or the long latency of longest chain protocols. With this, we remove the biggest obstacles of Nakamoto Consensus, which are probabilistic finality, high latency, and energy consumption of proof-of-work, while retaining key tenets of the permissionless model.

On first encounter, the above appears daunting: How can participants reach agreement when they come and go and there isn’t even a known count of active participants?

Removing proof-of-work, the “Sleepy” model introduced by Pass and Shi borrows two core pillars of the permissionless setting. One is continuity, requiring Synchronous message propagation among currently active participants, in order to enable transfer of knowledge in face of churn. The second is that at any moment in time, there is an Unknown and Dynamic set of active (and synchronously connected) parties with an honest majority. Under both assumptions (described more precisely below), Sleepy Consensus solves Byzantine agreement, but inherits the probabilistic Safety and high latency from longest-chain protocols.

Under a similar Unknown and Dynamic Participation model (with a more practical assumption on synchrony, namely without infinite buffering), a recent result by Atsuki and Ren broke new grounds by demonstrating a solution for Byzantine agreement with deterministic Safety guarantees and expected constant latency. However, liveness is only guaranteed when participation stops being dynamic for long enough periods. Moreover, even during such periods, the constant factor in the latency is somewhat large.

By requiring a slightly higher threshold of honest participation – two-thirds instead of an honest majority – we demonstrate a remarkably simple solution for Byzantine agreement in the Unknown and Dynamic Participation model. It has deterministic and unconditional Safety guarantees and a (small) constant expected latency, overcoming the above-mentioned drawbacks. Moreover, did we mention it is remarkably simple? As evidenced below, the algorithm doesn’t look much harder or different than an asynchronous agreement protocol in the classical setting with a static and known set of participants!

Here, we present our paradigm solving a one-time binary Byzantine agreement. In future posts, we will (i) extend to multi-valued Consensus, (ii) extend to a chain of Consensus decisions (i.e., the atomic broadcast problem), and (iii) harness a leader-based paradigm to improve average case round complexity.

Unknown and Dynamic Participation Model

For ease of exposition, we will think of time as divided into discrete “rounds”. In reality, the model can be defined with and our solution can be extended to continuous time.

In order to provide continuity of knowledge in the protocol, permissionless protocols actually make a strong synchrony assumption: there is a known upper bound on channel propagation delays among every pair of honest participants (called ‘nodes’), encapsulated in our round model.

Synchronous Communication. If an honest node p sends a message in round r, then every honest node q that is active in round r+1 receives the message.

Permissionless consensus protocols commonly assume an honest majority among active participants. Here, we require a higher threshold of honest participants. This turns out to simplify the construction considerably.

Honest participation. In each round, there is an unknown number of active participants, more than two-thirds of which are honest.

The participation assumption merits additional clarification. In a dynamic setting, participants churn between rounds, including faulties. We must assume that if a faulty node goes down, its messages do not appear in future rounds. This is a much weaker assumption than prior works, which assume all faulty parties are active throughout the entire execution.

Unknown and Dynamic Participation model in a nutshell. To summarize, we have the following model. The set of active nodes at each moment is unknown, their count is unknown, and at each round, they may be replaced entirely, subject to the following assumptions:

PKI: Participating nodes are taken from a finite universe, each can be identified by a public key for which it owns the private key.
- Note, PKI is needed only for VRF and liveness; messages do not need to be signed by their senders
Active nodes: Each round r has an unknown set of active nodes, whose (unknown) count n_r satisfies n_r ≥ 3f_r+ 1.
Synchronous communication: In round r, honest and active nodes receive all the messages broadcast by honest nodes in round r-1.

The Byzantine Agreement Problem

This problem has been described many times. Briefly, in the binary agreement variant, a group of participants initially hold an input value ∈ {0,1} (a bit) and come to agreement on an output value satisfying:

Safety: The output of two honest parties is the same
Validity: If all honest parties start with the same input value, the output is that value
Liveness: Eventually all honest parties output a value

Binary Agreement in the Unknown and Dynamic Participation Model

We introduce a solution to a one-time binary agreement that guarantees Safety and an expected (small) constant number of rounds to termination.

We first provide insight into the key mechanism of transferability we work with.

Unknown and Dynamic Quorum property (UDQ). Each node receives in round (r+1) a set of messages sent during round r. We observe the following key facts:

Denote the unknown number of messages received by an honest node in round (r+1) by R. We have R ≤ n_r
Due to synchrony, among the R messages, at least ⅔ n_r≥ ⅔ Rmessages are received from honest round-r nodes.
An unknown number t_r≤ ⅓ R messages out of R are from faulty round-r nodes.

Assume that protocol messages carry a single value each. The following guarantees hold:

UDQ-unique: In a round, if a node p receives a super-majority of messages carrying a value b, and another node q receives a super-majority carrying a value b’, then b = b’. Note that UDQ-unique provides similar guarantees to quorum-intersection in a static setting, but without the ability to transfer a set of signed messages as a certificate of receiving a super-majority.

UDQ-unique holds because if an honest node p receives in round-(r+1) a super-majority of round-r messages carrying a value b, then a majority of honest round-r messages carry b. This holds because:
⅔ R – t_r = ⅔ (n_r – f_r + t_r) – t_r = ½ (n_r – f_r) + ⅙ (n_r – f_r – 2t_r) > ½ (n_r – f_r)

UDQ-valid: If an honest node receives in round-(r+1) more than one-third of round-r messages carrying b, then some honest node has sent a message carrying b. This holds by a similar reasoning to the above:
⅓ (R – t_r) = ⅓ (n_r – f_r + t_r) – t_r = ⅓ (n_r – f_r – 2t_r) > 0

UDQ-graded-agreement: If an honest node p receives in round-(r+1) a super-majority of round-r messages containing a value b, then more than a third of round-r messages received by another honest node in round (r+1) contain b.

Given the UDQ, it is easy to construct in two broadcast steps a single-shot, binary agreement protocol. It works as follows.

The protocol starts with a round where nodes advertise their inputs, and then proceeds to alternate between collection-GA rounds and decision-GA rounds. In a round, active nodes listen to incoming messages (which were broadcast in the previous round), process them, and broadcast a message.

The protocol by node p

Initialization-round 0:

A node broadcasts to all nodes (0, collect, input)

Collection-GA r+1:

Let S-collect be the set of received messages (r, collect,*)
A node broadcasts to all nodes:
(r+1, propose, b) if more than two-thirds of S-collect are (r, collect, b)
(r+1, propose, “empty”) otherwise
Simultaneously, a node broadcasts to all nodes:
(r+1, VRF, c) where c is a random bit

Decision-GA r+2:

Let S-propose be the set of received messages (r+1, propose,*)
A node p processes S-propose as follows:
Decide b: If more than two-thirds of S-propose are (r+1, propose, b)
Adopt vp ← b: If more than one-third of S-propose are (r+1, propose, b)
Adopt vp ← value v in (r+1, VRF, v) with the highest VRF: otherwise
A node p broadcasts to all nodes: (r+2, collect, vp)

In collection-GA rounds, nodes collect messages containing other nodes’ adopted values. At the end of a collection-GA round, if a node receives more than two-thirds (collect, b) for some value b, then it broadcasts (propose, b); otherwise, it broadcasts (propose, “empty”). Simultaneously, a node broadcasts (VRF, c) where VRF is a unique pseudo random value and is verifiably using the node’s public key and the round number, and c is a local coin toss. Note, at the very first round of the execution, nodes send collect-messages with their own input.

In decision-GA rounds, nodes collect (propose, *). Every node that receives more than two-thirds (propose, b) messages decides b. Every node that receives more than one-third (propose, b) adopts b as its current value. Other nodes adopt the random value b’ in a message (VRF, b’) whose (verified) VRF value is the highest among all VRFs they receive. At the end of a decision-GA round, every node p broadcast (collect, vp) carrying its adopted value.

Correctness (Sketch)

Claim-1 [Unique-adopt]. In decision-GAs, at most one value b may be forced to be adopted. In order to be adopted, a value has to be received by more than one-third of a preceding collection-GA’s messages. By UDQ-valid, it is sent by one honest sender. However, by UDQ-unique, at most one value b may be broadcast by honest nodes in a collection-GA .

Claim-2 [Safety]. If an honest node decides b in a decision-GA , then by UDQ-GA, every other honest node receives more than one-third of a preceding collection-GA ’s messages carrying (propose, b). By Unique-adopt, only b may be forced to be adopted. Hence, every honest party adopts b in the decision-GA . It is easy to see that an iteration of collection-GA and decision-GA starting with all honest nodes adopting b ends with all honest nodes deciding b.

Claim-3 [Termination]. Once again, termination is easy if all honest nodes start with b. If no decision is made, by UDQ-adopt, at most one value b may be forced to be adopted. With probability ½, b will be a winning VRF value and adopted by everyone.

Solutions with Unknown and Dynamic Participation

	Authentication	Safety guarantee	Latency	Threshold honest
Nakamoto Consensus 2008	proof-of-work	probabilistic	high	majority
Sleepy Consensus 2016	PKI	probabilistic	high	majority under permanent failures
Atsuki&Ren 2022	PKI	deterministic	(somewhat large) constant	majority under permanent failures
This work 2022	authenticated channels for safety and PKI for liveness	deterministic and unconditional	(small) constant	two-thirds

Read more about this work in our recently published whitepaper: https://eprint.iacr.org/2022/1448.

Acknowledgements

We benefited from discussions about this topic with Lorenzo Alvisi, Ittay Eyal, Jacob Leshno, Kartik Nayak, Youer Pu.

Instant Finality in Byzantine Generals With Unknown and Dynamic Participation

Synopsis

Unknown and Dynamic Participation Model

The Byzantine Agreement Problem

Binary Agreement in the Unknown and Dynamic Participation Model

The protocol by node p

Correctness (Sketch)

Solutions with Unknown and Dynamic Participation

Acknowledgements

Suggested Reading

Developers

Products

Use Cases

Community

Resources

Chainlink

Contact

Social

Instant Finality in Byzantine Generals With Unknown and Dynamic Participation

Synopsis

Unknown and Dynamic Participation Model

The Byzantine Agreement Problem

Binary Agreement in the Unknown and Dynamic Participation Model

The protocol by node p

Correctness (Sketch)

Solutions with Unknown and Dynamic Participation

Acknowledgements

Suggested Reading

You might also be interested in

Strengthening U.S. Federal Agencies With Blockchain Technology: 10+ Real-World Use Cases

Empirical Evidence in AI Oracle Development

Introducing Chainlink Runtime Environment (CRE): A Major Upgrade to the Chainlink Platform

Stay updated on the latest Chainlink news