Fault-tolerant consensus algorithms, represented by Paxos and Raft, have been extensively studied and implemented.
In recent years, Raft has been implemented in etcd as the control plane for Kubernetes. Similarly, consensus mechanisms such as Proof of Work (PoW) and Proof of Stake (PoS) have gained attention in the blockchain space. Many may have at least heard these terms. In distributed computing, consensus has been a long-standing area of research. Some may recall learning about the Byzantine Generals Problem in computer science courses.
For those who learned through practical operations, setting up an Elasticsearch cluster to avoid the Split Brain Problem or configuring an Amazon DynamoDB Accelerator (DAX) cluster with at least three nodes may sound familiar.
Application developers may not frequently interact with databases implementing consensus algorithms like ZooKeeper or etcd. However, ZooKeeper powers systems like HBase, Hadoop YARN, OpenStack Nova, and Kafka. Even if indirectly, many developers use these tools.
Consensus algorithms are notoriously difficult to implement. Paxos, in particular, is considered especially challenging to understand or prove. However, by using analogical thinking, I can explore these concepts in a more accessible way.
Have you ever played the board game "Are You a Werewolf?"?
I played Werewolf a few times, especially during my university days when I taught programming to middle and high school students. It was quite popular among young people at the time.
For detailed rules, please look them up, but in summary, it is a psychological battle between the Villager team and the Werewolf team. According to Wikipedia, the game originated in the United States as a board game:
"Are You a Werewolf?" is a party game focused on conversation and deduction, released in 2001 by the American game manufacturer Looney Labs. The game's rules are based on the 1986 game "Mafia" created by Dimitry Davidoff. ... Wikipedia "Are You a Werewolf?"
Although it is said to be playable with 10 or more players, it can be played with as few as three. The number three should be a familiar figure to those who have studied consensus algorithms.
Werewolves know who the other werewolves are, while villagers do not know who their allies are. This creates an information asymmetry. The Werewolf team exploits this gap to create confusion among the villagers.
The game progresses in day-night cycles. During the day, werewolves pretend to be villagers. Discussions take place to deduce who the werewolves might be. Of course, werewolves lie about their identity. Since werewolves know each other, they can manipulate the discussion to frame an innocent villager.
In the evening phase, a vote determines who to "execute" (yes, an "election"). The person receiving the most votes is eliminated from the game.
At night, the werewolves reveal their true nature and attack a villager. The attacked villager is removed from the game (just like a node being excluded from a cluster due to network failure).
This cycle repeats until the game reaches an endpoint. The Villager team wins if they execute all werewolves. The Werewolf team wins if they gain a majority (consider why this is the winning condition after reading this article).
Let's explore the idea that the Werewolf game can serve as an analogy for consensus algorithms. While not a perfect analogy, it helps capture fundamental concepts.
The Werewolf team can be seen as "faulty nodes." The execution vote in the evening phase resembles a leader election process. In Raft terms, every villager is a Candidate, sending out RequestVote RPCs to elect a leader. Each cycle eliminates a leader, requiring the remaining players to hold another election.
Werewolf requires at least three players. Why?
Consider etcd's documentation on recommended cluster sizes. If a cluster has only two nodes, a single failure prevents reaching a majority, making leader election impossible. Thus, at least three nodes are needed for fault tolerance. Similarly, the Werewolf game requires at least three players to function.
Cluster Size | Majority | Failure Tolerance |
---|---|---|
1 | 1 | 0 |
2 | 2 | 0 |
3 | 2 | 1 |
4 | 3 | 1 |
5 | 3 | 2 |
6 | 4 | 2 |
7 | 4 | 3 |
8 | 5 | 3 |
9 | 5 | 4 |
For consensus algorithms, a five-node cluster is often recommended because it tolerates up to two failures. Likewise, Werewolf is more engaging with at least five players.
In the game of Werewolf, the victory condition for the werewolves is defined as "when the number of surviving werewolves equals or exceeds the number of surviving villagers." In other words, they need a majority. This concept becomes clearer when considered alongside the earlier diagram.
The werewolves know who their fellow werewolves are. This means that once they secure a majority vote, they can unilaterally dictate which villagers to eliminate. At this point, the villagers can no longer achieve victory. The balance of the village has been broken, and it is no longer possible to protect it from the werewolves.
This phenomenon is also well-known in the blockchain space as a majority attack or 51% attack.
Similarly, in distributed computing, if more than half of the nodes fail at once, consensus formation ceases to function, and the system loses its fault tolerance.
It's often said that when setting up a cluster in a database that implements a consensus algorithm, "a cluster of five nodes is a good choice." But why is that? As seen in the earlier table, when Cluster Size == 5, the Failure Tolerance == 2. The reasoning behind this recommendation is based on the assumption that the likelihood of more than two nodes failing simultaneously is quite low.
In a way, this is similar to how, although Werewolf can technically be played with just three people, the game becomes significantly more interesting with at least five players.
In the Byzantine Generals Problem, the key challenge is that no one knows who the traitor is. Because of this, any messages sent—whether ordering an attack or a retreat—may contain deliberate misinformation.
Interestingly, Werewolf has an optional rule set that introduces additional roles such as "Seer," "Medium," and even a "Traitor." While you can look up the details of these roles separately, the Traitor in Werewolf corresponds directly to the traitor general in the Byzantine Generals Problem. This player is technically counted as part of the villager faction but secretly collaborates with the werewolves, acting in their interest. A truly insidious role.
During the daytime phase, everyone appears to be a villager. In the context of the Byzantine Generals Problem, everyone appears to be a general of the Byzantine Empire. However, hidden among them is a traitor. Because of this uncertainty, discussions among the villagers (or generals) become highly chaotic and convoluted.
Human conflicts are inherently complex. In contrast, maintaining trust in a cluster of multiple nodes might actually be a much simpler task. Or not.
Through this thought experiment, I explored consensus algorithms using the Werewolf game analogy. If you're familiar with Werewolf, perhaps Paxos and Raft now seem more intuitive—(or do they?).