TIL #2: CAP Theorem

I recently did some research to discover and subscribe to various web development and software engineering podcasts. Here’s that list:

To be fair, I already knew about the last two, Security Now and FLOSS Weekly, because I regularly watch the TWiT netcast network. If you aren’t aware of TWiT, I highly recommend you browse all of their netcasts and watch or listen to those that intrigue you: https://twit.tv/.

Distributed Systems

One of my goals for 2017 is to increase my understanding of distributed systems. I want to become more familiar with distributed computing paradigms and principles such as SOA, message queues, clustering, CAP Theorem, shared-nothing, and parallelism vs concurrency.

I had never heard of the CAP Theorem, so when I spotted a talk with Eric Brewer, who discovered the CAP Theorem, in the archives of Software Engineering Radio, I knew I had to listen to that episode[1].

What is CAP Theorem?

CAP Theorem states that it is impossible for a distributed system to simultaneously provide all three of the following guarantees[2][3]:

Consistency (C)
All nodes see the same data at the same time.
Every read receives the most recent write.

Availability (A)
Every request receives a non-error response.

Partition tolerance (P)
The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.

Effectively, a distributed system can only guarantee 2 of these 3. So you could have a CP system, an AP system, or an AC system.

A CP system, which prioritizes consistency and partition tolerance, will typically have a master-slave setup, where one particular node is elected leader. So the leader handles all writes, and the other nodes handle reads.

An AP system, which prioritizes availability and partition tolerance, will typically have a master-master setup, where every node can accept both writes and reads. In this system, nodes copy data between themselves so that if any single node goes down, others can serve in its place. This setup sacrifices absolute consistency because responses could contain old data while the new data propagates through the system. However, it does maintain eventual consistency[4].

AC systems are possible, but much more rare because it’s difficult to forfeit P. Because of this, you typically will only see references to the other 2 systems. AC systems are possible in local only environments, or situations where probability of a partition is far less than that of other system failures[5].

Final Thoughts

It seems the CAP Theorem boils down to a single decision: do you value consistency over availability, or availability over consistency? This is a question that must be asked on a case-by-case basis.

However, generally speaking, it is my opinion that in today’s society availability should be prioritized over consistency. By that I mean, it is more important that you provide a response in a timely manner than that you provide the most up-to-date response. Of course, there are exceptions to every rule.

This really brings the whole NoSQL movement to mind, which seems to follow this mindset of preferring availability above all else. Unfortunately, my experience with NoSQL is somewhat limited, so I suppose that will be a good topic for another day.

R. Blumen, Host, “Episode 227: Eric Brewer: The CAP Theorem, Then and Now,” Software Engineering Radio, May 27, 2015.
E. Brewer, “CAP Twelve Years Later: How the ‘Rules’ Have Changed,” InfoQ, May 30, 2012.