Architecture and Design 101: CAP Theorem

6 min readJan 27, 2024

The CAP theorem (A.K.A. Brewer’s theorem) is integral to designing distributed architecture. System design is a vast subject. While designing the system, you may choose various aspects based on the requirements and note the tradeoffs to make it more efficient and aligned with expected outcomes. It is essential to understand the CAP theorem to design strong and efficient distributed systems. It influences the architecture and decision-making process when designing databases, storage systems, and other distributed applications. Understanding the trade-offs helps you make informed choices based on the specific requirements and goals of their systems.

In the article, we would like to do a deep dive into the CAP theorem and understand it in greater detail.

Simple Usecase to Understand the CAP Theorem

Let us take the simple use case of a content management system. In our use case, we use MongoDB as a distributed database to store the content. When the author publishes the content, users should be able to view it. In our scenario, when the author publishes the content, there is a network failure that results in a network partition between the nodes in the system.

In the above use case, the system has two options.

It can fail one of the requests, leading to an availability issue
It can execute both requests, returning a stale value from the read request and breaking the system’s consistency when the read request goes to the secondary nodes.

The system cannot simultaneously fulfill both read and write requests successfully while guaranteeing that the read operation returns the most recent value written by the write. This limitation arises due to the inability to propagate the results of the write operation from the primary node to the secondary node caused by a network partition.

Let us understand each component of the CAP theorem.

Consistency

In a distributed system, consistency means that all the nodes in the system should have the same data at the same time. If we perform a read operation on a consistent system, it should return the most recent value of the write operation. When a data update occurs, all subsequent reads should reflect that update. Achieving consistency ensures that all nodes have a synchronized view of the data.

Availability

In a distributed system, availability refers to the guarantee that every request made in the system receives a response, irrespective of each node's status. i.e., the system remains operational even though a couple of nodes are not operational. Unlike a consistent system, there’s no guarantee that the response will be the latest or most recent write operation.

Partition Tolerance

Partition refers to a communication failure or break between nodes within a distributed system. This means that if a node cannot receive any messages from another node in the system, there is a partition between the two nodes. Partition could have been because of a network failure, a server crash, or any other reason.

Partition tolerance deals with the system’s ability to continue functioning and providing services even when network partitions occur.

If a system is partition-tolerant, it does not fail, regardless of whether messages are dropped or delayed between nodes within the system. To have partition tolerance, the system must replicate records across combinations of nodes and networks. The distributed system must have partitional tolerance to overcome malfunctioning nodes and continue read/write operations.

The CAP theorem states that a distributed database system has to make a tradeoff between consistency and availability when a partition occurs.

There are three possible properties that CAP allows you to work with:

CA System (Consistency and Availability)
CP System (Consistency and Partition Tolerance)
AP System (Availability and Partition Tolerance)

CA System (Consistency and Availability)

A system that prioritizes consistency and availability sacrifices Partition Tolerance. Such systems ensure that all nodes have a consistent view of the data and remain available, but they might not be able to function properly during network partitions.

CP System (Consistency and Partition Tolerance)

A system that prioritizes consistency and partition tolerance sacrifices availability. This means that if a partition or node failure occurs, these particular nodes (A.K.A. inconsistent nodes) will be turned off.

This is done to maintain the consistency of written data across all working nodes. Data is generally replicated across the primary nodes so if they fail, the secondary nodes step in. However, since availability isn’t prioritized, write operations are restricted until the primary node is rectified.

AP System (Availability and Partition Tolerance)

A system that prioritizes Availability and Partition Tolerance sacrifices Consistency. These systems prioritize responsiveness and can continue operating during network partitions, but there might be variations in the data seen by different nodes.

The CAP theorem is often discussed in the context of database systems; its principles extend beyond just database selection. The theorem addresses fundamental challenges and trade-offs in the design of distributed systems, and these considerations are relevant in various aspects of distributed computing. Here are some areas where the CAP theorem’s principles and trade-offs are applicable:

Database Systems: The most common context where the CAP theorem is discussed is in the selection and design of distributed databases. Different database systems prioritize different combinations of Consistency, Availability, and Partition Tolerance based on their use cases and goals.
Distributed Systems Architecture: The CAP theorem influences the overall architecture of distributed systems. Designing a highly available and fault-tolerant system may involve sacrifices in terms of consistency under certain conditions.
Microservices and Service-Oriented Architectures: In microservices or service-oriented architectures, where multiple services communicate with each other over a network, the principles of the CAP theorem come into play. Designing communication protocols and ensuring the overall system’s resilience requires consideration of the trade-offs presented by CAP.
Message Brokers and Event-Driven Architectures: Systems relying on message brokers or event-driven architectures often face challenges related to consistency, availability, and partition tolerance. The design of these systems may be influenced by the need to balance these trade-offs.
Cloud Computing: In cloud computing environments, where applications and services are deployed across multiple servers and data centers, the CAP theorem is relevant. Choosing the right cloud services, designing for fault tolerance, and ensuring resilience against network partitions all involve considerations related to CAP.
Blockchain and Distributed Ledgers: In distributed ledger technologies like blockchain, where decentralization is a key principle, the CAP theorem influences design decisions. Achieving consensus among nodes while tolerating potential network partitions is a complex challenge.
Real-Time Systems: Real-time systems that require low-latency responses and continuous availability must carefully balance consistency and partition tolerance. Systems that prioritize real-time responsiveness may sacrifice strong consistency in certain scenarios.
Internet of Things (IoT): In IoT environments, where numerous devices communicate with each other over a network, the principles of CAP become important. Ensuring data integrity, availability, and resilience against network failures are crucial considerations.

In summary, while the CAP theorem originated from discussions about distributed databases, its principles have broader applications in the design and implementation of distributed systems. Understanding the trade-offs between consistency, availability, and partition tolerance is essential in various distributed computing scenarios.

That’s all for today!

Thank you for taking the time to read this article. I hope you have enjoyed it. If you enjoyed it and would like to stay updated on various technology topics, please consider following and subscribing for more insightful content.

Architecture and Design 101: CAP Theorem

Simple Usecase to Understand the CAP Theorem

Consistency

Availability

Partition Tolerance

CA System (Consistency and Availability)

CP System (Consistency and Partition Tolerance)

AP System (Availability and Partition Tolerance)

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Anji…

No responses yet