Overview of Consistency Ranges in Database Techniques
An summary of well-known consistency ranges
The notion of a consistency degree originated by researchers of shared-memory multi-processor programs. The aim of the early work was to cause about how and when totally different threads of execution, which can be operating concurrently and accessing overlapping units of knowledge in shared reminiscence, ought to see the writes carried out by one another. As such, a lot of the preliminary work targeted on reads and writes of particular person knowledge gadgets, reasonably than on the degree of teams of reads and writes inside a transaction. We are going to start our dialogue utilizing the identical terminology as the unique analysis and fashions, after which proceed to debate the right way to apply these concepts to transactions.
In sequential consistency, all writes — regardless of which thread made the write, and regardless of which knowledge merchandise was written to — are globally ordered. Each single thread of execution should see the writes occurring on this order. For instance, if one thread noticed knowledge X being up to date to five, after which later noticed Y being up to date to 10, each single thread should see the replace of X taking place earlier than the replace of Y. If any thread sees the brand new worth of Y however the previous worth of X, sequential consistency can be violated. This instance is proven in Determine 1. On this determine, time will get later as you progress to the appropriate within the determine, and there are 4 threads of execution: P1, P2, P3, and P4. Each thread (that reads X and Y) sees the replace of X from 0 to five taking place earlier than the replace of Y from 0 to 10. Threads: P1 and P2 write X and Y respectively, however don’t learn both one. Thread P3 sees the brand new worth of X and subsequently sees the previous worth of Y. That is solely doable if the replace to X occurred earlier than the replace to Y. Thread P4 solely sees the brand new values of X and Y, so it doesn’t see which one occurred first. Thus, all threads agree that it’s doable that the replace of X occurred earlier than the replace to Y. Distinction this to Determine 2, beneath, by which P3 and P4 see clearly totally different orders of the updates to X and Y — P3 sees the brand new worth of X (5) and subsequently the previous worth of Y (0), whereas P4 sees the brand new worth of Y (10) and subsequently the previous worth of X (0). Determine 2 is thus not a sequentially constant schedule.
Usually, sequential consistency doesn’t place any necessities on the right way to order the writes. In our instance, the write to X occurred in actual time earlier than the write to Y. Nonetheless, so long as each thread agrees to see the write to Y as taking place earlier than the write to X, sequential consistency permits the official historical past to be totally different that what occurred in line with actual time (the one restriction is that writes and reads originating from the identical thread of execution can’t be reordered). See Determine 3 for an instance of this.
In distinction to sequential consistency, strict consistency does place actual time necessities on the right way to order the writes. It assumes that it’s at all times doable to know what time it presently is with zero error — i.e. that each thread of execution agree on the exact present time. The order of the writes within the sequential order should be equal to the actual time that these writes had been issued. Moreover, each learn operation should learn the worth of the latest write in actual time — regardless of which thread of execution initiated that write. In a distributed system (and even multi-processor single-server programs), it’s not possible in observe to have world settlement on exact present time, which renders strict consistency to be largely of theoretical curiosity.
None of Figures 1, 2, or 3 above fulfill strict consistency as a result of all of them comprise both a learn of x=0 or a learn of y=0 after the worth of x or y has been written to a brand new worth. Nevertheless, Determine 4 beneath satisfies strict consistency since all reads replicate the latest write in actual time:
In a distributed/replicated system, the place writes and reads can originate wherever, the very best degree of consistency obtained in observe is linearizability (often known as “atomic consistency” which is what it’s known as within the CAP theorem). Linearizability is similar to strict consistency: each are extensions of sequential consistency that impose actual time constraints on writes. The distinction is that the linearizability mannequin acknowledges that there’s a time frame that happens between when an operation is submitted to the system, and when the system responds with an acknowledgement that it was accomplished. In a distributed system, the sending of the write request to the right location(s) — which can embrace replication — can happen throughout this time interval. A linearizability assure doesn’t place any ordering constraints on operations that happen with overlapping begin and finish occasions. The one ordering constraint is for operations that don’t overlap in time — solely in these instances does the sooner write should be seen earlier than the later write.
Determine 5 above exhibits an instance of a schedule that’s linearizable, however not strictly constant. It isn’t strictly consistency for the reason that learn of X by P3 is initiated (and returns) barely after the write of X by P1, however nonetheless sees the previous worth. Nonetheless, it’s linearizable as a result of this learn of X by P3 and write of X by P1 overlap in time, and subsequently linearizability doesn’t require the learn of X by P3 to see the results of the write of X by P1.
Whereas linearizability and strict consistency are stronger than sequential consistency, sequential consistency is by itself a really excessive degree of consistency, and there exist many consistency ranges beneath it.
Causal consistency is a well-liked and helpful consistency degree that’s barely beneath sequential consistency. In sequential consistency, all writes should be globally ordered — even when they’re completely unrelated to one another. Causal consistency doesn’t implement orderings of unrelated writes. Nevertheless, if a thread of execution performs a learn of some knowledge merchandise (name it X) after which writes that knowledge merchandise or a unique one (name it Y), it assumes that the next write could have been brought on by the learn. Subsequently, it enforces the order of X and Y — particularly all threads of execution should observe the write of Y after the write of X.
For instance, examine Determine 6 (above) with Determine 2. In Determine 2, P3 noticed the write to X taking place earlier than the write to Y, however P4 noticed the write to Y taking place earlier than the write to X. This violates sequential consistency, however not causal consistency. Nevertheless, in Determine 6, P2 learn the write to X earlier than performing the write to Y. That locations a causal constraint between the write to X and Y — Y should occur after X. Subsequently, when P4 sees the write to Y with out the write to X, causal consistency is violated.
Eventual consistency is even weaker — even causally dependent writes could change into seen out of order. For instance, regardless of violating each different consistency assure that we’ve got mentioned to this point, Determine 6 doesn’t essentially violate eventual consistency. The one assure in eventual consistency is that if there aren’t any writes for a “lengthy” time frame (the place the definition of “lengthy” is system dependent), each thread of execution will agree on the worth of the final write. So so long as P4 finally sees the brand new worth of X (5) at some later time limit (not proven in Determine 6), then eventual consistency is maintained.
As we mentioned above, consistency ranges have traditionally been outlined by way of particular person reads or writes of an information merchandise. This makes the dialogue of consistency ranges onerous to use to the context of database programs by which teams of learn and write operations happen collectively in atomic transactions. Certainly, each the analysis literature and vendor documentation (for these distributors that provide a number of consistency ranges) are extraordinarily complicated and don’t take a uniform strategy in terms of making use of consistency ranges within the context of database programs.
I believe that the best option to cause about consistency ranges within the presence of database transactions is to make solely minor changes to the consistency fashions we mentioned earlier. We nonetheless view consistency as threads of execution sending learn and write requests to a shared knowledge retailer. The one distinction is that we annotate every learn and write request with the transaction identifier of the transaction that initiated every request. If every thread of execution can solely course of one transaction at a time, and transactions cannot be processed by a couple of thread of execution, then the standard timeline consistency diagrams want solely be supplemented with rectangular boundaries indicating the beginning and finish level of every transaction inside a thread of execution, as proven within the determine beneath.
The presence of a transaction in a consistency diagram provides extra constraints equivalent to AID of ACID: the entire reads and writes inside the transaction succeed or fail collectively (atomicity), they’re remoted from different concurrently operating transactions (the diploma of isolation will rely upon the isolation degree), and writes of dedicated transactions will outlive all types of system failure (sturdiness).
The atomicity and sturdiness ensures of transactions are fairly straightforward to cognitively separate from consistency ensures, as a result of they cope with basically totally different ideas. The issue is isolation. Consistency ensures specify how and when writes are seen to threads of execution that learn database state. Isolation ensures additionally place limitations on when writes change into seen. This conceptual overlap of isolation and consistency ensures is the supply of a lot confusion. Within the subsequent publish of this sequence I plan to provide a tutorial for understanding the distinction between isolation ranges and consistency ranges.