Strict Consistency Models
by srikanth on 11/04/2011

I’ve had some trouble understanding memory consistency models and related terminology. So I thought I’ll write a blog post about whatever I’ve learnt.

Sequential Consistency – This is a very common term. Sequential consistency means that given a program with a bunch of instructions, they will be run in the same order as they are given for that thread/processor. In uniprocessor systems, this is pretty straight-forward and how we naturally think about programs(see Strict Consistency). In multiprocessor systems, we can reason about sequentially consistent programs by considering any interleaving between the two programs. This is under the assumption that each of these instructions is atomic.

To understand this, let’s look at a simplified Dekker’s algorithm. There are two threads P1 and P2 with shared variables x and y.

1] x = 1;
2] r1 = y;

3] y = 1;
4] r2 = x;

Suppose these two threads start around the same time. Then we can think of any interleaving of instructions between these two threads such that the order of execution of instructions of one threads remains the same. In a sequentially consistent environment, there is no possible combination of instruction interleaving that gives r1 = r2 = 0.

One possible interleaving:

P1:                  W(x)1   R(y)1
P2:  W(y)1   R(x)0

Thus the core idea in Sequential Consistency is that there is a Total Order upon the sequence of instructions. The result of the execution should be describable by any total order formed by any interleaving of the instructions.

Legal execution for Sequential Consistency(not linearizable):

P1:  W(x)1
P2:         R(x)0  R(x)1

It can be cool to note that Sequential Consistency is a global property and it’s not composable. The following execution is sequentially valid if we consider the x and y objects separately. But, together the execution becomes inconsistent. So, we can’t take objects that are sequentially consistent and use them together assuming that they will be behave correctly.

Illegal execution for Sequential Consistency(not linearizable, obviously):

P1:  W(x)1       W(y)0       R(x)1
P2:        W(y)1       W(x)0       R(y)1

It’s important to note that the time of execution does not matter and is irrelevant. It’s only the relative order of execution and whether the execution behaves as if some total order exists on it. For the above execution, it behaves as if W(x)1 occurred after R(x)0 even though the actual execution in real life may be as given in the diagram. It’s important to separate behavioral idea of sequential consistency from actual implementation and execution.

Strict Consistency – This is the strictest memory model. It’s same as sequential consistency, except that it absolutely guarantees that any read sees the most recent write across all processors.

One way to imagine it is that there is a global clock for Strict consistency. Every operation occurs at some point on this global clock and all writes are immediately accessible to every other processor. In other words, this behaves like sequential consistency on a uniprocessor.

Causal Consistency - Further relaxing Sequential Consistency, we get this one. Here, concurrent writes may appear in different orders in different processors. Only causally related statements need to execute in the same order for all processors. Two statements are causally related if the result of one affects another. So, a write of x is causally related to the next reads of x. Thus, the result of the read should be the same on all processors. But a write of x and a write of y are not causally related. Hence their orders may appear to be interchanged on different processors.

A valid causally consistent execution:

P1:  W(x)1
P2:  W(x)0
P3:        R(x)0  R(x)1
P4:        R(x)1  R(x)0
Since there is no dependency of any sort between the two writes, they may appear to have happened in different orders on different processors.

Not possible under causal consistency:
P1:         R(x)0   W(x)1
P2:  W(x)0
P3:                        R(x)0  R(x)1
P4:                        R(x)1  R(x)0

Here, it can be reasoned that the R(x)0 in P1 is possibly causally related to the W(x)0. Thus the view of the ordering of instructions in P1 and P2 must be consistent in all other threads. Note that it is possible for P3 and P4 to both read R(x)0 and R(x)1 in that order or vice versa.

What processors and languages do

Most programming languages provide Sequential Consistency as long as the program is Data Race Free. Multiple threads protect accesses to shared memory with some kind of synchronization such as mutual exclusion locks thus avoiding data races. Thus multiple threads will access these critical sections separately.

Microprocessors do not provide sequential consistency directly. This is because it’s typically expensive to implement. Compilers generate code that has the same overall behavior by taking advantage of control flow and read/write dependences in the program. Also, typically microprocessors may try to perform a load earlier violating program order, while maintaining dependence order. Numerous other hardware optimizations are performed to churn out as much speed as possible.

Linearizability - This is Sequential Consistency with an added caveat that there is a time stamp associated with each operation and that every thread sees the operations in the order given by time stamps. This is mainly a tool to reason about executions relative to absolute time. Every execution is said to have occured at the assigned time and can see all the writes that have occured in all processes with lesser timestamps.

Non linearizable, but sequentially consistent execution:

P1:  W(x)1  R(x)0
P2:               W(x)0

This particular execution is not valid for linearizability, but it’s sequentially consistent.

Linearizable and sequentially consistent execution:

P1:  W(x)1         R(x)0
P2:         W(x)0

So, Linearizability is stronger than Sequential Consistency because linearizability actually requires the execution order to be preserved. This constraint allows Linearizable objects to be composable. Thus, a bunch of linearizable objects used together will also give linearizable executions.

Serializability - The context of serializability is typically that of databases. In this, a bunch of statements are grouped together and form a transaction. An execution is serializable if it is equivalent to these transactions running in some sequential order. An execution is strictly serializable, if the actual execution order is “correct”. So, linearizability is just the same as strict serializability, but with each transaction consisting of just one operation. These transactions can run in different or same threads.

Disclaimer: I may not be 100% right on everything!

  • Anonymous

    At what point will a developer dig deeper into memory consistency models, i.e if the model is dictated by the compiler / framework ?

  • Srikanth Raju

    Yup. I suspect in the future that some frameworks will start providing weak memory models such as Release Consistency. Parallel and Distributed Applications using shared memory may be able to take advantage of some of these models and show significant performance improvement along with better expressibility.
    However, compiler coders will have to be aware of these because there tends to be many differences in the memory models in microprocessors. For example, SPARC processors have a much weaker model compared to x86 and allow significant instruction reordering.

  • fredfsh

    Most clearly explanation on linearizability and sequential consistency I’ve ever seen.
    I think the example under the paragraph “It can be cool…” shows no problem of consistency, whether we consider x and y separately or not. Maybe
    it should be R(x)1 and R(y)1 according to your hints, I think.

    • Srikanth Raju

      Thanks! You’re right. I’ve fixed it.