Multithreading: load and store reordering
Until now, we’ve been busy talking a look at several interesting topics associated with multithreading programming. As we’ve seen, one of the most problematic areas in multithreaded programs is sharing state across multiple threads. As we’ve seen in several posts along this series, we can use a critical regions for ensuring that shared state is accessed by one thread at a time. In other words, using a critical region ensures that access to shared state is serialized leading to correctness of the program.
One thing that surprises many people is learning that single variables access isn’t always safe. Understanding why leads us to a new concept: memory loads and stores reordering. Even though most of us have grown used to thinking that the program executes precisely in the order we wrote it, that isn’t really guaranteed. Currently, memory operations like loads (ie, accessing a variable) and stores (ie, putting a value into a variable) can be reordered under the optimization “banner”:
- the first “optimization” you might get will be performed by the compiler. Compilers might move, eliminate (or even add) memory loads and stores to the program of your app. However, compilers will always preserve sequential behavior (though its reordering can break multithreaded code);
- processors might also change the way compiled code gets executed. For instance, modern processors tend to use branch prediction to improve the performance of your program. This is just one of the optimizations that may break your code when run in parallel (ie,after adding multithreading to your app).
- caches may give you wrong results too. Most modern processor architectures employ several levels of caches. Some are shared between all processors,while others are processor specific. Caches tend to break the “memory as a big array” vision, leading to effective reorder of loads and stores (at least, this is the perception you get when caching starts breaking your code).
After this small introduction, you should be worried about the code you write because it seems like nothing is safe (if even a simple variable access smaller than the “current” processor word size isn’t safe, then what can we do?). Fortunately, there are some guarantees which we can use to write safe multithreaded programs:
- instruction reordering cannot break the sequential evaluation of the code. What this means is that your code should always run safely and correctly if you run it through a single thread (meaning that we only need to worry with reordering when we write multithreaded code);
- data dependency will always be respected. As an example, this means that if you have something like x = 10; y = x; then memory access won’t be reordered (ie, you will never get y=x; x = 10) because there is a data dependency between x and y;
- finally, all platforms conform to a specific memory model which define the rules that are to be followed for memory loads and stores reordering.
I guess that the main thing you should take from these points is that what is run isn’t always what you’ve written. On the next post, we’ll keep looking at these issues and see how critical regions help with memory reordering issues. Keep tuned!