Concurrency and Parallelism

If you have a good general purpose introduction to concurrency and parallelism, let me know. All the links below either presuppose a decent amount of background, or otherwise don't add up to a real introduction.

General

Data Races vs. Race Conditions

They're related, but not the same, even though people use them interchangably. I originally thought this was an excessively pedantic distinction, then had to remember it when a friend said he thought it was impossible for Rust to prevent data races (it's not!).1 So this isn't pedantic, or at least it's the best kind of pedantic.

I'm honestly confused about whether the Boehm paper uses the terminology in the same way. It cites the C11/C++11 standards giving what seems like an inconsistent definition requiring one of the accesses to be non-atomic.

15 Years of Concurrency

Joe Duffy's reflections on working on concurrency in C# and Midori. Contains a lot of ideas about searching for inspiration on new architectures and language features for concurrency, but no overarching thesis.

Threads, Locks and Shared Memory

There's a pervasive feeling that threads and shared memory are the wrong abstraction level for parallel or concurrent programming for most developers, leaving the question of what a better replacement is, with ideas like actors, CSP, or threads plus language guarantees that

These are great ideas, but it's valuable to understand threading and shared memory--the other approaches are typically implemented in terms of those primitives.

Deadlock Empire

An interactive github page that presents a series of challenges: make two or more program threads simultaneously enter the critical section, deadlock, fail to terminate, etc. The first few examples are easy, but they quickly get trickier.

The examples are in C#, but hopefully will be readable for someone with experience with any curly brace language. I wish the authors would add a more explicit goal for each exercise.

Position Paper: Nondeterminism is Unavoidable but Data Races are Pure Evil

Data races invalidate core semantic guarantees provided by the programming language and compiler. In the presence of data races, it becomes impossible to reason about program behavior. Simply removing data races from programs that happen to work with them, typically does not worsen scalability, defined informally as the number of processor cores we can effectively take advantage of. And, in the case of C11 and C++11, the absolute performance cost of removing data races from racy programs is also essentially zero.

I confess I found the argument of the scalability section difficult to follow.

Concurrency vs. Parallelism

This is a distinction that took me a little while to wrap my head around. The three posts below take slightly different tacks in presenting it.

Concurrency is Not Parallelism, It's Better

Rob Pike's talk presenting the distinction in a very intuitive way, and also explaining how Go supports concurrency. His view is that by supporting concurrency well, you can easily write parallel programs. The PLT folks agree about the distinction, but are otherwise somewhat critical. They also have several citations.

Parellism is not Concurrency

Bob Harper making much the same distinction in much more technical terms. But Harper also differs in stressing that parallelism is an important first class notion, and should be understood in its own terms, not as a product of concurrency.

Parallelism /= Concurrency

Simon Marlowe argues that Haskell has the best support for both parallelism and concurrency, and argues that languages with pervasive side effects lead to conflating parallelism and concurrency.

Performance

Single Writer Principle

Having multiple writers to a single type of data creates coordination overhead, and often should be avoided for high-performance multithreaded programs.

False Sharing

Heap

The Art of Multiprocessor Programming

Is Parallel Programming Hard, And If So, What Can You Do About It

Textbook on shared memory parallel programming, focusing on the primitives.

The Problem With Threads

Rather than pruning nondeterminism, we should build from essentially deterministic, composable components. Nondeterminism should be explicitly and judiciously introduced where needed, rather than removed where not needed. The consequences of this principle are profound. I argue for the development of concurrent coordination languages based on sound, composable formalisms. I believe that such languages will yield much more reliable, and more concurrent programs.

Threads Cannot Be Implemented As A Library

An Introduction to Programming With Threads

Actors Are Overly Nondeterministic

We could almost do a search-and-replace to turn the The Problem with Threads into an argument against using actors!

Any decision affecting a program's asymptotic runtime or space complexity should be under explicit programmer control, using a clear and composable programming model. Actors violate this principle by making a key decision affecting memory usage (when to sequence a message send, and what sort of nondeterminism to allow) an emergent runtime property, and forcing us to encode these decisions using more hoc protocols, which aren't really first class or composable.

Concurrency Freaks


  1. Despite the glib assurance in the Rust nomicon that it's impossible to prevent arbitrary race conditions, I half-recall that there's type theoretic ways to prevent "a lot" of them, and I wouldn't be surprised if when I retire, there are mainstream languages using those ideas--the same way that Rust brought the concept of ownership into a mainstream language.