Saturday, September 02, 2006

Hell is a multi-threaded C++ program.

What are threads?

Every modern operating system has support for threads, and most programming environments provide some level of support for threading. What threads give you is the ability for your program to do more than one thing at once. The problem with threads is the way that they can dramatically increase the complexity of your program.

First, a little background, so we're all on the same page. In Computer Science, as in the physical sciences, using a simplified model makes it easier to discuss complex phenomena without getting bogged down in insignificant details. The trick of course, is in knowing where your simplifications deviate from reality in a way that affects the validity of the results. While spherical cows on an infinite frictionless plane do make the calculations easier, sometimes the details matter.

When Real Computer Scientists (tm) are discussing problems in concurrent programming (like the Dining Philosophers), they'll sometimes refer to a Process, which is kind of abstract ideal of a computer program. Multiple Processes can be running at the same time in the same system, and can also interact and communicate in various ways.

The threads provided by your favorite operating system and programming language are something basically similar to this theoretical concept of a Process, with a few unfortunate details of implementation.

The New Jersey approach

I couldn't find a definitive reference to the history of the development of threads as we know them today, but the model most people are familiar with arose out of POSIX, which was largely an attempt to formalize existing practice in UNIX implementations.

It turns out that POSIX Threads, Mach Threads, Windows Threads, Java Threads, and C# Threads all work very much the same, since they're all implemented in more or less the same way. The object-oriented environments wrap a thin veneer of objects around a group of extremely low-level functions, but you've got your basic operations of create(), join(), and exit(), as well as operations on condition variables and mutexes. For the rest of this rant, I'll refer to these as "Pthreads", for convenience.

Pthreads are an example of the Worse is better philosophy of software design, as applied to the problems of concurrent programming. The POSIX threading model is just about the simplest possible implementation of multi-threading you could have. When you want to create a new thread, you call pthread_create(), and a new thread is created, starting execution with some function you provide. The newly-created thread is created by allocating some memory for a stack for the new thread, loading up a couple of machine registers, and jumping to an address.

Shared state - two models

In the Pthreads model, all of your threads share the same address space. This makes sharing data between threads very simple and efficient. On the other hand, the fact that all of the state in the program is accessible and changeable from every thread can make it very difficult to ensure that access to all this shared state is managed correctly. Race conditions, where one thread attempts to update a data structure at the same time that another thread is trying to access or change that same structure, are common.

The problem with the all state is shared model is that it doesn't match up very well with what you're generally trying to accomplish when you spawn a thread. You'll normally create a new thread because you want that thread to do something different than what the main thread is already doing. This implies that not all of the state in the parent thread needs to be available to be modified in the second thread. But because of the way threads are created in this model, it's easier (for the OS or language implementor) to share everything rather than a well-defined subset, so that's what you get.

The other major model for multi-threading is known as message-passing multiprocessing. Unless you're familiar with the Occam or Erlang programming languages, you might not have encountered this model for concurrency before.

There are a number of variations on the message-passing model, but they all have one thing in common: In the message-passing model, your threads don't share any state by default. If you want some information to go from one thread to another, you need to do it by having one thread send a message to the other thread, typically by calling a function provided by the system for just this purpose. Two popular variants of the message-passing model are "Communicating Sequential Processes" and the "Actor model".

You can get a nice introduction to the message-passing model by reading the first couple chapters of the Occam Reference Manual, which is apparently available online these days (I got mine by digging around in a pile of unwanted technical books at a former employer). Occam is of course the native language of the Transputer, a very inventive but commercially unsuccessful parallel processor architecture from the UK which made a big splash in the mid-80's before vanishing without a trace.

Why would you want to learn about this alternative model, when Pthreads have clearly won the battle for the hearts and minds of the programming public? Well, besides the sheer joy of learning something new, you might develop a different way of looking at problems, that'll help you top make better use of the tools that you do use regularly. In addition, as I'll explain in Part II of this rant, there's good reason to believe that message-passing concurrency is going to be coming back in a big way in the near future.

Enough rope to hang yourself with

As I mentioned earlier, the Pthreads model implies that all of your program's address space is shared between all threads. Most (all?) implementations allow you to allocate some amount of thread-local storage, but in general, the vast majority of your program's state is shared by every thread. This implies that every thread has the ability to modify the value of any variable, and call any arbitrary function, at any time. This is a really powerful tool, but like all powerful tools, it can be dangerous if misused.

It's extremely difficult to predict what the behavior of even fairly simple code will be, when multiple threads can run it simultaneously. For more complex code, the problem rapidly becomes intractable. In a low-level language like C, you need to know the intimate details of how the compiler will optimize your code, which operations are guaranteed to be completed atomically, what the register allocation policy is, etc, etc. In a JIT-compiled language like Java or C#, it's impossible to even know what machine code will be used at runtime, so analyzing runtime behavior in detail just isn't possible.

An unlocked gun cabinet

I think one of the major problems with Pthreads is that it's too easy to make something that almost works. This then leads to an unwarranted belief that multi-threaded programming is simple. For example, say you've got a simple interactive GUI application, and you think that the application takes too long to calculate something after the user presses a button. The "obvious" solution to this problem is to have your button-press handler spawn off a new thread to perform the long-running operation.

So you try this, and it works perfectly on the first try - the child thread launches, and then calculates away while the main thread goes back to handling the UI. You still need to notify the main thread when the calculation is complete, but there any number of easy ways to do this, and you probably won't have much trouble figuring that out. Gee, that wasn't so difficult, I wonder why people say that multi-threaded programming is difficult?

It's this sort of ad-hoc approach to creating threads that gets people into trouble. They create a new thread to solve one problem, and then another, and then they suddenly realize that thread A and thread M are interacting in a bad way. So they protect some critical data structures with mutexes, and before they know it, they're trying to debug a deadlock situation where they don't even understand how those two pieces of code could interact.

You need to really think about why you're creating a thread, before you spawn it. I'm not going to go so far as to say that creating threads while your program is running (rather than at startup) is de-facto proof that you're doing something wrong, but it's definitely a strong indication that you're not thinking about what your threads are for with any great rigor.

How to not shoot yourself in the foot

... is going to be the subject of Part II. Sorry for the cliff-hanger ending, but I wanted to get at least a little of this published, and potentially get some comments on it, before finishing the rest.


leeg said...

Maybe you could clarify something for me: I think that your second model of threading can be implemented in Mach, with some resource issues; rather than simply spawning a new thread, you need to spawn a new task containing your thread and tell vm_inherit() that you don't want any of the memory regions to be inherited. This effectively gives you a poor man's fork(), so that you get a new task but without it even having a shadow object pointing to the first task's data. It's not as simple as the native Mach model (which is indeed pthread-esque) but I don't think it's mad hard to do.

Does that implement what you're talking about, or are me and the point on different continents? :-)

Anonymous said...

Great to see some appreciation and awareness of the Occam language!

I hope the ideas embodied in it will see the light of day again.

(Erlang is cool too, but not exactly the same..)

Reimer Mellin said...

nice writeup, but if memory does not trick me then CSP was indeed the granddaddy of message passing models and predated ERlang and definitly Occam.

Perry The Cynic said...

It seems to me that most thread programming is done for three reasons:
(1) Take advantage of multiple CPU cores for a single task.
(2) Work with (or around) synchronous, blocking interfaces that would otherwise idle the processor.
(3) Avoid asynchronous/state model programming and continue to use sequential programming techniques instead.

Only (1) is a good reason to use threads of any kind (and then only if you're doing a CPU-crunching job).

(2) is sometimes necessary to keep your program from freezing up, but the right answer is to use (and demand) asynchronous/nonblocking APIs to get your work done. (Sadly, AIO-style UNIX I/O, to give an example, is often implemented with threads, propagating the problem right into the new API.)

And then there's (3), the source (in my estimate) of a good 80-90% of all thread programming. State/event programming seems so unnatural to many programmers that they'll go to incredible lengths to avoid it. One reason is that by the time the problem becomes obvious, programs are already established (often shipped), and rewriting them as state machines is a major architectural change. Adding just one thread (or two, or ten) seems so much easier, a more gradual, manageable change. Many programs slip into multi-threading that way, almost imperceptively. The result is almost invariably mush.

-- perry

Mike said...

I really think the "threads are bad" mentality comes from the poor (as in non-existent) support for concurrent programming in C/C++. "Threads" are a poor concurrency model compared to (for example) protected objects and tasks in Ada. I have done concurrent programming in Ada83 and Ada95 for years and the built-in support for concurrency in these languages makes development of multithreaded code much easier, IMO, than using threads, condition variables, mutexes etc in C. Sure, you could argue that on Unix systems Ada tasks are in fact implemented as threads and that rendezvous mechanisms use condition variables and mutexes, but so what? As an Ada programmer, I don't have to deal with all those low-level objects, all of which have different APIs. Instead I can work with the single tasking model API in the language.

Other languages might have concurrent programming support that is nice too (Occam per article), but Ada is the only language I've worked in that had it. I hope Java and Ruby eventually add true concurrent programming support as well and ditch the C/C++ "thread" model.

Anne said...

Hi, this is a great post and sums up basically how I feel about threads (and a lot of other programming idioms too).

I came here in order to find a blog or message board that makes fun of bad c++ code. Not so much to critique it, but to say "Can you believe someone did this in production code? Ha ha funny!" Do you know of such a place?

Anonymous said...

""Can you believe someone did this in production code? Ha ha funny!" Do you know of such a place?"

The Daily WTF is your source for that kind of humor:

Anonymous said...

Coming from the JavaScript world and it's way too insulated WebWorker model, I appreciate threads so much more. There you have to send both the source and the data that you wish to run.

I have to fault the statement that the threads have access to everything in your application. Yes address space wise they do, and if you have a lot of singletons/globals on the threaded path, then you have trouble. But if you build and insulate the data well, then it's not as awedul as you would have readers believe.

Libdispatch atop a system thread pool balanced across all running apps is a really great threading experience. And threads aren't that bad if you name and track what they do.

Anonymous said...

you want that thread to do something different than what the main thread is already doing...

Maybe isolate the two things into separate programs/processes and have them interact using shared memory?

This is just as fast as the thread model but you only share exactly the state needed for the two to communicate.

Mark Bessey said...

The communication part is just as fast, but creating a thread is much faster than creating a process. But yes - this is definitely a way to do it.