fork() without exec() is dangerous in large programs (evanjones.ca)

[ 2016-August-16 20:42 ]

The UNIX fork() system call seems like an elegant way to process tasks in parallel. It creates a copy of the current process, allowing tasks to read the same memory, without needing fine-grained synchronization. Unfortunately, you need to use it extremely carefully in large programs because fork in a multi-threaded program can easily cause deadlocks in the child process. In the child process, only the thread that called fork continues running. Other threads no longer exist. If a thread was holding a lock, it will remain locked forever [1, 2, 3]. The next call to a function that needs a locks, like malloc, can deadlock waiting on a thread that is not running.

Most large programs today use threads, either intentionally or accidentally, since many libraries use threads without making it obvious. This makes it very easy for someone to add threads to some part of the program, which can then cause rare hangs in forked child processes in some other part, far removed from where the threads were added. Chrome, which intentionally uses multiple processes and threads, has run into this issue, which I think that is good evidence that it is hard to get right, even if you are aware this is a potential problem.

At Bluecore, we ran into this accidentally. We don't explicitly use threads or fork anywhere, but some libraries and tools that we use do. The original problem was that our test suite started hanging on Mac OS X. It turns out that many Mac OS X system libraries use libdispatch (a.k.a. Grand Central Dispatch) for inter-process communication with system processes. In an attempt to avoid rare and hard-to-reproduce bugs, libdispatch explicitly "poisons" the child process when forking, causing the next use to crash. This caused our unit test suite to hang, waiting for children that were dead. In this article, I'll describe how you can safely use fork if you really must, as well as a deep dive into this specific Python crash, and why it really has no workaround.

How to use fork safely

I have three suggestions, in priority order:

Only use fork to immediately call exec (or just use posix_spawn). This is the least error-prone. However, you really do need to immediately call exec. Most other functions, including malloc, are unsafe. If you do need to do some complex work, you must do it in the parent, before the fork (example). Others support this opinion. Using posix_spawn is even better, since it is more efficient and more explicit.
Fork a worker at the beginning of your program, before there can be other threads. You then tell this worker to fork additional processes. You must ensure that nothing accidentally starts a thread before this worker is started. This is actually complicated in large C++ programs where static constructors run before the main function (e.g. the Chrome bug I mentioned above). To me, this doesn't seem to have many advantages over calling exec() on the same binary.
Only use fork in toy programs. The challenge is that successful toy programs grow into large ones, and large programs eventually use threads. It might be best just to not bother.

Hanging Python unit tests

When I ran into this problem, I was just trying to run all of Bluecore's unit tests on my Mac laptop. We use nose's multiprocess mode, which uses Python's multiprocessing module to utilize multiple CPUs. Unfortunately, the tests hung, even though they passed on our Linux test server. I figured we had some trivial bug, so I ran subsets of our tests until I isolated the problem to a single directory. Strangely, each test worked when run by itself. It was only when I ran the whole directory with nose that it got stuck. It was the --parallel=4 option, which uses multiple processes, that caused the hang. The parent process was waiting for a message from a child worker. Looking at the children using ps showed that it had already exited. After adding a ton of print messages, I found that the child process was actually crashing.

Python crash

It should not be possible to crash the Python interpreter, so now I was really interested. I didn't really know where to start, so I decided use "brute force" and simplify the crash as much as possible. I removed pieces of the test and checked if it still crashed. After a few hours, I managed to reduce it to a program which crashes Python 2.7 and 3.4 on Mac OS X, by using sqlite3 in the parent and urllib2 in the child. For future Google searchers, the OS X crash report contains "crashed on child side of fork pre-exec" and it crashes in _dispatch_async_f_slow in libdispatch.dylib.

Searching for fork, libdispatch and the crash message eventually led to me an article about a similar bug, which gave me the clue that not calling urllib2 in the parent avoided the crash. This gave me the workaround I needed to fix the unit tests, but did not really explain why it was crashing. I eventually ended up digging through the source to libdispatch, which Apple doesn't make easy to find. I searched the code for "fork", which led me to the following function (with minor reformatting for conciseness):

void dispatch_atfork_child(void) {
  void *crash = (void *)0x100;

  /* ... unnecessary stuff removed ... */
  _dispatch_child_of_unsafe_fork = true;

  _dispatch_main_q.dq_items_head = crash;
  _dispatch_main_q.dq_items_tail = crash;
  /* ... more stuff set to = crash ... */
}

This shows that libdispatch registers a function to be called after fork, and uses it to explicitly crash if the child uses it. Presumably the authors thought it was better to reliably crash rather than run into extremely rare hangs or other unreliable behaviour.

So why are the Python urllib2 and sqlite3 modules using libdispatch? The stack trace from lldb shows that urllib2 reads the Mac OS X system proxy settings using SCDynamicStoreCopyProxies, which calls _CFPrefsWithDaemonConnection, which then calls a library named libxpc.so and finally libdispatch. I'm guessing from the names that the preferences are loaded by communicating with another process. For SQLite, the problem only occurs when using libsqlite3.dylib that comes with Mac OS X. If you build your own version of SQLite, it doesn't happen. I'm not exactly sure what Apple's version is doing, but poking around with lldb shows that it calls dispatch_async from sqlite3_initialize, using the libdispatch main queue. I suspect this is because Apple's Core Data API uses SQLite, so this may be something to support UI applications.

Fixing the problem in Python

I filed a bug on the Python bug tracker, in an attempt to fix this in Python itself. Unfortunately, there doesn't appear to be a good way to solve it. We could make some changes to avoid the issue specifically with urllib2 and sqlite, but searching the Internet shows there are other ways to cause this crash, such using fork in a Python program that has a GUI. Unfortunately, this means it is going to go unresolved, and anything that uses multiprocessing on Mac OS X could cause this crash. In general, it seems best to avoid fork entirely.