Factor Language Blog

Two things every Unix developer should know

Friday, September 3, 2010

Unix programming can be tricky. There are many subtleties many developers are not aware of. In this post, I will describe just two of them… my favorite Unix quirks, if you will.

Interruptible system calls

On Unix, any system call which blocks can potentially fail with an errno of EINTR, which indicates that the caller must retry the system call. The EINTR error can be raised at any time for any reason, so essentially every I/O operation on a Unix system must be prepared to handle this error properly. Surprisingly to some, this includes the C standard library functions such as fread(), fwrite(), and so on.

For example, if you are writing a network server, then most of the time, you want to ignore the SIGPIPE signal which is raised when the client closes its end of a socket. However, this ignored signal can cause some pending I/O in the server to return EINTR.

A commonly-held belief is that setting the SA_RESTART flag with the sigaction() system call means that if that signal is delivered, system calls are restarted for you and EINTR doesn’t need to be handled. Unfortunately this is not true. The reason is that certain signals are unmaskable. For instance, on Mac OS X, if your process is blocking reading on standard input, and the user suspends the program by sending it SIGSTOP (usually by pressing ^Z in the terminal), then upon resumption, your read() call will immediately fail with EINTR.

Don’t believe me? The Mac OS X cat program is not actually interrupt-safe, and has this bug. Run cat with no arguments in a terminal, press ^Z, then type %1, and you’ll get an error from cat!

$ cat
^Z
[1]+  Stopped                 cat
$ %1
cat
cat: stdin: Interrupted system call
$

As far as I’m aware, Factor properly handles interruptible system calls, and has for a while now, thanks to Doug Coleman explaining the issue to me 4 years ago.

Subprocesses inherit semi-random things from the parent process

When you fork() your process, various things are copied from the child to the parent; environment variables, file descriptors, the ignored signal mask, and so on. Less obvious is the fact that exec() doesn’t reset everything. If shared file descriptors such as stdin and stdout were set to non-blocking in the parent, the child will start with these descriptors non-blocking also, which will most likely break most programs. I’ve blogged about this problem before.

A similar issue is that if you elect to ignore certain signals with the SIG_IGN action using sigaction(), then subprocesses will inherit this behavior. Again, this can break processes. Until yesterday, Factor would ignore SIGPIPE using this mechanism, and child processes spawned with the io.launcher vocabulary that expected to receive SIGPIPE would not work properly. There are various workarounds; you can reset the signal mask before the exec() call, or you can do what I did in Factor, and ignore the signal by giving it an empty signal handler instead of a SIG_IGN action.