Improved process launcher
Monday, November 12, 2007
The io.launcher
vocabulary now supports passing environment variables
to the child process.
Launching processes is a complex and inherently OS-specific task. For
example, until version 1.5, Java had a bunch of methods in the Runtime
class for this, each one taking a different set of arguments:
Process exec(String command)
Process exec(String[] cmdarray)
Process exec(String[] cmdarray, String[] envp)
Process exec(String[] cmdarray, String[] envp, File dir)
Process exec(String command, String[] envp)
Process exec(String command, String[] envp, File dir)
The fundamental limitation here was that it would always launch processes with stdin/stdout rebound to a new pipe; so there was no easy way to launch the process to read/write the JVM’s stdin/stdout file descriptors, you had to spawn two threads to copy streams. In 1.5, they added a new ProcessBuilder class, but it doesn’t seem to offer any new functionality except for merging stdout with stderr. It still always opens pipes.
The approach I decided to take in Factor is similar to the
ProcessBuilder approach, but more lightweight. The run-process
word
takes one of the following types:
- A string – in which case it is passed to the system shell as a single command line
- A sequence of strings – which becomes the command line itself
- An association – a “process descriptor”, containing keys from the below set.
The latter is the most general. The allowed keys are all symbols in the
io.launcher
vocabulary:
+command+
– the command to spawn, a string, or f+arguments+
– the command to spawn, a sequence of strings, or f. Only one of+command+
and+arguments+
can be specified.+detached+
– t or f. If t,run-process
won’t wait for completion.+environment+
- an assoc of additional environment variables to pass along.+environment-mode+
- one ofprepend-environment
,append-environment
, orreplace-environment
. This value controls the function used to combine the current environment with the value of+environment+
before passing it to the child process.
While run-process
spawns a process which inherits the current one’s
stdin/stdout, <process-stream>
spawns a process reading and writing on
a pipe.
Idiomatic usage of run-process
uses make-assoc
to build the assoc:
[
{ "ls" "/etc" } +arguments+ set
H{ { "PATH" "/bin:/sbin" } } +environment+ set
] H{ } make-assoc run-process
I’m happy with the new interface. With two words it achieves more than Java’s process launcher API, which consists of a number of methods together with a class.
Not only is the interface simple but so is the implementation.
The run-process
first converts strings and arrays into full-fledged
descriptors, then calls run-process*
which is an OS-specific hook.
In the implementation of this hook, the fact that any assoc can be
treated as a dynamically scoped namespace really into play. The
io.launcher
implementation has a pair of words:
: default-descriptor
H{
{ +command+ f }
{ +arguments+ f }
{ +detached+ f }
{ +environment+ H{ } }
{ +environment-mode+ append-environment }
} ;
: with-descriptor ( desc quot -- )
default-descriptor [ >r clone r> bind ] bind ; inline
Passing a descriptor and a quotation to with-descriptor
calls it in a
scope where the various +foo+
keys can be read, assuming their default
values if they’re not explicitly set in the descriptor.
So, if we look at the Unix implementation for example,
M: unix-io run-process* ( desc -- )
[
+detached+ get [
spawn-detached
] [
spawn-process wait-for-process
] if
] with-descriptor ;
It calls various words, all of which simply use get
to read a
dynamically scoped variable. These are resolved in the namespace set up
by with-descriptor
. For example, there is a word
: get-environment ( -- env )
+environment+ get
+environment-mode+ get {
{ prepend-environment [ os-envs union ] }
{ append-environment [ os-envs swap union ] }
{ replace-environment [ ] }
} case ;
It retrieves the environment set by the user, together with the
environment of the current process, and composes them according to the
function in +environment-mode+
. There is no need to pass anything
around on the stack; any word can call get-environment
from inside a
with-descriptor
.
Notice that constructing a namespace with make-assoc
, and then passing
it to a word which binds to this namespace and builds some kind of
object, is similar to the “builder design pattern” in OOP. However, it
is simpler. Instead of defining a new data type with methods, we
essentially just construct a namespace then run code in this namespace.