Improvements to io.monitors; faster refresh-all
Friday, April 11, 2008
Factor’s io.monitors
library previously supported Mac OS
X,
Windows and
Linux.
Now it also supports BSD, but in a much more restricted fashion than the
other platforms. Basically you cannot monitor directories, just
individual files. This is because kqueue()
only provides very limited
functionality in this regard. However, having something is better than
nothing, and the functionality provided on BSDs is still useful for
monitoring log files and such.
On Linux, inotify
doesn’t directly support monitoring recursive
directory hierarchies so Factor’s monitors didn’t support recursive
monitoring, but a mailing list post by Robert
Love
discusses how to solve this issue in user-space. I used his solution to
implement recursive monitors on Linux.
Another oddity relating to inotify
is that if you add the same
directory twice to the same inotify, you get the same watch ID both
times, and events are only reported once. This means that the previous
implementation where there was one global inotify instance shared by all
monitors wasn’t really as general as one would hope, because you
couldn’t run two programs that monitor overlapping portions of the file
system. I thought of several possible fixes but in the end just changed
the monitors API to accommodate this case. All monitor operations must
now be wrapped in a with-monitors
combinator. On Linux, it creates an
inotify instance and stores it in a dynamically-scoped variable, so that
subsequent calls to <monitor
use this inotify. Independent inotifies
in different threads don’t interact at all. On Mac OS X, BSD and
WIndows, with-monitors
just calls the quotation without doing any
special setup.
Another issue I fixed was that on Mac OS X, monitors would only work when used from the UI because no run loop was running otherwise. I made a run loop run all of the time and this allows monitors to work in the terminal listener.
Now that monitors are working better, I was able to use them to make
refresh-all
. This word finds all changed source files in the
vocabulary roots and reloads them. It does this by comparing cached
CRC32 checksums with the actual checksum of the file. Previously it
would also compare modification times, but I took that code out because
filesystem meta-data queries got moved out of the VM and into the native
I/O code, which isn’t available during bootstrap. A side-effect of this
is that refresh-all
became much slower, because it had to read all
files. Using monitors I was able to make this faster than it has ever
been. A thread waiting on a monitor is started on startup. Then, the
source tree only has to be checksummed in its entirety the first time
refresh-all
is used in a session. Subsequently, only files for which
the monitor reported changes have to be scanned. So refresh-all
runs
instantly if there are no changes, and so on.