Generating minimal images; deploying stand-alone applications

Saturday, June 9, 2007

First of all, an update on the new module system. The porting of the core library is almost complete, and in fact the only remaining module I still need to update is the Cocoa UI backend! In the last few days I’ve been using the X11 UI backend to test things on my Mac. There’s no technical difficulty preventing the Cocoa binding from being ported; it is just a matter of priorities.

I still haven’t pushed the module system patches to the main repository. For now, you can grab the experimental repository:

http://factorcode.org/experimental/ - darcs repository
http://factorcode.org/images/poo/ - boot images

Now that the module system is relatively stable, I’ve been able to attack one of the major features planned for Factor 1.0.

Factor’s images are quite large these days. A fully-loaded 0.90 image with the compiler, UI and tools is 9.4 Mb on Mac OS X/PowerPC. This has two undesirable consequences:

On resource-constrained systems, such as cellphones running Windows CE, a full image might use too much memory, or not even start at all.
Stand-alone application deployment becomes impractical; this is “the Java problem”; the user must endure a lengthy download and install a bulky runtime environment before being able to use any software written in that language.

In the latest sources, I have addressed the first issue by adding a couple of new command line switches to the bootstrap process. The two switches are -include and -exclude, and take a list of components to load into the final image. The default value for -include is

compiler help tools ui ui.tools io

The default value for -exclude is empty. During bootstrap, all components appearing in the included list but not in the excluded list are loaded. So for example, an image with the compiler only can be bootstrapped as follows:

./factor -i=boot.image.x86 -include=compiler

If you would like almost all components except native I/O,

./factor -i=boot.image.x86 -exclude=io

Here are some image sizes on PowerPC:

Options	Size
Minimal	1.2 Mb
Compiler only	4.4 Mb
Compiler, tools	4.6 Mb
Everything except for the UI	6.0 Mb
Everything	9.4 Mb

This additional flexibility at bootstrap time allows one to develop code in resource-constrained environments, however it won’t do you any good if you want to deploy a graphical application written in Factor; the UI image is 9.4Mb.

Factor images include a lot of information that is only useful for developers, such as cross-referencing data. Furthermore, a typical application only uses a fraction of the functionality in the image; most would never need to invoke the parser at run time, for example.

The standard solution for deployment in the Lisp world is the “tree shaker”, which creates an image containing only those functions referenced from a specified “main” entry point. I decided to give this approach a go. The tree shaker clears out most global variables; for example, the variable holding the dictionary of words, as well as cross-referencing data. It also clears out word names and other such data. It constructs a startup routine which initializes Factor and then executes the main entry point of a specified module. Then, the garbage collector is run. This has the effect of reclaiming all words not directly referenced from the main entry point; after this operation completes, the image is saved and Factor exits. The result is a stripped-down image which is much smaller than the default development image.

The tree-shaker is used as follows:

USE: tools.deploy
deploy-ui? on
"bunny.image" "demos.bunny" deploy

There are various variables one can set (see the the source) before calling the deploy word, which takes an image name and a module name as input.

Here are some figures for the deployed image size of various demonstration programs shipped with Factor:

Program	Image size
Hello world (interpreted)	81 Kb
Hello world (compiled)	283 Kb
Hello world (graphical)	680 Kb
Maze demo	687 Kb
Bunny demo	824 Kb
Contributors	557 Kb

Note that in addition to the image size, there is also the 150kb Factor VM.

All these programs are quite trivial, however some of these do pull in non-trivial dependencies (UI, XML, HTTP). Also, the tree shaker is not as good as it could be; in the future these images will become smaller.

The tree shaker is also configurable; if you want to, you can leave the parser and dictionary intact, allowing runtime source modification and interactive inspection of your application, however this negates most of the space savings.

This code is not really in a usable state right now, and needs a lot of polishing and debugging. However, it is a good first step.

So far, the deploy tool doesn’t directly address the issue of generating stand-alone, double-clickable binaries. On Mac OS X, creating a .app bundle consisting of the Factor VM and image is very easy, and I will write a tool in Factor which does this; it will also be able to emit the XML property list file and therefore allow you to customize the application name, icon, etc. On other platforms, I’m still not sure how to proceed; it might be good enough to spit out a shell script for Unix and a batch file for Windows.