Better static safety for higher-order code

Friday, March 20, 2009

Recently Daniel Ehrenberg and I implemented a new language feature which speeds up code calling that calls quotations which the compiler cannot inline.

Thew new language feature takes the form of syntax which is based on the call and execute words from the kernel vocabulary. These words call and execute a quotation, respectively. The new syntax allows for a stack effect declaration to be made at the call site. Suppose you have a hashtable of quotations in the dynamic variable foo, and you want to call the quotation stored at a given key. The traditional way is to use the call word:

"A key" foo get at call

Using the new call( syntax, you can call it but also assert that the quotation has a certain stack effect. For example,

"A key" foo get at call( a -- b )

The execute( syntax generalizes execute in a similar way. For example, here is some code from the parser which invokes parsing words. First it checks that the word was not defined in the same file (because if it was, it wouldn’t be compiled yet); then it executes it using execute( to ensure it has exactly one input and one output:

ERROR: staging-violation word ;

: execute-parsing ( accum word -- accum )
    dup changed-definitions get key? [ staging-violation ] when
    execute( accum -- accum ) ;

The basic idea is that the code that uses call( and execute( is statically checked to ensure that the right number of parameters are passed around between words; and at runtime, the actual quotation or word is checked and if its stack effect is not what the code assumed statically, an error is thrown.

The call( and execute( words are parsing words which call parse-effect to read a stack effect object. This is the same underlying facility that the ( word uses. After parsing an effect, call( and execute( are transformed into calls to call-effect and execute-effect. So the following,

"A key" foo get at call( a -- b )

parses as, and is equivalent to

"A key" foo get at (( a -- b )) call-effect

Where call-effect should be thought of as “call a quotation with this effect”.

These words complement call and execute. The call and execute words benefit from the compiler’s lambda lifting optimization. If a literal quotation, the result of curry and compose, or a word, is only ever passed “downward”, then the compiler will inline it at the call site of call and execute, effectively converting your program into a first-order program. This means that all kinds of iteration patterns become simple loops. Even if the iteration combinators are expressed as compositions of higher-order functions, where the quotation might be passed between several words and be operated on with curry and compose, things still work out pretty well.

However, not all code is first-order, and call( and execute( allow more code to be written in a style where Factor’s stack checker can infer the stack effect. Whereas before, some 80% of the words in the library would have an inferred stack effect, now the figure is closer to 100%, after introducing the new language feature and refactoring code to use it.

The concept of compiler warnings always confused Factor beginners. In Factor, a compiler warning indicates that the compiler was unable to infer the stack effect of a word; this inhibits most optimizations because no assumptions about the positions of values on the stack can be made. So if you care about the performance of your code, it was good to write it in a way such that the stack effect would infer, but it wasn’t possible to write code like this in all situations. Now it is, and compiler warnings are very rare; loading the Furnace web framework now produces less than 10 whereas before it would be a few hundred.

The implementation is pretty interesting. The call( parsing word expands into a call to the call-effect macro. The macro expects that the stack effect is a literal compile-time value. The macro expansion is specialized for calling a quotation with this effect. The quotation itself is only known at runtime, and so the code generated by the macro receives it as a parameter. It uses three strategies:

Inline cache hit: If the quotation is the same as last time, call it without any further runtime checks using the call-effect-unsafe primitive.
Inline cache miss: If the quotation is different from last time, then it checks the stack effect of the quotation against the declaration. The stack effect is computed lazily and stored in a slot in the quotation. If the declaration matches, the quotation is called using call-effect-unsafe.
Slow case: if the quotation’s stack effect cannot be inferred, a snapshot of the data stack is taken, the quotation is called, and the stack is compared to make sure the quotation didn’t use more parameters than declared.

The execute( parsing word expands into a call of the execute-effect macro, which uses three similar strategies:

Inline cache hit: If the word is the same as last time, call it without any further runtime checks using the execute-effect-unsafe primitive.
Inline cache miss: If the word is different from last time, then it checks the stack effect of the word against the declaration. Stack effects of words are always available, because they are computed by the compiler. If the declaration matches, the quotation is called using execute-effect-unsafe.
Slow case: if the word’s stack effect is not known, then it creates a quotation with the word in it, and calls call-effect on it.

In most cases, the slow case should never be hit, and even if it is, the runtime checks are quite efficient. The inline caching ensures that if call( is used to invoke a quotation in a loop, it only has to check its effect once.

Factor is heading in an interesting direction. It generally offers more static checking and optimization than other dynamic languages. Here are the main differences; all of these things that Factor does at compile time, other dynamic languages leave until runtime:

In Factor, calls to undefined words are parse-time errors. In most dynamic languages, a typo in a method name would not be caught until run time.
In Factor, most code passes the stack effect checker. This means that calling a word with the wrong number of parameters is impossible.
In Factor, most uses of higher-order functions are inlined by the compiler, and escape analysis eliminates most closure allocations. So there are less blocks being passed around at runtime, and less allocation.

However, the language is still dynamically typed; the types of values are not known until runtime (except when the compiler can infer them), and there is a lot of generic dispatch and ad-hoc polymorphism in the library.

Now call( and execute( will speed up code which cannot take advantage of the last optimization. Indeed, with inline caching, calling quotations a loop with call( is almost as fast as if everything has been inlined.