Better static safety for higher-order code
Friday, March 20, 2009
Recently Daniel Ehrenberg and I implemented a new language feature which speeds up code calling that calls quotations which the compiler cannot inline.
Thew new language feature takes the form of syntax which is based on the
call
and execute
words from the kernel
vocabulary. These words
call and execute a quotation, respectively. The new syntax allows for a
stack effect declaration to be made at the call site. Suppose you have a
hashtable of quotations in the dynamic variable foo
, and you want to
call the quotation stored at a given key. The traditional way is to use
the call
word:
"A key" foo get at call
Using the new call(
syntax, you can call it but also assert that the
quotation has a certain stack effect. For example,
"A key" foo get at call( a -- b )
The execute(
syntax generalizes execute
in a similar way. For
example, here is some code from the parser which invokes parsing words.
First it checks that the word was not defined in the same file (because
if it was, it wouldn’t be compiled yet); then it executes it using
execute(
to ensure it has exactly one input and one output:
ERROR: staging-violation word ;
: execute-parsing ( accum word -- accum )
dup changed-definitions get key? [ staging-violation ] when
execute( accum -- accum ) ;
The basic idea is that the code that uses call(
and execute(
is
statically checked to ensure that the right number of parameters are
passed around between words; and at runtime, the actual quotation or
word is checked and if its stack effect is not what the code assumed
statically, an error is thrown.
The call(
and execute(
words are parsing words which call
parse-effect
to read a stack effect object. This is the same
underlying facility that the (
word uses. After parsing an effect,
call(
and execute(
are transformed into calls to call-effect
and
execute-effect
. So the following,
"A key" foo get at call( a -- b )
parses as, and is equivalent to
"A key" foo get at (( a -- b )) call-effect
Where call-effect
should be thought of as “call a quotation with this
effect”.
These words complement call
and execute
. The call
and execute
words benefit from the compiler’s lambda lifting optimization. If a
literal quotation, the result of curry
and compose
, or a word, is
only ever passed “downward”, then the compiler will inline it at the
call site of call
and execute
, effectively converting your program
into a first-order program. This means that all kinds of iteration
patterns become simple loops. Even if the iteration combinators are
expressed as compositions of higher-order functions, where the quotation
might be passed between several words and be operated on with curry
and compose
, things still work out pretty well.
However, not all code is first-order, and call(
and execute(
allow
more code to be written in a style where Factor’s stack checker can
infer the stack effect. Whereas before, some 80% of the words in the
library would have an inferred stack effect, now the figure is closer to
100%, after introducing the new language feature and refactoring code to
use it.
The concept of compiler warnings always confused Factor beginners. In Factor, a compiler warning indicates that the compiler was unable to infer the stack effect of a word; this inhibits most optimizations because no assumptions about the positions of values on the stack can be made. So if you care about the performance of your code, it was good to write it in a way such that the stack effect would infer, but it wasn’t possible to write code like this in all situations. Now it is, and compiler warnings are very rare; loading the Furnace web framework now produces less than 10 whereas before it would be a few hundred.
The implementation is pretty interesting. The call(
parsing word
expands into a call to the call-effect
macro. The macro expects that
the stack effect is a literal compile-time value. The macro expansion is
specialized for calling a quotation with this effect. The quotation
itself is only known at runtime, and so the code generated by the macro
receives it as a parameter. It uses three strategies:
- Inline cache hit: If the quotation is the same as last time,
call it without any further runtime checks using the
call-effect-unsafe
primitive. - Inline cache miss: If the quotation is different from last time,
then it checks the stack effect of the quotation against the
declaration. The stack effect is computed lazily and stored in a
slot in the quotation. If the declaration matches, the quotation is
called using
call-effect-unsafe
. - Slow case: if the quotation’s stack effect cannot be inferred, a snapshot of the data stack is taken, the quotation is called, and the stack is compared to make sure the quotation didn’t use more parameters than declared.
The execute(
parsing word expands into a call of the execute-effect
macro, which uses three similar strategies:
- Inline cache hit: If the word is the same as last time, call it
without any further runtime checks using the
execute-effect-unsafe
primitive. - Inline cache miss: If the word is different from last time, then
it checks the stack effect of the word against the declaration.
Stack effects of words are always available, because they are
computed by the compiler. If the declaration matches, the quotation
is called using
execute-effect-unsafe
. - Slow case: if the word’s stack effect is not known, then it
creates a quotation with the word in it, and calls
call-effect
on it.
In most cases, the slow case should never be hit, and even if it is, the
runtime checks are quite efficient. The inline caching ensures that if
call(
is used to invoke a quotation in a loop, it only has to check
its effect once.
Factor is heading in an interesting direction. It generally offers more static checking and optimization than other dynamic languages. Here are the main differences; all of these things that Factor does at compile time, other dynamic languages leave until runtime:
- In Factor, calls to undefined words are parse-time errors. In most dynamic languages, a typo in a method name would not be caught until run time.
- In Factor, most code passes the stack effect checker. This means that calling a word with the wrong number of parameters is impossible.
- In Factor, most uses of higher-order functions are inlined by the compiler, and escape analysis eliminates most closure allocations. So there are less blocks being passed around at runtime, and less allocation.
However, the language is still dynamically typed; the types of values are not known until runtime (except when the compiler can infer them), and there is a lot of generic dispatch and ad-hoc polymorphism in the library.
Now call(
and execute(
will speed up code which cannot take
advantage of the last optimization. Indeed, with inline caching, calling
quotations a loop with call(
is almost as fast as if everything has
been inlined.