GC bug
Wednesday, June 14, 2006
I found and fixed a bug in the garbage collector today. It seems to have been introduced when quotations became arrays. I triggered it by following the following steps:
- Running ’tests’ in the UI
- Redefining a core word
- Recompiling everything
- Invoking
full-gc
At this point the runtime crashed with a memory corruption error. Further investigation revealed that some code was holding on to an address which appears to have been moved by the GC. Several hours later, I uncovered the suspect code.
When a callback is called, the current interpreter state is saved in a
stack_state
struct, and these structures are chained. This includes a
copy of all stacks, and the currently executing quotation. This is
because each nested callback runs with its own isolated data and call
stacks. When leaving the callback, the topmost entry in the linked list
is removed, and the saved state is restored. The top-level stack_state
struct does not contain a valid saved current quotation field, since
there was no current quotation when the top level object was created. So
the garbage collector would not consider the saved quotation there a
valid root, and would not copy this object. However, the test for this
case was wrong; it would ignore the saved current quotation in the first
stack_state
, and not the last. The result was that if callbacks were
never used, then the first and last elements of the linked list would
coincide and there would be no problem. However when the garbage
collector was invoked from within a nested callback, the correct pointer
would not be updated, and thus the interpreter would continue executing
at an address which was not even valid anymore.