Crash while running unit tests caused by unit tests
Monday, October 16, 2006
I spent an hour last night on a weird crashing bug that manifested
itself on Windows: if you ran test-modules
from the UI a few times, it
would eventually crash Factor in unpredictable ways. I narrowed this
down to the test for I/O
buffers, which
are low-level character queues, allocated in malloc()
space, for when
the Factor implementation needs to call native I/O functions with a
pointer which will not be moved by the GC.
Now one of the unit tests created two buffers from a pair of strings,
appended one to the other, then converted the result to a string, and
compared the result against the expected string. Unfortunately, the word
to append them was written rather carelessly. It used memcpy
to copy
the contents of one buffer to another, without checking bounds and
growing the buffer first. The result? Random crashes, yet amazingly they
never appeared on Linux or Mac OS X.
Lessons learned? Well, I knew all of these already, but this incident underscores them:
- Pointer arithmetic is dangerous. Fortunately I don’t believe in “hair shirt” programming, so Factor makes minimal use of unsafe constructs, only resorting to direct memory manipulation when calling C code which wants us to deal with C data. User programs never have to step outside the memory safe sandbox.
- When unit tests fail, it doesn’t necessarily imply the code being tested is buggy. The unit tests could be broken too! The reason it took me a long time to track down the bug is simply that I did not think to check the unit tests themselves, since after all, the buffers test passed most of the time.
- Testing on multiple platforms helps weed out bugs.