Converting a table to a tree
Thursday, April 3, 2008
I found an interesting Haskell snippet in a reddit comment.
From the comment:
*Here’s a slightly longer function I created a while ago:
*listToForest :: Eq a => [[a]] -> Forest a listToForest = map toTree . groupBy ((==) `on` head) . filter (/= []) where toTree = Node . (head . head) <*> (listToForest . map tail)
The function turns a table of results, such as that which might come back from a database query, into a tree. e.g.
[[A, B, C], [A, B, D], [A, E, F]]
Becomes:
A / \ B E / \ \ C D F
I decided to try implementing this in Factor, and I came up with three versions.
All versions represent a tree as its root node, and each node is a hashtable mapping the key of the child to the child node.
The first version does not use any utility words other than what’s in the standard library; it consists of a single, rather long word:
: list>forest ( list -- forest )
[ empty? not ] subset
H{ } clone [ [ >r unclip r> [ ?push ] change-at ] curry each ] keep
[ list>forest ] assoc-map ;
The second version uses a couple of utilities:
: push-at ( value key assoc -- )
[ ?push ] change-at ;
: classify ( seq classifier -- assoc )
H{ } clone [ [ slip push-at ] 2curry each ] keep ;
: list>forest ( list -- forest )
[ empty? not ] subset [ unclip ] classify [ list>forest ] assoc-map ;
The third version has a different implementation of classify
which
uses extra/fry
syntax sugar:
: push-at ( value key assoc -- )
[ ?push ] change-at ;
: classify ( seq classifier -- assoc )
H{ } clone [ '[ @ , push-at ] each ] keep
: list>forest ( list -- forest )
[ empty? not ] subset [ unclip ] classify [ list>forest ] assoc-map ;
If this was a one-off thing I’d use the first version. If I was writing
production code, I would use one of the second two, and I might even put
the utilities in a shared vocabulary since they’re generally useful. In
the last version, the only stack manipulation operator is keep
, and
everything else just happens to be on the stack in the correct order.
This is how good stack code is written.
Some explanation:
?push
is likepush
except it makes a new vector if the input isf
slip
calls a quotation underneath the top of the stack.push-at
adds a value to the sequence stored at a key in a hashtable.classify
takes a sequence and a quotation which decomposes a sequence element into a key and a value. It builds an association mapping all the unique keys to corresponding values.unclip
takes a sequence and outputs the rest of the sequence followed by the first element.- Haskell’s
groupBy
is a generalization of myclassify
; it works with an equivalence relation rather than a “fibration” into a set with an implied equivalence relation (wrong term, but I’m not sure what the correct term is). - My
classify
runs in linear time and I suspectgroupBy
is quadratic, but I could be wrong.
I’d be interested in seeing a version of this function in idiomatic Common Lisp or Scheme. I suspect it would be somewhat longer, but not horribly so.