My Trees can now have arbitrary roots. It is time to grow a Forest.
When I create a new tree, which might or might not be a clone of an existing one, I need to give the tree the root node id as an initializer argument. This is awkward, since I don't know what it will be...
I think the following is a better design:
- if root_id is None, the tree picks the next id when it creates the root node
- if root_id is not None, it is the root of the tree we clone from
- when the tree updates nodes, it never overwrites the root node, instead it creates a new node all the time
- when things are committed to disk, the forest queries all trees for their current root node and stores that list in the metadata
Next big thing: trees currently generate node ids themselves. This will end badly... Fixed.
The final big thing: removing a tree. I haven't used refcounts yet at all, that's going to be needed for this. Otherwise I can't remove nodes.
In fact, refcounting is the only interesting part of this. So, the changes that need making are:
- when creating a node, set its refcount to 1
- when creating an index node, additionally increment the refcount of each of its children
- instead of removing a node, decrement its refcount and if it drops to zero, only then remove it
- when removing an index node, additionally decrement the refcount of each of its children
- node removal happens also when root node is replaced
Hmm. This is not quite the same as what Rodeh suggests, but he's planning for a filesystem where disk blocks need to be re-used, where I do not have that.
Actually, my scheme above can be simplified:
- when creating a leaf node, set its refcount to 0
- when creating an index node, increment refcount of each child
- when removing an inex node, decrement refcount of each child
- root nodes will always have a refcount of 1
How should I deal with the root that gets replaced? It might belong to another tree, after all. I guess I should not force roots to have refcounts of 1. Instead, when I clone a tree, the refcount for a root should grow. Alternatively, I could always create a new root node when cloning a tree.
Also, even when inserting, some nodes might end up getting destroyed. This requires some careful thinking, now.
Algorithm for decrementing refcount for a node:
decrement refcount for node if refcount dropped to 0: if node is index node: decrement refcounts for children remove node
Calling this algorithm at judicious locations when inserting and removing should work.
I think I got it work. Didn't write extensive tests for the refcounting, but we'll see.
Next step should probably to do some benchmarking.