Trees in SQL
Here is a cool DB trick that a friend of mine showed me a while back. The problem is basically how to efficiently access a tree stored in a relational/SQL database.
The solution where you store the parent of a node in the node/row itself works, but can't be retrieved in-depth without using some kind of a loop or reccursion.
Joe Celko seems to one of the most mentioned DB experts on this matter. Here is an intro he wrote on reprensenting trees as "nested sets".
A double numbering or indexing is done globally on the tree, by giving each node/row a left index and right index. These indexes are set sequentially starting with left=1 on the root, and walking through the tree in-depth. The left index is the value of the sequence when you first reach a node (on the left side, if you do a drawing) and the right index is the value of the counter when you reach the node the second time.
Below is an illustration of this numbering:
Some of the nice properties of this schema are that non-reccursive queries can be used to retrieve some meaningful parts of the tree:
- a leaf has (right=left+1),
- a subbranch (under a node with leftbase and rightbase as indexes) verifies (leftbase < left AND right < rightbase),
- the path back to the root starting at a node (with leftnode, rightnode as indexes) can be found similarly with (left < leftnode AND rightnode < right).
One of the problem is that each time a new node is inserted, approximately half of the existing nodes/rows need to be updated, to maintain the consistency of this indexing. This could be a problem for trees that are updated frequently.
If you know of any large scale implementation of trees in SQL using this system or any other, please let me know.
Here is another bunch of links on the topic:
First, second and third articles by Celko on dbmsmag.com.
A comparison with other methods.
Celko wrote an article for O'Reilly Networks: Hierarchical SQL.
Found another article that explains this technique, in PHP: http://www.sitepoint.com/article/1105Posted by: Dumky at June 19, 2003 04:11 PM
http://www.codeproject.com/cs/database/Trees_in_SQL_databases.aspPosted by: dev@ at October 26, 2004 02:47 PM
Also see Teodor Sigaev and Oleg Bartunov's use of GiST indexes at http://www.sai.msu.su/~megera/postgres/gist/