Sunday, January 30, 2011

Ternary Search Trees

I looked online and (in two books) but did not find very good explanations of ternary search trees. Following is what I combined together from various sources:

Ternary Search Trees are specialized structures used to store strings so they are easily searchable.
  1. The traversal of the tree proceeds character by character
  2.  Each node can have up to three children ( as opposed to two in binary trees)
  3. Middle node connection is dotted.
  4. Nodes that form the end of a word are colored.
These are the rules for insertion:
An insert compares the current character in the string to be inserted (insert character) with the current character at the node (node character):
    1. If the insert character is less than the node’s character then the insertion is made to the left (the line between nodes is solid)
    2. If the insert character is greater than the node’s character then the insertion is made to the right(the line between nodes is solid)
    3. If the insert character is equal to the node’s character then the insertion is made to the middle child and proceeds to the next character in the string to be inserted (or the next insert character).(the line between nodes is dotted)
Suppose you had the following words AB, ABBA, ABCD, and BCD and you wished to store them and then search for them.
We begin with AB. We insert A and then B is a child so a dotted line is drawn between the two :
The yellow circle indicates that the search string has terminated.
AB, ABBA, ABCD, and BCD
Now we wish to store string ABBA. We try to insert A. We find it is already there so we move to B in and notice that B is also already in the tree. Then we insert the rest of the string "ABBA" which is B and A :

AB, ABBA, ABCD, and BCD
Then we insert the next string: ABCD. We try to insert A and we find A. So we move to the next node in the tree B. Then we try to insert B and we find B in the tree so we move to the next node in the tree, again a B. Then we try to insert C and we find that we are on the second B or the third node. So to insert C we take a right branch and insert C and then insert D.



AB, ABBA, ABCD, and BCD
Now we try to insert BCD and as we enter the tree we encounter the first node character A which is not the same as the first character of the string BCD which is B. We insert B to the right branch of A so:
 A search compares the current character in the search string (search character) with the character at the node (node character):
    1. If the search character is less than the node’s character then the search goes to the left
    2. If the search character is greater than the node’s character then the search goes to the right
    3. If the search character is equal to the node’s character then the search goes down to the middle child and proceeds to the next character in the search string (or the next search character).
If we searched for the string ABBA we would see the following path followed in blue:

If we searched for ABD we would be unable to find it:

Another example of a ternary search tree is the following:

In the above don’t be surprised by the far left location of the first word “as” it is merely shifted left.

No comments:

Total Pageviews

Popular Posts