When I was first learning how to program, I had a class that wanted a fast way of checking the spelling of the words in a file. The language we were using was called APL (Another Programming Language), which was based on atomic structures that were very dynamic and powerful. Unfortunately it was a language that I only used in college. Anyway, the data structure that I used was called a trie, even though I didn’t know it at the time. The way that I built it was a beast. I sacrificed memory for speed. I could find a word on the first look. O(1).
I later built it in C++ for a friend who wanted a fast way of finding key words for some project or another. It was also memory intensive and extremely fast. A couple of weeks ago I was trying a coding challenge on HackerRank.com, and low and behold they wanted a dictionary application. I have been wanting to gain a little C# experience and jumped right into it.
I’m only going to give the classes for the Trie, because the other code would be completely useless to anyone. But the essence of the code breaks down the word into characters, and places them into a tree. The tree is designed to grow for as many characters that are needed. Which means that it can hold any word no matter it’s length. The Node class is based off the SortedDictionary because a Dictionary will hold a key, and value. The key is the letter or character, and the value is the next Node. The Dictionary is Sorted to increase the speed in which the keys can be found.
To add words into the Trie, use the AddWord function. The reason that I liked the Dictionary data structure so much is because it won’t grow unless you need it to. So if I need enough Nodes for 20 letters them all I make space for is 20 Nodes. It uses exactly what is needed. If there is a similar spelling to a word, then the word gets compressed. You only add what you need. The property End in the Node class is there to tell you that you have a complete word. For instance, if I add the word ‘ant’ and then add ‘antiquate’, the Node with the first “t” will set the End property to true so you can find the ‘ant’ later if you need to.
When you check the Trie for a word, pass a string into the function Contains. It will burrow down into the Trie until one of two things happens; it runs into a dead end, or runs out of letters from the word you gave it. If a dead end is found then the word wasn’t added into the Trie. If it runs out of letters, then the function checks to see if the End property is true. If it’s true, then the search has made a match. The function will then return a true of false on the results of the search.
The data structure can be modified easily to match your circumstances. In another challenge, I needed to parse numeric values bit by bit. Changing the number into a string of bits is not best way to solve this problem, so change the code to do it for you.
If you find any problems with what I wrote, please feel free to leave a comment so I can update the example.
public class Node: SortedDictionary<char, Node>
{
public bool End { get; set; }
}
public class Trie
{
private Node Root = new Node();
public void AddWord(string wd)
{
Node t = Root;
foreach (char c in wd)
{
if (!t.ContainsKey(c))
{
t.Add(c, new Node());
}
t = t[c];
}
t.End = true;
}
public bool Contains(string wd)
{
Node t = Root;
foreach (char c in wd)
{
if (t.ContainsKey(c))
{
t = t[c];
}
else
return false;
}
return t.End;
}
}
