Maps and Unique Sets

In This Topic

In NOV data structures a map is a dynamic set of key-value pairs, where the keys in the set are unique - i.e. a mapping of distinct keys to values. In other frameworks the map is also known as dictionary (.NET CLR) or association (Java). In a map, the key is used to uniquely identify the pair, while the mapped value is some sort of value associated to this key. Types of key and mapped value may differ - hence the INMap<TKey, TValue> generic interface has two types parameters - for the keys and for the values. A typical example of a map is a language dictionary - say English-German one - each English word maps to a German word, however there may be multiple English words that map to the same German one.

Maps implement the INMap<TKey, TValue> interface, which is declared as follows:

INMap<TKey, TValue> Interface	Copy Code
/// <summary> /// Represents a dynamic set of key-value pairs, where each key is unique for the set. /// </summary> /// <typeparam name="TKey"></typeparam> /// <typeparam name="TValue"></typeparam> public interface INMap<TKey, TValue> : INDynamicSet<NKeyValuePair<TKey, TValue>>, INContains<TKey>, INRemovable<TKey> { /// <summary> /// Gets the value that corresponds to the specified key. /// Throws an exception, if entry for this key does not exist. /// </summary> /// <param name="key"></param> /// <returns></returns> TValue Get(TKey key); /// <summary> /// Tries to get the value for the specified key. /// </summary> /// <param name="key">key to look for</param> /// <param name="value">value corresponding to key. Valid only if true is returned.</param> /// <returns>true if the map contained a value for the specified key, otherwise false</returns> bool TryGet(TKey key, out TValue value); /// <summary> /// Sets the value for the specified key. /// If key-value association already exists, the value is replaced. /// </summary> /// <param name="key"></param> /// <param name="value"></param> void Set(TKey key, TValue value); /// <summary> /// Adds a new entry with the specified key and value. /// Throws an exception, if entry for this key already exists. /// </summary> /// <param name="key"></param> /// <param name="value"></param> void Add(TKey key, TValue value); /// <summary> /// Gets/sets the value associated with the specified key. /// The getter implementation throws an exception, if entry for the key does not exist. /// The setter implementation replaces the value, if entry for the key exists. /// </summary> /// <param name="key"></param> /// <returns></returns> TValue this[TKey key] { get; set; } /// <summary> /// Gets the set of keys /// </summary> INSet<TKey> Keys { get; } /// <summary> /// Gets the set of values /// </summary> INSet<TValue> Values { get; } }

INMap<TKey, TValue> Interface

Copy Code

/// <summary>
/// Represents a dynamic set of key-value pairs, where each key is unique for the set.
/// </summary>
/// <typeparam name="TKey"></typeparam>
/// <typeparam name="TValue"></typeparam>
public interface INMap<TKey, TValue> :
    INDynamicSet<NKeyValuePair<TKey, TValue>>,
    INContains<TKey>,
    INRemovable<TKey>
    {
        /// <summary>
        /// Gets the value that corresponds to the specified key.
        /// Throws an exception, if entry for this key does not exist.
        /// </summary>
        /// <param name="key"></param>
        /// <returns></returns>
        TValue Get(TKey key);
        /// <summary>
        /// Tries to get the value for the specified key.
        /// </summary>
        /// <param name="key">key to look for</param>
        /// <param name="value">value corresponding to key. Valid only if true is returned.</param>
        /// <returns>true if the map contained a value for the specified key, otherwise false</returns>
        bool TryGet(TKey key, out TValue value);
        /// <summary>
        /// Sets the value for the specified key. 
        /// If key-value association already exists, the value is replaced.
        /// </summary>
        /// <param name="key"></param>
        /// <param name="value"></param>
        void Set(TKey key, TValue value);
        /// <summary>
        /// Adds a new entry with the specified key and value.
        /// Throws an exception, if entry for this key already exists.
        /// </summary>
        /// <param name="key"></param>
        /// <param name="value"></param>
        void Add(TKey key, TValue value);
        /// <summary>
        /// Gets/sets the value associated with the specified key.
        /// The getter implementation throws an exception, if entry for the key does not exist.
        /// The setter implementation replaces the value, if entry for the key exists.
        /// </summary>
        /// <param name="key"></param>
        /// <returns></returns>
        TValue this[TKey key] { get; set; }
        /// <summary>
        /// Gets the set of keys
        /// </summary>
        INSet<TKey> Keys { get; }
        /// <summary>
        /// Gets the set of values
        /// </summary>
        INSet<TValue> Values { get; }
}

The most common implementation of this interface is provided by the NMap<TKey, TValue> class and is based on a prime hash table (see below).

As a general rule in NOV data structures, the default implementation for a specific interface has a class named after it. For example: INList - NList, INStack - NStack, INQueue - NQueue etc.

The following code example shows how to make a simple reverse telephone dictionary (i.e. get a name from a telephone number). Note that in the reverse telephone dictionary the numbers are unique, while the returned names can be duplicate, because a single person can have multiple telephones:

Map Example	Copy Code
void TestMap() { // make a map and fill it with items NMap<string, string> map = new NMap<string, string>(); map.Add("345-234-3456", "John Smith"); map.Add("452-345-4667", "John Smith"); map.Add("783-352-6573", "Susan Smith"); map.Add("896-134-8365", "Susan Smith"); // make a sample query - the first one succeeds, the second does not string[] phones = new string[] { "783-352-6573", "234-235-2345" }; for (int i = 0; i < 2; i++) { string phone = phones[i]; string name; if (map.TryGet(phone, out name)) { Console.WriteLine("Phone: " + phone + " maps to name: " + name); } else { Console.WriteLine("Phone: " + phone + " has no mapped name"); } } // output is: // Phone: 783-352-6573 maps to name: Susan Smith // Phone: 234-235-2345 has no mapped name // output the content of the map INIterator<NKeyValuePair<string, string>> it = map.GetIterator(); while (it.MoveNext()) { NKeyValuePair<string ,string> cur = it.Current; Console.WriteLine("Phone: " + cur.Key + " maps to name: " + cur.Value); } // output is: // Phone: 345-234-3456 maps to name: John Smith // Phone: 452-345-4667 maps to name: John Smith // Phone: 783-352-6573 maps to name: Susan Smith // Phone: 896-134-8365 maps to name: Susan Smith }

Map Example

Copy Code

void TestMap()
{
    // make a map and fill it with items
    NMap<string, string> map = new NMap<string, string>();
    map.Add("345-234-3456", "John Smith");
    map.Add("452-345-4667", "John Smith");
    map.Add("783-352-6573", "Susan Smith");
    map.Add("896-134-8365", "Susan Smith");
 
    // make a sample query - the first one succeeds, the second does not
    string[] phones = new string[] { "783-352-6573", "234-235-2345" };
    for (int i = 0; i < 2; i++)
    {
        string phone = phones[i];
        string name;
        if (map.TryGet(phone, out name))
        {
            Console.WriteLine("Phone: " + phone + " maps to name: " + name);
        }
        else
        {
            Console.WriteLine("Phone: " + phone + " has no mapped name");
        }
    }
    // output is:
    // Phone: 783-352-6573 maps to name: Susan Smith
    // Phone: 234-235-2345 has no mapped name
    // output the content of the map
    INIterator<NKeyValuePair<string, string>> it = map.GetIterator();
    while (it.MoveNext())
    {
        NKeyValuePair<string ,string> cur = it.Current;
        Console.WriteLine("Phone: " + cur.Key + " maps to name: " + cur.Value);
    }
    // output is:
    // Phone: 345-234-3456 maps to name: John Smith
    // Phone: 452-345-4667 maps to name: John Smith
    // Phone: 783-352-6573 maps to name: Susan Smith
    // Phone: 896-134-8365 maps to name: Susan Smith
}

In many cases it is necessary to only use the fact that the keys in the map are unique - for example, if you have to build a set that contain only the distinct items of a list. Indeed if the TValue generic argument is equal to the TKey generic argument and you always associate a key with itself (or a constant value), you can achieve this behavior with a map. NOV however provides a slightly optimized way to achieve this, which is implemented by the NUniqueSet<T> class. It only stores the distinct items that you add to it. Internally it is also based on a prime hash table (see below).

The following code example shows how to make a distinct list of the items contained in a list.

UniqueSet Example	Copy Code
void TestUniqueSet() { // create a simple list that contains duplicate items. NList<string> list = new NList<string>(); list.Add("NOV"); list.Add("World"); list.Add("World"); list.Add("Hello"); list.Add("Hello"); // create a set that contains the distict items of the list. NUniqueSet<string> set = new NUniqueSet<string>(list); // output the set items INIterator<string> it = set.GetIterator(); while (it.MoveNext()) { Console.WriteLine(it.Current); } // output is: // NOV // World // Hello // add a dummy constant value several times, // just to see that it is actually added only once for (int i = 0; i < 5; i++) { set.Add("Nevron"); } // output the set items it = set.GetIterator(); while (it.MoveNext()) { Console.WriteLine(it.Current); } // output is: // NOV // World // Hello // Nevron }

UniqueSet Example

Copy Code

void TestUniqueSet()
{
    // create a simple list that contains duplicate items.
    NList<string> list = new NList<string>();
    list.Add("NOV");
    list.Add("World");
    list.Add("World");
    list.Add("Hello");
    list.Add("Hello");   
    // create a set that contains the distict items of the list.
    NUniqueSet<string> set = new NUniqueSet<string>(list);
    // output the set items
    INIterator<string> it = set.GetIterator();
    while (it.MoveNext())
    {
        Console.WriteLine(it.Current);
    }
    // output is:
    // NOV
    // World
    // Hello
    // add a dummy constant value several times, 
    // just to see that it is actually added only once
    for (int i = 0; i < 5; i++)
    {
        set.Add("Nevron");
    }
    // output the set items
    it = set.GetIterator();
    while (it.MoveNext())
    {
        Console.WriteLine(it.Current);
    }
    // output is:
    // NOV
    // World
    // Hello
    // Nevron
}

Hash Table Implementation

The NMap<TKey, TValue> and NUniqueSet<T> implementations, are based on a prime hash table, which means that the internal hash table managed by the map is sized only to prime numbers. This is a well known strategy for protecting a general hash-table from not very good hashing functions. The hash table uses the build in ability of each .NET object to provide a hash-code via its int GetHashCode() virtual method. If two key objects have the same hash-code, they are said to reside in the same bucket, so the uniqueness of the keys with the same hash-code is further checked by calling the .NET object bool Equals(object other) virtual method. This directly leads to the general limitation of the hash table implementation, which is that the key object cannot be null.

The hash table implementation is a very good performer, that has nearly constant query times and update times. Addition of new entries is also performed in constant time, except in the cases when the hash table needs to be resized (such cases are performed in O(N) time). The hash table aims to grow in a geometric progression with a common ratio of 2, but because of the prime number size constraint it actually picks the next good prime number, that is closest to the required capacity.