Реферат Курсовая Конспект
Hash Functions - раздел Образование, Unit 4. Sorting, Searching, and Complexity · 4.1 Sorting and Searching · 4.2 Complexity Assessments · Multiple-Choice Quiz Operations Using Hash Tables Are Efficient Because The Position Of A Stored V...
|
Operations using hash tables are efficient because the position of a stored value can be calculated using the key. Hash tables are implemented typically as an array of values. A hash function is used to map a key to an index within this array. This index is where the key's corresponding value is stored. Other search algorithms, such as a linear or binary search, cannot map a key to its value's position as quickly as hash tables. These algorithms must perform more comparisons to find the index of the stored value. The time this search takes to complete increases as the number of stored items increases.
Figure 1 A sample hash function
A mapping of keys to indexes is demonstrated in Figure 1. In this figure, the hash function generates an index based on the rightmost digit of the ASCII value of the second letter of the person's last name. For example, in the name "Hanson, Bob", the second letter is an "a". The ASCII value of "a" is 97. The rightmost digit of 97 is 7. Therefore, the record for "Hanson, Bob" is stored at index 7 of the hash table.
Figure 3 A sample hash table
The hash table in Figure 2 shows each name in its mapped position. The advantage of a hash table is that to find an entry, one only has to apply the hash function to the key.
A popular method used in implementing a hash function is the division method. Like all hash functions, the division method maps a key to an index within the hash table. The division method implementation involves the conversion of a key to a variable of type unsigned int. Then, this value is divided by the size of the hash table. The remainder of this division is used as the index.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: | class hash_function { public: unsigned int mm; hash_function(unsigned int m = 6151) : mm(m) {} unsigned int operator()(const string& s) const { unsigned int res = 0; for (unsigned int i = 0; i < s.size(); i++) { res = res * mm + s[i]; } return res; } }; |
Listing 1A hash function based on the division method |
A problem arises when we consider that there are typically many more keys than there are locations within a hash table. This means that a hash function can potentially map two or more distinct keys to the same index. Consider if we tried to add the name "Anderson, Jeff" to the hash table pictured in Figure 2. Applying the hash function yields an index of 0. The name "Adderly, Maria", however, already exists at index 0. When a hash function maps more than one key to the same index a collision occurs.
Hash table implementations are complicated by the fact that they must handle potential collisions. These mechanisms to handle collisions are discussed in detail in chapter 20 of the Weiss textbook. Basically, collisions decrease the performance of hash table operations. The best way to reduce the number of collisions, and thus increase the efficiency of a hash table, is to use a good hash function. A good hash function evenly distributes the mapping of keys throughout the positions in the hash table.
It is good software engineering practice to separate the hash function from the implementation of the hash table. This allows a programmer to customize a hash function to the data stored by a particular instance of a hash table. Listing 1 defines a hash table as a class that overloads operator ().
Memoizing: An Application of Hash Tables
When it is computationally expensive to calculate a function value y = f(x), it may be a good idea to store the value for future use. In other words, we calculate f(x) only the first time the function value on this particular input x is required, and then store ( x, f(x) ) in a hash table. Any future requests for f(x) will then result in a table lookup and be much faster than a recalculation. This technique is called memoizing.
Note that we have to modify the function a bit to take advantage of the stored values. Here is the structure of a memoized function.
int f_memo(int x) {
if ( there is an entry (x,y) in hash table ) {
return y;
}
else {
compute y = f(x) directly
store (x,y) in table
return y;
}
}
Example 1 A typical memoized function
Since the hash table has to persist between function calls, it would be natural to organize this as a class that overloads operator ().
– Конец работы –
Эта тема принадлежит разделу:
На сайте allrefs.net читайте: "Unit 4. Sorting, Searching, and Complexity · 4.1 Sorting and Searching · 4.2 Complexity Assessments · Multiple-Choice Quiz"
Если Вам нужно дополнительный материал на эту тему, или Вы не нашли то, что искали, рекомендуем воспользоваться поиском по нашей базе работ: Hash Functions
Если этот материал оказался полезным ля Вас, Вы можете сохранить его на свою страничку в социальных сетях:
Твитнуть |
Новости и инфо для студентов