Top Banner
Data Structures Dictionaries Andres Mendez-Vazquez May 6, 2015 1 / 127
289
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Preparation Data Structures 09 hash tables

Data StructuresDictionaries

Andres Mendez-Vazquez

May 6, 2015

1 / 127

Page 2: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

2 / 127

Page 3: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Dictionaries

DefinitionThe ADT dictionary—also called a map, table, or associativearray—contains entries that each have two parts:

A keyword—usually called a search key—such as an English word or aperson’s nameA value—such as a definition, an address, or a telephonenumber—associated with that key

3 / 127

Page 4: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Dictionaries

DefinitionThe ADT dictionary—also called a map, table, or associativearray—contains entries that each have two parts:

A keyword—usually called a search key—such as an English word or aperson’s nameA value—such as a definition, an address, or a telephonenumber—associated with that key

3 / 127

Page 5: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Dictionaries

DefinitionThe ADT dictionary—also called a map, table, or associativearray—contains entries that each have two parts:

A keyword—usually called a search key—such as an English word or aperson’s nameA value—such as a definition, an address, or a telephonenumber—associated with that key

3 / 127

Page 6: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 7: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 8: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 9: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 10: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 11: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 12: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Dictionary with DuplicatesPairs are of the form (word, meaning).

More than a single meaning(bolt, a threaded pin)(bolt, a crash of thunder)(bolt, to shoot forth suddenly)(bolt, a gulp)(bolt, a standard roll of cloth)etc.

4 / 127

Page 13: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus

We have possibly in a dictionarySorted keysDuplicate keys

5 / 127

Page 14: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 15: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 16: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 17: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 18: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 19: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 20: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Operation in the ADT dictionary

Common operations with most databasesinsert adds a new entry to the dictionary, given a search key andassociated value.delete removes an entry, given its associated search keyretrieve retrieves the value associated with a given search keysearch sees whether the dictionary contains a given search keytraverse

I It traverse all the search keys in the dictionaryI It traverse all the values in the dictionary

6 / 127

Page 21: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

In addition

We have the following extra operationsDetect whether a dictionary is emptyGet the number of entries in the dictionaryRemove all entries from the dictionary

7 / 127

Page 22: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

In addition

We have the following extra operationsDetect whether a dictionary is emptyGet the number of entries in the dictionaryRemove all entries from the dictionary

7 / 127

Page 23: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

In addition

We have the following extra operationsDetect whether a dictionary is emptyGet the number of entries in the dictionaryRemove all entries from the dictionary

7 / 127

Page 24: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

8 / 127

Page 25: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Add

Pseudocodeadd(key, value)

TaskIt adds the pair (key , value) to the dictionary.

Input and OutputInput: key is an object search key, value is an associated object.

Output: None.

9 / 127

Page 26: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Add

Pseudocodeadd(key, value)

TaskIt adds the pair (key , value) to the dictionary.

Input and OutputInput: key is an object search key, value is an associated object.

Output: None.

9 / 127

Page 27: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Add

Pseudocodeadd(key, value)

TaskIt adds the pair (key , value) to the dictionary.

Input and OutputInput: key is an object search key, value is an associated object.

Output: None.

9 / 127

Page 28: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Adding something

h(k)

Hash Table

(k,item)

10 / 127

Page 29: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

11 / 127

Page 30: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Remove

Pseudocoderemove(key)

TaskIt removes from the dictionary the entry that corresponds to a given searchkey.

Input and OutputInput: key is an object search key.

Output: Returns either the value that was associated with the searchkey or null if no such object exists.

12 / 127

Page 31: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Remove

Pseudocoderemove(key)

TaskIt removes from the dictionary the entry that corresponds to a given searchkey.

Input and OutputInput: key is an object search key.

Output: Returns either the value that was associated with the searchkey or null if no such object exists.

12 / 127

Page 32: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Remove

Pseudocoderemove(key)

TaskIt removes from the dictionary the entry that corresponds to a given searchkey.

Input and OutputInput: key is an object search key.

Output: Returns either the value that was associated with the searchkey or null if no such object exists.

12 / 127

Page 33: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Removing something

h(k)

Hash Table

(k,item)

13 / 127

Page 34: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

14 / 127

Page 35: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetValue

PseudocodegetValue(key)

TaskIt retrieves from the dictionary the value that corresponds to a givensearch key.

Input and OutputInput: key is an object search key.

Output: Returns either the value associated with the search key ornull if no such object exists.

15 / 127

Page 36: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetValue

PseudocodegetValue(key)

TaskIt retrieves from the dictionary the value that corresponds to a givensearch key.

Input and OutputInput: key is an object search key.

Output: Returns either the value associated with the search key ornull if no such object exists.

15 / 127

Page 37: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetValue

PseudocodegetValue(key)

TaskIt retrieves from the dictionary the value that corresponds to a givensearch key.

Input and OutputInput: key is an object search key.

Output: Returns either the value associated with the search key ornull if no such object exists.

15 / 127

Page 38: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

16 / 127

Page 39: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Contains

Pseudocodecontains(key)

TaskIt sees whether any entry in the dictionary has a given search key.

Input and OutputInput: key is an object search key.

Output: Returns true if an entry in the dictionary has key as itssearch key.

17 / 127

Page 40: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Contains

Pseudocodecontains(key)

TaskIt sees whether any entry in the dictionary has a given search key.

Input and OutputInput: key is an object search key.

Output: Returns true if an entry in the dictionary has key as itssearch key.

17 / 127

Page 41: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: Contains

Pseudocodecontains(key)

TaskIt sees whether any entry in the dictionary has a given search key.

Input and OutputInput: key is an object search key.

Output: Returns true if an entry in the dictionary has key as itssearch key.

17 / 127

Page 42: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

18 / 127

Page 43: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetKeyIterator

PseudocodegetKeyIterator()

TaskIt creates an iterator that traverses all search keys in the dictionary.

Input and OutputInput: None.

Output: Returns an iterator that provides sequential access to thesearch keys in the dictionary.

19 / 127

Page 44: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetKeyIterator

PseudocodegetKeyIterator()

TaskIt creates an iterator that traverses all search keys in the dictionary.

Input and OutputInput: None.

Output: Returns an iterator that provides sequential access to thesearch keys in the dictionary.

19 / 127

Page 45: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetKeyIterator

PseudocodegetKeyIterator()

TaskIt creates an iterator that traverses all search keys in the dictionary.

Input and OutputInput: None.

Output: Returns an iterator that provides sequential access to thesearch keys in the dictionary.

19 / 127

Page 46: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetValueIterator

PseudocodegetValueIterator()

TaskIt creates an iterator that traverses all values in the dictionary.

Input and OutputInput: None.

Output: Returns an iterator that provides sequential access to thevalues in the dictionary.

20 / 127

Page 47: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetValueIterator

PseudocodegetValueIterator()

TaskIt creates an iterator that traverses all values in the dictionary.

Input and OutputInput: None.

Output: Returns an iterator that provides sequential access to thevalues in the dictionary.

20 / 127

Page 48: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Specifications: GetValueIterator

PseudocodegetValueIterator()

TaskIt creates an iterator that traverses all values in the dictionary.

Input and OutputInput: None.

Output: Returns an iterator that provides sequential access to thevalues in the dictionary.

20 / 127

Page 49: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Other Operations

isEmpty()It sees whether the dictionary is empty.

getSize()It gets the size of the dictionary.

clear()It removes all entries from the dictionary.

21 / 127

Page 50: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Other Operations

isEmpty()It sees whether the dictionary is empty.

getSize()It gets the size of the dictionary.

clear()It removes all entries from the dictionary.

21 / 127

Page 51: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Other Operations

isEmpty()It sees whether the dictionary is empty.

getSize()It gets the size of the dictionary.

clear()It removes all entries from the dictionary.

21 / 127

Page 52: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However, we have two scenarios

Distinct search keysCase 1 You can refuse to add another key-value.Case 2 You can change the existing value associated with key to

the new value. Then, you return the old value

Duplicate search keysif the method add adds every given key-value entry to a dictionary

The methods remove and getValue must deal with multiple entriesthat have the same search key.What do you remove or return!!!

22 / 127

Page 53: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However, we have two scenarios

Distinct search keysCase 1 You can refuse to add another key-value.Case 2 You can change the existing value associated with key to

the new value. Then, you return the old value

Duplicate search keysif the method add adds every given key-value entry to a dictionary

The methods remove and getValue must deal with multiple entriesthat have the same search key.What do you remove or return!!!

22 / 127

Page 54: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However, we have two scenarios

Distinct search keysCase 1 You can refuse to add another key-value.Case 2 You can change the existing value associated with key to

the new value. Then, you return the old value

Duplicate search keysif the method add adds every given key-value entry to a dictionary

The methods remove and getValue must deal with multiple entriesthat have the same search key.What do you remove or return!!!

22 / 127

Page 55: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However, we have two scenarios

Distinct search keysCase 1 You can refuse to add another key-value.Case 2 You can change the existing value associated with key to

the new value. Then, you return the old value

Duplicate search keysif the method add adds every given key-value entry to a dictionary

The methods remove and getValue must deal with multiple entriesthat have the same search key.What do you remove or return!!!

22 / 127

Page 56: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However, we have two scenarios

Distinct search keysCase 1 You can refuse to add another key-value.Case 2 You can change the existing value associated with key to

the new value. Then, you return the old value

Duplicate search keysif the method add adds every given key-value entry to a dictionary

The methods remove and getValue must deal with multiple entriesthat have the same search key.What do you remove or return!!!

22 / 127

Page 57: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Interface

We have the following interface

impor t j a v a . u t i l . I t e r a t o r ;

p u b l i c i n t e r f a c e D i c t i o n a r y I n t e r f a c e <Key , Value>{

p u b l i c Value add (Key k , Value Item ) ;p u b l i c Value remove (Key k ) ;p u b l i c Value ge tVa lue (Key k ) ;p u b l i c boo l ean c o n t a i n s (Key k ) ;p u b l i c I t e r a t o r <Key> g e tK e y I t e r a t o r ( ) ;p u b l i c I t e r a t o r <Value> g e t V a l u e I t e r a t o r ( ) ;p u b l i c boo l ean isEmpty ( ) ;p u b l i c i n t g e t S i z e ( ) ;p u b l i c vo i d c l e a r ( ) ;

}

23 / 127

Page 58: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Where we can use this ADT?

In the phone directory problemIt is a directory that uses a name as the key and adds and returns a phonenumber

For exampleName Number

Suzanne Nouveaux 401-555-1234Andres Mendez-Vazquez 301-123-2345

... ...

24 / 127

Page 59: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Where we can use this ADT?

In the phone directory problemIt is a directory that uses a name as the key and adds and returns a phonenumber

For exampleName Number

Suzanne Nouveaux 401-555-1234Andres Mendez-Vazquez 301-123-2345

... ...

24 / 127

Page 60: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus, we have the following diagram

A class Diagram

TelephoneDirectory

readFile(data)getPhoneNumber(name)

PhoneDirectory

DictionaryName

String

1 1

*

* *

*

25 / 127

Page 61: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Basic Code

We have something like this

p u b l i c c l a s s Te l e phoneD i r e c t o r y{

p r i v a t e D i c t i o n a r y I n t e r f a c e <Name , S t r i ng> phoneBook ;p u b l i c Te l e phoneD i r e c t o r y ( ){

phoneBook = new So r t e dD i c t i o n a r y <Name , S t r i ng >() ;} // end d e f a u l t c o n s t r u c t o r

/∗∗ Reads a t e x t f i l e o f names and t e l e p h o n e numbers∗∗/p u b l i c vo i d r e a d F i l e ( Scanner data ){ . . . }/∗∗ Gets the phone number o f a g i v e n pe r son . ∗/p u b l i c S t r i n g getPhoneNumber (Name personName ){ . . . } // end getPhoneNumber

} // end T e l e p h o n e D i r e c t o r y

26 / 127

Page 62: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, the Big Question

It is a big oneHow do we implement this data structure?

Possible waysLinear ListSkip ListHash Tables...

27 / 127

Page 63: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, the Big Question

It is a big oneHow do we implement this data structure?

Possible waysLinear ListSkip ListHash Tables...

27 / 127

Page 64: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, the Big Question

It is a big oneHow do we implement this data structure?

Possible waysLinear ListSkip ListHash Tables...

27 / 127

Page 65: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, the Big Question

It is a big oneHow do we implement this data structure?

Possible waysLinear ListSkip ListHash Tables...

27 / 127

Page 66: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, the Big Question

It is a big oneHow do we implement this data structure?

Possible waysLinear ListSkip ListHash Tables...

27 / 127

Page 67: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

First: Represent As A Linear List

You haveL = (e0, e1, ..., en−1)

WhereEach ei is a pair (key, element).

We can use the following representationsArray or linked representation.

28 / 127

Page 68: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

First: Represent As A Linear List

You haveL = (e0, e1, ..., en−1)

WhereEach ei is a pair (key, element).

We can use the following representationsArray or linked representation.

28 / 127

Page 69: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

First: Represent As A Linear List

You haveL = (e0, e1, ..., en−1)

WhereEach ei is a pair (key, element).

We can use the following representationsArray or linked representation.

28 / 127

Page 70: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Array Representation

Our classic array

b c d e a

We have thenOperation in Array Representation Complexity

getValue(theKey) O(size)add(theKey, theItem) O(size) to find duplicate

O(1) to add at right endremove(theKey) O(size)

29 / 127

Page 71: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Array Representation

Our classic array

b c d e a

We have thenOperation in Array Representation Complexity

getValue(theKey) O(size)add(theKey, theItem) O(size) to find duplicate

O(1) to add at right endremove(theKey) O(size)

29 / 127

Page 72: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What if we sort the array?

Sorted Keys

b c d ea

We have thenOperation in Array Representation Complexity

getValue(theKey) O(logsize)add(theKey, theItem) O(logsize) to find duplicate

O(size) to addremove(theKey) O(size)

30 / 127

Page 73: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What if we sort the array?

Sorted Keys

b c d ea

We have thenOperation in Array Representation Complexity

getValue(theKey) O(logsize)add(theKey, theItem) O(logsize) to find duplicate

O(size) to addremove(theKey) O(size)

30 / 127

Page 74: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Unsorted Chain

Our Structure

b c d e a

ComplexityOperation in Chain Representation Complexity

getValue(theKey) O(size)add(theKey, theItem) O(size) to find duplicate

O(1) to addremove(theKey) O(size)

31 / 127

Page 75: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Unsorted Chain

Our Structure

b c d e a

ComplexityOperation in Chain Representation Complexity

getValue(theKey) O(size)add(theKey, theItem) O(size) to find duplicate

O(1) to addremove(theKey) O(size)

31 / 127

Page 76: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Sorted Chain

Our Structure

b c d ea

ComplexityOperation in Chain Representation Complexity

getValue(theKey) O(size)add(theKey, theItem) O(size) to find duplicate

O(1) to addremove(theKey) O(size)

32 / 127

Page 77: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Sorted Chain

Our Structure

b c d ea

ComplexityOperation in Chain Representation Complexity

getValue(theKey) O(size)add(theKey, theItem) O(size) to find duplicate

O(1) to addremove(theKey) O(size)

32 / 127

Page 78: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Skip Lists: we will skip it - It is for an advance class ofanalysis of algorithms

We have something like this

ComplexityOperation Complexity - Worst Case Complexity - Expected

getValue(theKey) O(size) O(log size)add(theKey, theItem) O(size) O(logsize)remove(theKey) O(size) O(logsize)

33 / 127

Page 79: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Skip Lists: we will skip it - It is for an advance class ofanalysis of algorithms

We have something like this

ComplexityOperation Complexity - Worst Case Complexity - Expected

getValue(theKey) O(size) O(log size)add(theKey, theItem) O(size) O(logsize)remove(theKey) O(size) O(logsize)

33 / 127

Page 80: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We will concentrate our efforts in the Hash Tables

DefinitionA hash table or hash map T is a data structure, most commonly anarray, that uses a hash function to efficiently map certain identifiers ofkeys (e.g. person names) to associated values.

Why?Operation in Array Representation Complexity - Worst Case Complexity - Expected

getValue(theKey) O(size) O (1 + C)

add(theKey, theItem) O(size) O (1 + C)

remove(theKey) O(size) O (1 + C)

34 / 127

Page 81: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We will concentrate our efforts in the Hash Tables

DefinitionA hash table or hash map T is a data structure, most commonly anarray, that uses a hash function to efficiently map certain identifiers ofkeys (e.g. person names) to associated values.

Why?Operation in Array Representation Complexity - Worst Case Complexity - Expected

getValue(theKey) O(size) O (1 + C)

add(theKey, theItem) O(size) O (1 + C)

remove(theKey) O(size) O (1 + C)

34 / 127

Page 82: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Then

AdvantagesThey have the advantage of having a expected complexity ofoperations of O(1 + C)

I Still, be aware of C because this will change depending on whichoverflow policy you use...

35 / 127

Page 83: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Then

AdvantagesThey have the advantage of having a expected complexity ofoperations of O(1 + C)

I Still, be aware of C because this will change depending on whichoverflow policy you use...

35 / 127

Page 84: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

36 / 127

Page 85: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

You have two cases for this data structure

FirstSmall universe of keys.

SecondLarge number of keys

37 / 127

Page 86: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

You have two cases for this data structure

FirstSmall universe of keys.

SecondLarge number of keys

37 / 127

Page 87: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 88: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 89: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 90: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 91: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 92: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 93: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 94: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a small universe of keys, U

We can do the followingKey values are direct addresses in the array.Direct implementation or Direct-address tables.

Operations1 Direct-Address-Search(Table, key)

I return Table[key]2 Direct-Address-Search(Table,key,value)

I Table[key]=value3 Direct-Address-Delete(T , x)

I Table[key]=null

38 / 127

Page 95: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a large universe of keys, U

ThenThen, it is impractical to store a table of the size of |U|.

You can use a especial function for mapping

h : U→{0, 1, ...,m − 1} (1)

39 / 127

Page 96: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

When you have a large universe of keys, U

ThenThen, it is impractical to store a table of the size of |U|.

You can use a especial function for mapping

h : U→{0, 1, ...,m − 1} (1)

39 / 127

Page 97: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Imagine that you haveA 1D array (or table) table[0 : m − 1].

Thush(k)is the home bucket for key k.

ThenEvery dictionary pair (key, Item) is stored in its home bucket table[h[key ]].

40 / 127

Page 98: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Imagine that you haveA 1D array (or table) table[0 : m − 1].

Thush(k)is the home bucket for key k.

ThenEvery dictionary pair (key, Item) is stored in its home bucket table[h[key ]].

40 / 127

Page 99: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Imagine that you haveA 1D array (or table) table[0 : m − 1].

Thush(k)is the home bucket for key k.

ThenEvery dictionary pair (key, Item) is stored in its home bucket table[h[key ]].

40 / 127

Page 100: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Push the following pairs in a hash table of size m = 8(22,a), (33,c), (3,d), (73,e), (85,f).

Hash function iskey/11

Then, we have that

41 / 127

Page 101: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Push the following pairs in a hash table of size m = 8(22,a), (33,c), (3,d), (73,e), (85,f).

Hash function iskey/11

Then, we have that

41 / 127

Page 102: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Push the following pairs in a hash table of size m = 8(22,a), (33,c), (3,d), (73,e), (85,f).

Hash function iskey/11

Then, we have that

41 / 127

Page 103: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What Can Go Wrong?

What if we add?Where does (26,g) go?

Then

PROBLEM!!!Keys that have the same home bucket are synonyms.22 and 26 are synonyms with respect to the hash function that is inuse.This is known as collision or overflow.

42 / 127

Page 104: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What Can Go Wrong?

What if we add?Where does (26,g) go?

Then

PROBLEM!!!Keys that have the same home bucket are synonyms.22 and 26 are synonyms with respect to the hash function that is inuse.This is known as collision or overflow.

42 / 127

Page 105: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What Can Go Wrong?

What if we add?Where does (26,g) go?

Then

PROBLEM!!!Keys that have the same home bucket are synonyms.22 and 26 are synonyms with respect to the hash function that is inuse.This is known as collision or overflow.

42 / 127

Page 106: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What Can Go Wrong?

What if we add?Where does (26,g) go?

Then

PROBLEM!!!Keys that have the same home bucket are synonyms.22 and 26 are synonyms with respect to the hash function that is inuse.This is known as collision or overflow.

42 / 127

Page 107: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

43 / 127

Page 108: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

43 / 127

Page 109: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

43 / 127

Page 110: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

43 / 127

Page 111: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Collisions

This is a problemWe might try to avoid this by using a suitable hash function h.

IdeaMake appear to be “random” enough to avoid collisions altogether(Highly Improbable) or to minimize the probability of them.

You still have the problem of collisionsPossible Solutions to the problem:

1 Chaining2 Open Addressing

43 / 127

Page 112: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Other Issues

First IssueThe choice of the possible hash function.

SecondThe collision handling method

ThirdThe size (number of buckets) at the hash table

44 / 127

Page 113: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Other Issues

First IssueThe choice of the possible hash function.

SecondThe collision handling method

ThirdThe size (number of buckets) at the hash table

44 / 127

Page 114: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Other Issues

First IssueThe choice of the possible hash function.

SecondThe collision handling method

ThirdThe size (number of buckets) at the hash table

44 / 127

Page 115: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

45 / 127

Page 116: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Functions

They have two parts1 The conversion of the key into an integer in the case the key is not an

integer.2 The mapping to the home bucket.

46 / 127

Page 117: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Functions

They have two parts1 The conversion of the key into an integer in the case the key is not an

integer.2 The mapping to the home bucket.

46 / 127

Page 118: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Functions

They have two parts1 The conversion of the key into an integer in the case the key is not an

integer.2 The mapping to the home bucket.

46 / 127

Page 119: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

String To Integer

Something NotableEach Java character is 2 bytes long.An int is 4 bytes long.

We could have the following string s = “pt”We then do the following:

1 int answer = s.charAt(0);2 answer = (answer<�<16)+s.charAt(1) ;

47 / 127

Page 120: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

String To Integer

Something NotableEach Java character is 2 bytes long.An int is 4 bytes long.

We could have the following string s = “pt”We then do the following:

1 int answer = s.charAt(0);2 answer = (answer<�<16)+s.charAt(1) ;

47 / 127

Page 121: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

String To Integer

Something NotableEach Java character is 2 bytes long.An int is 4 bytes long.

We could have the following string s = “pt”We then do the following:

1 int answer = s.charAt(0);2 answer = (answer<�<16)+s.charAt(1) ;

47 / 127

Page 122: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

String To Integer

Something NotableEach Java character is 2 bytes long.An int is 4 bytes long.

We could have the following string s = “pt”We then do the following:

1 int answer = s.charAt(0);2 answer = (answer<�<16)+s.charAt(1) ;

47 / 127

Page 123: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

String To Integer

Something NotableEach Java character is 2 bytes long.An int is 4 bytes long.

We could have the following string s = “pt”We then do the following:

1 int answer = s.charAt(0);2 answer = (answer<�<16)+s.charAt(1) ;

47 / 127

Page 124: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What about longer Strings?

We can do the following process

p u b l i c s t a t i c i n t i n t e g e r ( S t r i n g s ){

i n t l e n g t h = s . l e n g t h ( ) ;// number o f c h a r a c t e r s i n s

i n t answer = 0 ;i f ( l e n g t h % 2 == 1){// l e n g t h i s odd

answer = s . charAt ( l e n g t h − 1 ) ;l eng th −−;

}

48 / 127

Page 125: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Cont...

The rest of the code

// l e n g t h i s now evenf o r ( i n t i = 0 ; i < l e n g t h ; i += 2){// do two c h a r a c t e r s a t a t ime

answer += s . charAt ( i ) ;answer += ( ( i n t ) s . charAt ( i + 1) ) << 16 ;

}r e t u r n ( answer < 0) ? −answer : answer ; }

49 / 127

Page 126: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Analysis of hashing: Which hash function?

Consider that:Good hash functions should maintain the property of simple uniformhashing!

The keys have the same probability 1/m to be hashed to any bucket!!!A uniform hash function minimizes the likelihood of an overflow whenkeys are selected at random.

50 / 127

Page 127: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Analysis of hashing: Which hash function?

Consider that:Good hash functions should maintain the property of simple uniformhashing!

The keys have the same probability 1/m to be hashed to any bucket!!!A uniform hash function minimizes the likelihood of an overflow whenkeys are selected at random.

50 / 127

Page 128: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Analysis of hashing: Which hash function?

Consider that:Good hash functions should maintain the property of simple uniformhashing!

The keys have the same probability 1/m to be hashed to any bucket!!!A uniform hash function minimizes the likelihood of an overflow whenkeys are selected at random.

50 / 127

Page 129: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Possible hash function when the keys are natural numbers

The division methodh(k) = k mod m.Good choices for m are primes not too close to a power of 2.

51 / 127

Page 130: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Possible hash function when the keys are natural numbers

The division methodh(k) = k mod m.Good choices for m are primes not too close to a power of 2.

51 / 127

Page 131: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What if...Question:What about something with keys in a normal distribution?

52 / 127

Page 132: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

53 / 127

Page 133: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hashing By Division

Universe of keyskeySpace = all integers.

Thus, we have thatFor every m, the number of integers that get mapped (hashed) into bucketi is approximately 232/m.

PropertiesThe division method results in a uniform hash function when keySpace =all integers.

54 / 127

Page 134: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hashing By Division

Universe of keyskeySpace = all integers.

Thus, we have thatFor every m, the number of integers that get mapped (hashed) into bucketi is approximately 232/m.

PropertiesThe division method results in a uniform hash function when keySpace =all integers.

54 / 127

Page 135: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hashing By Division

Universe of keyskeySpace = all integers.

Thus, we have thatFor every m, the number of integers that get mapped (hashed) into bucketi is approximately 232/m.

PropertiesThe division method results in a uniform hash function when keySpace =all integers.

54 / 127

Page 136: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However

ProblemIn practice, keys tend to be correlated.

ThusThe choice of the divisor b affects the distribution of home buckets.

ThenBecause of this correlation, applications tend to have a bias towards keysthat map into odd integers (or into even ones).

55 / 127

Page 137: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However

ProblemIn practice, keys tend to be correlated.

ThusThe choice of the divisor b affects the distribution of home buckets.

ThenBecause of this correlation, applications tend to have a bias towards keysthat map into odd integers (or into even ones).

55 / 127

Page 138: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However

ProblemIn practice, keys tend to be correlated.

ThusThe choice of the divisor b affects the distribution of home buckets.

ThenBecause of this correlation, applications tend to have a bias towards keysthat map into odd integers (or into even ones).

55 / 127

Page 139: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Examples

Odd number and m an even numberOdd integers hash into odd home buckets

15%14 = 1, 3%14 = 3, 23%14 = 9

Even number and m an even numberEven integers into even home buckets.

20%14 = 6, 30%14 = 2, 8%14 = 8

PropertiesThe bias in the keys results in a bias toward either the odd or even homebuckets.

56 / 127

Page 140: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Examples

Odd number and m an even numberOdd integers hash into odd home buckets

15%14 = 1, 3%14 = 3, 23%14 = 9

Even number and m an even numberEven integers into even home buckets.

20%14 = 6, 30%14 = 2, 8%14 = 8

PropertiesThe bias in the keys results in a bias toward either the odd or even homebuckets.

56 / 127

Page 141: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Examples

Odd number and m an even numberOdd integers hash into odd home buckets

15%14 = 1, 3%14 = 3, 23%14 = 9

Even number and m an even numberEven integers into even home buckets.

20%14 = 6, 30%14 = 2, 8%14 = 8

PropertiesThe bias in the keys results in a bias toward either the odd or even homebuckets.

56 / 127

Page 142: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What if we use an Odd number?

Odd number and m an odd numberodd integers may hash into any home.

15%15 = 0, 3%15 = 3, 23%15 = 8

Even number and m an odd numberEven integers may hash into any home.

20%15 = 5, 30%15 = 0, 8%15 = 8

ThusThe bias in the keys does not result in a bias toward either the odd or evenhome buckets.

57 / 127

Page 143: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What if we use an Odd number?

Odd number and m an odd numberodd integers may hash into any home.

15%15 = 0, 3%15 = 3, 23%15 = 8

Even number and m an odd numberEven integers may hash into any home.

20%15 = 5, 30%15 = 0, 8%15 = 8

ThusThe bias in the keys does not result in a bias toward either the odd or evenhome buckets.

57 / 127

Page 144: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

What if we use an Odd number?

Odd number and m an odd numberodd integers may hash into any home.

15%15 = 0, 3%15 = 3, 23%15 = 8

Even number and m an odd numberEven integers may hash into any home.

20%15 = 5, 30%15 = 0, 8%15 = 8

ThusThe bias in the keys does not result in a bias toward either the odd or evenhome buckets.

57 / 127

Page 145: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Bias

Something NotableThe bias in the keys does not result in a bias toward either the odd or evenhome buckets.

Then, we haveWe have a better chance of uniformly distributed home buckets.

ThusSo do not use an even divisor.

58 / 127

Page 146: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Bias

Something NotableThe bias in the keys does not result in a bias toward either the odd or evenhome buckets.

Then, we haveWe have a better chance of uniformly distributed home buckets.

ThusSo do not use an even divisor.

58 / 127

Page 147: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Bias

Something NotableThe bias in the keys does not result in a bias toward either the odd or evenhome buckets.

Then, we haveWe have a better chance of uniformly distributed home buckets.

ThusSo do not use an even divisor.

58 / 127

Page 148: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Selecting The Divisor

Another ProblemSimilar biased distribution of home buckets is seen, in practice, when thedivisor is a multiple of prime numbers such as 3, 5, 7, . . .

HoweverThe effect of each prime divisor p of m decreases as p gets larger.

Rules of Choosing mIdeally, choose m so that it is a prime number.Not to close to a power of 2.

59 / 127

Page 149: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Selecting The Divisor

Another ProblemSimilar biased distribution of home buckets is seen, in practice, when thedivisor is a multiple of prime numbers such as 3, 5, 7, . . .

HoweverThe effect of each prime divisor p of m decreases as p gets larger.

Rules of Choosing mIdeally, choose m so that it is a prime number.Not to close to a power of 2.

59 / 127

Page 150: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Selecting The Divisor

Another ProblemSimilar biased distribution of home buckets is seen, in practice, when thedivisor is a multiple of prime numbers such as 3, 5, 7, . . .

HoweverThe effect of each prime divisor p of m decreases as p gets larger.

Rules of Choosing mIdeally, choose m so that it is a prime number.Not to close to a power of 2.

59 / 127

Page 151: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However

Something NotableEven with this hash function, we can have problems

RememberThe Gaussian Keys...

60 / 127

Page 152: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

However

Something NotableEven with this hash function, we can have problems

RememberThe Gaussian Keys...

60 / 127

Page 153: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Java.util.HashTable

Quite simpleSimply uses a divisor that is an odd number.

Simplify the implementationThis simplifies implementation because we must be able to resize the hashtable as more pairs are put into the dictionary.

Why?Array doubling, for example, requires you to go from a 1D array tablewhose length is m (which is odd) to an array whose length is 2m + 1(which is also odd).

61 / 127

Page 154: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Java.util.HashTable

Quite simpleSimply uses a divisor that is an odd number.

Simplify the implementationThis simplifies implementation because we must be able to resize the hashtable as more pairs are put into the dictionary.

Why?Array doubling, for example, requires you to go from a 1D array tablewhose length is m (which is odd) to an array whose length is 2m + 1(which is also odd).

61 / 127

Page 155: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Java.util.HashTable

Quite simpleSimply uses a divisor that is an odd number.

Simplify the implementationThis simplifies implementation because we must be able to resize the hashtable as more pairs are put into the dictionary.

Why?Array doubling, for example, requires you to go from a 1D array tablewhose length is m (which is odd) to an array whose length is 2m + 1(which is also odd).

61 / 127

Page 156: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

A Better Solution: Universal Hashing

IssuesIn practice, keys are not randomly distributed.Any fixed hash function might yield retrieval O(n) time.

GoalTo find hash functions that produce uniform random table indexesirrespective of the keys.

IdeaTo select a hash function at random from a designed class of functions atthe beginning of the execution.

62 / 127

Page 157: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

A Better Solution: Universal Hashing

IssuesIn practice, keys are not randomly distributed.Any fixed hash function might yield retrieval O(n) time.

GoalTo find hash functions that produce uniform random table indexesirrespective of the keys.

IdeaTo select a hash function at random from a designed class of functions atthe beginning of the execution.

62 / 127

Page 158: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

A Better Solution: Universal Hashing

IssuesIn practice, keys are not randomly distributed.Any fixed hash function might yield retrieval O(n) time.

GoalTo find hash functions that produce uniform random table indexesirrespective of the keys.

IdeaTo select a hash function at random from a designed class of functions atthe beginning of the execution.

62 / 127

Page 159: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hashing methods: Universal hashing

Example

Set of hash functions

Choose a hash function randomly

(At the beginning of the execution)

HASH TABLE

63 / 127

Page 160: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 161: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 162: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 163: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 164: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 165: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 166: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 167: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of Universal HashProceed as follows:

Choose a primer number p large enough so that every possible key k is inthe range [0, ..., p − 1]

Zp = {0, 1, ..., p − 1}and Z∗p = {1, ..., p − 1}

Define the following hash function:

ha,b(k) = ((ak + b) mod p) mod m,∀a ∈ Z∗p and b ∈ Zp

The family of all such hash functions is:

Hp,m = {ha,b : a ∈ Z∗p and b ∈ Zp}

Importanta and b are chosen randomly at the beginning of execution.The class Hp,m of hash functions is universal. 64 / 127

Page 168: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example: Universal hash functions

Examplep = 977, m = 50, a and b random numbers

I ha,b(k) = ((ak + b) mod p) mod m

65 / 127

Page 169: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example of key distribution

Example, mean = 488.5 and dispersion = 5

66 / 127

Page 170: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example with 10 keys

Universal Hashing Vs Division Method

67 / 127

Page 171: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example with 50 keys

Universal Hashing Vs Division Method

68 / 127

Page 172: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example with 100 keys

Universal Hashing Vs Division Method

69 / 127

Page 173: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example with 200 keys

Universal Hashing Vs Division Method

70 / 127

Page 174: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

71 / 127

Page 175: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

71 / 127

Page 176: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

71 / 127

Page 177: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

71 / 127

Page 178: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Another Example: Matrix MethodThen

Let us say keys are u-bits long.Say the table size M is power of 2.an index is b-bits long with M = 2b.

The h functionPick h to be a random b-by-u 0/1 matrix, and define h(x) = hxwhere after the inner product we apply mod 2

Example

h

b

1 0 0 00 1 1 11 1 1 0

u

x1010

=

h (x) 110

71 / 127

Page 179: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Implementation of the column*vector mod 2

Code - SWAR-Popcount - Divide and Conquer

// This works on l y i n 32 b i t si n t p roduc t ( i n t row , i n t v e c t o r ){

i n t i = row & ve c t o r ;

i = i − ( ( i >> 1) & 0x55555555 ) ;i = ( i & 0x33333333 ) + ( ( i >> 2) & 0x33333333 ) ;i = ( ( ( i + ( i >> 4)) & 0x0F0F0F0F ) ∗ 0x01010101 ) >> 24 ;

r e t u r n i & 0x00000001 ;

}

72 / 127

Page 180: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explanation

Counting Duo-BitsA bit-duo (two neighboring bits) can be interpreted with bit 0 = a, and bit1 = b as

duo := 2b + a

The duo population is

popcnt(duo) := b + a

This can be achieved by(2b + a)− (2b + a)÷ 2 or (2b + a)− b

73 / 127

Page 181: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explanation

Counting Duo-BitsA bit-duo (two neighboring bits) can be interpreted with bit 0 = a, and bit1 = b as

duo := 2b + a

The duo population is

popcnt(duo) := b + a

This can be achieved by(2b + a)− (2b + a)÷ 2 or (2b + a)− b

73 / 127

Page 182: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explanation

Counting Duo-BitsA bit-duo (two neighboring bits) can be interpreted with bit 0 = a, and bit1 = b as

duo := 2b + a

The duo population is

popcnt(duo) := b + a

This can be achieved by(2b + a)− (2b + a)÷ 2 or (2b + a)− b

73 / 127

Page 183: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We have thenSomething Notable

i i div 2 popcnt(i)00 00 0001 00 0110 01 0111 01 10

We have thenOnly the lower bit is needed from i div 2 - and one do not has to worryabout borrows from neighboring duos.

We need to clear the upper bitsRemember 5 in binary 0101 (Granularity Problem)... we use the 0 toremove the upper bits

74 / 127

Page 184: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We have thenSomething Notable

i i div 2 popcnt(i)00 00 0001 00 0110 01 0111 01 10

We have thenOnly the lower bit is needed from i div 2 - and one do not has to worryabout borrows from neighboring duos.

We need to clear the upper bitsRemember 5 in binary 0101 (Granularity Problem)... we use the 0 toremove the upper bits

74 / 127

Page 185: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We have thenSomething Notable

i i div 2 popcnt(i)00 00 0001 00 0110 01 0111 01 10

We have thenOnly the lower bit is needed from i div 2 - and one do not has to worryabout borrows from neighboring duos.

We need to clear the upper bitsRemember 5 in binary 0101 (Granularity Problem)... we use the 0 toremove the upper bits

74 / 127

Page 186: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus, we have that

Clearing BitsSWAR-wise, one needs to clear all "even" bits of the div 2 subtrahendto perform a 32-bit subtraction of all 16 duos:

I x = x - ((x >�> 1) & 0x55555555);

NoteThe popcount-result of the bit-duos still takes two bits.

Now

What?

75 / 127

Page 187: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus, we have that

Clearing BitsSWAR-wise, one needs to clear all "even" bits of the div 2 subtrahendto perform a 32-bit subtraction of all 16 duos:

I x = x - ((x >�> 1) & 0x55555555);

NoteThe popcount-result of the bit-duos still takes two bits.

Now

What?

75 / 127

Page 188: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus, we have that

Clearing BitsSWAR-wise, one needs to clear all "even" bits of the div 2 subtrahendto perform a 32-bit subtraction of all 16 duos:

I x = x - ((x >�> 1) & 0x55555555);

NoteThe popcount-result of the bit-duos still takes two bits.

Now

What?

75 / 127

Page 189: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Counting Nibble-Bits

We haveThe next step is to add the duo-counts to populations of four neighboringbits, the 8 nibble-counts, which may range from zero to four

We can do this using 3 in binary0x3=0011

76 / 127

Page 190: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Counting Nibble-Bits

We haveThe next step is to add the duo-counts to populations of four neighboringbits, the 8 nibble-counts, which may range from zero to four

We can do this using 3 in binary0x3=0011

76 / 127

Page 191: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

How?

How many ones you can have in 4 bits0 to 22

Thus, we only need 4 bits to store the resultWe create a counter for each of the two bits in four bits

This is done using the mask - after How many ones can you have intwo bits?

I From 0 to 2 you can use the lower 2 bits to represent this!!!

This is the instructioni = (i & 0x33333333) + ((i >�> 2) & 0x33333333);

77 / 127

Page 192: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

How?

How many ones you can have in 4 bits0 to 22

Thus, we only need 4 bits to store the resultWe create a counter for each of the two bits in four bits

This is done using the mask - after How many ones can you have intwo bits?

I From 0 to 2 you can use the lower 2 bits to represent this!!!

This is the instructioni = (i & 0x33333333) + ((i >�> 2) & 0x33333333);

77 / 127

Page 193: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

How?

How many ones you can have in 4 bits0 to 22

Thus, we only need 4 bits to store the resultWe create a counter for each of the two bits in four bits

This is done using the mask - after How many ones can you have intwo bits?

I From 0 to 2 you can use the lower 2 bits to represent this!!!

This is the instructioni = (i & 0x33333333) + ((i >�> 2) & 0x33333333);

77 / 127

Page 194: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

How?

How many ones you can have in 4 bits0 to 22

Thus, we only need 4 bits to store the resultWe create a counter for each of the two bits in four bits

This is done using the mask - after How many ones can you have intwo bits?

I From 0 to 2 you can use the lower 2 bits to represent this!!!

This is the instructioni = (i & 0x33333333) + ((i >�> 2) & 0x33333333);

77 / 127

Page 195: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

How?

How many ones you can have in 4 bits0 to 22

Thus, we only need 4 bits to store the resultWe create a counter for each of the two bits in four bits

This is done using the mask - after How many ones can you have intwo bits?

I From 0 to 2 you can use the lower 2 bits to represent this!!!

This is the instructioni = (i & 0x33333333) + ((i >�> 2) & 0x33333333);

77 / 127

Page 196: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

We haveOld i new i 3 new i&3

0000 0000 0011 00000001 0001 0011 00010010 0001 0011 00010011 0010 0011 00100100 0100 0011 00000101 0101 0011 00011111 1010 0011 0010

new i>�>2 3 new i>�>2&3

0000 0011 00000000 0011 00000000 0011 00000000 0011 00000001 0011 00010001 0011 00010010 0011 0010

Final i

0000000100010010000100100100

78 / 127

Page 197: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We have the following

Now what?You already got the idea? Now it is about to get the byte-populationsfrom two nibble-populations (8 bits)

Do you remember your hexadecimal notation0x0f = 00001111

How many ones do you have in 8 positionsFrom 0 to 23, you only require 4 bits for this!!!

79 / 127

Page 198: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We have the following

Now what?You already got the idea? Now it is about to get the byte-populationsfrom two nibble-populations (8 bits)

Do you remember your hexadecimal notation0x0f = 00001111

How many ones do you have in 8 positionsFrom 0 to 23, you only require 4 bits for this!!!

79 / 127

Page 199: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We have the following

Now what?You already got the idea? Now it is about to get the byte-populationsfrom two nibble-populations (8 bits)

Do you remember your hexadecimal notation0x0f = 00001111

How many ones do you have in 8 positionsFrom 0 to 23, you only require 4 bits for this!!!

79 / 127

Page 200: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus the next step

It will get the counter for the number of ones in 8 bits(i + (i >�> 4)) & 0x0f0f0f0f0f0f0f0f;

80 / 127

Page 201: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

We haveOld x 2 bits 4 bits

11110000 10100000 0100000011111111 10101010 01000100

Now we get1 01000000 7−→ (i >�> 4) → 00000100 7−→ i+(i >�> 4)→01000100&000011117−→ 00000100

2 01000100 7−→ (i >�> 4) → 00000100 7−→ i+(i >�> 4)→01001000&000011117−→ 00001000

81 / 127

Page 202: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

We haveOld x 2 bits 4 bits

11110000 10100000 0100000011111111 10101010 01000100

Now we get1 01000000 7−→ (i >�> 4) → 00000100 7−→ i+(i >�> 4)→01000100&000011117−→ 00000100

2 01000100 7−→ (i >�> 4) → 00000100 7−→ i+(i >�> 4)→01001000&000011117−→ 00001000

81 / 127

Page 203: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We can keep doing this

It is possible to keep goingi = (i + (i >�> 8)) & 0x00ff00ff

Theni = (i + (i >�> 16)) & 0x000000ff

82 / 127

Page 204: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

We can keep doing this

It is possible to keep goingi = (i + (i >�> 8)) & 0x00ff00ff

Theni = (i + (i >�> 16)) & 0x000000ff

82 / 127

Page 205: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

With today fast 64 bit multiplications

Something Notableone can multiply the vector of 8-byte-counts with 0x0101010101010101 toget the final result in the most significant byte,

For this the code(((i + (i >�> 4)) & 0x0F0F0F0F) * 0x01010101)>�>24

83 / 127

Page 206: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

With today fast 64 bit multiplications

Something Notableone can multiply the vector of 8-byte-counts with 0x0101010101010101 toget the final result in the most significant byte,

For this the code(((i + (i >�> 4)) & 0x0F0F0F0F) * 0x01010101)>�>24

83 / 127

Page 207: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Divide and Conquer

This is the natural way we do many thingsWe always attack smaller versions first of the large one!!!

84 / 127

Page 208: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Overflow HandlingOverflowAn overflow occurs when the home bucket for a new pair (key, element) isfull.

One strategy to handle overflow, small universe of keysSearch the hash table in some systematic fashion for a bucket that is notfull.

Linear probing (linear open addressing).Quadratic probing.Random probing.

The other strategy, a large universe of keysEliminate overflows by permitting each bucket to keep a list of all pairs forwhich it is the home bucket.

Array linear list.Chain.

85 / 127

Page 209: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Overflow HandlingOverflowAn overflow occurs when the home bucket for a new pair (key, element) isfull.

One strategy to handle overflow, small universe of keysSearch the hash table in some systematic fashion for a bucket that is notfull.

Linear probing (linear open addressing).Quadratic probing.Random probing.

The other strategy, a large universe of keysEliminate overflows by permitting each bucket to keep a list of all pairs forwhich it is the home bucket.

Array linear list.Chain.

85 / 127

Page 210: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Overflow HandlingOverflowAn overflow occurs when the home bucket for a new pair (key, element) isfull.

One strategy to handle overflow, small universe of keysSearch the hash table in some systematic fashion for a bucket that is notfull.

Linear probing (linear open addressing).Quadratic probing.Random probing.

The other strategy, a large universe of keysEliminate overflows by permitting each bucket to keep a list of all pairs forwhich it is the home bucket.

Array linear list.Chain.

85 / 127

Page 211: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Overflow HandlingOverflowAn overflow occurs when the home bucket for a new pair (key, element) isfull.

One strategy to handle overflow, small universe of keysSearch the hash table in some systematic fashion for a bucket that is notfull.

Linear probing (linear open addressing).Quadratic probing.Random probing.

The other strategy, a large universe of keysEliminate overflows by permitting each bucket to keep a list of all pairs forwhich it is the home bucket.

Array linear list.Chain.

85 / 127

Page 212: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Overflow HandlingOverflowAn overflow occurs when the home bucket for a new pair (key, element) isfull.

One strategy to handle overflow, small universe of keysSearch the hash table in some systematic fashion for a bucket that is notfull.

Linear probing (linear open addressing).Quadratic probing.Random probing.

The other strategy, a large universe of keysEliminate overflows by permitting each bucket to keep a list of all pairs forwhich it is the home bucket.

Array linear list.Chain.

85 / 127

Page 213: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Overflow HandlingOverflowAn overflow occurs when the home bucket for a new pair (key, element) isfull.

One strategy to handle overflow, small universe of keysSearch the hash table in some systematic fashion for a bucket that is notfull.

Linear probing (linear open addressing).Quadratic probing.Random probing.

The other strategy, a large universe of keysEliminate overflows by permitting each bucket to keep a list of all pairs forwhich it is the home bucket.

Array linear list.Chain.

85 / 127

Page 214: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

86 / 127

Page 215: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Open addressing

DefinitionAll the elements occupy the hash table itself.

What is it?We systematically examine table slots until either we find the desiredelement or we have ascertained that the element is not in the table.

AdvantagesThe advantage of open addressing is that it avoids pointers altogether.

87 / 127

Page 216: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Open addressing

DefinitionAll the elements occupy the hash table itself.

What is it?We systematically examine table slots until either we find the desiredelement or we have ascertained that the element is not in the table.

AdvantagesThe advantage of open addressing is that it avoids pointers altogether.

87 / 127

Page 217: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Open addressing

DefinitionAll the elements occupy the hash table itself.

What is it?We systematically examine table slots until either we find the desiredelement or we have ascertained that the element is not in the table.

AdvantagesThe advantage of open addressing is that it avoids pointers altogether.

87 / 127

Page 218: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Insert in Open addressing

Extended hash function to probeInstead of being fixed in the order 0, 1, 2, ...,m − 1 with Θ (n) searchtimeExtend the hash function toh : U × {0, 1, ...,m − 1} → {0, 1, ...,m − 1}This gives the probe sequence 〈h(k, 0), h(k, 1), ..., h(k,m − 1)〉

I A permutation of 〈0, 1, 2, ...,m − 1〉

88 / 127

Page 219: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Insert in Open addressing

Extended hash function to probeInstead of being fixed in the order 0, 1, 2, ...,m − 1 with Θ (n) searchtimeExtend the hash function toh : U × {0, 1, ...,m − 1} → {0, 1, ...,m − 1}This gives the probe sequence 〈h(k, 0), h(k, 1), ..., h(k,m − 1)〉

I A permutation of 〈0, 1, 2, ...,m − 1〉

88 / 127

Page 220: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Insert in Open addressing

Extended hash function to probeInstead of being fixed in the order 0, 1, 2, ...,m − 1 with Θ (n) searchtimeExtend the hash function toh : U × {0, 1, ...,m − 1} → {0, 1, ...,m − 1}This gives the probe sequence 〈h(k, 0), h(k, 1), ..., h(k,m − 1)〉

I A permutation of 〈0, 1, 2, ...,m − 1〉

88 / 127

Page 221: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Insert in Open addressing

Extended hash function to probeInstead of being fixed in the order 0, 1, 2, ...,m − 1 with Θ (n) searchtimeExtend the hash function toh : U × {0, 1, ...,m − 1} → {0, 1, ...,m − 1}This gives the probe sequence 〈h(k, 0), h(k, 1), ..., h(k,m − 1)〉

I A permutation of 〈0, 1, 2, ...,m − 1〉

88 / 127

Page 222: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hashing methods in Open Addressing

HASH-INSERT(T , k)1 i = 02 repeat3 j = h (k , i)4 if T [j ] == NIL5 T [j ] = k6 return j7 else i = i + 18 until i == m9 error “Hash Table Overflow”

89 / 127

Page 223: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hashing methods in Open Addressing

HASH-SEARCH(T,k)1 i = 02 repeat3 j = h (k , i)4 if T [j ] == k5 return j6 i = i + 17 until T [j ] == NIL or i == m8 return NIL

90 / 127

Page 224: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

Hash functionGiven an ordinary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) =(h′(k) + i

)mod m, (2)

Sequence of probesGiven key k, we first probe T [h′(k)], then T [h′(k) + 1] and so on untilT [m − 1]. Then, we wrap around T [0] to T [h′(k)− 1].

Distinct probesBecause the initial probe determines the entire probe sequence, there arem distinct probe sequences.

91 / 127

Page 225: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

Hash functionGiven an ordinary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) =(h′(k) + i

)mod m, (2)

Sequence of probesGiven key k, we first probe T [h′(k)], then T [h′(k) + 1] and so on untilT [m − 1]. Then, we wrap around T [0] to T [h′(k)− 1].

Distinct probesBecause the initial probe determines the entire probe sequence, there arem distinct probe sequences.

91 / 127

Page 226: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

Hash functionGiven an ordinary hash function h′ : 0, 1, ...,m − 1→ U fori = 0, 1, ...,m − 1, we get the extended hash function

h(k, i) =(h′(k) + i

)mod m, (2)

Sequence of probesGiven key k, we first probe T [h′(k)], then T [h′(k) + 1] and so on untilT [m − 1]. Then, we wrap around T [0] to T [h′(k)− 1].

Distinct probesBecause the initial probe determines the entire probe sequence, there arem distinct probe sequences.

91 / 127

Page 227: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

DisadvantagesLinear probing suffers of primary clustering.Long runs of occupied slots build up increasing the average searchtime.

I Clusters arise because an empty slot preceded by i full slots gets fillednext with probability i+1

m .

Long runs of occupied slots tend to get longer, and the averagesearch time increases.

92 / 127

Page 228: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

DisadvantagesLinear probing suffers of primary clustering.Long runs of occupied slots build up increasing the average searchtime.

I Clusters arise because an empty slot preceded by i full slots gets fillednext with probability i+1

m .

Long runs of occupied slots tend to get longer, and the averagesearch time increases.

92 / 127

Page 229: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

DisadvantagesLinear probing suffers of primary clustering.Long runs of occupied slots build up increasing the average searchtime.

I Clusters arise because an empty slot preceded by i full slots gets fillednext with probability i+1

m .

Long runs of occupied slots tend to get longer, and the averagesearch time increases.

92 / 127

Page 230: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear probing: Definition and properties

DisadvantagesLinear probing suffers of primary clustering.Long runs of occupied slots build up increasing the average searchtime.

I Clusters arise because an empty slot preceded by i full slots gets fillednext with probability i+1

m .

Long runs of occupied slots tend to get longer, and the averagesearch time increases.

92 / 127

Page 231: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example: Linear Probing – Get And Put

Constraintsdivisor = m (number of buckets) = 17.Home bucket = key % 17.

ThenPut in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45

We have

93 / 127

Page 232: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example: Linear Probing – Get And Put

Constraintsdivisor = m (number of buckets) = 17.Home bucket = key % 17.

ThenPut in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45

We have

93 / 127

Page 233: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example: Linear Probing – Get And Put

Constraintsdivisor = m (number of buckets) = 17.Home bucket = key % 17.

ThenPut in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45

We have

93 / 127

Page 234: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – Remove

Example

remove(0)

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

94 / 127

Page 235: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – Remove

Example

remove(0)

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

94 / 127

Page 236: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – Remove

Example

remove(0)

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

94 / 127

Page 237: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – remove(34)

Example

remove(34)

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

95 / 127

Page 238: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – remove(34)

Example

remove(34)

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

95 / 127

Page 239: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – remove(34)Example

remove(34)

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

95 / 127

Page 240: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – remove(29)

Example

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

96 / 127

Page 241: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear Probing – remove(29)

Example

Compact ClusterSearch cluster for pair (if any) to fill vacated bucket.

96 / 127

Page 242: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Code for RemovingWe have the following

p u b l i c vo i d remove ( key ){i n t p o s i t i o n sChe c k ed = 1 ;i n t i = F i ndS l o t (Key ) ;i f ( Table [ i ] == n u l l )

r e t u r n ; // key i s not i n the t a b l ej = i ;wh i l e ( p o s i t i o n sCh e c k ed <= Table . l e n g t h ){

j = ( j +1) % Table . l e n g t h ;i f ( Table [ j ] == n u l l ) b reak ;k = Hashing ( Table [ j ] . key ) ;i f ( i < j && ( k <= i | | k > j ) ) | |

( j < i && ( k <= i && k > j ) ) {Table [ i ] = Table [ j ] ;i = j ;

}po s i t i o nChe ck ed++;

}Table [ i ] = n u l l ;

}97 / 127

Page 243: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explanation

FirstFor all records in a cluster, there must be no vacant slots between theirnatural hash position and their current position (else lookups willterminate before finding the record).

Secondk is the raw hash where the record at j would naturally land in thehash table if there were no collisions.

ThusThis test is asking if the record at j is invalidly positioned with respect tothe required properties of a cluster now that i is vacant.

98 / 127

Page 244: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explanation

FirstFor all records in a cluster, there must be no vacant slots between theirnatural hash position and their current position (else lookups willterminate before finding the record).

Secondk is the raw hash where the record at j would naturally land in thehash table if there were no collisions.

ThusThis test is asking if the record at j is invalidly positioned with respect tothe required properties of a cluster now that i is vacant.

98 / 127

Page 245: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explanation

FirstFor all records in a cluster, there must be no vacant slots between theirnatural hash position and their current position (else lookups willterminate before finding the record).

Secondk is the raw hash where the record at j would naturally land in thehash table if there were no collisions.

ThusThis test is asking if the record at j is invalidly positioned with respect tothe required properties of a cluster now that i is vacant.

98 / 127

Page 246: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Case 1

We have the followingCase 1

i j

k k

we have i < jIf i < k ≤ j then moving j to the i position will be incorrect... Why?

99 / 127

Page 247: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Case 2

We have the followingCase 2

ij

k

We have j < iIf k ≤ j < or i < k then moving j to the i position will be incorrect...Why?

100 / 127

Page 248: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Complexity

We have thenWorst-case get/put/remove time is Θ(n), where n is the number of pairsin the table.

Something NotableThis happens when all pairs are in the same cluster.

101 / 127

Page 249: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Complexity

We have thenWorst-case get/put/remove time is Θ(n), where n is the number of pairsin the table.

Something NotableThis happens when all pairs are in the same cluster.

101 / 127

Page 250: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explaining α

We have thatα= loading density = (number of pairs)/m.

In the example α = 12/17

We have the following terms to explain complexity in Open AddressingSn = expected number of buckets examined in a successful searchwhen n is large.Un= expected number of buckets examined in a unsuccessful searchwhen n is large

102 / 127

Page 251: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explaining α

We have thatα= loading density = (number of pairs)/m.

In the example α = 12/17

We have the following terms to explain complexity in Open AddressingSn = expected number of buckets examined in a successful searchwhen n is large.Un= expected number of buckets examined in a unsuccessful searchwhen n is large

102 / 127

Page 252: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explaining α

We have thatα= loading density = (number of pairs)/m.

In the example α = 12/17

We have the following terms to explain complexity in Open AddressingSn = expected number of buckets examined in a successful searchwhen n is large.Un= expected number of buckets examined in a unsuccessful searchwhen n is large

102 / 127

Page 253: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Explaining α

We have thatα= loading density = (number of pairs)/m.

In the example α = 12/17

We have the following terms to explain complexity in Open AddressingSn = expected number of buckets examined in a successful searchwhen n is large.Un= expected number of buckets examined in a unsuccessful searchwhen n is large

102 / 127

Page 254: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Expected Performance in Open Addressing, 0 ≤ α ≤ 1

We have for unsuccessful search

Un = O(1 + 1

1− α

)(3)

We have for successful search

Sn = O(1 + 1

αln 1

1− α

)(4)

103 / 127

Page 255: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Expected Performance in Open Addressing, 0 ≤ α ≤ 1

We have for unsuccessful search

Un = O(1 + 1

1− α

)(3)

We have for successful search

Sn = O(1 + 1

αln 1

1− α

)(4)

103 / 127

Page 256: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus, for different α′s

We have using uniform keys!!!α Sn Un

0.50 1.5 2.50.75 2.5 8.50.90 5.5 50.5

Recommendationα ≤ 0.75 is recommended.

104 / 127

Page 257: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Thus, for different α′s

We have using uniform keys!!!α Sn Un

0.50 1.5 2.50.75 2.5 8.50.90 5.5 50.5

Recommendationα ≤ 0.75 is recommended.

104 / 127

Page 258: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Table Design

For exampleIf performance requirements are given, determine maximum permissibleloading density.

We want a successful search to make no more than 5 compares(expected)

4 = 1 + 11−α ‘

α = 3/4

We want an unsuccessful search to make no more than 6 compares(expected).

6 = 1 + 1α log 1

1−α

α ≈ 0.964

105 / 127

Page 259: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Table Design

For exampleIf performance requirements are given, determine maximum permissibleloading density.

We want a successful search to make no more than 5 compares(expected)

4 = 1 + 11−α ‘

α = 3/4

We want an unsuccessful search to make no more than 6 compares(expected).

6 = 1 + 1α log 1

1−α

α ≈ 0.964

105 / 127

Page 260: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Table Design

For exampleIf performance requirements are given, determine maximum permissibleloading density.

We want a successful search to make no more than 5 compares(expected)

4 = 1 + 11−α ‘

α = 3/4

We want an unsuccessful search to make no more than 6 compares(expected).

6 = 1 + 1α log 1

1−α

α ≈ 0.964

105 / 127

Page 261: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Use the minimum α for your trigger of doubling the array!!!

Thus, we have

αf ≤ min {3/4, 0.964}

106 / 127

Page 262: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Hash Table Design

Dynamic resizing of table.Whenever loading density exceeds threshold, rehash into a table ofapproximately twice the current size.

107 / 127

Page 263: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

But , even with a nice division method!!!

Example using keys uniformly distributedIt was generated using the division method

Then

108 / 127

Page 264: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

But , even with a nice division method!!!Example using keys uniformly distributedIt was generated using the division method

Then

108 / 127

Page 265: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

Example using Gaussian keysIt was generated using the division method

Then

109 / 127

Page 266: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

ExampleExample using Gaussian keysIt was generated using the division method

Then

109 / 127

Page 267: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Outline1 Introduction2 Operation in a ADT dictionary

AddRemoveGetValueContainsIterators

3 Example4 Implementation5 Hash Tables

Number of KeysHash FunctionsHashing By Division

6 Overflow HandlingOpen AddressingChaining

110 / 127

Page 268: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Linear List Of Synonyms

ThusEach bucket keeps a linear list of all pairs for which it is the homebucket.The linear list may or may not be sorted by key.The linear list may be an array linear list or a chain.

111 / 127

Page 269: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Collision Handling: Chaining

A Possible SolutionInsert the elements that hash to the same slot into a linked list.

U(Universe of Keys)

112 / 127

Page 270: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example Sorted Chains

Add to a hash table with m = 11Put in pairs whose keys are 6, 17, 12, 23, 28, 5, 16, 3, 8

So, we haveHome bucket = key % 11.

113 / 127

Page 271: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example Sorted Chains

Add to a hash table with m = 11Put in pairs whose keys are 6, 17, 12, 23, 28, 5, 16, 3, 8

So, we haveHome bucket = key % 11.

113 / 127

Page 272: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23, 28, 5, 16, 3, 8

114 / 127

Page 273: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23, 28, 5, 16, 3

8

115 / 127

Page 274: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23, 28, 5, 16

8

3

116 / 127

Page 275: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23, 28, 5

8

3

16

117 / 127

Page 276: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23, 28, 5

8

3

16

118 / 127

Page 277: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23, 28

8

3

5 16

119 / 127

Page 278: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12, 23

8

3

5 16

28

120 / 127

Page 279: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17, 12

8

3

5 16

28

23

121 / 127

Page 280: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6, 17

8

3

5 16

28

12 23

122 / 127

Page 281: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table

0

1

2

3

4

5

6

7

8

9

10

6

8

3

5 16

17

12 23

28

123 / 127

Page 282: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Example

The Table0

1

2

3

4

5

6

7

8

9

10

8

3

5 16

17

12 23

286

124 / 127

Page 283: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Do You Remember This?

Universal Hashing Vs Division Method

125 / 127

Page 284: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Expected Complexity of Hash Table under Chaining

We have for unsuccessful search

Un = O (1 + α) (5)

We have for successful search

Sn = O (1 + α) (6)

126 / 127

Page 285: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Expected Complexity of Hash Table under Chaining

We have for unsuccessful search

Un = O (1 + α) (5)

We have for successful search

Sn = O (1 + α) (6)

126 / 127

Page 286: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, Back to the real world

java.util.HashtableIt uses unsorted chains.It uses a default initial m = divisor = 101It uses a default α ≤ 0.75When loading density exceeds a max permissible threshold, It rehashwith new m = 2m+1.

127 / 127

Page 287: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, Back to the real world

java.util.HashtableIt uses unsorted chains.It uses a default initial m = divisor = 101It uses a default α ≤ 0.75When loading density exceeds a max permissible threshold, It rehashwith new m = 2m+1.

127 / 127

Page 288: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, Back to the real world

java.util.HashtableIt uses unsorted chains.It uses a default initial m = divisor = 101It uses a default α ≤ 0.75When loading density exceeds a max permissible threshold, It rehashwith new m = 2m+1.

127 / 127

Page 289: Preparation Data Structures 09 hash tables

Images/cinvestav-1.jpg

Now, Back to the real world

java.util.HashtableIt uses unsorted chains.It uses a default initial m = divisor = 101It uses a default α ≤ 0.75When loading density exceeds a max permissible threshold, It rehashwith new m = 2m+1.

127 / 127