Top Banner
Hashing in C CSCI2100A Data Structures Tutorial Jiani,ZHANG [email protected]
44

Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Aug 28, 2019

Download

Documents

trinhcong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing in C

CSCI2100A Data Structures Tutorial

Jiani,ZHANG

[email protected]

Page 2: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 2

Page 3: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 3

Page 4: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing in C

• One of the biggest drawbacks to a language like C is that there are no keyed arrays.

3/7/2016 4

– Can only access indexed Arrays, e.g. city[5];

– Cannot directly access the values e.g. city[“California"];

Page 5: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

/* Hash function for ints */

int hashfunc(int integer_key)

{

return integer_key%HASHTABLESIZE;

}

Hashing - hash function

• Hash function

– A mapping function that maps a key to a number in the range 0 to TableSize -1

3/7/2016 5

• However, collisions cannot be avoided.

Page 6: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 6

Page 7: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - separate chaining

• If two keys map to same value, the elements are chained together by creating a linked list of elements

3/7/2016 7

Page 8: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - example

Initial hash table

3/7/2016 8

• Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.

• The hash function is key % 10

Page 9: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - example

After insert 22

22 % 10 = 2

• Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.

• The hash function is key % 10

3/7/2016 9

Page 10: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - example

After insert 84

84 % 10 = 4

• Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.

• The hash function is key % 10

3/7/2016 10

Page 11: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - example

After insert 35

35 % 10 = 5

• Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.

• The hash function is key % 10

3/7/2016 11

Page 12: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - example

After insert 62

62 % 10 = 2

• Insert the following four keys 22 84 35 62 into hash table of size 10 using separate chaining.

• The hash function is key % 10

3/7/2016 12

Page 13: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 13

Page 14: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing

• Open addressing

– Open addressing hash tables store the records directly within the array.

– A hash collision is resolved by probing, or searching through alternate locations in the array.

• Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 14

Page 15: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

#define HASHTABLESIZE 51

typedef struct

{

int key[HASHTABLESIZE];

char state[HASHTABLESIZE];

/* -1=lazy delete, 0=empty, 1=occupied */

} hashtable;

/* The hash function */

int hash(int input)

{

return input%HASHTABLESIZE;

}

3/7/2016 15

Page 16: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Open addressing – if collision occurs, alternative cells are tried.

h0(X), h1(X), h2(X), ..., hk(X)

= (Hash(X) + F(k) ) mod TableSize

– Linear probing F(k) = k

– Quadratic probing F(k) = k2

– Double hashing F(k) = k*Hash2(X)

3/7/2016 16

Page 17: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing void open_addressing_insert(int item, hashtable * ht )

{

hash_value = hash(item);

i = hash_value;

k = 1;

while (ht->state[i]!= 0) {

if (ht->key[i] == item) {

fprintf(stderr,”Duplicate entry\n”);

exit(1);

}

i = h(k++,item);

if (i == hash_value) {

fprintf(stderr, “The table is full\n”);

exit(1);

}

}

ht->key[i] = item;

} 3/7/2016 17

/* -1=lazy delete,

0=empty, 1=occupied */

Page 18: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 18

Page 19: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Linear probing

• F(k) = k

–hk(X) = (Hash(X) + k ) mod TableSize • h0(X) = (Hash(X) + 0) mod TableSize,

• h1(X) = (Hash(X) + 1) mod TableSize,

• h2(X) = (Hash(X) + 2) mod TableSize,

• ......

3/7/2016 19

Page 20: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Initial hash table

3/7/2016 20

Page 21: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Insert 7 at h0(7) (7 mod 17) = 7

3/7/2016 21

Page 22: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Insert 36 at h0(36) (36 mod 17) = 2

3/7/2016 22

Page 23: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Insert 24 at h0(24)=(24 mod 17) = 7, so we call h1(24)=((24 + 1) mod 17) = 8

3/7/2016 23

Page 24: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Insert 75 at h0(75)=(75 mod 17) = 7, h1(75)=((75+1) mod 17) = 8, h2(75)=((75+2) mod 17) = 9,

3/7/2016 24

Page 25: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Delete 24 -> lazy deletion technique

3/7/2016 25

Page 26: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Lazy Deletion

• We need to be careful about removing elements from the table as it may leave holes in the table.

• Lazy Deletion:

– not to delete the element, but place a marker in the place to indicate that an element that was there is now removed.

– So when we are looking for things, we jump over the “dead bodies” until we find the element or we run into a null cell.

• Drawback

– Space cost

3/7/2016 26

Page 27: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing example

– Find 75 h0(75)=(75 mod 17) = 7(occupied), 8(lazy delete), 9(Get it!)

3/7/2016 27

Page 28: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Linear probing /* The h function */

int h(int k, int input)

{

return (hash(input) + k)% HASHTABLESIZE;

}

3/7/2016 28

while (ht->state[i]!= 0) {

if (ht->key[i] == item) {

fprintf(stderr,”Duplicate entry\n”);

exit(1);

}

i = h(k++,item);

//call the function

if (i == hash_value) {

fprintf(stderr, “The table is full\n”);

exit(1);

}

}

Page 29: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 29

Page 30: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing

– F(k) = k2

hk(X) = (Hash(X) + k2 ) mod TableSize

h0(X) = (Hash(X) + 02) mod TableSize,

h1(X) = (Hash(X) + 12) mod TableSize,

h2(X) = (Hash(X) + 22) mod TableSize, ...

3/7/2016 30

Page 31: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing

/* The h function */

int h(int k, int input)

{

return (hash(input) + k * k) % HASHTABLESIZE;

}

3/7/2016 31

Page 32: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing example

– Initial hash table

3/7/2016 32

Page 33: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing example

– Insert 5 at h0(5)=(5 mod 17) = 5

3/7/2016 33

Page 34: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing example

– Insert 56 at h0(56)=(56 mod 17) = 5 h1(56)=((56 + 1*1) mod 17) = 6

3/7/2016 34

Page 35: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing example

– Insert 73 at h0(56)=(73 mod 17) = 5, h1(56)=((73 + 1*1) mod 17) = 6, h2(56)=((73 + 2*2) mod 17) = 9

3/7/2016 35

Page 36: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Quadratic probing example

– Insert 124 at h0(124)=(124 mod 17) = 5, h1(124)=(124+1*1 mod 17) = 6,

h0(124)=(124+2*2 mod 17) = 9, h3(124)=((124 + 3*3) mod 17) = 14

3/7/2016 36

Page 37: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 37

Page 38: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Random probing – Randomize(X)

• h0(X) = Hash(X),

• h1(X) = (h0(X) + RandomGen()) mod TableSize,

• h2(X) = (h1(X) + RandomGen()) mod TableSize,

......

– Use Randomize(X) to ‘seed’ the random number generator using X

– Each call of RandomGen() will return the next random number in the random sequence for seed X

3/7/2016 38

Page 39: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Implement random probing using random number generator in C

– pseudo-random number generator: rand()

– returns an integer between 0 and RAND_MAX

– ‘Seed’ the randomizer • srand(unsigned int);

– Use time as a ‘seed’ • time(time_t *);

• time(NULL);

3/7/2016 39

Page 40: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

– random number generation in C #include <stdlib.h>

#include <time.h>

int main(){

int i;

// srand() should only be called once

srand(time(NULL));

for (i = 0; i < 10; i++){

printf("%d\n", rand());

}

return 0;

}

3/7/2016 40

Page 41: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Contents

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 41

Page 42: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Hashing - Open addressing

• Double hashing : F(k) = k * Hash2(X)

hk(X) = (Hash(k) + i * Hash2(X) ) mod TableSize

h0(X) = (Hash(X) + 0 * Hash2(X)) mod TableSize,

h1(X) = (Hash(X) + 1 * Hash2(X)) mod TableSize,

h2(X) = (Hash(X) + 2 * Hash2(X)) mod TableSize, ...

3/7/2016 42

Page 43: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Review

• Hash function

• Collision resolutions – Separate Chaining (Open hashing)

– Open addressing (Closed Hashing) • Linear probing

• Quadratic probing

• Random probing

• Double hashing

3/7/2016 43

Page 44: Hashing in C - cse.cuhk.edu.hk fileContents •Hash function •Collision resolutions –Separate Chaining (Open hashing) –Open addressing (Closed Hashing) •Linear probing •Quadratic

Thank you !