Top Banner
DATA STRUCTURES AND ALGORITHMS LAB 11 Bianca Tesila FILS, May 2014
15
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data structures and algorithms lab11

DATA STRUCTURES AND ALGORITHMS

LAB 11

Bianca Tesila

FILS, May 2014

Page 2: Data structures and algorithms lab11

OBJECTIVES

Dictionaries Hash tables

Page 3: Data structures and algorithms lab11

DICTIONARIES: WHAT ARE THEY?

An ADT made of a collection of keys and a collection of values, in which each key has a value associated to it

A dictionary is also called associative array

Useful for searching

Page 4: Data structures and algorithms lab11

DICTIONARIES: OPTIMAL SEARCH

The keys must be unique The range of the key must be severly bounded

Otherwise… if the keys are not unique: construct a set of m(keys count) lists and store the heads of these lists in the associative array(the keys)

Page 5: Data structures and algorithms lab11

DICTIONARIES: DUPLICATE KEYS

If we have a high number of duplicates (a lot of elements with the same key), the search time will severely increase

Solution: make a function to optimize the search criterion, h => solve collisions of keys

We will search for  T[h(k)] rather than T[k] , where: T is our associative array, k is an index and h(k) is a mapping function

Page 6: Data structures and algorithms lab11

DICTIONARIES: IMPLEMENTATION

Hash-tables Self-balancing binary search trees Radix- tree Prefix-tree Judy arrays

Page 7: Data structures and algorithms lab11

DICTIONARIES: BASIC OPERATIONS

put(key, value) Inserts the pair (key, value) in the hash table If a pair (key, value’) (with the same key) already exists, then

value’ is replaced by value We say that the value value is associated to the key key

get(key) Returns the value associated to the key key If no value is associated to key, then an error occurs

hasKey(key) Returns 1 if the key key exists in the hash table, and 0

otherwise

Page 8: Data structures and algorithms lab11

HASH-TABLES: INTRODUCTION

Data structure with an optimized lookup function (average search time is constant, O (1)).

How? By turning the key in a hash (code), using a hash function

The hash function must be wisely chosen in order to minimize the number of collisions (Risk: different values produce the same hashes).

We cannot avoid all the collisions - they occur inherently as hash length is fixed, and storage objects can have arbitrary length and content.

In the event of a collision, the values stored in the same position (the same bucket). In this case, the search is reduced to comparing the actual values in the bucket.

Page 9: Data structures and algorithms lab11

HASH-TABLES: EXAMPLE

Page 10: Data structures and algorithms lab11

HASH TABLE: HASH FUNCTIONS

Deterministic: if called twice, they should return the same value

Low collision rate: buckets with small dimensions

Good dispersion between “buckets”

Page 11: Data structures and algorithms lab11

HASH TABLE: IMPLEMENTATION WITH LINKED LISTS

A hash implementation which solves the collisions is called direct chaining

For each bucket, we use a linked list: every list is associated to a key(hash-coded)

Inserting in hash table means finding the correct index(key) and adding the element to the list that corresponds to the found key

Deleting means searching and removing of that element from the list

Page 12: Data structures and algorithms lab11

HASH TABLE: ADVANTAGES AND DISADVANTAGES

Advantage: the delete operation is simple and the table resizing can be postponed a lot because (even when all positions of hash are used), performance is still good.

Disadvantage: for small amount of data, the overhead is quite large and “browsing” the data can be time consuming (the same disadvantage as in linked lists)

Page 13: Data structures and algorithms lab11

HASH TABLE: EXAMPLE

• hmax is the maximum number of linked lists in our hash-table

• the function hash will be passed as an argument (actually, a pointer to the function will be passed)

• the key is not mandatory to be a number (think of a real dictionary!!!): that is why we use templates

Page 14: Data structures and algorithms lab11

HASH TABLE: ASSIGNMENT

!!Exercise: Using the previous header, implement the hash tables data structure and test it, for a custom hash-function

Page 15: Data structures and algorithms lab11

HASH TABLE: ASSIGNMENT

Hint: Maintain an array H[HMAX] of linked lists

The info field of each element of a list consists of a struct containing a key and a value

Each key is mapped to a value hkey=hash(key), such that 0≤hkey≤HMAX-1 hash(key) is called the hash function and hkey is the index in a linked list

put(k, v) Searches for the key k in the list H[hkey=hash(k)] If the key is found, then we replace the value by v If the key is not found, then we insert the pair (k,v) in H[hkey]

get(k) Search for the key k in H[hkey=hash(k)] If it finds the key, then it returns its associated value; otherwise, an error occurs

hasKey(k) Search for the key k in H[hkey=hash(k)] If it finds the key, then it returns 1; otherwise, it returns 0

• hmax is the maximum number of linked lists in our hash-table• the function hash will be passed as an argument (actually, a pointer to the function will be passed)• the key is not mandatory to be a number (think of a real dictionary!!!): that is why we use templates