Top Banner
Fundamentals of Data Structure - Niraj Agarwal
176

Fundamentals of data structures

Dec 03, 2014

Download

Education

Niraj Agarwal

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Fundamentals of data structures

Fundamentals of Data Structure

- Niraj Agarwal

Page 2: Fundamentals of data structures

Data Structures

• "Once you succeed in writing the programs for complicated algorithms, they usually run extremely fast. The computer doesn't need to understand the algorithm, it’s task is only to run the programs.“

• There are a number of facets to good programs: they must

– run correctly – run efficiently – be easy to read and understand – be easy to debug and – be easy to modify.

Page 3: Fundamentals of data structures

Data Structure (Cont.)

What is Data Structure ?

• A scheme for organizing related pieces of information

• A way in which sets of data are organized in a particular system

• An organised aggregate of data items

• A computer interpretable format used for storing, accessing, transferring

and archiving data

• The way data is organised to ensure efficient processing: this may be in

lists, arrays, stacks, queues or trees

Data structure is a specialized format for organizing and storing data so that it can be be accessed and worked with in appropriate ways to make an a program efficient

Page 4: Fundamentals of data structures

Data Structures (Cont.)

• Data Structure = Organised Data + Allowed Operations

There are two design aspects to every data structure:

the interface part The publicly accessible functions of the type. Functions like creation and destruction of the object, inserting and removing elements (if it is a container), assigning values etc.

the implementation part:Internal implementation should be independent of the interface. Therefore, the details of the implementation aspect should be hidden out from the users.

Page 5: Fundamentals of data structures

Collections

•Programs often deal with collections of items. •These collections may be organised in many ways and use many different program structures to represent them, yet, from an abstract point of view, there will be a few common operations on any collection.

create Create a new collection

add Add an item to a collection

delete Delete an item from a collection

find Find an item matching some criterion in the collection

destroy Destroy the collection

Page 6: Fundamentals of data structures

Analyzing an Algorithm

• Simple statement sequence

s1; s2; …. ; sk

– Complexity is O(1) as long as k is constant

• Simple loops

for(i=0;i<n;i++) { s; } where s is O(1)

– Complexity is n O(1) or O(n)

• Loop index doesn’t vary linearly

h = 1;while ( h <= n ) { s; h = 2 * h;}

– Complexity O(log n)

• Nested loops (loop index depends on outer loop index)

for(i=0;i<n;i++) f

for(j=0;j<n;j++) { s; }

– Complexity is n O(n) or O(n2)

Page 7: Fundamentals of data structures

Arrays

An Array is the simplest form of implementing a collection

• Each object in an array is called an array element • Each element has the same data type (although they may have different

values)• Individual elements are accessed by index using a consecutive range of

integers

One Dimensional Array or vector

int A[10];

for ( i = 0; i < 10; i++)

A[i] = i +1;

A[0]

1

A[1]

2

A[2]

3

A[n-2]

N-1

A[n-1]

N

Page 8: Fundamentals of data structures

Arrays (Cont.)

Multi-dimensional ArrayA multi-dimensional array  of dimension n (i.e., an n-dimensional array or simply n-D array) is a collection of items which is accessed via n subscript expressions. For example, in a language that supports it, the (i,j) th element of the two-dimensional

array x is accessed by writing x[i,j].

m

xi:::::::::::::::

210

nj109876543210C o l u m n

ROW

Page 9: Fundamentals of data structures

Arrays (Cont.)

Page 10: Fundamentals of data structures

Array : Limitations

• Simple and Fast but must specify size during construction

• If you want to insert/ remove an element to/ from a fixed position in the

list, then you must move elements already in the list to make room for

the subsequent elements in the list.

• Thus, on an average, you probably copy half the elements.

• In the worst case, inserting into position 1 requires to move all the

elements.

• Copying elements can result in longer running times for a program if

insert/ remove operations are frequent, especially when you consider the

cost of copying is huge (like when we copy strings)

• An array cannot be extended dynamically, one have to allocate a new

array of the appropriate size and copy the old array to the new array

Page 11: Fundamentals of data structures

Linked Lists• The linked list is a very flexible dynamic data structure:

items may be added to it or deleted from it at will

– Dynamically allocate space for each element as needed– Include a pointer to the next item– the number of items that may be added to a list is limited only by the

amount of memory available Linked list can be perceived as connected (linked) nodes Each node of the list contains

• the data item

• a pointer to the next node

• The last node in the list contains a NULL pointer to indicate that it is the end or tail of the list.

Data Next

object

Page 12: Fundamentals of data structures

Linked Lists (Cont.)

• Collection structure has a pointer to the list head

– Initially NULL

• Add first item

– Allocate space for node

– Set its data pointer to object

– Set Next to NULL

– Set Head to point to new node

Data Next

object

Head

Collectionnode

Tail

The variable (or handle) which represents the list is simply a pointer to the node at the head

of the list.

Page 13: Fundamentals of data structures

Linked Lists (Cont.)

• Add a node

– Allocate space for node

– Set its data pointer to object

– Set Next to current Head

– Set Head to point to new node

Data Next

object

Head

Collection

node

Data Next

object2

node

Page 14: Fundamentals of data structures

Linked Lists - Add implementation

• Implementation

struct t_node { void *item; struct t_node *next; } node;typedef struct t_node *Node;struct collection { Node head; …… };int AddToCollection( Collection c, void *item ) { Node new = malloc( sizeof( struct t_node ) ); new->item = item; new->next = c->head; c->head = new; return TRUE; }

Recursive type definition -C allows it!

Error checking, assertsomitted for clarity!

Page 15: Fundamentals of data structures

Linked Lists - Find implementation

• Implementation

void *FindinCollection( Collection c, void *key ) { Node n = c->head; while ( n != NULL ) {

if ( KeyCmp( ItemKey( n->item ), key ) == 0 ) {return n->item;

n = n->next; } return NULL; }

Add time Constant - independent of nSearch time Worst case - n

• A recursive implementation is also possible!

Page 16: Fundamentals of data structures

Linked Lists - Delete implementation

• Implementation

void *DeleteFromCollection( Collection c, void *key ) { Node n, prev; n = prev = c->head; while ( n != NULL ) {

if ( KeyCmp( ItemKey( n->item ), key ) == 0 ) {prev->next = n->next;return n;

} prev = n; n = n->next; } return NULL; }

head

Page 17: Fundamentals of data structures

Linked Lists - Variations

• Simplest implementation

– Add to head

– Last-In-First-Out (LIFO) semantics

• Modifications

– First-In-First-Out (FIFO)

– Keep a tail pointer

struct t_node { void *item; struct t_node *next; } node;typedef struct t_node *Node;struct collection { Node head, tail; };

head

tail

By ensuring that the tail of the list is always pointing to the head, we

can build a circularly linked list

head is tail->next

LIFO or FIFO using ONE pointer

Page 18: Fundamentals of data structures

Linked Lists - Doubly linked

• Doubly linked lists

– Can be scanned in both directions

struct t_node { void *item; struct t_node *prev, *next; } node;

typedef struct t_node *Node;struct collection { Node head, tail; }; head

tail

prev prev prev

Applications requiring both way search

Eg. Name search in telephone directory

Page 19: Fundamentals of data structures

Binary Tree

• The simplest form of Tree is a Binary Tree– Binary Tree Consists of

• Node (called the ROOT node)• Left and Right sub-trees• Both sub-trees are binary trees• The nodes at the lowest levels of the tree (the ones with no sub-

trees) are called leaves

Note therecursivedefinition!

Each sub-treeis itself

a binary tree

In an ordered binary tree the keys of all the nodes in • the left sub-tree are less

than that of the root • the keys of all the nodes

in the right sub-tree are greater than that of the root,

• the left and right sub-trees are themselves ordered binary trees.

Page 20: Fundamentals of data structures

Binary Tree (Cont.)

A

B C

DE

F G

• If A is the root of a binary tree and B is the root of its left/right subtree then

o A is the father of Bo B is the left/right son of A

• Two nodes are brothers if they are left and right sons of the same father

• Node n1 is an ancestor of n2 (and n2 is descendant of n1) if n1 is either the father of n2 or the father of some ancestor of n2

• Strictly Binary Tree: If every nonleaf node in a binary tree has non empty left and right subtrees

• Level of a node: Root has level 0. Level of any node is one more than the level of its father

• Depth: Maximum level of any leaf in the tree A binary tree can contain at most 2l nodes at level l Total nodes for a binary tree with depth d = 2d+1 - 1

Page 21: Fundamentals of data structures

Binary Tree - Implementation

struct t_node { void *item; struct t_node *left; struct t_node *right; };

typedef struct t_node *Node;

struct t_collection { Node root; …… };

Page 22: Fundamentals of data structures

Binary Tree - Implementation

• Find

extern int KeyCmp( void *a, void *b );/* Returns -1, 0, 1 for a < b, a == b, a > b */

void *FindInTree( Node t, void *key ) { if ( t == (Node)0 ) return NULL; switch( KeyCmp( key, ItemKey(t->item) ) ) { case -1 : return FindInTree( t->left, key ); case 0: return t->item; case +1 : return FindInTree( t->right, key ); } }

void *FindInCollection( collection c, void *key ) { return FindInTree( c->root, key ); }

Less,search

left

Greater,search right

Page 23: Fundamentals of data structures

Binary Tree - Performance

• Find– Complete Tree

– Height, h• Nodes traversed in a path from the root to a leaf

– Number of nodes, h• n = 1 + 21 + 22 + … + 2h = 2h+1 - 1• h = floor( log2 n )

– Since we need at most h+1 comparisons,find in O(h+1) or O(log n)

Page 24: Fundamentals of data structures

Binary Tree - Traversing

Traverse: Pass through the tree, enumerating each node once

• PreOrder (also known as depth-first order)1. Visit the root

2. Traverse the left subtree in preorder

3.Traverse the right subtree in preorder

• InOrder (also known as symmetric order)1. Traverse the left subtree in inorder

2. Visit the root

3. Traverse te right subtree in inorder

• PostOrder (also known as symmetric order)1. Traverse the left subtree in postorder

2. Traverse the right subtree in postorder

3. Visit the root

Page 25: Fundamentals of data structures

Binary Tree - Applications

• A binary tree is a useful data structure when two-way decisions must be

made at each point in a process

– Example: Finding duplicates in a list of numbers

• A binary tree can be used for representing an expression containing

operands (leaf) and operators (nonleaf node).

Traversal of the tree will result in infix, prefix or postfix forms of expression

Two binary trees are MIRROR SIMILAR if they are both empty or if they are

nonempty, the left subtree of each is mirror similar to the right subtree

Page 26: Fundamentals of data structures

General Tree

A Hierarchical Tree

• A tree is a finite nonempty set of elements in which one element is called the ROOT and remaining element partitioned into m >=0 disjoint subsets, each of which is itself a tree

• Different types of trees – binary tree, n-ary tree, red-black tree, AVL tree

Page 27: Fundamentals of data structures

Heaps

Heaps are based on the notion of a complete tree

A binary tree is completely full if it is of height, h, and has 2h+1-1 nodes.

• A binary tree of height, h, is complete iff

– it is empty or

– its left subtree is complete of height h-1 and its right subtree is completely full of height h-

2 or

– its left subtree is completely full of height h-1 and its right subtree is complete of height h-

1.

• A complete tree is filled from the left:

– all the leaves are on

– the same level or two adjacent ones and

– all nodes at the lowest level are as far to the left as possible.

• A binary tree has the heap property iff

– it is empty or

– the key in the root is larger than that in either child and both subtrees have the heap

property.

Page 28: Fundamentals of data structures

Heaps (Cont.)

• A heap can be used as a priority queue:

• the highest priority item is at the root and is trivially extracted. But if the root is deleted, we are left with two sub-trees and we must efficiently re-create a single tree with the heap property.

• The value of the heap structure is that we can both extract the highest priority item and insert a new one in O(logn) time.

Example:

A deletion will remove the Tat the root

Page 29: Fundamentals of data structures

Heaps (Cont.)To work out how we're going to maintain the heap property, use the fact that a complete tree is filled from the left. So that the position which must become empty is the one occupied by the M. Put it in the vacant root position.

This has violated the condition that the root must be greater than each of its children. So interchange the M with the larger of its children.

The left subtree has now lost the heap property. So again interchange the M with the larger of its children.

We need to make at most h interchanges of a root of a subtree with one of its children to fully restore the heap property.

O(h) or O(log n)

Page 30: Fundamentals of data structures

Heaps (Cont.)

Addition to a Heap

To add an item to a heap, we follow the reverse procedure. Place it in the next leaf position and move it up. Again, we require O(h) or O(logn) exchanges.

Page 31: Fundamentals of data structures

Comparisons

ArraysSimple, fastInflexibleO(1)O(n) inc sortO(n)

O(n)O(logn)binary search

Add

Delete

Find

Linked ListSimpleFlexibleO(1) sort -> no advO(1) - anyO(n) - specificO(n)(no bin search)

TreesStill SimpleFlexibleO(log n) O(log n)

O(log n)

Page 32: Fundamentals of data structures

Queues

Queues are dynamic collections which have some concept of order • FIFO queue

– A queue in which the first item added is always the first one out.

• LIFO queue

– A queue in which the item most recently added is always the first one out.

• Priority queue

– A queue in which the items are sorted so that the highest priority item is always the next one to be extracted.

Queues can be implemented by Linked Lists

Page 33: Fundamentals of data structures

Stacks• Stacks are a special form of collection

with LIFO semantics

• Two methods– int push( Stack s, void *item );

- add item to the top of the stack– void *pop( Stack s );

- remove most recently pushed item from the top of the stack

• Like a plate stacker

• Other methodsint IsEmpty( Stack s ); Determines whether the stack has anything in it

void *Top( Stack s );Return the item at the top without deleting it

* Stacks are implemented by Arrays or Linked List

Page 34: Fundamentals of data structures

Stack (Cont.)

• Stack very useful for Recursions

• Key to call / return in functions & procedures

function f( int x, int y) { int a; if ( term_cond ) return …; a = ….; return g( a ); }

function g( int z ) { int p, q; p = …. ; q = …. ; return f(p,q); } Context

for execution of f

Page 35: Fundamentals of data structures

Searching

Computer systems are often used to store large amounts of data from which individual records must be retrieved according to some search criterion. Thus the efficient storage of data to facilitate fast searching is an important issue

Things to consider

– the average time

– the worst-case time and

– the best possible time.

• Sequential Searches

– Time is proportional to n

– We call this time complexity O(n)

– Both arrays (unsorted) and linked lists

Page 36: Fundamentals of data structures

Binary Search

• Sorted array on a key

• first compare the key with the item in the middle position of the array

• If there's a match, we can return immediately.

• If the key is less than the middle key, then the item sought must lie in the lower half of the array

• if it's greater then the item sought must lie in the upper half of the array

• Repeat the procedure on the lower (or upper) half of the array - RECURSIVE

Time complexity O(log n)

Page 37: Fundamentals of data structures

Binary Search Implementationstatic void *bin_search( collection c, int low, int high, void *key ) {

int mid;if (low > high) return NULL; /* Termination check */ mid = (high+low)/2; switch (memcmp(ItemKey(c->items[mid]),key,c->size)) {

case 0: return c->items[mid]; /* Match, return item found */ case -1: return bin_search( c, low, mid-1, key); /* search lower half */ case 1: return bin_search( c, mid+1, high, key ); /* search upper half */ default : return NULL; }

}

void *FindInCollection( collection c, void *key ) { /* Find an item in a collection

Pre-condition: c is a collection created by ConsCollection c is sorted in ascending order of the key key != NULL

Post-condition: returns an item identified by key if one exists, otherwise returns NULL */

int low, high; low = 0; high = c->item_cnt-1; return bin_search( c, low, high, key );

}

Page 38: Fundamentals of data structures

Binary Search vs Sequential Search

• Find method– Sequential search

• Worst case time: c1 n– Binary search

• Worst case time: c2 log2n

Compare n with logn

0

10

20

30

40

50

60

0 10 20 30 40 50 60

n

Tim

e 4 log n

n

LogsBase 2 is by farthe most commonin this course.Assume base 2unless otherwisenoted!

Smallproblems -we’re notinterested!

Largeproblems -we’reinterestedin this gap!

n

log2n Binary search

More complexHigher constant factor

Page 39: Fundamentals of data structures

SortingA file is said to be SORTED on the key if i < j implies that k[i] preceeds k[j] in some ordering of the keys

Different types of Sorting

• Exchange Sorts

• Bubble Sort

• Quick Sort• Insertion Sorts

• Selection Sorts

• Binary Tree Sort

• Heap Sort

• Merge and Radix Sorts

Page 40: Fundamentals of data structures

Insertion Sort

First card is already sorted With all the rest,

Scan back from the end until you find the first card larger than the new one O(n)

Move all the lower ones up one slot O(n)insert it O(1)

For n cards Complexity O(n2)

Q2 9A K 10 J 45

9

Page 41: Fundamentals of data structures

Bubble Sort

Bubble Sort• From the first element

– Exchange pairs if they’re out of order– Repeat from the first to n-1– Stop when you have only one element to check

/* Bubble sort for integers */#define SWAP(a,b) { int t; t=a; a=b; b=t; }

void bubble( int a[], int n ) { int i, j; for(i=0;i<n;i++) { /* n passes thru the array */ /* From start to the end of unsorted part */ for(j=1;j<(n-i);j++) { /* If adjacent items out of order, swap */

if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]); } } }

Overall O(n2)

O(1) statement

Inner loopn-1, n-2, n-3, … , 1 iterations

Outer loop n iterations

Page 42: Fundamentals of data structures

Partition Exchange or Quicksort

• Example of Divide and Conquer algorithm• Two phases

– Partition phase• Divides the work into half

– Sort phase• Conquers the halves!

quicksort( void *a, int low, int high ) { int pivot; if ( high > low ) /* Termination condition! */ {

pivot = partition( a, low, high );quicksort( a, low, pivot-1 ); quicksort( a, pivot+1, high ); }

}

< pivot > pivotpivot

< pivot > pivot

pivot< p’ p’ > p’ < p” p” > p”

Page 43: Fundamentals of data structures

Heap Sort

Heaps also provide a means of sorting:

• construct a heap,

• add each item to it (maintaining the heap property!),

• when all items have been added, remove them one by one (restoring the heap property as each one is removed).

• Addition and deletion are both O(logn) operations. We need to perform n additions and deletions, leading to an O(nlogn) algorithm

• Generally slower

Page 44: Fundamentals of data structures

Comparisons of Sorting

• The Sorting Repertoire

– Insertion O(n2) Guaranteed

– Bubble O(n2) Guaranteed

– Heap O(n log n) Guaranteed

– Quick O(n log n) Most of the time!

O(n2)

– Bin O(n) Keys in small range O(n+m)

– Radix O(n) Bounded keys/duplicates O(nlog n)

Page 45: Fundamentals of data structures

Hashing• A Hash Table is a data structure that associates each element

(e) to be stored in our table with a particular value known as a key (k)

• We store item’s (k,e) in our tables

• Simplest form of a Hash Table is an Array

• A bucket array for a hash table is an array A of size N, where each cell of A is thought of as a bucket and the integer N defines the capacity of the array,

Page 46: Fundamentals of data structures

Bucket Arrays• If the keys (k) associated with each element (e) are well distributed in

the range [0, N-1] this bucket array is all that is needed.

• An element (e) with key (k) is simply inserted into bucket A[k].

• So A[k] = (Item)(k, e);

• Any bucket cells associated with keys not present, stores a

NO_SUCH_KEY object.

• If keys are not unique, that is there exists element key pairs (e1, k) and

(e2, k) we will have two different elements mapped to the same bucket.

• This is known as a collision. And we will discuss this later.

• We generally want to avoid such collisions.

Page 47: Fundamentals of data structures

Direct Access Table• If we have a collection of n elements whose keys are unique

integers in (1,m), where m >= n,then we can store the items in a direct address table, T[m],where Ti is either empty or contains one of the elements of our collection.

• Searching a direct address table is clearly an O(1) operation:for a key, k, we access Tk,

• if it contains an element, return it,

• if it doesn't then return a NULL.

Page 48: Fundamentals of data structures

Analysis of Bucket Arrays• Drawback 1: The Hash Table uses O(N) space which is not

necessarily related to the number of elements n actually present in our set.

• If N is large relative to n, then this approach is wasteful of space.

• Drawback 2: The bucket array implementation of Hash Tables requires key values (k) associated with elements (e) to be unique and in the range [0, N-1], which is often not the case.

Page 49: Fundamentals of data structures

Hash Functions• Associated with each Hash Table is a function h, known as a Hash

Function.

• This Hash Function maps each key in our set to an integer in the range [0, N-1]. Where N is the capacity of the bucket array.

• The idea is to use the hash function value, h(k) as an index into our bucket array.

• So we store the item (k, e) in our bucket at A[h(k)]. That is A[h(k)] = (Item)(k, e);

• Unfortunately, finding a perfect hashing function is not always possible. Let's say that we can find a hash function, h(k), which maps most of the keys onto unique integers, but maps a small number of keys on to the same integer. If the number of collisions (cases where multiple keys map onto the same integer), is sufficiently small, then hash tables work quite well and give O(1) search times.

Page 50: Fundamentals of data structures

Handling the collisions

• In the small number of cases, where multiple keys map to the same integer, then elements with different keys may be stored in the same "slot" of the hash table. It is clear that when the hash function is used to locate a potential match, it will be necessary to compare the key of that element with the search key. But there may be more than one element which should be stored in a single slot of the table. Various techniques are used to manage this problem:

• chaining,

• overflow areas,

• re-hashing,

• using neighbouring slots (linear probing),

• quadratic probing,

• random probing,

Page 51: Fundamentals of data structures

Chaining• Chaining

One simple scheme is to chain all collisions in lists attached to the appropriate slot. This allows an unlimited number of collisions to be handled and doesn't require a priori knowledge of how many elements are contained in the collection. The tradeoff is the same as with linked lists versus array implementations of collections: linked list overhead in space and, to a lesser extent, in time.

Page 52: Fundamentals of data structures

Rehashing• Re-hashing schemes use a second hashing operation when

there is a collision. If there is a further collision, we re-hash until an empty "slot" in the table is found. The re-hashing function can either be a new function or a re-application of the original one. As long as the functions are applied to a key in the same order, then a sought key can always be located.

• h(j)=h(k), so the next hash function,h1 is used. A second collision occurs,so h2 is used.

Page 53: Fundamentals of data structures

Overflow• Divide the pre-allocated table into two sections: the primary

area to which keys are mapped and an area for collisions, normally termed the overflow area.

• When a collision occurs, a slot in the overflow area is used for the new element and a link from the primary slot established as in a chained system. This is essentially the same as chaining, except that the overflow area is pre-allocated and thus possibly faster to access. As with re-hashing, the maximum number of elements must be known in advance, but in this case, two parameters must be estimated: the optimum size of the primary and overflow areas.

• design systems with multiple overflow tables

Page 54: Fundamentals of data structures

Comparisons

Page 55: Fundamentals of data structures

Graph

A graph consists of a set of nodes (or vertices) and a set of arcs (or edges)

Graph G = Nodes {A,B, C} Arcs {(A,C), (B,C)}

Terminology :

• V = Set of vertices (or nodes)

• |V| = # of vertices or cardinality of V     (in usual terminology |V| = n)

• E = Set of edges, where an edge is defined by two vertices

• |E| = # of edges or cardinality of E

• A Graph, G is a pair     G = (V, E)

Labeled Graphs: We may give edges and vertices

labels. Graphing applications often require the labeling of vertices Edges

might also be numerically labeled. For instance if the vertices represent cities, the edges might be labeled to

represent distances.

Page 56: Fundamentals of data structures

Graph Terminology

Directed (or digraph) & Undirected Graphs

A directed graph is one in which every edge (u, v) has a direction,

so that (u, v) is different from (v, u). In an undirected graph, there is no distinction between (u, v) and (v, u). There are two possible situations that can arise in a directed graph between vertices u and v.

• i) only one of (u, v) and (v, u) is present. • ii) both (u, v) and (v, u) are present.

•An edge (u, v) is said to be directed from u to v if the pair (u, v) is ordered with u preceding v.

E.g. A Flight Route

•An edge (u, v) is said to be undirected if the pair (u, v) is not ordered

E.g. Road Map

Page 57: Fundamentals of data structures

Graph Terminology

• Two vertices joined by an edge are called the end vertices or endpoints of the edge.

• If an edge is directed its first endpoint is called the origin and the other is called the destination.

• Two vertices are said to be adjacent if they are endpoints of the same edge.

• The degree of a vertex v, denoted deg(v), is the number of incident edges of v.

• The in-degree of a vertex v, denoted indeg(v) is the number of incoming edges of v.

• The out-degree of a vertex v, denoted outdeg(v) is the number of outgoing edges of v.

Page 58: Fundamentals of data structures

Graph Terminology

• Two vertices joined by an edge are called the end vertices or

endpoints of the edge.

• If an edge is directed its first endpoint is called the origin and the

other is called the destination.

• Two vertices are said to be adjacent if they are endpoints of the same

edge.

Page 59: Fundamentals of data structures

Graph Terminology

A

E

D

C

B

F

a

c

b

de

f

g

h

ij

Page 60: Fundamentals of data structures

Graph Terminology

A

E

D

C

B

F

a

c

b

de

f

g

h

ij

Vertices A and Bare endpoints of edge a

Page 61: Fundamentals of data structures

Graph Terminology

A

E

D

C

B

F

a

c

b

de

f

g

h

ij

Vertex A is theorigin of edge a

Page 62: Fundamentals of data structures

Graph Terminology

A

E

D

C

B

F

a

c

b

de

f

g

h

ij

Vertex B is thedestination of edge a

Page 63: Fundamentals of data structures

Graph Terminology

A

E

D

C

B

F

a

c

b

de

f

g

h

ij

Vertices A and B areadjacent as they areendpoints of edge a

Page 64: Fundamentals of data structures

Graph Terminology

• An edge is said to be incident on a vertex if the vertex is one of the

edges endpoints.

• The outgoing edges of a vertex are the directed edges whose origin

is that vertex.

• The incoming edges of a vertex are the directed edges whose

destination is that vertex.

Page 65: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

Edge 'a' is incident on vertex VEdge 'h' is incident on vertex ZEdge 'g' is incident on vertex Y

Page 66: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The outgoing edges of vertex Ware the edges with vertex W as

origin {d, e, f}

Page 67: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The incoming edges of vertex Xare the edges with vertex X as

destination {b, e, g, i}

Page 68: Fundamentals of data structures

Graph Terminology

• The degree of a vertex v, denoted deg(v), is the number of incident

edges of v.

• The in-degree of a vertex v, denoted indeg(v) is the number of

incoming edges of v.

• The out-degree of a vertex v, denoted outdeg(v) is the number of

outgoing edges of v.

Page 69: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The degree of vertex Xis the number of incident

edges on X.deg(X) = ?

Page 70: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The degree of vertex Xis the number of incident

edges on X.deg(X) = 5

Page 71: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The in-degree of vertex Xis the number of edges that

have vertex X as a destination.indeg(X) = ?

Page 72: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The in-degree of vertex Xis the number of edges that

have vertex X as a destination.indeg(X) = 4

Page 73: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The out-degree of vertex Xis the number of edges that have vertex X as an origin.

outdeg(X) = ?

Page 74: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

The out-degree of vertex Xis the number of edges that have vertex X as an origin.

outdeg(X) = 1

Page 75: Fundamentals of data structures

Graph Terminology

• Path:

» Sequence of alternating vertices and edges

» begins with a vertex

» ends with a vertex

» each edge is preceded and followed by its endpoints• Simple Path:

» A path where all where all its edges and vertices are distinct

Page 76: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

We can see that P1 is a simple path.

P1 = {U, a, V, b, X, h, Z}

P1

Page 77: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

P2 is not a simple pathas not all its edges and

vertices are distinct.P2 = {U, c, W, e, X, g, Y, f, W, d, V}

Page 78: Fundamentals of data structures

Graph Terminology

• Cycle:

» Circular sequence of alternating vertices and edges.

» Each edge is preceded and followed by its endpoints.

• Simple Cycle:

» A cycle such that all its vertices and edges are unique.

Page 79: Fundamentals of data structures

Graph Terminology

Simple cycle{U, a, V, b, X, g, Y, f, W, c}

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

Page 80: Fundamentals of data structures

Graph Terminology

U

Y

X

W

V

Z

a

c

b

de

f

g

h

ij

Non-Simple Cycle{U, c, W, e, X, g, Y, f, W, d, V, a}

Page 81: Fundamentals of data structures

Graph Properties

Page 82: Fundamentals of data structures

Graph Representation

Adjacency Matrix Implementation

A |V| × |V| matrix of 0's and 1's. A 1 represents a connection or an edge.

Storage = |V|²       (this is huge!!)

For a non-directed graph there will always be symmetry along the top left to bottom

right diagonal. This diagonal will always be filled with zero's. This simplifies Coding.

Coding is concerned with storing the graph

in an efficient manner.

One way is to take all the bits from the

adjacency matrix and concatenate

them to form a binary string.

For undirected graphs, it suffices to concatenate the bits of the upper right triangle of

the adjacency matrix. Graph number zero is a graph with no edges.

Page 83: Fundamentals of data structures

Graph ApplicationsSome of the applications of graphs are :

• Networks (computer, cities ....)

• Maps (any geographic databases )

• Graphics : Geometrical Objects

• Neighborhood graphs

• Flow Problem

• Workflow

Page 84: Fundamentals of data structures

Reachability

• Reachability: Given two vertices 'u' and 'v' of a directed graph G, we say that 'u' reaches 'v' if G has a directed path from 'u' to 'v'.

• That is 'v' is reachable from 'u'.

• A directed graph is said to be strongly connected if for any two vertices 'u' and 'v' of G, 'u' reaches 'v'.

Page 85: Fundamentals of data structures

Graphs

• Depth First Search

• Breadth First Search

• Directed Graphs (Reachability)

• Application to Garbage Collection in Java

• Shortest Paths

• Dijkstra's Algorithm

Page 86: Fundamentals of data structures

Depth First Search

Algorithim DFS()Input graph GOutput labeling of the edges of G

as discovery edges and back edges

for all u in G.vertices()setLabel(u, Unexplored)

for all e in G.incidentEdges()setLabel(e, Unexplored)

for all v in G.vertices()if getLabel(v) = Unexplored DFS(G, v).

Page 87: Fundamentals of data structures

Algorithm DFS(G, v) Input graph G and a start vertex v of GOutput labeling of the edges of G as

discovery edges and back edges

setLabel(v, Visited)

for all e in G.incidentEdges(v)

if getLabel(e) = Unexploredw <--- opposite(v, e)

if getLabel(w) = UnexploredsetLabel(e, Discovery)DFS(G, w)

elsesetLabel(e, BackEdge)

Page 88: Fundamentals of data structures

Depth First Search

A

A

Unexplored Vertex

Visited Vertex

Unexplored Edge

Discovery Edge

Back Edge

Page 89: Fundamentals of data structures

A

ED

C

B

Start At Vertex A

Page 90: Fundamentals of data structures

A

ED

C

B

Discovery Edge

Page 91: Fundamentals of data structures

A

ED

C

B

Visited Vertex B

Page 92: Fundamentals of data structures

A

ED

C

B

Discovery Edge

Page 93: Fundamentals of data structures

A

ED

C

B

Visited Vertex C

Page 94: Fundamentals of data structures

A

ED

C

B

Back Edge

Page 95: Fundamentals of data structures

A

ED

C

B

Discovery Edge

Page 96: Fundamentals of data structures

A

ED

C

B

Visited Vertex D

Page 97: Fundamentals of data structures

A

ED

C

B

Back Edge

Page 98: Fundamentals of data structures

A

ED

C

B

Discovery Edge

Page 99: Fundamentals of data structures

A

ED

C

B

Visited Vertex E

Page 100: Fundamentals of data structures

A

ED

C

B

Discovery Edge

Page 101: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 102: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 103: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 104: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 105: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 106: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 107: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 108: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 109: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 110: Fundamentals of data structures

P

JI

M

L

FE

N

HG

K

O

DCBA

Page 111: Fundamentals of data structures

Breadth First Search

Algorithm BFS(G)Input graph GOutput labeling of the edges

and a partitioning of thevertices of G

for all u in G.vertices()setLabel(u, Unexplored)

for all e in G.edges()setLabel(e, Unexplored)

for all v in G.vertices()if getLabel(v) = Unexplored

BFS(G, v)

Page 112: Fundamentals of data structures

Algorithm BFS(G, v)L0 <-- new empty listL0.insertLast(v)setLabel(v, Visited)i <-- 0

while(¬Li.isEmpty())Li+1 <-- new empty listfor all v in G.vertices(v)

for all e in G.incidentEdges(v)if getLabel(e) = Unexplored

w <-- opposite(v)if getLabel(w) = Unexplored

setLabel(e, Discovery)setLabel(w, Visited)Li+1.insertLast(w)

elsesetLabel(e, Cross)

i <-- i + 1

Page 113: Fundamentals of data structures

A

E F

B C D

Page 114: Fundamentals of data structures

A

E F

B C D

Start Vertex ACreate a sequence L0

insert(A) into L0

Page 115: Fundamentals of data structures

A

E F

B C D

Start Vertex Awhile L0 is not empty

create a new empty list L1L0

Page 116: Fundamentals of data structures

A

E F

B C D

Start Vertex Afor each v in L0 do

get incident edges of vL0

Page 117: Fundamentals of data structures

A

E F

B C D

Start Vertex A if first incident edge is unexploredget opposite of v, say w

if w is unexploredset edge as discovery

L0

Page 118: Fundamentals of data structures

A

E F

B C D

Start Vertex Aset vertex w as visitedand insert(w) into L1 L0

L1

Page 119: Fundamentals of data structures

A

E F

B C D

Start Vertex A get next incident edge

L0

L1

Page 120: Fundamentals of data structures

A

E F

B C D

Start Vertex Aif edge is unexplored

we get vertex opposite v say w,

if w is unexplored

L0

L1

Page 121: Fundamentals of data structures

A

E F

B C D

Start Vertex Aif w is unexplored

set edge as discoveryL0

L1

Page 122: Fundamentals of data structures

A

E F

B C D

Start Vertex Aset w as visited and

add w to L1L0

L1

Page 123: Fundamentals of data structures

A

E F

B C D

Start Vertex Acontinue in this fashionuntil we have visited all

incident edge of v

L0

L1

Page 124: Fundamentals of data structures

A

E F

B C D

Start Vertex Acontinue in this fashionuntil we have visited all

incident edge of vL0

L1

Page 125: Fundamentals of data structures

A

E F

B C D

Start Vertex Aas L0 is now empty

we continue with list L1L0

L1

Page 126: Fundamentals of data structures

A

E F

B C D

Start Vertex Aas L0 is now empty

we continue with list L1L0

L1

L2

Page 127: Fundamentals of data structures

A

E F

B C D

Start Vertex A

L0

L1

L2

Page 128: Fundamentals of data structures

A

E F

B C D

Start Vertex A

L0

L1

L2

Page 129: Fundamentals of data structures

A

E F

B C D

Start Vertex A

L0

L1

L2

Page 130: Fundamentals of data structures

A

E F

B C D

Start Vertex A

L0

L1

L2

Page 131: Fundamentals of data structures

A

E F

B C D

Start Vertex A

L0

L1

L2

Page 132: Fundamentals of data structures

A

E F

B C D

Start Vertex A

L0

L1

L2

Page 133: Fundamentals of data structures

A

E F

B C D

Page 134: Fundamentals of data structures

A

E F

B C D

Page 135: Fundamentals of data structures

A

E F

B C D

Page 136: Fundamentals of data structures

A

E F

B C D

Page 137: Fundamentals of data structures

A

E F

B C D

Page 138: Fundamentals of data structures

A

E F

B C D

Page 139: Fundamentals of data structures

A

E F

B C D

Page 140: Fundamentals of data structures

A

E F

B C D

Page 141: Fundamentals of data structures

A

E F

B C D

Page 142: Fundamentals of data structures

A

E F

B C D

Page 143: Fundamentals of data structures

Weighted Graphs

• In a weighted graph G, each edge 'e' of G has associated with it, a numerical value, known as a weight.

• Edge weights may represent distances, costs etc.

• Example: In a flight route graph the weights associated with each graph edge could represent the distances between airports.

Page 144: Fundamentals of data structures

Shortest Paths

• Given a weighted graph G and two vertices 'u' and 'v' of G, we require that we find a path between 'u' and 'v' that has a minimum total weight between 'u' and 'v' also known as a (shortest path).

• The length of a path is the sum of the weights of the paths edges.

Page 145: Fundamentals of data structures

Dijkstra's Algorithm

The distance of a vertex v from a vertex s is the length of a shortest path between s and v

Dijkstra’s algorithm computes the distances of all the vertices from a given start vertex s

Assumptions:

the graph is connected

the edges are undirected

the edge weights are nonnegative

Page 146: Fundamentals of data structures

Dijkstra's Algorithm

We grow a “cloud” of vertices, beginning with s and eventually covering all the vertices

We store with each vertex v a label d(v) representing the distance of v from s in the subgraph

consisting of the cloud and its adjacent vertices

At each step

We add to the cloud the vertex u outside the cloud with the smallest distance label, d(u)

We update the labels of the vertices adjacent to u

Page 147: Fundamentals of data structures

Edge Relaxation

Consider an edge e (u,z) such that

u is the vertex most recently added to the cloud

z is not in the cloud

The relaxation of edge e updates distance d(z) as follows:

d(z) min{d(z),d(u) weight(e)}

Page 148: Fundamentals of data structures

A

DC

B

FE

8 4

217

5932

Page 149: Fundamentals of data structures

A(0)

DC

B

FE

8 4

217

5932

Add starting vertexto cloud.

Page 150: Fundamentals of data structures

A(0)

DC

B

FE

8 4

217

5932

We store with each vertexv a label d(v) representing

the distance of v from sin the subgraph consisting

of the cloud and its adjacentvertices.

Page 151: Fundamentals of data structures

A(0)

D(4)

C(2)

B(8)

FE

8 4

217

5932

We store with each vertexv a label d(v) representing

the distance of v from sin the subgraph consisting

of the cloud and its adjacentvertices.

Page 152: Fundamentals of data structures

A(0)

D(4)

C(2)

B(8)

FE

8 4

217

5932

At each step we add to the cloudthe vertex outside the cloud

with the smallestdistance label d(v).

Page 153: Fundamentals of data structures

A(0)

D(3)

C(2)

B(8)

F(11)E(5)

8 4

217

5932

We update the vertices adjacentto v.

d(v) = min{d(z), d(v) + weight(e)}

Page 154: Fundamentals of data structures

A(0)

D(3)

C(2)

B(8)

F(11)E(5)

8 4

217

5932

At each step we add to the cloudthe vertex outside the cloud

with the smallestdistance label d(v).

Page 155: Fundamentals of data structures

A(0)

D(3)

C(2)

B(8)

F(8)E(5)

8 4

217

5932

We update the vertices adjacentto v.

d(v) = min{d(z), d(v) + weight(e)}.

Page 156: Fundamentals of data structures

A(0)

D(3)

C(2)

B(8)

F(8)E(5)

8 4

217

5932

At each step we add to the cloudthe vertex outside the cloud

with the smallestdistance label d(v).

Page 157: Fundamentals of data structures

A(0)

D(3)

C(2)

B(7)

F(8)E(5)

8 4

217

5932

We update the vertices adjacentto v.

d(v) = min{d(z), d(v) + weight(e)}.

Page 158: Fundamentals of data structures

A(0)

D(3)

C(2)

B(7)

F(8)E(5)

8 4

217

5932

At each step we add to the cloudthe vertex outside the cloud

with the smallestdistance label d(v).

Page 159: Fundamentals of data structures

A(0)

D(3)

C(2)

B(7)

F(8)E(5)

8 4

217

5932

At each step we add to the cloudthe vertex outside the cloud

with the smallestdistance label d(v).

Page 160: Fundamentals of data structures

E

G

B

A D

H

C

F

2 2

16

7 7

4

23 3

2 2

Page 161: Fundamentals of data structures

E

G

B

A(0) D

H

C

F

2 2

16

7 7

4

23 3

2 2

Insert

Page 162: Fundamentals of data structures

E

G(6)

B(2)

A(0) D

H

C

F

2 2

16

7 7

4

23 3

2 2

Update

Page 163: Fundamentals of data structures

E

G(6)

B(2)

A(0) D

H

C

F

2 2

16

7 7

4

23 3

2 2

Insert

Page 164: Fundamentals of data structures

E(4)

G(6)

B(2)

A(0) D

H

C(9)

F

2 2

16

7 7

4

23 3

2 2

Update

Page 165: Fundamentals of data structures

E(4)

G(6)

B(2)

A(0) D

H

C(9)

F

2 2

16

7 7

4

23 3

2 2

Insert

Page 166: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0) D

H

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Update

Page 167: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0) D

H

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Insert

Page 168: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0) D

H(9)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Update

Page 169: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0) D

H(9)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Insert

Page 170: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0) D

H(8)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Update

Page 171: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0) D

H(8)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Insert

Page 172: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0)D(10)

H(8)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Update

Page 173: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0)D(10)

H(8)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Insert

Page 174: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0)D(10)

H(8)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Update

Page 175: Fundamentals of data structures

E(4)

G(5)

B(2)

A(0)D(10)

H(8)

C(9)

F(6)

2 2

16

7 7

4

23 3

2 2

Insert

Page 176: Fundamentals of data structures

Thank You