Top Banner
14.09.17 1 Advanced Algorithmics (6EAP) MTAT.03.238 Linear structures, sorting, searching, etc Jaak Vilo 2017 Fall 1 Jaak Vilo Big-Oh notation classes Class Informal Intuition Analogy f(n) ο ( g(n) ) f is dominated by g Strictly below < f(n) O( g(n) ) Bounded from above Upper bound f(n) Θ( g(n) ) Bounded from above and below “equal to” = f(n) Ω( g(n) ) Bounded from below Lower bound f(n) ω( g(n) ) f dominates g Strictly above > Conclusions Algorithm complexity deals with the behavior in the long-term worst case -- typical average case -- quite hard best case -- bogus, cheatingIn practice, long-term sometimes not necessary E.g. for sorting 20 elements, you dont need fancy algorithms… Linear, sequential, ordered, list … Memory, disk, tape etc – is an ordered sequentially addressed media. Physical ordered list ~ array Memory /address/ Garbage collection Files (character/byte list/lines in text file,…) Disk Disk fragmentation Linear data structures: Arrays Array Bidirectional map Bit array Bit field Bitboard Bitmap Circular buffer Control table Image Dynamic array Gap buffer Hashed array tree Heightmap Lookup table Matrix Parallel array Sorted array Sparse array Sparse matrix Iliffe vector Variable-length array
32

Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

Jun 28, 2018

Download

Documents

hadung
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

1

AdvancedAlgorithmics(6EAP)MTAT.03.238

Linearstructures,sorting,searching,etc

JaakVilo2017Fall

1JaakVilo

Big-Ohnotationclasses

Class Informal Intuition Analogy

f(n)∈ ο (g(n)) fisdominatedbyg Strictlybelow <f(n)∈O(g(n)) Boundedfromabove Upperbound ≤f(n)∈Θ(g(n)) Boundedfrom

aboveand below“equalto” =

f(n)∈Ω(g(n)) Boundedfrombelow Lowerbound ≥f(n)∈ω(g(n)) fdominatesg Strictlyabove >

Conclusions

• Algorithmcomplexitydealswiththebehaviorinthelong-term– worstcase -- typical– averagecase -- quitehard– bestcase -- bogus,“cheating”

• Inpractice,long-termsometimesnotnecessary– E.g.forsorting20elements,youdon’tneedfancyalgorithms…

Linear,sequential,ordered,list…

Memory, disk, tape etc – is an ordered sequentially addressed media.

Physicalorderedlist~array

• Memory/address/– Garbagecollection

• Files(character/bytelist/linesintextfile,…)

• Disk– Diskfragmentation

Lineardatastructures:Arrays• Array• Bidirectionalmap• Bitarray• Bitfield• Bitboard• Bitmap• Circularbuffer• Controltable• Image• Dynamicarray• Gapbuffer

• Hashedarraytree• Heightmap• Lookuptable• Matrix• Parallelarray• Sortedarray• Sparsearray• Sparsematrix• Iliffevector• Variable-lengtharray

Page 2: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

2

Lineardatastructures:Lists

• Doublylinkedlist• Arraylist• Linkedlist• Self-organizinglist• Skiplist• Unrolledlinkedlist• VList

• Xorlinkedlist• Zipper• Doublyconnectededgelist

• Differencelist

Lists:Array

3 6 7 5 2

0 1 size MAX_SIZE-1

L = int[MAX_SIZE]

L[2]=7

Lists:Array

3 6 7 5 2

0 1 size MAX_SIZE-1

L = int[MAX_SIZE]

L[2]=7 L[size++] = new

3 6 7 5 2

1 2 size MAX_SIZE

L[3]=7 L[++size] = new

Multiplelists,2-D-arrays,etc…

2Darray

& A[i,j] = A + i*(nr_el_in_row*el_size) + j*el_size

LinearLists

• Operationswhichonemaywanttoperformonalinearlistofn elementsinclude:

– gainaccess tothekthelementofthelisttoexamine and/orchange thecontents

– insert anewelementbeforeorafterthekthelement

–delete thekthelementofthelist

Reference:Knuth,TheArtofComptuerProgramming,Vol1,FundamentalAlgorithms,3rd Ed.,p.238-9

Page 3: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

3

AbstractDataType(ADT)• High-leveldefinitionofdatatypes

• AnADTspecifies– Acollection ofdata– Asetofoperations onthedataorsubsetsofthedata

• ADTdoesnotspecifyhowtheoperationsshouldbeimplemented

• Examples– vector,list,stack,queue,deque,priorityqueue,table(map),

associativearray,set,graph,digraph

ADT• Adatatype isasetofvaluesandanassociatedsetof

operations• Adatatypeisabstract iffitiscompletelydescribedbyitsset

ofoperationsregradlessofitsimplementation• Thismeansthatitispossibletochangetheimplementation

ofthedatatypewithoutchangingitsuse• Thedatatypeisthusdescribedbyasetofprocedures• Theseoperationsaretheonlythingthatauserofthe

abstractioncanassume

Primitive&compositetypesPrimitivetypes• Boolean (forboolean values

True/False)• Char (forcharactervalues)• int (forintegralorfixed-precision

values)• Float (forstoringrealnumber

values)• Double (alargersizeoftypefloat)• String (forstringofchars)• Enumerated type

Compositetypes• Array• Record (alsocalledtupleorstruct)

• Union• Taggedunion(alsocalledavariant,variantrecord,discriminatedunion,ordisjointunion)

• Plainolddatastructure

AbstractDataTypes(ADT)• SomecommonADTs,whichhaveprovedusefulinagreatvarietyofapplications,are

• EachoftheseADTsmaybedefinedinmanywaysandvariants,notnecessarilyequivalent.

ContainerListSetMultisetMapMultimap

StackGraphQueuePriority queueDouble-ended queueDouble-ended priority queue

Abstractdatatypes:

• Dictionary (key,value)• Stack(LIFO)• Queue(FIFO)• Queue(double-ended)• Priorityqueue(fetchhighest-priorityobject)• ...

Dictionary

• Containerofkey-element(k,e)pairs• Requiredoperations:

– insert(k,e),– remove(k),– find(k),– isEmpty()

• Mayalsosupport(whenanorderisprovided):– closestKeyBefore(k),– closestElemAfter(k)

• Note:Noduplicatekeys

Page 4: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

4

Somedatastructuresfor DictionaryADT• Unordered

– Array– Sequence/list

• Ordered– Array– Sequence(SkipLists)– BinarySearchTree(BST)– AVLtrees,red-blacktrees– (2;4)Trees– B-Trees

• Valued– HashTables– ExtendibleHashing

Lineardatastructures

Arrays• Array• Bidirectional

map• Bitarray• Bitfield• Bitboard• Bitmap• Circularbuffer• Controltable• Image• Dynamicarray• Gapbuffer

• Hashedarraytree

• Heightmap• Lookuptable• Matrix• Parallelarray• Sortedarray• Sparsearray• Sparsematrix• Iliffe vector• Variable-length

array

Lists• Doublylinkedlist• Linkedlist• Self-organizinglist• Skiplist• Unrolledlinkedlist• VList• Xor linkedlist• Zipper• Doublyconnectededgelist

Trees…Binarytrees• AAtree• AVLtree• Binarysearch

tree• Binarytree• Cartesiantree• Pagoda• Randomized

binarysearchtree

• Red-blacktree• Rope• Scapegoattree• Self-balancing

binarysearchtree

• Splaytree• T-tree• Tangotree• Threadedbinary

tree• Toptree

• Treap• Weight-balanced

tree

B-trees• B-tree• B+tree• B*-tree• Bsharptree• Dancingtree• 2-3tree• 2-3-4tree• Queap• Fusiontree• Bx-tree

Heaps• Heap• Binaryheap• Binomialheap• Fibonacciheap• AF-heap• 2-3heap

• Softheap• Pairingheap• Leftistheap• Treap• Beap• Skewheap• Ternaryheap• D-ary heap•• Tries• Trie• Radixtree• Suffixtree• Suffixarray• Compressed

suffixarray• FM-index• Generalised

suffixtree• B-trie• Judyarray• X-fasttrie• Y-fasttrie

• Ctrie

Multiway trees• Ternarysearch

tree• And–ortree• (a,b)-tree• Link/cuttree• SPQR-tree• Spaghettistack• Disjoint-setdata

structure• Fusiontree• Enfilade• Exponentialtree• Fenwicktree• VanEmde Boas

tree

Space-partitioningtrees• Segmenttree

• Intervaltree• Rangetree• Bin• Kd-tree• Implicitkd-tree• Min/maxkd-tree• Adaptivek-dtree• Kdb tree• Quadtree• Octree• Linearoctree• Z-order• UB-tree• R-tree• R+tree• R*tree• HilbertR-tree• X-tree• Metrictree• Covertree• M-tree• VP-tree

• BK-tree• Bounding

intervalhierarchy

• BSPtree• Rapidly-exploring

randomtree

Application-specifictrees• Syntaxtree• Abstractsyntax

tree• Parsetree• Decisiontree• Alternating

decisiontree• Minimax tree• Expectiminimax

tree• Fingertree

Hashes,Graphs,Other• Hashes• Bloomfilter• Distributedhashtable• Hasharraymapped

trie• Hashlist• Hashtable• Hashtree• Hashtrie• Koorde• Prefixhashtree•

Graphs• Adjacencylist• Adjacencymatrix• Graph-structured

stack• Scenegraph• Binarydecision

diagram• Zerosuppressed

decisiondiagram• And-invertergraph• Directedgraph• Directedacyclicgraph

• Propositionaldirectedacyclicgraph

• Multigraph• Hypergraph

Other• Lightmap• Wingededge• Quad-edge• Routingtable• Symboltable

Lists:Array

3 6 7 5 2

0 1 size MAX_SIZE-1

3 6 7 8 2

0 1 size

5 2 Insert 8 after L[2]

3 6 7 8 2

0 1 size

5 2 Delete last

*array (memory address)sizeMAX_SIZE

Lists:Array

3 6 7 8 2

0 1 size

5 2 Insert 8 after L[2]

3 6 7 8 2

0 1 size

5 2 Delete last

• Access i O(1)• Insert to end O(1)• Delete from end O(1)• Insert O(n) • Delete O(n)• Search O(n)

Page 5: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

5

LinearLists

• Otheroperationsonalinearlistmayinclude:– determinethenumberofelements– searchthelist– sortalist– combinetwoormorelinearlists– splitalinearlistintotwoormorelists– makeacopyofalist

Stack

• push(x) -- addtoend(addtotop)• pop() -- fetchfromend(top)

• O(1)inallreasonablecasesJ

• LIFO– LastIn,FirstOut

Linkedlistshead tail

head tail

Singly linked

Doubly linked

Linkedlists:addhead tail

head tail

size

Linkedlists:delete(+garbagecollection?)

head tail

head tail

size

Operations

• Arrayindexedfrom0 ton – 1:

• Singly-linkedlistwithheadandtailpointers

1 undertheassumptionwehaveapointertothekthnode,O(n) otherwise

k = 1 1 < k < n k = naccess/change the kth element O(1) O(1) O(1)

insert before or after the kth element O(n) O(n) O(1)delete the kth element O(n) O(n) O(1)

k = 1 1 < k < n k = naccess/change the kth element O(1) O(n) O(1)

insert before or after the kth element O(1) O(n) O(1)1 O(n) O(1)delete the kth element O(1) O(n) O(n)

Page 6: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

6

ImprovingRun-TimeEfficiency

• Wecanimprovetherun-timeefficiencyofalinkedlistbyusingadoubly-linkedlist:

Singly-linkedlist:

Doubly-linkedlist:

– Improvementsatoperationsrequiringaccesstothepreviousnode

– Increasesmemoryrequirements...

ImprovingEfficiency

Singly-linkedlist:

Doubly-linkedlist:

1 undertheassumptionwehaveapointertothekthnode,O(n) otherwise

k = 1 1 < k < n k = naccess/change the kth element O(1) O(n) O(1)

insert before or after the kth element O(1) O(1)1 O(1)delete the kth element O(1) O(1)1 O(1)

k = 1 1 < k < n k = naccess/change the kth element O(1) O(n) O(1)

insert before or after the kth element O(1) O(n) O(1)1 O(n) O(1)delete the kth element O(1) O(n) O(n)

• Arrayindexedfrom0 ton – 1:

• Singly-linkedlistwithheadandtailpointers

• Doublylinkedlist

k = 1 1 < k < n k = naccess/change the kth element O(1) O(1) O(1)insert before or after the kth element O(n) O(n) O(1)

delete the kth element O(n) O(n) O(1)

k = 1 1 < k < n k = naccess/change the kth element O(1) O(n) O(1)insert before or after the kth element O(1) O(n) O(1)1 O(n) O(1)delete the kth element O(1) O(n) O(n)

k = 1 1 < k < n k = naccess/change the kth element O(1) O(n) O(1)

insert before or after the kth element O(1) O(1)1 O(1)

delete the kth element O(1) O(1)1 O(1)

Introductiontolinkedlists• Considerthefollowingstructdefinition

struct node {string word;int num;node *next; //pointer for the next node

};node *p = new node;

? ?

num word next

p ?

Introductiontolinkedlists:insertinganode• node *p;

• p = new node;

• p->num = 5; • p->word = "Ali";• p->next = NULL

•5 Ali

num word next

p

Introductiontolinkedlists:addinganewnode

• Howcanyouaddanothernodethatispointedbyp->link?

• node *p;• p = new node;• p->num = 5; • p->word = "Ali";• p->next = NULL;• node *q;

5 Ali

num word link

?p

q

Page 7: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

7

Introductiontolinkedlistsnode *p;p = new node;p->num = 5; p->word = "Ali";p->next = NULL;

node *q;q = new node;

5 Ali

num word link

? ? ?

num word link

?

q

p

Introductiontolinkedlistsnode *p, *q;p = new node;p->num = 5; p->word = "Ali";p->next = NULL;

q = new node;q->num=8;q->word = "Veli";

5 Ali

num word next

? 8 Veli

num word next

?p

q

Introductiontolinkedlistsnode *p, *q;p = new node;p->num = 5; p->word = "Ali";p->next = NULL;

q = new node;q->num=8;q->word = "Veli";p->next = q;q->next = NULL;

5 Ali

num word link

? 8 Veli

num word link

p

q

PointersinC/C++

p=newnode;deletep;p=newnode[20];

p=malloc(sizeof(node));freep;

p=malloc(sizeof(node)*20);(p+10)->next=NULL;/*11thelements*/

Book-keeping

• malloc,new – “remember” whathasbeencreatedfree(p),delete (C/C++)

• Whenyouneedmanysmallareastobeallocated,reserveabigchunk(array)andmaintainyourownsetoffreeobjects

• Elementsofarrayofobjects canbepointedbythepointertoanobject.

Object

• Object=newobject_type;

• Equalstocreatinganewobjectwithnecessarysizeofallocatedmemory(deletecanfreeit)

Page 8: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

8

Somelinks

• http://en.wikipedia.org/wiki/Pointer

• Pointerbasics:http://cslibrary.stanford.edu/106/

• C++MemoryManagement:Whatisthedifferencebetweenmalloc/freeandnew/delete?– http://www.codeguru.com/forum/showthread.php?t=401848

Alternative:arraysandintegers

• Ifyouwanttotestpointersandlinkedlistetc.datastructures,butdonothavepointersfamiliar(yet)

• Usearraysandindexestoarrayelementsinstead…

Replacingpointerswitharrayindex

/

7

5

5

8

/

1

4

3

1 2 3 4 5 6 7head=3

nextkeyprev

8 4 7

head

Maintaininglistoffreeobjects

/

7

5

5

8

/

1

4

3

1 2 3 4 5 6 7head=3

nextkeyprev

8 4 7

head

/

7

5

4 5

8

/

/ 1

4

3

7 21 2 3 4 5 6 7

head=3

nextkeyprev

free=6 free = -1 => array is full

allocate object:new = free;free = next[free] ;

free object xnext[x]= freefree = x

Multiplelists,singlefreelist

/

7

5

4 5

8

/

/ 1

4

3

7

3

/

/

9

6

1 2 3 4 5 6 7nextkeyprev

head1=3 => 8, 4, 7head2=6 => 3, 9free =2 (2)

Hack:allocatemorearrays…

1 2 3 4 5 6 7

8 9 10 11 12 13 14

15 16 17 18 19 20 21

AA

AA[ (i-1)/7 ] -> [ (i -1) % 7 ]

LIST(10) = AA[ 1 ][ 2 ]

LIST(19) = AA[ 2 ][ 5 ]

use integer division and mod

Page 9: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

9

Queue

• enqueue(x) - addtoend• dequeue() - fetchfrombeginning

• FIFO– FirstInFirstOut

• O(1)inallreasonablecasesJ

Queue(FIFO)

3 6 7 5 2

F L

Queue(basicidea,doesnotcontainallcontrols!)

3 6 7 5 2

F L MAX_SIZE-1

7 5 2

F L MAX_SIZE-1

First = List[F]

Last = List[L-1]

Pop_first : { return List[F++] }

Pop_last : { return List[--L] }

Full: return ( L==MAX_SIZE )Empty: F< 0 or F >= L

Circularbuffer

• Acircularbuffer orringbuffer isadatastructure thatusesasingle,fixed-sizebufferasifitwereconnectedend-to-end.Thisstructurelendsitselfeasilytobufferingdatastreams.

CircularQueue

3 6

FL

First = List[F]

Last = List[ (L-1+MAX_SIZE) % MAX_SIZE ]

Full: return ( (L+1)%MAX_SIZE == F )Empty: F==L

7 5 2

MAX_SIZE-1

Add_to_end( x ) : { List[L]=x ; L= (L+1) % MAX_SIZE ] } // % = modulo

3 6

FL,

7 5 2

MAX_SIZE-1

Page 10: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

10

Stack

• push(x) -- addtoend(addtotop)• pop() -- fetchfromend(top)

• O(1)inallreasonablecasesJ

• LIFO– LastIn,FirstOut

Stackbasedlanguages

• Implementapostfixcalculator– ReversePolishnotation

• 543*2- + => 5+((4*3)-2)

• Verysimpletoparse andinterpret

• FORTH,Postscriptarestack-basedlanguages

Arraybasedstack

• Howtoknowhowbigastackshalleverbe?

• Whenfull,allocatebiggertabledynamically,andcopyallpreviousvaluesthere

• O(n)?

3 6 7 5

3 6 7 5 2

• Whenfull,create2xbiggertable,copypreviousnelements:

• Afterevery2k insertions,performO(n)copy

• O(n)individualinsertions+• n/2+n/4+n/8…copy-ing• Total:O(n)effort!

Whataboutdeletions?

• whenn=32->33(copy32,insert1)• delete:33->32

– shouldyoudeleteimmediately?– Deleteonlywhenbecomeslessthan1/4thfull

– Havetodeleteatleastn/2todecrease– Havetoaddatleastntoincreasesize– Mostoperations,O(1)effort– ButfewoperationstakeO(n)tocopy– Foranymoperations,O(m)time

Amortizedanalysis

• Analyzethetimecomplexityovertheentire“lifespan” ofthealgorithm

• Someoperationsthatcostmorewillbe“covered”bymanyotheroperationstakingless

Page 11: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

11

ListsanddictionaryADT…

• Howtomaintainadictionaryusing(linked)lists?

• IskinD?– gothroughallelementsdofD,testifd==kO(n)– Ifsorted:d=first(D);while(d<=k)d=next(D);– onaveragen/2tests…

• Add(k,D)=>insert(k,D)=O(1)orO(n)– testforuniqueness

Arraybasedsortedlist

• isdinD?• BinarysearchinD

low highmid

Binarysearch– recursiveBinarySearch(A[0..N-1],value,low,high){

if (high<low)return -1//notfound

mid=(low+high)/2if (A[mid]>value)

return BinarySearch(A,value,low,mid-1)elseif(A[mid]<value)

return BinarySearch(A,value,mid+1,high)else

returnmid//found}

Binarysearch– recursiveBinarySearch(A[0..N-1],value,low,high){

if (high<low)return -1//notfound

mid=low+((high- low)/2) //Note:not(low+high)/2!!if (A[mid]>value)

return BinarySearch(A,value,low,mid-1)elseif(A[mid]<value)

return BinarySearch(A,value,mid+1,high)else

returnmid//found}

Binarysearch– iterativeBinarySearch(A[0..N-1],value){

low=0;high=N- 1;

while (low<=high){mid=low+((high- low)/2)//Note:not(low+high)/2 !!if (A[mid]>value)

high=mid- 1else if (A[mid]<value)

low=mid+1else

returnmid//found}return -1//notfound

}

Workperformed

• x<=>A[18]?<• x<=>A[9]? >• x<=>A[13]?==

• O(lgn)

1 3618

Page 12: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

12

Sorting

• givenalist,arrangevaluessothatL[1]<=L[2]<=…<=L[n]

• nelements=>n!possibleorderings• OnetestL[i]<=L[j]candividen!to2

– Makeabinarytreeandcalculatethedepth

• log(n!)=Ω(nlogn)

• Hence,lowerboundforsortingisΩ(nlogn)– usingcomparisons…

Decisiontreemodel

• n!orderings(leaves)• Heightofsuchtree?

Proof:log(n!)=Ω(nlogn)

• log(n!)=logn+log(n-1)+log(n-2)+…log(1)

>=n/2*log(n/2)

=Ω(nlogn)Half of elements are larger than log(n/2)

Page 13: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

13

Thedivide-and-conquerdesignparadigm

1. Divide theproblem(instance)intosubproblems.

2. Conquer thesubproblems bysolvingthemrecursively.

3. Combine subproblem solutions.

Mergesort

Merge-Sort(A,p,r)if p<r

then q=(p+r)/2 //floorMerge-Sort( A,p,q)Merge-Sort( A,q+1,r)Merge( A,p,q,r)

It was invented by John von Neumann in 1945.

Example

• Applyingthemergesortalgorithm:

Mergeoftwolists:Θ(n)

A,B– liststobemergedL=newlist;//emptywhile(Anotemptyand Bnotempty)

ifA.first()<=B.first()thenappend(L,A.first());A=rest(A);elseappend(L,B.first());B=rest(B);

append(L,A); //allremainingelementsofAappend(L,B); //allremainingelementsofBreturnL

Wikipedia/viz.

Valu

e

Pos in array

Run-timeAnalysisofMergeSort

• Thus,thetimerequiredtosortanarrayofsizen > 1 is:– thetimerequiredtosortthefirsthalf,– thetimerequiredtosortthesecondhalf,and– thetimerequiredtomergethetwolists

• Thatis:

( )îíì

>+=

=1)(T21)1(

)T(2 nn

nn n Θ

Θ

Page 14: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

14

Mergesort

• Worstcase,averagecase,bestcase…Θ(nlogn)

• Commonwisdom:– Requires additionalspaceformerging(incaseofarrays)

• Homework*:developin-placemergeoftwolistsimplementedinarrays/comparespeed/

a b

a bL[a] <= L[b] a b

L[a] > L[b]

Quicksort

• ProposedbyC.A.R.Hoarein1962.• Divide-and-conqueralgorithm.• Sorts“inplace”(likeinsertionsort,butnotlikemergesort).

• Verypractical(withtuning).

Quicksort

Page 15: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

15

Partitioningversion2

pivot=A[R];//i=L;j=R-1;while(i<=j)while (A[i]<pivot)i++;//willstopatpivotlatestwhile (i<=jand A[j]>=pivot)j-- ;if (i<j){swap(A[i],A[j]);i++;j-- }

A[R]=A[i];A[i]=pivot;returni;

L R

L R <= pivot

> pivot

i j

L Rij

L Rij

pivot

Wikipedia/“video”

Page 16: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

16

ChoiceofpivotinQuicksort

• Selectmedianofthree…

• Selectrandom– opponentcannotchoosethewinningstrategyagainstyou!

http://ocw.mit.edu/OcwWeb/Electrical-Engineering-and-Computer-Science/6-046JFall-2005/VideoLectures/detail/embed04.htm

Randompivot

Selectpivot randomlyfromtheregion(blue)andswapwithlastposition

Selectpivotasamedianof3[ormore]randomvaluesfromregion

Applynon-recursivesortforarraylessthan10-20

L R

L R

<= pivot

> pivoti j

L Rij

• 2-pivotversionofQuicksort– (splitin3regions!)

• Handleequalvalues(equaltopivots)

Page 17: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

17

Page 18: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

18

Page 19: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

19

Alternativematerials

• Quicksortaveragecaseanalysishttp://eid.ee/10z

• https://secweb.cs.odu.edu/~zeil/cs361/web/website/Lectures/quick/pages/ar01s05.html

• http://eid.ee/10y - MITOpenCourseware-Asymptoticnotation,Recurrences,SubstitutonMasterMethod

f(n)

f(n/b) f(n/b) f(n/b)

f(n)

f(n/b) f(n/b) f(n/b)

n vs f(n)log b a

Page 20: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

20

Page 21: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

21

Backtosorting

Page 22: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

22

WecansortinO(nlogn)

• Isthatthebestwecando?

• Remember:usingcomparisons<,>,<=,>=wecannotdobetterthanO(nlogn)

Howfastcanwesortnintegers?

• Sortpeoplebytheirsex?(F/M,0/1)

• Sortpeoplebyyearofbirth?

Page 23: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

23

Page 24: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

24

Radixsort

Radix-Sort(A,d)1. for i=1to d/*leastsignificanttomostsignificant*/

2. useastablesorttosortAondigiti

Page 25: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

25

Radixsortusinglists(stable)bbabbb adb aad aac ccc ccb aba cca

Radixsortusinglists(stable)

a

b

c

d

bba aba cca

bbb adb ccb

aac ccc

aad

1.

bbabbb adb aad aac ccc ccb aba cca

Radixsortusinglists(stable)

a

b

c

d

bba aba cca

bbb adb ccb

aac ccc

aad

a

b

c

d

cca

bbb

adb

ccb

aac

ccc

aad

bba aba

a

b

c

d

cca

bbb

adb

ccb

aac

ccc

aad

bba

aba

1. 2.

3.

bbabbb adb aad aac ccc ccb aba cca

Whynotfromlefttoright?

• Swap‘0’ withfirst‘1’• Idea1:recursivelysortfirstandsecondhalf

– Exercise?

01011001001010111100010010010010010100100101010000010000

01011000010010111100010010011001010100100101010000010000

01011000010010010100010010011001010100100111110000010000

01011000010010010100000100001001010100100111110001001001

Page 26: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

26

Bitwisesortlefttoright

• Idea2:– swapelementsonlyiftheprefixesmatch…

– Forallbitsfrommostsignificant• advancewhen0• when1->lookfornext0

– ifprefixmatches,swap– otherwisekeepadvancingon0’sandlookfornext1

Bitwiselefttorightsort/*Historicalsorting– wasusedinUniv.ofTartuusingassembler….*//*Cimplementation– JaakVilo,1989*/

voidbitwisesort(SORTTYPE*ARRAY,intsize){

inti,j,tmp,nrbits;

registerSORTTYPEmask,curbit,group;

nrbits=sizeof(SORTTYPE)*8;

curbit=1<<(nrbits-1);/*setmostsignificantbit1*/mask=0; /*maskofthealreadysortedarea */

Jaak Vilo, Univ. of Tartu

do{/*Foreachbit*/i=0;

new_mask:for(;(i<size)&&(!(ARRAY[i]&curbit));i++) ; /*Advancewhilebit==0*/if(i>=size)gotoarray_end;group=ARRAY[i]&mask; /*Savecurrentprefixsnapshot*/

j=i; /*memorize locationof1*/for(;;) {

if(++i>=size)gotoarray_end; /* reachedendofarray*/if((ARRAY[i]&mask)!=group)goto new_mask ;/*newprefix*/

if(!(ARRAY[i]&curbit)) {/*bitis0– needtoswapwithpreviouslocationof1,A[i]ó A[j]*/tmp =ARRAY[i]; ARRAY[i]=ARRAY[j]; ARRAY[j]=tmp;j+=1; /*swapandincreasejtothe

nextpossible1*/}

}array_end:

mask=mask|curbit; /*areaundermaskisnowsorted*/curbit>>=1; /*nextbit */

}while(curbit); /*untilallbitshavebeensorted… */}

Jaak Vilo, Univ. of Tartu

Bitwisefromlefttoright

• Swap‘0’ withfirst‘1’

00100000010010010100001011001001010100100110010011111000

Jaak Vilo, Univ. of Tartu

Bucketsort

• Assumeuniformdistribution

• AllocateO(n)buckets

• Assigneachvaluetopre-assignedbucket

.78

.17

.39

.26

.72

.94

.21

.12

.23

.68

/

/

/

1

0

Sortsmallbucketswithinsertionsort

3

2

5

4

7

6

9

8

.12 .17

.21 .23 .26

.39

.68

.72 .78

.94

Page 27: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

27

http://sortbenchmark.org/

• Thesortinputrecordsmustbe100bytesinlength,withthefirst10bytesbeingarandomkey

• Minutesort– maxamountsortedin1minute– 116GBin58.7sec(JimWyllie,IBMResearch)– 40-node80-Itaniumcluster,SANarrayof2,520disks

• 2009,500GBHadoop 1406nodesx(2QuadcoreXeons,8GBmemory,4SATA)OwenO'Malley andArunMurthyYahooInc.

• Performance/PriceSortandPennySort

SortBenchmark• http://sortbenchmark.org/• SortBenchmarkHomePage• Wehaveanewbenchmarkcallednew GraySort,new inmemoryofthefatherofthesort

benchmarks,JimGray.ItreplacesTeraByteSortwhichisnowretired.• Unlike2010,wewillnotbeacceptingearlyentriesforthe2011year.Thedeadlinefor

submittingentriesisApril1,2011.– Allhardwareusedmustbeoff-the-shelfandunmodified.– ForDaytonaclustersortswhereinputsamplingisusedtodeterminetheoutputpartitionboundaries,theinput

samplingmustbedoneevenlyacrossallinputpartitions.

NewrulesforGraySort:• Theinputfilesizeisnowminimum~100TBor1Trecords.Entrieswithlargerinputsizesalso

qualify.• ThewinnerwillhavethefastestSortedRecs/Min.• Wenowprovideanewinputgeneratorthatworksinparallelandgeneratesbinarydata.See

below.• FortheDaytonacategory,wehavetwonewrequirements.(1)Thesortmustrun

continuously(repeatedly)foraminimum1hour.(Thisisaminimumreliabilityrequirement).(2)Thesystemcannotoverwritetheinputfile.

Page 28: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

28

Orderstatistics

• Minimum– thesmallestvalue• Maximum– thelargestvalue• Ingenerali’thvalue.• Findthemedianofthevaluesinthearray• MedianinsortedarrayA:

– nisoddA[(n+1)/2]– niseven– A[(n+1)/2]orA[(n+1)/2]

Orderstatistics

• Input:AsetAofnnumbersandi,1≤i≤n• Output:xfromAthatislargerthanexactlyi-1elementsofA

Q:Findi’thvalueinunsorteddata

A. O(n)B. O(nloglogn)C. O(nlogn)D. O(nlog2 n)

Minimum

Minimum(A)1 min=A[1]2 fori=2tolength(A)3 ifmin>A[i]4 thenmin=A[i]5 returnmin

n-1comparisons.

Minandmaxtogether

• compareeverytwoelementsA[i],A[i+1]• Comparelargeragainstcurrentmax• Smalleragainstcurrentmin

• 3n/2

SelectioninexpectedO(n)

Randomised-select(A,p,r,i)if p=rthen return A[p]q=Randomised-Partition(A,p,r)k=q– p+1 //nrofelementsinsubarrif i<=kthen return Randomised-Partition(A,p,q,i)else return Randomised-Partition(A,q+1,r,i-k)

Page 29: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

29

Conclusion

• SortingingeneralO(nlogn)• Quicksortisrathergood

• Lineartimesortingisachievablewhenonedoesnotassumeonlydirectcomparisons

• Findi’thvalueinunsorted– expectedO(n)

• Findi’thvalue:worstcaseO(n)– seeCLRS

Ok…

• lists– aversatiledatastructureforvariouspurposes

• Sorting– atypicalalgorithm(manyways)• Whichsortingmethodsforarray/list?

• Array:mostoftheimportant(e.g.update)tasksseemtobeO(n),whichisbad

Q:searchforavalueXinlinkedlist?

A. O(1)

B. O(logn)

C. O(n)

Canwesearchfasterinlinkedlists?

• Whysortlinkedlistsifsearch anywayO(n)?

• Linkedlists:– whatisthe“mid-point” ofanysublist?– Therefore,binarysearchcannotbeused…

• Orcanit?

SkipList Skiplists

• Buildseverallistsatdifferent“skip” steps

• O(n)list• Level1:~n/2• Level2:~n/4• …• Levellogn~2-3elements…

Page 30: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

30

9/14/171:33PM SkipLists 175

SkipLists

+¥-¥

S0

S1

S2

S3

+¥-¥ 10 362315

+¥-¥ 15

+¥-¥ 2315

SkipList

typedefstructnodeStructure*node;typedefstructnodeStructure{keyTypekey;valueTypevalue;nodeforward[1];/*variablesizedarrayof forwardpointers*/

};

9/14/171:33PM SkipLists 177

WhatisaSkipList• Askiplist forasetS ofdistinct(key,element)itemsisaseriesoflists

S0, S1 , … , Sh suchthat– EachlistSi containsthespecialkeys+¥ and-¥– ListS0 containsthekeysofS innondecreasingorder– Eachlistisasubsequenceofthepreviousone,i.e.,

S0 ÊS1 Ê … Ê Sh

– ListSh containsonlythetwospecialkeys• WeshowhowtouseaskiplisttoimplementthedictionaryADT

56 64 78 +¥31 34 44-¥ 12 23 26

+¥-¥

+¥31-¥

64 +¥31 34-¥ 23

S0

S1

S2

S3

Illustrationoflists

-inf inf25 30 47 99173

KeyRight[ .. ] - right links in arrayLeft[ .. ] - left links in array, array size – how high is the list

9/14/171:33PM SkipLists 179

Search• Wesearchforakeyx inaaskiplistasfollows:

– Westartatthefirstpositionofthetoplist– Atthecurrentpositionp,wecomparex withy ¬ key(after(p))

x = y:wereturnelement(after(p))x > y:we“scanforward”x < y:we“dropdown”

– Ifwetrytodropdownpastthebottomlist,wereturnNO_SUCH_KEY• Example:searchfor78

+¥-¥

S0

S1

S2

S3

+¥31-¥

64 +¥31 34-¥ 23

56 64 78 +¥31 34 44-¥ 12 23 26

9/14/171:33PM SkipLists 180

RandomizedAlgorithms• Arandomizedalgorithm

performscointosses(i.e.,usesrandombits)tocontrolitsexecution

• Itcontainsstatementsofthetype

b ¬ random()if b = 0

do A …else { b = 1}

do B … • Itsrunningtimedependsonthe

outcomesofthecointosses

• Weanalyzetheexpectedrunningtimeofarandomizedalgorithmunderthefollowingassumptions– thecoinsareunbiased,and– thecointossesare

independent• Theworst-caserunningtimeof

arandomizedalgorithmisoftenlargebuthasverylowprobability(e.g.,itoccurswhenallthecointossesgive“heads”)

• Weusearandomizedalgorithmtoinsertitemsintoaskiplist

Page 31: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

31

9/14/171:33PM SkipLists 181

• Toinsertanitem(x, o) intoaskiplist,weusearandomizedalgorithm:– Werepeatedlytossacoinuntilwegettails,andwedenotewithi the

numberoftimesthecoincameupheads– Ifi ³ h,weaddtotheskiplistnewlistsSh+1, … , Si +1,eachcontaining

onlythetwospecialkeys– Wesearchforx intheskiplistandfindthepositionsp0, p1 , …, pi ofthe

itemswithlargestkeylessthanx ineachlistS0, S1, … , Si

– Forj ¬ 0, …, i,weinsertitem(x, o) intolistSj afterpositionpj

• Example:insertkey15,withi = 2

Insertion

+¥-¥ 10 36

+¥-¥

23

23 +¥-¥

S0

S1

S2

+¥-¥

S0

S1

S2

S3

+¥-¥ 10 362315

+¥-¥ 15

+¥-¥ 2315p0

p1

p2

9/14/171:33PM SkipLists 182

Deletion

• Toremoveanitemwithkeyx fromaskiplist,weproceedasfollows:– Wesearchforx intheskiplistandfindthepositionsp0, p1 , …, pi ofthe

itemswithkeyx,wherepositionpj isinlistSj

– Weremovepositionsp0, p1 , …, pi fromthelistsS0, S1, … , Si

– Weremoveallbutonelistcontainingonlythetwospecialkeys• Example:removekey34

-¥ +¥4512

-¥ +¥

23

23-¥ +¥

S0

S1

S2

-¥ +¥

S0

S1

S2

S3

-¥ +¥4512 23 34

-¥ +¥34

-¥ +¥23 34p0

p1

p2

9/14/171:33PM SkipLists 183

Implementationv2

• Wecanimplementaskiplistwithquad-nodes

• Aquad-nodestores:– item– linktothenodebefore– linktothenodeafter– linktothenodebelow– linktothenodeafter

• Also,wedefinespecialkeysPLUS_INFandMINUS_INF,andwemodifythekeycomparatortohandlethem

x

quad-node

9/14/171:33PM SkipLists 184

SpaceUsage

• Thespaceusedbyaskiplistdependsontherandombitsusedbyeachinvocationoftheinsertionalgorithm

• Weusethefollowingtwobasicprobabilisticfacts:Fact1: Theprobabilityofgettingi

consecutiveheadswhenflippingacoinis1/2i

Fact2: Ifeachofn itemsispresentinasetwithprobabilityp,theexpectedsizeofthesetisnp

• Consideraskiplistwithn items– ByFact1,weinsertanitemin

listSi withprobability1/2i

– ByFact2,theexpectedsizeoflistSi isn/2i

• Theexpectednumberofnodesusedbytheskiplistis

nnn h

ii

h

ii 2

21

2 00<= åå

==

Thus, the expected space usage of a skip list with nitems is O(n)

9/14/171:33PM SkipLists 185

Height

• Therunningtimeofthesearchaninsertionalgorithmsisaffectedbytheheighth oftheskiplist

• Weshowthatwithhighprobability,askiplistwithnitemshasheightO(log n)

• Weusethefollowingadditionalprobabilisticfact:Fact3: Ifeachofn eventshas

probabilityp,theprobabilitythatatleastoneeventoccursisatmostnp

• Consideraskiplistwithn items– ByFact1,weinsertaniteminlist

Si withprobability1/2i

– ByFact3,theprobabilitythatlistSi hasatleastoneitemisatmostn/2i

• Bypickingi = 3log n,wehavethattheprobabilitythatS3log nhasatleastoneitemisatmost

n/23log n = n/n3 = 1/n2

• Thusaskiplistwithn itemshasheightatmost3log n withprobabilityatleast1 - 1/n2

9/14/171:33PM SkipLists 186

SearchandUpdateTimes• Thesearchtimeinaskiplistis

proportionalto– thenumberofdrop-downsteps,

plus– thenumberofscan-forward

steps• Thedrop-downstepsare

boundedbytheheightoftheskiplistandthusareO(log n) withhighprobability

• Toanalyzethescan-forwardsteps,weuseyetanotherprobabilisticfact:Fact4:Theexpectednumberof

cointossesrequiredinordertogettailsis2

• Whenwescanforwardinalist,thedestinationkeydoesnotbelongtoahigherlist– Ascan-forwardstepisassociated

withaformercointossthatgavetails

• ByFact4,ineachlisttheexpectednumberofscan-forwardstepsis2

• Thus,theexpectednumberofscan-forwardstepsisO(log n)

• WeconcludethatasearchinaskiplisttakesO(log n) expectedtime

• Theanalysisofinsertionanddeletiongivessimilarresults

Page 32: Big-Oh notation classes Advanced Algorithmics (6EAP) … · Class Informal Intuition Analogy f(n) ... Double-ended queue Double-ended priority queue ... string word; int num;

14.09.17

32

9/14/171:33PM SkipLists 187

Summary• Askiplistisadatastructure

fordictionariesthatusesarandomizedinsertionalgorithm

• Inaskiplistwithn items– Theexpectedspaceusedis

O(n)– Theexpectedsearch,

insertionanddeletiontimeisO(log n)

• Usingamorecomplexprobabilisticanalysis,onecanshowthattheseperformanceboundsalsoholdwithhighprobability

• Skiplistsarefastandsimpletoimplementinpractice

Conclusions

• Abstractdatatypeshideimplementations• ImportantisthefunctionalityoftheADT• Datastructuresandalgorithms determinethespeedoftheoperationsondata

• Lineardatastructuresprovidegoodversatility• Sorting– amosttypicalneed/algorithm• SortinginO(nlogn)MergeSort,Quicksort• SolvingRecurrences– meanstoanalyse• Skiplists– logn randomised datastructure