ArrayStack (ArrayList), [ArrayDeque, and DualArrayDeque] implement the List interface using one or two arrays Review o get(i), set(i,x) take constant time.

• ArrayStack (ArrayList), [ArrayDeque, and DualArrayDeque] implement the List interface using one or two arrays

Review

o get(i), set(i,x) take constant time

o add(size(),x), remove(size()-1) [add(0,x), remove(0)] take constant amortized time

o Can waste a lot of space

− 2/3 of the array positions can be emptyo Not suitable for real-time applications

− grow(), shrink(), and balance() take O(size()) time.

• RootishArrayStack: A list implementation with

− get(i) and set(i,x) in constant time

− add(i,x) and remove(i) in O(1 + size()-i) time

− no more than O(size()1/2) wasted space

− Suitable for real-time applications

o in some languages (not Java)

Coming up

• DualRootishArrayDeque

− A 2-ended version

• Store the stack as a List of blocks (arrays)

− block k has size (k+1), for k=0,1,2,...,r

− at most 2 blocks not full

RootishArrayStack

a

b c

d fe

g jh i

k l

0

1

2

3

4

blocks public class

RootishArrayStack<T> extends AbstractList<T> { List<T[]> blocks; int n; ...}

• How much space is wasted?

− r blocks have room for r(r+1)/2 elements

− To store n elements we need

− r(r+1)/2 ≥ n

− r ≥ (2n)1/2 blocks are sufficient

− We only waste O(n1/2) space keeping track of the blocks

• The size of the last 2 blocks is at most 2r + 3

− Only waste O(n1/2) space on non-full blocks

− Wasted space is only O(n1/2)

Space analysis

r+1

r

• As usual:

− grow() if necessary

− shift elements i,...,size()-1 right by one position

RootishArrayStack – add(i,x)

public void add(int i, T x) { int r = blocks.size(); if (r*(r+1)/2 < n + 1) grow(); n++; for (int j = n-1; j > i; j--) set(j, get(j-1)); set(i, x);}

• Also as usual:

− shift elements i+1,...,size()-1 left by one position

− shrink() if necessary

RootishArrayStack – remove(i)

public T remove(int i) { T x = get(i); for (int j = i; j < n-1; j++) set(j, get(j+1)); n--; shrink(); return x;}

• Add another block of size r

− runs in constant time in languages not requiring array initialization

− otherwise, takes O(r) = O(size()1/2) time.

RootishArrayStack – grow()

protected void grow() { blocks.add(f.newArray(blocks.size()+1));}

• Remove blocks until there are at most 2 partially empty blocks

RootishArrayStack – shrink()

protected void shrink() { int r = blocks.size(); while (r > 0 && (r-2)*(r-1)/2 >= n) { blocks.remove(blocks.size()-1); r--; }}

• Find the block index b that contains element i (function i2b(i) )

• access element i - b(b+1)/2 within that block

RootishArrayStack get(i) and set(i,x)

public T get(int i) { int b = i2b (i); int j = i - b*(b+1)/2; return blocks.get(b)[j];}

public T set(int i, T x) { int b = i2b(i); int j = i - b*(b+1)/2; T y = blocks.get(b)[j]; blocks.get(b)[j] = x; return y;}

• Converting the List index i into a block number b

− 0,...,i consists of i+1 elements

− Blocks 0,...,b can store (b+1)(b+2)/2 elements

− We want to find minimum integer b such that

o (b+1)(b+2)/2 ≥ i + 1

− Solve (b+1)(b+2)/2 = i + 1 using the quadratic equation

− quadratic equation gives a non-integer solution b’

o actually two solutions, but only one is positive

− set b = Γb’˥

RootishArrayStack - i2b(i)

• Theorem: A RootishArrayStack− can perform get(i) and set(i,x) in constant time− can perform add(i,x) and remove(i) in

O(1+size()-i) time− uses only O(size()1/2) memory in addition to what

is required to store its elements

RootishArrayStack - summary

• Key points:− Real-time

• no amortization− Low-memory overhead

• O(n1/2) versus O(n) for other array-based stacks

• Theorem: Any data structure that allows insertions will, at some point during a sequence of n insertions be wasting at least n1/2 space.

Optimality of RootishArrayStack

• Proof: If the data structure uses more than n1/2 blocks− Real-time− n1/2 pointers (references) are being wasted just

keeping track of blocks− Otherwise, the data structure uses k ≤ n1/2 blocks

• some block B has size at least n/k ≥ n1/2

• when B was allocated, it was empty and therefore was a waste of n1/2 space

• The use of many arrays to store data means that we can't do shifting with 1 call to System.arraycopy()– Slower than other implementations when i is

small

Practical Considerations

• The solution to the quadratic formula in i2b(i) requires the square root operation

−This can be slow−This can lead to rounding errors

• can be corrected by checking that • (b+1)/2 < i ≤ (b+1)(b+2)/2

−Lookup tables can speed things• we only want an integer square root

• Using a RootishArrayStack for the internal stacks within a DualArrayDeque we obtain:

−Theorem: A DualRootishArrayDeque• can perform get(i), set(i,x) in constant time• can perform add(i,x) remove(i) in O(1 + min{i,size()-i}) amortized time

−uses only O(size()1/2) memory in addition to what is required to store its elements

DualRootishArrayDeque

• A real-time version is possible, see−Brodnik, Carlsson, Demaine, Munro, and

Sedgewick. Resizeable arrays in optimal time and space. Proceedings of WADS 1999

• Array-based implementations of Lists, Queues, Stacks, and Deques have many advantages

−Constant-time access by position [get(i), set(i,x)]−Constant-amortized time addition and removal

at the ends−Space-efficient versions use only O(n1/2) extra

space

Review

• Big disadvantage−Additions and removals in the interior are slow

• Running time is at least Ω(min{i, size()-i})

• Lists and queues based on (singly and doubly) linked lists– It might use an array of length 2n to store n elements of data –get(i), set(i,x) are slowadd(), remove() with an iterator take constant time

• Space-efficient linked lists

Coming up…

• Singly-linked lists

–Efficient stacks and queues

• Doubly-linked lists

–Efficient deques

• Space-efficient doubly-linked lists

–Time/space tradeoff

Coming up…

• A list is a sequence of Node:

• Node contains–a data value x–a pointer next to the next node in the list

Singly-linked lists

a b c d e null

protected class Node { T x; Node next;}

• We keep track of the first node in the list (head)

• We might also keep track of the last node (tail)

Singly-linked lists (cont'd)

public class SLList<T> extends AbstractQueue<T> { Node head; Node tail; int n; ...}

a b y zhead

tail ..

. null

• A singly-linked list can implement a queue

− enqueue at the tail

− dequeue at the head

Queues as singly-linked lists

• Requires special care to manage head and tail correctly

− when adding to empty queue

− when removing last element from queue

a b y zhead

tail ..

. null

front of the linefront of the line back of the lineback of the line

Dequeuing (removing) an element

a b y zhead

tail ..

. null

e

head

tail

null

public T poll() { T x = head.x; head = head.next; if (--n == 0) tail = null; return x;}

a b

x

head

tail

null

public boolean offer(T x) { Node u = new Node(); u.x = x; if (n == 0) { head = u; tail = u; } else { tail.next = u; tail = u; } n++; return true;}

Enqueuing

x

head

tail

null

Delicateness

public boolean offer(T x) { Node u = new Node(); u.x = x; if (n == 0) { head = u; tail = u; } else { tail = u; tail.next = u; } n++; return true;}

• This code is wrong

−can you see why?

• A singly-linked list can also be used as a stack

−push and pop are done by manipulating head

Stacks as singly-linked lists

a b y zhead

tail ..

. null

top of the stacktop of the stack

bottom of the stackbottom of the stack

Stack - push

a

b ehead

tail

null

e

head

tail

null

public T push(T x) { Node u = new Node(); u.x = x; u.next = head; head = u; if (n == 0) tail = u; n++; return x;}

c d

• In a singly-linked list, we can even do arbitrary insertions/deletions− if we are given a pointer to the preceding

element• Getting a pointer to the ith node takes O(i+1) time

Arbitrary insertion and deletions

protected Node getNode(int i) { Node u = head; for (int j = 0; j < i; j++) u = u.next; return u;}

a b y zhead

tail

..

. null

u

• Does not work for first node

−no preceding node u!

Deleting a node

u d e

protected void deleteNext(Node u) { if (u.next == tail) tail = u; u.next = u.next.next;}

• Does not work for first node

−no preceding node u!

Adding a node

u

v

e

protected void addAfter(Node u, Node v) { v.next = u.next; u.next = v; if (u == tail) tail = v;}

29

• Write code for–add(i,x)–remove(i)

• Code should run in O(1+i) time

In-Class Exercise

• Singly-linked lists support:−push(x), pop(), enqueue(x), dequeue() in constant time (in the worst case)

−add(i,x), remove(i) in O(1+i) time

Singly-linked list summary

• One Node is created per list item−Memory allocation overhead−Node contains data + 1 pointer/reference (next)

−At least n pointers for a list of size n

• Singly-linked lists fall just short of being able to implement a deque

−No way to remove elements from the tail

Doubly-linked lists

a b y zhead

tail ..

. null

can't access this nodeexcept through head

can't access this nodeexcept through head

• Doubly-linked lists maintain two pointers (references) per node

−next - points to next node in the list

−prev - points to previous node in the list

Doubly-linked lists

a b y z

head tail

..

. null

protected class Node { Node next, prev; T x;}

null

• This code is incorrect – Why?

Removing a node (incorrect)

u d e

protected void remove(Node p) { p.prev.next = p.next; p.next.prev = p.prev; n--;}

p.prevp.prev

p.nextp.next

pp

• Doesn't correctly handle boundary cases

− p == head (so p.prev == null)

− p == tail (so p.prev == tail)

− p == head and p == tail

(sp p.prev == p.tail == null)

Removing a node (incorrect)

d e


head

tail

u d

dhead tail

Versus

protected void remove(Node p) { if (p == head && p == tail) { head = null; tail = null; } else if (p == head) { head = p.next; p.next.prev = null; } else if (p == tail) { tail = p.prev; p.prev.next = null; } else { p.prev.next = p.next; p.next.prev = p.prev; } n--;}


• Code for boundary cases is troublesome− hard to write correctly− lots of cases− slow to execute (on some architectures)

• We would like to get rid of boundary cases− need to get rid of head and tail

Code is error prone


• Replace head and tail with a dummy Node− dummy.next replaces head− dummy.prev replaces tail− dummy is always present; even in an empty list

The dummy node technique

a b y z...

dummy

public class DLList<T> extends AbstractSequentialList<T> { protected Node dummy; protected int n; ...}

Creating a new (empty) list

public DLList() { dummy = new Node(); dummy.next = dummy; dummy.prev = dummy; n = 0;}

dummy

• Now removing a node is easy

Removing a node

u d e


p.prevp.prev

p.nextp.next

pp

Removing a node

p

dummy


pp

• The same code works even when removing the last node

p.prev == p.next == dummy

p.prev == p.next == dummy

• Add the new Node u just before Node p

Adding a node

protected Node add(Node u, Node p) { u.next = p; u.prev = p.prev; u.next.prev = u; u.prev.next = u; n++; return u;}

d

u

p

p.prevp.prev

pp

uu

• This code is not correct. Why?

Exercise

protected Node add(Node u, Node p) { u.next = p; u.next.prev = u; u.prev = p.prev; u.prev.next = u; n++; return u;}

d

u

p

p.prevp.prev

pp

uu

• To find the ith node search− from the front if

i < size()/2− from the back

otherwise• O(1+min{i, size()-i})

time• Fast

− when i~0 (head)− when i~size() (tail)

Finding a node

protected Node getNode(int i) { Node p = null; if (i < n/2) { p = dummy.next; for (int j = 0; j < i; j++) p = p.next; } else { p = dummy; for (int j = n; j > i; j--) p = p.prev; } return(p);}

• add(i,x) and remove(i) are now easy−Find the appropriate node p−Add x before p (or remove p)

• Takes O(1 + min{i, size()-i}) time

Removing and Adding

public T remove(int i) { Node p = getNode(i); remove(p); return p.x;}

public void add(int i, T x) { add(getNode(i), x);}

• get(i) and set(i,x) are easy too−and take O(1 + min{i, size()-i}) time

Getting and setting

public T get(int i) { return getNode(i).x;}

public T set(int i, T x) { Node u = getNode(i); T y = u.x; u.x = x; return y;}

• Doubly-linked lists support

−add(i,x), remove(i) in O(1 + min{i,size()-i}) time

• deque operations run in constant time per operation

−get(i), set(i,x) in O(1+min{i,size()-i}) time

−insertion/removal of any node in constant time

• given a reference to the node being deleted or

• a reference to the node after the insertion

Doubly-linked lists - summary

• Linked lists are great, except

−Each value is stored in its own list node

• Each insertion requires allocating a new node

• Each node stores 2 pointers

−Wasted space is at least

2 × size() × sizeof(pointer)

• If data values are small (e.g., Integer) then wasted space can exceed the space for data

Memory-efficient linked lists

• Idea:−group list elements into blocks (arrays)

−blocks have size b+1−each block stores b-1, b, or b+1 values

• except the last block, which can be more empty

−store the blocks in a linked list

Memory-efficient linked lists

a b c d ge f h i j

last block - partly fulllast block - partly fullb = 3b = 3

• The number of blocks is at most−1 + size()/(b-1)−each block wastes a constant [O(1)] amount of space

−wasted space is O(b+n/(b-1))−By making b larger we can reduce the wasted space

• limit is b ~ n1/2

a b c d ge f h i j

Space analysis

• We represent each block as a BoundedArrayDeque

−ArrayDeque with size of backing array a set fixed

−a.length = b+1

−no grow() or shrink() operations

Block data structure

• Sometimes we will want to

−move the last element in node u to the front u.next

−move the first element in block i to the back of block i-1

−These operations take constant time in a BoundedArrayDeque

• To find the ith element we find the block that contains it

− Takes time O(1 + (min{i, size()-i} /

b))

− faster than a standard linked list

Finding an elementpublic T get(int i) { if (i < n/2) { Node u = first; int b = 0; while (b + u.x.size() < i + 1) { b += u.x.size(); u = u.next; } return u.x.get(i-b); } else { ...

a b c d ge f h i j

b=0b=0

b=3b=3

b=7b=7

b=9b=9

• To insert into block j– check if any of blocks j, j+1, j+2,...,j+b-1 are not full

• if yes, then there is space, so do shifting to make room• requires at most 2b deque operations• requires shifting at most b elements in one of the deques ( O(b) time )

Insertion - easy case O(b) time

a db c e hf g i j k

a db c e hf g i j kx

Insert x hereInsert x here

• If blocks j, j+1, ... j+b-1 are all full– these b blocks contain a total of b(b+1) elements– repartition them into b+1 blocks each containing b elements– then insert into the (now not full) block

Insertion – full case O(b2) time

Insert x hereInsert x herea db c e hf g i lj k

a db c e hf g i lj k

a dx b e hf g i lj kc

O(b2) timeO(b2) time

• To remove an element from block j– if any of blocks j, j+1, j+2,... contain more than b-1

elements• do shifting so that block j contains at least b elements• remove element from block j

Removal- easy case O(b) time

delete thisdelete this

a c d e f gb

a dc e f gcb

a d e f gc

Removal– hard case O(b2) time


O(b2) timeO(b2) time

• If blocks j,...,j+b-1 each contain b-1 elements–we have b blocks each containing b-1 elements– redistribute so that we have b-1 blocks each

containing b elements– delete the element

a c d e fb

a dc e fb

a dc e f

• A CompactDLList has the following properties

–wasted space is O(size()/b)

–get(i) and set(i,x) each take

•O(1 + min{i, size()-i}/b) time

– remove(i) and add(i,x) take time

•O(b + min{i, size()-i}/b) usually

•O(b2 + min{i, size()-i}/b) occasionally

• What do we mean by usually and occasionally?

Space-Efficient Linked List (SElist)

• We use a credit scheme–A block with b+1 elements or b-1 elements has 1 credit–A block with b elements has 0 credits

• Main idea:–When insertion and removal take b2 time, we will take away b spare credits–With every insertion and removal we create at most one new credit

• Conclusion: At most one out of every b insertion/removals takes b2 time

Amortized analysis of CompactDLLists

• A hamburger costs 8 credits – [analogous to: operation that takes O(b2) time]

Weight-watchers analogy

• Every hour of workout gives you one credit– [analagous to: operation that takes O(b) time]

• The maximum number of hamburgers you are allowed to eat is– (# hours spent working out)/8

• At most one credit is created by insertion– (maybe none)

Analysis of insertion (not full case)

a db c e hf g i j k

a db c e hf g i j kx

insert x hereinsert x here

₡₡ ₡₡

₡₡₡₡₡₡

• b credits are freed up and one credit is added

Analysis of insertion (full case)



a dx b e hf g i lj kc

₡₡ ₡₡ ₡₡

₡₡

3 credits freed now3 credits freed now

insert x hereinsert x here

1 credit is added1 credit is added

• At most one new credit is added

Analysis of removal (easy case)

a c d e f gb


a dc e f gcb

a d e f gc

₡₡₡₡

₡₡₡₡

₡₡₡₡₡₡

• b credits are freed and one credit is added.

Analysis of insertion (sparse case)

a c d e fb


a dc e fb

a dc e f

₡₡₡₡₡₡

₡₡

3 credits freed3 credits freed

1 credit is added1 credit is added

• In a sequence of n add/remove operations–At most n/b takes O(b2) time–Others take O(b) time

Analysis wrap up

• Total time is–O(nb + (n/b)b2) = O(nb)–O(b) amortized time per operation

Compact doubly-linked list (summary)

• Theorem: A CompactDLList is an implementation of the List interface with the following properties:– get(i) and set(i,x) each take•O(1 + min{i, size()-i}/b) time

– remove(i) and add(i,x) take•O(b + min{i, size()-i}/b) amortized time

– The amount of memory used beyond that needed to store the data is O(n/b)– The number of memory allocation/deallocations

during a sequence of n add/remove operations is O(n/b)

ArrayStack (ArrayList), [ArrayDeque, and DualArrayDeque] implement the List interface using one or two arrays Review o get(i), set(i,x) take constant time.

Documents

r blocks

list blocks int n

list index i

getbj blocks

return blocks

return x

positiveset b

necessaryshift elements