Top Banner
Data Structures/Containers Overviews Standard Containers plus properties
27

Data Structures/Containers Overviews Standard Containers plus properties.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Structures/Containers Overviews Standard Containers plus properties.

Data Structures/ContainersOverviews

Standard Containers

plus properties

Page 2: Data Structures/Containers Overviews Standard Containers plus properties.

Consumers vs Producers

• Intelligent Consumers of Data Structures know

– What operations are supported

– Complexity of operations

– Memory costs of operations

– In your code documentation, include costs if not O(1)

• Isn’t this enough?

• Can’t we let the theoreticians build great data structures and algorithms and use them?

• Sometimes - but

– may need to adapt the algorithms

– Or reuse the ideas

– Or even, be a producer

Page 3: Data Structures/Containers Overviews Standard Containers plus properties.

Review

Data Structure Retrieve/add Complexity

Stack Get youngest put

O(1)

Queue Get oldestput

O(1)

Linked List unordered ordered

Get anyPutput

O(N)O(1)O(N)

Search Tree Get or put O(logN)

Hash Table Get or put Usually O(1)

Priority Queue Get MinimumDelete minimumput

O(1) O(logN) O(logN)

Page 4: Data Structures/Containers Overviews Standard Containers plus properties.

Data Structures

• Why these data structures?

– experience shows these are general useful building blocks

– Different classes of programs have different building blocks

– Maybe more building blocks should be discovered.

• Composition/ Hybrid data structure

– can compose data structures

– e.g. list of trees, hashtable of binary trees, trees can be implemented as list of lists

– Hybrid algorithms also useful, e.g. quicksort+bubblesort.

Page 5: Data Structures/Containers Overviews Standard Containers plus properties.

Selecting a Data Structure

• In TSP, suppose we move a city in a tour?

– How should tour be represented?

• In keeping a personal address book, add/delete a person

• If managing a telephone directory that needs to print names in order

• add/delete bank transactions

• Spell checker vs spell corrector

Page 6: Data Structures/Containers Overviews Standard Containers plus properties.

Selecting a Data Structure

• In TSP, suppose we add/delete a city to a tour?

– How should tour be represented?

– linked list

• In keeping a personal address book, add/delete a person

– hash table

• If managing a telephone directory that needs to print names in order

– sorted tree

• add/delete bank transactions

– queue, to maintain time-order (single point)

– priority queue, multiple entry points

Page 7: Data Structures/Containers Overviews Standard Containers plus properties.

Selecting a Data Structure

• Polynomials

– methods: add, multiply, solve, factor, differentiate, integrate, find extrema,...

– representation:

• dense entries: array

– position implicitly encodes degree

– implicit information is more efficient

• sparse entries: list of pairs (degree, coefficient)

– information explicit

– explicit information is more comprehensible

Page 8: Data Structures/Containers Overviews Standard Containers plus properties.

Class Polynomial

• Constructor Polynomial(String s)

e.g. New Polynomial(“3*x^3+ x^2+ 1”)

Methods:

void add(Polynomial p)

void mult(Polynomial p)

public void toString()

Non-obvious

private void simplify()

private void sort()

Theorem: guaranteed, absolute simplification is impossible.

Page 9: Data Structures/Containers Overviews Standard Containers plus properties.

Polynomial

• Representation/implementation

Array: size = maximum degree+1

Linked list: size = numbers of terms where

term = pair(coeff, degree)

Useful: class Term implements Comparable

just define int compareTo(Object o)

Page 10: Data Structures/Containers Overviews Standard Containers plus properties.

Term

Class Term implements Comparable

{ int coeff, exp;

Term(int c, int e)

{ coeff = c;

exp = e;

}

public int compareTo(Object o)

{

Term t = (Term)o;

if (t.exp != exp) return t.exp- exp;

else return (t.coeff-coeff);

} }

Page 11: Data Structures/Containers Overviews Standard Containers plus properties.

Polynomial with collections

• Collections.sort( linkedlist l)

– will sort (in O(n* log n)) time the entries where the natural ordering (i.e. entries in l implement comparable)

• Collections.sort(arraylist a)

– same complexity

• Collections.sort( linkedlist l, Comparator c)

– you can change the ordering by defining a new object, a comparator.

– A comparator is an interface with one method,

– int compare(Object o1, Object o2)

– Comparator and Comparables can be used to sort and find extrema (mininum or maximum)

Page 12: Data Structures/Containers Overviews Standard Containers plus properties.

Linked List

• Methods

– boolean isEmpty()

– void insert(Object o) O(1) O(N) if ordered

– void delete(Object o) O(N) even if ordered

– … find(Object o) O(N) even if ordered

• Uses:

– languages like LISP, Scheme, CLOS are based on lists

– everything can be done with lists

– one-size fits all => expensive

Page 13: Data Structures/Containers Overviews Standard Containers plus properties.

Ordered Linked List

• Methods

– boolean isEmpty()

– void insert(Object o) O(N)

– void delete(Object o) O(N)

– … find(Object o) O(N)

• Not great performance for work done.

• OK if list short.

Page 14: Data Structures/Containers Overviews Standard Containers plus properties.

Lists

• Types of lists: circular, singly-linked, double-linked, ordered lists, list of lists = trees

• Implementable as dynamic arrays

– if insert(o) overflows array, allocated a new array that is twice as large.

• In Collections, LinkedList is a doubly linked list

– boolean contains(Object o)

– boolean add(Object o)

– boolean remove(Object o)

– Iterator iterator()

• supports hasNext(), next() and remove()

• What type of list do you need?

Page 15: Data Structures/Containers Overviews Standard Containers plus properties.

Dynamic Arrays

• What’s the problem with ordinary arrays?

– Overflow

– Replace array by new class DynamicArray.

– When array overflows, allocate twice as much space and copy old values into new array.

• Comparison with linked list

– Storage: depends on size of objects

– For primitives, dynamic arrays require less storage.

– Time: depends on operations

• adding at head bad, at end good.

– Know your domain. What operations occur? Frequency?

Page 16: Data Structures/Containers Overviews Standard Containers plus properties.

Splay Lists

• Splaying is a new idea: Probabilistic ordering

• No moving of elements on inserts, but on finds.

• The goal is have good average (amortized) performance for finding elements.

• Insert(object o) O(1) just add to front

• Remove(object o) O(N) no change

• Find(object o) O(N) worse case

– p*N on average where is p probability of o

– Action: When you find o, move it to the front

– General: if p1>p2>…pn are probabilities of o1…on, then list will (on average) look like o1->o2->…on.

– Or the expect rank of oi is i.

Page 17: Data Structures/Containers Overviews Standard Containers plus properties.

Stack

• Stack: Main Methods

– void push(object) O(1)

– void pop() O(1)

– Object top() O(1)

– boolean isEmpty() O(1)

• Uses

– hold functions calls (recursion)

– test for balanced parenthesis

– operator parsing

• Easily implemented as singly linked-list

Page 18: Data Structures/Containers Overviews Standard Containers plus properties.

Stack Applications• Syntax checker

– if next token is paren e.g. (, {,[, },),] )

• if open-paren, push on stack

• if closed-paren, check if equals top of stack

– if equals, pop, else return error

• Evaluation of Postfix (or build a tree)

– If token is operand, push on stack

– If token is operator, let k be its arity

• do k pops

• apply operator to those elements

• push result

• Search trees (depth-first search: later in course)

• Backtracking algorithms (later in course)

Page 19: Data Structures/Containers Overviews Standard Containers plus properties.

Queues: FIFO

• Methods

– void enqueue(Object o) adds object

– Object dequeue() returns oldest object

– boolean isEmpty()

– void makeEmpty()

• If no O notation, assume O(1) (time and memory)

• Uses

– model transactions

– model requests

• Implementable as doubly linked list easily

• As array is a little tricky

Page 20: Data Structures/Containers Overviews Standard Containers plus properties.

Queues as Array

• Assume that we have a large enough array

• Otherwise we can use dynamic arrays

• Idea: wrap around

– front points to first entry stored (initialize to 0)

• deque: remove and decrement front (mod array size)

– back points to last entry stored (initialize to -1)

• enqueue: increment (mod array size) and insert

– if front == back, either empty or full so..

– Keep a count of number of elements stored.

Page 21: Data Structures/Containers Overviews Standard Containers plus properties.

Queue Applications

• Simulations:

– whenever multiple lines of customers and servers, e.g. at a bank, grocery store etc.

• Search

– breadth first search (later in course)

• Topological Sorting (later in course)

• File-Servers or printers in a network

– Policy: first-come first serve

– Other Policies: (Priority queues)

• smallest job first

• most important job first

• ….

Page 22: Data Structures/Containers Overviews Standard Containers plus properties.

Basic Trees

• Main Methods

– boolean isEmpty()

– void makeEmpty()

– insert(Object o)/delete(Object o) O(log n) if balanced

– boolean find(Object o) O(log n) if balanced

• Uses

– sorted record keeping, reporting and updating

– dictionary, telephone directory,...

– internal representation for programs in compilation

– Language PROLOG based on trees

– Everything can be done with trees

– Game trees

Page 23: Data Structures/Containers Overviews Standard Containers plus properties.

Applications

• Sorting e.g. heapsort and treesort

• Expression Tree: evaluation of expression

• Parse Tree: Compiler has 3 main steps

– Parse into tokens

– Organize tokens into a Parse Tree

– Generate Code

• Decision Tree

– each internal node is a query

– leaf nodes are conclusion

– e.g. medicine, botany, etc

– can be built automatically from data

Page 24: Data Structures/Containers Overviews Standard Containers plus properties.

Hash Tables

• Main Methods

– void insert(Object key, Object o) O(1)

– void remove(Object key, Object o) O(1)

– Object retrieve(Object key) O(1)

• Amazing!

• Uses

– Whenever add/delete, but don’t care if sorted

– dictionary but not employee records

• problem with weekly/month reports

– Symbol tables in compilers

• what does a variable/function name refer to

Page 25: Data Structures/Containers Overviews Standard Containers plus properties.

Priority Queues

• Not a Queue

• Main Methods

– void insert(Object o) O(log n)

– Object findMin() O(1)

– void delete(Object o) O(log n)

• Uses

– bank with multiple tellers

• events: customers arrive, depart

• process next event (min)

– How many tellers needed to give good service

– If few events, theoretically (queueing theory) works

– With many events, simulate.

Page 26: Data Structures/Containers Overviews Standard Containers plus properties.

Graphs

• Game Trees are often graphs

– aids checkers and chess

• State-space search (general planning) is a graph

– states (representation of model of world)

– operators: map states into next states

• Path finding is searching through a graph

– 1,000,000 queens problem (solved easily)

– job scheduling

– class/ta/room scheduling

– critical-path analysis

– flow analysis: traffic/water/electric/money/work flow

Page 27: Data Structures/Containers Overviews Standard Containers plus properties.

Summary

• Data Structures are the foundation of programs

• Wrong choice of data structure degrades program significantly.

• Be Data Structure smart.

• Data Structure are the engines underlying programs

– a small part of the code

– But major determining factor for performance