Top Banner
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 1: Introducti on
38

CS216: Program and Data Representation University of Virginia Computer Science

Feb 02, 2016

Download

Documents

keenan

CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans. Lecture 1: Introduction. Menu. Motivating Problem Course Structure, Expectations, Goals Analyzing Algorithms. Phylogeny. from http://www.shef.ac.uk/language/quantling/. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS216: Program and Data Representation University of Virginia Computer Science

CS216: Program and Data RepresentationUniversity of Virginia Computer Science

Spring 2006 David Evans

Lecture 1:Introduction

Page 2: CS216: Program and Data Representation University of Virginia Computer Science

2UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Menu• Motivating Problem• Course Structure, Expectations, Goals• Analyzing Algorithms

Page 3: CS216: Program and Data Representation University of Virginia Computer Science

3UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Phylogeny

from http://www.shef.ac.uk/language/quantling/

Page 4: CS216: Program and Data Representation University of Virginia Computer Science

4UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Language Phylogeny

tree housebaum haus treow hus

tre hus

strom domovni

arbol casaalbero casa

arbore casa

arbre maison

Latin: arbor domuscasa

Page 5: CS216: Program and Data Representation University of Virginia Computer Science

5UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Language Phylogeny

EnglishGerman Anglo-Saxon

Norwegian

Czech

SpanishItalian

Romanian

French

Page 6: CS216: Program and Data Representation University of Virginia Computer Science

6UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Tree of Life

From http://tolweb.org/

Page 7: CS216: Program and Data Representation University of Virginia Computer Science

7UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Finding a Phylogeny

• Speculate on history based on current evidence – Not guaranteed to be correct

• Find the “most likely” history– Parsimony: find the evolutionary tree

that explains the observations with the fewest possible changes

Page 8: CS216: Program and Data Representation University of Virginia Computer Science

8UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Measuring Changes

• Natural Languages– Grammatical Rules– Lexicon– Hard to quantify how similar two

languages are• Species

– Genomes (only recently)– Easy to quantify: genome differences

are measurable

Page 9: CS216: Program and Data Representation University of Virginia Computer Science

9UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

How Species Evolve

• Point Mutations (Substitution): one base is replaced with another

…CAT…

…CTT…

UV Ray With only point mutations, easy to tell how close two genomes are, just count the different bases

Page 10: CS216: Program and Data Representation University of Virginia Computer Science

10UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

How Species Evolve 2

• Insertions: one or more bases are inserted

• Deletions: one or more bases are removed

…GCATG… …GCACATG…

…GCATCATG……GCATG…

Caused by copying errors (enzymes slipping, etc.)

Page 11: CS216: Program and Data Representation University of Virginia Computer Science

11UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Measuring Genome Similarity

• Insertions and Deletions this hard

ACATCATCATCAT CATCATCATCATC are more “similar” than

ACATCATCATCAT | | | |

TCGTTCGCGAAAA

Page 12: CS216: Program and Data Representation University of Virginia Computer Science

12UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Sequence Alignment• Align sequences by inserting gaps:

• Find best alignment inserting gaps given: – value of matching bases (point

mutations) = c– cost of a gap (insertion/deletion) = g

ACATCATCATCAT |||||||||||| -CATCATCATCAT

We use c = 10, g = 2: goodness = 12 * c – g = 118

Page 13: CS216: Program and Data Representation University of Virginia Computer Science

13UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Brute Force Alignment

To find the best alignment of sequences U and V with correct value c and gap penalty g:

if U or V is emptyU, V is the best alignment

otherwise,[ next slide ]

Page 14: CS216: Program and Data Representation University of Virginia Computer Science

14UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Brute Force Alignment: Otherwise…

Try three possibilities:

1. First elements of U and V are aligned:score of best alignments of U[1:] and V[1:]

+ c if U[0] == V[0]2. First element of U is aligned with a gap in V

score of best alignments of U[1:] and V + g3. First element of V is aligned with a gap in U

score of best alignments of U and V[1:] + g

Pick the choice with the highest score

U[1:] means U with the first element removed

Page 15: CS216: Program and Data Representation University of Virginia Computer Science

15UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Is this a “good” solution?

Page 16: CS216: Program and Data Representation University of Virginia Computer Science

16UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Course Structure, Expectations, Goals

Page 17: CS216: Program and Data Representation University of Virginia Computer Science

17UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Staff• Me: David Evans (Call me “Dave” or

“Coach”)– Office Hours posted on course website– Always available by email, if I don’t reply in 24

hours, send again and complain

• Assistant Coaches: Erika Chin, David Faulkner, Erin Golub, Sam Guarnieri, Katherine Jogerst, and Pitchaya (“Yam”) Sitthi-Amorn – Will lead Monday and Tuesday sections– Available in Small Hall lab at posted times (only)

Page 18: CS216: Program and Data Representation University of Virginia Computer Science

18UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Meetings• Lectures: 2 per week

– Will include material not in the book– Most lectures will use slides and notes

• Section meetings: 1 per week– You must sign up for one of the sections– Classroom work, group exercises, review,

quizzes, …• Staffed time in Small Hall

– Take advantage of help from the ACs and your classmates

Page 19: CS216: Program and Data Representation University of Virginia Computer Science

19UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Problem Sets• 8 total, 1-2½ weeks each• Work on them when and where you

want (but take advantage of staffed lab time in Small Hall)

• Usually will work with partners• Mix of programming and analysis• Main way most will learn • Turn in on paper at beginning of

class (first is due Wednesday)

Page 20: CS216: Program and Data Representation University of Virginia Computer Science

20UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

My Teaching Philosophy:Drinking from a Firehose

It may hurt a little bit, and a lot of water will go by you, but you won’t go away

thirsty!

Don’t be overwhelmed!You will do fine.

Page 21: CS216: Program and Data Representation University of Virginia Computer Science

21UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Expectations: Programming Background

• You understand basic programming:– Can write a program longer than a

screenful– Can understand multi-file programs– Familiar with common control structures,

procedures, recursive definitions

• You don’t freak out when you are expected to learn a new language on your own

Page 22: CS216: Program and Data Representation University of Virginia Computer Science

22UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Expectations:Math and Logic Background

• You remember some things from CS202 (or will learn/re-learn them when you need them):– Arithmetic, logarithms, sets, graphs– Symbolic logic, implication– Proof techniques (induction, contradiction)

• The textbook is quite mathematical – you may need to read things more than once

Page 23: CS216: Program and Data Representation University of Virginia Computer Science

23UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Expectations: Honor• Everyone will be expected to follow

what is on the course pledge• Exact content will be determined before

next class based on your survey responses– I hope to place a large burden on the honor

system, but won’t if students think this puts honest students at a disadvantage

Page 24: CS216: Program and Data Representation University of Virginia Computer Science

24UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Course Goals

Page 25: CS216: Program and Data Representation University of Virginia Computer Science

25UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Course Goal 1

Learn to write delightful programs.

correct, readable, elegant, economical, efficient, scalable,

maintainable, secure, dependable

Page 26: CS216: Program and Data Representation University of Virginia Computer Science

26UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Course Goal 2

Be able to predict how decisions about data representation will impact properties of an implementation.

running time, memory use, ease of implementation,

scalability, …

Page 27: CS216: Program and Data Representation University of Virginia Computer Science

27UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Course Goal 3

Understand how a program executes at levels of abstraction ranging from a high-level programming language to machine memory.

We will talk about what this means in Monday’s class.

Page 28: CS216: Program and Data Representation University of Virginia Computer Science

28UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

if U or V is emptyU, V is the best alignment

otherwise,Try three possibilities:1. First elements of U and V are aligned:

score of best alignments of U[1:] and V[1:] + c if U[0] == V[0]

2. First element of U is aligned with a gap in Vscore of best alignments of U[1:] and V + g

3. First element of V is aligned with a gap in Uscore of best alignments of U and V[1:] + g

Pick the choice with the highest score

Is this a “good” solution?

Page 29: CS216: Program and Data Representation University of Virginia Computer Science

29UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Algorithm Properties

• Implementable – can be readily expressed as a program

• Termination – always finishes• Correctness – always gives the

correct answer• Efficient – uses resources wisely

Note: Chapter 2 of text has a similar list but separates “Implementable” into Effectiveness and Program Complexity

Page 30: CS216: Program and Data Representation University of Virginia Computer Science

30UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Is it Implementable?def bestAlignment (U, V, c, g): if len(U) == 0 or len(V) == 0: return U, V else: (U0, V0) = bestAlignment (U[1:], V[1:], c, g) scoreNoGap = goodnessScore (U0, V0, c, g) if U[0] == V[0]: scoreNoGap += c # try inserting a gap in U (no match for V[0]) (U1, V1) = bestAlignment (U, V[1:], c, g) scoreGapU = goodnessScore (U1, V1, c, g) - g # try inserting a gap in V (no match for U[0]) (U2, V2) = bestAlignment (U[1:], V, c, g) scoreGapV = goodnessScore (U2, V2, c, g) - g …

Page 31: CS216: Program and Data Representation University of Virginia Computer Science

31UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Is it Implementable?def bestAlignment (U, V, c, g): if len(U) == 0 or len(V) == 0: return U, V else: (U0, V0) = bestAlignment (U[1:], V[1:], c, g) scoreNoGap = goodnessScore (U0, V0, c, g) if U[0] == V[0]: scoreNoGap += c # try inserting a gap in U (no match for V[0]) (U1, V1) = bestAlignment (U, V[1:], c, g) scoreGapU = goodnessScore (U1, V1, c, g) - g # try inserting a gap in V (no match for U[0]) (U2, V2) = bestAlignment (U[1:], V, c, g) scoreGapV = goodnessScore (U2, V2, c, g) - g

if scoreNoGap >= scoreGapU and scoreNoGap >= scoreGapV: return U[0] + U0, V[0] + V0 elif scoreGapU >= scoreGapV: return GAP + U1, V[0] + V1 else: return U[0] + U2, GAP + V2

Page 32: CS216: Program and Data Representation University of Virginia Computer Science

32UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Algorithm Properties

Implementable – can be readily expressed as a program

• Termination – always finishes• Correctness – always gives the

correct answer• Efficient – uses resources wisely

Page 33: CS216: Program and Data Representation University of Virginia Computer Science

33UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Termination?if U or V is empty

U, V is the best alignmentotherwise,

Try three possibilities:1. First elements of U and V are aligned:

score of best alignments of U[1:] and V[1:] + c if U[0] == V[0]

2. First element of U is aligned with a gap in Vscore of best alignments of U[1:] and V + g

3. First element of V is aligned with a gap in Uscore of best alignments of U and V[1:] + g

Pick the choice with the highest score

Page 34: CS216: Program and Data Representation University of Virginia Computer Science

34UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

if U or V is emptyU, V is the best alignment

otherwise,Try three possibilities:1. First elements of U and V are aligned:

score of best alignments of U[1:] and V[1:]

+ c if U[0] == V[0]2. First element of U is aligned with a gap in V

score of best alignments of U[1:] and V + g3. First element of V is aligned with a gap in U

score of best alignments of U and V[1:] + gPick the choice with the highest score

Every attempt, at least one element isremoved (and none added). Initial lengthis finite, so must terminate.

Page 35: CS216: Program and Data Representation University of Virginia Computer Science

35UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Algorithm Properties

Implementable – can be readily expressed as a program

Termination – always finishes• Correctness – always gives the

correct answer– Very informally: it tries all possibilities

and picks the best one• Efficient – uses resources wisely

Page 36: CS216: Program and Data Representation University of Virginia Computer Science

36UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Efficiency?

• What resources do we care about?– Programmer Time

– Running Time– Space Use

Page 37: CS216: Program and Data Representation University of Virginia Computer Science

37UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Measuring Resource Use• Space

– Fundamental unit: bit• Running Time

– No fundamental unit• Number of steps?• How much can you do in one step?• How long does a step take?

• How does it scale with the size of the input Answering for this algorithm

is a PS1 question

Page 38: CS216: Program and Data Representation University of Virginia Computer Science

38UVa CS216 Spring 2006 - Lecture 1: Analyzing Algorithms

Charge• Registration Survey

– Linked from course web site– Submit by Friday 5pm

• Text: Read chapters 1-3• PS1: Out now, due in 1 week

– Start now – the section time is not for doing PSs

• Monday: Levels of Abstraction, Order Notation