Top Banner
CS 111 Green: Program Design I Lecture 14: BLAST, methods, encodings & text, more text files Robert H. Sloan (CS) & Rachel Poretsky(Bio) University of Illinois, Chicago October 13, 2017
33

CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Sep 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

CS111Green:ProgramDesignILecture14:

BLAST,methods,encodings&text,moretextfiles

RobertH.Sloan(CS)&RachelPoretsky(Bio)UniversityofIllinois,Chicago

October13,2017

Page 2: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

MORE FUNCTIONS, MAINLY BUILT-IN CLASS METHODS

Page 3: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

COLLABORATION POLICY (AGAIN)

Page 4: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

From the syllabus

n Consulting with your classmates on assignments is encouraged, except where noted. However, submissions are individual, and copying code from your classmates is considered plagiarism.

n To avoid suspicion of plagiarism, you must specify your sources together with all submitted materials. List classmates you discussed your assignment with and webpages from which you got inspiration or copied (short) code snippets. All students are expected to understand and be able to explain their submitted materials. For example, give the question "how did you do X?", a great response would be "I used function Y, with W as the second argument. I tried Z first, but it doesn't work''. An inappropriate response would be "here is my code, look for yourself."

Page 5: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Meaning # (portions from Brown CS 17, Fall 2016)

n You are encouraged to discuss lab/homework assignments with other students in the class. You may even work out solutions together. However, you are not allowed to take away any written notes, diagrams, or code from joint work sessions. Emails, IM conversations, and the like all constitute “notes”. q Must include comments stating whom you worked with in your turned-in code.

n We expect you to fully comprehend everything you hand in. To that end, you must write up your solutions entirely on your own, and you must debug your code entirely on your own. Learning to independently implement and debug solutions, possibly developed with your classmates, is a key CS 111 goal.

n Important: after participating in a joint work session, you must pause before writing up your solutions; a pause long enough to grab a cup of coffee with a friend is sufficient

Page 6: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Enforcement

n Violations will be bothq Reported to Dean of Students as academic misconduct

(cheating and plagiarism)q Receive grade penalties for first violation, failure in course to

suspension/expulsion from UIC for second violation

Page 7: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

How do I know which functions exist? Python documentation

Page 8: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Additional built-in functions from modules

n Useful for certain kinds of things, e.g., math, internet, making graphs, random numbers are available in modules that must be imported before they can be used

n Will discuss a few soon, as needed

Page 9: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

BUILT-IN functions for strings

n Strings and lists examples of "built-in class" and each comes with some built-in functions (and these class functions also called methods).

n Same as other built-in functions except calling syntax is .fn_nameq st = "gtgcgagggtcg"q st.upper() à "GTGCGAGGGGTCG"q st.find("a") à 5

Page 10: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

OBJECTS AND DOT NOTATION

Page 11: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Objects

n (Implicit in Chapters 2.1 Objects and variables, 3.2 List Basics, 7.3 String Methods, 8.2 List Methods, but not explicit anywhere we'll assign: So pay attention!)

n Everything in Python is an objectn Object combines

q data (e.g., number, string, list) with q methods that can act on that object

Page 12: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Methods

n Methods: like (special case of) function but not globally accessible

n Cannot call method just by giving its name, the way we call print(), open(), abs(), type(), range(), etc.

n Method: function that can be accessed only through an objectq Using dot notation

Page 13: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Dot notation

n To call method, use dot notation:q object_name.method()

n String example:

>>> test= "This is my test string">>> test.upper()'THIS IS MY TEST STRING'

Page 14: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

If o is object of type having method do_it where do_it needs an input in addition to o, and x is defined, what is the proper way to call do_it?

A. do_it(x)B. do_it(o, x)C. o.do_it(x)D. o.do_it(o, x)

Page 15: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Recall that str converts to type string

>>> x = 42>>> x == "42">>> False>>> str(x) == "42">>> True

n Is str() a method of strings?A. No, it's not a methodB. No, it's a method of

something elseC. YesD. I have no clue

Page 16: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Methods depend on type of object

n answers = [52, 17, 43]n answer.append(42)n answers is now [52, 17, 43, 42]n "test string".append("s") gives back an error because append

is not a method of stringsn ["cat", "dog"].count("cat") à 1n "catactcat".count("cat") à 2

Page 17: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Methods' importance

n Understanding key data types depends on understanding their methods

n We have already seen the append method for lists, and you probably want to use it in current lab, and in project

n Will come back to more list methodsn file reference methods write(), read(), readline(), readlines()

q But open is not a method

Page 18: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

When you get to CS 341 & 342

n Or if you know Java or C++ nown methods are an Object Oriented (OO) conceptn In our CS 111

q We do need to know the basics of dot notation and methodsq We will otherwise be ignoring OO, and taking primarily a procedural

approach (built on functions, also called functional decomposition)

Page 19: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

ENCODINGS AGAIN

Page 20: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Encodings again

n Recall that the smallest unit in a computer is the bitn One bit can take on 2 possible values: 0 or 1n Two bits can take on 4 possible values: 00, 01, 10, or 11n Three bits can take on 8 possible values: 000, 001, 010, 011,

100, 101, 110, 111

Page 21: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

How many distinct values can 4 bits take on?

A. 4B. 8C. 9D. 13E. 16

Page 22: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Ben Bitdiddle says

n n bits can take on 2 times as many values as n-1 bits = 2n

values

Page 23: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

How many distinct values can a byte take on

n (Recall that a byte = 8 bits)

A. 2B. 8C. 64D. 128E. 256

Page 24: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Encoding characters in bytes: 1960s

n ASCII: Use 1 byte to encode 95 printing characters

n The ones on every computer keyboard to this day

n Pretty much all encodings agree with ASCII on those 95 characters

n ASCII also has some nonprinting characters like newline and tab

Page 25: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Communicating common non-printing characters

n \n is used to denote the newline in a string literaln \t is used to denote the tab in a string literaln And so double backslash is used to denote a backslash in a

string literal.n What is len("\\")?

A. 0B. 1C. 2

Page 26: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

But

n What about René Antoine Ferchault de Réaumur? n А что насчет Aрабского?

Page 27: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Encoding more characters

n Unicode: over 128,000 characters covering 135 modern and historical scripts, and symbols

n Python uses Unicode

Page 28: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

FILES: REVIEW & A BIT MORE DETAIL

Page 29: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

(Text) File reading, a little more slowly

n Recall text file = sequence of linesn Line = sequence of characters up to and including the special

newline character \nq (Special case: probably last set of characters at end of file will work

okay even if text file doesn't end with newline as it should.)q (How could we find out?)

Page 30: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Speaking of text

n afile.txt:

1234Can I have a little more?5678910I love you!ABCDCan I bring my friend to tea?

fileref = open("afile.txt", "r")line = fileref.readline()

What is len(line)?A. 0B. 1C. 4D. 5E. 6

Page 31: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Speaking of files and programming with them

n You need your execution environment, i.e., console, i.e., lower right panel of Spyder, to be working in directory you have the file you want to open

n Working directory button upper right corner

Page 32: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Can iterate over text file reference (not in book)

fileref = open("afile.txt", "r")

for line in fileref: # process each lineprocess line as we wishin this block

rest of program

fileref.close()

n Typically easiest way to read text file, all other things being equal

Page 33: CS 111 Green: Program Design I Lecture 14: BLAST, methods ... · Meaning # (portions from Brown CS 17, Fall 2016) n You are encouragedto discuss lab/homework assignments with other

Strategies for read_seq(fname, n)

n Can't just use method from previous slide and process each line in same way

n Because we need to process the ">" comment lines differently from the following content lines

n Could use:q Previous slide method with a nested conditionalq for loop over the number nq readlines() and then deal with the list it returns