Introduction to Computational Thinking Vicky Chen.

Post on 26-Dec-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Introduction to Computational Thinking

Vicky Chen

Fundamental Theorem of Informatics

Friedman C P J Am Med Inform Assoc 2009;16:169-170

What Informatics Is Not

Friedman C P J Am Med Inform Assoc 2009;16:169-170

Computational Thinking

Computational thinking is a way of solving problems, designing systems, and understanding human behavior that draws on concepts fundamental to computer science. To flourish in today's world, computational thinking has to be a fundamental part of the way people think and understand the world.

http://www.cs.cmu.edu/~CompThink/

Computational Thinking

• Analyzing and logically organizing data• Data modeling, data abstractions, and

simulations• Formulating problems so computers may assist• Identifying, testing, and implementing possible

solutions• Automating solutions via algorithmic thinking• Generalizing and applying this process to other

problems

Algorithm

• A finite list of instructions that describe all required steps to perform a computation, written in general language

Programming Steps

• Specification– What the code should do

• Design– Pseudocode

• Implement– Programming

• Test– Debugging

Data Type / Data Structure

• Integer• Floating point• Boolean• Character• String

• List• Dictionary• Hash Table

Data Types

List

Dictionary / Hash Table

Exercise 1

We have a matrix with mutation information for different tumor samples.

How can this data be represented?

List of Lists

• Data is a sparse matrix• Stores a lot of extra uninformative information

Dictionary

Opening Files

• Mutation matrix contains data on 2337 genes and 779 samples

• Inputting data by hand is not feasible• Data usually read in and processed from files

Opening Files

Input and print

For Loops

While Loops

Conditional Statements

Conditional Statements

• If, else if, else• and• or• not

Exercise 2

We have a dictionary that contains tumor sample mutation information.

We want to print out a list of tumor samples after receiving a mutated gene of interest from the user.

Opening Files Revisited

Opening Files Revisited

Data Extraction from Files

• Many files will contain extra information• Focus on extracting only pertinent data• Applicable to many types of data– Natural language documents (e.g. articles)– Sequence data (e.g. FASTA files)– Files from databases (e.g. NCBI Gene, TCGA)– Etc.

Regular Expressions

Reusing Code

• Some code can be useful in multiple situations• It is possible to just rewrite (or copy) the code

each time– Less efficient– Multiple locations to fix when debugging

Functions

Exercise 3

We have a document containing human gene information downloaded from NCBI.

We want to extract and store the Ensembl ID of each gene with its corresponding gene symbol.

top related