Top Banner
Shell Scripting Basics Arun Sethuraman
24

Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Dec 23, 2015

Download

Documents

Osborn Miller
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Shell Scripting Basics

Arun Sethuraman

Page 2: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

What’s a shell?

• Command line interpreter for Unix• Bourne (sh), Bourne-again (bash), C shell (csh,

tcsh), etc• Handful of commands• Text mining made easy!

Page 3: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Before we get started

• Unix/Mac Users: Open a terminal• Windows Users: Should have installed

VMware Player, and downloaded the virtual machine with Unix pre-loaded on it (else do it now!)

Page 4: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

VMware Player Basics• Allows creating/playing virtual machines• We will use a standalone version of GNU/Linux called SliTaz,

which is very minimalist (< 40 mb), but should work for all our exercises.

• Download all example files from my website: www.sites.google.com/site/arunsethuraman1/teaching instead of from Blackboard.

• Save state of virtual machine, suspend, restart, etc.• Switch environments using CTRL+ALT• File sharing is a little complicated – so before you submit your

assignment for next week, VMware users please email me and stop by my office with your laptop to submit it (unless you can get Gmail to work without any glitches inside Midori).

Page 5: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.
Page 6: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.
Page 7: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Working at the prompt

• The ‘prompt’ refers to Unix’s native command line interface.

• Your prompt should look something like:username@prompt:~$• Prompt commands are similar to python

scripts – can specify variables, run one-liner commands, specify entire program flows, etc.

Page 8: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Unix 101

Try:• man• ls• pwd• clear• Ctrl+C• echo• ps• cat• tail• head

• cd• mkdir• rm• cp• mv• cal• kill• vi/vim• find• set• who

Page 9: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Piping

• Piping (|) refers to sequentially running multiple commands at one go.

• For eg. Say I want to read a file, then print only the last line of the file, try:

cat example1.txt | tail –n 1 ls | grep “exam” cat example4.txt | head • Important: Piped commands only work on the output of the

previous command!

Page 10: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Regular Expressions

• Describe a pattern (sequence of characters)• [A-Z]*, [a-z]*• [0-9]*, [0-9]\{n\}• Escape (special) characters – start with \• ^ - start of a line• $ - end of a line

Page 11: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Examples

• Eg. {bicycle, bidirectional, biology, binary, bigotry, bill, big, bin, bionic, …}

• Eg. {Sunday, Monday, …, Saturday}

• Eg. {121, 123, 124, …, 129}

Page 12: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Examples

• TATAAA – TATA box, 25 bases upstream of transcription start site

• Telomeric repeat - (TTAGGG)n

Page 13: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 1 – grepSyntax: grep ‘pattern’ <filename>

• Create a new directory.• Copy file “example1.txt” from /usr/home/shellbasics to

your folder• Explore contents of the file using cat/head/tail/vi• Explore grep - copy first line of the file into another file

(use –n flag)• Copy 14th line/last line/last 4 lines into another file• Look for the word “Poe” in example1.txt, paste all

instances into another file (name it <yourname1.txt>)• Look for all numbers in the file – what’s wrong?

Page 14: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 2 – sedSyntax: sed ‘s/<find>/<replace>/g’

• Stream Editor – substituting text• Substitute all words that are “old” with “new”

in example1.txt.• Substitute all “a” with “A”, and all “b” with “B”

in one line.

Page 15: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 3 – Your first shell script!

• Copy example3.sh to your folder.• Explore its contents:

#!/bin/sh

sed ‘

s/a/A/g

s/b/B/g

‘ example2.txt > example3.txt

• Execute this script using ./example3.sh• Oops – what happened here?

Page 16: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Permissions in Unix

• Unix has three permission/file access modes for all files – read (r), write (w), and execute (x).

• Need to specify permissions explicitly for executables.

• Try chmod +x example3.sh, then try ./example3.sh

Page 17: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 3 – contd.

• Add script to change all small letters to capital letters in example2.txt and save it as a new file, example3.txt

• Execute it in the command line.• Write a script to change find all numbers, and

replace them with “[ref]”.

Page 18: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 4 – awkSyntax: awk ‘{<action>}’

• Used to mine column formatted data.• Columns denoted by $<column number>• Copy example4.txt to your folder• awk to print only the third column of the file

and save it to <yourname5.txt>• awk to print the 4th and 5th columns, separated

by a tab character to a new file <yourname6.txt>

Page 19: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 5 – a FASTA file

• Copy example5.fasta from /usr/home to your folder• Explore its contents – what is the FASTA file format?

What does it contain? Do you see a pattern?• Now use any of the commands we just learned to

extract only the gene-ID from the FASTA file. Print it.• Count the number of “AC” repeats, save to a file

<yourname7.txt>• Save only the first 5 lines in example5.fasta to

<yourname8.fasta>

Page 20: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 6 – Executing commands in Shell

• What is BLAST?• Write a shell script to:• BLAST <yourname8.fasta> against all nucleotide BLAST

databases.• Save output of BLAST to a separate file – call it <yourname9.txt>• What hits do you get?• Explore the BLAST output, pull out only gene ID’s for all your hits

with ‘e’ value = 0.0, and with Genbank accessions (gb), save it to a new file <yourname10.txt>

• HINT: You’ll notice that there are multiple ID’s, separated by “|” – to tell awk to use this as a delimiter, use awk ‘BEGIN { FS=“|”};…’

• HINT: To sort a list, use “sort” function

Page 21: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.
Page 22: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Example 7 – Advanced scripts (Assignment)

• Write a python script to pull all gene ID’s from <yourname10.txt>, look for these gene ID’s against NCBI and obtain all hits, save it to a file.

• Execute this python script, then parse out only protein id’s (gene/protein=) values from it using a shell script into a separate file.

• Copy all these protein ID’s (they should be Genbank accession ID’s), paste into the query at www.pantherdb.org, select all species on the list, add PANTHER-GO-Slim Biological Process to your columns.

Page 23: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.
Page 24: Shell Scripting Basics Arun Sethuraman. What’s a shell? Command line interpreter for Unix Bourne (sh), Bourne-again (bash), C shell (csh, tcsh), etc Handful.

Assignment (contd.)

• Save the output of PANTHER as a file. Now parse this file using grep/sed/awk to print only the GO terms – they should be separated by ;

• Make a unique list of these GO terms by using the ‘uniq’ function, save this to a final assignment submission file.

• HINT: Prior to pulling unique values, try replacing the “;” values with something else, say a newline character “\n”.