1 last modified 05/07/16 Getting Started with the BDSG Login Service Introduction and Learning Objectives This document comes in two parts – a short introduction to a number of necessary concepts, and a set of annotated practical exercises to work through. This tutorial will introduce the concept of the Unix operating system and then some of the commonly used inbuilt commands. Basic programs for editing files are shown, and then some command-line syntax useful for (re)directing input and output to programs and other file manipulations. A short glossary/summary of commands is given at the end of the document. By the end of the practical, you should be comfortable moving around your account, manipulating directories, files and running simple commands. Requirements To work through the exercises in this practical, you will need login access to a machine running linux – either a server or a linux workstation. Here we give instructions assuming that you will have a user account and password on the bioinformatics server codon.bioinformatics.ic.ac.uk, which you will access via a local machine (PC, Mac or linux workstation) with connection software already installed – such as a College teaching machine. You can work through the tutorial using any local machine (PC, Mac or linux machine) that is connected to the college network. If you are working from anywhere else e.g. from home, you will need to use a VPN connection as the server will only accept connections from the ic.ac.uk domain for security reasons. (see below) Supplementary information on how to install the connection software you will need on your local machine (if not already installed) and how to log in using different combinations (e.g. from a MAC or a linux box to Codon) and how to configure the necessary connections are all available from http://www.imperial.ac.uk/bioinformatics-data-science-group/support/help/ under ‘connecting to codon’). You will need your standard college username and password to access these help pages. If you don’t already have an account with us (Bioinformatics Support Service/ Bioinformatics Data Science Group), you will need to apply for one via our web-form at http://www.imperial.ac.uk/bioinformatics-data-science-group/support/apply-for-account Help on setting up VPN on a private machine is available from the main ICT web-site at http://www.imperial.ac.uk/admin-services/ict/self-service/connect-communicate/remote- access/method/set-up-vpn/
29
Embed
Getting Started with the BDSG Login Service · Getting Started with the BDSG Login Service Introduction and Learning Objectives ... This tutorial will introduce the concept of the
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
last modified 05/07/16
Getting Started with the BDSG Login Service Introduction and Learning Objectives This document comes in two parts – a short introduction to a number of necessary concepts,
and a set of annotated practical exercises to work through.
This tutorial will introduce the concept of the Unix operating system and then some of the
commonly used inbuilt commands. Basic programs for editing files are shown, and then
some command-line syntax useful for (re)directing input and output to programs and other
file manipulations. A short glossary/summary of commands is given at the end of the
document. By the end of the practical, you should be comfortable moving around your
account, manipulating directories, files and running simple commands.
Requirements To work through the exercises in this practical, you will need login access to a machine
running linux – either a server or a linux workstation. Here we give instructions assuming
that you will have a user account and password on the bioinformatics server
codon.bioinformatics.ic.ac.uk, which you will access via a local machine (PC, Mac or linux
workstation) with connection software already installed – such as a College teaching
machine. You can work through the tutorial using any local machine (PC, Mac or linux
machine) that is connected to the college network.
If you are working from anywhere else e.g. from home, you will need to use a VPN
connection as the server will only accept connections from the ic.ac.uk domain for security
reasons. (see below)
Supplementary information on how to install the connection software you will need on your
local machine (if not already installed) and how to log in using different combinations (e.g.
from a MAC or a linux box to Codon) and how to configure the necessary connections are
all available from http://www.imperial.ac.uk/bioinformatics-data-science-group/support/help/
under ‘connecting to codon’). You will need your standard college username and password
to access these help pages. If you don’t already have an account with us (Bioinformatics
Support Service/ Bioinformatics Data Science Group), you will need to apply for one via our
Bioinformatics Data Science Group’s server codon.bioinformatics.ic.ac.uk for the
remainder of the tutorial. You will need a username and password specific for this machine
– your standard College username/password WILL NOT work here.
If you are logging in from other machines, specific instructions for doing so are available
from our web site as listed on the previous page.
Log in to your PC using your standard college username and password. You will need to
use the Putty program to login to our server, where you will run the remainder of the
practical. To allow the server to display graphics on your screen, you will also need to use
X11 software on the PC. Your teaching machine has either Exceed or XMing installed for
this purpose.
You will first need to configure and save a session inside Putty:
Find and double click on the PuTTY icon. If it is not on the desktop, look in the Start Menu,
following All Programs:
You should see a screen that looks rather like this: (but the ‘Saved Sessions’ field may be
empty)
Type codon.bioinformatics.ic.ac.uk into the Host Name (or IP address) box, and ensure
that the Protocol is set to SSH (shown by the ring above). Then click on
SSH in the left-hand pane to open its additional options – select X11, as indicated by the
arrow. You will then see the following:
4
Make sure that the box next to Enable X11 forwarding is ticked, as shown. Then click
Session in the Category list (on the left). This will take you back to the original screen,
where you should save the settings you have made as follows:
Type codon.bioinformatics in the Saved Sessions box, and click on Save. You have now
made a shortcut to enable you to login to the server next time without having to do any
configuration.
TOP TIP (optional): the session you have created will generate a screen for you to work in
that has white text on a black background. If you prefer alternative colours, you can change
them inside the Putty configuration. Make sure that your ‘codon.bioinformatics’ saved
session is loaded by selecting it and clicking on ‘load’, then go to the menu on the left side of
the Putty screen and click on the “Window -> Colours option as below.
Select ‘Default Background’ and then ‘Modify’. You can now select a suitable background
colour. Now select a ‘Default Foreground’ colour, to produce text that is visible against the
background. When you are done, you can go back to the ‘Session’ menu at the top of the
left-hand menu and click on Save
Now we need to start an X11 emulator program. Here we will assume that your PC has
Exceed installed. Some machines may have XMing installed instead - Alternative notes for
using XMing are shown in a boxed section at the end of the Exceed notes.
5
Look for the Exceed icon on the desktop, which looks a bit like this:
If you can’t find it, search for Exceed under the ‘All programs’ windows menu, and launch by
clicking on the icon you find there. NOTE: you may not see any new program window
appear on your desktop. Now you have X11 running, you can connect to the server, by
going back to Putty, clicking on your codon.bioinformatics saved session to select it, and
clicking on the ‘Load’ button followed by ‘Open’.
You will see a new window appear, that will look something like this (colour may differ
depending on your Putty configuration):
You will need the codon.bioinformatics username and password we have sent to you
earlier by email. You cannot log into this server using your standard college username and
password. Type in your username and password, pressing return each time (you won’t see
any characters on the screen when you type the password).
The first information to appear on the screen once you have logged in, is the location and
date/time that you last logged into the server, followed by a banner telling you which
machine you are connected to and a help email address ([email protected]). After this, is a
section where the administrator of the server can add any new messages about the service
– for instance warning of scheduled maintenance sessions (not shown in the example
above). This is called the Message of the Day (MOTD for short). On codon, you will then see
some horizontal bars that show a summary of how much space your account is using (more
on this in a later section).
The Prompt
The line on the screen that appears after you have typed in your password and pressed
return which has the form of
[sarahb@codon ~]$ all of this together is called the prompt
6
This reminds you of your username (e.g. sarahb), the short name of the machine you are logged in to (codon), and the directory you are currently in (~ - here your home directory – more about this later). The prompt reminds you that the machine is waiting for you to give a command. We can open as many terminal windows at once as are wanted or needed (there is a limit but you won’t ever need that many). You can start another terminal, by going back to Putty and starting another codon.bioinformatics) session, the same as before. Terminal windows are maximised and minimised, and moved around the screen the same way as for normal PC or Mac windows. Size can also be adjusted by dragging on a corner while using the left mouse button. Please do not close them by clicking on the X in the top right hand corner – this is not a safe way to log out.
N.B. When you have finished and are ready to log out, you can close a terminal window, by typing exit at the prompt, or <Ctrl> d (hold down the control key and type d)
Now we can check that X11 forwarding is working by typing the command:
xeyes
After a short pause, you should see a pair of googly eyes appear somewhere on your
screen:
If you can see them, go back to your putty session and stop
the xeyes program by typing <Ctrl> c (i.e. hold the control key down, while typing the
letter c)
If you see an error message and no eyes appear, please go back to your Putty configuration
and check that you have the Enable X11 forwarding box ticked – save any changes,
restart Putty and try Xeyes again.
Using XMing For X11 on a PC (instead of Exceed) Look on your desktop, quick launch bar or under the programs menu, for the Xming Icon, which looks a bit like this
Start the program by selecting it from the menu or double-clicking the shortcut. You won’t see very much happening at this point - XMing will add an icon to the notification panel at the bottlom right of your screen. Really you should start the X11 programs BEFORE starting your Putty connection.
7
Terminals – copying and pasting When X11 was originally designed, the assumption was that everyone would have a three-
button mouse, using the left mouse button to highlight and copy, and the middle button to
paste – so, what do you do if your mouse has less buttons?
Many mice only have two buttons, or perhaps 2 buttons and a central scroll wheel. To
emulate the third button (needed for pasting inside a terminal window), there are 2
possibilities, depending on how the mouse has been configured. Where there is a scroll-
wheel, pressing down on the scroll wheel (i.e. into the body of the mouse, rather than turning
the wheel) will paste, or if there are only 2 buttons and no scroll wheel, pressing the 2
buttons simultaneously will paste. On a Mac, you may only find one button on the mouse, or
two. There, holding down the Apple command key on the keyboard and c while
highlighting text should copy, while using the command key with v should allow you to
paste. Remember, whatever you are selecting to copy and paste will get pasted where your
cursor is. To select something for copying, press the left button and select the text, then
paste using whatever is designated as the middle mouse button – as above.
Exercise:
Go back to Putty and open another terminal window. Practice selecting some text in one
window and pasting it into the other terminal. You can also copy and paste between the
terminal and other programs on your PC e.g. Notepad. When you are happy, you can shut
the extra terminal window by typing exit
Basic commands It is possible to achieve a great deal with only a basic set of Unix commands. The server can be
thought of as a very large filing cabinet, containing files within file folders, within file folders, etc.
Folders are commonly referred to as Directories and Subdirectories within Unix. Carrying on
with the filing cabinet simile, imagine how hard it would be to find anything if you just threw all
your documents in the drawer without any folders, or dividers – chaos! The same thing will
happen to your Unix account if you choose to keep all your documents in your home directory,
instead of creating subdirectories (file folders) to store associated data together.
Your home directory is the directory you automatically start off in every time you log in to your
account.
There is a short-cut name for your home directory when you are typing – which is the ~ (tilde)
symbol.
First we will make a directory called course in which to store the files you will generate today,
type:
mkdir course
To move into this new directory, type: (note that the prompt changes to show the new directory)
jbloggs@codon ~]$ cd course
8
jbloggs@codon course]$
cd stands for change directory
course is a subdirectory of jbloggs, this person’s home directory. To show the fully qualified
pathname for your current directory type:
pwd
/home/jbloggs/course (typical reply)
pwd stands for ‘print working directory’ and will return the full path to where you are on the
machine relative to a fixed point – the Root of the machine. Here, this tells you that course is a
directory, within the directory jbloggs, which is a directory within home - we are using a
‘hierarchical file system’ which means that we can have directories within directories.
N.B.
Knowing the full path for a particular file is important when you need to tell the machine where to
find files you want to work on, which may reside in directories other than the one you are
currently working in. There are 3 ways of specifying the location of a file or directory:
1. Absolute address from the root of the machine (like the one shown above)
2. Relative to your home directory
3. Relative to the current directory you are working in at the time
Use the one that is easiest for you at the time. A location relative to the root of the machine
always starts with a /
When you type pwd, you will see an absolute path from the root of the machine. Other
forward slashes are added to delineate between directories. Relative paths do not start with
a /. There are various shortcut symbols to help you move around as well. We will explore
paths shortly in the exercises, but here is another example:
As an example, let us assume we are currently in /usr and we want to move the file fred
into the software directory, we would type:
mv users/fred /software
or, alternatively we could type mv users/fred ../software
i.e. the symbol “..” stands for backwards one directory towards the root of the machine
a single dot “.” stands for the directory that you are in at the time – i.e. your current directory.
9
the tilde symbol “~” stands for your home directory
Now you can try this out in the exercise below: Type the following commands.
pwd (this will return your current directory, in this case course)
cd .. (this will move you one directory backwards to your home directory
cd /usr/biosoft (this will change your directory to one called /usr/biosoft
ls (this will list the contents of this directory)
cd ~ (this returns you to your home directory)
A note on filenames Unix makes use of many of the character keys on your keyboard. Some of them have
special attributes which means that they cannot be used in standard filenames – as they are
interpreted to mean something specific. There are ways of wrapping them so that they are
not interpreted by the operating system (e.g. by using an escape character first such as “\“ in
front of a space in a file name) but it is generally a GOOD IDEA TO AVOID using the
following characters in your file and directory names `¬!$%&*():;~#?/><,|\{ } [ ] / to stop
unexpected effects.
Spaces are also not expected in a file name, and if present, any characters after the space
will be ignored, e.g. a file called my filecalledfred will actually be seen as “my”
whereas my\ filecalledfred will be correctly seen.
Hyphens, underscores and full stops in file and directory names are fine.
NOTE – a full stop used at the first character of a filename or directory will create what is
known as a hidden file (one that is not seen when you list the contents of a directory). These
are generally used to tidy away configuration files that affect the way your account works.
Now we can copy some files into the course directory that you made earlier. These files are
currently sitting in a directory called intro_course type:
cd course
cp /home/biotrain/intro_course/* .
cp (short for copy) requires the name of the file or directory to copy and then the place to
put the copy.
NOTE the full stop, which comes after a space - and yes you do need to type it as it
specifies the place to put the copies! Here, the full stop is short-hand for ‘the directory I am
currently in”.
This copies all files (*) in the directory intro_course, which is a sub-directory of the home
directory, to the current directory (.) but not the directory intro_course itself. The * is known
as a wildcard (more about this later)
TIP – to copy a directory and all of its contents (including other subdirectories and their
contents, if present, we have to copy recursively. e.g.
cp -R /home/biotrain/intro_course .
10
this would copy the directory intro_course AND all of its contents to your current
directory.
As with most UNIX commands, if this command has worked, there will be no output to tell
you so. If anything is printed (except the usual prompt) this command has not worked, go
back and check you have typed it in EXACTLY as above. If you receive one, the error
message may be informative - for instance
cp: cannot stat fred: No such file or directory
(this suggests that the copy command cannot find the file you are trying to copy – in this case fred)
To list the files now present in your current (working) directory, type:
ls (if this is empty, your copy command hasn’t worked - try again)
To list all the files in your home directory, (the one with the same name as your
username) type:
ls ~ (the tilde or ~ symbol is an abbreviated name for your home directory)
Command line arguments There are two ways that a program can be given additional information - either
1) It can ask you questions on the commandline (prompt) – that you type answers to
2) You can offer the information without being prompted
The drawback of the program asking a question is that if it can do 20 different things, then
being asked 20 questions each time you run it can be very tedious. By convention most
UNIX programs don’t ask for information, they expect you to supply it. This is achieved by
using “command line arguments” sometimes also called flags. By convention, something
is indicated as an “argument” by placing a dash in front of it. Some bioinformatics programs
will ask a basic range of questions but expect additional information to be given via the
command line. We will look more closely at command line arguments, using ls as an
example case.
try typing ls –l
To list all of your files using a different combination of command line arguments, that
influence the output, try typing:
ls –Rl ~ (This is , R and then a small L, not the number one)
The –R flag causes ls to search recursively through all directories below, in this case, your home
directory which is indicated by using the ~ symbol.
11
The –l flag causes a long listing of the information including sizes, ownership and creation/last
modification times.
On this machine, files and directories listed by ls are shown coloured by their type:
Blue: Directory
Green: Executable or recognized data file
Sky Blue: Linked file
Pink: Graphic image file
Red: Archive file
This can make things a little hard to read sometimes. We have set an alias on the ls
command so that when it is run, it automatically and silently adds the option to show
colours. To see this alias type
alias ls and you will see the following:
alias ls='ls --color=tty' (in other words, if someone types ls, you actually run ls
with the optional flag “–colour=tty” to colour output by type if run in a terminal).
Note: You can turn this colour-coding off in the terminal window (for this session only) by
typing unalias ls
Now try to sort all your files according to their age (newest last)
ls –lrt
Finally, we can take a look at some files you don’t normally see when you list with ls
ls –la ~
this makes visible so-called hidden files and directories whose names start with a full stop, .e.g. drwx------ 8 train17 training 4096 May 17 13:49 .
drwxr-xr-x 23 root root 4096 May 5 13:43 ..
-rw------- 1 train17 training 17208 May 16 15:33 .bash_history
-rw-r--r-- 1 train17 training 18 Nov 20 05:02 .bash_logout
-rw-r--r-- 1 train17 training 193 Nov 20 05:02 .bash_profile
-rw-r--r-- 1 train17 training 231 Nov 20 05:02 .bashrc
drwx------ 3 train17 training 20 May 12 12:10 .cache
Here the top line is returning information about your current directory (.) and the second line, the directory one further back towards the root of the machine. If you were to type the command inside /homes/train99 for instance, the top line would refer to train99 and the second line to homes. NOTE – hidden or dot files (e.g. .bashrc) are generally doing useful work inside your account, influencing your environment. DO NOT DELETE THEM. If you delete them by mistake, your account may not look the same next time you log in, or certain programs may no longer work as expected. If so – contact [email protected] for help.
Ownership and Permissions Example of file information returned by the command ls -l:
drwx------ 20 sarahb system 8192 Sep 12 2002 www_data/
-rw-r--r-- 1 johnp system 1200 Sep 24 17:30 tape.txt
The first character of each line (as below) indicates the type of the file. For example, a d in this position indicates a directory, - indicates a regular data file.
-rw-r--r-- 1 sarah system 1200 Sep 24 17:30 tape_change.txt
^^^
The next three characters define the permissions afforded to the owner of the file. In this case,
they should be set to rw- for all the listed files. This indicates that the file owner can read, write
to, but not execute the files.
Write permission is required in order to edit or delete a file. Execute permission is required if the
file is a program file or a file containing a list of textual UNIX commands (a script). Without
execute permission, a program or script file cannot be made to run, i.e. be executed. Execute
permission is also required for directories in order to gain full access to the files stored within.
-rw-r----- 1 sarah system 1200 Sep 24 17:30 tape_change.txt
^^^
It is possible to divide the users of a system into groups. This allows users to set their file
permissions such that members of their group have greater access to their files than do other
users of the system. The next three characters define the permissions that the members of the
user’s group have. The group name is given in the fourth column (in this case, “system”).
These three characters should be set r--, indicating that members of your user group may read
your files but may not write to (i.e. amend or delete) or execute them.
The next three positions refer to the access that everyone else (world) would have (in this case none – as shown by a dash). -rw-r----- 1 sarah system 1200 Sep 24 17:30 tape_change.txt
^^ ^
The second column reports the number of links to the file (you can ignore this figure). -rw-r----- 1 sarah system 1200 Sep 24 17:30 tape_change.txt
^
The next columns report the owner of the file (sarah) and the group (system) to which the file
belongs. The figures following this are the size of the file in bytes (characters if you prefer), the
date and time that the file was last modified, and finally the name of the file (or directory).
Changing file permissions There may be a situation where you want someone else to be able to copy or read one of your
files. You will have to change the permissions on the files to allow them to do so. You must also
change the permissions of the parent directories, as these override those of individual files. It is a
very common error to forget to do this. The command to change permissions is chmod. You
have to specify who you are modifying permissions for, and what permissions you are changing,
and for what file/directory.
N.B. Unless you have a specific need to share a specific file-set, you should not normally
need to modify the permissions in your account.
u means user and refers to the owner of the file g means group, and refers to the group the file belongs to o means others, everyone apart from those above a means all, i.e. user, group and others Also, as we have seen above, r means read permission, w means write permission and x means execute permission.
13
So, for example, to give read permission to someone in the same group for a file called “filename” in ~/course. ls –l ~
chmod g+r ~ (allow people in the group to read my home directory)
chmod g+r ~/course (allow people in the group to read the directory course)
chmod g+r ~/course/filename (allow the group to read the file called filename)
chmod a+r ~ (allow everyone to read your home directory)
If you wanted to remove the permissions use – instead of + chmod g-r ~ stop your group from reading your home directory
Looking at text –based files There are a number of commands you can use to look at files that contain text. Sometimes
you may want to just send the contents to the screen all in one stream without stopping but
more generally you may want to be able to look at the content a screen-full at a time.
Two of the most useful are: more filename
less filename
These two commands are very similar, but less has greater flexibility – (less does more than
more does – silly pun). Both will present information from the file to the screen one page at a
time, (as opposed to other commands that scroll down the document too quickly to read
such as cat). However, less will allow you to scroll back up the document using the arrow
keys, whereas more only allows you to scroll down. To exit out of a document you are
reading using more or less, type
q
Note: more is a standard UNIX command, whereas less may or may not be available on
other UNIX systems you may encounter.
For example, try the following: cat cd4_human.pep
more cd4_human.pep
The more command shows you the contents of a file one page at a time and tells you to hit
the space bar to continue. Now try:
less cd4_human.pep
You should be able to use the arrow keys to scroll up and down the document.
When using less there are a number of keystrokes you can use to give for navigation
h help
14
q quit program
space bar next page
return key next line
f forward one page (same as pressing the space bar)
b back one page
G go to the end of the file
g go to the start of the file
j moves you forward a line
k moves you back one line
/xxx search for the characters xxx in the file8.and highlight matches
n find next occurrence of search pattern above
? search in the opposite direction
Now Try looking at one of your files using less, and navigate around the document
using some of the commands shown above.
Looking at Other Files
If files have been compressed using the gzip command, they will usually have a filename
which ends in .gz. If they are text-based files, you will be able to read the contents without
uncompressing it using a command zcat. This sends the output to screen all in one go, so
you might want to redirect it into the less command so you can read it one screen-full at a
time
zcat myfile.gz | less (more on redirection later…)
If you really need to look inside a binary file, you will either need to use a program designed
to work specifically with its exact format (and this will depend on which program created it) –
or you can extract readable strings out of it using the strings command)
Wild cards The * character is a ‘wildcard’. That means it can mean any symbol or symbols.
Thus:
*.seq means all files ending in .seq
c*.pep all files whose names begins with c and ends with .pep
* all files
Wild cards allow us to specify alternative filenames with a minimum number of keystrokes.
We can also use the ? character, meaning “any single character”, so
more cd4_?????.pep
will display any file beginning in cd4_, followed by any 5 characters, and then .pep. So,
here this would match cd4_mouse.pep and cd4_human.pep files, but not cd4_rat.pep.
We can also use square brackets to denote a range of letters [a-z] or a selection of
letters [abrh], so
15
more cd4_[abrh]????.pep
will match cd4_human.pep and cd4_rabit.pep, but not cd4_mouse.pep or cd4_rat.pep.
Now you try these commands on your files in your current directory, also
more *.pep
Copying and deleting files and directories Here, we will carry out a number of basic file manipulations using UNIX commands. A
summary of the commands we use and what they do is provided at the end of these notes.
We are going to:
• make a new subdirectory under the one we are in at the moment
• move some files into it
• rename a few files
• delete the directory we have made (and its contents)
These are all functions that you will need in order to be able to organise and navigate within
your own account.
Type:
cd ~/course moves you into the directory course, under your home directory
mkdir test make a new directory called test
ls test list all the files in the directory called test (directory should be empty)
Now, we are going to copy a file into that directory.
cp cd4_human.pep test copy the file cd4_human.pep into the directory test
cp stands for copy, and is an important command. An important point to note is that you can
copy files, or directories, (if you add certain flags). Notice that above, we are copying the file
cd4_human.pep to the directory test. If we had not previously created the directory called
test, the computer would have assumed that what we wanted to do was to copy the file
“cd4_human.pep” and call the copy “test”. If you wanted to be sure, you could write the
following, but it does the same as the command above:
cp cd4_human.pep test/cd4_human.pep
Remember, to the computer, files and directories are two different things. A directory is
something you can store other things in. But you do have to TELL the computer if you intend
something to be a directory or just a file. That is why you have special commands, like
mkdir, to make a directory.
Now, try the following:
cd test move into the directory “test”
mv cd4_human.pep newname.pep rename the file cd4_human.pep to newname.pep
cp newname.pep second.pep make a copy of newname.pep called second.pep
16
mv is short for move, and is the command used for either moving files to new locations, or
purely renaming them (a similar act!).
Now we want to delete, or remove, newname.pep. Type:
rm newname.pep
Now, let’s move up a directory, to course, and then delete the test directory completely:
cd ..
rm -r test
The .. is a shortcut, meaning ‘go back up one directory from where you currently are’ – in
this case back from test to course.
The flag -r is required to delete directories and will delete a directory recursively along with
all of its contents – including other subdirectories so BE CAREFUL.
On this machine, you will be prompted to examine contents of a non-empty directory and
asked if you want to delete each subdirectory and contents individually (type Y or N when
prompted). To blindly delete without examining, again insert a backslash in front of the rm to
unalias it.
.
Empty directories are more usually deleted using the rmdir command.
Note the difference between the mv and cp commands. If the entity you are moving or
copying is a directory, the source file(s) are moved (mv) or copied (cp) into that directory. If
the name given as the place you are moving or copying to is not already known to the
computer as a directory, then the file is copied (cp) or renamed (mv).
If the target (destination) is a file which already exists, then the program will ask you to first
confirm the action (this is not standard, most UNIX systems will immediately overwrite the
original file).
ls (returns 2 files that exist)
normal_1_1_fastqc.html
normal_1_2_fastqc.html
cp normal_1_1_fastqc.html normal_1_2_fastqc.html
cp: overwrite normal_1_2_fastqc.html? n (user is prompted if existing file
should be over-written - file is not overwritten as answer no is given)
If you add a -f flag to the copy command, it forces the action to be done silently, but take
care if you choose to do this!
The file system has been set up to try to stop you copying over files and directories
accidentally. However, NOT ALL PROGRAMS ARE AS NICE. Although the more
dangerous commands (cp, rm, mv) have been modified so that they will at least ask first,
most bioinformatics programs won’t. So if you repeat an analysis, the results of the
second analysis may overwrite those of the first unless you give the program a new
destination name for the output. Other programs may just silently fail to run.
17
The moral here is to make copies of important files before you start manipulating them:
cp file file.orig
Using an editor to create a file There are several text editors available to you on our servers. The most universal UNIX
editor is vi (or its slightly more helpful version vim) but it isn’t the simplest editor to use.
Today we will be looking at two editors, a simple editor called pico, and a windows-based
editor called gedit.
Pico [Pico is available on Codon but not currently on training.medbio].
We will create a simple text file. We will make a file of filenames called cd4.list. So, we
start up the editor pico, telling it to edit or create the file cd4.list. If we gave it no filename
pico would start a new file and ask you for a name to save it under when you exit the
program.
pico –w cd4.list (the –w flag tells pico not to linewrap long lines)
.
At the bottom of the screen you will see the standard pico commands in reverse video like
this: (I have started to type test into the editor pane)
e.g. cntrl-x exit cntrl-u undo cntrl-w search.
You need to use the arrow keys to move around inside your document. Now Type the
following lines into the file:
cd4_cerae.pep
cd4_erypa.pep
cd4_human.pep
When you have inserted the three lines quit from the editor by pressing <Cntrl> x
The list you created consists of the names of three files in your current directory. Some programs
can take as input a list file like this (i.e. a file containing the names of other files to input). If you
18
wanted to use files in any other directory, you will need to tell the machine where to look for
them, either relative to the place where you are when using the listfile, or the absolute path from
the root of the machine. To do this, you need to specify the full path of your files.
e.g: /home/jbloggs/course/cd4_cerae.pep
Remember - The pwd command can be very useful is you are not sure what the full path to your
file is!
Now let’s look at the gedit editor (the Gnu editor - where Gnu is a free software foundation
rather than an ungulate).
Gedit requires an X11 connection, by the way, while pico does not, and will work in a simple
terminal – one reason to be familiar with both.
Start it by typing
gedit cd4.list &
The & symbol runs the program in the background so you can carry on working in the
terminal as well, if you wish – more about this later
This is a slightly more friendly-looking editor. Here we have several menus, selected by using the
right mouse button. At the bottom of the window, a menu currently showing a Plain Text option,
allows you to select auto-syntax prompting, for a large number of possible programming
languages, including Python, C++.
now try editing the text file you have loaded. When you have seen enough, save the file and
then exit gedit.
Finding Help There are a number of places you can go to find programs you need, or find out about
programs or commands. Here are a few options:
19
The man command
man more
The UNIX command for getting help is man (because it brings up manual pages). These
pages provide information on a number of programs on the system, including many of the
UNIX commands you may have cause to need. If you type:
man ls
you can now read all about the ls command, including what extra information you can give
the program to get it to do particular things. Of course, you need to already know the name
of the program to get help this way. If you don't know this however, you can type either of
the following to try and find out what commands exist for what you want to do:
man -k keyword
or
apropos keyword
You can now look at the man page for any command you think is appropriate.
Other help A few bioinformatics programs (e.g. Hmmer) have man pages, but most don’t. Often help
files are distributed in html (web) format or as pdf files and can be found by searching our
web site by software name or by looking in our software database