Scripting Languages Course 3 Diana Trandabăț Master in Computational Linguistics - 1 st year 2013-2014
Scripting LanguagesCourse 3
Diana Trandabăț
Master in Computational Linguistics - 1st year2013-2014
Today’s lecture
• What is Perl?• How to install Perl?• How to write Perl progams?• How to run a Perl program?– perl program.pl
• Scalars
About programming
3
• Working with algorithms• Program needs to contain exact commands– (Mostly) not: Go buy some bread– But: Put on your coat and shoes, open the door, go
through it, close the door, go down the stairs…
• A program has a certain input• Processes it• Produces a certain output
Why Perl?
4
• PERL = Practical Extraction and Report Language• Easy to learn• Simple syntax• Open source, available for different platforms: Unix,
Mac, Windows• Good at manipulating text– Good at dealing with regular expressions
• TMTOWTDI - “There’s more than one way to do it”• Extremely popular for CGI and GUI programming.
Getting started…
• For Windows: install ActivePerl http://www.activestate.com/Products/ActivePerl
• You may use your university account (putty), and then you don’t have to install anything.
• Most Linux distribution come with Perl. To find out if you have it installed already, open an terminal, and write
perl –v• which should give you the version of Perl that you
have installed on your computer.
04/21/23 Perl in a Day - Introduction 6
· Make sure Perl exists, find out what version it is· perl -v
· How do I get help?· perldoc perl (general info, TOC)· perldoc perlop (operators like +, *)· perldoc perlfunc (functions like chomp: > 200!)· perldoc perlretut (regular expressions: /ABC/)· perldoc perlreref (regular expression reference)· perldoc -f chomp (what does chomp function do?)· perldoc File::IO (find out about a Perl module)
· Type q to quit when viewing help pages or space bar for next page.
Before you start using Perl…
How to write a Perl program
7
• Perl programs can be written in any text editor– Notepad, vim, even Word…– Recommended: A simple text editor with syntax
highlighting
• Write the program code• Save the file as xxx.pl– .pl extension not necessary, but useful
What is a Perl program like?
8
#! usr/bin/perl -w# This *very* simple program prints "Hello World!“
print "Hello World!";
What is a Perl program like?
9
• This line is needed in Linux, not mandatory in Windows, but it does not harm, so you may leave it in your code.
• The -w option tells Perl to produce extra warning messages about potential dangers. This is similar to
#! usr/bin/perl use warnings;
White space doesn't matter in Perl.All Perl statements end in a semicolon ;
#! usr/bin/perl –w# This *very* simple program prints "Hello World!“
print "Hello World!";
What is a Perl program like?
10
• The content of a line after the # is commentary. It is ignored by the program - with the exception of the line #! usr/bin/perl
• What are commentaries for, then?– They are for you, and others who will have to read the
code– Imaging looking at a complex program in a few months
and trying to figure out what it does• Write as much commentaries as you can
#! usr/bin/perl –w# This *very* simple program prints "Hello World!“
print "Hello World!";
What is a Perl program like?
11
• This is a Perl command– In this case, for printing text on the screen
• Every command should start at a new line– Not a Perl requirement, but crucial for readability
• Every command should end with a semicolon;• Many commands take arguments– Here: “Hello World!”
#! usr/bin/perl –w# This *very* simple program prints "Hello World!“
print "Hello World!";
What to do with the program?
12
• Perl works from the command line• Windows: Start Run…cmd• Go to the directory where you saved the
program– E.g.: cd C:\Perl\MyPrograms
• Run the program:– perl program.pl
• See the results of your labours!
Exercise
13
• Create a folder for your Perl programs• Open the editor of your choice and write the
„Hello World“ program– The command is print „Hello World!“;– Don‘t forget the commentary!
• Save the program• Run it!• What happens if you misprint the print
command?
More on the first program·Perl is case sensitive!
·print is not the same as Print·$bio is not the same as $Bio
·print is a function which prints to the screen·print("Hi") is (usually) the same as print "Hi"· Inside "double quotes", \n starts new line, \t prints tab·A function is called with zero or more arguments
· Arguments are separated by commas· print takes as many arguments as you give it
print ""; # legal, prints nothing, not even \nprint("Hi", "There"); # prints HiThereprint(Hi); # illegal (calls the function Hi)print(1+1, 2+2, "\n"); # prints 24 and a newline
Variables
15
• The „Hello World“ program always has the same output– Not a very useful program, as such
• We need to be able to change the output• Variables are objects that can hold different
values
Variables
• Names in Perl: – Start with a letter – Contain letters, numbers, and underscores “_” – Case sensitive
• Two major types: – $ Scalars (single value) – @ Lists – % hash tables
Scalars
• Start with a dollar sign “$” • Can be of type: – Integer – Floating point – String/text– Binary data – Reference (like a pointer)
• Perl is not a strongly typed language (There is no necessity to declare the variable before hand)
Defining variables
18
• To define a variable, write a dollar sign followed by the variable’s name– Names should consist of letters, numbers and the
underscore– They should start with a letter– Variable names are case-sensitive!
• $a and $A are different variables!
– Generally, a variable’s name should tell you what the variable does
# We define a variable „a“ and assign it a value of „42“
$a = 42;
Defining variables
19
• Variables can be assigned values– String: text (character sequence) in quotes/double
quotes– Numbers
• $a = 42;• $a = “some text”;
# We define a variable „a“ and assign it a value of „42“
$a = 42;
04/21/23 Perl in a Day - Variables 20
Declaring Variables
• Variables can also be declared with my – Tell the program there's a variable with that name– my $value = 1;– Use my the first time you use a variable– Don't have to give a value (default is "", but –w may warn)
• Avoid typos– use strict; will force you to declare all variables you
use with my– Put this at the top of (almost) any program– Now Perl will complain if you use an undeclared variable
Changing variables
21
• Arithmetic operations– $a = 42 / 2; # division– $a = 42 + 5; # addition– $a = $b * 2; # multiplication– $a = $a - $b; # subtraction
• Also useful:– $a += 42; # the same as $a = $a + 42;– The same for +, -, /
• String operations– $a = “some“ . “ text“; # concatenation– $a = $a . “ more text“;
22
Data flow
• Unless you say otherwise:– Data comes in through STDIN (Standard IN)– Data goes out through STDOUT (Standard Out)– Errors go to STDERR (Standard Error)• Error code contained in a ‘magic’ variable $!
Basic output
23
• We have already seen an output command– print “text“;– print $a;– print “text $a“;– print “text “ . $a+$b . “ more text.“;– Special characters:• \n – new line• \t – tabulator
Exercise
24
• Define a variable• Assign it a value of 15• Print it• Double the value• Print it again• Define another variable with the string „apples“• Print both variables• Change the first variable to its square and the second
to „pears“• Print both variables
Basic input
25
• The <> operator returns input from the standard source (usually, the keyboard)
• Syntax:– $a = <>;
• Don’t forget to tell the user what he’s supposed to enter!
• Try the following program:
# This program asks the user for his name and greets him
print "What is your name? ";$name = <>;print "Hello $name!";
Input, output and new lines
26
• As the user input is followed by the [Enter] key, the string in $name ends in a new line
• The chomp function deletes the new line at the end of a string
• Try the following, modified program:# This program asks the user for his name and greets him
print "What is your name? ";$name = <>;chomp($name);print "Hello $name!";
If, else
27
• Until now, the course the program runs is fixed• The if clause allows us to take different actions
in different circumstances
# Let‘s try out a conditional clause
print "Please enter password: ";$password = <>;if ($password == 42) {
print "Correct password! Welcome.";} else {
print "Wrong password! Access denied.";}
If, else
28
• Note: = is the assignment operator, == is the comparison operator
• Else is an optional operator triggering if the if condition fails
# Let‘s try out a conditional clause
print "Please enter password: ";$password = <>;if ($password == 42) {
print "Correct password! Welcome.";} else {
print "Wrong password! Access denied.";}
Exercise
29
• Try out the password program.– Why doesn‘t it work correctly? Fix it.– Tell the user if the number he entered is too large
or too small• Hint: The comparison operators you’ll need are < and >
While
30
What if we want to do checks until something happens?The while loop repeats commands until its criteria are
met Note: in the example below, $password has no value, so it
specifically doesn’t have the value 42
# Now on to a "while" loopwhile ($password != 42) {
print "Access denied.\n";print "Please enter password: ";$password = <>;chomp($password);
}print "Correct password! Welcome.";
Exercise
31
• Write a small game: take a number, and make the user guess it. Tell him if it‘s too high or too low. If the user gets it right, the program terminates.– If you like, you can take a random number:
$random = int (rand(10) );
32
Filehandles
• A filehandle is a way to interact with input or output– ‘<>’ interacts with files on the command line
• filehandle names are simple strings with no symbols– I usually use all caps (SEQFILE), but that isn’t
necessary
• You must open your filehandle before using it
Reading files
33
• What if we want to have input from a file, not from the user?
• Open file for reading:– open(INPUT, "<file.ext");• This is default behavior, so you don’t actually need the
‘<‘
• Read a line:– $line = <SOURCE>;– $line = <>; # is just a special case
Writing files
34
• What if we want to print to a file, not to the screen?• Open file for writing:– open(OUTPUT, “>file.ext"); #open new file• Warning: If filename already exists, it is
overwritten!!• Write:– print OUTPUT “Some text...”;
• Appending:– open NAME, “>>filename”; # append to old file
Reading files
35
• Perl Magic! <> – Opens the file (or files) given as arguments on the
command line– Brings in one line of data at a time
open(INPUT, "<test.txt");while ($line = <INPUT>) {
chomp $ine;$line_id++;print “$line_id:\t$line\n”;
}
36
Filehandle
• Flexible coding– I want to specify the file to open on the
command line, rather than hard coding it$in_name = shift;$out_name = shift;open FILE, “<$in_name” or die “Couldn’t open $in_name for reading: $!\n”;open OUT, “>$out_name” || die“Couldn’t open $out_name for reading: $!\n”;while ($line = <FILE>){
chomp $line;print OUT “Something about $line\n
}close OUT;close FILE; • Usage: perl myscript.pl inputfile outputfile
37
Pipelining
• The STDOUT of one script can serve as the STDIN of another script. – use the pipe (‘|’) symbol to chain scripts
together
• Nothing goes to the screen in between scripts– instead, what would normally go to the screen is
redirected and made the STDIN of the next script
Exercise
38
• Make a text file and fill it with a Wikipedia article– Count the number of definite and indefinite
articles (the and a)– Count the number of numbers and digits– Insert a <number!> tag before every number
Great!
See you next time!