Welcome to COMP1100 — Introduction to Programming and Algorithms Clem Baker-Finch & Malcolm Newey Australian National University Semester 1, 2006 COMP 1100 — Introduction & Outline 1
Welcome to
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
&
Malcolm Newey
Australian National University
Semester 1, 2006
COMP 1100 — Introduction & Outline 1
What is this course about?
• Introduction to the basic principles of programming
• First part of a sequence COMP1100 + COMP1110
• About 2/3rds programming concepts using Haskell ,
a functional programming language, followed by . . .
• About 1/3rd introduction to Java ,
an object-oriented programming language, leading on to . . .
• COMP1110 using Java .
COMP 1100 — Introduction & Outline 2
Core Topics
• Types and data structures
• Control structures
• Abstraction
• Modularisation
Philosophy
• Data-directed program design
• Programming as a human activity
COMP 1100 — Introduction & Outline 3
Learning to program is a lot like learning a foreign language.
You must
practice, practice, practice . . .
COMP 1100 — Introduction & Outline 4
COMP1100 Web Site
http://cs.anu.edu.au/student/comp1100/
The main information resource and communication tool for the course.
Lecture notes, lab exercises, assignments, announcements, discussion
forums, etc.
Linked from the WebCT pages.
COMP 1100 — Introduction & Outline 5
Lectures
There will be at least 30 lectures, 3 each week:
• Monday 4–5pm
• Thursday 9–10am
• Friday 2–3pm
Attend all lectures — check web site regularly for schedule.
Lecture slides and sample programs on the web site.
Lecture recordings on WebCT pages.
COMP 1100 — Introduction & Outline 6
Laboratory Classes
10 two-hour weekly laboratory classes, beginning in week 2.
You must register in a lab class as soon as possible.
Registration is on-line at http://cs.anu.edu.au/streams/
Check your timetable to avoid clashes.
Logging on to StReaMS will automatically create an account for you on the
DCS Student System.
COMP 1100 — Introduction & Outline 7
DCS Student Computing Environment
Different to other labs and InfoPlace at ANU – (Linux and KDE).
Handouts:
• Student Computing Environment User Guide
• Student Computing Environment Familiarisation Exercises
Once you have a DCS student account, work through the familiarisation
exercises sometime this week .
This will help prepare you for the first supervised lab classes in week 2.
COMP 1100 — Introduction & Outline 8
Textbooks
Main textbook:
Haskell: The Craft of Functional Programming (2nd edition) – Simon
Thompson (Addison-Wesley)
Later in the semester:
Big Java (2nd edition) — Cay Horstmann (Wiley)
This is also the textbook for COMP1110 in second semester.
COMP 1100 — Introduction & Outline 9
Assessment
Assignments: 35% – three assignments
10% + 15% + 10% due near weeks 6, 9, 12
Lab participation: 5% – satisfactory completion of exercises
Mid-semester quiz: 10% – week 7, open book
redeemable against final exam
Final Exam: 50% – exam period, two A4 sheets of notes allowed
COMP 1100 — Introduction & Outline 10
Learning how to program
• There will seem to be an endless number of minor details to be
remembered. You can be a successful programmer without knowing
them all.
• There will be many frustrations.
• Computers won’t handle ambiguity.
• Abstraction: at different times in the process, different aspects are
(temporarily) irrelevant.
• Practice and experiment.
COMP 1100 — Introduction & Outline 11
What to do now:
• Make sure you have the four handouts:
– COMP1100 General Course Information
– Student Environment User Guide
– Student Environment Familiarisation Exercises
– DCS Student Handbook
• Check you timetable and register in lab classes at
http://cs.anu.edu.au/streams/
• Get the textbook and read Chapter 1
• Go to the DCS labs
Work through the familiarisation exercises sometime this week
COMP 1100 — Introduction & Outline 12
Questions?
COMP 1100 — Introduction & Outline 13
Computers, Programs, Programming Languages
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Computers, Programs, Programming Languages 1
Reminders from last lecture:
• Register in prac groups at http://cs.anu.edu.au/streams/.
• Make sure you have a copy of the Student Computer Environment User
Guide.
• Work through the Student Computer Environment familiarisation
exercises before your first prac class (week 2).
• Read chapter 1 of the textbook.
COMP 1100 — Computers, Programs, Programming Languages 2
What is a computer?
1940s: A human being — job description.
1950s: John von Neumann — stored program computer.
Not a calculator: Before von Neumann there were calculators which had a
store and a processor. The instructions to calculators are externally
controlled (by a human).
COMP 1100 — Computers, Programs, Programming Languages 3
Von Neumann’s brilliant idea:
• Store + processor
• Data and instructions in store — the instructions are the program .
• Two aspects to the processor:
– Fetch instructions, decode and execute the instructions
– Do the calculations to modify the data in the store
By loading the instructions into the computer,
it can complete the calculation independently .
In contrast, calculators require separate control and supervision.
COMP 1100 — Computers, Programs, Programming Languages 4
Binary representation
In 1697, Leibniz discovered the binary numeral system and binary
computation.
(Binary: base-2 number representation. Digits 0 and 1 only.)
Convenient for electronic devices:
charge versus no charge; current versus no current.
All data is represented using binary digits (bits). E.g. in the ASCII encoding,
character ‘A’ has the same representation as integer 65.
COMP 1100 — Computers, Programs, Programming Languages 5
What is a programming language ?
Writing programs as sequences of machine instructions in binary notation is
obviously very inconvenient for human programmers.
A programming language is a notation for expressing a program.
Fortran: The first programming language (late 1950s). Based closely on the
operations of a particular machine (IBM 704).
LISP: The second programming language (late 1950s). Based on the
lambda calculus — programs consist of evaluations of expressions and
applications of functions.
COMP 1100 — Computers, Programs, Programming Languages 6
Modern programming languages
Fortran and LISP are still widely used.
Programs are now much, much more complex —
languages are now (supposedly) designed to help manage that complexity.
There are hundred of different programming languages – some general
purpose, most oriented towards a particular problem domain.
What is programming?
The human activity of designing and constructing the instructions to make a
computer achieve some particular task.
COMP 1100 — Computers, Programs, Programming Languages 7
Kinds of programming languages
Imperative: A sequence of commands —
Step-by-step description of intended changes to the computer store.
Declarative: Focus on what is to be computed rather than the details of
how this is achieved.
Functional: Subset of the declarative languages —
Computation is expressed as functions from input to output values.
Haskell is a functional language.
Object-oriented: Usually imperative, but not necessarily —
Structure programs around the idea of objects and messages.
Java is an imperative object-oriented language.
COMP 1100 — Computers, Programs, Programming Languages 8
Translation
Computers can only execute its own set of instructions.
No matter what language we use to write our programs, we eventually want
to run it on a computer. So our programs must be translated to the
instructions of the machine.
We use a program called a compiler to do the translation.
The input to the compiler is a program written in a high-level language. The
output is the ‘same’ program as machine instructions.
For Haskell we will be using the Glasgow Haskell Compiler, GHC .
COMP 1100 — Computers, Programs, Programming Languages 9
A Haskell program
HelloYou.hs is a little Haskell program. Don’t worry about trying to
understand it for now.
We can use GHC to translate HelloYou.hs to machine instructions. Type:
ghc HelloYou.hs
This will create some new files, including a.out which is the machine
language version of HelloYou.hs.
Run this program by typing a.out
(or possibly ./a.out).
COMP 1100 — Computers, Programs, Programming Languages 10
Why Haskell?
• It is easy to get started writing Haskell programs.
The simplest Java program requires you to know about I/O, libraries and
many syntactic details.
• GHC has an interactive interface that works more like a calculator. We
don’t need to write whole programs.
• It is easy to understand how Haskell programs work — they evaluate
expressions. To understand Java programs, first you need to understand
the underlying machine model.
COMP 1100 — Computers, Programs, Programming Languages 11
Why Haskell? (ctd)
• Problem-oriented data structures are easy to define in Haskell.
• Types are a fundamental organising principal in programming
languages. Haskell’s type system is the most advanced and
well-designed of any programming language.
• The aim is not to become highly skilled Haskell programmers.
The aim is to learn some fundamental principles of programming .
COMP 1100 — Computers, Programs, Programming Languages 12
Introducing GHCi
Reading: Thompson Ch.2
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Introducing GHCi 1
GHCi
GHC has an interactive mode that we will use a lot in this course.
Start it up by typing ghci in a terminal window.
Your computer responds with some messages as it starts up, ending with:
Prelude>
This is a prompt . You can type in an expression at the prompt.
GHCi will print its value and then prompt for another expression.
The ‘language’ or notation in which you write the expression is Haskell.
COMP 1100 — Introducing GHCi 2
The expressions can be simple:
Prelude> 1+3-2
2
Prelude>
or not so simple:
Prelude> scanl (*) 1 [1..]
[1,1,2,6,24,120,720,5040,40320,362880,3628800, ...
GHCi knows about lots of functions and operators that can be used in
expressions. They are defined in the Standard Prelude which is just a
Haskell module (Prelude.hs) containing a collection of definitions.
COMP 1100 — Introducing GHCi 3
Example: Resting metabolic rates
Use GHCi to calculate your resting metabolic rate.
(How much energy you use being totally idle.)
From the CSIRO Total Wellbeing Diet book:
women: (655.1+9.56×weight+1.85×height−4.68×age)×4.2
men: (66.47+13.75×weight+5×height−6.76×age)×4.2
Weight in kg, height in cm, age in years. Result in kJ per day.
COMP 1100 — Introducing GHCi 4
Haskell scripts
A GHCi session is like a calculator, but using it is not really programming.
Programming in Haskell centres around defining functions in a script.
[A program is a particular kind of script. We’ll get to that later.]
A script is a file that contains definitions, declarations and comments.
By loading a script into GHCi, you can use the session to evaluate
expressions containing functions and operators that are defined in that
script, as well as those in the Prelude.
COMP 1100 — Introducing GHCi 5
Resting metabolic rate, revisited. . .
Write a Haskell script with functions to calculate resting metabolic rates.
An algebraic expression with unknowns directly correspond to a function.
In the formula for males:
(66.47+13.75×weight+5×height−6.76×age)×4.2
weight, height and age represent values that we substitute into the formula
to do the calculation.
That is, they are supplied as arguments to a function which calculates
resting metabolic rate.
COMP 1100 — Introducing GHCi 6
If we call the function maleRestRate we can write its definition in Haskell
as an equation:
maleRestRate weight height age =
(66.47+13.75*weight+5*height-6.76*age)*4.2
Create a Haskell script containing this definition.
If we load the script into GHCi, we can evaluate expressions that apply the
function maleRestRate to arguments:
*Main> maleRestRate 85 190 21
8581.691
COMP 1100 — Introducing GHCi 7
A note on the form of a Haskell function definition:
The right had side of the definition is called the body and is just the formula
with unknowns:
(66.47+13.75*weight+5*height-6.76*age)*4.2
The left hand side is called the head and consists of the names of the
function and its arguments:
maleRestRate weight height age
[You might have preferred maleRestRate(weight,height,age)
but you’ll get used to it . . . ]
COMP 1100 — Introducing GHCi 8
Note:
The textbook refers to another Haskell system called Hugs.
It is almost identical to GHCi but does not include some features of Haskell
that we will be using later in the course.
COMP 1100 — Introducing GHCi 9
Values, Functions, Types
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Values, Functions, Types 1
• Lab classes commence this week .
You should have registered by now.
• Lab class exercises are available on the comp1100 web site.
– Hard copies will not be handed out in lectures.
– You should at least read them and think about them before attending
your lab session.
– We expect you to come prepared.
• Workload: average 10 hours per week for 16 weeks.
– Attending lectures and labs is not enough.
– Doing (writing bit of programs) is much more effective than just
reading about it.
• Extra assistance — drop-in labs will commence soon.
COMP 1100 — Values, Functions, Types 2
Values
Values are things such as 42 (an integer), "Hello, it’s 2pm" (a string of
characters) and 3.1412 (a floating point number).
Values may be passed to functions which return other values, eg:
• (*) takes two numbers and produces a number: 13 * 5⇒ 65
• ++ takes two strings and produces another string by concatenating them.
• length takes a string and produces a number: the length of the string.
We can combine values and functions by using the result of a function
application as input to another function, e.g.
• length ("Hello, " ++ "it’s 2pm")
COMP 1100 — Values, Functions, Types 3
Types
What’s the difference between 42 and "Hello, it’s 2pm"?
What does 2 + "abc" mean?
Values may naturally be grouped together in different categories.
We call these categories types . For example:
Type Example values
Int ...-3, -2, -1, 0, 1, 2, 3, ...
Float 1.0, 3.1412, ...
String "Hello", "The time is 11 o’clock!", ...
Char ’a’, ’A’, ’x’, ’S’, ’%’, ’8’, ...
Bool True, False
COMP 1100 — Values, Functions, Types 4
GHCi can tell you the type of any Haskell expression with the command
:type (or just :t)
Prelude> :type 7 + 3 < 5
7 + 3 < 5 :: Bool
In Haskell we can also define our own types , and build compound types
by collecting types together into data structures .
Much more on this later . . .
COMP 1100 — Values, Functions, Types 5
Functions — canned computations
The central activity in writing Haskell programs is defining functions.
In Mathematics, a function f associates each member of a set A (the domain
of f ) with a single member of a set B (the codomain of f ) and we write
f : A→ B
If a∈ A then f (a) is the associated member of B.
Haskell uses this same concept and similar notation.
COMP 1100 — Values, Functions, Types 6
How to define our own functions?
Let’s start simple: a function whose result is the input number times 2.
We’ll call it double.
The algebraic expression x + x computes double x for any x, so our
function definition looks like:
double x = x + x
On the left of the = sign, double is the name of the function and x is the
name of the argument.
On the right of the = sign is the body of the function definition.
COMP 1100 — Values, Functions, Types 7
The result of applying the function double to an argument value is
computed by replacing all occurrences of x in the body by that value.
for example:
double 5 ⇒ 5 + 5 ⇒ 10
The arrows indicate steps in the computation.
Another example:
double (double 3) ⇒ double (3 + 3) ⇒ double 6
⇒ 6 + 6 ⇒ 12
COMP 1100 — Values, Functions, Types 8
Parentheses around arguments
One notational difference to be aware of is that in Haskell we can write f x
instead of f(x)
The brackets serve no real purpose. Leaving them out makes for less
cluttered notation, but it can take some getting used to . . .
Suppose you write: double 3+1
The spacing may suggest: double (3+1)
but in fact it means: (double 3)+1
The best way to remember this is that function application is just like any
other operator, but it has higher priority than all other operators and it is left
associative.
COMP 1100 — Values, Functions, Types 9
Type signatures
Functions take arguments of certain types and give results of certain types.
For example, double takes argument values of type Int and return values
of type Int.
So functions have type, too. The type of double is Int -> Int.
The complete definition of the function is:
double :: Int -> Int
double x = x + x
[double 3.1412 also works, but 3.1412 is not an Int. We’ll get to that later.]
COMP 1100 — Values, Functions, Types 10
Multiple arguments
In last week’s lectures we wrote function definitions for resting metabolic rate
which took 3 arguments: weight, height and age:
maleRestRate weight height age = ...
The type signature for functions with more than one argument separates the
argument types with (->):
maleRestRate :: Float -> Float -> Float -> Float
and we write applications: maleRestRate 85 190 21
rather than: maleRestRate(85,190,21)
COMP 1100 — Values, Functions, Types 11
Reading type signatures
result(output)
maleRestRate :: Float −> Float −> Float −> Float
weight height age rate
arguments(inputs)
COMP 1100 — Values, Functions, Types 12
Another simple function definition:
add :: Int -> Int -> Int
add x y = x+y
We write applications: add 3 4
Haskell functions take their arguments one at a time.
It is legal to write an expression like add 3. (Try it.)
The value of the expression add 3 is a function of type Int -> Int
which adds 3 to its argument. (Try it.)
So add 3 4 is the same as (add 3) 4.
(Function application is left-associative.)
COMP 1100 — Values, Functions, Types 13
Type checking
GHC will work out the type of every expression and every function from its
definition.
If you declare the type of a function, GHC will check whether you are right.
You should always declare the type of every functions you define in your
programs:
• The type of a function is a basic part of its design.
• Type declarations are an important part of program documentation.
• Type declarations help you to find errors.
If GHC figures that the type of a function is different to what you expect,
then you have made an error.
COMP 1100 — Values, Functions, Types 14
Example
Suppose you want to design a function isEven such that isEven x
returns True if x is divisible by 2 and False otherwise.
What’s wrong with this:
isEven ::Int -> Bool
isEven x = x ‘div ‘ 2
(What is a correct definition?)
COMP 1100 — Values, Functions, Types 15
Introducing overloading
Values of type Int can be added using the (+) operator.
Values of type Float can be added using the (+) operator.
Using the same symbol or name for different operations is called
overloading .
The function (+) simultaneously has types:
(+) :: Int -> Int -> Int
(+) :: Float -> Float -> Float
But every Haskell expression has one type.
COMP 1100 — Values, Functions, Types 16
If we had type variables (and we do) we might say:
(+) has type a -> a -> a where a is either Int or Float
In fact Haskell allows (+) to be applied to any numeric type
(which includes Double and Integer as well as Int and Float)
The set of numeric types form a type class called Num
If we ask GHCi for the type of (+) we get:
(+) :: (Num a) => a -> a -> a
which means (+) has type a -> a -> a
for any type a in class Num
COMP 1100 — Values, Functions, Types 17
Comments in scripts
Haskell allows us to annotate scripts with comments . There are two kinds:
• everything on a line following the symbol --
• everything between the symbols {- and -}
The computer ignores comments but they may help humans reading our
programs — including ourselves — to understand them.
Programming is a human activity
COMP 1100 — Values, Functions, Types 18
Write your comments as you code
Write your comments at the program design phase and maintain them
through the development phase.
• Do not add them to your scripts later
• They are for your benefit, too
Always include an identifying banner comment in every script, including:
• author’s name
• date
• the purpose of the script
COMP 1100 — Values, Functions, Types 19
Choosing names for functions, variables, etc.
People read programs , not just computers.
During development, maintenance, testing, review, etc. you and other
programmers read your programs.
metabolicRestRate sex weight height age
is much better than:
rate sx wt ht yr
is much better than:
f x y z u
(The computer doesn’t care but humans do.)
COMP 1100 — Values, Functions, Types 20
Haskell lexical rules
• Haskell is case sensitive so restRATE is different from restrate.
• Names of functions, variables, type variables must begin with a lower
case letter
• Names can contain letters (upper and lower case), digits, underscores ‘ ’
and apostrophes ‘’’
• Names of types must begin with an upper case letter
• Names of “data contructors” must begin with an upper case letter
so far we have only seen data constructors in enumerations. For
example, True and False are the data constructors of type Bool.
COMP 1100 — Values, Functions, Types 21
Suggestions
• The name of a function should describe what is being calculated (a
noun), rather than how it does it (a verb).
• Often names are made up of several words (or obvious contractions).
Make the name by stringing the words together, starting each new word
with a capital.
• Look (on the web) for some coding standards to see what others do.
• Whatever you do, develop a good style, and be consistent .
COMP 1100 — Values, Functions, Types 22
Conditionals and Tuples
Reading: Thompson Ch.3
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Conditionals and Tuples 1
Conditional Expressions
So far, all our example scripts have unconditionally performed the same
computation.
How would we implement a function: max :: Int -> Int -> Int
which returns the greater of its two arguments?
We want max x y to return x if x >= y, otherwise it should return y.
Haskell has conditional expressions to allow us to make such a choice:
if 〈condition〉 then 〈value if true〉 else 〈value if false〉
so we can write:
max x y = if x >= y then x else y
COMP 1100 — Conditionals and Tuples 2
Example evaluation:
max 5 3 ⇒ if 5 >= 3 then 5 else 3 ⇒if True then 5 else 3 ⇒ 5
Sometimes the choice is not so simple:
signum :: Int -> Int
signum x = if x<0 then -1 else if x==0 then 0 else 1
Nested conditionals can be hard to read, but layout can help:
signum x = if x<0 then -1
else if x==0 then 0
else 1
COMP 1100 — Conditionals and Tuples 3
Guards
Some languages (Haskell included) have guarded expressions as an
alternative notation for conditionals:
signum :: Int -> Int
signum x
| x < 0 = -1
| x == 0 = 0
| x > 0 = 1
GHC evaluates each guard in turn, first to last until it finds one that equals
True. The right hand side that corresponds to that guard is chosen.
If none of the guards are true, GHC will report an error (at run time).
COMP 1100 — Conditionals and Tuples 4
Haskell provides a special guard otherwise that is always true.
(Experiment: evaluate otherwise in GHCi.)
otherwise is used as a catch-all at the end of a sequence of alternatives:
min x y
| x <= y = x
| otherwise = y
Another example: How many premiership points does a team get, given the
score at the end of the match?
points :: Int -> Int -> Int
points for against
| for > against = 2
| for == against = 1
| otherwise = 0
COMP 1100 — Conditionals and Tuples 5
Tuples — Combining Different Data
to represent real-world data, we often want to combine types. For example,
an item in a supermarket may need a bar code, a name and a price.
Haskell lets us combine any n types into an ordered n-tuple.
(723476, "Peanut Butter", 375)
has type:
(Int, String, Int)
Notice that the type is written in a way that corresponds to the way we
write the expressions .
COMP 1100 — Conditionals and Tuples 6
Tuple Patterns
Functions on tuples are usually defined using pattern matching .
For example, here is a function to add a pair of Ints:
addPair :: (Int ,Int) -> Int
addPair (x,y) = x + y
The pattern (x,y) matches any pair and sets x to be the first element
of the pair and y to be the second element.
COMP 1100 — Conditionals and Tuples 7
For example, representing cartesian coordinates:
type Point = (Int ,Int)
-- The origin of the coordinate system
origin :: Point
origin = (0,0)
-- Move a point a distance to the right
moveRight :: Point -> Int -> Point
moveRight (x,y) distance = (x+distance ,y)
-- Move a point upwards
moveUp :: Point -> Int -> Point
moveUp (x,y) distance = (x,y+distance)
COMP 1100 — Conditionals and Tuples 8
Definitions with where clauses
It is often possible to simplify an expression by extracting some part and
naming it. This can help in two ways: putting a name to a sub-expression
can make it easier to understand; a repeated sub-expression may only be
evaluated once.
Suppose we wanted to compute the real roots of a quadratic:
ax2 +bx+c = 0
The standard formula is
−b±√
b2−4ac2a
COMP 1100 — Conditionals and Tuples 9
The formula b2−4ac is the discriminant.
The discriminant must be ≥ 0 for the roots to be real.
roots :: Float -> Float -> Float -> (Float , Float)
roots a b c
| discrim >= 0 = ((-b + (sqrt discrim ))/(2*a),
(-b - (sqrt discrim ))/(2*a))
| otherwise = error "No real roots"
where discrim = b^2 - 4*a*c
(This isn’t a particularly satisfactory design — a quadratic may have 0, 1 or 2 real
root. We may revisit this example later.)
COMP 1100 — Conditionals and Tuples 10
Layout of function definitions
In any programming language, the layout of your program is important for the
readability of your programs.
In Haskell, layout rules help to get rid of the annoying punctuation used in
many other languages (semicolons, braces, etc.).
Haskell uses indentation to decide the ends of definitions, expressions and
so on.
Once you get into good habits, it will be very natural.
When you are learning, you might need to be a bit careful.
(Emacs Haskell mode helps with indentation. Hit the TAB key a few times. . . )
COMP 1100 — Conditionals and Tuples 11
The Off-Side Rule
A definition ends when a (non-space) symbol appears in the same column
as the first symbol of the definition.
See textbook pages 47 & 48.
COMP 1100 — Conditionals and Tuples 12
Recommended layout
Something like:
fun p_1 p_2 .. p_n
| guard_1 = e_1
| guard_2 = e_2
. . .
| guard_k = e_k
where
local_1 a_1 .. a_m = r_1
local_2 = r_2
. . .
COMP 1100 — Conditionals and Tuples 13
How To . . .
Reading: Thompson Ch.4
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — How To . . . 1
How To Design a Function
We will go through some very basic ideas about how to come up with a
function definition to solve a particular problem.
When learning to program, the biggest hurdle is knowing how and where to
start.
The running example will be to design and define a function to convert
integers to English phrases corresponding to how we say numerals.
(The script we are going to develop will be different to NumWords.hs used in
Week 2 lab classes.)
COMP 1100 — How To . . . 2
Understand the Problem
Analyse the problem.
Think about all the details. Is the problem fully specified?
For example, we want our program to say numerals properly , not like
telephone numbers etc.
Do we want it to work for all integers? No matter how big? Negative as well
as positive?
There is no right answer to these questions. It’s up to the person specifying
the requirements — that is, the “customer.”
For this exercise, we will only deal with positive numbers between 0 and 100.
(It will be fairly easy to extend, once we have finished.)
COMP 1100 — How To . . . 3
What is a good name for the function?
I chose convert.
Not a very informative name — can you suggest a better one?
What is the Function’s Type ?
What are its inputs and outputs?
convert :: Int -> String
COMP 1100 — How To . . . 4
Abstraction — Reducing Complexity
You can’t figure it out all at once!
This is the KEY ACTIVITY in software development!
1. Does the problem break down into smaller parts?
2. Is there a simpler version you can do first?
3. Have you seen a similar or related problem?
COMP 1100 — How To . . . 5
For our exercise, thinking about (1) and (2) in the list, we can start by
working on the problem of converting single digit numerals: 0 to 9.
(Why do I think that will help? Common sense and experience . . . )
COMP 1100 — How To . . . 6
What Tools Do We Have?
• What does the programming language give you that might be
useful?
– In Haskell, the Prelude functions and other libraries.
– (This knowledge improves with practice.)
• Do you know of any other programs or functions that may be
similar or otherwise useful?
– I routinely cut and paste code from programs I have written earlier.
– I routinley look at old programs for clues and details.
COMP 1100 — How To . . . 7
Test As You Go!
Writing a whole program means writing lots of different self-contained bits of
code (functions) . . .
because we have broken the problem down into simpler parts, or . . .
because we have tackled a simpler problem first.
Make sure those parts work correctly before proceeding!
COMP 1100 — How To . . . 8
ABSTRACTION
The key to reducing complexity is to focus on a well-defined andwell understood sub-problem . . .
while temporarily ignoring the rest of the problem.
When we have solved that subproblem, we can forget abouthow it works and only think about what it does.
COMP 1100 — How To . . . 9
Lists
Reading: Thompson Ch.5
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Lists 1
Lists
Tuples allow us to combine a fixed number of values of various types .
Lists allow us to combine a varying number of values of the same type .
1, 2, 3 are all of type Int, so [1, 2, 3] is of type [Int].
(Say [Int] as “list of Int.”)
String is a synonym for [Char].
"dog" is another way to write [’d’, ’o’, ’g’].
[[1,2,3],[1,99],[0],[]] has type [[Int]].
That is, “list of list of Int”.
We can have lists of ANY type, so long as all elements are the same type.
COMP 1100 — Lists 2
Which of the following are valid lists?
What are their types?
[1,’2’]
Invalid because the elements of a list must all be of the same type.
1 :: Int but ’2’ :: Char
[[1,2],3,4]
Invalid because the first element is [1,2] :: [Int] and the other
elements have type Int. This is not a valid list because the elements do not
all have the same type.
["cat","sat","on","mat"]
Valid and has type [String] or equivalently, [[Char]].
COMP 1100 — Lists 3
[(1,2),(1,2,3)]
Invalid because the elements are of different types: (Int,Int) and
(Int,Int,Int) respectively.
[(10,’a’),(3,’x’),(42,’m’)]
Valid and has type [(Int,Char)].
[True,[False]]
Invalid because Bool and [Bool] are different types.
[[1],[]]
Valid expression of type [[Int]].
Try some experiments for yourself.
COMP 1100 — Lists 4
Constructing Lists
The cons function
Lists are constructed by adding elements to the front:
42:[7,65,3] =⇒ [42,7,65,33]
EVERY LIST is constructed by a sequence of elements cons-ed onto the
front of the empty list : [].
[42,7,65,3] is another notation for 42:7:65:3:[]
The cons operator can only add elements at the front .
[7,65,3]:42 =⇒ error!
COMP 1100 — Lists 5
Other Useful List Functions
Concatenate: Join lists together using (++):
[3,5,7] ++ [7,8,9] =⇒ [3,5,7,7,8,9]
Index: Select elements from a list using (!!):
[’a’,’b’,’c’,’d’] !! 2 =⇒ ’c’
Indexes start from 0.
Head and tail: Together, the dual of (:)
head [’a’,’b’,’c’,’d’] =⇒ ’a’
tail [’a’,’b’,’c’,’d’] =⇒ [’b’,’c’,’d’]
Remember "abcd" is the same as [’a’,’b’,’c’,’d’] .
COMP 1100 — Lists 6
More Useful List Functions
Length: The length function:
length [42,7,65,3] =⇒ 4
Sum, product: Add or multiply the elements of a list of numbers:
sum [2,3,4] =⇒ 9
product [2,3,4] =⇒ 18
Reverse: the order of list elements:
reverse [42,7,65,3] =⇒ [3,65,7,42]
Lots more in the Prelude and the List library.
COMP 1100 — Lists 7
Arithmetic Progressions
We sometimes want a list of values in an arithmetic progression.
Haskell provides a special syntax:
[5..10] =⇒ [5, 6, 7, 8, 9, 10]
To specify step sizes other than 1, give the first 2 numbers. For example:
[1, 3..10] =⇒ [1, 3, 5, 7, 9]
[0, 10..50] =⇒ [0, 10, 20, 30, 40, 50]
Remember that only arithmetic progressions work.
[2, 4, 8..128] is illegal.
The notation works for other types in Haskell such as Float and Char.
COMP 1100 — Lists 8
Example: roots of a quadratic, revisited
A quadratic may have 0, 1 or 2 real roots.
In the earlier version, we raised an error if there was 0, and returned a pair
containing the same value twice, if there was 1.
Alternatively, we can return a list of 0, 1 or 2 elements, depending on the
discriminant.
COMP 1100 — Lists 9
roots :: Float -> Float -> Float -> [Float]
roots a b c
| discrim == 0 = [ -b/(2*a) ]
| discrim > 0 = [ (-b + (sqrt discrim ))/(2*a),
(-b - (sqrt discrim ))/(2*a) ]
| otherwise = []
where
discriminant = b^2 - 4*a*c
COMP 1100 — Lists 10
Polymorphism
What are the types of (:), (++), (!!), length . . . ?
Since a list can have elements of any type, (:) has to work for any type.
(Infinitely many of them.)
Notice that all these functions change or inspect the structure of the list
without touching the elements .
In other words, you don’t need to know what the list elements are to work out
its length:
length [ �, �, �, �, � ] =⇒ 5
COMP 1100 — Lists 11
Functions like these are generic as they work on lists of any type. We call
them polymorphic functions, and we write their types using type variables.
A single definition of a polymorphic function is sufficient for all types.
For example, a function to return the first element of a pair can be defined
like this:
fst (x,y) = x
and works in exactly the same way, no matter what are the types of x and y.
(In languages without polymorphism, you would have to write separate definitions for
fst for each different pair of element types that you wanted to handle. Each time the
code would be identical, possibly apart from a type declaration.)
COMP 1100 — Lists 12
Polymorphic Types
How do we write the types of polymorphic functions? With type variables :
(:) :: a -> [a] -> [a]
(++) :: [a] -> [a] -> [a]
length :: [a] -> Int
fst :: (a,b) -> a
We say length has type [a] -> Int for all types a.
When we apply a polymorphic function all occurrences are instantiated to
the same type, so (++) can have types like:
(++) :: [Int] -> [Int] -> [Int]
(++) :: [Char] -> [Char] -> [Char]
COMP 1100 — Lists 13
Polymorphism or Overloading?
Recall that (+) works on different types: Int, Float, etc.
We said (+) was overloaded and had type
(+) :: Num a => a -> a -> a
What’s the difference between polymorphism and overloading?
A polymorphic function has a single definition that works on all
instances.
An overloaded function has different definitions for different types.
COMP 1100 — Lists 14
Polymorphism or Overloading? (ctd)
The definition of fst above works for any types a and b.
In contrast, the equality operator (==) has a different definition for different
types. Equality on Int is a built in operation, but equality on pairs depends
on there being an equality operation on its component types, and has a
definition:
(x,y) == (a,b) = (x == a) && (y == b)
There is another different definition of (==) for lists, triples, and so on.
Not every type has an equality operation.
There is a class of types that have equality operations. The type of (==) is:
(==) :: Eq a => a -> a -> Bool
COMP 1100 — Lists 15
Repetition, Recursion, Induction
Reading: Thompson Ch.7
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Repetition, Recursion, Induction 1
Repetition
So far all our functions have used a fixed number of operations .
There are some functions that require an different number of operations,
depending on the input.
An obvious example is the factorial function:
factorialn = 1×2×·· ·×n
Another is the standard sum function that sums the elements of a list of
numbers:
sum [x1,x2 . . .xn] =⇒ x1 +x2 + · · ·+xn
How can we define such functions in Haskell?
COMP 1100 — Repetition, Recursion, Induction 2
Observations:
• The number of operations to calculate both factorial and sum depends
on the input (n in this case);
• A certain computation (× in factorial, + in sum) is repeatedly used
Recursion
The fundamental way to implement this kind of repetition is by a
programming technique called recursion .
It is intimately related to the mathematical idea of induction .
COMP 1100 — Repetition, Recursion, Induction 3
Calculating factorial n
The elipsis (. . . ) in the description of factorial gives us the general idea, but
how to write a precise definition?
Looking for clues:
fact 0 = 1
fact 1 = 1
fact 2 = 1 * 2
fact 3 = 1 * 2 * 3
fact 4 = 1 * 2 * 3 * 4
fact 5 = 1 * 2 * 3 * 4 * 5
...
COMP 1100 — Repetition, Recursion, Induction 4
See the pattern?
fact 5 = 5 * fact 4
fact 4 = 4 * fact 3
fact 3 = 3 * fact 2
fact 2 = 2 * fact 1
fact 1 = 1 * fact 0
fact 0 = 1
...
Generalising all except the fact 0 case, they are of the form:
fact n = n * fact (n-1)
COMP 1100 — Repetition, Recursion, Induction 5
Combining the special case and the general case:
fact 0 = 1
fact n = n * fact (n-1)
Which is a correct Haskell definition — Too easy!
Notice the correspondence to mathematical induction:
• The definition has a base case, fact 0 , and
• a step case, fact n defined in term of fact (n-1)
COMP 1100 — Repetition, Recursion, Induction 6
Another example
The standard prelude has a function
replicate :: Int -> a -> [a]
where replicate n x gives us a list of n occurrences of x:
replicate 0 x = [] = []
replicate 1 x = [x] = x : []
replicate 2 x = [x,x] = x : x : []
replicate 3 x = [x,x,x] = x : x : x : []
replicate 4 x = [x,x,x,x] = x : x : x : x : []
...
COMP 1100 — Repetition, Recursion, Induction 7
See the pattern?
replicate 4 x = x : replicate 3 x
replicate 3 x = x : replicate 2 x
replicate 2 x = x : replicate 1 x
replicate 1 x = x : replicate 0 x
replicate 0 x = []
So the definition is
replicate 0 x = []
replicate n x = x : replicate (n-1) x
( Again, notice the induction on n. )
COMP 1100 — Repetition, Recursion, Induction 8
Lists are Recursive Structures
Every lists is constructed by a sequence of elements cons-ed onto the
empty list.
[ x1,x2,x3, . . .xn ] = x1 : (x2 : (x3 : . . . : (xn : [ ])))
A list is either:
1. the empty list []
2. constructed by cons-ing an element onto the front of another list
COMP 1100 — Repetition, Recursion, Induction 9
List Patterns
Every list has either of the forms:
• []
• (x:xs) where x is an element and xs is a list
Last week we used pattern matching to define functions on tuples:
fst :: (a,b) -> a
fst (x,y) = x
When we evaluate an expression like:
fst (10 ,66)
10 is bound to x, and 66 is bound to y in the definition of fst.
COMP 1100 — Repetition, Recursion, Induction 10
We can similarly use list patterns to define functions on tuples:
head :: [a] -> a
head (x:xs) = x
tail :: [a] -> [a]
tail (x:xs) = xs
When we evaluate
tail [11,12,13,4,5]
11 is bound to x, and [12,12,4,5] is bound to xs,
so the result (xs) is [12,12,4,5]
(Notice that neither head nor tail has a case for []. They are partial functions.)
COMP 1100 — Repetition, Recursion, Induction 11
There are two basic list patterns, so in general functions over lists will
use both of them.
null :: [a] -> Bool
null [] = True
null (x:xs) = False
If we evaluate
null [11,12,13,4,5]
Haskell will first attempt to match [11,12,13,4,5] to the pattern [] in
the first clause of the definition and fail. Haskell will then attempt it match it
to the pattern (x:xs) in the second clause and succeed (as in the
previous slide). Hence the result will be the right hand side of the successful
clause — False.
COMP 1100 — Repetition, Recursion, Induction 12
Recursive List Functions
Return to our initial question: how to define a function to sum the elements
of a list of numbers.
Compare the sum calculation with the structure of the list:
x1 + (x2 + (x3 + ... (xn + 0 )))
x1 : (x2 : (x3 : ... (xn : [])))
(I admit choosing the parentheses and including the “+ 0” to make this look right.)
Notice the direct correspondence:
• instead of [] we have 0
• instead of (:) we have (+)
COMP 1100 — Repetition, Recursion, Induction 13
The list patterns immediately suggest a starting point:
sum [] = ...
sum (x:xs) = ...
The correspondence on the previous slide suggests the right hand sides.
The sum function substitutes 0 for [] and (+) for (:) to give
sum :: Num a => [a] -> a
sum [] = 0
sum (x:xs) = x + sum xs
COMP 1100 — Repetition, Recursion, Induction 14
The evaluation of a call of sum proceeds as follows:
sum [2,5,7,1] =⇒ 2+sum [5,7,1]
=⇒ 2+(5+sum [7,1])
=⇒ 2+(5+(7+sum [1]))
=⇒ 2+(5+(7+(1+sum [])))
=⇒ 2+(5+(7+(1+0)))
=⇒ 15
COMP 1100 — Repetition, Recursion, Induction 15
Another Example
Define a function to multiply every element of a list of numbers by 2.
Compare the double calculation with the structure of the list:
x1*2 : x2*2 : x3*2 : ... : xn*2 : []
x1 : x2 : x3 : ... : xn : []
Notice the direct correspondence:
• instead of each xi we have xi*2
• where we had [] we still have []
COMP 1100 — Repetition, Recursion, Induction 16
The list patterns suggest a starting point:
double [] = ...
double (x:xs) = ...
The correspondence gives us the right hand sides:
double [] = []
double (x:xs) = x*2 : double xs
Applying the same function to every element of a list is a very common
activity.
This pattern of recursion is coded up as the map function. (More later.)
COMP 1100 — Repetition, Recursion, Induction 17
Modules and Assignment 1
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Modules & PPM 1
Announcements
• Assignment 1 is now available.
Due date: 12 noon, Monday 3rd April 2006.
See the COMP1100 web site.
• Reminder — Drop-in lab sessions.
For help with course material, practical exercises, assignments, etc.
– Monday 10-11am in N116
– Thursday 10-11am in N116
COMP 1100 — Modules & PPM 2
Modules in Haskell
Programs can grow to hundreds of thousands of lines of code.
How can we manage such complexity?
ABSTRACTION!
Decompose the system into smaller strongly related modules .
Aim: details of each module are only relevant within that module.
Separation of concerns: HOW versus WHAT.
All(?) programming languages provide a mechanism for splitting a program
into separate modules .
COMP 1100 — Modules & PPM 3
Benefits of Good Modularisation
• Structuring tool to simplify program design
• Simultaneous development of modules by multiple teams
• Parts of the program can be understood in isolation
• Easier to change parts of the program without affecting others
• Code can be re-used — e.g. the Haskell library modules
COMP 1100 — Modules & PPM 4
Modular Design
The fundamental activity in the software design process. But how?
• Each module should correspond to a single abstraction
• Each module should have a clearly defined purpose
• It should be possible to understand each module in isolation
• Modules should be easy to test individually
In most cases, modules should correspond to an abstract data type.
That is, a type that models some aspect of the problem domain .
COMP 1100 — Modules & PPM 5
Modules in Haskell
A module in Haskell is a script where the first line of code is:
module Pixel where
The name of this module is Pixel. Module names begin with a capital letter.
In general, modules should be in files with the same name.
The Pixel module should be in a file called Pixel.hs.
To use the definitions of one module in another, it must be imported:
module PPM where
import Pixel
COMP 1100 — Modules & PPM 6
Example – Assignment 1
Assignment 1: build some computer image manipulation tools:
rotate, flip, desaturate . . .
Step 1: Analyse the Problem
How are images represented on computers?
Computers represent images by a grid of coloured dots, called pixels.
The pixels are so small that, when displayed on a computer screen, they
appear to merge into a smooth image.
COMP 1100 — Modules & PPM 7
Abstract Data Types
Immediately from a simple problem analysis we can infer:
• The problem involves a type of things called Pixel
• The problem involves a type of things called Image
• Pixels have a type of things called Colour
How do we represent these abstract data types in Haskell?
COMP 1100 — Modules & PPM 8
Colours
There are many ways to represent colours. A simple one is RGB.
To specify a colour, you say how much red, how much green, and how much
blue is in the colour, and the usual range is 0 . . . 255 for each value.
White is 0 red, 0 green, 0 blue.
Black is 255, 255, 255.
Red is 255, 0, 0.
COMP 1100 — Modules & PPM 9
Pixel Module
To represent the pixel abstract data type, what goes in module Pixel?
• A type for representing the values of pixel type
• All the core functions operating on pixels
COMP 1100 — Modules & PPM 10
Representing Images
A “grid” of pixels . . . ?
In Haskell:
Each row can be a list of pixels.
Each image can be a list of rows.
COMP 1100 — Modules & PPM 11
PPM File Format
As well as our internal Haskell image representation, we need to deal with
the external representation of images in files.
There are many file formats (jpeg, gif, eps, psd, pdf, . . . )
We will use a very (very) simple one: Portable Pixel Map (.ppm)
COMP 1100 — Modules & PPM 12
PPM Module
The PPM module will include the (internal) representation of images.
Since images are made from pixels, module PPM will import Pixel
Core operations on images include:
• decoding PPM file format to values of type Image
• encoding values of type Image to PPM file format
COMP 1100 — Modules & PPM 13
Image Transformation Modules
Each image transformation tool will be developed in a separate module.
User Interaction Module
Another module (module Main) contains the user interaction:
get an input file; apply the transformation; produce an output file.
(A whole program at last!)
Test Harness
module TestBed allows us to test our image transformation tools more
interactively and more conveniently than the Main module allows . . .
COMP 1100 — Modules & PPM 14
Understanding Recursive Definitions
Through Interpretive Dance
(and other techniques)
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Understanding Recursive Definitions 1
Recursion
Recursion is ubiquitous in programming, including Haskell.
Recursion is the fundamental technique for implementing repetition in any
computation, any programming language.
Any other techniques (loops) can always be transformed to recursion.
General recursion cannot be directly implemented using loops.
Recursive definitions can be confusing at first, so here are some ideas that
might help you understand them.
COMP 1100 — Understanding Recursive Definitions 2
Understanding recursion by . . .
correspondence with the list structure
(We did this last week.)
Compare the sum calculation with the structure of the list:
x1 + (x2 + (x3 + ... (xn + 0 )))
x1 : (x2 : (x3 : ... (xn : [])))
Notice the direct correspondence:
• instead of [] we have 0
• instead of (:) we have (+)
COMP 1100 — Understanding Recursive Definitions 3
The sum function traverses the list:
• replacing [] with 0
• replacing (:) with (+)
sum [] = 0
sum (x:xs) = x + sum xs
COMP 1100 — Understanding Recursive Definitions 4
How about the length of a list?
1 + (1 + (1 + ... (1 + 0 )))
x1 : (x2 : (x3 : ... (xn : [])))
• instead of [] we have 0
• instead of (:) we have (+)
• instead of each xi we have 1
so:
length [] = 0
length (x:xs) = 1 + length xs
COMP 1100 — Understanding Recursive Definitions 5
Understanding recursion by . . .
interpretive dance
• Get some friends together
• Each play a role in the computation
length [] = 0
length (x:xs) = 1 + length xs
(This is no fun by yourself.)
COMP 1100 — Understanding Recursive Definitions 6
Another one?
sum [] = 0
sum (x:xs) = x + sum xs
COMP 1100 — Understanding Recursive Definitions 7
Accumulating Parameters
Forget programming for a moment.
How would you count the elements in a list?
like this?
a : (b : (c : (d : [])))
4 3 2 1 0
or like this?
a : (b : (c : (d : [])))
0 1 2 3 4
COMP 1100 — Understanding Recursive Definitions 8
Length of a list — accumulating parameter
Use an extra parameter to keep a running total:
count :: [a] -> Int -> Int
count [] total = total
count (x:xs) total = count xs (total +1)
Start the count at zero:
length :: [a] -> Int
length xs = count xs 0
COMP 1100 — Understanding Recursive Definitions 9
Understanding recursion by . . .
trusting the recursive call
• Pick out the recursive call
• Suppose it works
• Think about the definition
length [] = 0
length (x:xs) = 1 + length xs
(This is like solo intepretive dance.)
COMP 1100 — Understanding Recursive Definitions 10
Another one?
sum [] = 0
sum (x:xs) = x + sum xs
COMP 1100 — Understanding Recursive Definitions 11
Sum a list of numbers — accumulator
addUp :: Num a => [a] -> a -> a
addUp [] total = total
addUp (x:xs) total = addUp xs (total+x)
sum :: Num a => [a] -> a
sum xs = addUp xs 0
COMP 1100 — Understanding Recursive Definitions 12
Patterns of Recursion
Higher Order Functions
Reading: Thompson Ch.9
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Higher Order Functions 1
Patterns of Recursion
Polymorphism allows us to reuse code because the same function can be
applied to different types.
Another mechanism for reuse is to recognise patterns of computation and
design functions that embody those patterns.
For example, we often need to:
• transform every list element in some uniform way
• select out the list elements that satisfy some property
• combine the elements of a list using some operator
COMP 1100 — Higher Order Functions 2
Mapping
We often want to transform each element of a list in some way, eg:
• double every element of a list of numbers
• increment each element of a list of numbers
• convert each character of a string to upper case
For the PPM example, to select the red channel for an image, we apply a
function that selects the red channel of a pixel to every pixel in a row , then
apply that process to every row in the image .
We call this pattern of computation, mapping .
COMP 1100 — Higher Order Functions 3
Three examples:
double :: [Int] -> [Int]
double [] = []
double (x:xs) = 2*x : double xs
incr :: [Int] -> [Int]
incr [] = []
incr (x:xs) = x+1 : incr xs
upperCase :: [Char] -> [Char]
upperCase [] = []
upperCase (x:xs) = toUpper x : upperCase xs
COMP 1100 — Higher Order Functions 4
How do they differ?
double [] = []
double (x:xs) = 2*x : double xs
incr [] = []
incr (x:xs) = x+1 : incr xs
upperCase [] = []
upperCase (x:xs) = toUpper x : upperCase xs
• They all have the same form (pattern )
• Only differ in function applied to the elements
• Obviously, other functions could be used in the same way
COMP 1100 — Higher Order Functions 5
Change all the names to something meaningless to further emphasise
to pattern even more:
f [] = []
f (x:xs) = 2*x : f xs
g [] = []
g (x:xs) = x+1 : g xs
h [] = []
h (x:xs) = toUpper x : h xs
COMP 1100 — Higher Order Functions 6
Idea!
Make those trasformation functions arguments to a function representing the
form of the definition. Such a mapping function will take two arguments:
• a function to transform the elements
• a list
This is exactly the function map defined in the prelude:
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
Functions that take other functions as arguments are called higher-order .
Notice that map is also polymorphic .
COMP 1100 — Higher Order Functions 7
Now we can use map to define double, incr and upperCase:
double xs = map (2*) xs
incr xs = map (+1) xs
upperCase xs = map toUpper xs
Sample trace of computation:
map (2*) [3, 9] =
map (2*) (3 : (9 : []))
=⇒ 2*3 : (map (2*) (9 : []))
=⇒ 2*3 : 2*9 : (map (2*) [])
=⇒ 2*3 : 2*9 : []
=⇒ [6, 18]
COMP 1100 — Higher Order Functions 8
Advantages
What are the advantages of this approach?
• The map function is a good example of abstraction and generalisation .
• If you understand WHAT map does, definitions that use it are easier to
understand. You don’t need to think about HOW it works.
• Code re-use is a cornerstone of modern software engineering practices.
map can be re-used for all functions of this form.
(Similar advantage to polymorphism.)
COMP 1100 — Higher Order Functions 9
Filtering
We may want to select the elements of a list with some property in common.
For example, we may wish to select the digits in a string:
getDigits [] = []
getDigits (x:xs)
| isDigit x = x : getDigits xs
| otherwise = getDigits xs
or the negative numbers in a list:
getNegs [] = []
getNegs (x:xs)
| x < 0 = x : getNegs xs
| otherwise = getNegs xs
COMP 1100 — Higher Order Functions 10
How do they differ?
getDigits [] = []
getDigits (x:xs)
| isDigit x = x : getDigits xs
| otherwise = getDigits xs
getNegs [] = []
getNegs (x:xs)
| x < 0 = x : getNegs xs
| otherwise = getNegs xs
COMP 1100 — Higher Order Functions 11
Extract a function to represent the common form of the selection algorithm:
filter :: (a -> Bool) -> [a] -> [a]
filter p [] = []
filter p (x:xs)
| p x = x : filter p xs
| otherwise = filter p xs
filter takes a predicate (a Bool-valued function) and a list as arguments,
and returns the sub-list as result.
Now we can write:
getDigits xs = filter isDigit xs
getNegs xs = filter (< 0) xs
COMP 1100 — Higher Order Functions 12
Folding
We often want to combine all the elements of a list into a single value in a
uniform way.
For example, the standard function sum adds together all the elements of a
list. concat combines a list of lists by the operator (++).
Recall the sum computation:
x1 + (x2 + (x3 + ... (xn + 0 )))
x1 : (x2 : (x3 : ... (xn : [])))
which has the same pattern as concat:
x1 ++ (x2 ++ (x3 ++ ... (xn ++ [])))
x1 : (x2 : (x3 : ... (xn : [])))
COMP 1100 — Higher Order Functions 13
foldr
The pattern on the previous slide is fold right :
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
which is a bit harder to grasp that map or filter.
Think of what it does, rather than the definition.
foldr (+) 0 [x1 , .. xn] = x1 + (x2 + .. (xn + 0)))
foldr (++) [] [x1 , .. xn] = x1 ++ (x2 ++ .. (xn ++ [])))
COMP 1100 — Higher Order Functions 14
foldl
There is also a fold left :
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
In fact, the standard Prelude definition of sum is in terms of foldl:
sum xs = foldl (+) 0 xs
For example, sum [x1, x2 .. xn] is:
foldl (+) 0 [x1 , x2 .. xn] = (((0 + x1) + x2) + ... xn)
COMP 1100 — Higher Order Functions 15
The fold functions are extremely general patterns of recursion (even map can
be defined using fold) . . .
. . . and they correspond nicely to some loop structures in procedural
languages . . .
. . . but I think they are harder to understand . . .
. . . so they might be a bit difficult for an introductory programming course.
Look for some other useful standard higher-orderfunctions, e.g.
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
takeWhile :: (a -> Bool) -> [a] -> [a]
dropWhile :: (a -> Bool) -> [a] -> [a]
COMP 1100 — Higher Order Functions 16
Data Directed Design
Reading: Thompson Ch.6
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Data Directed Design 1
Announcements
• Mid-semester exam:
– Melville Hall 6 to 7pm Monday 3 April 2006
– Open-book (except ANU library books)
• Drop-in labs in N116:
– Week 5:
∗ Thursday 10 – 12am
– Week 6:
∗ Monday 10 – 12am
∗ Thursday 10 – 12am
– Week 7:
∗ Monday 10 – 12am
COMP 1100 — Data Directed Design 2
Program Design
In earlier lectures we looked at how to think about designing functions.
The key idea was abstraction — breaking the problem into smaller parts to
reduce the complexity of components we need to manage at one time.
Now that we know a little about modules we should look again at how we
might go about analysing the problem to identify the manageable parts .
The approach we will take is called Data Directed Design .
A key step is to identify the types of data the problem involves.
COMP 1100 — Data Directed Design 3
Example: Supermarket Docket
We will work through an example based on a part of a supermarket checkout
system.
The full system involves hardware such as the scanner and till, software to
take the sequence of scans (barcodes) and produce the final bill, and lots of
other activities, organisational procedures such as the operation of the
checkout, management of the database of products and so on. This is part
of a larger integrated system to deal with stock management, bookkeeping,
and all the other matters to do with running a business.
We will look at a small part of the overall system:
• the scanner produces a sequence of barcodes
• we produce an itemised docket with a total cost
COMP 1100 — Data Directed Design 4
For example, the scanner produces a sequence of barcodes, like:
4719 1112 1113 3814 1234
and our program will print:
Alonzo’s Mega-Mart
Frozen Pizza..............6.49
Mars Bar..................1.60
Unknown Item..............0.00
Hokkien Noodles...........2.05
Chianti, 1lt.............17.95
Total...................$28.09
COMP 1100 — Data Directed Design 5
The strategy we will follow is Data Directed Design .
I strongly recommend that you follow this strategy. It consists of a
sequence of well-defined steps.
1. Understand the Problem
• Thoroughly analyse the aims and requirements of the problem.
• Work some examples by hand.
• Ask questions (of yourself and others) and look for special cases.
For example, what should be done if the barcode does not appear in the
database?
COMP 1100 — Data Directed Design 6
2. Identify the Objects
What are the “things” that the problem involves?
Abstract Data Types (ADTs) are “types” related to the problem domain
rather than the programming language.
For the supermarket problem, I identified:
• barcodes, names and prices of items
• sequences of barcodes from the scanner
• the database linking barcodes to items and prices
• the docket itself
COMP 1100 — Data Directed Design 7
These ADTs are not all independent concepts, and some are much simpler
than others.
The hierarchy and relations between the ADTs:
barcodesequence
barcode itemname
database
price
docket
COMP 1100 — Data Directed Design 8
Choose some names for the ADTs
TillType : sequence of barcodes
BarCode : bar codes
Name : name of item
Price : price of item
Database : database of supermarket stock
Docket : printed bill
COMP 1100 — Data Directed Design 9
3. Identify Basic Operations on Objects
All data types are completely determined by the operations on that
type.
TillType: Process in sequence — extract the first bar code, then the
second etc.
BarCode: Compare for equality. Depending on Database representation,
we may want other relational operations.
Price: Arithmetic and reading.
Database: Look up information about an item, using Barcode as an index.
Realistically, we would want other operations to maintain and update the
database, too.
Name and Docket: No operations in program, but read by system users.
COMP 1100 — Data Directed Design 10
There is never only one correct answer. For example, I keep talking about
“items” so they should probably be included as another ADT.
COMP 1100 — Data Directed Design 11
4. Choose Representations of Objects
Based on the operations identified above, choose appropriate types to
represent the ADTs.
TillType: Only needing sequential processing suggests that a simple list
would be sufficient:
type TillType = [BarCode]
BarCode: I don’t know enough about bar codes to make an informed
judgement. Why an integer? Do we ever do arithmetic on a bar code?
Nevertheless:
type BarCode = Int
COMP 1100 — Data Directed Design 12
Name: A string of characters is sufficient.
type Name = String
Price: The obvious choices are Float (number of dollars) or Int (number
of cents). In general, Int is preferred.
type Price = Int
Docket: A string, a sequence of lines (strings), a “file” or such like.
type Docket = String
COMP 1100 — Data Directed Design 13
Database: There are various standard representations for look-up tables.
The simplest is an association list — an unordered sequence of records ,
each with a key and other data fields. This representation would be far too
inefficient for anything other than a tiny database. For this exercise we will
take the simple approach.
The key is Barcode. The records need to contain the bar code, the name
and the price. (We can infer that from the problem requirements.)
type Database = [ (BarCode ,Name ,Price) ]
COMP 1100 — Data Directed Design 14
Modules
Two observations about the Database ADT:
• The operations are not so straightforward;
• It is easy to imagine the database ADT being used for purposes other
that this specific exercise.
Generally, for all but the simplest ADTs, it is appropriate to build them in their
own module.
COMP 1100 — Data Directed Design 15
Modules
There are several advantages to packaging ADT in modules:
• While developing that module we can concentrate on HOW and ignore
WHY it is needed;
• In the main application we can ignore how the ADT is implemented as
long as we understand WHAT it does;
• Modifications to the ADT are confined to that module, rather than being
scattered through a program;
• Module systems allow the programmer to control misuse of an ADT
through export restrictions. That is, only certain operations are
permitted.
COMP 1100 — Data Directed Design 16
5. Implement Basic Operations on Objects
TillType = [BarCode]
The operations (sequential processing) are simple enough that the standard
functions (list patterns) correspond.
BarCode = Int
No other operations other than equality comparison (==).
Price = Int
Standard arithmetic operations. We also need to display the price as dollars
and cents. This formatting could also be considered a Docket operation.
formatCents :: Price -> String
formatTotal :: Price -> String
COMP 1100 — Data Directed Design 17
Docket = String
Name = String
Standard string processing operations.
Database = [ (BarCode, Name, Price) ]
A lookup function: given a bar code, return the record with that key.
find :: Database -> BarCode -> (Name ,Price)
COMP 1100 — Data Directed Design 18
6. Factor Process into Manageable Parts
We now have some objects (types) and basic operations with which to
construct our solution.
We need to break the overall solution process down into manageable parts.
The top-level function is:
printBill :: TillType -> Docket
How can we break that down into smaller parts?
COMP 1100 — Data Directed Design 19
One possible approach is to factor it into two parts:
1. look up the items
2. create the docket from the name and price information
This suggests the following operations:
makeBill :: TillType -> BillType
formatBill :: BillType -> Docket
where the list of item information is a new type:
type BillType = [(Name ,Price)]
Design is an iterative process. We have introduced a new data type into
the design, so steps 2, 3, 4 and 5 must be repeated for that type.
COMP 1100 — Data Directed Design 20
The main operation (printBill) is just makeBill and formatBill in
sequence.
“Sequencing” corresponds to function composition, so we have:
printBill codes = formatBill (makeBill codes)
Now we can concentrate on the two simpler functions.
Do they need to be factored down further?
Can they be easily implemented (in terms of the basic operations on the
ADTs)?
COMP 1100 — Data Directed Design 21
makeBill obviously operates on TillType and the Database, so its
definition will involve the look-up function and operations to retrieve bar
codes from lists of bar codes.
printBill needs to construct a heading, a body and a total line. The body
and the total line will depend on the names and prices of the items
purchased (BillType):
formatLines :: BillType -> String
totalLine :: BillType -> String
totalLine will be further factored into a function to compute the total and a
function to display that value:
makeTotal :: BillType -> Price
formatTotal :: Price -> String
COMP 1100 — Data Directed Design 22
The design process is never that straightforward.
We learn as we design, so we make mistakes and omissions.
You should always expect to review , revisit and redesign .
Software development is an iterative process, gradually refining to a final
solution.
COMP 1100 — Data Directed Design 23
Process (Function) Hierarchy
makeBill
find formatLines
formatLine
printBill
formatBill
totalLine
makeTotal formatTotal
formatCents
COMP 1100 — Data Directed Design 24
Summary of Data Directed Design Process
1. Understand the Problem
2. Identify the Objects
3. Identify the Basic Operations on Objects
4. Choose Representations of the Objects
5. Implement the Basic Operations on Objects
6. Factor the Process into Manageable Parts
COMP 1100 — Data Directed Design 25
Input and Output
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Input and Output 1
Assignment 1 hint, clarification, advice
Tutors and I have been asked by lots of students for help with transformation
functions that take more than one argument, such as
threshold :: Int -> Image -> Image
In TestBed.hs, you can’t pass threshold to the transform test function
because it’s the wrong type.
Partial application is the key idea.
(threshold 100) is a function of type Image -> Image, which is the
right type for transform. You should evaluate
transform (threshold 100)
COMP 1100 — Input and Output 2
The same advice holds for the function arguments to map.
If you have a threshold function for pixels, like:
thresholdPixel :: Int -> Pixel -> Pixel
which you want to map over every element of a list of pixels (i.e. a Row),
again you need to partially apply that function to get the right type:
(thresholdPixel 100) is a function of type Pixel -> Pixel.
I don’t want to give too much away, but:
thresholdRow :: Int -> Row -> Row
thresholdRow level rows = map (thresholdPixel level) rows
COMP 1100 — Input and Output 3
What’s a Program?
So far, we have concentrated on pure functions — they take arguments,
perform a computation , and return a result.
To use such functions we have used the program GHCi which interacts
with the user.
A program running on a computer interacts with its environment .
The most basic kinds of actions are input and output :
• input from the keyboard or a file (e.g. ppm image files)
• output to the screen or a file (e.g. ppm image files)
(Other interactions include communication with devices, networks etc.)
COMP 1100 — Input and Output 4
A Haskell Program
• A module called Main, containing
• a function main of type IO ()
module Main where
main :: IO ()
main = print (map factorial [0..10])
factorial :: Int -> Int
factorial 0 = 1
factorial n = n * factorial (n-1)
COMP 1100 — Input and Output 5
Compiling Programs
We use the GHC compiler to translate the Haskell program to machine
language:
ghc Factorials.hs -o factorials
This produces a machine-language program (in the file factorials) which
can be run independently:
./factorials
[1,1,2,6,24,120,720,5040,40320,362880,3628800]
COMP 1100 — Input and Output 6
print is a function which prints any value which has a text representation.
What happens if we apply print to a string?
print "Hello World"
If we want to print the contents of a string we use
putStr "Hello World"
or
putStrLn "Hello World"
COMP 1100 — Input and Output 7
Interaction and Computation
Programs are generally composed of an inner computational kernel and an
outer interaction shell:
input
outputinput
Interaction
computational
shell
kernel
COMP 1100 — Input and Output 8
Haskell Separates Interaction from Computation
Most languages don’t clearly distinguish these two aspects.
Haskell does it by means of types .
All the types we have seen so far are “computation” types.
For example, the expression
"Hello " ++ "World!"
denotes a computation, and the function
(++) :: [a] -> [a] -> [a]
is a pure function — it only computes a result string from two input strings.
COMP 1100 — Input and Output 9
IO types
Input/output actions always have a type of the form IO a
where a is the result of the I/O operation.
For example, the expression
putStr "Hello World!"
has type IO () where the () indicates that no result is returned.
The type of putStr is
putStr :: String -> IO ()
The result of the putStr function is not only a value (),
but also an IO interaction.
COMP 1100 — Input and Output 10
Combining Actions
Unlike pure functions, we cannot combine IO functions by composing them.
putStr getLine
doesn’t work, because getLine has type IO String but putStr expects a
pure String, not an IO action that produces a String.
Haskell distinguishes between computations and interactions by types.
Order of execution
In pure computations, there is some freedom for the system to choose the
order in which subcomputations are performed. That is not the case with IO
actions. We must specify the sequence.
COMP 1100 — Input and Output 11
The do Notation
When we want to combine IO actions we use the do notation:
do
<statement #1>
<statement #2>
...
<statement #n>
where each statement is an IO action.
Whenever we want the result of an action, we can use a statement of the
form:
v <- <action>
to bind the result of the action to some variable v .
COMP 1100 — Input and Output 12
Now we can combine getLine and putStr as follows:
do
input <- getLine
putStr input
Since getLine has type IO String, the variable input has type String,
which can be passed to putStr.
Of course, IO types can be used in function definitions:
echoLine :: IO ()
echoLine = do
input <- getLine
putStr input
COMP 1100 — Input and Output 13
Defining Interaction Functions
As with computations, we can define functions that encapsulate useful
interaction patterns. The following ask function poses a question and waits
for the user’s response.
The question is passed as a String argument, and the result is an IO
action that produces a String:
ask :: String -> IO String
ask question = do
putStr question
getLine
An example using ask is in HelloYou.hs
COMP 1100 — Input and Output 14
Showing and Reading Values
IO actions getLine and putStr allow us to read and write Strings.
The functions readLn and print read and write lots of other types of
values.
askNumber :: String -> IO Float
askNumber question = do
putStr question
readLn
Examples using askNumber include Sum2Nums.hs and
MaleMetabolic.hs
COMP 1100 — Input and Output 15
Conditions and Repetition in IO Actions
We want to write a program that reads a line of text, then outputs it after
converting all characters to upper case (see week 4 lab exercises).
The program should repeat the process until an empty line is entered.
The computational part should be familiar:
stringToUpper :: String -> String
stringToUpper chs = map toUpper chs
The whole program is Upper.hs
Notice how the computational part (stringToUpper) is quarantined from
the interaction part.
COMP 1100 — Input and Output 16
The upperLine Function
The repeated interaction is:
1. get a line
line <- getLine
2. if it’s empty, return (that is, escape the repetition)
if line == "" then return ()
3. otherwise convert to upper case, print, and repeat the process (recurse)
else do
putStrLn (allUpper line)
upperLine -- recursive call
COMP 1100 — Input and Output 17
Notes on upperLine syntax
• The then and else branches are indented because they belong to the
statement starting with if.
(Remember the off-side rule.)
• As there are two statements in the else branch, we need another do to
put them together.
SumNums.hs is another program, similar to Upper.hs, which adds up a
sequence of numbers entered at the keyboard.
COMP 1100 — Input and Output 18
File I/O
All examples so far have been about reading from the keyboard and writing
to the screen.
Programs must also be able to access and modify files on disk, etc.
(For example, assignment 1 manipulates ppm files.)
The three simplest standard functions are:
type FilePath = String
writeFile :: FilePath -> String -> IO ()
appendFile :: FilePath -> String -> IO ()
readFile :: FilePath -> IO String
COMP 1100 — Input and Output 19
• Values of the FilePath denote files using the usual path syntax, eg
"/home/clem/words.txt"
• WriteFile path str creates a new file called path and writes str
into the file.
If a file with that name already exists, it is overwritten.
• appendFile path str adds str to the end of an already existing file
called path.
• readFile path reads the entire contents of the file called path and
returns it as a string.
COMP 1100 — Input and Output 20
Example
A simple program to read the name of a file from the keyboard, read the file,
then print it to the screen:
main :: IO ()
main = do
putStr "Name of file: " -- to screen
path <- getLine -- from keyboard
contents <- readFile path -- from file
putStrLn contents -- to screen
Example: Cat.hs
COMP 1100 — Input and Output 21
File contents as a single string. . .
readFile returns a single string being the contents of the file.
Demonstration: Contents.hs
Often we want to deal with each line separately. The standard, pure function:
lines :: String -> [String]
breaks a string into a list of lines at each new line character ’\n’
The dual of lines is
unlines :: [String] -> String
Example: PickLine.hs
COMP 1100 — Input and Output 22
There is also a pure function that breaks a string into separate “words” by
looking for space or newline characters:
words :: String -> [String]
unwords :: [String] -> String
Example: Alpha.hs
COMP 1100 — Input and Output 23
Showing and Reading Values . . . Revisited
Recall the functions readLn and print that can read and write non-string
values of various types.
In fact they are defined in terms of string I/O functions and two standard
pure functions, show and read.
• show converts values to their string representation.
e.g. show 42 =⇒ "42"
• read is the dual of show: it converts string representations to values.
e.g. read "42" =⇒ 42
COMP 1100 — Input and Output 24
Not every type of value has a string representation so the types of show and
read are not:
show :: a -> String
read :: String -> a
because not every type can be substituted for a. show and read are not
polymorphic. They are overloaded, so their types are:
show :: Show a => a -> String
read :: Read a => String -> a
COMP 1100 — Input and Output 25
print is defined using show and putStrLn:
print :: Show a => a -> IO ()
print value = putStrLn (show value)
readLn is defined using read and getLine:
readLn :: Read a => IO a
readLn = do
line <- getLine
return (read line)
Advice: Don’t always try to write programs from scratch.
Re-use code that you have written, I have written, from textbooks, etc.
COMP 1100 — Input and Output 26
Algebraic Data Types
COMP1100 — Introduction to Programming and Algorithms
Reading: Thompson Ch.14
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Algebraic Data Types 1
Assignment 1 Submission and Marking
• Assignment 1 is due on Monday 3rd April at 12 noon
• Submission is on-line using the command:
submit comp1100 Asst1 <files>
• Assessment is in two parts:
1. Demonstrate your programs to your tutor during week 7 lab
classes.
YOU MUST ATTEND WEEK 7 LABS TO OBTAIN THESE MARKS.
2. Your tutor will mark your submitted scripts for correctness, style
and technique.
COMP 1100 — Algebraic Data Types 2
Defining Our Own Data Types
So far we have only used the pre-defined and constructed types of Haskell.
For representing “real-world” data types, we can do a lot with just those
types, possibly re-naming them with synonyms.
However, some “real-world” data types do not naturally correspond to
standard types and requires some kind of coding.
For example, to represent the months of the year or the days of the week, we
could use the integers 1–12 and 1–7 respectively.
In that case, it is not clear whether a literal 3 represents March, Tuesday,
Wednesday, the value 3, . . .
COMP 1100 — Algebraic Data Types 3
Other examples where data type “coding” is inconvenient, unclear and
unnatural:
• A type which is either a number or a string (for example, in some areas
houses may have names rather than numbers).
• A tree data structure:"Freak"
"Bongo"
"Allures" "Fury"
"Out"
"Zoot"
Any type can be represented using the pre-defined types, but the
representation may not be very natural.
Haskell’s Algebraic Data Types let us define new types to more naturally
model types like those above.
COMP 1100 — Algebraic Data Types 4
Defining Algebraic Types
The simplest definition of an algebraic type:
• begins with the word data
• followed by the name of the new type
• followed by =
• followed by one or more alternatives separated by |
The alternatives each introduce a new constructor function which takes 0
or more arguments.
It is also possible to define polymorphic algebraic types where the
constructor functions are polymorphic.
COMP 1100 — Algebraic Data Types 5
Enumerated Types
The simplest algebraic type definitions are just an enumeration of the
elements or values of that new type.
For example, the months of the year and the days of the week could be
defined as follows:
data Day = Sun|Mon|Tue|Wed|Thu|Fri|Sat
data Month = Jan|Feb|Mar|Jun|Jul|Aug|Sep|Oct|Nov|Dec
COMP 1100 — Algebraic Data Types 6
Another example:
data Temp = Cold | Hot
data Season = Spring | Summer | Autumn | Winter
These are new types.
Type Temp consists of only two values: Cold and Hot.
Cold and Hot are called constructors of type Temp.
Note: Constructors must begin with a capital letter (to distinguish them from
variables). Remember that type names also begin with a capital letter.
COMP 1100 — Algebraic Data Types 7
Constructors May Appear in Patterns
That is the standard way to define functions over algebraic types. For
example:
weather :: Season -> Temp
weather Summer = Hot
weather other = Cold
weekend :: Day -> Bool
weekend Sat = True
weekend Sun = True
weekend other = False
COMP 1100 — Algebraic Data Types 8
Algebraic types do not automatically have operators such as equality,
ordering, show and so on. For example, the following definition:
weather :: Season -> Temp
weather s
| s == Summer = Hot
| otherwise = Cold
gives the error message:
AlgDT.hs:37:8:
No instance for (Eq Season)
arising from use of ‘==’ at AlgDT.hs:37:8-9
...
COMP 1100 — Algebraic Data Types 9
Defining Instances
If you need some overloaded operator such as (==) to apply to an algebraic
type, you can declare it as an instance of class Eq and define the equality
function yourself:
instance Eq Temp where
Hot == Hot = True
Cold == Cold = True
_ == _ = False
This can get tedious. (Try defining Month as an instance of Ord.)
COMP 1100 — Algebraic Data Types 10
Deriving Instances
Haskell has a standard way of deriving instances for algebraic types. In most
cases it is satisfactory but sometimes you need to define your own operators.
data Season = Spring | Summer | Autumn | Winter
deriving Eq
data Temp = Cold | Hot
deriving Eq
The clause deriving Eq makes the type an instance of type class Eq in
the “obvious” way.
COMP 1100 — Algebraic Data Types 11
If we want to show the new values as strings, we also need to derive an
instance of class Show. Otherwise:
> weather Autumn
No instance for (Show Temp)
Probable fix: add an instance declaration for (Show Temp)
We need:
data Season = Spring | Summer | Autumn | Winter
deriving (Eq , Show)
data Temp = Cold | Hot
deriving (Eq , Show)
Often we also want to derive an instance of classes Ord and Enum so that
we can use the relational operators and progressions, such as [Mon..Fri].
COMP 1100 — Algebraic Data Types 12
Union Types
Example: A program may be concerned with various geometrical shapes.
Using an enumerated type we can distinguish between different kinds of
shapes, but we cannot carry other information such as dimensions.
data Shape = Circle | Rectangle
The alternatives in a data definition can include other types, rather than
being simple constants like Hot and Cold.
data Shape = Circle Float | Rectangle Float Float
deriving (Eq, Show)
COMP 1100 — Algebraic Data Types 13
Now Circle is a constructor function .
> :type Circle
Circle :: Float -> Shape
In other words, Circle constructs a Shape from any Float value.
similarly, Rectangle constructs a Shape from two Float values:
> :type Rectangle
Rectangle :: Float -> Float -> Shape
COMP 1100 — Algebraic Data Types 14
Constructor Functions May Appear in Patterns
Constructor functions are the only functions that can appear in patterns.
For example:
isRound :: Shape -> Bool
isRound (Circle r) = True
isRound (Rectangle l b) = False
area :: Shape -> Float
area (Circle r) = pi * r^2
area (Rectangle l b) = l * b
COMP 1100 — Algebraic Data Types 15
Recursive Algebraic Types
It is possible to use the algebraic type being defined in a data definition
within its own definition. That means the type itself is recursive .
Lists are a common example of a recursive type. A list is either:
• empty , or
• it consists of a head and a tail where the tail is also a list .
An algebraic type declaration that mirrors the built-in type of lists:
data List = Empty | Cons Int List
deriving (Eq , Ord , Show)
COMP 1100 — Algebraic Data Types 16
Another standard example of a recursive data type is the Tree structure
shown at the beginning of this section.
A Binary Tree is either:
• empty , or
• it consists of a node containing some value and left and right subtrees
where the subtrees are also binary trees .
An example where the node values are strings:
data Tree = Null | Node String Tree Tree
COMP 1100 — Algebraic Data Types 17
Polymorphic Algebraic Types
The last two examples are constrained in the sense that type List
represents only lists of integers and Tree only represents binary trees of
strings.
It is possible to define lists and trees of any component type as follows:
data List a = Empty | Cons a (List a)
deriving (Eq , Ord , Show)
data Tree a = Null | Node a (Tree a) (Tree a)
COMP 1100 — Algebraic Data Types 18
Algebraic Data Types Case Study:
A Graphics Package
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Algebraic Data Types Case Study 1
Algebraic Data Types Case Study
This section studies the development a graphics package written in
Haskell. The focus is on the use of algebraic data types in the program.
A Graphics Package
There are two important initial design questions:
• How to represent pictures in Haskell
• How to display those pictures
Our focus will be on the first question.
COMP 1100 — Algebraic Data Types Case Study 2
The Graphics package is available at
/dept/dcs/comp1100/public/ANUPlot/. It provides simple 2-D
graphics and animation.
It is built on top of the Haskell OpenGL binding. OpenGL is widely used in
CAD, computer games, visualisation etc.
The week 7 practical exercises give basic directions on using ANUPlot.
There are also a number of examples in
/dept/dcs/comp1100/public/ANUPlot/Examples/
COMP 1100 — Algebraic Data Types Case Study 3
Representing Pictures in Haskell
The system will be based on plane co-ordinate geometry .
Each point will be represented as a pair of Float values:
type Point = (Float ,Float)
Each unit will correspond to a single pixel, so we could have chosen to
represent points as (Int, Int). There are advantages and disadvantages
but such a choice would be less extensible.
COMP 1100 — Algebraic Data Types Case Study 4
Picture Components
The package will allow for the drawing of lines, text and polygons.
A picture will consist of a collection of such things, each of type Component.
A picture will be a list of Components. The order of the list indicate the
layering of components, if they overlap.
type Picture = [Component]
COMP 1100 — Algebraic Data Types Case Study 5
Components
Lines are specified by a sequence of points, representing the beginning and
end of each segment.
Polygons are similarly represented by a sequence of points being the
vertices of the polygon.
We call a sequence of points a Path:
type Path = [Point]
Text components consist of a string.
By default, text components always appear at the origin, which is the central
point of the window opened by the drawing function.
COMP 1100 — Algebraic Data Types Case Study 6
Since pictures consist of different kinds of objects , an algebraic type
seems appropriate.
data Component = Line Path
| Polygon Path
| Text String
An example of a Line component:
square :: Component
square = Line [(72 ,72), (144,72), (144 ,144) ,
(72 ,144), (72 ,72)]
COMP 1100 — Algebraic Data Types Case Study 7
An example of a Text component:
title :: Component
title = Text "A square"
Combining square and title into a picture:
picture :: Picture
picture = [square , title]
(SimpleSquare.hs)
COMP 1100 — Algebraic Data Types Case Study 8
Transformations
Notice that the text component is at the origin, horizontal, and a
pre-determined size.
Obviously we need to be able to move things around in the window when
composing pictures.
We could transform individual Components, but it’s more convenient to
transform collections of components.
But a collection of components is a Picture, so we may as well transform
Pictures.
Often we want to apply several transformations at the same time
(in sequence).
COMP 1100 — Algebraic Data Types Case Study 9
Which Transformations?
The most general notion of a transformation of the plane is an (onto) function
of type Point -> Point but that’s not really workable.
Design Influence
Since our graphics package will use OpenGL, it might be convenient for our
transformations to correspond to those in OpenGL:
• Translate
• Rotate
• Scale
COMP 1100 — Algebraic Data Types Case Study 10
Data Type Transform
data Transform = Translate Float Float
| Rotate Float
| Scale Float Float
Details:
• Translate x-distance y-distance
• Rotate degrees clockwise
• Scale x-scale y-scale
(Why clockwise? Because the OpenGL default is that the z-axis is into the screen.)
COMP 1100 — Algebraic Data Types Case Study 11
Applying Transformations
We want to apply sequences transformations to Pictures.
The result of the transformation is a Component of a Picture, so we add it to
the Component Algebraic Data Type.
data Component = Line Path
| Polygon Path
| Text String
| Transform [Transform] Picture
Note: in OpenGL transformations are represented by matrices, and there is
a notion of the Current Transformation Matrix.
Consequently Transform [ t1, t2, . . . tn ] applies tn first and t1 last .
COMP 1100 — Algebraic Data Types Case Study 12
Finally, we also want to colour picture components, so:
data Component = Line Path
| Polygon Path
| Text String
| Transform [Transform] Picture
| Color Color Picture
where Color is another algebraic data type:
data Color = RGBA8 Int Int Int Int
| RGBA Float Float Float Float
(See Plot/Picture.hs and Plot/Color.hs for details.)
COMP 1100 — Algebraic Data Types Case Study 13
Notice we are using the names Transform and Color to mean two different
things:
• the names of types
• the names of constructor functions
Haskell won’t get confused, but you might . . .
(Square.hs)
(Parabola.hs)
(The library also has simple animation facilities.)
COMP 1100 — Algebraic Data Types Case Study 14
Graphics Package Applcation:
Fractals
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Fractals 1
COMP 1100 — Fractals 2
A cool application of the graphics package, and an interesting case of
general recursion.
Fractals
Fractals are naturally recursive shapes — they are defined by rules that
specify how to make the shape from smaller and simpler copies of itself.
Fractals are widely used in computer animation (games, movies, . . . )
For example, see www.speedtree.com
Convincing images of clouds, topography, trees, swarms etc. can be created
using fractals.
COMP 1100 — Fractals 3
A Simple Tree Fractal
An elementary tree can be a trapezium:
branch :: Picture
branch = [Polygon [(30 ,0) ,(15 ,300) ,( -15 ,300) ,( -30 ,0)]]
Which just looks like a rudimentary trunk.
(Stump.hs or trunk in Tree.hs)
COMP 1100 — Fractals 4
COMP 1100 — Fractals 5
We can make a more interesting tree by attaching limbs to the trunk.
Each limb is a just a smaller version of the trunk:
where limb = [Transform [Scale 0.5 0.5] trunk]
We put one limb on top of the trunk:
trunk ++
[Transform [Translate 0 300] limb]
and 4 others sticking out the sides of the trunk:
[Transform [Translate 0 240, Rotate 20] limb] ++
[Transform [Translate 0 180, Rotate (-20)] limb] ++
[Transform [Translate 0 120, Rotate 40] limb] ++
[Transform [Translate 0 60, Rotate (-40)] limb]
COMP 1100 — Fractals 6
Notice that this is constructed from several smaller versions of the original
shape — the essential idea of a fractal.
Putting it together:
limbs :: Picture
limbs =
trunk ++
[Transform [Translate 0 300] limb] ++
[Transform [Translate 0 240, Rotate 20] limb] ++
[Transform [Translate 0 180, Rotate (-20)] limb] ++
[Transform [Translate 0 120, Rotate 40] limb] ++
[Transform [Translate 0 60, Rotate (-40)] limb]
where limb = [Transform [Scale 0.5 0.5] trunk]
COMP 1100 — Fractals 7
COMP 1100 — Fractals 8
Carrying on . . .
An even more tree-like shape . . .
Key idea:
Instead of small versions of trunk, attach small versions of limbs to the
trunk, using the same rules.
The definition of the tree picture is identical to limbs, except that we use a
half-size version of limbs instead of trunk:
where branch = [Transform [Scale 0.5 0.5] limbs]
COMP 1100 — Fractals 9
Putting it together again
tree :: Picture
tree =
trunk ++
[Transform [Translate 0 300] branch] ++
[Transform [Translate 0 240, Rotate 20] branch] ++
[Transform [Translate 0 180, Rotate (-20)] branch] ++
[Transform [Translate 0 120, Rotate 40] branch] ++
[Transform [Translate 0 60, Rotate (-40)] branch]
where branch = [Transform [Scale 0.5 0.5] limbs]
COMP 1100 — Fractals 10
COMP 1100 — Fractals 11
Generalising
We could make more and more interesting pictures by continuing this
process. For example, the next step would be to add half-size trees to the
trunk in the same arrangement, and so on.
What is the process? Can we generalise?
The trunk image is a fractal of degree 0
The limbs image is a fractal of degree 1
The tree image is a fractal of degree 2
How would we make a fractal of degree 3?
degree 4? degree 5?. . .
COMP 1100 — Fractals 12
Generalisation Idea:
Make the degree a parameter to the tree fractal function:
tree :: Int -> Picture
tree 0 will be the same as trunk.
tree 1 will be the same as limbs.
tree 2 will be the same as tree.
tree 3 will be more complex, and so on.
COMP 1100 — Fractals 13
Putting it together:
tree :: Int -> Picture
tree 0 = trunk
tree n =
trunk ++
[Transform [Translate 0 300] prev] ++
[Transform [Translate 0 240, Rotate 20] prev] ++
[Transform [Translate 0 180, Rotate (-20)] prev] ++
[Transform [Translate 0 120, Rotate 40] prev] ++
[Transform [Translate 0 60, Rotate (-40)] prev]
where prev = [Transform [Scale 0.5 0.5] (tree (n-1))]
-- ^^^^^^^^^^
COMP 1100 — Fractals 14
Playing around. . .
• Colour shading adds to the effect (TreeColour.hs)
• Making small changes to the parameters (limb locations and rotations)
gives a different looking tree (TreeColour2.hs)
• There are other simple fractals: Koch (Snowflake) curve, Sierpinski
triangle, Dragon Curve, etc. See Week 7 lab exercises. (Snowflake.hs,
Sierpinski.hs)
COMP 1100 — Fractals 15
greyTree 5
COMP 1100 — Fractals 16
Reasoning About Programs
Reading: Thompson Ch.8
COMP1100 — Introduction to Programming and Algorithms
Malcolm Newey
Australian National University
Semester 1, 2006
COMP 1100 — Proofs 1
Before we Start:-
Standard Postmortem Questions
• Did I pass the mid-semester exam?
Your marks are available through STREAMS
• Where does my score place me in the class?
Vital statistics are:
Mean = 69;
Distribution: 42% HD, 13% D, 15% Cr, 9% P, 22% Fail
• Did I do well?
Marks must be interpreted by YOU
• Are the results for the class acceptable?
Yes! The staff are very happy.
COMP 1100 — Proofs 2
The Need for Proof
• Who can program without making errors?
(Errors can be subtle. .. and they can be expensive.)
• You need to be able to demonstrate correctness.
(All cases must be covered correctly.)
• Properties of a program must be validated.
But tests can rarely be exhaustive.
You must remember this:
Testing can only show the presence of bugs.
COMP 1100 — Proofs 3
Proof and Haskell
• Proof is easier in functional languages.
• Clauses in a function definition specify computations
• They can also be used as mathematical properties.
• Symbolic execution can be used to show general properties.
• Induction is flip-side of Recursion.
This week is mostly about Induction. It is a powerful technique .
COMP 1100 — Proofs 4
Typical Proof by Induction
Theorem: For all natural numbers, n , the number n2 +n is even.
Proof:
• Base Case
02 +0 is clearly even (by evaluation).
• Step Case
Assume n2 +n is even.
(n +1)2 +(n +1) = n2 +2n +1+n +1 = (n2 +n)+2(n +1)which is the sum of two even numbers and so is even.
(We know the first is even by the induction hypothesis.)
In summary, n2 +n is even implies (n +1)2 +(n +1) is too.
COMP 1100 — Proofs 5
Induction for Natural Numbers
To prove a property P(n) for all natural numbers n we must do these two
things:
• The Base Case: Prove P(0).
• The Step Case: Prove P(n +1) on the assumption that P(n) holds.
In the step case P(n) is referred to as the induction hypothesis.
COMP 1100 — Proofs 6
Proving Properties of Recursive Functions
Suppose that a function t is defined recursively on positive numbers as
follows
t(1) = 1
t(n) = t(n-1) + n*n!
Theorem: t(n) = (n+1)!-1
Proof:
• The Base Case:
LHS = t(1) = 1
RHS = 2!-1 = 1 = LHS
Why is n=1 in the base case (and not 0)?
Because t is only defined for +ve integers.
COMP 1100 — Proofs 7
• The Step Case:
The induction hypothesis is t(n) = (n+1)!-1
The goal for proof is t(n+1) = ((n+1)+1)!-1
LHS = t(n+1)
= t((n+1)-1) + (n+1)*(n+1)!
= t(n) + (n+1)*(n+1)!
= (n+1)! - 1 +(n+1)*(n+1)!
= (n+2)! - 1
RHS = ((n+1)+1)! - 1 = LHS
QED
COMP 1100 — Proofs 8
Proving Properties of Recursive Functions
Suppose we were to define multiplication in terms of addition, coding it as a
Haskell function:
(*) :: Int -> Int -> Int
m*n
n==0 = 0 (base case)
n>0 = m*(n-1) + m (recursive case)
Lemmas:
m*0 = 0 prop. 1
m*(n+1) = m*n + m (for n>0) prop. 2
(These re-express the clauses of the function definition.)
Theorem: 0*m = 0
COMP 1100 — Proofs 9
Proof:
• The Base Case:
LHS = 0*0 = 0 = RHS
• The Step Case:
The induction hypothesis is 0*n=0 for n>0
The goal for proof is 0*(n+1) = 0
LHS = 0*(n+1)
= 0*n + 0 (using prop. 2)
= 0 + 0 (using ind. hyp.)
= 0 = RHS
QED
COMP 1100 — Proofs 10
Proving Properties of List Functions
The length of a list can be calculated using this function definition:
length :: [a] -> Int
length [] = 0
length (x:xs) = 1 + length(xs)
The most basic property of the function is that the value returned can never
be negative. This can be stated formally and proven as follows:
Theorem: length(xs) >= 0
COMP 1100 — Proofs 11
Proof:
• The Base Case:
(length []) >= 0
= 0>=0
= True
• The Step Case:
The induction hypothesis is length(xs) >= 0
The goal to be proved is length(x:xs) >= 0
and its proof is this:
length(x:xs) >= 0
= 1+length(xs) >= 0
= True
QED
COMP 1100 — Proofs 12
Structural Induction for Lists
To prove a property P(xs) for all finite lists xs we must do these two things:
• The Base Case: Prove P( [] ).
• The Step Case: Prove P( x:xs ) on the assumption that P( xs ) holds.
In the step case P( xs ) is referred to as the induction hypothesis.
COMP 1100 — Proofs 13
Induction on Lists
• Note that the induction principle for lists has identical structure to that for
natural numbers.
• As with mathematical induction, structural induction for lists is a good
proof technique for recursively defined functions.
• In later courses you will encounter other structural induction principles -
induction trees, for example
COMP 1100 — Proofs 14
Another Example
• If we append a list of length 9 to one of length 7,
the result will be a list of length 16.
• That illustrates a fundamental characteristic of the length function:
length(xs++ys) = (length xs) + (length ys)
• It doesn’t depend on the type of the elements.
COMP 1100 — Proofs 15
Proof of ‘length-of-append’ Theorem
Here is the definition of append:
(++) :: [a] -> [a] -> [a]
[] ++ ys = ys
(x:xs) ++ ys = x:(xs ++ ys)
Theorem: length(xs ++ ys) = length(xs) + length(ys)
Proof:
• The Base Case:
LHS = length([] ++ ys )
= length(ys)
RHS = length([]) ++ length(ys)
= 0 + length(ys) = LHS
COMP 1100 — Proofs 16
• The Step Case:
The induction hypothesis is
length(xs ++ ys) = length(xs) + length(ys)
The goal for proof is
length((x:xs) ++ ys) = length(x:xs) + length(ys)
LHS = length((x:xs) ++ ys)
= 1 + length(xs ++ ys)
= 1 + length(xs) + length(ys)
RHS = length(x:xs) + length(ys)
= 1 + length(xs) + length(ys) = LHS
QED
COMP 1100 — Proofs 17
What is the Proof Strategy
See page 144 of Thompson’s text.
• State clearly the goal of the induction and the main subgoals, namely
the base case and the step case.
• If any confusion is possible, change variable name in the definitions to
be used.
• Use clauses from the function definitions to simplify the subgoals.If a
subgoal is an equation, simplify its let and right hand sides separately.
• You should expect to use the induction hypothesis in your proof.
COMP 1100 — Proofs 18
Yet Another Example
We now make use of the even-ness predicate which is standard in Haskell
and also the map function.
• (map even [1,2,3,4]) = [False,True,False,True]
• (map even [1,2]) = [False,True] and
(map even [3,4]) = [False,True]
• map even ([1,2] ++ [3,4])
= (map even [1,2]) ++ (map even [3,4])
A generalization should be obvious. It is:
map even (xs ++ ys) = (map even xs) ++ (map even ys)
COMP 1100 — Proofs 19
Proof of map Example
Here is the definition of map:
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = (f x):(map f xs)
Theorem: map f (xs ++ ys) = (map f xs) ++ (map f ys)
Proof:
• The Base Case:
LHS = map f ([] ++ ys)
= map f ys
RHS = (map f []) ++ (map f ys)
= [] ++ (map f ys) = LHS
COMP 1100 — Proofs 20
• The Step Case:
The induction hypothesis is
map f (xs ++ ys) = (map f xs) ++ (map f ys)
The goal for proof is
map f ((x:xs) ++ ys) = (map f (x:xs)) ++ (map f ys)
LHS = map f ((x:xs) ++ ys)
= map f (x:(xs ++ ys))
= (f x):(map f (xs ++ ys))
= (f x):((map f xs) ++ (map f ys))
RHS = (map f (x:xs)) ++ (map f ys)
= ((f x):(map f xs)) ++ (map f ys)
= (f x):((map f xs) ++ (map f ys)) = LHS
QED
COMP 1100 — Proofs 21
Surely not Another Example
• We’ve done three simple ones quite successfully
• What about harder ones?
No. They make for boring lecture material.
• The Thompson text book (page 146) proves that:
reverse(xs++ys) = (reverse ys) ++ (reverse xs)
COMP 1100 — Proofs 22
Time for Perspective
• The use of induction is clearly a very solid basis for making an argument
that some property holds.
• We have seen some examples that you might say are ‘almost automatic’.
• There are decision procedures which will produce many such proofs
completely mechanically.
• They can’t prove everything.
COMP 1100 — Proofs 23
A Hard Problem
• It is a theorem about the reverse function that you may well say is
‘obvious”.
reverse(reverse(xs)) = xs
• The next slide shows why it is hard.
• The text book gives a way of proving it (page 149) but it relies on on
redefining the reverse function and using another mathematical
technique - generalization.
• This section of the text is not examinable in COMP1100.
COMP 1100 — Proofs 24
Attempted Proof of the Rev-Rev Property
Here is the definition of reverse:
reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = (reverse xs) ++ [x]
Theorem to be proved: reverse(reverse(xs)) = xs
Proof: The base case is easy;
reverse(reverse []) = reverse [] = []
so we focus on the step case:
COMP 1100 — Proofs 25
• The Induction Hypothesis:
reverse(reverse(xs)) = xs
• The Goal in the Step Case:
reverse(reverse(x:xs)) = x:xs
• The attempt at simplification:
LHS = reverse(reverse(x:xs))
= reverse((reverse xs) ++ [x])
= ????
• So we find the RHS is in its simplest form and there are no definitions
that simplify the left side.
We are apparently at a dead end.
COMP 1100 — Proofs 26
Important Lessons So Far
– Review –
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Important Lessons 1
About the mid-semester exam:
• How hard was it?
• Will the final be that easy?
• Some advice . . .
COMP 1100 — Important Lessons 2
About Assignment 2:
• Due Friday 12th May
• Windows users be warned!
– DOS puts both carriage return and line feed at the end of the line.
When you read a line, GHC only looks for line feed, so the carriage
return becomes part of the string!
– If you save the Marks.txt file in a Windows editor, your program will
seem to overwrite lines of output!
– Solution: Don’t edit or save Marks.txt! If this problem occurs,
either:
1. Get a fresh copy of Marks.txt
2. Stop using Windows . . .
COMP 1100 — Important Lessons 3
Does everything seem a bit hazy?
Yes? That’s okay — this is complicated and we’ve got all year. . .
(The nature of university education?)
No? Either you’re not trying or you’re kidding yourself.
COMP 1100 — Important Lessons 4
Can’t See the Woods for the Trees?
This course is about programming .
Fundamental concepts — what programming is .
It is not about Haskell or any other programming language .
It is not utilitarian. It is not skills-based.
But . . . to learn about programming, you need to experiment, so a
programming language is a necessary vehicle.
COMP 1100 — Important Lessons 5
Programming Languages
Programming languages are complicated things.
Like human languages, some are more complicated than others.
Complexity is caused by irregular special cases.
Programming languages seem to have masses of details (special cases)
that can obfuscate the basic concepts. Some programming languages are
more orthogonal (less irregular) than others.
Why Haskell?
• Not so complicated;
• Orthogonal design; not so many details; Few (no?) special cases.
• Express basic concepts more directly and therefore more clearly.
COMP 1100 — Important Lessons 6
Important Lessons So Far
Mostly, I have tried to focus on some fundamental concepts of programming.
I have also talked about some details and some less important things. Why?
• Because programming is complicated and you need a certain amount of
stuff to put the basic concepts into practice;
For example, input-output is difficult in every useful language . . .
but you can’t write a whole program without it.
• Because some students are interested in more than the basics;
For example, writing programs to draw fractals.
• Because we need some examples to reinforce the important lessons.
COMP 1100 — Important Lessons 7
What have been the important lessons so far?
Types and Values
What’s a value ? What’s a type ? Why are types important?
Conditionals
Making choices is a necessary part of any non-trivial algorithm.
Different languages may have different ways of expressing choices, but the
fundamental idea is the same.
Structured Data
Tuples, lists, user-defined (algebraic) data types.
COMP 1100 — Important Lessons 8
Lists
An extremely important data structure.
Lists are built-in in some languages (e.g. Haskell).
Lists are provided in libraries in some languages (e.g. Java).
Lists are DIY (do it yourself) in some languages.
Recursion
The fundamental technique for expressing repetition in computation.
Repetition is the essence of computation.
All other means of expressing repetition (e.g. loops) are special cases —
patterns of recursion.
COMP 1100 — Important Lessons 9
Data Directed Design
Program design is largely independent of the implementation language.
Bottom Up and Top Down design focus on the problem . When we are
writing the implementation we work with the same ideas, but express them
differently depending on the language.
Algebraic Data Types
Not many programming languages have features corresponding to Haskell’s
algebraic data types, but they can be represented by classes, objects and
inheritance in Object-Oriented languages (like Java).
COMP 1100 — Important Lessons 10
Where Now? Java!
Why?
Java is an object-oriented programming language .
(Flavour of the month?)
Java is a widely used programming language. (Why? Libraries and hype.)
How?
Mostly, we will repeat the important lessons we have seen in Haskell.
The emphasis is on the concepts rather than the individual language.
Down-side?
I’m afraid so — another bunch of details to overcome.
COMP 1100 — Important Lessons 11
The Functional Model of Computation (Haskell)
• Everything is a value
• Every value has an explicit type
• We can give names to values
• Functions take values as arguments and return new values as results
• Computations are defined by composing functions
COMP 1100 — Important Lessons 12
The Functional Model of Computation
functionvalue value
value
function
function
value
value
value
COMP 1100 — Important Lessons 13
The Imperative Model of Computation (Java)
• There is a single store consisting of a collection of locations which can
contain values
• We can give names to locations and say what types of values they may
contain
• Computations are defined by a sequence of commands
• Each command may change the store in some way
Fortunately for us, imperative programs can have functions, too . . .
COMP 1100 — Important Lessons 14
The Imperative Model of Computation
command 2
command 3
command 1Store
value
value
value
location
location
location
COMP 1100 — Important Lessons 15
Introduction to Java
COMP1100 — Introduction to Programming and Algorithms
Reading: Big Java Ch.2
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Java Introduction 1
Where We Are Going?
We have looked at some key ideas about programming, using Haskell.
These key ideas also occur in Java.
The rest of the course will be structured around these correspondences.
The idea is to emphasise the concepts by seeing them from another angle.
Important Correspondences:
types: types, classes
values: values, objects
functions, type signatures: functions, type declarations
COMP 1100 — Java Introduction 2
lists: various list library classes
data-directed program design: data-directed program design
modules: classes
polymorphic types: generic classes
recursion: recursion, loops
map, fold, etc: for-each loops
show function: toString() method
read function: java.util.Scanner class
algebraic data types: abstract classes, inheritance
libraries: libraries
COMP 1100 — Java Introduction 3
Defining Functions in Java
Remember this Haskell function definition?
restRate :: Double -> Double -> Double -> Double
restRate weight height age =
(66.47+13.75* weight +5* height -6.76* age )*4.2
COMP 1100 — Java Introduction 4
How do we write it in Java?
Double restRate(Double weight ,Double height ,Double age)
{
return (66.47+13.75* weight +5* height -6.76* age )*4.2;
}
(Java has a Float type, but usually stops you using it. . . )
COMP 1100 — Java Introduction 5
Type Signatures Correspond
The type signature and the left side of the defining equation in Haskell
contain exactly the same information as the function definition header in
Java:
restRate :: Double −> Double −> Double −> Double
Double restRate(Double weight, Double height, Double age)
restRate weight height age = ...
(In Java, we always write the type before the identifier we are describing.)
COMP 1100 — Java Introduction 6
Java Commands Describe Actions
In Haskell, the body of a function is just an expression which is evaluated to
give the result value:
(66.47+13.75* weight +5* height -6.76* age )*4.2
In Java, the body of a function is a (sequence of) commands (statements).
So we have to instruct the function to return a value — it is an action.
Hence the return statement:
return (66.47+13.75* weight +5* height -6.76* age )*4.2;
COMP 1100 — Java Introduction 7
Java is not Layout Sensitive
Some languages (like Haskell) use indentation to indicate where things like
function definitions end.
Other languages (like Java) use punctuation to indicate where things like
function definitions end.
The end of a statement (command) is indicated by a semicolon :
return ... ;
Things (e.g. statements) are grouped together using braces:
Double restRate (...) {
...
}
COMP 1100 — Java Introduction 8
People Reading Java Programs are Layout Sensitive!
How readable is this?
public class Point{Double x; Double y; public Point()
{x = 0.0;y =0.0;} public Point(Double xVal , Double
yVal){x = xVal;y = yVal;} public Double xCoord
(){ return x;} public Double yCoord (){ return y;}
public void moveRight(Double distance ){x=x+distance
;} public void moveUp(Double distance ){y=y +
distance; }}
COMP 1100 — Java Introduction 9
A Java Program?
Haskell programs all have a function:
main :: IO ()
which handled the input-output interactions.
Similarly, Java programs have a function:
public static void main(String args [])
which handles the input-output interactions.
Warning : If you thought Haskell I/O was difficult, wait until you see Java I/O!
[Example: RestRate1P.java]
COMP 1100 — Java Introduction 10
Compiling Java Programs
The simplest way to compile a Java program is on the command line:
> javac RestRate1P.java
That is, javac is the Java compiler program.
A successful compilation produces a .class file with the same name
(RestRate1P.class in this case).
COMP 1100 — Java Introduction 11
Running Java Programs
Unlike ghc and many other compilers, javac does not produce a machine
executable version of the program.
Javac produces a version of the program that runs on the JVM (Java Virtual
Machine). The JVM is another program.
To run your (compiled) program on the JVM, use the command:
> java RestRate1P
(Make sure to leave off the .class suffix.)
COMP 1100 — Java Introduction 12
Other ways to Experiment with Java Programs
Unfortunately, there isn’t anything as convenient as GHCi for Java.
You might like to try DrJava, which does allow some experimentation with
parts of programs.
Eclipse is a fully featured integrated development environment available on
the student system. The COMP1110/1510 lecturer uses Eclipse quite a lot.
(Eclipse also supports Haskell. . . )
COMP 1100 — Java Introduction 13
Classes and Objects
COMP1100 — Introduction to Programming and Algorithms
Reading: Big Java Ch.2 & 3
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Classes & Objects 1
Types, Values and “Variables” in Java
As in Haskell, every value in Java has a type . For example:
• 42 is of type Integer
• "Hello, sailor!" is of type String
But the concept of a variable in Java is different from Haskell and
Mathematics:
In Java, a variable is a location in the store.
Locations contain values.
COMP 1100 — Classes & Objects 2
• We give names to variables by declaring them:
String greeting;
and say that greeting has type String.
• We put values into variables by assignment :
greeting = "Hello , World!";
• Usually, we initialise variables when we declare them:
String greeting = "Hello , World!";
COMP 1100 — Classes & Objects 3
What does “Object-Oriented” Mean?
Three fundamental constructs:
• Classes
• Objects
• Methods
Object-oriented programs manipulate objects .
In a pure object-oriented language, every value is an object.
Java is not purely OO, but we’ll try to pretend that it is. . .
COMP 1100 — Classes & Objects 4
What’s a Class?
A class in Java is like a data type in Haskell.
Haskell has various types provided by the Prelude (e.g. Int, Bool, lists,
tuples) and others provided in the libraries.
There are lots of predefined classes in the Java libraries (e.g. Integer,
Boolean, about a dozen different kinds of lists, and so on.)
(Java doesn’t have any specific tuple classes. In a sense, every class corresponds to
a tuple.)
COMP 1100 — Classes & Objects 5
What’s a Class? — Part 2
A class in Java is also like a module in Haskell.
In Haskell, a common use of modules is to group together a user-defined
data type with the functions that operate on that data type.
For example, in Assignment 1 the Pixel module contained the data type
representing pixels and the functions that operate on pixels. The PPM
module contained the data type representing ppm images and the functions
that operate on those images.
(Of course, we expect your Assignment 2 submissions to be structured this
way, too!)
COMP 1100 — Classes & Objects 6
Java forces you to structure your programs that way!
A Java class groups together the data needed to represent things of that
type with the operations defined for values of that type.
The class definition will include instance fields to store the data
representing things of that type.
The operations are just functions.
In OO terminology, these functions are called methods .
Classes also contain constructors , which directly correspond to the
constructor functions for algebraic data types in Haskell.
Java constructors create new objects that are instances of that class.
COMP 1100 — Classes & Objects 7
What’s an Object?
If a class is like a type, an object (of a class) is like a value (of that type).
In Haskell, a value is just the data .
In Java, an object also comes with it’s own copy of the methods of the class.
In Haskell, the result of functions is always new data .
In Java, methods may change the values stored in the instance fields .
Another viewpoint:
A class is like a template . An object is like an instance of that template.
COMP 1100 — Classes & Objects 8
An Example
Java allows classes to be grouped together and organised in packages .
The standard Java libraries include a package called java.awt
(awt stands for abstract windowing toolkit).
One of the classes in that package is java.awt.Rectangle which allows
us to represent rectangles in 2D coordinate geometry.
Objects of type Rectangle allow us to represent rectangles in Java
programs.
(Exercise: How would you represent rectangles in Haskell?)
COMP 1100 — Classes & Objects 9
java.awt.Rectangle Fields:
The position and the dimensions of the rectangle are specified:
int height;
int width;
int x; // x coord of top left corner
int y; // y coord of top left corner
(Sadly, we’re already seeing Java is not purely OO . . . )
COMP 1100 — Classes & Objects 10
java.awt.Rectangle Constructors:
In fact, the class provides 7 different constructors!
The most natural is this one:
Rectangle(int x, int y, int width , int height)
which just sets the instance fields to the values passed in as arguments to
the constructor.
Note: since this class is part of the Java libraries, we don’t see the
implementation of the constructors. That will have to wait until we build our
own classes.
COMP 1100 — Classes & Objects 11
How do we Use Constructors?
The expression:
new Rectangle(5, 10, 20, 30)
constructs a new Rectangle object at (5,10) with width 20 and height 30.
Usually, the resulting object is stored in a variable:
Rectangle box = new Rectangle(5, 10, 20, 30);
This statement declares a new variable box of type Rectangle, and stores
the newly constructed Rectangle object in that variable.
COMP 1100 — Classes & Objects 12
java.awt.Rectangle Methods:
There are dozens of them!
(Again, since this is a library class, we don’t see the implementation of the
methods. Soon, we will look at building our own classes.)
COMP 1100 — Classes & Objects 13
Interlude — What’s a Method?
Methods are just the operators (functions) that allow us to manipulate
objects of this class (or equivalently, values of this type).
Methods come in two flavours:
Accessor Methods
Accessor methods return some information about the object without
changing it.
Mutator Methods
Mutator methods change the state (the instance fields) of the object.
COMP 1100 — Classes & Objects 14
(Back to the example. . . )
Some java.awt.Rectangle Accessor Methods:
It is common to provide accessor methods to return the values in each of
the instance fields:
double getX()
double getY()
double getHeight ()
double getWidth ()
(Returning double rather than int is a decision I’ll leave it to the implementers to
explain.)
COMP 1100 — Classes & Objects 15
Some java.awt.Rectangle Mutator Methods:
To change the dimensions of the rectangle:
void setSize(int height , int width)
To move the rectangle a distance to the right and a distance down:
void translate(int x, int y)
Notice these methods do not return a result. They change the object.
This is a fundamental difference from the functional approach.
COMP 1100 — Classes & Objects 16
Calling Methods
Suppose we have constructed a new Rectangle object as before:
Rectangle box = new Rectangle(5, 10, 20 ,30);
Remember that the methods are associated with the object .
If we want to access the x field of box we use the getX() method.
The syntax is like this:
box.getX ();
which is an expression that will return the double value 5.0
For example:
Double area = box.getHeight () * box.getWidth ();
COMP 1100 — Classes & Objects 17
Calling Methods — ctd
How do we call mutator methods? For example, how do we use
void translate(int x, int y)
to change box?
Notice (again) from the type declaration that translate doesn’t return
another Rectangle.
To change box using the translate method we use the statement :
box.translate (15, 25);
(See the example test program: MoveTester.java.)
COMP 1100 — Classes & Objects 18
Tricky . . .
In Java, variables whose type is a class don’t actually hold an object — they
hold a reference to the object.
So What?
Consider:
r1 = new Rectangle(5, 10,20, 30);
r2 = new Rectangle(5, 10,20, 30);
Is r1 == r2 true or false?
COMP 1100 — Classes & Objects 19
A picture helps:
5102030
xy
widthheight
5102030
xy
widthheight
r2
r1
The rectangles are identical, but r1 and r2 refer to separate objects .
So r1 == r2 evaluates to false !
This is called reference equality .
COMP 1100 — Classes & Objects 20
How can we test whether two objects are equal?
There is a standard method equals(...)
In class Rectangle we have:
boolean equals(Rectangle r)
Now we can ask:
r1.equals(r2)
which evaluates to true .
COMP 1100 — Classes & Objects 21
Writing Our Own Classes
The Java libraries contain lots of useful classes.
But mostly they are general purpose in some way — they are a programming
toolkit.
When designing programs, we begin by identifying and representing the
“things” — the abstract data types — that our program needs to
manipulate .
O-O Ideas Come From Data-Directed Design
which we looked at a few weeks ago . . .
COMP 1100 — Classes & Objects 22
Recall the Data Directed Design Process:
1. Understand the Problem
2. Identify the (Classes of) Objects
3. Identify the Basic Operations on Objects
4. Choose Representations of the Objects
5. Implement the Basic Operations on Objects
6. Factor the Process into Manageable Parts
COMP 1100 — Classes & Objects 23
In O-O languages, steps 2–6 correspond to designing and
implementing classes.
1. Understand the Problem
2. Identify the (Classes of) Objects
3. Identify the Basic Operations on Objects
4. Choose Representations of the Objects
5. Implement the Basic Operations on Objects
6. Factor the Process into Manageable Parts
COMP 1100 — Classes & Objects 24
Example: Bank Account
Suppose that in the process of developing a particular system using the
DDD method, one of the abstract data types identified in step 2 is a bank
account .
(There are many different kinds of accounts, but let’s keep it simple.)
Step 3: Identify the Basic Operations
• Deposit money
• Withdraw money
• Check the current balance
(We will leave the matter of interest to the lab exercises.)
COMP 1100 — Classes & Objects 25
Making a deposit changes a bank account, so it will be a mutator method .
The argument to the deposit method will be the amount of money being
deposited.
Similarly withdraw is also a mutator method and its argument is the amount
of money being withdrawn.
Checking the current balance doesn’t change the account, so that will be an
accessor method .
COMP 1100 — Classes & Objects 26
These observations give us the following skeleton for the BankAccount
class:
class BankAccount {
void withdraw(Double amount) ...
void deposit (Double amount) ...
Double getBalance () ...
}
COMP 1100 — Classes & Objects 27
Constructors
Setting up a new bank account is also a basic operation, so we also need to
consider the BankAccount constructors at this point.
Suppose we decide that a bank account can be set up with an initial deposit
of money, or (by default) with a balance of 0.
That suggests two constructors, one taking an argument being the initial
amount, the other taking no argument.
BankAccount(Double initialBalance) ...
BankAccount () ...
Notice that constructors never have a result type. Notice that constructors
always have the same name as the class.
COMP 1100 — Classes & Objects 28
The outline of the class now looks like:
class BankAccount {
BankAccount(Double initialBalance) ...
BankAccount () ...
void withdraw(Double amount) ...
void deposit (Double amount) ...
Double getBalance () ...
}
We have now identified the basic operations on bank accounts and
determined their types.
COMP 1100 — Classes & Objects 29
The next step is . . .
Step 4: Choose a Representation
That is, what data do we need to represent a bank account?
For our simplified example, the only thing we need to keep track of is the
balance , which we have already decided will be a Double.
In OO languages, the data representation appears as the instance fields of
the class:
class BankAccount {
Double balance;
...
}
COMP 1100 — Classes & Objects 30
Finally . . .
Step 5: Implement the Basic Operations
That is, fill in the bodies of the methods and constructors we have identified.
The mutator methods (deposit, withdraw) will change the balance
field.
The accessor method (getBalance) will not change the balance field.
The constructor methods will initialise the balance field.
(BankAccount.java)
(BankAccountTester.java)
COMP 1100 — Classes & Objects 31
Lists in Java
COMP1100 — Introduction to Programming and Algorithms
Reading: Big Java Ch.8.2 – 8.4
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Java Lists 1
Lists
Recall that lists allow us to combine a varying number of values of the
same type .
Lists are an extremely important data structure in programming, so we spent
a lot of time working with lists in Haskell.
In Haskell, lists are built in so we have some nice notation like [3,5,7] and
list comprehensions.
In Java, lists are supplied by the Java Class Libraries in the package
java.util.
COMP 1100 — Java Lists 2
Java Lists
The Java API has several different kinds of lists.
(For example, ArrayList, LinkedList, Vector, Stack . . . )
They all have a large collection of methods in common — see the
java.util.List interface.
So how do they differ?
The main difference is in terms of their implementations.
How do we know which version to choose for a particular application?
Wait until COMP1110/1510!
COMP 1100 — Java Lists 3
java.util.ArrayList
In this course we will choose to use ArrayList and not worry about its
implementation.
Some of the basic ArrayList methods:
add (elem) an element to the end of this list.
get (index) the element in the indexth position in the list.
isEmpty () tests if the list has no elements.
size () returns the number of elements in the list.
and so on.
COMP 1100 — Java Lists 4
Notice that the basic Java list operations have a different style to those used
to in Haskell.
In Haskell, we built lists by cons-ing elements on to the front of a list, like
(x:xs).
Java adds them to the end .
In Haskell we took lists apart by breaking them into their head and tail .
Java accesses list elements with the get .
Haskell has an equivalent operator (!!) but we didn’t use it much.
COMP 1100 — Java Lists 5
java.util.Stack
The Stack version of lists in Java does have methods more like the
operations we are used to in Haskell.
peek is like head .
pop is like tail .
push is like cons .
It also has all the other standard Java list methods.
But Stacks are “special purpose” lists, so we won’t use them here.
COMP 1100 — Java Lists 6
Generic Classes
The ArrayList class has heading ArrayList<E>.
What’s the <E> mean?
Since Java 1.5, classes may be generic , which is a similar concept as
Haskell’s polymorphic types.
In Haskell, if we want a list of Ints we write [Int].
If we want a list of Strings we write [String].
In Java, if we want a list of Integers we write ArrayList<Integer>.
If we want a list of Strings we write ArrayList<String>.
COMP 1100 — Java Lists 7
DrJava Demonstration . . .
COMP 1100 — Java Lists 8
Traversing Lists in Haskell
In Haskell many of the list algorithms we studied were structured around
traversing the list.
That is, the algorithm visited each element of the list (in order), processing
the elements in some way.
Sometimes we used a recursive definition, e.g.
sum [] = 0
sum (x:xs) = x + sum xs
Other times we used some of the higher-order functions that represent
patterns of recursion , like map, fold, filter and so on.
COMP 1100 — Java Lists 9
Traversing Lists in Java
In Java, we can also define recursive methods to traverse lists.
Java also has patterns of recursion , but they don’t occur as higher-order
functions.
Java’s patterns of recursion occur as language features.
In particular, Java has a variety of loop statements .
In this course, we will concentrate on a single loop statement: the “for
each” loop.
This loop corresponds exactly to a simple list traversal .
COMP 1100 — Java Lists 10
For-each Loop Structure
Suppose we have a list whose element are objects of some class Elem.
That is, we have:
ArrayList <Elem > list = ...
To traverse list the loop structure is:
for (Elem elt : list) {
...
}
Inside the loop, we can refer to the current element of the list as elt.
COMP 1100 — Java Lists 11
Example: Summing a List of Integers
Suppose we have a list of integers:
ArrayList <Integer > list
To sum them, we add each one in turn to a running total, like this:
Integer sum() {
Integer total = 0;
for (Integer x: list) {
total = total + x;
}
return total;
}
COMP 1100 — Java Lists 12
Accumulating Parameters in Haskell
Loops like the one in sum above correspond closely to the approach of using
accumulating parameters in Haskell definitions.
For example, a few weeks ago we looked at this version of a function to sum
the elements of a list:
addUp [] total = total
addUp (x:xs) total = addUp xs (total+x)
sum xs = addUp xs 0
COMP 1100 — Java Lists 13
Accumulating Parameters (ctd)
Corresponding to the parameter total in the Haskell function,
the Java version has a variable total.
In the Haskell version, we change the total parameter at each step to
(total+x). That is, we add the current element x to total.
In the Java version, we change the total variable at each step with the
statement, total = total+x. That is, we add the current element x to
total.
In the Haskell version, we initialise the parameter total to 0 in the
“wrapper” function, sum.
In the Java version, we initialise the variable total to 0 with the assignment
total = 0.
COMP 1100 — Java Lists 14
Other examples in ListInt.java
COMP 1100 — Java Lists 15
The next day . . .
COMP 1100 — Java Lists 16
Comparing Java Classes to Haskell Modules
In this lecture we will compare typical (good) Haskell and Java
implementations of:
• Cartesian coordinate points
• Triangles (determined by their 3 vertices)
• Paths (i.e. sequences of points)
The aim is to emphasise their commonalities and their differences .
COMP 1100 — Java Lists 17
Points
In Haskell, a good module structure will gather together the representation of
points with the operations on points in to a module .
A natural representation of a point is as a pair :
module Point where
type Point = (Double ,Double)
...
COMP 1100 — Java Lists 18
In Java, we will develop a class to represent points and the operations
(methods) on points.
Since the class “is” the type, it will include fields to represent the data – i.e.
the x and y components:
class Point {
Double x;
Double y;
..
COMP 1100 — Java Lists 19
Constructors
In Haskell we can just write values and expressions of type Point. For
example, (0.0,3.0).
In Java we must explicitly construct objects of class Point. The class has a
constructor :
Point(Double xVal , Double yVal) {
x = xVal;
y = yVal;
}
which we use like this:
Point p = new Point (0.0, 3.0);
COMP 1100 — Java Lists 20
Accessing Fields
In Haskell, we can access the x and y components of values of type Point
using patterns , e.g.:
distance (x1,y1) (x2 ,y2) = ...
In Java, we provide explicit accessor methods :
public Double xCoord () {
return x;
}
public Double yCoord () {
return y;
}
COMP 1100 — Java Lists 21
Functions vs. Methods
In Haskell, we write functions to operate on points. For example, to
compute a x,y translation of a point:
translatePt :: Double -> Double -> Point -> Point
translatePt xDist yDist (x,y) = (x+xDist , y+yDist)
Notice that there is an input Point and an output Point, which is the
result of the translation.
In Java we write methods to manipulate points.
Point objects may be changed by mutator methods .
COMP 1100 — Java Lists 22
The translate method is part of each Point object.
The Translate method changes the point.
The point is not passed as an argument to the method — the point being
affected is this point.
The method does not return a result — it affects this point.
public void translate(Double xDist , Double yDist) {
x = x + xDist;
y = y + yDist;
}
The method changes the x and y fields of this object.
COMP 1100 — Java Lists 23
Paths
A path is a sequence of points, so in Haskell a natural representation is as a
list of points:
module Path where
type Path = [Point]
...
In Java, the data field in the Path class will be a list of points:
class Path {
ArrayList <Point > path;
...
COMP 1100 — Java Lists 24
Constructing Paths
In Haskell we may simply write values of type Path, e.g.
[(0.0 ,0.0) ,(1.0 ,0.0) ,(0.0 ,1.0)]
In Java we must explicitly construct new Path objects:
Path() {
path = new ArrayList <Point >();
}
which gives us an empty path.
COMP 1100 — Java Lists 25
We also need to provide a mutator method to build paths point by point:
public void extend(Point point) {
path.add(point);
}
To build a path equivalent to the Haskell one
[(0.0,0.0),(1.0,0.0),(0.0,1.0)]:
Path myPath = new Path ();
myPath.extend(new Point (0.0 ,0.0));
myPath.extend(new Point (1.0 ,0.0));
myPath.extend(new Point (0.0 ,1.0));
(Whew!)
COMP 1100 — Java Lists 26
Functions vs. Methods
In the Haskell Point module we wrote a function to translate a single point.
To translate a Path we want to apply the same translation to each point. The
obvious approach is to use the map higher-order function:
translate :: Double -> Double -> Path -> Path
translate xDist yDist path =
map (translatePt xDist yDist) path
The corresponding “pattern of recursion” in Java is the for-each loop.
COMP 1100 — Java Lists 27
A loop beginning with:
for (Point point: path)
will traverse the path list. At each step, the current element of the list will be
referenced by the Point point variable.
So applying the translate method to point inside the loop will have the same
effect as map in Haskell: to apply the method to every element of the list :
void translate(Double xDist , Double yDist) {
for (Point point: path)
point.translate(xDist , yDist);
}
COMP 1100 — Java Lists 28
Assignment 3
The representation of PPM images in Assignment 3 uses lists of lists of
pixels, just as we did in Assignment 1. In Java it looks like this:
class PPM {
int width;
int height;
ArrayList <ArrayList <Pixel >> image;
Most of the manipulations are list traversals, so the for-each loop is a good
choice.
COMP 1100 — Java Lists 29
On the other hand, sometimes we want more control.
For example, in the flip manipulations we may want to go through the rows
or pixels in reverse order.
In that case, the more general (and standard) version of the for-loop can be
more appropriate.
COMP 1100 — Java Lists 30
Public, Private, Static
and all that . . .
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Public, Private, Static . . . 1
The Interface to a Class
In the data directed design approach, a key step was:
Identify the Basic Operations on Objects
These basic operations become the methods of the class.
Users of such objects are often called client classes .
Client classes access and manipulate objects through the methods. . .
. . . but they may also be able to directly access and modify the fields of
objects.
COMP 1100 — Public, Private, Static . . . 2
Controlling Client Access
Often we want to limit the ways in which clients can access and manipulate
objects. For example, some methods of a class may only be intended for by
other methods of the class — that is, they are “helper” functions.
As an example in Assignment 3, you may want a function which doubles a
row of an image:
ArrayList <Pixel > doubleRow(ArrayList <Pixel > row)
This may be used internally to help define the method to double the scale of
an image, but you don’t want to make this function available to the client.
COMP 1100 — Public, Private, Static . . . 3
Controlling Client Access
Almost always we want to stop clients directly accessing the fields of an
object.
Why?
• Because how we choose to represent the data is our choice, not the
clients.
• In the design process we identified the operations that were needed
(and made sense) on objects of this class. That is, the methods of the
class must be sufficient for all proper uses of the object.
• Giving clients access to the fields means they can circumvent the proper
use of the object (via the methods).
• Sometimes, controlling access to fields is a security issue.
COMP 1100 — Public, Private, Static . . . 4
An Example
In a recent lecture and prac class we looked at a simple bank account class,
BankAccount.java.
The operations we identified were
• Create a new account, BankAccount(Double initialBalance)
• Get the current balance, Double getBalance()
• Make a deposit, void deposit(Double amount)
• Make a withdrawal, void withdraw(Double amount)
so we expect that clients will use only those methods to work with bank
accounts.
COMP 1100 — Public, Private, Static . . . 5
However. . .
There is nothing stopping clients accessing the balance field directly !
What’s wrong with that?
Unscrupulous programmers can defraud the bank by changing their account
balance without going through the proper channels (deposit and withdrawal).
COMP 1100 — Public, Private, Static . . . 6
Still not convinced?
Let’s extend the BankAccount class to include a record of all transactions.
Each time a deposit is made, record the amount, and each withdrawal is
recorded as a negative quantity.
ArrayList <Double > transactions;
We also want a method to check that the current balance is consistent with
the transaction record.
(See OpenAccount.java)
Will that stop the fraud? No, if those naughty programmers can change the
balance field, they can also change the transactions field.
(See Naughty.java)
COMP 1100 — Public, Private, Static . . . 7
Public and Private
Most programming languages provide a means of controlling which
aspects of a module or class are accessible to client classes .
Java has “access modifiers”: public and private.
If a field or method declaration is preceded by private, it cannot be
accessed or modified by client classes.
If a field or method declaration is preceded by public, it can be accessed
and modified by all other classes.
(If we don’t specify public or private, they may be accessed and modified by all
other classes in the same directory.)
COMP 1100 — Public, Private, Static . . . 8
Make All Fields Private
It is considered good practice to make all fields private. This ensures that the
objects of the class are only accessed and manipulated by the methods
provided.
Reworking the bank account example (again), we see that making the
balance and transactions fields private (and the methods public)
prevents improper access and manipulation.
(See SecureAccount.java)
COMP 1100 — Public, Private, Static . . . 9
Make “Local” Methods Private
In the PPM example mentioned above, we don’t want clients to have access
to the doubleRow method because it’s a helper function only intended for
internal use.
Note also that it is not one of the operations that would be identified as a
basic operation on PPM images .
By making the method private we prevent clients from calling it:
private ArrayList <Pixel > doubleRow(ArrayList <Pixel > row)
Also, when client programmers see that the method is private, they know
that they can ignore it completely.
COMP 1100 — Public, Private, Static . . . 10
Static
(Treat this as an advanced topic — it will not be in the final exam.)
In earlier lectures I stressed the point that in Java:
An object has its own copies of the methods and fields of the class.
For example,
BankAccount jimsSaving = new BankAccount ();
jimsSaving.deposit (2000);
jimsSaving.withdraw (500);
System.out.println(jimsSaving.getBalance ());
COMP 1100 — Public, Private, Static . . . 11
I lied. . .
We have written things like
Math.sqrt((x - p.xCoord ())*(x - p.xCoord ()) ...
Math.cos(theta) ...
but Math is a class , not an object. (See java.lang.Math.)
We didn’t say
Math thing = new Math ();
thing.sqrt (...
What’s going on?
COMP 1100 — Public, Private, Static . . . 12
The Truth
(According to Clem. . . )
In an earlier lecture I also made the point that Java classes are like Haskell
types an also like Haskell modules .
Sometimes we use modules to group together logically related functions
and operations. In that case they don’t correspond to abstract types in any
sense.
Since Java conflates (confuses?) the two ideas, we have to use classes for
both purposes.
COMP 1100 — Public, Private, Static . . . 13
The java.lang.Math library is a good example.
It doesn’t represent a type — it is a collection of mathematical operators.
In that case, it doesn’t make sense to construct an object of class Math.
We want a way to refer to the methods of these module-like classes.
Putting static in front of the declaration of methods and fields of a
class means that they are associated with the class , not an object.
To refer to a static method, we use the class name, not an object.
COMP 1100 — Public, Private, Static . . . 14
All the methods in java.lang.Math are declared static so we write
things like
Math.sqrt (...)
We never write
Math thing = new Math ();
(In fact java.lang.Math doesn’t have any constructors, which makes sense.)
Find some other Java library classes like this — there are plenty of them.
(In tomorrow’s lecture, we will develop our own module-style class.)
COMP 1100 — Public, Private, Static . . . 15
Structuring Java Programs
— Data Directed Design —
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Structuring Java Programs 1
Announcements
• If you’re still having trouble installing Java on your Windows machine, I
have put some more detailed instructions on the course web site, linked
from the Java page.
• We need a 1st year representative for the Department of Computer
Science Forum.
– Solicit views of students (e.g. on 1100.talk and 1200.talk phorums,
then . . .
– Offer feedback on DCS educational issues (teaching infrastructure,
policies etc.)
– Next meeting: 4–5pm, 31st May, N101 (One per semester)
– Contact me or Peter Strazdins [email protected]
COMP 1100 — Structuring Java Programs 2
The Supermarket Docket again . . . but in Java
In an attempt to bring all these Java threads together, today we will go
through the development of the supermarket docket program we wrote in
Haskell, weeks ago.
Reminder: The checkout scanner produces a sequence of bar codes. Our
program is to look up each of those codes in a database to retrieve the
information about the product, then generate an itemised docket including
the total price.
COMP 1100 — Structuring Java Programs 3
For example, if the scanner produces a sequence of bar codes like:
4719 1112 1113 3814 1234
our program will print something like:
Gosling’s Sunny-Mart
Frozen Pizza..............6.49
Mars Bar..................1.60
Unknown Item..............0.00
Hokkien Noodles...........2.05
Chianti, 1lt.............17.95
Total...................$28.09
COMP 1100 — Structuring Java Programs 4
Summary of Data Directed Design Process
Here we go again. . .
1. Understand the Problem
2. Identify the Abstract Data Types
3. Identify the Basic Operations on the ADTs
4. Choose Representations of the ADTs
5. Implement the Basic Operations on the ADTs
6. Factor the Process into Manageable Parts
COMP 1100 — Structuring Java Programs 5
2. Identify the ADTs
• Products
• Bar codes
• Prices
• Names (of products)
• Database
• Docket
• Sequence of bar codes from scanner
COMP 1100 — Structuring Java Programs 6
3. Identify the Basic Operations on the ADTs
Products:
• What is its bar code? Name? Price?
• Construct a new product record.
Bar codes
• Equality comparison.
• Anything else?
Prices
• Arithmetic.
COMP 1100 — Structuring Java Programs 7
Names of products
• Print or display them.
Database
• Retrieve a product, given a bar code.
• Add a new product to the database.
• Delete a product, etcetera, etcetera!!
Docket
• Print or display.
Bar code sequences
• Traverse.
COMP 1100 — Structuring Java Programs 8
4. Choose Representations of the ADTs
Let’s do the easy ones first:
• Dockets and Names can just be strings.
• Prices can just be integers.
• Bar codes?
Some context information: the UPC (Universal Product Code) is a
sequence of 12 digits. It is not a number, so representing them as
integers is not a good idea, especially on a computer.
But this is just a demonstration exercise, so we will ignore reality and
represent bar codes as integers.
COMP 1100 — Structuring Java Programs 9
Products:
Considering the operations we have identified, a simple class with fields for
the product’s bar code, name and price is an obvious choice.
Database:
There are lots of choices, and lots of advantages and disadvantages of
different representations. But we don’t know about that yet. . .
Choose something we do know about. We know how to search through a
list, so we will represent the database as a list of products.
(In next week’s lab, you are invited to use a hash table instead.)
So, we need to implement two Java classes:
Product.java
DB.java
COMP 1100 — Structuring Java Programs 10
Sequence of bar codes:
Since all we want to do is traverse the sequence, a list of bar codes suffices.
An interesting question is whether we really need to physically represent
these sequences — that is do they need to be stored in the program?
From one viewpoint, the answer is no — the scanner gives us one bar code
of the sequence at a time and we process them as they arrive. The
sequence is purely transitory.
However, for the purpose of testing (during development of the program) it
may be convenient to set up fixed dummy lists of bar codes, rather than have
to worry about generating them dynamically.
That’s what we’ll do today.
COMP 1100 — Structuring Java Programs 11
5. Implement the Basic Operations on the ADTs
We have identified the operations and the representations for each ADT.
This part is just writing code. . .
COMP 1100 — Structuring Java Programs 12
6. Factor the Process into Manageable Parts
For each bar code:
1. Look it up in the database to retrieve the product information;
2. Format (and print?) a line of the docket
3. Add the product price to the total
After processing all of the bar codes, format and print the total cost line of
the docket.
We also want to print a header for the docket first.
COMP 1100 — Structuring Java Programs 13
The sequence of actions making up the process will appear in the driver
class. Since we are only going as far as testing, we’ll call it Test.java.
Since it’s a test program, we’ll construct a small sample database, and a
dummy list of bar codes representing the sequence of scanner inputs.
The repeated actions for each bar code will be in a loop traversing the list of
bar codes.
But first...
COMP 1100 — Structuring Java Programs 14
A Docket Module
Notice that in the process sequence we had three activities related to
formatting stuff for the docket:
• Format each line representing a purchase
• Format the total cost line
• Format a header for the docket
It makes sense (to me at least) to collect these operations together, but they
aren’t really operations on an ADT — we want a module, but all we have in
Java is classes.
COMP 1100 — Structuring Java Programs 15
In yesterday’s lecture we saw how the Math library class is this kind of
module — a class where all methods are static , so they are related to the
class, and we don’t construct objects of such a class.
We can build a module Docket.java with all methods declared static.
It also has some helper methods for internal use, so they are declared
private.
(As is often the case with formatting stuff, it gets a bit messy and tedious, so don’t
worry too much about the code. . . )
COMP 1100 — Structuring Java Programs 16
A Haiku . . .
Nice little program.
We thought about its structure.
I feel satisfied.
COMP 1100 — Structuring Java Programs 17
Algebraic Data Types in Java
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Algebraic Data Types in Java 1
Algebraic Types in Haskell
Recall that one of the ways we can define our own types in Haskell is as
algebraic data types .
The simplest kind of algebraic types are enumerations . For example:
data Day = Sun|Mon|Tue|Wed|Thu|Sat
data Temp = Cold|Hot
data Season = Spring|Summer|Autumn|Winter
Can we do that in Java?
COMP 1100 — Algebraic Data Types in Java 2
Enumerations in Java
enum Day {SUN ,MON ,TUE ,WED ,THU ,FRI ,SAT};
enum Temp {HOT ,COLD};
enum Season {SPRING ,SUMMER ,AUTUMN ,WINTER };
(It is a convention to write constants in upper case. It is not a rule.)
COMP 1100 — Algebraic Data Types in Java 3
Java enum actually defines a new class, so you have to write
Season.SPRING.
Temp weather(Season s) {
if (s.equals(Season.SUMMER ))
return Temp.HOT;
else
return Temp.COLD;
}
COMP 1100 — Algebraic Data Types in Java 4
Union Types
Haskell’s algebraic data types can be more interesting than simple
enumerations — the alternatives can be structured .
Here is a type of different shapes, together with their dimensions:
data Shape = Circle Float
| Rectangle Float Float
| Ellipse Float Float
| Square Float
For example, (Rectangle 12.0 8.0) represents a rectangle of length
12.0 and height 8.0
COMP 1100 — Algebraic Data Types in Java 5
In Haskell we define functions on union types, case by case:
area :: Shape -> Float
area (Circle rad) = pi * rad^2
area (Rectangle l b) = l * b
area (Ellipse maj min) = pi * maj * min
area (Square side) = side^2
COMP 1100 — Algebraic Data Types in Java 6
Can we do that in Java?
Yes, but of course it looks a bit different.
Pick out just one alternative first — Rectangle
Based on the Haskell code on previous slides, rectangles alone can be
represented as:
data Rectangle = Rectangle Float Float
and we can define functions like this:
area :: Rectangle -> Float
area (Rectangle l b) = l * b
(See Rectangle.hs)
COMP 1100 — Algebraic Data Types in Java 7
How to Represent Rectangles Alone in Java?
One of the first Java classes we looked at was java.awt.Rectangle.
Here we will build our own version:
class Rectangle {
Double length , height;
public Rectangle(Double l, Double h) {
length = l;
height = h;
}
...
COMP 1100 — Algebraic Data Types in Java 8
Notice that in Haskell we refer to Rectangle (in the expression Rectangle
12.0 8.0) a constructor function as it constructs a rectangle.
Which corresponds exactly to the constructor (method) in the Java
rectangle class. The expression new Rectangle(12.0,8.0) constructs
the same rectangle.
Corresponding to the functions we define in Haskell, we have methods like
public Double area() {
return length * height;
}
and so on.
COMP 1100 — Algebraic Data Types in Java 9
Lesson:
We can build a Java class corresponding to each of the alternatives in
a Haskell algebraic type.
(Circle.java, Rectangle.java, Square.java, Ellipse.java)
In Haskell, that’s like
data Circle = Circle Float
data Rectangle = Rectangle Float Float
data Ellipse = Ellipse Float Float
data Square = Square Float
which is not quite what we want.
COMP 1100 — Algebraic Data Types in Java 10
In Haskell we can have variables of type Shape which may be circles,
rectangles, ellipses or squares.
How can we do that in Java?
Abstract Classes
Abstract classes don’t have objects .
Abstract classes can be “extended” by concrete classes of the same
“signature.” That is, the concrete classes implement all the methods of the
abstract class.
(See Shape.java)
COMP 1100 — Algebraic Data Types in Java 11
The methods in Shape.java are declared abstract and do’t have bodies.
Each of the concrete classes Circle.java, Rectangle.java,
Square.java and Ellipse.java starts like this:
class Rectangle extends Shape {
to indicate that it is related to Shape.
We can have variables, parameters and function results of type Shape.
For example:
public Shape normalise () {
which will return an object of class Circle, Rectangle, Square or
Ellipse.
So we have achieved something corresponding to the Haskell algebraic data
type.
COMP 1100 — Algebraic Data Types in Java 12
A more complex example of an algebraic data type was used to represent
pictures in the ANUPlot library. Corresponding Java classes are given on the
Lectures page of the course website.
COMP 1100 — Algebraic Data Types in Java 13
UML Class Diagrams
An aspect of UML (the Universal Modelling Language) is class diagrams to
graphically represent the relationship between classes in a system. Classes
are represented by boxes and the relations are represented by lines or
arrows.
On of the main arrows represents the “is-a” relationship:
In the shapes example, each of Circle, Rectangle, Square and Ellipse
“is-a” Shape.
COMP 1100 — Algebraic Data Types in Java 14
COMP 1100 — Algebraic Data Types in Java 15
Shape
+ area() : double+ perimeter() : double+ isRound() : boolean
Circle+ radius : double+ area() : double+ perimeter() : double+ isRound() : boolean
Square+ side : double+ area() : double+ perimeter() : double+ isRound() : boolean
Rectangle+ length : double+ height : double+ area() : double+ perimeter() : double+ isRound() : boolean
Ellipse+ major : double+ minor : double+ area() : double+ perimeter() : double+ isRound() : boolean
Diagram: class diagram Page 1
COMP 1100 — Algebraic Data Types in Java 16
The End.
• Summary lecture Thursday 25 May 9.30am
• Sample exam questions in the next week or so. . .
• Watch course website for drop-in labs schedule between now and
exam.
• Pre-exam massed tutorials/lectures in week prior to exam. Watch
website for announcement.
• Thanks and good luck.
COMP 1100 — Algebraic Data Types in Java 17
— Summary —
Core Themes & Unifying Principles
COMP1100 — Introduction to Programming and Algorithms
Clem Baker-Finch
Australian National University
Semester 1, 2006
COMP 1100 — Core Themes & Unifying Principles 1
Types, Types, Types!
• Types classify program behaviour according to the kinds of values they
compute.
• Types are a basic organising principle of Computer Science.
None of this is language specific.
COMP 1100 — Core Themes & Unifying Principles 2
Types are good for:
• Detecting errors
• Documentation
Since types “classify program behaviour,” reading the type signature of a
function sets a framework for understanding its detailed behaviour.
• Abstraction
The fundamental activity in Data Directed Design is to identify the
“types” of things involved in the problem domain.
• Other stuff you might find out about later, e.g. efficient implementations.
COMP 1100 — Core Themes & Unifying Principles 3
Data Directed Design
Designing and constructing programs is about identifying and characterising
abstract data types in which to express the problem or application.
(Data Directed Design steps 2 and 3.)
That word again: Abstraction .
COMP 1100 — Core Themes & Unifying Principles 4
Data Structures Suggest Algorithms
For example, if we choose to represent an ADT using a tuple then we should
expect the algorithms on that ADT to be in terms of selection.
If we are using a list then we should expect the algorithms to use traversal.
If we are using an algebraic data type then we should expect to have
separate cases for the alternatives and to select the components from each
alternative.
COMP 1100 — Core Themes & Unifying Principles 5
Lists
Lists consist of a variable number of elements of the same type.
So. . .
Algorithms on lists consist of doing the same thing with each of the elements.
Since there is a variable number of elements, that process will occur a
variable number of times.
We have repetition!
COMP 1100 — Core Themes & Unifying Principles 6
Repetition
Lists consist of a number of elements of the same type,
so. . .
List algorithms are structured around repetition.
Recursion is a fundamental way of expressing repetition.
Recursion is all we ever need to express any repetitive algorithm.
COMP 1100 — Core Themes & Unifying Principles 7
Patterns of Repetition
With experience, we observe the same patterns of repetition occur again
and again in different algorithms.
We can abstract and identify the pattern.
By using the pattern (instead of its implementation using recursion)
programs are easier to develop and understand.
In Haskell such patterns appear as higher order functions.
In Java such patterns appear as (built-in) loops.
COMP 1100 — Core Themes & Unifying Principles 8
Reasoning: safety critical systems
In some applications there is a need for formal proof of properties of
programs — “correctness.”
(For examples, see Peter G. Neumann : Computer-Related Risks.)
COMP 1100 — Core Themes & Unifying Principles 9
Reasoning: your understanding
More generally, when we think about programs (e.g. when we are writing
them) we are reasoning about them, perhaps informally.
There is a very close relationship between inductive proof, recursive data
structures (such as lists) and recursive algorithms:
• The clauses of a recursive algorithm reflects the form of the recursive
data structure
• The cases of an inductive proof reflect the form of the recursive data
structure
• The cases of an inductive proof reflect the clauses of the recursive
algorithm
COMP 1100 — Core Themes & Unifying Principles 10
Modularisation
The problem-domain data types identified in the Data Directed Design
process suggest a clean and coherent way to break the overall program into
modules — conceptually independent components of manageable size and
complexity.
This is the fundamental motivation behind object oriented languages such as
Java.
Classes correspond directly to the problem-domain data types.
COMP 1100 — Core Themes & Unifying Principles 11
Interlude
COMP 1100 — Core Themes & Unifying Principles 12
Perspective
We mostly used Haskell to present the core concepts of the course.
We reinforced the concept by expressing them in Java .
This was the main reason for looking at Java in this course.
The other reason: to provide a bridge to COMP1110/1510.
Don’t forget about Haskell!
Don’t stop using it!
COMP 1100 — Core Themes & Unifying Principles 13
Study Focus
• Don’t over-emphasise Java just because we have been using it for the
last few weeks.
• Most of the course used Haskell and most of the final exam will be in
terms of Haskell, or perhaps not language specific.
• Don’t be overwhelmed by the intricacies of Java — you have all next
semester for that.
• Try to focus on the core concepts.
COMP 1100 — Core Themes & Unifying Principles 14
Aiming for a Particular Result?
(Be realistic.)
Pass or Credit: Focus on the material and concepts in (***) lectures.
For example, in Java concentrate on the concept of a class and an
object, and on the alternative view of list processing.
Distinction: Material and concepts in (***) and (**) lectures.
High Distinction: Material and concepts in (***), (**), some (*) lectures. . .
. . . and cross your fingers.
COMP 1100 — Core Themes & Unifying Principles 15
Study Advice:
• Prioritise your focus based on your aims.
• Re-work prac exercises.
• Experiment with sample code from lectures.
• Study groups can be much more effective that a solitary effort.
• Study groups can provide extra motivation.
COMP 1100 — Core Themes & Unifying Principles 16
Good Luck!
COMP 1100 — Core Themes & Unifying Principles 17