Top Banner
COMP60431 Machine Learning Advanced Computer Science MSc Lecturers: Magnus Rattray & Gavin Brown
22

COMP60431 Machine Learning Advanced Computer Science MSc

Nov 30, 2014

Download

Documents

butest

 
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: COMP60431 Machine Learning Advanced Computer Science MSc

COMP60431 Machine Learning

Advanced Computer Science MSc

Lecturers: Magnus Rattray & Gavin Brown

Page 2: COMP60431 Machine Learning Advanced Computer Science MSc

What is Machine Learning?

1. Software that adapts to (“learns” from) data

2. Concerned with creating and using mathematical “data structures” that allow a computer to exhibit behaviour that would normally require a human.

Page 3: COMP60431 Machine Learning Advanced Computer Science MSc

Applications

Speech and hand-writing recognition Autonomous robot control Data mining and bioinformatics Playing games Fault detection Clinical diagnosis Spam email detection Inverse kinematics

Applications are diverse, algorithms are generic.

Page 4: COMP60431 Machine Learning Advanced Computer Science MSc

Introduce the concepts and details behind various ML methods, including how they work, and use existing software packages to illustrate how they are used on data.

Projects – explore the field, reinvent if you want

What will you be doing?

Page 5: COMP60431 Machine Learning Advanced Computer Science MSc

Machine Learning Methods

Learning from labelled data (supervised learning) (e.g. trying to predict the weather from a dataset of historical patterns)

Learning from unlabelled data (unsupervised learning) (e.g. trying to identify natural patterns in sales of books on Amazon.com)

Learning from sequential data (e.g. Speech recognition, DNA sequence analysis)

Page 6: COMP60431 Machine Learning Advanced Computer Science MSc

Statistical Learning

Different Machine learning methods can be unified within a framework of statistics :

Data is considered to be from a probability distribution. Typically, we don’t expect perfect learning but only

“probably correct” learning. Statistical concepts are the key to measuring our

future expected performance.

Important:If you’re not prepared to get into a bit of maths (linear

algebra, calculus, statistics) don’t take this course.

Page 7: COMP60431 Machine Learning Advanced Computer Science MSc

Example 1: Hand-written digits

Data: Greyscale images

Task: Classification (0, 1, 2, 3…..9)

Problem features: Highly variable inputs from same class,

including some “weird” inputs.

Page 8: COMP60431 Machine Learning Advanced Computer Science MSc

                                                

Methods: K-Nearest Neighbour or Support Vector Machines

US Postal Service Digits

Page 9: COMP60431 Machine Learning Advanced Computer Science MSc

Example 2: Predicting heart disease

-- 1. age -- 2. sex -- 3. chest pain type (4 values) -- 4. resting blood pressure -- 5. serum cholestoral in mg/dl -- 6. fasting blood sugar > 120 mg/dl -- 7. resting electrocardiographic results (values 0,1,2) -- 8. maximum heart rate achieved -- 9. exercise induced angina -- 10. oldpeak = ST depression induced by exercise relative to rest -- 11. the slope of the peak exercise ST segment -- 12. number of major vessels (0-3) colored by flourosopy

Page 10: COMP60431 Machine Learning Advanced Computer Science MSc

Example 2: Predicting heart disease

age sex ch bp sc fb ele mx ang old slo maj typ r

67 0 3 115 564 0 2 160 0 1.6 2 0 7 1

57 1 2 124 261 0 0 141 0 0.3 1 0 7 2

64 1 4 128 263 0 0 105 1 0.2 2 1 7 1

74 0 2 120 269 0 2 121 1 0.2 1 1 3 1

65 1 4 120 177 0 0 140 0 0.4 1 0 7 1

56 1 3 130 256 1 2 142 1 0.6 2 1 6 2

59 1 4 110 239 0 2 142 1 1.2 2 1 7 2

……63 0 4 150 407 0 2 154 0 4 2 3 7 2

59 1 4 135 234 0 0 161 0 0.5 2 0 7 1

53 1 4 142 226 0 2 111 1 0 1 0 7 1

44 1 3 140 235 0 2 180 0 0 1 0 3 1

61 1 1 134 234 0 0 145 0 2.6 2 2 3 2

57 0 4 128 303 0 2 159 0 0 1 1 3 1

71 0 4 112 149 0 0 125 0 1.6 2 0 3 1

(2% of full

dataset shown)

Page 11: COMP60431 Machine Learning Advanced Computer Science MSc

Example 2: Predicting heart disease

“Heuristics that make us smart”

Page 12: COMP60431 Machine Learning Advanced Computer Science MSc

Example 3: DNA microarrays

DNA from ~10,000 genes attached to a glass slide called a “microarray”.

Green and red labels attached to mRNA from two different sample tissues.

Page 13: COMP60431 Machine Learning Advanced Computer Science MSc

DNA microarrays

Tasks: Sample classification, gene classification, visualisation and clustering of genes/samples.

Problem features: Very high-dimensional data (many features)

but relatively small number of examples (samples)

Extremely noisy data (noise ~ signal) Lack of good domain knowledge

Page 14: COMP60431 Machine Learning Advanced Computer Science MSc

Projection of 10,000 dimensional data onto 2D using PCA effectively separates cancer subtypes.

DNA microarrays

Page 15: COMP60431 Machine Learning Advanced Computer Science MSc

Relevant disciplines

Algorithms Artificial intelligence Control Physics Information theory Dynamical systems Neurobiology Signal processing

Statistics Linear algebra Etc, etc …..

Researchers in ML come from a variety of different backgrounds.

Page 16: COMP60431 Machine Learning Advanced Computer Science MSc

Prerequisites

Need: Reasonable knowledge of calculus and matrix/vector algebra.

Don’t need: Previous experience of Matlab programming – this will be learned during the course.

Page 17: COMP60431 Machine Learning Advanced Computer Science MSc

Module structure

Assessed exercises (20%)Project (30%)January examination (50%)

Period 1 (Tuesdays) 28th Sept – 3rd Nov

Page 18: COMP60431 Machine Learning Advanced Computer Science MSc

Resources

We’ll provide full slides and notes.

If you want a book, this is a suggestion:

E. Alpaydin

“Introduction to Machine Learning”

Page 19: COMP60431 Machine Learning Advanced Computer Science MSc

What now ?

Web page : http://intranet.cs.man.ac.uk/mlo/comp60431/

The course begins on Tuesday 29th Sept.

If you want to take the course: check primer tutorial on the required maths, practice with Matlab (tutorial on website)

Page 20: COMP60431 Machine Learning Advanced Computer Science MSc

Questions?

Page 21: COMP60431 Machine Learning Advanced Computer Science MSc

Example : Speech recognition

Data: features from spectral analysis of speech signals (two in this simple example).

Task: Classification of vowel sounds in words of the form “h-?-d”, e.g. head, hid, had etc.

Problem features: Highly variable data with same classification Good feature selection is very important This task is a small part of a larger task

Page 22: COMP60431 Machine Learning Advanced Computer Science MSc

Method: Multilayer neural network