Top Banner
Data Comes in Shapes July 16, 5 th Elephant Tim Poston Chief Scientist http://forushealth.com http://geometeer.com [email protected]
16
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Data Comes in Shapes

Data Comes in ShapesJuly 16, 5th Elephant

Tim PostonChief Scientist

http://forushealth.comhttp://geometeer.com

[email protected]

Page 2: Data Comes in Shapes

Mostly numbers.

What are data?

Are numbers only numbers?

Numbers come in patterns:That is what ‘big data’ is all about.

Patterns are shapes.

Page 3: Data Comes in Shapes

Studying data shapes is geometry.

Patterns are shapes.

… but not the geometry of high school.

Page 4: Data Comes in Shapes

Studying data shapes is not the geometry of high school.

It is not replacing the 3D minds of children

with flattened (though intricate) teen imagination.

Page 5: Data Comes in Shapes

If we have three variables, we have three dimensions.

If we have n variables, we have n dimensions.

To think about n dimensions, we have two choices:

Practice thinking in 3D

Turn it all into algebra

We have to do both.

Page 6: Data Comes in Shapes

What does a matrix

a b c

c d e

f g h[ ]even mean?

Page 7: Data Comes in Shapes

[ ][] []

A matrix

a b c 1 a

c d e 0 = c

f g h 0 f

describes a transformation

by listing how a few things change.

Page 8: Data Comes in Shapes

[ ][] []

A matrix

a b c 0 b

c d e 1 = d

f g h 0 g

describes a transformation

by listing how a few things change.

Page 9: Data Comes in Shapes

[ ][] []

A matrix

a b c 0 c

c d e 0 = e

f g h 1 h

describes a transformation

by listing how a few things change.

Page 10: Data Comes in Shapes

a b c

c d e

f g h[ ]is just a list of where (1,0,0), (0,1,0) and (0,0,1) go.

A matrix

Remember that, and you always clarify how the algebra works.

Remember that, and you always clarify how the code should work.

Page 11: Data Comes in Shapes

Principal component analysis (PCA)

just finds a rotation (matrix) so that the data pointslie as close as possible to coordinate axes.

In n dimensions.

Page 12: Data Comes in Shapes

The simplex method (“Linear Programming”) looks at points constrained by inequalities

a1x1 + a2x2 + … + anxn + c ≥ 0

which just means ‘lying on one side of a line/plane/hyperplane, in 2D/3D/nD’.

A convex polygon/polyhedron/polytope.

Page 13: Data Comes in Shapes

The simplex method (“Linear Programming”) looks ata convex polytope, and seeks the highest point.

Find a genuine corner (any corner).

Go up the most vertical edge, till you meet another face.

Do that again. And again.

And again. And again. And reach the top. All the matrix ‘pivoting’, degenerate case handling, etc.,is just implementing that.

Page 14: Data Comes in Shapes

Support vector machine explanations

(like this from Wikipedia)tend to skimp on the geometry.

What is a support line / plane /hyperplane?

How do you find one? (Very like simplex method.)

Page 15: Data Comes in Shapes

Geometry organises what algebra needs to do.

Algebra (often linear) organises what code needs to do.

Planning code needs algebra, which needs geometry.

Some bugs come from coding wrong.

Some bugs come from coding the wrong algebra.

Some bugs come from algebraising the wrong geometry.

Try to think at all levels!

Page 16: Data Comes in Shapes

Thank you!

Tim Poston

http://forushealth.comhttp://[email protected]