Top Banner
Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M M ath ath L L iteracy iteracy : : The ability to read and write math notation.
24

Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Dec 14, 2015

Download

Documents

Rhett Randell
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Math Literate Computers

Dorothea Blostein School of Computing, Queen’s University

CICM 2009

MMath ath LLiteracyiteracy:: The ability to read and write math notation.

Page 2: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

In people, understanding precedes literacy. Computers are fairly literate, but with shallow understanding.

People learn to read before they learn to write. Computers are better at writing than reading.

Math literacy relates to literacy in other diagram notations:two-dimensional, domain-specific, natural languages.

Page 3: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Freedom to think with paper and pencil.

Computer support for typesetting, search, automated reasoning.

Goal:Goal: Smooth conversion betweenSmooth conversion betweenpaper and electronic documentspaper and electronic documents

Four Color Theorem, Appel and Haken, 1976

Math Notation - A Tool to Support Reasoning• Evolved over centuries• Additional notation is invented as needed• Many dialects

Page 4: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Topics

Page 5: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Notational conventions map between information and ink.

Writing (Generation)

Reading (Recognition)

Page 6: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Difficult: create anaesthetically appealing diagram

A solved problem

Difficult. An active research area.

Difficult: handle symbol recognition errors and variable layout.

Writing (Generation)

Reading, RecognitionReading (Recognition)

Conventions geared toward generation

Conventions geared toward recognition

Page 7: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Many Diagrams Represent the Same Information

Same use of hard conventions

Varying use of Soft conventions

RecognitionAll the diagrams lead to same information

GenerationOne path (chosen according

to user preferences) from information to diagram

Hard conventions: how to encode information. Soft conventions: how to make it readable.

Page 8: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Topics

Page 9: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Sources of Information about Math NotationSample Documents Math notation defined by use in society. Introspection.

geared toward manual typesetting.

By example. People use their judgment .

Chaundy, Barrett, Batey, The Printing of Mathematics, 1957.Wick, Rules for Typesetting Mathematics, 1965. Higham, Handbook of Writing for the Math. Sciences, 1993.

geared toward computational typesetting. Knuth, “Mathematical Typography,” Bulletin of the AMS, 1979.

for recognizing and generating math notation.

Written Descriptions

Program Code

Recognition Contestsdefine datasets and evaluation metrics. Contests at ICDAR and GREC: Arc segmentation, symbol recognition, segmenting text and graphics, raster to vector conversion, signature verification, document binarization, page segmentation.

Page 10: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Statistics about Math Notation: An Example

Gather statistics from training data.

Almost matches human performance in labeling bounding boxes.

Spatial relations for pairs of bounding boxes.

Top labels: most likely, based on statistics.

Ambiguity due to unknown baseline

[Wang&Faure, ICPR 1988]

Page 11: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Topics

Page 12: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Challenges in Math Recognition

Symbol recognition ( C O 0 7 > S 5 / 1 l

Several roles for symbols

Spatial relationships

Little redundancy

Handwritten notationis particularly difficult

Compilers easily handle math notation in programming languages.

2D math notation is harder: – Noise causes errors in segmenting and identifying symbols.– Can’t blame the user for mistakes.– Hard to capture 2D relationships effectively in a string.

Page 13: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Evaluate/compare these approaches?

The choice of software architecture is difficult to make and defend.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

[Survey by Blostein and Grbavec, 1997]

Page 14: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

No explicit definition of math syntax.

Update code in response to recognition errors.

Can get good recognition performance.

Page 15: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

Apply a rule to a set of symbols: create subsets with syntactic subgoals.

A clear, well-structured representation of notational conventions.

[Anderson 1969; in Fu 77]

Attributes: xmin, ymin, xmax, ymax, xcenterm encodes meaning

Page 16: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

horizontal cut

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

vertical cut

[Okamoto and Miao, 1992]

The order of cuts provides the tree-structure of the expression.

A simple and efficient technique.Can be applied prior to OCR.

Special handling of overlapping symbols:

Page 17: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

Hidden Markov Model [Kopec, Chou 1994]

An explicit image-generation model,to drive recognition.

Applied to yellow pages & music notation.

2D stochastic context-free grammar [Chou 1989]

Find the most likely parse of the image, without segmentation.

Page 18: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Procedurally-coded math syntax Coordinate grammar

Projection profile cutting Stochastic grammars & HMMs

Graph rewriting Tree rewriting

Math-Recognition Approaches

Rewrite rules replace one subgraph by another

PROGRES language: a mix of textual and visual notation

Write a graph schema to define the structure of valid graphs.

The PROGRES execution environment flags violations.

Build Constrain

Parse

Parse

[Blostein, Schürr, Software Practice and Experience, 1999]

Page 19: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Math-Recognition Approaches

Compiler-inspired approach, using tree rewriting[Zanibbi, Blostein, Cordy: ICPR 2002 and PAMI 2002]

Separate analysis of layout, lexical, syntactic, and semantic aspects.

Get partial results even ifthere are syntax errors.

Find linear structures in the input,and create a tree from them.

Operation of a compiler

Recognition of math notation

Page 20: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Topics

Page 21: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Goal: seamless transition between - real world (stylus and paper)

- electronic world

Many paper documents are produced from electronic sources.Eventually include digitally-encoded contents?

Methods used in digital watermarking are relevant.

Electronic Paper is more advanced than Paper Electronic

Page 22: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Entering math expressions

• How much user time?

• How many residual errors?

• How much frustration?

Method 1: Use Recognition Software Scan a document image or write on a data tablet

Method 2: Enter information directly Type the information (e.g. LaTeX)

or use a structure-based editor

User proofreads and corrects

Generate math notation

Recognition software

Information

Page 23: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

User Frustration

People eventually feel comfortable with irritating interfaces.

The Argh is a unit of frustration. Kilarghs. Megarghs….Arghometers need to be developed.

Document recognition is frustrating because:

1.Users don’t like to correct errors made by the “stupid computer”. Better to correct errors they made themselves.

2.Users don’t like to think about the marks on the paper.They would rather think about the document contents.

3.Users don’t like unpredictable systems. Better to adapt themselves (even if inconvenient) to achieve predictability.

[Talk at ICDAR 2001]

Page 24: Math Literate Computers Dorothea Blostein School of Computing, Queen’s University CICM 2009 M ath L iteracy : M ath L iteracy : The ability to read and.

Possible research directions

Precisely define math literacy tasks.

Use soft conventions in recognition.

Use statistics: know about likely versus unlikely expressions.

Exploit the advanced state of generation, to improve recognition.

Topics: Notational Conventions What is Math Notation, anyway?

Math Recognition Approaches User Interface Issues

Conclusion

A group effort is required.