Top Banner
Quality and Consistency in Text Alignment James R. Covington [email protected] Miklal Software Solutions
41
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Quality and consistency in text alignment

Quality and Consistency in Text Alignment

James R. Covington

[email protected]

Miklal Software Solutions

Page 2: Quality and consistency in text alignment

Text alignmentutility as a function of quality and consistency

Page 3: Quality and consistency in text alignment

Text alignment: utility

Machine translation

Text comparison

Preaching

Education in biblical languages

Textual criticism

Translation technique

Lexicography

Biblical interpretation

James R. Covington | Miklal Software Solutions | [email protected]

Page 4: Quality and consistency in text alignment

Text alignment: quality and consistency

Machine translation lots of data

Text comparison big picture

Preaching

Education in biblical languages

Textual criticism

Translation technique

Lexicography

Biblical interpretation

James R. Covington | Miklal Software Solutions | [email protected]

Page 5: Quality and consistency in text alignment

Text alignment: quality and consistency

Machine translation lots of data

Text comparison big picture

Preaching bad sermon

Education in biblical languages bad exam

Textual criticism bad research

Translation technique

Lexicography

Biblical interpretation

James R. Covington | Miklal Software Solutions | [email protected]

Page 6: Quality and consistency in text alignment

Text alignment: quality and consistency

Part 1: writing consistency standards

Part 2: designing a software tool to promote consistency

Part 3: post-processing quality control

James R. Covington | Miklal Software Solutions | [email protected]

Page 7: Quality and consistency in text alignment

Writing consistency standardsguidelines for evaluating quality and consistency

Page 8: Quality and consistency in text alignment

Writing consistency standards

Step 1: Engineering (our focus today)

Step 2: Proofing

Step 3: Revising

James R. Covington | Miklal Software Solutions | [email protected]

Page 9: Quality and consistency in text alignment

Engineering: principles

Principle 1: as small as possible

“Each set of tokens being linked should be as small as possible.”

Principle 2: as large as necessary

“Each set of tokens being linked should be as large as necessary.”

Principle 1 > Principle 2

James R. Covington | Miklal Software Solutions | [email protected]

Page 10: Quality and consistency in text alignment

Engineering: principles

Principle 1: as small as possible

Gen 12:4James R. Covington | Miklal Software Solutions | [email protected]

Page 11: Quality and consistency in text alignment

Engineering: principles

Principle 2: as large as necessary

Ex 34:6James R. Covington | Miklal Software Solutions | [email protected]

Page 12: Quality and consistency in text alignment

Engineering: principles

Principle 2: as large as necessary

ἐν ἐν to to

γαστήρ γαστρὶ be be

ἔχω ἔχουσα with with

child child

Matt 1:18James R. Covington | Miklal Software Solutions | [email protected]

Page 13: Quality and consistency in text alignment

Engineering: case-specific rules

Step 1: Identify grammatical structures in source language.

Step 2: Identify grammatical structures in target language used to translate structures from Step 1.

Step 3: Write a rule for each pair of grammatical structures.

James R. Covington | Miklal Software Solutions | [email protected]

Page 14: Quality and consistency in text alignment

Engineering: case-specific rules

Step 1: Identify grammatical structures in source language.

Function words Substantives Verbs PunctuationArticles Nouns Auxiliaries Quotation markUniv. Quantifier Pronouns Subjects Question markPrepositions Adjectives ObjectsConjunctions Finite

VolitionalInfinitivesParticiples

James R. Covington | Miklal Software Solutions | [email protected]

Page 15: Quality and consistency in text alignment

Engineering: case-specific rules

Step 1: Identify grammatical structures in source language.

Function words Substantives PersonalArticles Nouns ReflexiveUniv. Quantifier Pronouns PossessivePrepositions Adjectives ReciprocalConjunctions Demonstrative

RelativeInterrogativeIndefiniteCorrelative

James R. Covington | Miklal Software Solutions | [email protected]

Page 16: Quality and consistency in text alignment

Step 1: Hebrew structure

ל + infinitive construct

Engineering: case-specific rules

Gen 2:15James R. Covington | Miklal Software Solutions | [email protected]

Page 17: Quality and consistency in text alignment

Step 2: English structures

Case 1: English to + infinitive

Case 2: English infinitive

Engineering: case-specific rules

Case 1

Case 2

Gen 2:15James R. Covington | Miklal Software Solutions | [email protected]

Page 18: Quality and consistency in text alignment

Step 3: Rules

Case 1: English to + infinitive

Rule 1: link separately

Case 2: English infinitive

Rule 2: group ל and infinitive

infinitive is primary

Engineering: case-specific rules

Case 1

Case 2

Gen 2:15James R. Covington | Miklal Software Solutions | [email protected]

Page 19: Quality and consistency in text alignment

Engineering: case-specific rules

Step 1: Greek structure Step 2: English structures

circumstantial participle participle phrase

ἔχων ὑπʼ ἐμαυτὸν στρατιώτας subordinate clause

main clause

prepositional phrase

preposition

Luke 7:8James R. Covington | Miklal Software Solutions | [email protected]

Page 20: Quality and consistency in text alignment

Engineering: case-specific rules

ἔχων ὑπʼ ἐμαυτὸν στρατιώτας Step 2: English structures

having soldiers under myself participle phrase

since I have soldiers under myself subordinate clause

and I have soldiers under myself main clause

in having soldiers under myself prepositional phrase

with soldiers under me (ESV) preposition

Luke 7:8James R. Covington | Miklal Software Solutions | [email protected]

Page 21: Quality and consistency in text alignment

Engineering: case-specific rules

Step 3: Rules

Case 1: participle phrase Case 2: subordinate clause

συμπαραλαβὼν taking though

καὶ Ἕλλην he

Τίτον Titus ὤν was

along a

with Greek

me

Gal 2:1 Gal 2:3James R. Covington | Miklal Software Solutions | [email protected]

Page 22: Quality and consistency in text alignment

Proofing and Revising

Proofing: multiple readers

time

consult work of other alignments

Revising: begin alignment

note problem spots

note undefined cases

revise and expand cases/rules

James R. Covington | Miklal Software Solutions | [email protected]

Page 23: Quality and consistency in text alignment

Proofing and Revising

Proofing: multiple readers

time

consult work of other alignments

Revising: begin alignment

note problem spots

note undefined cases

revise and expand cases/rules

James R. Covington | Miklal Software Solutions | [email protected]

Page 24: Quality and consistency in text alignment

Designing a software toolan environment to facilitate accuracy and consistency

Page 25: Quality and consistency in text alignment

Designing a software tool: goals

clarity understand alignment correctly

find errors easily

speed make changes quickly

dig deeper quickly

comparison find parallels to check for consistency

James R. Covington | Miklal Software Solutions | [email protected]

Page 26: Quality and consistency in text alignment

Designing a software tool: demo

[demo tool]

James R. Covington | Miklal Software Solutions | [email protected]

Page 27: Quality and consistency in text alignment

Post-processingchecking for accuracy and consistency

Page 28: Quality and consistency in text alignment

Post-processing: philosophy

Find as many algorithmically-detectable mistakes as possible.

Recall > Precision

Precision (low) % hits false

Recall (high) % mistakes caught

James R. Covington | Miklal Software Solutions | [email protected]

Page 29: Quality and consistency in text alignment

Post-processing: techniques

1. Natural Language Processing: conformity to consistency rules

uncommon links

improbable links

consistent treatment of n-grams

2. Graph theory: consistent primary status

James R. Covington | Miklal Software Solutions | [email protected]

Page 30: Quality and consistency in text alignment

Natural language processing: rules

ArticlesGen 1:27

Zech 1:10

2 Sam 15:6

James R. Covington | Miklal Software Solutions | [email protected]

Page 31: Quality and consistency in text alignment

Natural language processing: rules

Verbs Are auxiliaries grouped with main verbs?

Do main verbs receive primary status?

Of Is “of” grouped with nomen regens (construct)?

Waw Is waw grouped with conjunctions that follow it?

James R. Covington | Miklal Software Solutions | [email protected]

Page 32: Quality and consistency in text alignment

Natural language processing: rules

Hebrew definite direct object marker ( תא )

always unlinked (unless interpreted as preposition)

Jer 10:1James R. Covington | Miklal Software Solutions | [email protected]

Page 33: Quality and consistency in text alignment

Natural language processing: rules

[demo Hebrew definite direct object checker]

James R. Covington | Miklal Software Solutions | [email protected]

Page 34: Quality and consistency in text alignment

Natural language processing: context

Uncommon link checker global context

common tokens

uncommon link

Improbable link checker local context

more probable link

(“unstable marriage”)

James R. Covington | Miklal Software Solutions | [email protected]

Page 35: Quality and consistency in text alignment

Natural language processing: context

[demo uncommon and improbably link checker]

James R. Covington | Miklal Software Solutions | [email protected]

Page 36: Quality and consistency in text alignment

N-grams: consistent alignment

4-gram (Hebrew)

James R. Covington | Miklal Software Solutions | [email protected]

Page 37: Quality and consistency in text alignment

Graph theory: primary status of םש “name”

Example groups linked to םש “name”

name

a/the name

the name of

a name for

renown

was named

she named

he called … name

James R. Covington | Miklal Software Solutions | [email protected]

םש

Page 38: Quality and consistency in text alignment

Graph theory: primary status of םש “name”

Goal: simple directed graph (i.e. no loops)

James R. Covington | Miklal Software Solutions | [email protected]

Page 39: Quality and consistency in text alignment

Graph theory: שוב (qal) “return”

Some graphs get complicated.

James R. Covington | Miklal Software Solutions | [email protected]

Page 40: Quality and consistency in text alignment

Graph theory: שוב (qal) “return”

James R. Covington | Miklal Software Solutions | [email protected]

Page 41: Quality and consistency in text alignment

Conclusions

1. Text alignment is useful inasmuch as it is accurate and consistent.

2. Achieving quality and consistency requires multiple strategies:

a. writing consistency standards (before)

b. software-design (during)

c. post-processing (after)

James R. Covington | Miklal Software Solutions | [email protected]