Top Banner
2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell ([email protected]) Senior Consultant BravePoint, Inc.
49

2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell ([email protected])

Jan 03, 2016

Download

Documents

Eustacia Newton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

2003 Exchange

PROGRESS

BP1110:Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches

Steve Southwell ([email protected])Senior ConsultantBravePoint, Inc.

Page 2: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Steve Southwell

Employee of BravePoint

Consultant specializing in Progress web-enablement

Business systems analyst

Dallas, Texas based

Page 3: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Steve Southwell

Employee of BravePoint

Consultant specializing in Progress web-enablement

Business systems analyst

Dallas, Texas based

Just my day job until I get my record contract. Yeah, Baby!!!

Page 4: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

The Problem - User Perspective

Page 5: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

The Problem - User Perspective

Page 6: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

The Problem - User Perspective

Users expect intuitive text searches.Google and other consumer-oriented web sites have raised the bar.

Find what I'm looking for – not what I typed.

It's not my problem if I'm a bad speller

Oh yeah... Put the most interesting results at the top of the list.

Page 7: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

The Problem – User Perspective

Users do not know “contains” syntax.More users know about quotes and the use of “and” or “or”.

Page 8: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Scope of this Talk

Various tools for making searches work better

General Techniques

Examples

Specific code

Technical Analysis

Page 9: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Disclaimers!

There is no “one-size-fits-all”.

You may trade performance for results.

Some techniques incompatible with each other.

It all depends on the nature of the data.

Page 10: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Disclaimers!

This talk is more about theory and methods.

Your mileage may vary.

Batteries not included.

Do not remove this tag under penalty of law.

Page 11: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Questions?

• Feel free to ask questions anytime.

Page 12: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Types of Searches Where Close Counts

Product Searches

Page 13: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Target Smart Searching Example

User Can't Spell!

Page 14: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Amazon Smart Searching Example

User Can't Spell!

Page 15: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Types of Searches Where Close Counts

Product Searches

Searches for Proper Names

Page 16: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Yellow Pages Smart Searching

User Can't Spell!

Page 17: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Google Smart Searching Example

User Can't Spell!

Page 18: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Types of Searches Where Close Counts

Product Searches

Searches for Proper Names

Full-text Searches

Page 19: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

AltaVista Smart Searching ExampleUser Can't Spell!

Page 20: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

The Problem – Developer Perspective

Internal users need quick results. Time is money.

If customers want to to buy, I'll help them find it.

If they can't spell it, we still sell it.

A widget by any other name... It's still for sale.

List the good stuff first.

Page 21: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Technical Issues

How can Progress store what a word sounds like?How do I search for sound-alikes or similar words?

How can I rank search results?

Page 22: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Determining What a Word Sounds Like

SoundexUsed by US Census Bureau since 1880

Intended to index surnames

Only codes starting letter and 3 sounds

Had to be simple enough to do by hand.

1 = B, P, F, V 4 = L 2 = C, S, K, G, J, Q, X, Z 5 = M,N 3 = D, T 6 = R

Page 23: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Soundex Examples

Last Name: Southwell

Soundex: S340

First letter = S

Next consonant = T = 3

H & W not represented.

Next consonant = L = 4

Next L is a double – skip

Pad with 0

1 = B, P, F, V 4 = L 2 = C, S, K, G, J, Q, X, Z 5 = M,N 3 = D, T 6 = R

Other S340 Names:Seidl, Steele, Staley, Stahl, Stahley, Seidel, Settle, Shadle, Shotwell, Shuttle, Sidwell, Southall, Stall, Steel, Steely, Stell, Still, Stoll, Stowell, Stull, Sudlow, Suttle

Page 24: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

src/samples/soundex.pDEFINE INPUT PARAMETER name AS CHARACTER NO-UNDO.DEFINE OUTPUT PARAMETER code AS CHARACTER NO-UNDO.

DEFINE VARIABLE e AS INTEGER NO-UNDO.DEFINE VARIABLE i AS INTEGER NO-UNDO.DEFINE VARIABLE k AS CHARACTER NO-UNDO.DEFINE VARIABLE l AS CHARACTER NO-UNDO.

ASSIGN l = "" name = CAPS(name) code = SUBSTRING(name,1,1).DO i = 2 TO LENGTH(name): e = ASC(SUBSTRING(name,i,1)) - 64. IF e >= 1 AND e <= 26 THEN DO: k = SUBSTRING("01230120022455012623010202",e,1). IF k <> l AND k <> "0" THEN code = code + k. IF LENGTH(code) > 3 THEN LEAVE. END. l = k.END.code = SUBSTRING(code + "000",1,4).RETURN.

Page 25: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Soundey

More sound codes

Indexes vowel positions

Codes the entire word

Makes phonetic substitutions

0 = aehiouwy 5 = mn1 = bp 6 = r 2 = ckqx 7 = fv 3 = dt 8 = gj 4 = l 9 = sz

Page 26: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

0 = aehiouwy 5 = mn1 = bp 6 = r 2 = ckqx 7 = fv 3 = dt 8 = gj 4 = l 9 = sz

Soundey – Continued

Soundeylib.i available free at www.FreeFrameWork.org

More sophisticated than Soundex

Page 27: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Steps in Soundey Conversion

Pre-tokenMark word boundaries

“Anywhere” translations

“Ends” translations

“Begins” translations

Eliminate silent E

Unmark word boundaries

Translate characters to digits

Eliminate double digits

Page 28: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Soundey Example

Word: Telephone Soundey: 3040705

Replace 'ph' with 'f': telefone

Eliminate silent 'e' on the end: telefon

Translate characters to digits:

T = 3, E = 0, L=4, E=0, F=7, O=0, N=5:

3040705

Page 29: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

0 = aehiouwy 5 = mn1 = bp 6 = r 2 = ckqx 7 = fv 3 = dt 8 = gj 4 = l 9 = sz

Soundey – Disadvantages

Not as good as Metaphone

Presents problems when there are digits possible in the search target or search string.

Page 30: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

0 = th sound h = h*b = b l = l x = ch,sh sounds m = m s = s, some c n = n k = k, some c,g p = pj = j, some g r = rt = t, d w = w*f = f, v y = y**mostly silent

Metaphone

Published in 1990 by Lawrence Philips

Reduces alphabet to 16 consonant sounds

Page 31: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Metaphone – Continued

Less fuzzy than Soundex or Soundey

Uses many English spelling heuristics to convert odd spellings to correct sounds.

Progress version available at http://www.freeframework.org/downloads/new/wordnet/

Not a strict standard

Have a look at metaphonerules.d

Page 32: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Technical Issues

How can Progress determine what a word sounds like?

How do I search for sound-alikes or similar words?How can I rank search results?

Page 33: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Storing the Sound-like Value

Add field(s) to your target table – one or two per target field

For example: If searching against Item.ItemName

Add Item.MetaphoneCode.

Add Item.MetaphoneFragments.

Both word-indexed.

Page 34: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Metaphone and Fragments

You can use triggers to keep your fragment list up-to-date.

WordChop() fragments single words.

SuperWordChop() does sentences.

Searching for “ball*” would now find both baseball and balloon.

Storing fragments in metaphone allows for fuzzy partial matches!

Page 35: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Fragments?

Standard Progress word-indexing only matches against the beginning of words.Contains “*ball*” is a syntax error

How would you match “ball” with “baseball”?

Fragment field contains this:Baseball aseball seball eball ball

Don't store fragments under 4 characters.

Page 36: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Page 37: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Populating Metaphone Fields in DB

{lib/metaphone.i}...FOR EACH ITEM EXCLUSIVE-LOCK: ASSIGN ITEM.MetaphoneCode = toMetaphone(ITEM.ItemName +" " + ITEM.CatDescription).

Item.MetaphoneFragments = superWordChop(Item.MetaphoneCode).

END....

Page 38: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Using Metaphone in 4gl Queries

MySearch = toMetaphone(MySearch).

FOR EACH ITEM WHERE ITEM.MetaphoneCode CONTAINS mySearch NO-LOCK:

...

END.

Page 39: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Metaphone Use in 4gl

Demo of Sports2000 item search with Soundey: itemsearch.w

Page 40: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

General Metaphone Query Tips

Try regular “contains” search first.

Convert search string to Metaphone code, and do “contains” search on MetaphoneCode field.

Try Split and Rejoin

Other alternatives:Synonym and Related word searches

Neural Networks with User Feedback

Forced Ranking

Page 41: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Metaphone Extensibility

Can make it replace known words or fragments:

Anywhere

Beginning of words

Ending of words

GUI demonstration – FunctionTester.w

Page 42: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Other Search Issues

Unexpected Boolean operators. “and” is default

Users want to use the words “and” and “or”

Use booleanConvert() on the query string.

Hyphens / Compound Wordsdehyphenize()

Word SynonymsSee thesaurus.i and itemsearch.w for examples

Page 43: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Other Search Issues

Numbers and Ordinals29 Palms / Twentynine Palms

5th Inning / Fifth Inning

Abbreviations / SlangFt. Worth, TX / Fort Worth, Texas

Page 44: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Technical Issues

How can Progress store what a word sounds like?

How do I search for sound-alikes or similar words?

How can I rank search results?

Page 45: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Ranking Search Results

Not an exact science

Can use many criteria:Number of word matches

Similarity to key words

“Preferred” results – upsells, recent additions, etc.

Requires use of temp-table for results.

All results must be analyzed, so keep set small. (MAX-ROWS?)

Page 46: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Search Ranking Demonstration

Itemsearch.w

Page 47: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Technical Issues

How can Progress store what a word sounds like?

How do I search for sound-alikes or similar words?

How can I rank search results?

Page 48: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

Source Code Availability

All source code used in this presentation can be found at the FreeFrameWork website: http://www.freeframework.org

Up-to-date copy of this presentation available with the source code at the FreeFrameWork site.

Page 49: 2003 Exchange PROGRESS BP1110: Close Enough Indexed Record Retrieval In Progress Using Sound-alikes and Near Matches Steve Southwell (ses@bravepointdallas.com)

CopyLeft 2003BP1110: Close Enough - 1Sim

plify

you

r bu

sin

ess

Sim

plify

you

r bu

sin

ess

2003 Exchange

PROGRESS

All questionsanswered...

Stump the Chump