Top Banner
Statistical Models of Semantics and Unsupervised Language Discovery Lecture #18 Introduction to Natural Language Processing CMPSCI 585, Fall 2007 Andrew McCallum Computer Science Department University of Massachusetts Amherst Including slides from Chris Manning, Dan Klein, Rion Snow & Patrick Pantel.
137

Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Aug 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Statistical Models of Semantics andUnsupervised Language Discovery

Lecture #18

Introduction to Natural Language ProcessingCMPSCI 585, Fall 2007

Andrew McCallumComputer Science Department

University of Massachusetts Amherst

Including slides from Chris Manning, Dan Klein, Rion Snow & Patrick Pantel.

Page 2: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Attachment Ambiguity

• Where to attach a phrase in the parse tree?• “I saw the man with the telescope.”

– What does “with a telescope” modify?– Is the problem AI complete? Yes, but…

– Proposed simple structural factors• Right association [Kimball 1973]

‘low’ or ‘near’ attachment = ‘early closure’ of NP• Minimal attachment [Frazier 1978]

(depends on grammar) = ‘high’ or ‘distant’ attachment= ‘late closure’ (of NP)

Page 3: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Attachment Ambiguity

• “The children ate the cake with a spoon.”• “The children ate the cake with frosting.”

• “Joe included the package for Susan.”• “Joe carried the package for Susan.”

• Ford, Bresnan and Kaplan (1982):“It is quite evident, then, that the closure effects inthese sentences are induced in some way by thechoice of the lexical items.”

Page 4: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Lexical acquisition, semantic similarity

• Previous models give same estimate to allunseen events.

• Unrealistic - could hope to refine that basedon semantic classes of words

• Examples– “Susan ate the cake with a durian.”– “Susan had never eaten a fresh durian before.”– Although never seen “eating pineapple” should be

more likely than “eating holograms” becausepineapple is similar to apples, and we have seen“eating apples”.

Page 5: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

An application: selectional preferences

• Most verbs prefer arguments of a particulartype. Such regularities are called selectionalpreferences or selectional restrictions.

• “Bill drove a…” Mustang, car, truck, jeep

• Selectional preference strength: how stronglydoes a verb constrain direct objects

• “see” versus “unknotted”

Page 6: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Measuring selectional preference strength

• Assume we are given a clustering of (direct object) nouns.Resnick (1993) uses WordNet.

• Selectional association between a verb and a class

Proportion that its summand contributes to preference strength.

• For nouns in multiple classes, disambiguate as most likelysense:

Page 7: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Selection preference strength(made up data)

Noun class c P(c) P(c|eat) P(c|see) P(c|find)people 0.25 0.01 0.25 0.33furniture 0.25 0.01 0.25 0.33food 0.25 0.97 0.25 0.33action 0.25 0.01 0.25 0.01SPS S(v) 1.76 0.00 0.35

A(eat, food) = 1.08A(find, action) = -0.13

Page 8: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Selectional Preference Strength example(Resnick, Brown corpus)

Page 9: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

But how might we measureword similarity for word classes?

• Vector spaces

Page 10: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

But how might we measureword similarity for word classes?

• Vector spacesword-by-word matrix B

Page 11: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Similarity measures for binary vectors

Page 12: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Cosine measure

Page 13: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Example of cosine measure onword-by-word matrix on NYT

Page 14: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Probabilistic measures

Page 15: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Neighbors of word “company”[Lee]

Page 16: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Learning syntactic patterns forautomatic hypernym discovery

Rion Snow, Daniel Jurafsky, and Andrew Y. Ng.

Page 17: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 18: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 19: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 20: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 21: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 22: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 23: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 24: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

VERBOCEAN: Mining the Web forFine-Grained Semantic Verb Relations

Timothy Chklovski and Patrick Pantel

Page 25: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 26: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 27: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 28: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 29: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 30: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 31: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 32: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 33: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 34: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 35: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 36: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

http://semantics.isi.edu/ocean/

Demo

Page 37: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Models

Unsupervised Models ofWord Co-occurrences

Page 38: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

A Probabilistic Approach

• Define a probabilistic generativemodel for documents.

• Learn the parameters of thismodel by fitting them to the dataand a prior.

Page 39: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Clustering words into topics withLatent Dirichlet Allocation

[Blei, Ng, Jordan 2003]

Sample a distributionover topics, θ

For each document:

Sample a topic, z

For each word in doc

Sample a wordfrom the topic, w

Example:

70% Iraq war30% US election

Iraq war

“bombing”

GenerativeProcess:

Page 40: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

STORYSTORIES

TELLCHARACTER

CHARACTERSAUTHOR

READTOLD

SETTINGTALESPLOT

TELLINGSHORT

FICTIONACTION

TRUEEVENTSTELLSTALE

NOVEL

MINDWORLDDREAM

DREAMSTHOUGHT

IMAGINATIONMOMENT

THOUGHTSOWNREALLIFE

IMAGINESENSE

CONSCIOUSNESSSTRANGEFEELINGWHOLEBEINGMIGHTHOPE

WATERFISHSEA

SWIMSWIMMING

POOLLIKE

SHELLSHARKTANK

SHELLSSHARKSDIVING

DOLPHINSSWAMLONGSEALDIVE

DOLPHINUNDERWATER

DISEASEBACTERIADISEASES

GERMSFEVERCAUSE

CAUSEDSPREADVIRUSES

INFECTIONVIRUS

MICROORGANISMSPERSON

INFECTIOUSCOMMONCAUSING

SMALLPOXBODY

INFECTIONSCERTAIN

Example topicsinduced from a large collection of text

FIELDMAGNETICMAGNET

WIRENEEDLE

CURRENTCOIL

POLESIRON

COMPASSLINESCORE

ELECTRICDIRECTION

FORCEMAGNETS

BEMAGNETISM

POLEINDUCED

SCIENCESTUDY

SCIENTISTSSCIENTIFIC

KNOWLEDGEWORK

RESEARCHCHEMISTRY

TECHNOLOGYMANY

MATHEMATICSBIOLOGY

FIELDPHYSICS

LABORATORYSTUDIESWORLD

SCIENTISTSTUDYINGSCIENCES

BALLGAMETEAM

FOOTBALLBASEBALLPLAYERS

PLAYFIELD

PLAYERBASKETBALL

COACHPLAYEDPLAYING

HITTENNISTEAMSGAMESSPORTS

BATTERRY

JOBWORKJOBS

CAREEREXPERIENCE

EMPLOYMENTOPPORTUNITIES

WORKINGTRAINING

SKILLSCAREERS

POSITIONSFIND

POSITIONFIELD

OCCUPATIONSREQUIRE

OPPORTUNITYEARNABLE

[Tennenbaum et al]

Page 41: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

STORYSTORIES

TELLCHARACTER

CHARACTERSAUTHOR

READTOLD

SETTINGTALESPLOT

TELLINGSHORT

FICTIONACTION

TRUEEVENTSTELLSTALE

NOVEL

MINDWORLDDREAM

DREAMSTHOUGHT

IMAGINATIONMOMENT

THOUGHTSOWNREALLIFE

IMAGINESENSE

CONSCIOUSNESSSTRANGEFEELINGWHOLEBEINGMIGHTHOPE

WATERFISHSEA

SWIMSWIMMING

POOLLIKE

SHELLSHARKTANK

SHELLSSHARKSDIVING

DOLPHINSSWAMLONGSEALDIVE

DOLPHINUNDERWATER

DISEASEBACTERIADISEASES

GERMSFEVERCAUSE

CAUSEDSPREADVIRUSES

INFECTIONVIRUS

MICROORGANISMSPERSON

INFECTIOUSCOMMONCAUSING

SMALLPOXBODY

INFECTIONSCERTAIN

FIELDMAGNETICMAGNET

WIRENEEDLE

CURRENTCOIL

POLESIRON

COMPASSLINESCORE

ELECTRICDIRECTION

FORCEMAGNETS

BEMAGNETISM

POLEINDUCED

SCIENCESTUDY

SCIENTISTSSCIENTIFIC

KNOWLEDGEWORK

RESEARCHCHEMISTRY

TECHNOLOGYMANY

MATHEMATICSBIOLOGYFIELD

PHYSICSLABORATORY

STUDIESWORLD

SCIENTISTSTUDYINGSCIENCES

BALLGAMETEAM

FOOTBALLBASEBALLPLAYERS

PLAYFIELD

PLAYERBASKETBALL

COACHPLAYEDPLAYING

HITTENNISTEAMSGAMESSPORTS

BATTERRY

JOBWORKJOBS

CAREEREXPERIENCE

EMPLOYMENTOPPORTUNITIES

WORKINGTRAINING

SKILLSCAREERS

POSITIONSFIND

POSITIONFIELD

OCCUPATIONSREQUIRE

OPPORTUNITYEARNABLE

Example topicsinduced from a large collection of text

[Tennenbaum et al]

Page 42: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Collocations

• An expression consisting of two or morewords that correspond to some conventionalway of saying things.

• Characterized by limited compositionality.– compositional: meaning of expression can be

predicted by meaning of its parts.– “dynamic programming”, “hidden Markov model”– “weapons of mass destruction”– “kick the bucket”, “hear it through the grapevine”

Page 43: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topics Modeling Phrases

• Topics based only on unigrams oftendifficult to interpret

• Topic discovery itself is confused becauseimportant meaning / distinctions carried byphrases.

• Significant opportunity to provide improvedlanguage models to ASR, MT, IR, etc.

Page 44: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical N-gram Model

z1 z2 z3 z4

w1 w2 w3 w4

y1 y2 y3 y4

θ

φ1

T

D

. . .

. . .

. . .

α

WTW

ψ γ1 γ2β φ2

[Wang, McCallum 2005]

Page 45: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

LDA Topic

LDA

algorithmsalgorithmgenetic

problemsefficient

Topical N-grams

genetic algorithmsgenetic algorithm

evolutionary computationevolutionary algorithms

fitness function

Page 46: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Comparison

learningoptimalreinforcementstateproblemspolicydynamicactionprogrammingactionsfunctionmarkovmethodsdecisionrlcontinuousspacessteppoliciesplanning

LDAreinforcement learningoptimal policydynamic programmingoptimal controlfunction approximatorprioritized sweepingfinite-state controllerlearning systemreinforcement learning rlfunction approximatorsmarkov decision problemsmarkov decision processeslocal searchstate-action pairmarkov decision processbelief statesstochastic policyaction selectionupright positionreinforcement learning methods

policyactionstatesactionsfunctionrewardcontrolagentq-learningoptimalgoallearningspacestepenvironmentsystemproblemstepssuttonpolicies

Topical N-grams (2) Topical N-grams (1)

Page 47: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Comparison

motionvisualfieldpositionfiguredirectionfieldseyelocationretinareceptivevelocityvisionmovingsystemflowedgecenterlightlocal

LDAreceptive fieldspatial frequencytemporal frequencyvisual motionmotion energytuning curveshorizontal cellsmotion detectionpreferred directionvisual processingarea mtvisual cortexlight intensitydirectional selectivityhigh contrastmotion detectorsspatial phasemoving stimulidecision strategyvisual stimuli

motionresponsedirectioncellsstimulusfigurecontrastvelocitymodelresponsesstimulimovingcellintensitypopulationimagecentertuningcomplexdirections

Topical N-grams (2) Topical N-grams (1)

Page 48: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Comparison

wordsystemrecognitionhmmspeechtrainingperformancephonemewordscontextsystemsframetrainedspeakersequencespeakersmlpframessegmentationmodels

LDAspeech recognitiontraining dataneural networkerror ratesneural nethidden markov modelfeature vectorscontinuous speechtraining procedurecontinuous speech recognitiongamma filterhidden controlspeech productionneural netsinput representationoutput layerstraining algorithmtest setspeech framesspeaker dependent

speechwordtrainingsystemrecognitionhmmspeakerperformancephonemeacousticwordscontextsystemsframetrainedsequencephoneticspeakersmlphybrid

Topical N-grams (2) Topical N-grams (1)

Page 49: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Unsupervised learning oftopic hierarchies

(Blei, Griffiths, Jordan & Tenenbaum, NIPS 2003)

Page 50: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Joint models of syntax and semantics (Griffiths,Steyvers, Blei & Tenenbaum, NIPS 2004)

• Embed topics model inside an nth orderHidden Markov Model:

Document-specific distribution over topics

Page 51: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

FOODFOODSBODY

NUTRIENTSDIETFAT

SUGARENERGY

MILKEATINGFRUITS

VEGETABLESWEIGHT

FATSNEEDS

CARBOHYDRATESVITAMINSCALORIESPROTEIN

MINERALS

MAPNORTHEARTHSOUTHPOLEMAPS

EQUATORWESTLINESEAST

AUSTRALIAGLOBEPOLES

HEMISPHERELATITUDE

PLACESLAND

WORLDCOMPASS

CONTINENTS

DOCTORPATIENTHEALTH

HOSPITALMEDICAL

CAREPATIENTS

NURSEDOCTORSMEDICINENURSING

TREATMENTNURSES

PHYSICIANHOSPITALS

DRSICK

ASSISTANTEMERGENCY

PRACTICE

BOOKBOOKS

READINGINFORMATION

LIBRARYREPORT

PAGETITLE

SUBJECTPAGESGUIDE

WORDSMATERIALARTICLE

ARTICLESWORDFACTS

AUTHORREFERENCE

NOTE

GOLDIRON

SILVERCOPPERMETAL

METALSSTEELCLAYLEADADAM

OREALUMINUM

MINERALMINE

STONEMINERALS

POTMININGMINERS

TIN

BEHAVIORSELF

INDIVIDUALPERSONALITY

RESPONSESOCIAL

EMOTIONALLEARNINGFEELINGS

PSYCHOLOGISTSINDIVIDUALS

PSYCHOLOGICALEXPERIENCES

ENVIRONMENTHUMAN

RESPONSESBEHAVIORSATTITUDES

PSYCHOLOGYPERSON

CELLSCELL

ORGANISMSALGAE

BACTERIAMICROSCOPEMEMBRANEORGANISM

FOODLIVINGFUNGIMOLD

MATERIALSNUCLEUSCELLED

STRUCTURESMATERIAL

STRUCTUREGREENMOLDS

Semantic classes

PLANTSPLANT

LEAVESSEEDSSOIL

ROOTSFLOWERS

WATERFOOD

GREENSEED

STEMSFLOWER

STEMLEAF

ANIMALSROOT

POLLENGROWING

GROW

Page 52: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

GOODSMALL

NEWIMPORTANT

GREATLITTLELARGE

*BIG

LONGHIGH

DIFFERENTSPECIAL

OLDSTRONGYOUNG

COMMONWHITESINGLE

CERTAIN

THEHIS

THEIRYOURHERITSMYOURTHIS

THESEA

ANTHATNEW

THOSEEACH

MRANYMRSALL

MORESUCHLESS

MUCHKNOWN

JUSTBETTERRATHER

GREATERHIGHERLARGERLONGERFASTER

EXACTLYSMALLER

SOMETHINGBIGGERFEWERLOWER

ALMOST

ONAT

INTOFROMWITH

THROUGHOVER

AROUNDAGAINSTACROSS

UPONTOWARDUNDERALONGNEAR

BEHINDOFF

ABOVEDOWN

BEFORE

SAIDASKED

THOUGHTTOLDSAYS

MEANSCALLEDCRIEDSHOWS

ANSWEREDTELLS

REPLIEDSHOUTED

EXPLAINEDLAUGHED

MEANTWROTE

SHOWEDBELIEVED

WHISPERED

ONESOMEMANYTWOEACHALL

MOSTANY

THREETHIS

EVERYSEVERAL

FOURFIVE

BOTHTENSIX

MUCHTWENTY

EIGHT

HEYOUTHEY

ISHEWEIT

PEOPLEEVERYONE

OTHERSSCIENTISTSSOMEONE

WHONOBODY

ONESOMETHING

ANYONEEVERYBODY

SOMETHEN

Syntactic classes

BEMAKEGET

HAVEGO

TAKEDO

FINDUSESEE

HELPKEEPGIVELOOKCOMEWORKMOVELIVEEAT

BECOME

Page 53: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Corpus-specific factorization(NIPS)

Sem

antic

sSy

ntax

Page 54: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

REMAINED

5 8 14 25 26 30 33IN ARE THE SUGGEST LEVELS RESULTS BEEN

FOR WERE THIS INDICATE NUMBER ANALYSIS MAYON WAS ITS SUGGESTING LEVEL DATA CAN

BETWEEN IS THEIR SUGGESTS RATE STUDIES COULDDURING WHEN AN SHOWED TIME STUDY WELLAMONG REMAIN EACH REVEALED CONCENTRATIONS FINDINGS DIDFROM REMAINS ONE SHOW VARIETY EXPERIMENTS DOES

UNDER REMAINED ANY DEMONSTRATE RANGE OBSERVATIONS DOWITHIN PREVIOUSLY INCREASED INDICATING CONCENTRATION HYPOTHESIS MIGHT

THROUGHOUT BECOME EXOGENOUS PROVIDE DOSE ANALYSES SHOULDTHROUGH BECAME OUR SUPPORT FAMILY ASSAYS WILLTOWARD BEING RECOMBINANT INDICATES SET POSSIBILITY WOULD

INTO BUT ENDOGENOUS PROVIDES FREQUENCY MICROSCOPY MUSTAT GIVE TOTAL INDICATED SERIES PAPER CANNOT

INVOLVING MERE PURIFIED DEMONSTRATED AMOUNTS WORK

THEYAFTER APPEARED TILE SHOWS RATES EVIDENCE ALSO

ACROSS APPEAR FULL SO CLASS FINDINGAGAINST ALLOWED CHRONIC REVEAL VALUES MUTAGENESIS BECOME

WHEN NORMALLY ANOTHER DEMONSTRATES AMOUNT OBSERVATION MAGALONG EACH EXCESS SUGGESTED SITES MEASUREMENTS LIKELY

Syntactic classes in PNAS

Page 55: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Semantic highlighting Darker words are more likely to have been generated from the topic-based “semantics” module:

Page 56: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Social Network Analysiswith Links and Text

Role DiscoveryGroup DiscoveryTrend Discovery

Community DiscoveryImpact Measurement

Page 57: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

From LDA to Author-Recipient-Topic(ART)

Page 58: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Inference and Estimation

Gibbs Sampling:- Easy to implement- Reasonably fast

r r

Page 59: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Enron Email Corpus

• 250k email messages• 23k people

Date: Wed, 11 Apr 2001 06:56:00 -0700 (PDT)From: [email protected]: [email protected]: Enron/TransAltaContract dated Jan 1, 2001

Please see below. Katalin Kiss of TransAlta has requested anelectronic copy of our final draft? Are you OK with this? Ifso, the only version I have is the original draft withoutrevisions.

DP

Debra PerlingiereEnron North America Corp.Legal Department1400 Smith Street, EB 3885Houston, Texas [email protected]

Page 60: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topics, and prominent senders / receiversdiscovered by ARTTopic names,

by hand

Page 61: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topics, and prominent senders / receiversdiscovered by ART

Beck = “Chief Operations Officer”Dasovich = “Government Relations Executive”Shapiro = “Vice President of Regulatory Affairs”Steffes = “Vice President of Government Affairs”

Page 62: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Comparing Role Discovery

connection strength (A,B) =

distribution overauthored topics

Traditional SNA

distribution overrecipients

distribution overauthored topics

Author-TopicART

Page 63: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Comparing Role Discovery Tracy Geaconne ⇔ Dan McCarty

Traditional SNA Author-TopicART

Similar roles Different rolesDifferent roles

Geaconne = “Secretary”McCarty = “Vice President”

Page 64: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Traditional SNA Author-TopicART

Different roles Very differentVery similar

Blair = “Gas pipeline logistics”Watson = “Pipeline facilities planning”

Comparing Role Discovery Lynn Blair ⇔ Kimberly Watson

Page 65: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

McCallum Email Corpus 2004

• January - October 2004• 23k email messages• 825 people

From: [email protected]: NIPS and ....Date: June 14, 2004 2:27:41 PM EDTTo: [email protected]

There is pertinent stuff on the first yellow folder that iscompleted either travel or other things, so please sign thatfirst folder anyway. Then, here is the reminder of the thingsI'm still waiting for:

NIPS registration receipt.CALO registration receipt.

Thanks,Kate

Page 66: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

McCallum Email Blockstructure

Page 67: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Four most prominent topicsin discussions with ____?

Page 68: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 69: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Two most prominent topicsin discussions with ____?

Words Prob

love 0.030514

house 0.015402

0.013659

time 0.012351

great 0.011334

hope 0.011043

dinner 0.00959

saturday 0.009154

left 0.009154

ll 0.009009

0.008282

visit 0.008137

evening 0.008137

stay 0.007847

bring 0.007701

weekend 0.007411

road 0.00712

sunday 0.006829

kids 0.006539

flight 0.006539

Words Prob

today 0.051152

tomorrow 0.045393

time 0.041289

ll 0.039145

meeting 0.033877

week 0.025484

talk 0.024626

meet 0.023279

morning 0.022789

monday 0.020767

back 0.019358

call 0.016418

free 0.015621

home 0.013967

won 0.013783

day 0.01311

hope 0.012987

leave 0.012987

office 0.012742

tuesday 0.012558

Topic 1 Topic 2

Page 70: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Role-Author-Recipient-Topic Models

Page 71: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Results with RART:People in “Role #3” in Academic Email

• olc lead Linux sysadmin• gauthier sysadmin for CIIR group• irsystem mailing list CIIR sysadmins• system mailing list for dept. sysadmins• allan Prof., chair of “computing committee”• valerie second Linux sysadmin• tech mailing list for dept. hardware• steve head of dept. I.T. support

Page 72: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Roles for allan (James Allan)

• Role #3 I.T. support• Role #2 Natural Language researcher

Roles for pereira (Fernando Pereira)

•Role #2 Natural Language researcher•Role #4 SRI CALO project participant•Role #6 Grant proposal writer•Role #10 Grant proposal coordinator•Role #8 Guests at McCallum’s house

Page 73: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Traditional SNA Author-TopicART

Block structured NotNot

ART: Roles but not Groups

Enron TransWestern Division

Page 74: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Social Network Analysiswith Links and Text

Role DiscoveryGroup DiscoveryTrend Discovery

Community DiscoveryImpact Measurement

Page 75: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Groups and Topics

• Input:– Observed relations between people– Attributes on those relations (text, or categorical)

• Output:– Attributes clustered into “topics”– Groups of people---varying depending on topic

Page 76: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Adjacency Matrix Representing Relations

FEDCBA

FEDCBAFEDCBA

G3G3G2G1G2G1

G3G3G2G1G2G1

FEDCBA

FEDBCA

G3G3G2G2G1G1

G3G3G2G2G1G1

FEDBCA

Student Roster

AdamsBennettCarterDavisEdwardsFrederking

Academic Admiration

Acad(A, B) Acad(C, B)Acad(A, D) Acad(C, D)Acad(B, E) Acad(D, E)Acad(B, F) Acad(D, F)Acad(E, A) Acad(F, A)Acad(E, C) Acad(F, C)

Page 77: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Group Model:Partitioning Entities into Groups

2S

v

!

2G

! ! !

Stochastic Blockstructures for Relations[Nowicki, Snijders 2001]

S: number of entities

G: number of groups

Enhanced with arbitrary number of groups in [Kemp, Griffiths, Tenenbaum 2004]

BetaDirichlet

Binomial

S

gMultinomial

Page 78: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Two Relations with Different Attributes

FEDBCA

G3G3G2G2G1G1

G3G3G2G2G1G1FDBECA

G2G2G2G1G1G1

G2G2G2G1G1G1

FDBECA

Student Roster

AdamsBennettCarterDavisEdwardsFrederking

Academic Admiration

Acad(A, B) Acad(C, B)Acad(A, D) Acad(C, D)Acad(B, E) Acad(D, E)Acad(B, F) Acad(D, F)Acad(E, A) Acad(F, A)Acad(E, C) Acad(F, C)

Social Admiration

Soci(A, B) Soci(A, D) Soci(A, F)Soci(B, A) Soci(B, C) Soci(B, E)Soci(C, B) Soci(C, D) Soci(C, F)Soci(D, A) Soci(D, C) Soci(D, E)Soci(E, B) Soci(E, D) Soci(E, F)Soci(F, A) Soci(F, C) Soci(F, E)

FEDBCA

Page 79: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

The Group-Topic Model:Discovering Groups and Topics Simultaneously

bN

w

t

B

T

!

!

DirichletMultinomial

Uniform

2S

v

!

2G

! ! !

BetaDirichlet

Binomial

S

gMultinomial

T

Page 80: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Dataset #1:U.S. Senate

• 16 years of voting records in the US Senate (1989 – 2005)

• a Senator may respond Yea or Nay to a resolution

• 3423 resolutions with text attributes (index terms)

• 191 Senators in total across 16 yearsS.543Title: An Act to reform Federal deposit insurance, protect the deposit insurancefunds, recapitalize the Bank Insurance Fund, improve supervision and regulationof insured depository institutions, and for other purposes.Sponsor: Sen Riegle, Donald W., Jr. [MI] (introduced 3/5/1991) Cosponsors (2)Latest Major Action: 12/19/1991 Became Public Law No: 102-242.Index terms: Banks and banking Accounting Administrative fees Cost controlCredit Deposit insurance Depressed areas and other 110 terms

Adams (D-WA), Nay Akaka (D-HI), Yea Bentsen (D-TX), Yea Biden (D-DE), YeaBond (R-MO), Yea Bradley (D-NJ), Nay Conrad (D-ND), Nay ……

Page 81: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topics Discovered (U.S. Senate)

carepolicypollutionpreventionemployeelawresearchelementarybusinessaidpetrolstudents

taxcongressgasdrugaidtaxnuclearchildren

insuranceforeignwateraidlabormilitarypowerschool

federalgovernmentenergyeducation

EconomicMilitaryMisc.EnergyEducation

Mixture of Unigrams

Group-Topic Model

assistancebusinessdiseasesresearchdisabilitywagecommunicableenergymedicareminimumdrugstax

careincomecongressgovernmentmedicalcongresstariffaid

insurancetaxchemicalsfederalsecurityinsurancetradeschoolsociallaborforeigneducation

Social Security+ Medicare

EconomicForeignEducation+ Domestic

Page 82: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Groups Discovered (US Senate)

Groups from topic Education + Domestic

Page 83: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Senators Who Change Coalition the mostDependent on Topic

e.g. Senator Shelby (D-AL) votes with the Republicans on Economicwith the Democrats on Education + Domesticwith a small group of maverick Republicans on Social Security + Medicaid

Page 84: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Dataset #2:The UN General Assembly

• Voting records of the UN General Assembly (1990 - 2003)

• A country may choose to vote Yes, No or Abstain

• 931 resolutions with text attributes (titles)

• 192 countries in total

• Also experiments later with resolutions from 1960-2003

Vote on Permanent Sovereignty of Palestinian People, 87th plenary meeting

The draft resolution on permanent sovereignty of the Palestinian people in theoccupied Palestinian territory, including Jerusalem, and of the Arab population inthe occupied Syrian Golan over their natural resources (document A/54/591)was adopted by a recorded vote of 145 in favour to 3 against with 6 abstentions:

In favour: Afghanistan, Argentina, Belgium, Brazil, Canada, China, France,Germany, India, Japan, Mexico, Netherlands, New Zealand, Pakistan, Panama,Russian Federation, South Africa, Spain, Turkey, and other 126 countries.Against: Israel, Marshall Islands, United States.Abstain: Australia, Cameroon, Georgia, Kazakhstan, Uzbekistan, Zambia.

Page 85: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topics Discovered (UN)

callsisraelcountriessecuritysituationimplementation

syriapalestineuseisraelhumanweapons

occupiedrightsnuclear

Securityin Middle East

Human RightsEverythingNuclear

Mixture ofUnigrams

Group-TopicModel

israelspacenationsoccupiedraceweaponspalestinepreventionunitedhumanarmsstatesrightsnuclearnuclear

Human RightsNuclear ArmsRace

NuclearNon-proliferation

Page 86: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

GroupsDiscovered(UN)The countries list for eachgroup are ordered by their2005 GDP (PPP) and only 5countries are shown ingroups that have more than5 members.

Page 87: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Groups and Topics, Trends over Time (UN)

Page 88: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Social Network Analysiswith Links and Text

Role DiscoveryGroup DiscoveryTrend Discovery

Community DiscoveryImpact Measurement

Page 89: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Groups and Topics, Trends over Time (UN)

Page 90: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Want to Model Trends over Time

• Pattern appears only briefly– Capture its statistics in focused way– Don’t confuse it with patterns elsewhere in time

• Is prevalence of topic growing or waning?

• How do roles, groups, influence shift over time?

Page 91: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topics over Time (TOT)

θ

w t

α

Nd

z

D

φ

TBetaover time

Multinomialover words

β γ

Dirichlet

multinomialover topics

topicindex

wordtime

stamp

Dirichletprior

Uniformprior

[Wang, McCallum, KDD 2006]

Page 92: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

State of the Union Address208 Addresses delivered between January 8, 1790 and January 29, 2002.To increase the number of documents, we split the addresses into paragraphsand treated them as ‘documents’. One-line paragraphs were excluded.Stopping was applied.

• 17156 ‘documents’

• 21534 words

• 669,425 tokens

Our scheme of taxation, by means of which this needless surplus is takenfrom the people and put into the public Treasury, consists of a tariff orduty levied upon importations from abroad and internal-revenue taxes leviedupon the consumption of tobacco and spirituous and malt liquors. It must beconceded that none of the things subjected to internal-revenue taxationare, strictly speaking, necessaries. There appears to be no just complaintof this taxation by the consumers of these articles, and there seems to benothing so well able to bear the burden without hardship to any portion ofthe people.

1910

Page 93: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Comparing

TOT

against

LDA

Page 94: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

TOT

versus

LDA

on myemail

Page 95: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Distributions Conditioned on Time

time

topi

c m

ass

(in v

ertic

al h

eigh

t)in N

IPS conference papers

Page 96: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Social Network Analysiswith Links and Text

Role DiscoveryGroup DiscoveryTrend Discovery

Community DiscoveryImpact Measurement

Page 97: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

How do new links form in social networks?

1) Randomly (Poisson graph)2) Pick someone popular (Preferential attachment)3) Pick someone with mutual friends

(Adamic & Adar, Liben-Nowell & Kleinberg)

4) Pick someone from one of your “communities”(Mimno, Wallach & McCallum 2007)Can we find communities that help predict links?

Page 98: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

A Community-based Generative Model forText and Co-authorships

1) To generate a document,we first pick a community.

2) The community thendetermines the choice ofauthors and topics.

3) From topics, we pickwords.

Community

Authors

Topics

Words

Page 99: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

A Community-based Generative Model forText and Co-authorships

Graphical Model can answervarious queries!

P(author3 | author1, author2)P(author3 | author1, author2, text)

P(community | authors)P (authors | community)P (text | community)P (text | authors)

Community

Authors

Topics

Words

Page 100: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

(Preferential attachment is much worse, at -40,121.)

Link PredictionProbability of NIPS 2004-6 Co-authorships

Page 101: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Community-Author Viewfeatures, feature, markov, sequence, models, conditional, label, function, setnumber, results, paper, based, function, previous, resulting, introduction, generalpolicy, learning, action, states, function, reward, actions, optimal, mdpcontrol, controller, model, helicopter, system, neural, forward, learning, systemsmodel, models, press, shows, figure, related, journal, underlying, correspondpresent, effect, figure, references, important, increase, similar, addition, increasedlearning, control, reinforcement, sutton, action, space, task, trajectory, methods

Ng_AKoller_DParr_RAbbeel_PJordan_MMerzenich_MMel_B

propagation, belief, tree, nodes, node, approximation, variational, networks, boundnumber, results, paper, based, function, previous, resulting, introduction, generaltheorem, case, proof, function, assume, set, section, algorithm, boundfield, boltzmann, approximations, exact, jordan, parameters, set, step, networklog, models, inference, variables, model, distribution, variational, parameters, matrixproblem, algorithm, optimization, methods, solution, method, problems, proposed, optimalclustering, spectral, graph, matrix, cut, data, clusters, eigenvectors, normalized

Jordan_MJaakkola_TSaul_LBach_F_RSingh_SWainwright_MNguyen_X

Page 102: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Community-Author-Topic Viewwords, model, word, documents, document, text, topic, distribution, mixturesuffix, algorithm, feature, adaptor, space, model, kernels, strings, naturallearning, category, naive, definition, estimation, single, figure, applied, obtainset, labels, analysis, adclus, pmm, function, evaluation, problem, alphabetnumber, results, paper, based, function, previous, resulting, introduction, generalprior, posterior, distribution, bayesian, likelihood, data, models, probability, modeltarget, task, visual, figure, contrast, attention, search, orientation, discrimination

Griffiths_T_LSinger_YBlei_DGoldwater_SJordan_MJohnson_MCampbell_W

propagation, belief, tree, nodes, node, approximation, variational, networks, boundfield, boltzmann, approximations, exact, jordan, parameters, set, step, networklog, models, inference, variables, model, distribution, variational, parameters, matrixnetwork, variables, node, inference, distribution, nodes, algorithm, message, treenumber, results, paper, based, function, previous, resulting, introduction, generaltheorem, case, proof, function, assume, set, section, algorithm, boundmixture, data, gaussian, density, likelihood, parameters, distribution, model, function

Jordan_MWillsky_AJaakkola_TSaul_LWiegerinck_WKappen_HWainwright_M

control, motor, learning, arm, model, movement, feedback, movements, handeye, vor, visual, desired, field, controller, force, cerebellum, vestibularneural, data, activity, figure, firing, movement, motor, speech, dynamicspresent, effect, figure, references, important, increase, similar, addition, increasedfinger, data, learning, shift, rbfs, pattern, manipulated, scaling, modulesvisual, corrective, performance, generalization, neural, figure, neurons, network, learningmodel, models, press, shows, figure, related, journal, underlying, correspond

Kawato_MJordan_MBarto_AVatikiotisShadmehr_RHirayama_MWolpert_D

Page 103: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Social Network Analysiswith Links and Text

Role DiscoveryGroup DiscoveryTrend Discovery

Community DiscoveryImpact Measurement

Page 104: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Our Data

• Over 1.6 million research papers,gathered as part of Rexa.info portal.

• Cross linked references / citations.

Page 105: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Previous Systems

Page 106: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 107: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

ResearchPaper

Cites

Previous Systems

Page 108: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

ResearchPaper

Cites

Person

UniversityVenue

Grant

Groups

Expertise

More Entities and Relations

Page 109: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 110: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 111: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 112: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 113: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 114: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 115: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 116: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 117: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 118: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 119: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 120: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 121: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 122: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical TransferCitation counts from one topic to another.

Map “producers and consumers”

Page 123: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical Bibliometric Impact Measures

• Topical Citation Counts

• Topical Impact Factors

• Topical Longevity

• Topical Precedence

• Topical Diversity

• Topical Transfer

[Mann, Mimno, McCallum, 2006]

Page 124: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical Transfer

Transfer from Digital Libraries to other topics

WebBase: a repository of Web pages11Web Pages

Trawling the Web for Emerging Cyber-Communities

12Graphs

Lessons learned from the creation anddeployment of a terabyte digital video libr..

12Video

On being ‘Undigital’ with digital cameras:extending the dynamic...

14Computer Vision

Trawling the Web for Emerging Cyber-Communities, Kumar, Raghavan,... 1999.

31Web Pages

Paper TitleCit’sOther topic

Page 125: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical Diversity

Papers that had the most influence across many other fields...

Page 126: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical DiversityEntropy of the topic distribution among

papers that cite this paper (this topic).

HighDiversity

LowDiversity

Page 127: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical Bibliometric Impact Measures

• Topical Citation Counts

• Topical Impact Factors

• Topical Longevity

• Topical Precedence

• Topical Diversity

• Topical Transfer

[Mann, Mimno, McCallum, 2006]

Page 128: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical PrecedenceWithin a topic, what are the earliest papers that received more than n citations?

“Early-ness”

Speech Recognition:

Some experiments on the recognition of speech, with one and two ears,E. Colin Cherry (1953)

Spectrographic study of vowel reduction,B. Lindblom (1963)

Automatic Lipreading to enhance speech recognition, Eric D. Petajan (1965)

Effectiveness of linear prediction characteristics of the speech wave for...,B. Atal (1974)

Automatic Recognition of Speakers from Their Voices,B. Atal (1976)

Page 129: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical PrecedenceWithin a topic, what are the earliest papers that received more than n citations?

“Early-ness”

Information Retrieval:

On Relevance, Probabilistic Indexing and Information Retrieval,Kuhns and Maron (1960)

Expected Search Length: A Single Measure of Retrieval Effectiveness Basedon the Weak Ordering Action of Retrieval Systems,

Cooper (1968)Relevance feedback in information retrieval,

Rocchio (1971)Relevance feedback and the optimization of retrieval effectiveness,

Salton (1971)New experiments in relevance feedback,

Ide (1971)Automatic Indexing of a Sound Database Using Self-organizing Neural Nets,

Feiten and Gunzel (1982)

Page 130: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topical Transfer Through Time

• Can we predict which research topicswill be “hot” at ICML next year?

• ...based on– the hot topics in “neighboring” venues last year– learned “neighborhood” distances for venue pairs

Page 131: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

How do Ideas Progress ThroughSocial Networks?

COLT

“ADA Boost”

ICML

ACL(NLP)

ICCV(Vision)

SIGIR(Info. Retrieval)

Hypothetical Example:

Page 132: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

How do Ideas Progress ThroughSocial Networks?

COLT

“ADA Boost”

ICML

ACL(NLP)

ICCV(Vision)

SIGIR(Info. Retrieval)

Hypothetical Example:

Page 133: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

How do Ideas Progress ThroughSocial Networks?

COLT

“ADA Boost”

ICML

ACL(NLP)

ICCV(Vision)

SIGIR(Info. Retrieval)

Hypothetical Example:

Page 134: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Prediction Models

Static Model

Transfer Model

Linear Regression and Ridge RegressionUsed for Coefficient Training.

Page 135: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Preliminary Results

MeanSquaredPredictionError

# Venues used for prediction

Transfer Model with Ridge Regression is a good Predictor

(SmallerIs better) Transfer

Model

Page 136: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery
Page 137: Statistical Models of Semantics and Unsupervised Language …mccallum/courses/inlp2007/lect18-unsup.ppt.pdf · Statistical Models of Semantics and Unsupervised Language Discovery

Topic Model Musings

• 3 years ago Latent Dirichlet Allocationappeared as a complex innovation...but now these methods & mechanics arewell-understood.

• Innovation now is to understanddata and modeling needs,how to structure a new model to capture these.