Top Banner
Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion Weakly Supervised User Profile Extraction from Twitter Jiwei Li, Alan Ritter and Eduard Hovy School of Computer Science, Carnegie Mellon University June 22nd, 2014 Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction
79

Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Apr 07, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Weakly Supervised User Profile Extraction fromTwitter

Jiwei Li, Alan Ritter and Eduard Hovy

School of Computer Science, Carnegie Mellon University

June 22nd, 2014

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 2: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

User Profile

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 3: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Main Contribution

Automatic extraction of attributes from Twitter:

Spouse

Education

Job

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 4: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Main Contribution

We use Google plus as distant supervision for user attributeextraction.

We present a large-scale dataset for this task.

We demonstrate the benefit of jointly reasoning about users’social network structure.

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 5: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Outline

Motivation/Introduction

Related Work

Dataset Creation

Algorithm

Experiments

Conclusion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 6: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 7: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 8: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 9: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 10: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Why Profile Extraction ?

Friend Recommendation

Target Advertising (Movie, Book ... )

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 11: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Why Profile Extraction ?

Friend Recommendation

Target Advertising (Movie, Book ... )

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 12: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Why Profile Extraction ?

Friend Recommendation

Target Advertising (Movie, Book ... )

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 13: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Why Profile Extraction ?

Friend Recommendation

Target Advertising (Movie, Book ... )

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 14: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 15: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Twitter serves as a wonderful source:

Text Level Evidence

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 16: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Twitter serves as a wonderful source:

Text Level Evidence

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 17: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Twitter serves as a wonderful source:

Text Level EvidenceNetwork Information

Homophily: People sharing more attributes have a higherchance of becoming friends in social media

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 18: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Twitter serves as a wonderful source:

Text Level EvidenceNetwork Information

Homophily: People sharing more attributes have a higherchance of becoming friends in social media

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 19: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Motivation/Introduction

Question

Unstructured Twitter data → Structured User Profile ?

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 20: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Related Work

Motivation/Introduction

Related Work

Dataset Creation

Algorithm

Experiments

Conclusion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 21: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Related Work

User Attribute Extraction/ Identification

Gender (Ciot et al., 2013; Liu and Ruths, 2013)Age (Rao et al., 2010)

Relying on Amazon Mechanical Turk

Political Polarity (Pennacchiotti et al, 2011)

Relying on external political websites

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 22: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Related Work

User Attribute Extraction/ Identification

Gender (Ciot et al., 2013; Liu and Ruths, 2013)

Age (Rao et al., 2010)Relying on Amazon Mechanical Turk

Political Polarity (Pennacchiotti et al, 2011)

Relying on external political websites

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 23: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Related Work

User Attribute Extraction/ Identification

Gender (Ciot et al., 2013; Liu and Ruths, 2013)Age (Rao et al., 2010)

Relying on Amazon Mechanical Turk

Political Polarity (Pennacchiotti et al, 2011)

Relying on external political websites

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 24: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Related Work

User Attribute Extraction/ Identification

Gender (Ciot et al., 2013; Liu and Ruths, 2013)Age (Rao et al., 2010)

Relying on Amazon Mechanical Turk

Political Polarity (Pennacchiotti et al, 2011)

Relying on external political websites

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 25: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Related Work

User Attribute Extraction/ Identification

Gender (Ciot et al., 2013; Liu and Ruths, 2013)Age (Rao et al., 2010)

Relying on Amazon Mechanical Turk

Political Polarity (Pennacchiotti et al, 2011)

Relying on external political websites

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 26: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Motivation/Introduction

Related Work

Dataset Creation

Algorithm

Experiments

Conclusion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 27: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Challenge:

Lack of Training Data

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 28: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 29: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 30: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 31: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 32: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 33: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 34: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Distant Supervision

Relation Extraction (Mintz et al., 2009)

Paris is the capital and most populous city of FranceThe capital of France is Paris

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 35: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

What is Knowledge Base for our task ?

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 36: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

What is Knowledge Base for our task ?

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 37: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

What is Knowledge Base for our task ?

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 38: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Attributes we focus on:

Education

Job

Spouse

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 39: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Attributes we focus on:

Education

Job

Spouse

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 40: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Positive Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 41: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Positive Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 42: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Positive Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 43: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Positive Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 44: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Negative Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 45: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Negative Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 46: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Negative Examples

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 47: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Data Expansion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 48: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Data Expansion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 49: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education: Data Expansion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 50: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Spouse

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 51: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Spouse

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 52: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Spouse

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 53: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Dataset Creation

Education Job Spouse#Users 7,208 1,806 1,636#Users Connected 6,295 1,407 1,108#Edges 11,167 3,565 554#Pos Entities 451 380 3121#PosTweets 124,801 65,031 135,466#Aver Pos Tweets User 17.3 36.6 82.8#Neg Entity 6,987,186 4,405,530 8,840,722#Neg Tweets 16,150,600 10,687,403 12,872,695

Table 1 : Statistics for our Dataset

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 54: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Algorithm

Motivation/Introduction

Related Work

Dataset Creation

Algorithm

Experiments

Conclusion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 55: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Potential Function

Given an entity e recognized by Twitter NER (Ritter et al., 2011).

Ψ(zi ,e): Potential function, entity e constitutes the correspondentattribute of user i

Ψ(zi ,e) =1

ZΨText(zi ,e)ΨNetwork(zi ,e)

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 56: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Learning

Text-Level Evidence ΨText(zki ,e)

Entity-level: number of tokens, capital letter, lengthToken-level: identity, shape, POS, NERWindow-level: tokens, POSTweet-level: tokensExternal Sources: list of universities/companies

Neighboring Effect

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 57: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Learning

Text-Level Evidence ΨText(zki ,e)

Neighboring Effect

Education, Job (Homophily)

ΨNetwork(zi,e) =∏

j∈Neigh(i)

exp(λI(Zj,e = 1)/N)

SpouseΨNetwork(zi,e) = exp(λI(Ze,useri = 1))

Max-Ent for training.

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 58: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Learning

Text-Level Evidence ΨText(zki ,e)

Neighboring Effect

Education, Job (Homophily)

ΨNetwork(zi,e) =∏

j∈Neigh(i)

exp(λI(Zj,e = 1)/N)

SpouseΨNetwork(zi,e) = exp(λI(Ze,useri = 1))

Max-Ent for training.

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 59: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Learning

Text-Level Evidence ΨText(zki ,e)

Neighboring Effect

Education, Job (Homophily)

ΨNetwork(zi,e) =∏

j∈Neigh(i)

exp(λI(Zj,e = 1)/N)

SpouseΨNetwork(zi,e) = exp(λI(Ze,useri = 1))

Max-Ent for training.

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 60: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Observed: Neighboring Information is already given(Education, Job).

Latent: Neighboring Information is unknown (Joint Inference)

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 61: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Observed: Neighboring Information is already given(Education, Job).

Latent: Neighboring Information is unknown (Joint Inference)

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 62: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Latent: Neighboring Information is unknown (JointInference)

Initializing only based on text-level information ΨText(zi,e)Infer each individual given its neighbors

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 63: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Latent: Neighboring Information is unknown (JointInference)

Initializing only based on text-level information ΨText(zi,e)

Infer each individual given its neighbors

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 64: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Latent: Neighboring Information is unknown (JointInference)

Initializing only based on text-level information ΨText(zi,e)Infer each individual given its neighbors

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 65: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Latent: Neighboring Information is unknown (JointInference)

Initializing only based on text-level information ΨText(zi,e)Infer each individual given its neighbors

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 66: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Latent: Neighboring Information is unknown (JointInference)

Initializing only based on text-level information ΨText(zi,e)Infer each individual given its neighbors

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 67: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Inference

Latent: Neighboring Information is unknown (JointInference)

Initializing only based on text-level information ΨText(zi,e)Infer each individual given its neighbors

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 68: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Experiments

Motivation/Introduction

Related Work

Dataset Creation

Algorithm

Experiments

Conclusion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 69: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Baselines

Only-Text:Text-Level Evidence ΨText(zi ,e)

NELL: Bag of words matching in the list ofuniversities or companies borrowed from NELL

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 70: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Baselines

Only-Text:Text-Level Evidence ΨText(zi ,e)

NELL: Bag of words matching in the list ofuniversities or companies borrowed from NELL

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 71: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Results

˜

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 72: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Results

˜

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 73: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Results

˜

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 74: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Results

˜

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 75: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Conclusion

Motivation/Introduction

Related Work

Dataset Creation

Algorithm

Experiments

Conclusion

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 76: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Conclusion

We present a framework to extract user attributes fromTwitter.

We present a large-scale dataset for this task.

We demonstrate the benefit of jointly reasoning about users’social network structure.

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 77: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Future Work

Facebook:

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 78: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Thank you

Datasethttp://aclweb.org/aclwiki/title=Profile_data

Thank You !

Questions, Suggestions ?

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction

Page 79: Weakly Supervised User Profile Extraction from Twitterbdlijiwei/ppt/attribute.pdfMotivation/IntroductionRelated WorkDataset CreationAlgorithmExperimentsConclusion Weakly Supervised

Motivation/Introduction Related Work Dataset Creation Algorithm Experiments Conclusion

Thank you

Datasethttp://aclweb.org/aclwiki/title=Profile_data

Thank You !

Questions, Suggestions ?

Jiwei Li, Alan Ritter and Eduard Hovy User Profile Extraction