Study concept drift in political ontologies
Post on 03-Jul-2015
642 Views
Preview:
Transcript
What is the problem? How can we deal with concept drift? Open questions
Studying Concept Drift in Political Ontologies
Shenghui Wang1,2 Stefan Schlobach1
Janet Takens2 Wouter van Atteveldt2
1 Department of Computer Science2 Department of Communication Science
Vrije Universiteit Amsterdam
Workshop on Matching and Meaning 2010
What is the problem? How can we deal with concept drift? Open questions
Content analysis in Communication Science
Communication scientists study all sorts of media contentrelated to human communication
Content analysis based on the NET method
concepts: political actors and issuesrelations: associations, opinions, or actions.
Example
Het Openbaar Ministerie (OM) wil de komende vier jaar mensen-handel uitroeien.
What is the problem? How can we deal with concept drift? Open questions
Content analysis in Communication Science
Communication scientists study all sorts of media contentrelated to human communication
Content analysis based on the NET method
concepts: political actors and issuesrelations: associations, opinions, or actions.
Example
Het Openbaar Ministerie (OM) wil de komende vier jaar mensen-handel uitroeien.
What is the problem? How can we deal with concept drift? Open questions
Content analysis in Communication Science
Communication scientists study all sorts of media contentrelated to human communication
Content analysis based on the NET methodconcepts: political actors and issuesrelations: associations, opinions, or actions.
Example
Het Openbaar Ministerie (OM) wil de komende vier jaar mensen-handel uitroeien.
om human trafficking-1
What is the problem? How can we deal with concept drift? Open questions
Semantic network analysis
2077
4842606 2471
1625
2423
2076
1259
2151
1545
2647
2492
623
1827
1409
329
2655
870
1306
10731097
1439
2403 1932
1906
889
1145
956
845
1474
2054
480
1936
1045
1332
2614
2251
1373
1608
883
1233
2653
1011
693
1275
752
2259
2120
475
341
2323
539
2221
1034
1940
1635
545
1386
654
2806
2199
2002
1198
2696
907
2438
1052
2394
438
2186
2377548
2753
648
1721
361
2124
2467
2070
856
2751
1077
1708
2393
1067
1223
2351
22712127
1059
1706
1739
74013881268
2573
2090
4641841
1234
2516
964
2171
What is the problem? How can we deal with concept drift? Open questions
Network-based communication science study
Politicians are networking
Politics is perceived by citizens via media
Media study by semantic network analysis
Who is determining the subjects?Who is teaming up?Who is more credible?Who owns which topic?
What is the problem? How can we deal with concept drift? Open questions
Before network analysis
We first need to extract networks!
How do we extract such networks?Requires: large corpora with annotated textual content
Manual coding against coding books (ontologies)Automated content analysis in progress
What is the problem? How can we deal with concept drift? Open questions
Before network analysis
We first need to extract networks!
How do we extract such networks?Requires: large corpora with annotated textual content
Manual coding against coding books (ontologies)Automated content analysis in progress
What is the problem? How can we deal with concept drift? Open questions
What is the problem?
Problems with constructing annotated content
Data from different time periods or genres
Coded by different teams at different moments
Manifesto Research Group: 25 countries, from 1945 to 2006Comparative Policy Agendas project: media content,manifestos, legislative texts, government press statements, etc.Election campaign coverage from 1994 to 2006
What is the problem? How can we deal with concept drift? Open questions
What is the problem?
Problems with constructing annotated content
Data from different time periods or genres
Coded by different teams at different moments
Manifesto Research Group: 25 countries, from 1945 to 2006Comparative Policy Agendas project: media content,manifestos, legislative texts, government press statements, etc.Election campaign coverage from 1994 to 2006
What is the problem? How can we deal with concept drift? Open questions
What is the problem?
Problem 1: Interoperability problem while information sharing
Different coding books should be merged or at leastconnected
illegal immigration
labour migrants
What is the problem? How can we deal with concept drift? Open questions
What is the problem, again?
Everything changes, quickly or slowly ...
What is the problem? How can we deal with concept drift? Open questions
Follow the Fashion?
What is the problem? How can we deal with concept drift? Open questions
Women’s role?
Suffragettes said that women’s role in society is unacceptable
Pope says that women’s role in society is unacceptable
What is the problem? How can we deal with concept drift? Open questions
Concept drift is a problem
Problem 2: Concept drift
Meaning of concepts changes over time
Analysis based on evolving concepts must consider temporallocality
Study concept drift itself is useful
What is the problem? How can we deal with concept drift? Open questions
Datasets
Five political ontologies which were used to annotatenewspaper articles
23639 manually annotated newspaper articles during fiverecent Dutch national election campaigns
There even exist manual mappings but most of them arelexically very similar
What is the problem? How can we deal with concept drift? Open questions
What are the main issues?
What is concept drift?
How do we detect the concept drift?
How do we represent the concept drift?
How do we evaluate the concept drift?
How do we use the concept drift?
What is the problem? How can we deal with concept drift? Open questions
What are the main issues?
What is concept drift?
How do we detect the concept drift?
How do we represent the concept drift?
How do we evaluate the concept drift?
How do we use the concept drift?
What is the problem? How can we deal with concept drift? Open questions
What are the main issues?
What is concept drift?
How do we detect the concept drift?
How do we represent the concept drift?
How do we evaluate the concept drift?
How do we use the concept drift?
What is the problem? How can we deal with concept drift? Open questions
What are the main issues?
What is concept drift?
How do we detect the concept drift?
How do we represent the concept drift?
How do we evaluate the concept drift?
How do we use the concept drift?
What is the problem? How can we deal with concept drift? Open questions
What are the main issues?
What is concept drift?
How do we detect the concept drift?
How do we represent the concept drift?
How do we evaluate the concept drift?
How do we use the concept drift?
What is the problem? How can we deal with concept drift? Open questions
What is concept drift?
Definition
The meaning of concept
Label
Intension
Extension
Questions:
Can three of them change at the same time?
Should there be a rigid part in one concept which staysconstant all the time?
What is the problem? How can we deal with concept drift? Open questions
What is concept drift?
Definition
The meaning of concept
Label
Intension
Extension
Questions:
Can three of them change at the same time?
Should there be a rigid part in one concept which staysconstant all the time?
What is the problem? How can we deal with concept drift? Open questions
What is concept drift?
Definition
The meaning of concept
Label
Intension
Extension
Questions:
Can three of them change at the same time?
Should there be a rigid part in one concept which staysconstant all the time?
What is the problem? How can we deal with concept drift? Open questions
What is concept drift?
Definition
The meaning of concept
Label
Intension
Extension
Questions:
Can three of them change at the same time?
Should there be a rigid part in one concept which staysconstant all the time?
What is the problem? How can we deal with concept drift? Open questions
Detecting concept drift
Detecting concept drift in terms of
its labels
its extension: instance-based mapping between different time
its intension: using its hierarchical information and theco-occurrence links to other concepts of the same time
What is the problem? How can we deal with concept drift? Open questions
Representing concept drift
What is the problem? How can we deal with concept drift? Open questions
Representing concept drift: Builder’ fraud
2002
2006
2003
bouwfraude
klokkenluider
0.083
belangenverstrengeling jusititie
0.058
bouwfraude
parlementaire_enquete_bouwfrau
0.124
corruptie
0.1520.050
0.034
0.130 0.117
0.097
bouwfraude
companies,_coroprations,_business
0.002
fiod
0.077
fraude_en_corruptie
criminaliteit
criminaliteit
0.043 parlementaire_enquete_algemeen
0.154
0.198
bestuurlijke_vernieuwing
0.030
0.047
0.038
0.044
What is the problem? How can we deal with concept drift? Open questions
Representing concept drift: Police
2002
1998
1994
2006
2003
politie
jusititie
0.052
criminaliteit
0.073
politie
criminaliteit
0.037
justitie
0.055
0.069
rpolitie
0.044 rcriminaliteit
0.066
xjustitie
0.071
0.042
sjo_creawetsto criminelen
0.064
politie
0.066 0.120
0.083 0.123
criminaliteit
criminaliteit
criminaliteitsbestrijding
politie
0.030
0.046
0.072
0.055
0.1070.151
0.0670.065
What is the problem? How can we deal with concept drift? Open questions
What kinds of concept drift can we detect?
What is the problem? How can we deal with concept drift? Open questions
What kinds of concept drift can we detect?
Association shifting
1998
1994
best_vernieuwin
hbestuurlijke_vernieuwinsoc_vernieuwing
0.082
ontwikkelingshu
0.074
hervorming_politiebestel
0.018
hgekozen_burgemeester
0.033
What is the problem? How can we deal with concept drift? Open questions
What kind of concept drift can we detect?
Generalising or specialising
2003
2006
builder’s fraud
corruption builder’s fraud
fraud and corruption
0.043
criminality
0.038
What is the problem? How can we deal with concept drift? Open questions
What kind of concept drift can we detect?
Overlapping
1998
1994
rwerkgevers
rvno
0.013
lkoppeling
0.015
wgv_werkgevers
vno_ncw
0.037 0.077
0.077
What is the problem? How can we deal with concept drift? Open questions
Open questions
What other types of concept drift can we identify andautomatically detect?
What is an appropriate (formal) representation for thedetected drift?
How can we evaluate the detected concept drift, bothqualitatively and quantitatively?
What is the problem? How can we deal with concept drift? Open questions
Thank you
top related