Top Banner
When visibility means power Tracking political races from the Internet Stephane Gauvin Université Laval ECIG: October 2009
24

When visibility means power

Nov 29, 2014

Download

Documents

gauvins

Talk given at ECIG 2009. Using web-based count data to build 6 visibility indexes. Application to the US presidential (and slide on the Canadian federal). News visibility comes very close to voting behavior (less than 1% difference). Several indexes provide sensitive and timely tracking, unlike what had happened until recently
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: When visibility means power

When visibility means power Tracking political races from the Internet

Stephane Gauvin Université Laval ECIG: October 2009

Page 2: When visibility means power
Page 3: When visibility means power
Page 4: When visibility means power
Page 5: When visibility means power
Page 6: When visibility means power

What can we learn from

Search engines count data?

  Webometrics   The 2004 experience   The 2009 experience (USA)   The 2009 experience (Canada)   Building a measurement scale   Meaning and sentimentalism

Page 7: When visibility means power

Webometrics – painful beginnings

  1996: WordOfNet measures visibility  Closely aligned with SEO  Highly unreliable – disappears 2001

  2003: Factiva’s visibility index  Venture owned by Dow Jones & Reuters  Tracks media mentions of democrats

Page 8: When visibility means power

2004 Democrat convention

Page 9: When visibility means power

2004 consensus

Web based metrics unreliable   See:

  Bar-Ilan, 2001, 2008   Björneborn & Ingwersen, 2001   Clarke & Willett, 1997   Cothey, 2004   Ingwersen & Björneborn, 2005   Lawrence & Giles, 1999   Mettrop & Nieuwenhuysen, 2001   Oppenheim, Morris, McKnight, & Lowley, 2000   Shafi & Rather, 2005   Snyder & Rosenbaum, 1999   Vaughan & Thelwall, 2004

  That was before the social www   2004: 1M blogs / 2009 200M blogs   2004: 4G urls / 2009 1T urls

Page 10: When visibility means power

Measurement issues   Latency

  Unlike stars in the sky, visibility doesn’t reveal itself   Document centric (url seed(s) and follow links)

•  Amounts to convenience sampling •  Bias is shown in Vaughan & Thelwall 2004

  Concept centric (rely on extensive generic crawling/indexing – i.e. Google)

  Domain definition   Narrow (Senator John McCain)   Wide (McCain)   Variants (typos, nicknames)

  Variance   Ex: Yahoo! doesn’t agree with Google (next slide)

  Partitions   Digital space is not homogeneous: news, blogs, social, images, videos, www

Page 11: When visibility means power

Raw scores all over the place

Page 12: When visibility means power

Building a measurement scale

  Identify independent instruments

  Harvest data

  Weed out using Cronbach’s alpha

Page 13: When visibility means power

Independent instruments

Page 14: When visibility means power

Harvest data   Every day, a script mimics a user (not an API using a

sub-index)

  Up to 6 trials if the engine fails to return a result (dropped connexion, busy, etc.)

  Machine parsed to extract count data

  Compute visibility shares to alleviate extreme outliers (engines may return results several orders of magnitude larger than what they should be, which makes correlations unreliable. Visibility shares are always in to 0..1 interval)

Page 15: When visibility means power

  High reliability

  Google often low

Page 16: When visibility means power

2009 US presidential

Page 17: When visibility means power

2008 : Harper vs Dion

Page 18: When visibility means power

Blogs as early signal?

Page 19: When visibility means power

Blogs as early signal?

Page 20: When visibility means power

Visibility vs opinion polls

Page 21: When visibility means power

Correlations between signals

Absolute values above .18 are significant at p < 0.05

Page 22: When visibility means power

Summary

  Web metrics are highly reliable   They appear to be valid indicators

  French presidential   US presidential   Canadian elections

  But   Anecdotal (only 2-3 instances)   In the political realm (what about brands or social themes?)   Questionable (what about sentiment?)

Page 23: When visibility means power

Sentiment

  Mere visibility works because it embodies sentiment i.e. a rotten politician will soon become « invisible »

  Changes in visibility may convey sentiment (steady gains signals positive, explosion signals negative)

  Sentiment analysis is difficult for several reasons:   Volume makes human analysis impractical   Complexity makes machine analysis difficult   Conceptually not clear what is good or bad (pro-life?)

Page 24: When visibility means power

Next

  Apply to other concepts   Investigate metric properties

  Consider simple sentiment analysis (SA)  Goal is to call turning points (ex: Gore gets Nobel,

Spitzer gets prostitute)  When there is a news storm, sentiment is usually

obvious, making SA pointless  But some events are ambiguous (ex: Sarkozy-Bruni)  And other events are unsentimental (ex: H1N1)