Influence of Search Engines Christina Pong cs349
Dec 31, 2015
Influence of Search Engines
Christina Pongcs349
Introduction
The web is meant to be the ideal tool of a democratic society- anyone can say anything
We find information on the web through search engines, directories, or by clicking on links on other web pages
Search engines define what we see on the web, but in 2000, the total coverage only added up to roughly 42%. Today the number is even lower.
How do search engines decide what users will see?
Search Engines and Directories
Search Engines index using keywords
Robots search URLsBacklink MetricPageRank MetricLocation Metric
DirectoriesHuman gatekeepersYahoo’s “Criteria of relevancy” Logos taken from their respective websites
Ranking Concerns
Top 10-20 spots are the most coveted People use various techniques to improve
their rankingsWeb spamming- keyword stuffingPaid advertisementsBuying keywords
Idealism vs Cynicism
No one owns the web
Public space Supports democracy Conveys information Narrows gap
between “haves” and “have-nots”?
Web leads to increased isolation, division, and reduced discourse
“Freedom of the press is guaranteed only to those that own one”
-A.J. Liebling, journalist
Introducing the Googlearchy Study (2003)
In 2000 a survey reported that 30% of users looked for political information on the web
Scholars assumed two things: The web will generate lots of new, easily assessable
content, which lowers the cost of information The web will make it easier for people see information
from lesser known sources, thereby increasing equality
Inbound Links
Finding information by web surfingThe amount of traffic web sites through
surfing is still based on inbound links Using Search Engines
Google’s PageRank system is based on inbound links and 85% of users reported using Google as their search engine
HITS algorithm uses “hubs” and “authorities”, but still produces similar ranking results
Concept of Googlearchy
The cost of information and variety of content on the web is determined by link structure
In general, a small handful of sites get a larger ratio of inbound links and therefore a larger amount of traffic
Power-law relationship
Methods
Created lists of top 200 political websites in six different categories from Yahoo and Google
Used web crawlers to download all pages to a depth of four and classified them as “positive” or “negative”
Support Vector Machine (SVM) Classifier
SVM
Results
Other Findings
Significant amount of overlap in the pages found by Yahoo and Google
Generally a small handful of sites at the top get more backlinks than all of the other sites put together
Top sites are from long established interest groups
Also…
Given the small diameter of the web, the shallow depth of most searches, high degree of overlap, and that search engines are designed to copy user behavior, they believe that their study covered almost all sites that matter for their categories
Any site that is more than 3 clicks away from the top 200 on a Google or Yahoo search probably would not have much of an impact on politics
Study Conclusions
People thought that the web’s effect on politics is that it would lower the cost of information thereby reducing inequality
It lowers the cost, but due to Googlearchy, people to focus on only a few, heavily-linked sources
Food for Thought
The web is valuable because there is so much information, but its size makes it hard to study
An even less organized structure would be harder to navigate and it would be harder to find information
While anyone can post to the web, for most people it’s like: “Having the online equivalent of having their own talk
show on public access television at 3:30 in the morning”
-Matthew Hindman et al.
The web as a meritocracy- the idea a site must be good if everyone’s referencing it
Backlinks are the currency of the web
http://www.daily-seo.com/images/link-building-campaign.jpg http://www.usagold.com/images/gold-coins-images.jpeghttp://blog.kir.com/archives/images/money.jpg
As a Result
What do to about the Monopoly?
Market Dynamics Yahoo argues that if users don’t like search
engine results, they can just switch engines, which will encourage competitors to improve their own ranking systems
But so few people understand how engines work, so they wouldn’t know to switch
More Recently
Does Googleopoly mean that Google is Evil? (Nov. 2008 NY Times article)
A study conducted in 2006 (Filippo Mencer et al) found that search engine results are becoming less influential because people are searching using more specific queries, which reveals the more obscure pages.
Sources
Hindman, Matthew et al. Googlearchy: How a Few Heavily-Linked Sites Dominate Politics on the Web. March 31, 2003. All charts and tables were taken from the Googlearchy paper
Introna, Lucas and Helen Nissenbaum. Defining the Web: The Politics of Search Engines. IEEE. 2000.
Hansell, Saul. “Debating the Vices and Virtues of Google”. Nytimes.com. November 20, 2008.
Mencer, Filippo et al. Googlearchy or Googlocracy? IEEE Spectrum. February 2006. http://www.spectrum.ieee.org/feb06/2787 12/4/08.