Data Feed SEO A4uexpo London, October 2010 Will Critchlow
May 13, 2015
Data Feed SEO
A4uexpo London, October 2010
Will Critchlow
Data Feeds Are Not Unique
The “Affiliate” Penalty
Unique Content Matrix
Unique Content, Low Trust
Strong site, unique content
Non-unique Content, low trust
Strong site, non-unique content
Un
iqen
ess
Site Strength
Case Study“Welcome visitor, please find out selection of
[insert product] below, we have [number of products] items. We think you’ll like them!”
User Generated Content
“User” Generated Content
User “Generated” Content
Mozenda
Building quick & dirty SEO ToolsA Cheat Sheet & Inspiration
By Will Critchlow, www.distilled.co.uk. First published: www.seomoz.org
APIs (more on programmable web)AdWords – KeywordsAlchemy – Structured data & textBing – Search, news, spellingEvri – Sentiment and popularityFace.com – Face detectionFacebook – Social graphGoogle Analytics – Visitor dataHostip – Geo dataLinkedIn – Professional dataPingdom – Website uptimePostrank (1, 2, 3) – real-time & influenceRapleaf – Social media profilesTwitter – Real time and social... And of course:Linkscape – Links
YQL – Yahoo! Query Language
select * from html where url=“<url>" and xpath=“<xpath>“
select * from html where url=“<url>"
select * from feed where url=“<url>”
select * from search.web where query = “<query>"
xpath (more examples)/foo – the element ‘foo’//bar – all elements ‘bar’foo/bar – all bar elements children of foofoo//bar – bar arbitrary levels below foofoo/*/bar – bar grandchildren of foofoo/* - all children elements of foofoo/@bar – bar attribute on foofoo/[@bar] – foo with bar attributesfoo/[@bar=baz] – where attribute=baz
PythonSince Python is the language of Google App Engine, here is how you can use YQL easily within Python:Download source – extract to yql folder within your application
import yqly = yql.Public()result = y.execute(“<yql query>”)
Crawlers / ScrapersMozenda80legsGoogle App EngineAmazon Web Services
Human TouchAmazon Mechanical TurkSmartsheet (interface to Mechanical Turk)oDesk
Sources Magic Horsepower
Data (more on infochimps)Data.gov – US government dataData.gov.uk – UK government dataDelicious list – from Peter SkomorochGoogle Public Data - DirectoryGuardian – content and dataWorld Bank – finance, health, etc.80legs – prepackaged crawl data
User Generated “Content”
• External search queries
• Internal search queries
• Tags• Testimonials• FAQs/Support
emails
Tracking # of Reviews_gaq.push(['_setCustomVar',
1, // This custom var is set to slot #1. ‘Number of Reviews', // The top level name for the variable ‘1', // The Number of Reviews 3 // Page level variable
]);
Context Is KeyGoogle News: Google likes alternative factsLyrics: Never considered duplicate content
Context is key
Look to stand out from your competitors “Use a source of content that’s not
unique, but that no-one else in your space is using”
Manipulate & Clean Your Data“Kingston
DataTraveler 101 USB flash drive - 4 GB – Cyan”
“Kingston USB memory stick 4gb”
vs
Of Course, Links Always Win
http://www.seobook.com/black-hat-seo-case-study
Manual Reviews – aka “Hand Jobs”
Check out the quality rater guidelines
“Add value to users”
“Relevant”These are
subjective!!
Resources
• http://www.seomoz.org/blog/whiteboard-friday-flat-site-architecture
• http://seogadget.co.uk/solving-site-architecture-issues/• http://www.seomoz.org/blog/api-and-dataset-cheatsheet-building-
quick-dirty-tools• http://www.mozenda.com• http://www.seomoz.org/blog/leveraging-mechanical-turk-odesk-el
ance-craigslist-for-seo• http://www.seochat.com/c/a/Google-Optimization-Help/Googles-Q
uality-Rater-Guidelines-Leaked/• http://www.flickr.com/photos/rosaydani/77371897/
Thanks!
Director> [email protected]> twitter.com/WillCritchlowDistilled
> www.distilled.co.uk
Will Critchlow