Top Banner
www.seoresearchlabs.co m keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs
16

Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

Dec 13, 2015

Download

Documents

Hannah Spencer
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Argh! We’ve Been Duped!

Dan Thies, SEO Research Labs

Page 2: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

A (little) about me...

• 10 years of SEO… • Once held the #1 ranking on Infoseek for “sex” –

for 18 minutes• Make up your own joke• Published “SEO Fast Start” in 2001• Started SEO Research Labs in Jan. 2003• Author, SitePoint Search Engine Marketing Kit

Page 3: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Topics For Today

• Getting Duped vs. Duping Yourself• Impacts on Traffic• Reverse Cloaking & Spider Validation• Changing & Rotating Content• DMCA & Dupes• Challenges to search engines

Page 4: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Defining The Problem• Duplicate Content

– The same content, presented on more than one URL

– Most web sites do this to an extent• http://www.example.com vs. http://example.com• www.example.com/ vs. www.example.com/index.html

• Near-Duplicate– “Nearly the same…”

– Search engines look for uniqueness

• Filtered from index vs. filtered from SERPs

Page 5: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Getting Duped vs. Duping Yourself

• Duping Yourself – See Other Sessions– Duplicate URLs

– Shopping sites w/ duplicate product descriptions

– Near-empty pages

• Getting Duped – You Are Here– Screen scrapers & “borrowing”

– RSS Feeds (or did you do it to yourself?)

– Proxy URLs

Page 6: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Impacts on Traffic• Specific site: (omitted…)• Duped: 10-15% of traffic is organic search• De-Duped: 20-25% from organic search• Revenue drop… “feelable.”

• This client is very good at PPC and other marketing, many sites would suffer far worse from a 50% drop in SEO referrals

Page 7: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Reverse Cloaking vs. Scrapers

• Simple user agent detection - If the user-agent is NOT a major SE spider, insert:

<meta name=“robots” content=“noindex”>

– Screen scrapers that steal an entire page’s HTML get a page that will not be indexed.

– Easily thwarted by someone who cares to, but reduces duplication by scraping substantially

Page 8: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Links By Proxy – An Old TrickFun With Spam:

Hack someone else’s site to create a link or redirect to one of your sites – either create a page or craft a URL using XSS attack… then link to it using a proxy URL. Woo-hoo!

Page 9: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Public Proxies

Page 10: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Proxy URLs As Duplicates• Thousands of public anonymous proxy servers• Every URL on the web can be duplicated by them• Proxy-based duplicates, when linked to, can affect

duplicate content filtering– Search Engine Spiders access proxy URLs too!

• Public proxies pass along the user-agent– IE version of site vs. Mozilla vs. Opera etc.– Googlebot, MSNBot, Slurp, Ask…

• But proxies use their own IP address– Check logs – do any “Googlebot” IPs resolve to proxies (e.g.

webwarper.net)?

Page 11: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Spider Validation vs. Proxies• When you get a request from a “search engine spider”

user agent, check the requesting IP:– If the IP address is “owned” by the search engine, deliver the

page– If the IP address is not owned by the search engine, deliver a

different page, empty page, or 403 Forbidden– NSLookup is less reliable than checking ARIN’s WHOIS

database– Store lists of good vs. bad IPs, to speed processing

• Yes, it’s really the SE’s bot, but coming to a proxy URL– So, you MUST block the request to avoid duplication

– Warning: Danger – Danger – Danger! Use With Caution!

Page 12: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

But What If They Get Through?• Changing & Rotating Content

– Testimonials– News & Headlines– Brute Force

• The most important page on your site is probably the home page, yet it’s probably the least often changed.

• How much is unique? How often to change?• If the page changes every 24 hours, a proxy can only

duplicate you for 24 hours + indexing lead time• Our client is changing one paragraph of copy every 4

hours – 42 variations per week.

Page 13: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Monitoring Dupes• Set up monitoring for a “signature SERP”

– Text that is unique to your page or pages– Home page duplication is the #1 issue– Use a second signature for internal pages

• Google Alerts– www.google.com/alerts

• Roll your own with the Google API– www.google.com/apis– or www.googlealert.com

Page 14: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Killing Dupes w/ DMCA

• DMCA, Digital Millenium Copyright Act• I am NOT an attorney, lawyer, barrister, solicitor,

etc. and this is NOT legal advice• Ian McAnerin’s templates:

– http://www.mcanerin.com/EN/articles/copyright-03.asp

– Or Google McAnerin DMCA

• To Hosting Provider (ISP) to remove sites/pages• To search engines to remove from index

Page 15: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Challenging The Search Engines• Duplication by proxy, by theft, etc. is a major issue for

webmasters – a drain on resources, and a pain in the…

• Like search engine spam, much of it is paid for by search engines through contextual ad networks & PPC

• Identify the originals – is the page in DMOZ? Is it in the Y! Directory? It just might be the original!

• How many DMCA notices can a search engine afford to process?

• Why are any URLs from known proxies still indexed after all these years?

Page 16: Www.seoresearchlabs.com keyword research – corporate training – private coaching Argh! We’ve Been Duped! Dan Thies, SEO Research Labs.

www.seoresearchlabs.comkeyword research – corporate training – private coaching

Contact InformationDan Thies, [email protected]

Free Training Videos:www.seoresearchlabs.com/keywordvideo

www.seoresearchlabs.com/linkvideo

Free Tools:www.seoresearchlabs.com/tools.php