Top Banner
How Search Engine Works ? Presented by Mohammed Azharuddin Digital Marketing Trainer
31

Basics of search engines and algorithms (1)

Aug 12, 2015

Download

Education

kongara
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Basics of search engines and algorithms (1)

How Search Engine Works ?

Presented by Mohammed Azharuddin

Digital Marketing Trainer

Page 2: Basics of search engines and algorithms (1)

History of Search

• 1990 – Archi Query Form – FTP based file search engine

• Feb 1993 – Excite.com– General word relation based search

• Oct 1993 – AliWeb– Manual submission engine

• Jan 1994 – Altavista– First natural language search engine

Page 3: Basics of search engines and algorithms (1)

• Jan 1996 – Backrup– Started by Larry Page and Segrey Brin

• Sep 15 1997 – Google.com– First search engine with Page Rank Technology

• 1997 – Yandex.com – Russian based search engine

• 1998 – MSN Search– Microsoft Rival to Google

Page 4: Basics of search engines and algorithms (1)

• 2000 – Baidu.com– Chinese based search engine

• 2008 – duckduckgo.com– Non tracking search engine

• 2009 – Bing.com – Microsoft Rival to Google

• 2010 – Blekko.com– Spam and Virus free search

http://www.searchenginehistory.com/http://www.google.co.in/about/company/history/

http://www.wordstream.com/articles/internet-search-engines-history

Page 5: Basics of search engines and algorithms (1)

The Google Story

Page 6: Basics of search engines and algorithms (1)

Search Engine Architecture

• Every search engine is based on following

–Crawling

– Indexing

–Algorithms

–Results

– Fight Spam

Page 7: Basics of search engines and algorithms (1)

Google Architecture

http://infolab.stanford.edu/~backrub/google.html

Page 8: Basics of search engines and algorithms (1)

Search Engine Architecture

CrawlerStore

Indexer

100 Million GBindexes

indexes

Search Interface

Algorithms(Programs)

trash

trash

trash

Sorted based on Content / Factors

WWW

60 Trillion PagesOr

60 Lakh CroreLive Google Example

Page 9: Basics of search engines and algorithms (1)

Algorithms

• Programs and Formulas to get relevant results

– Page Rank

– Spelling Check

– Synonym check

– Auto complete

– Query Understanding

– Safe Search

– User Context

Page 10: Basics of search engines and algorithms (1)

Page Rank Algorithm

• Google's first algorithms, which looks at links between pages to determine their relevance.

• PR is a number generated for each page available in Google Index

• PR Toolbar Range – NA to 10 (Best Rank) : This is based on Log Scale

of 0 – 10

• Real Page rank is calculated based on number of pages in index, which can be 0.15 to Trillions

Page 11: Basics of search engines and algorithms (1)

Toolbar Vs. Real PR

Toolbar Real PR

0 0 - 10

1 100 - 1,000

2 1,000 – 10,000

3 10,000 – 100000

4 100000 – 1000000

5 1000000 - 10000000

http://www.webworkshop.net/pagerank_calculator.php3

Page 12: Basics of search engines and algorithms (1)

PR Formula

Updated Formula

Old Formula

D = Damping Factor ; PR(N) = PR of Linking Site ; L(N) : No of Outbound Links

Page 13: Basics of search engines and algorithms (1)

Example

http://en.wikipedia.org/wiki/PageRankhttp://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm

Page 14: Basics of search engines and algorithms (1)

Fighting Spam

• Spam refers to websites which uses un ethical practices for Search Rankings

• To fight the spam Google release updates frequently called as “Algorithm Updates”

• Google changes its search algorithm around 500 – 600 times every year.

• Some of them are major and few are minor updates

Page 15: Basics of search engines and algorithms (1)

Major Updates

Page 16: Basics of search engines and algorithms (1)
Page 17: Basics of search engines and algorithms (1)

• Panda Update - February 23, 2011

– This algorithm target the sites with thin content, content farms, duplicate content, sites with high ad-to-content ratios, and a number of other quality issues.

– Affected 12% queries on launch

– Recent update : Panda 4 – May 19 2013

Page 18: Basics of search engines and algorithms (1)
Page 19: Basics of search engines and algorithms (1)

• Penguin Update – April 24, 2012

– This algorithm target the sites which over optimize the websites, uses excessive links.

– Affected 3% queries on launch

– Recent update : Pengiun 2.1 – Oct 4 2013

Page 20: Basics of search engines and algorithms (1)
Page 21: Basics of search engines and algorithms (1)

Humming Bird Update – August 2013

• This algorithm understands the context of the query by analyzing the words in query

• It can automatically rewrite the query internally based on certain words like “Near”, Vs, How to, Where, Who is …. Etc

• Many queries are provided as “ONE BOX ANSWERS” to give the quick answers.

Page 22: Basics of search engines and algorithms (1)

How it Works ?

User QueryQuery

TranslatorModified

Query

Index

Page 23: Basics of search engines and algorithms (1)

One Box Answers Queries

• When is Independence of India • Time in India or Time in Toronto • 1$ to INR • 1Mile to Kms• Banana Vs. Apple • Who is wife of Bill Gates • What is my IP • who invented www• Show me pictures of taj mahal

Page 24: Basics of search engines and algorithms (1)

Search Engine Results Page(SERP)

Page 25: Basics of search engines and algorithms (1)
Page 26: Basics of search engines and algorithms (1)

Types of Results

Paid Results

PPC Ads

Comparison Ads

Shopping Ads

Non Paid Results

Organic Web

News Results

Image Results

Local Results

Video Results

Site Links

Schema Data

Page 27: Basics of search engines and algorithms (1)

Click Through Rate (CTR)

• CTR is a measure to understand how many users are clicking on the site from SERP

• CTR helps to understand the user response

• The top four positions “above the fold” for many desktop users, receive 83% of first page organic clicks.

CTR = (No of Clicks/No of Impressions)x100

Page 28: Basics of search engines and algorithms (1)

2011

Page 29: Basics of search engines and algorithms (1)

2012 CTR Results

Page 30: Basics of search engines and algorithms (1)

Branded Vs. Un Branded

Page 31: Basics of search engines and algorithms (1)

Thank you

Give us your feedback