Top Banner
Web Mining By: Shireen Fatima () Guide: Dr. Siddhartha Ghosh
22
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Web mining

Web Mining

By: Shireen Fatima ()

Guide: Dr. Siddhartha Ghosh

Page 2: Web mining

References

Web Mining :Accomplishments & Future Directions by Jaideep Srivastava

Mining the Web: Discovering Knowledge from Hypertext Data by Soumen Chakrabati

Web Mining today and tomorrow by Kavita Sharma and Vikas Kumar

Page 3: Web mining

Index

Introduction Applications Challenges Web Mining taxonomy Solution to Search Engine Problem Web Mining through cloud computing Conclusion

Page 4: Web mining

Introduction

Data mining: turn data into knowledge.

Web mining is the application of data mining techniques to find interesting and potentially useful knowledge from web data.

Page 5: Web mining

Cont.…

Web data is Web content –text,image,records,etc.

Web structure –hyperlinks,tags,etc.

Web usage –http logs,app server logs,etc.

Page 6: Web mining

Applications

Personalized customer experience in e-

commerce - Amazon.com

Web Search- Google

Web wide tracking - Double Click

Understanding Web communities- AOL

Understanding auction behavior - eBay

Personalized Portal for the Web - My Yahoo

Page 7: Web mining

Benefits

Information filtering techniques try to learn about users’ interests based on their evaluation and actions, and then to use this information to analyze new documents.

It Increase the value of each visitor. Improve the visitor’s experience at the websites.

Web mining is attractive for companies, because of several advantages.In the most general sense it can contribute to the increase of profit.

Page 8: Web mining

Challenges in Web Mining

Information is Huge.Information is diverse.Information is redundant

Page 9: Web mining

Web Mining taxonomy

Page 10: Web mining

Web Content Mining

Discovery of useful information from web contents / data / documents

The data mining techniques applied are:

Classification Clustering Associations

Page 11: Web mining

What is Clustering ?

Given:-A source of textual documents.-Similarity measuree.g., how many words are common in these documents?

• Find:

Several clusters of documents that are relevant to each other

Page 12: Web mining

Association and Classification

Association Rules: discovers similarity among sets of items across

transactions

X =====> Y where X, Y are sets of items, confidence or

P(X v Y),support or P(X^Y)

Classification: is the task of generalizing known structure to apply to new data.

For example, an e-mail program might attempt to classify an e-mail as "legitimate" or as "spam".

Page 13: Web mining

Web Structured Mining

The structure of a typical Web graph consists of Web pages as nodes, and hyperlinks as edges connecting between two related pages.

Web Structure Mining is the process of discovering structure information from the Web.

Web-graph: A directed graph that represents the Web.

Node: Each Web page is a node of the Web-graph.� Link: Each hyperlink on the Web is a directed edge �

of the Web-graph

Page 14: Web mining

Web Usage Mining

It deals with understanding user behavior in interacting with the web or with a website.

To obtain information that may assist web sites for reorganization or adaptation to better suit the user.

Page 15: Web mining

Web Usage Mining – 3 Phases

Page 16: Web mining

Web Usage Mining approaches

Clustering and Classification clients who often access

/products/software/webminer.html tend to be from educational institutions.

clients who placed an online order for software tend to be students in the 20-25 age group

75% of clients who download software from /products/software/demos/ visit between 7:00 and 11:00 pm on weekends

Page 17: Web mining

Cont.…

Sequential patterns - A set of items is followed by another item in time-order

Web usage examples30% of clients who visited /products/software/,

had done a search in Yahoo using the keyword “software” before their visit

60% of clients who placed an online order for WEBMINER, placed another online order for software within 15 days

Page 18: Web mining

Solution to the Search Engine Problems and How Web Mining Can Help in Improving the Business Decisions

As the search engines use enormous information existing in the web sites, web pages, it is a challenging task to engineer, implement and to improvise the search engine.

It helps in problems of how to effectively deal with uncontrolled hypertext collection where anyone can publish anything they want.

Page 19: Web mining

Cont.…

Web Mining Applications have been used by the web sites such as Web search e.g., Google and Yahoo ,Web Recommendations e.g., Amazon.com , Web Advertising e.g., Google and Yahoo.

Web site design e.g., landing page optimization

Page 20: Web mining

Web Mining through Cloud Computing

Cloud Computing is clearly one of today's most seductive technology areas due at least in part to its cost efficiency and flexibility.

Cloud Mining is new approach to faced search interface for your data. SaS (Software-as-a-Service) is used for reducing the cost of web mining and try to provide security that become with cloud mining technique.

Page 21: Web mining

Conclusion

Web Mining fills the information gap between web users and web designers

Many successful techniques have been developed for the mining the web

Cloud mining is the improvised method for web mining

The need for discovering new methods and techniques to handle the amounts of data existing in this universe will always exist.

Page 22: Web mining