MINING THE E-BUSINESS TO ENHANCE THE MARKET STRATEGIES OF A COMPANY ENROLMENT No. - 8103532 8103592 NAME OF THE STUDENT - SHRADDHA SINGH DHRUV GOEL NAME OF THE SUPERVISIOR - Mrs.ARTI GUPTA May- 2012 Submitted in partial fulfilment of the Degree of Bachelor of Technology In Computer Science Engineering DEPARTMENT OF COMPUTER SCIENCE ENGINEERING & INFORMATION TECHNOLOGY JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY, NOIDA
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
MINING THE E-BUSINESS TO ENHANCE THE MARKET
STRATEGIES OF A COMPANY
ENROLMENT No. - 8103532 8103592
NAME OF THE STUDENT - SHRADDHA SINGH DHRUV GOEL
NAME OF THE SUPERVISIOR - Mrs.ARTI GUPTA
May- 2012
Submitted in partial fulfilment of the Degree of
Bachelor of Technology
In
Computer Science Engineering
DEPARTMENT OF COMPUTER SCIENCE ENGINEERING &
INFORMATION TECHNOLOGY
JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY, NOIDA
TABLE OF CONTENTS
Chapter No. Topics Page No.
Cover Page
Student Declaration II
Certificate From The Supervisor III
Acknowledgement IV
Abstract V
List Of Figures VI
List Of Tables VII
List Of Symbols And Acronyms VIII
Chapter -1 Introduction 1-8
1.1 General Introduction
1.2 Problem Statement
1.3 Empirical Study
1.4 Current and Open Problems
1.5 Approach To Problem In Terms Of Technology /Platform
To Be Used
1.6 Support For Novelty/ Significance Of Problem
1.7 Solution Approach
Chapter -2 Literature Survey 9-18
2.1 Summary Of Papers
2.2 Diagrammatic Integrated Summary Of The Literature
Studied
Chapter -3 Analysis, Design And Modelling 19-44
3.1 Overall Description Of The Project
3.2 Specific Requirements
3.2.1 External Interfaces
3.2.2 Functions
3.2.3 Performance Requirements
3.2.4 Logical Database Requirements
3.2.5 Design Constraints
3.2.6 Software Attributes (H/W, S/W)
3.3 Design Diagrams
3.3.1Use Case Diagrams
3.3.2 Class Diagrams / Control Flow Diagrams
3.3.3 Sequence Diagram/Activity Diagrams
3.4 Risk Analysis
3.5 Risk Mitigation Plan
Chapter-4 Implementation And Testing 45-62
4.1 Implementation Details And Issues
4.1.1 Implementation
4.1.2 Debugging
4.1.3 Error And Exception Handling
4.2 Risk Management
4.3 Testing
4.3.1 Testing Plan
4.3.2 Features To Be Tested
4.3.3 Features Not Be Tested
4.3.4 Approach Taken For Testing 4.3.3.
4.3.5 Item Pass/Fail Criteria
4.3.6 Test Cases: For All Features To Be Tested
Chapter -5 Conclusion 63-64
5.1 Conclusion
5.2 Future Work
References 65.
Appendices 66-67
Appendix A Work Plan
Appendix B Description of Tool
DECLARATION
We hereby declare that this submission is our own work and that, to the best of our
knowledge and belief, it contains no material previously published or written by another
person nor material which has been accepted for the award of any other degree or diploma of
the university or other institute of higher learning, except where due acknowledgment has
been made in the text.
Place: Signature
Name: Shraddha Singh Dhruv Goel
Date: Enrolment No: 8103532 8103592
CERTIFICATE
This is to certify that the work titled “Mining The E-Business To Enhance The Market
Strategies Of A Company” submitted by Dhruv Goel & Shraddha Singh in partial
fulfilment for the award of Degree of Bachelor of Technology of Jaypee Institute of
Information Technology, Noida has been carried out under my supervision. This work has
not been submitted partially or wholly to any other university or institute for the award of this
or any other degree or diploma.
Signature of Supervisor
Name of Supervisor Mrs.Arti Gupta
Designation Lecturer
Date
ACKNOWLEDGEMENT
A project is an attempt by a student to put best of his skills and come out to conclude with
something productive or useful in understanding of the field. This project too has brought to
us with many ideas and knowledge of the topics covered.
We express our deepest gratitude to our supervisor Mrs.Arti Gupta for her invaluable
guidance and blessings. We are very grateful to her for providing us with an environment to
work on this project successfully. We would like to thank her for her unwavering support
during the entire course of this project work.
Signature of the student :
Name of the student :
Dhruv Goel Shraddha Singh
Enrolment No. : 8103592 8103532
Date :
ABSTRACT
The rapid growth of Internet is reshaping the industries and is giving a massive change to
business market. Traditional business is undergoing a major transformation into to the E-
business. Unfortunately the enormous size and hugely unstructured data on the web, even for
a single commodity, has become a cause of ambiguity for consumers. Extracting valuable
information from such an ever-increasing data is an extremely tedious task and is fast
becoming critical towards the success of businesses. Data mining is an emerging technology
aimed at discovering patterns in the underlying historical data and identifies trends within
data that go beyond simple analysis. Through the use of sophisticated algorithms, it provides
users an opportunity to identify key attributes of business processes and target opportunities.
A new dimension has been added to data mining by extending this technique to the realm of
e-business as it provides all the right ingredients for successful data mining. Data mining
techniques assist e-businesses to seek and retain the most profitable customers by analysing
customer-buying and traversing patterns collected online or offline. Essentially, e-business
companies can improve products quality or sales by anticipating problems before they occur
with the use of data mining techniques. Data mining, in general, is the task of extracting
implicit, previously unknown, valid and potentially useful information from data. Web
mining is the use of data mining techniques to automatically discover and extract information
from Web documents and services for obtaining useful information. Application of web
content mining can be very encouraging in the areas of Customer Relations Modelling,
billing records, product cataloguing and quality management.
Thus, in our project we have worked in the field of WEB TECHNOLOGY AND WEB
MINING and developed an efficient E-business process management system and also studied
and implemented techniques from the field of web content mining and studied their impact in
the area specific to business user needs focusing both on the customer as well as the
producer. Thus, our system aims at applying various data mining techniques on the business
data extracted from the web and analyse it which will in turn help in improving the
company‘s marketing strategies.
Signature
Signature
Name of the
student
Shraddha Singh Dhruv Goel Name of the
supervisor
Mrs.Arti Gupta
Date
Date
LIST OF SYMBOLS AND ACRONYMS
HTML –Hyper Text Mark-up Language
XML stands for EXtensible Mark-up Language.XML is a markup language much like
HTML .XML was designed to carry data, not to display data
ARFF-An ARFF (Attribute-Relation File Format) file is an ASCII text file that describes a
list of instances sharing a set of attributes. ARFF files have two distinct sections. The first
section is the Header information, which is followed the Data information.
Eg: @RELATION iris
@ATTRIBUTE sepallength NUMERIC
@ATTRIBUTE sepalwidth NUMERIC
@ATTRIBUTE petallength NUMERIC
@ATTRIBUTE petalwidth NUMERIC
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}
The Data of the ARFF file looks like the following:
@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa
INTRODUCTION
GENERAL INTRODUCTION
This project is aimed on developing an efficient business process management system and the
techniques of web content mining to satisfy the customer‘s product hunt and to get useful
information so as to analyse the business data and further use it to improve the market
strategies of a company.
Web content mining aims to extract/mine useful information or knowledge from web page
contents. Web content mining is related but different from data mining and text mining. It is
related to data mining because many data mining techniques can be applied in Web content
mining. It is related to text mining because much of the web contents are texts. However, it is
also quite different from data mining because Web data are mainly semi-structured and/or
unstructured, while data mining deals primarily with structured data. Web content mining is
also different from text mining because of the semi-structure nature of the Web, while text
mining focuses on unstructured texts. Web content mining thus requires creative applications
of data mining and/or text mining techniques and also its own unique approaches. Internet is
probably the biggest world‘s database. Moreover, data is available using easily accessible
techniques. Often it is important and detailed data that let people achieve goals or use it in
various realms. Data is held in various forms: text, multimedia, database. Web pages keep
standard of html which makes it kind of structural form, but not sufficient to easily use it in
data mining. Typical website contains, in addition to main content and links, various stuff
like ads or navigation items. It is also widely known that most of the data in the Internet is
redundant – a lot of information appear in different sites, in more or less alike form. In the
Web mining domain, web content mining essentially is an analogue of data mining
techniques for relational databases, since it is possible to find similar types of knowledge
from the unstructured data residing in Web documents. The Web document usually contains
several types of data, such as text, image, audio, video, metadata and hyperlinks. Some of
them are semi-structured such as HTML documents or a more structured data like the data in
the tables or database generated HTML pages, but most of the data is unstructured text data.
The unstructured characteristic of Web data forces the Web content mining towards a more
complicated approach.
PROBLEM STATEMENT
E-commerce has changed the face of most business functions in competitive enterprises.
Web-Enabled Electronic Business is generating massive amount of data on customer
purchases, browsing patterns, usage times and preferences at an increasing rate.
Unfortunately the enormous size and hugely unstructured data on the web, even for a single
commodity, has become a cause of ambiguity for consumers. Gathering information from the
web and then Extracting valuable information to be able to make proper decisions from such
an ever increasing data is an extremely tedious task and is fast becoming critical towards the
success of businesses.
EMPIRICAL STUDY
Various tools and software are available for purpose of data mining and web content mining:
WEKA TOOL
Weka (Waikato Environment for Knowledge Analysis) is a popular suite of machine
learning software written in Java, developed at the University of Waikato, New
Zealand. Weka is free software available under the GNU General Public License. Weka
is a comprehensive set of advanced data mining and analysis tools. The strength of
Weka lies in the area of classification where it covers many of the most current
machine learning (ML) approaches. At its simplest, it provides a quick and easy way to
explore and analyze data. Weka is also suitable for dealing with large data where the
resources of many computers and or multi-processor computers can be used in
parallel. Weka also allows for data to be pulled directly from database servers as well
as web servers. Its native data format is known as the ARFF format.
WEKA consists of
• Explorer
• Experimenter
• Knowledge flow
• Simple Command Line Interface
• Java interface
Weka has a comprehensive set of classification tools. Many of these algorithms are
very new and reflect an area of active development. We will only be examining the tree
based classifiers but this is only a very small part of all the classification methods
available in Weka. There are 11 tree algorithms, and 71algorithms in all.
MOZENDA SOFTWARE
Intuitive software that allows you to mine data in just minutes.Mozenda is Software as
a Service company that enables users of all types to easily and affordably extract and
manage web data. With Mozenda, users can set up agents that routinely extract data,
store data, and publish data to multiple destinations. Once information is in the
Mozenda systems users can format, repurpose, and mash up the data to be used in other
online/offline applications.
CURRENT AND OPEN PROBLEMS
In today‘s era where the entire world has become a global village and the driving force is
internet having e-business to internet blogs to search engines, the major questions in front of
the business users is while they would like to retain the existing customers and also would
like to understand the patterns and trends of customer behaviour so that their decisions can be
supported with facts represented with visualizations and appropriate reporting made possible
with web mining. Also, there is a huge competition amongst the companies and in order to be
ahead of others in terms of products one is selling and also to identify the strong and weak
parts of the competitors.
Thus, some relevant problems as listed as follows:
Very high data volumes and data flow rates
Complex, structured, semi-structured, and unstructured data
A growing trend among companies, organizations and individuals alike to gather
information to utilize it for their interest.
Need to unearth hidden relationships among various attributes of data and between
several snapshots of data over a period of time. These hidden patterns have enormous
potential in predictions and personalisation in e-commerce
Need of organized data for analysis in order to improve market strategies
Information Extraction for Catalogue Creation, Service Discovery
APPROACH TO THE PROBLEM IN TERMS OF TECHNOLOGY
USED
INTRODUCTION TO .NET Framework
The .NET Framework is a new computing platform that simplifies application development
in the highly distributed environment of the Internet. The .NET Framework is designed to
fulfill the following objectives:
To provide a consistent object-oriented programming environment whether object code is
stored and executed locally, executed locally but Internet-distributed, or executed
remotely.
To provide a code-execution environment that minimizes software deployment and
versioning conflicts.
To provide a code-execution environment that guarantees safe execution of code,
including code created by an unknown or semi-trusted third party.
To provide a code-execution environment that eliminates the performance problems of
scripted or interpreted environments.
To make the developer experience consistent across widely varying types of applications,
such as Windows-based applications and Web-based applications.
To build all communication on industry standards to ensure that code based on the .NET
Framework can integrate with any other code.
.NET FRAMEWORK CLASS LIBRARY
The .NET Framework class library is a collection of reusable types that tightly integrate with
the common language runtime. The class library is object oriented, providing types from
which your own managed code can derive functionality. This not only makes the .NET
Framework types easy to use, but also reduces the time associated with learning new features
of the .NET Framework. In addition, third-party components can integrate seamlessly with
classes in the .NET Framework.For example, the .NET Framework collection classes
implement a set of interfaces that you can use to develop your own collection classes. Your
collection classes will blend seamlessly with the classes in the .NET Framework.As you
would expect from an object-oriented class library, the .NET Framework types enable you to
accomplish a range of common programming tasks, including tasks such as string
management, data collection, database connectivity, and file access. In addition to these
common tasks, the class library includes types that support a variety of specialized
development scenarios. For example, you can use the .NET Framework to develop the
following types of applications and services:
Console applications.
Scripted or hosted applications.
Windows GUI applications (Windows Forms).
ASP.NET applications.
XML Web services.
Windows services.
ACTIVE SERVER PAGES.NET
ASP.NET is a programming framework built on the common language runtime that can be
used on a server to build powerful Web applications. ASP.NET offers several important
advantages over previous Web development models:
Enhanced Performance. ASP.NET is compiled common language runtime code running
on the server.
World-Class Tool Support. The ASP.NET framework is complemented by a rich
toolbox and designer in the Visual Studio integrated development environment.
WYSIWYG editing, drag-and-drop server controls, and automatic deployment are just a
few of the features this powerful tool provides.
Power and Flexibility. Because ASP.NET is based on the common language runtime,
the power and flexibility of that entire platform is available to Web application
developers. The .NET Framework class library, Messaging, and Data Access solutions
are all seamlessly accessible from the Web. ASP.NET is also language-independent, so
you can choose the language that best applies to your application or partition your
application across many languages.
Simplicity. ASP.NET makes it easy to perform common tasks, from simple form
submission and client authentication to deployment and site configuration.
Manageability. ASP.NET employs a text-based, hierarchical configuration system,
which simplifies applying settings to your server environment and Web applications.
Scalability and Availability. ASP.NET has been designed with scalability in mind, with
features specifically tailored to improve performance in clustered and multiprocessor
environments.
Customizability and Extensibility. ASP.NET delivers a well-factored architecture that
allows developers to "plug-in" their code at the appropriate level.
Security. With built in Windows authentication and per-application configuration, you
can be assured that your applications are secure.
ASP.NET WEB FORMS
The ASP.NET Web Forms page framework is a scalable common language runtime
programming model that can be used on the server to dynamically generate Web pages.
ASP.NET Web Forms pages are text files with an .aspx file name extension. They can be
deployed throughout an IIS virtual root directory tree. When a browser client requests .aspx
resources, the ASP.NET runtime parses and compiles the target file into a .NET Framework
class. This class can then be used to dynamically process incoming requests. ASP.NET
provides syntax compatibility with existing ASP pages. This includes support for <% %>
code render blocks that can be intermixed with HTML content within an .aspx file. These
code blocks execute in a top-down manner at page render time.
INTRODUCTION TO ASP.NET SERVER CONTROLS
In addition to (or instead of) using <% %> code blocks to program dynamic content,
ASP.NET page developers can use ASP.NET server controls to program Web pages. Server
controls are declared within an .aspx file using custom tags or intrinsic HTML tags that
contain a runat="server" attributes value. Intrinsic HTML tags are handled by one of the
controls in the System.Web.UI.HtmlControls namespace. Any tag that doesn't explicitly
map to one of the controls is assigned the type of
System.Web.UI.HtmlControls.HtmlGenericControl. Server controls automatically
maintain any client-entered values between round trips to the server. This control state is not
stored on the server (it is instead stored within an <input type="hidden"> form field that is
round-tripped between requests). Note also that no client-side script is required.
C#.NET
ADO.NET is an evolution of the ADO data access model that directly addresses user
requirements for developing scalable applications. It was designed specifically for the web
with scalability, statelessness, and XML in mind. ADO.NET uses some ADO objects, such as
the Connection and Command objects, and also introduces new objects. Key new
ADO.NET objects include the DataSet, DataReader, and DataAdapter. Some objects are:
Connections. For connection to and managing transactions against a database.
Commands. For issuing SQL commands against a database.
Data Readers. For reading a forward-only stream of data records from a SQL Server data
source.
Datasets. For storing, Removing and programming against flat data, XML data and
relational data.
Data Adapters. For pushing data into a Dataset, and reconciling data against a database.
When dealing with connections to a database, there are two different options: SQL Server
.NET Data Provider (System.Data.SqlClient) and OLE DB .NET Data Provider
(System.Data.OleDb). In these samples we will use the SQL Server .NET Data Provider.
These are written to talk directly to Microsoft SQL Server.
SQL SERVER
SQL Server stores each data item in its own fields. In SQL Server, the fields relating to a
particular person, thing or event are bundled together to form a single complete unit of data,
called a record (it can also be referred to as raw or an occurrence). Each record is made up of
a number of fields. No two fields in a record can have the same field name. During a SQL
Server Database design project, the analysis of your business needs identifies all the fields or
attributes of interest. If your business needs change over time, you define any additional
fields or change the definition of existing fields.
JAVA APPLET
Applets are used to provide interactive features to web applications that cannot be provided
by HTML alone. They can capture mouse input and also have controls like buttons or check
boxes. In response to the user action an applet can change the provided graphic content. This
makes applets well suitable for demonstration, visualization and teaching. There are online
applet collections for studying various subjects. An applet can also be a text area only,
providing, for instance, a cross platform command-line interface to some remote system. If
needed, an applet can leave the dedicated area and run as a separate window. A Java applet
extends the class java.applet.Applet.
SUPPORT FOR THE NOVELTY OF THE PROBLEM
Why E-business???
In e-commerce websites you have the ability to sell, advertise, and introduce different kinds
of services and products in the web. E-commerce websites have the advantage of reaching a
large number of customers regardless of distance and time limitations. Furthermore, an
advantage of e-commerce over traditional businesses is the faster speed and the lower
expenses for both ecommerce website owners and customers in completing customers‘
transactions and orders Retail websites aim to inspire, reflect a good image about the business
and improve it online. An important factor in having a successful retail website is to know
your competitors. On one hand, by identifying their points of strength and trying to get
benefit of them by improving those points and adopting powerful strategies. On the other
hand, identifying weakness points of your competitors and avoid them is a good practice in
having a successful retail website.
Web Mining versus Data Mining
Web mining is the use of data mining techniques to automatically discover and extract
information from Web documents and services. When comparing web mining with traditional
data mining, there are three main differences to consider:
1. Scale – In traditional data mining, processing 1 million records from a database would
be large job. In web mining, even 10 million pages wouldn‘t be a big number.
2. Access – When doing data mining of corporate information, the data is private and
often requires access rights to read. For web mining, the data is public and rarely
requires access rights
3. Structure – A traditional data mining task gets information from a database, which
provides some level of explicit structure. A typical web mining task is processing
unstructured or semi-structured data from web pages. Even when the underlying
information for web pages comes from a database, this often is obscured by HTML
mark-up
Thus, Web Mining can be used to support enterprises to create marketable products.
SOLUTION APPROACH
Developing Business Management Software
Implementing Web Crawler
Implementing Web Extractor
Implementation of Data Mining Techniques
Developing Business Management Software
A business Process model for Marketing team in order to service the target groups about its
products and services and also be able to make them place online orders that can be viewed
by the business managers ,in other words to automate a whole retail store for the sale of
products and providing the customers with best of services. The standards of security and
data protective mechanism have been given a big choice for proper usage. The application
takes care of different modules and their associated reports, which are produced as per the
applicable strategies and standards that are put forwarded by the administrative staff.
Implementing Web crawler for collection of web data
A web crawler based on path incremental crawling that applies breadth first search for
searching the linked pages to a URL. It starts to search as soon the crawl button on its
interface is pressed. The crawler application is designed in C# and the searching algorithm
based on the pseudo code provided in a research paper is also implemented in C# in
Microsoft Visual Studio 2010.
Implementing Web Extractor/Parser
One of the critical problems in building an extractor is defining a set of extraction rules that
precisely define how to locate the information on the page. For any given item to be extracted
from a page, one needs an extraction rule to locate both the beginning and end of that item.
Implementation of Data Mining Techniques
The data mining techniques will be applied on the data sets so extracted in order to retrieve
useful information and solve the required queries related to customers in order to enhance the
market strategies and combat the issue of competition by comparing various products and
services. The data can be categorized on the basis of similarity and relationships. The
categorization can be obtained by using classification techniques and Association is an
exploratory method of discovering previously unknown relationships. Thus, applying data
mining techniques to the business data will lead us to achieve the following:
Build unique market segments identifying the attributes of high value prospects,
Select promotional strategies that best reach the client‘s Web customer segments
Analyze online sales to improve targeting of the client‘s high-value customers
Test and determine which marketing activities have the greatest impact
Identify client customers most likely to be interested in their new products
LITERATURE SURVEY
SUMMARY OF PAPERS
TITLE RESEARCH ON DATA MINING IN E-BUSINESS
AUTHOR
Luo Hanyang, Shenzhen Graduate School,Harbin Institute of
Technology,Shenzhen, P.R.China,
Gao Jinling, Ji Wenli,College of Management,Shenzhen University
Shenzhen, P.R.China
YEAR OF
PUBLICATION
22 December 2008
PUBLISHING
DETAILS
Computer Science and Software Engineering, 2008 International
Conference
SUMMARY Data mining is an emerging technology and can be applied for
searching valuable business information from e-business as huge
details available in website‘s background database. The architecture
and sources of information in an e-business websites like server logs
,Customer Register information are made familiar with a brief
mention of data mining techniques applicable in such a scenario. It
lays emphasis on the main goal of data mining in E-business which is
to mine the customer visiting information, to understand customers‘
browse action and mode, and find useful market information and
provide personalized services. Data mining adopts many techniques,
the main methods: discrimination, association analysis, classification
and prediction, cluster analysis and evolution analysis.
WEB LINK http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4722629&tag=1
TITLE DATA MINING ON SYMBOLIC KNOWLEDGE EXTRACTED
FROM THE WEB
AUTHOR Rayid Ghani , Rosie Jones ,Dunja Mladeni´c,Kamal Nigam,Se´an
Slattery, School of Computer Science
Department for Intelligent Systems,Carnegie Mellon University,J.
Stefan Institute,Pittsburgh
YEAR OF
PUBLICATION
2000
PUBLISHING
DETAILS
Workshop on Text Mining at the Sixth ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD-2000),
2000
SUMMARY As a part of E-business, since it is crucial to know about competitors,
one needs to about details like products and services by them in
various domains .The Paper discusses about creating a dataset which
can be built by spidering sources on web and then applying data
mining techniques on it .A brief overview of data mining Techniques
applicable on corporate databases is highlighted. It discusses the need
of a Web crawler for extracting information from company‘s websites
and also need of a wrapper to extract information to augment crawler‘s
information. It is only after an available dataset that data mining
techniques such as clustering, classification and association can be
applied and interesting regularities can be discovered in a company‘s
dataset according to the requirements.
WEB LINK http://www.kamalnigam.com/papers/shield-kddws00
TITLE INTEGRATING E-COMMERCE AND DATA MINING:
ARCHITECTURE AND CHALLENGES
AUTHOR Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng
YEAR OF
PUBLICATION
2000
PUBLISHING
DETAILS
Appeared in,WEBKDD‘2000 workshop: Web Mining for E-
Commerce -- Challenges and Opportunities
Appeared in ICDM'01: The 2001 IEEE International Conference on
Data Mining
SUMMARY The paper discusses integration of data mining and E-business
,mainly focusing on E-business being a killer domain for data mining
.An architecture that successfully integrates data mining with an e-
commerce system has been proposed consisting of three main parts as
of Business Data definition, Customer Interaction and Analysis
.Business Data Definition discusses the data and meta data associated
with E -business and to be able to define rich set of attributes for
example products can have attributes like size, colour etc. For a
business to be successful, customer interaction plays a major role and
this gives rise to a need of an efficient e-business website .The third
component lays emphasis on analysis of data collected by various data
mining techniques concerning mainly on customer data .It is through
an analysis tool that reports can be generated to be able to get varied
knowledge about different point like top selling and worst selling
products etc. Finally several challenging problems that need to be
addressed for further enhancement of this architecture were