HAN 10-ch03-083-124-9780123814791 2011/6/1 3:16 Page 85 #3
3.1 Data Preprocessing: An Overview 85
have been deleted. Furthermore, the recording of the data
history or modifications mayhave been overlooked. Missing data,
particularly for tuples with missing values for someattributes, may
need to be inferred.
Recall that data quality depends on the intended use of the
data. Two different usersmay have very different assessments of the
quality of a given database. For example, amarketing analyst may
need to access the database mentioned before for a list of
cus-tomer addresses. Some of the addresses are outdated or
incorrect, yet overall, 80% ofthe addresses are accurate. The
marketing analyst considers this to be a large customerdatabase for
target marketing purposes and is pleased with the databases
accuracy,although, as sales manager, you found the data
inaccurate.
Timeliness also affects data quality. Suppose that you are
overseeing the distribu-tion of monthly sales bonuses to the top
sales representatives at AllElectronics. Severalsales
representatives, however, fail to submit their sales records on
time at the end ofthe month. There are also a number of corrections
and adjustments that flow in afterthe months end. For a period of
time following each month, the data stored in thedatabase are
incomplete. However, once all of the data are received, it is
correct. The factthat the month-end data are not updated in a
timely fashion has a negative impact onthe data quality.
Two other factors affecting data quality are believability and
interpretability. Believ-ability reflects how much the data are
trusted by users, while interpretability reflectshow easy the data
are understood. Suppose that a database, at one point, had
severalerrors, all of which have since been corrected. The past
errors, however, had causedmany problems for sales department
users, and so they no longer trust the data. Thedata also use many
accounting codes, which the sales department does not know how
tointerpret. Even though the database is now accurate, complete,
consistent, and timely,sales department users may regard it as of
low quality due to poor believability andinterpretability.
3.1.2 Major Tasks in Data PreprocessingIn this section, we look
at the major steps involved in data preprocessing, namely,
datacleaning, data integration, data reduction, and data
transformation.
Data cleaning routines work to clean the data by filling in
missing values, smooth-ing noisy data, identifying or removing
outliers, and resolving inconsistencies. If usersbelieve the data
are dirty, they are unlikely to trust the results of any data
mining that hasbeen applied. Furthermore, dirty data can cause
confusion for the mining procedure,resulting in unreliable output.
Although most mining routines have some proceduresfor dealing with
incomplete or noisy data, they are not always robust. Instead, they
mayconcentrate on avoiding overfitting the data to the function
being modeled. Therefore,a useful preprocessing step is to run your
data through some data cleaning routines.Section 3.2 discusses
methods for data cleaning.
Getting back to your task at AllElectronics, suppose that you
would like to includedata from multiple sources in your analysis.
This would involve integrating multipledatabases, data cubes, or
files (i.e., data integration). Yet some attributes representing
a
HAN 10-ch03-083-124-9780123814791 2011/6/1 3:16 Page 86 #4
86 Chapter 3 Data Preprocessing
given concept may have different names in different databases,
causing inconsistenciesand redundancies. For example, the attribute
for customer identification may be referredto as customer id in one
data store and cust id in another. Naming inconsistencies mayalso
occur for attribute values. For example, the same first name could
be registered asBill in one database, William in another, and B. in
a third. Furthermore, you sus-pect that some attributes may be
inferred from others (e.g., annual revenue). Havinga large amount
of redundant data may slow down or confuse the knowledge discov-ery
process. Clearly, in addition to data cleaning, steps must be taken
to help avoidredundancies during data integration. Typically, data
cleaning and data integration areperformed as a preprocessing step
when preparing data for a data warehouse. Addi-tional data cleaning
can be performed to detect and remove redundancies that may
haveresulted from data integration.
Hmmm, you wonder, as you consider your data even further. The
data set I haveselected for analysis is HUGE, which is sure to slow
down the mining process. Is there away I can reduce the size of my
data set without jeopardizing the data mining results?Data
reduction obtains a reduced representation of the data set that is
much smaller involume, yet produces the same (or almost the same)
analytical results. Data reductionstrategies include dimensionality
reduction and numerosity reduction.
In dimensionality reduction, data encoding schemes are applied
so as to obtain areduced or compressed representation of the
original data. Examples include datacompression techniques (e.g.,
wavelet transforms and principal components analysis),attribute
subset selection (e.g., removing irrelevant attributes), and
attribute construction(e.g., where a small set of more useful
attributes is derived from the original set).
In numerosity reduction, the data are replaced by alternative,
smaller representa-tions using parametric models (e.g., regression
or log-linear models) or nonparametricmodels (e.g., histograms,
clusters, sampling, or data aggregation). Data reduction is
thetopic of Section 3.4.
Getting back to your data, you have decided, say, that you would
like to use a distance-based mining algorithm for your analysis,
such as neural networks, nearest-neighborclassifiers, or
clustering.1 Such methods provide better results if the data to be
ana-lyzed have been normalized, that is, scaled to a smaller range
such as [0.0, 1.0]. Yourcustomer data, for example, contain the
attributes age and annual salary. The annualsalary attribute
usually takes much larger values than age. Therefore, if the
attributesare left unnormalized, the distance measurements taken on
annual salary will generallyoutweigh distance measurements taken on
age. Discretization and concept hierarchy gen-eration can also be
useful, where raw data values for attributes are replaced by ranges
orhigher conceptual levels. For example, raw values for age may be
replaced by higher-levelconcepts, such as youth, adult, or
senior.
Discretization and concept hierarchy generation are powerful
tools for data min-ing in that they allow data mining at multiple
abstraction levels. Normalization, data
1Neural networks and nearest-neighbor classifiers are described
in Chapter 9, and clustering is discussedin Chapters 10 and 11.
HAN 10-ch03-083-124-9780123814791 2011/6/1 3:16 Page 87 #5
3.2 Data Preprocessing: An Overview 87
discretization, and concept hierarchy generation are forms of
data transformation.You soon realize such data transformation
operations are additional data preprocessingprocedures that would
contribute toward the success of the mining process.
Dataintegration and data discretization are discussed in Sections
3.5.
Figure 3.1 summarizes the data preprocessing steps described
here. Note that the pre-vious categorization is not mutually
exclusive. For example, the removal of redundantdata may be seen as
a form of data cleaning, as well as data reduction.
In summary, real-world data tend to be dirty, incomplete, and
inconsistent. Data pre-processing techniques can improve data
quality, thereby helping to improve the accuracyand efficiency of
the subsequent mining process. Data preprocessing is an important
stepin the knowledge discovery process, because quality decisions
must be based on qual-ity data. Detecting data anomalies,
rectifying them early, and reducing the data to beanalyzed can lead
to huge payoffs for decision making.
Data cleaning
Data integration
Data reductionAttributes Attributes
A1 A2 A3 ... A126T1T2T3T4...
T2000
Tran
sact
ions
Tran
sact
ions T1
T4...
T1456
A1 A3 ... A115
Data transformation 2, 32, 100, 59, 48 0.02, 0.32, 1.00, 0.59,
0.48
Figure 3.1 Forms of data preprocessing.
Front Cover Data Mining: Concepts and
TechniquesCopyrightDedicationTable of ContentsForewordForeword to
Second EditionPrefaceAcknowledgmentsAbout the AuthorsChapter 1.
Introduction1.1 Why Data Mining?1.2 What Is Data Mining?1.3 What
Kinds of Data Can Be Mined?1.4 What Kinds of Patterns Can Be
Mined?1.5 Which Technologies Are Used?1.6 Which Kinds of
Applications Are Targeted?1.7 Major Issues in Data Mining1.8
Summary1.9 Exercises1.10 Bibliographic Notes
Chapter 2. Getting to Know Your Data2.1 Data Objects and
Attribute Types2.2 Basic Statistical Descriptions of Data2.3 Data
Visualization2.4 Measuring Data Similarity and Dissimilarity2.5
Summary2.6 Exercises2.7 Bibliographic Notes
Chapter 3. Data Preprocessing3.1 Data Preprocessing: An
Overview3.2 Data Cleaning3.3 Data Integration3.4 Data Reduction3.5
Data Transformation and Data Discretization3.6 Summary3.7
Exercises3.8 Bibliographic Notes
Chapter 4. Data Warehousing and Online Analytical Processing4.1
Data Warehouse: Basic Concepts4.2 Data Warehouse Modeling: Data
Cube and OLAP4.3 Data Warehouse Design and Usage4.4 Data Warehouse
Implementation4.5 Data Generalization by Attribute-Oriented
Induction4.6 Summary4.7 Exercises4.8 Bibliographic Notes
Chapter 5. Data Cube Technology5.1 Data Cube Computation:
Preliminary Concepts5.2 Data Cube Computation Methods5.3 Processing
Advanced Kinds of Queries by Exploring Cube Technology5.4
Multidimensional Data Analysis in Cube Space5.5 Summary5.6
Exercises5.7 Bibliographic Notes
Chapter 6. Mining Frequent Patterns, Associations, and
Correlations: Basic Concepts and Methods6.1 Basic Concepts6.2
Frequent Itemset Mining Methods6.3 Which Patterns Are
Interesting?Pattern Evaluation Methods6.4 Summary6.5 Exercises6.6
Bibliographic Notes
Chapter 7. Advanced Pattern Mining7.1 Pattern Mining: A Road
Map7.2 Pattern Mining in Multilevel, Multidimensional Space7.3
Constraint-Based Frequent Pattern Mining7.4 Mining High-Dimensional
Data and Colossal Patterns7.5 Mining Compressed or Approximate
Patterns7.6 Pattern Exploration and Application7.7 Summary7.8
Exercises7.9 Bibliographic Notes
Chapter 8. Classification: Basic Concepts8.1 Basic Concepts8.2
Decision Tree Induction8.3 Bayes Classification Methods8.4
Rule-Based Classification8.5 Model Evaluation and Selection8.6
Techniques to Improve Classification Accuracy8.7 Summary8.8
Exercises8.9 Bibliographic Notes
Chapter 9. Classification: Advanced Methods9.1 Bayesian Belief
Networks9.2 Classification by Backpropagation9.3 Support Vector
Machines9.4 Classification Using Frequent Patterns9.5 Lazy Learners
(or Learning from Your Neighbors)9.6 Other Classification
Methods9.7 Additional Topics Regarding Classification9.8 Summary9.9
Exercises9.10 Bibliographic Notes
Chapter 10. Cluster Analysis: Basic Concepts and Methods10.1
Cluster Analysis10.2 Partitioning Methods10.3 Hierarchical
Methods10.4 Density-Based Methods10.5 Grid-Based Methods10.6
Evaluation of Clustering10.7 Summary10.8 Exercises10.9
Bibliographic Notes
Chapter 11. Advanced Cluster Analysis11.1 Probabilistic
Model-Based Clustering11.2 Clustering High-Dimensional Data11.3
Clustering Graph and Network Data11.4 Clustering with
Constraints11.5 Summary11.6 Exercises11.7 Bibliographic Notes
Chapter 12. Outlier Detection12.1 Outliers and Outlier
Analysis12.2 Outlier Detection Methods12.3 Statistical
Approaches12.4 Proximity-Based Approaches12.5 Clustering-Based
Approaches12.6 Classification-Based Approaches12.7 Mining
Contextual and Collective Outliers12.8 Outlier Detection in
High-Dimensional Data12.9 Summary12.10 Exercises12.11 Bibliographic
Notes
Chapter 13. Data Mining Trends and Research Frontiers13.1 Mining
Complex Data Types13.2 Other Methodologies of Data Mining13.3 Data
Mining Applications13.4 Data Mining and Society13.5 Data Mining
Trends13.6 Summary13.7 Exercises13.8 Bibliographic Notes
BibliographyIndexFront Cover Data Mining: Concepts and
TechniquesCopyrightDedicationTable of ContentsForewordForeword to
Second EditionPrefaceAcknowledgmentsAbout the AuthorsChapter 1.
Introduction1.1 Why Data Mining?1.2 What Is Data Mining?1.3 What
Kinds of Data Can Be Mined?1.4 What Kinds of Patterns Can Be
Mined?1.5 Which Technologies Are Used?1.6 Which Kinds of
Applications Are Targeted?1.7 Major Issues in Data Mining1.8
Summary1.9 Exercises1.10 Bibliographic Notes
Chapter 2. Getting to Know Your Data2.1 Data Objects and
Attribute Types2.2 Basic Statistical Descriptions of Data2.3 Data
Visualization2.4 Measuring Data Similarity and Dissimilarity2.5
Summary2.6 Exercises2.7 Bibliographic Notes
Chapter 3. Data Preprocessing3.1 Data Preprocessing: An
Overview3.2 Data Cleaning3.3 Data Integration3.4 Data Reduction3.5
Data Transformation and Data Discretization3.6 Summary3.7
Exercises3.8 Bibliographic Notes
Chapter 4. Data Warehousing and Online Analytical Processing4.1
Data Warehouse: Basic Concepts4.2 Data Warehouse Modeling: Data
Cube and OLAP4.3 Data Warehouse Design and Usage4.4 Data Warehouse
Implementation4.5 Data Generalization by Attribute-Oriented
Induction4.6 Summary4.7 Exercises4.8 Bibliographic Notes
Chapter 5. Data Cube Technology5.1 Data Cube Computation:
Preliminary Concepts5.2 Data Cube Computation Methods5.3 Processing
Advanced Kinds of Queries by Exploring Cube Technology5.4
Multidimensional Data Analysis in Cube Space5.5 Summary5.6
Exercises5.7 Bibliographic Notes
Chapter 6. Mining Frequent Patterns, Associations, and
Correlations: Basic Concepts and Methods6.1 Basic Concepts6.2
Frequent Itemset Mining Methods6.3 Which Patterns Are
Interesting?Pattern Evaluation Methods6.4 Summary6.5 Exercises6.6
Bibliographic Notes
Chapter 7. Advanced Pattern Mining7.1 Pattern Mining: A Road
Map7.2 Pattern Mining in Multilevel, Multidimensional Space7.3
Constraint-Based Frequent Pattern Mining7.4 Mining High-Dimensional
Data and Colossal Patterns7.5 Mining Compressed or Approximate
Patterns7.6 Pattern Exploration and Application7.7 Summary7.8
Exercises7.9 Bibliographic Notes
Chapter 8. Classification: Basic Concepts8.1 Basic Concepts8.2
Decision Tree Induction8.3 Bayes Classification Methods8.4
Rule-Based Classification8.5 Model Evaluation and Selection8.6
Techniques to Improve Classification Accuracy8.7 Summary8.8
Exercises8.9 Bibliographic Notes
Chapter 9. Classification: Advanced Methods9.1 Bayesian Belief
Networks9.2 Classification by Backpropagation9.3 Support Vector
Machines9.4 Classification Using Frequent Patterns9.5 Lazy Learners
(or Learning from Your Neighbors)9.6 Other Classification
Methods9.7 Additional Topics Regarding Classification9.8 Summary9.9
Exercises9.10 Bibliographic Notes
Chapter 10. Cluster Analysis: Basic Concepts and Methods10.1
Cluster Analysis10.2 Partitioning Methods10.3 Hierarchical
Methods10.4 Density-Based Methods10.5 Grid-Based Methods10.6
Evaluation of Clustering10.7 Summary10.8 Exercises10.9
Bibliographic Notes
Chapter 11. Advanced Cluster Analysis11.1 Probabilistic
Model-Based Clustering11.2 Clustering High-Dimensional Data11.3
Clustering Graph and Network Data11.4 Clustering with
Constraints11.5 Summary11.6 Exercises11.7 Bibliographic Notes
Chapter 12. Outlier Detection12.1 Outliers and Outlier
Analysis12.2 Outlier Detection Methods12.3 Statistical
Approaches12.4 Proximity-Based Approaches12.5 Clustering-Based
Approaches12.6 Classification-Based Approaches12.7 Mining
Contextual and Collective Outliers12.8 Outlier Detection in
High-Dimensional Data12.9 Summary12.10 Exercises12.11 Bibliographic
Notes
Chapter 13. Data Mining Trends and Research Frontiers13.1 Mining
Complex Data Types13.2 Other Methodologies of Data Mining13.3 Data
Mining Applications13.4 Data Mining and Society13.5 Data Mining
Trends13.6 Summary13.7 Exercises13.8 Bibliographic Notes
BibliographyIndexFront Cover Data Mining: Concepts and
TechniquesCopyrightDedicationTable of ContentsForewordForeword to
Second EditionPrefaceAcknowledgmentsAbout the AuthorsChapter 1.
Introduction1.1 Why Data Mining?1.2 What Is Data Mining?1.3 What
Kinds of Data Can Be Mined?1.4 What Kinds of Patterns Can Be
Mined?1.5 Which Technologies Are Used?1.6 Which Kinds of
Applications Are Targeted?1.7 Major Issues in Data Mining1.8
Summary1.9 Exercises1.10 Bibliographic Notes
Chapter 2. Getting to Know Your Data2.1 Data Objects and
Attribute Types2.2 Basic Statistical Descriptions of Data2.3 Data
Visualization2.4 Measuring Data Similarity and Dissimilarity2.5
Summary2.6 Exercises2.7 Bibliographic Notes
Chapter 3. Data Preprocessing3.1 Data Preprocessing: An
Overview3.2 Data Cleaning3.3 Data Integration3.4 Data Reduction3.5
Data Transformation and Data Discretization3.6 Summary3.7
Exercises3.8 Bibliographic Notes
Chapter 4. Data Warehousing and Online Analytical Processing4.1
Data Warehouse: Basic Concepts4.2 Data Warehouse Modeling: Data
Cube and OLAP4.3 Data Warehouse Design and Usage4.4 Data Warehouse
Implementation4.5 Data Generalization by Attribute-Oriented
Induction4.6 Summary4.7 Exercises4.8 Bibliographic Notes
Chapter 5. Data Cube Technology5.1 Data Cube Computation:
Preliminary Concepts5.2 Data Cube Computation Methods5.3 Processing
Advanced Kinds of Queries by Exploring Cube Technology5.4
Multidimensional Data Analysis in Cube Space5.5 Summary5.6
Exercises5.7 Bibliographic Notes
Chapter 6. Mining Frequent Patterns, Associations, and
Correlations: Basic Concepts and Methods6.1 Basic Concepts6.2
Frequent Itemset Mining Methods6.3 Which Patterns Are
Interesting?Pattern Evaluation Methods6.4 Summary6.5 Exercises6.6
Bibliographic Notes
Chapter 7. Advanced Pattern Mining7.1 Pattern Mining: A Road
Map7.2 Pattern Mining in Multilevel, Multidimensional Space7.3
Constraint-Based Frequent Pattern Mining7.4 Mining High-Dimensional
Data and Colossal Patterns7.5 Mining Compressed or Approximate
Patterns7.6 Pattern Exploration and Application7.7 Summary7.8
Exercises7.9 Bibliographic Notes
Chapter 8. Classification: Basic Concepts8.1 Basic Concepts8.2
Decision Tree Induction8.3 Bayes Classification Methods8.4
Rule-Based Classification8.5 Model Evaluation and Selection8.6
Techniques to Improve Classification Accuracy8.7 Summary8.8
Exercises8.9 Bibliographic Notes
Chapter 9. Classification: Advanced Methods9.1 Bayesian Belief
Networks9.2 Classification by Backpropagation9.3 Support Vector
Machines9.4 Classification Using Frequent Patterns9.5 Lazy Learners
(or Learning from Your Neighbors)9.6 Other Classification
Methods9.7 Additional Topics Regarding Classification9.8 Summary9.9
Exercises9.10 Bibliographic Notes
Chapter 10. Cluster Analysis: Basic Concepts and Methods10.1
Cluster Analysis10.2 Partitioning Methods10.3 Hierarchical
Methods10.4 Density-Based Methods10.5 Grid-Based Methods10.6
Evaluation of Clustering10.7 Summary10.8 Exercises10.9
Bibliographic Notes
Chapter 11. Advanced Cluster Analysis11.1 Probabilistic
Model-Based Clustering11.2 Clustering High-Dimensional Data11.3
Clustering Graph and Network Data11.4 Clustering with
Constraints11.5 Summary11.6 Exercises11.7 Bibliographic Notes
Chapter 12. Outlier Detection12.1 Outliers and Outlier
Analysis12.2 Outlier Detection Methods12.3 Statistical
Approaches12.4 Proximity-Based Approaches12.5 Clustering-Based
Approaches12.6 Classification-Based Approaches12.7 Mining
Contextual and Collective Outliers12.8 Outlier Detection in
High-Dimensional Data12.9 Summary12.10 Exercises12.11 Bibliographic
Notes
Chapter 13. Data Mining Trends and Research Frontiers13.1 Mining
Complex Data Types13.2 Other Methodologies of Data Mining13.3 Data
Mining Applications13.4 Data Mining and Society13.5 Data Mining
Trends13.6 Summary13.7 Exercises13.8 Bibliographic Notes
BibliographyIndexFront Cover Data Mining: Concepts and
TechniquesCopyrightDedicationTable of ContentsForewordForeword to
Second EditionPrefaceAcknowledgmentsAbout the AuthorsChapter 1.
Introduction1.1 Why Data Mining?1.2 What Is Data Mining?1.3 What
Kinds of Data Can Be Mined?1.4 What Kinds of Patterns Can Be
Mined?1.5 Which Technologies Are Used?1.6 Which Kinds of
Applications Are Targeted?1.7 Major Issues in Data Mining1.8
Summary1.9 Exercises1.10 Bibliographic Notes
Chapter 2. Getting to Know Your Data2.1 Data Objects and
Attribute Types2.2 Basic Statistical Descriptions of Data2.3 Data
Visualization2.4 Measuring Data Similarity and Dissimilarity2.5
Summary2.6 Exercises2.7 Bibliographic Notes
Chapter 3. Data Preprocessing3.1 Data Preprocessing: An
Overview3.2 Data Cleaning3.3 Data Integration3.4 Data Reduction3.5
Data Transformation and Data Discretization3.6 Summary3.7
Exercises3.8 Bibliographic Notes
Chapter 4. Data Warehousing and Online Analytical Processing4.1
Data Warehouse: Basic Concepts4.2 Data Warehouse Modeling: Data
Cube and OLAP4.3 Data Warehouse Design and Usage4.4 Data Warehouse
Implementation4.5 Data Generalization by Attribute-Oriented
Induction4.6 Summary4.7 Exercises4.8 Bibliographic Notes
Chapter 5. Data Cube Technology5.1 Data Cube Computation:
Preliminary Concepts5.2 Data Cube Computation Methods5.3 Processing
Advanced Kinds of Queries by Exploring Cube Technology5.4
Multidimensional Data Analysis in Cube Space5.5 Summary5.6
Exercises5.7 Bibliographic Notes
Chapter 6. Mining Frequent Patterns, Associations, and
Correlations: Basic Concepts and Methods6.1 Basic Concepts6.2
Frequent Itemset Mining Methods6.3 Which Patterns Are
Interesting?Pattern Evaluation Methods6.4 Summary6.5 Exercises6.6
Bibliographic Notes
Chapter 7. Advanced Pattern Mining7.1 Pattern Mining: A Road
Map7.2 Pattern Mining in Multilevel, Multidimensional Space7.3
Constraint-Based Frequent Pattern Mining7.4 Mining High-Dimensional
Data and Colossal Patterns7.5 Mining Compressed or Approximate
Patterns7.6 Pattern Exploration and Application7.7 Summary7.8
Exercises7.9 Bibliographic Notes
Chapter 8. Classification: Basic Concepts8.1 Basic Concepts8.2
Decision Tree Induction8.3 Bayes Classification Methods8.4
Rule-Based Classification8.5 Model Evaluation and Selection8.6
Techniques to Improve Classification Accuracy8.7 Summary8.8
Exercises8.9 Bibliographic Notes
Chapter 9. Classification: Advanced Methods9.1 Bayesian Belief
Networks9.2 Classification by Backpropagation9.3 Support Vector
Machines9.4 Classification Using Frequent Patterns9.5 Lazy Learners
(or Learning from Your Neighbors)9.6 Other Classification
Methods9.7 Additional Topics Regarding Classification9.8 Summary9.9
Exercises9.10 Bibliographic Notes
Chapter 10. Cluster Analysis: Basic Concepts and Methods10.1
Cluster Analysis10.2 Partitioning Methods10.3 Hierarchical
Methods10.4 Density-Based Methods10.5 Grid-Based Methods10.6
Evaluation of Clustering10.7 Summary10.8 Exercises10.9
Bibliographic Notes
Chapter 11. Advanced Cluster Analysis11.1 Probabilistic
Model-Based Clustering11.2 Clustering High-Dimensional Data11.3
Clustering Graph and Network Data11.4 Clustering with
Constraints11.5 Summary11.6 Exercises11.7 Bibliographic Notes
Chapter 12. Outlier Detection12.1 Outliers and Outlier
Analysis12.2 Outlier Detection Methods12.3 Statistical
Approaches12.4 Proximity-Based Approaches12.5 Clustering-Based
Approaches12.6 Classification-Based Approaches12.7 Mining
Contextual and Collective Outliers12.8 Outlier Detection in
High-Dimensional Data12.9 Summary12.10 Exercises12.11 Bibliographic
Notes
Chapter 13. Data Mining Trends and Research Frontiers13.1 Mining
Complex Data Types13.2 Other Methodologies of Data Mining13.3 Data
Mining Applications13.4 Data Mining and Society13.5 Data Mining
Trends13.6 Summary13.7 Exercises13.8 Bibliographic Notes
BibliographyIndex