Page 1
KDD-98: A Comparison of Leading Data Mining Tools
A Comparison of LeadingData Mining Tools
John F. Elder IV & Dean W. AbbottElder Research
Fourth International Conferenceon Knowledge Discovery & Data Mining
Friday, August 28, 1998
New York, New York
Page 2
updated October 19, 1998© 1998 Elder Research T8-2
KDD-98: A Comparison of Leading Data Mining Tools
Copyright © 1998,John F. Elder IV and Dean W. Abbott
All rights reserved
Manufactured in the United States of America
Page 3
updated October 19, 1998© 1998 Elder Research T8-3
KDD-98: A Comparison of Leading Data Mining Tools
Contacting Elder Research
Dr. John F. Elder IV
1006 Wildmere Place
Charlottesville, VA 22901
[email protected]
804-973-7673
Fax: 804-995-0064
Dean W. Abbott
3443 Villanova Avenue
San Diego, CA 92122-2310
[email protected]
619-450-0313
http://www.datamininglab.com
Page 4
updated October 19, 1998© 1998 Elder Research T8-4
KDD-98: A Comparison of Leading Data Mining Tools
Tutorial Goals
• Compare and Summarize Data Mining Tools which:– Offer multiple modeling and classification algorithms
– Support project stages surrounding model construction
– Stand alone
– Are general-purpose
– Cost a lot
– We could get our hands on
• Include some (focused) Desktop Tools
Other Reports: Two Crows, Aberdeen Group, Elder Research(forthcoming ), Data Mining Journal
Page 5
updated October 19, 1998© 1998 Elder Research T8-5
KDD-98: A Comparison of Leading Data Mining Tools
Topics
• Products covered
• Review of algorithms
• Comparative tables of properties
• Screen shots exemplifying qualities
• Summary of distinctives
Page 6
updated October 19, 1998© 1998 Elder Research T8-6
KDD-98: A Comparison of Leading Data Mining Tools
Caveats
• We don’t know every tool well (and are sure tohave missed some!)– Level of exposure noted for each tool
• Our background (biasing our perspective)
– Very technical, “early adopters”
– Emphasize solving real-world applications
– More classification than estimation
• Field of tools is quite dynamic– New versions appear regularly
Page 7
updated October 19, 1998© 1998 Elder Research T8-7
KDD-98: A Comparison of Leading Data Mining Tools
Data Mining Products
PRW
Model 1
Page 8
updated October 19, 1998© 1998 Elder Research T8-8
KDD-98: A Comparison of Leading Data Mining Tools
Product Company URL
Ver
sion
T
este
d Our Experience
ClementineIntegral Solutions, Ltd.Integral Solutions, Ltd. http://www.isl.co.uk/clem.html 4 ModerateDarwin Thinking Machines, Corp.Thinking Machines, Corp. http://www.think.com/html/products/products.htm 3.0.1 ModerateDataCruncher DataMindDataMind http://www.datamindcorp.com 2.1.1 HighEnterprise Miner SAS InstituteSAS Institute http://www.sas.com/software/components/miner.html Beta ModerateGainSmarts Urban ScienceUrban Science http://www.urbanscience.com/main/gainpage.htm 4.0.3 LowIntelligent Miner IBMIBM http://www.software.ibm.com/data/iminer/ 2 LowMineSet Silicon Graphics, Inc.Silicon Graphics, Inc. http://www.sgi.com/Products/software/MineSet/ 2.5 LowModel 1 Group 1Group 1/Unica Technologies http://www.unica-usa.com/model1.htm 3.1 Moderate
ModelQuest AbTech Corp.AbTech Corp. http://www.abtech.com 1 ModeratePRW UnicaUnica Technologies, Inc. http://www.unica-usa.com/prodinfo.htm 2.1 High
CART Salford Systems http://www.salford-systems.com 3.5 ModerateNeuroShell Ward Systems Group, Inc. http://www.wardsystems.com/neuroshe.htm 3 ModerateOLPARS PAR Government Systems mailto://[email protected] 8.1 HighScenario Cognos http://www.cognos.com/busintell/products/index.html 2 ModerateSee5 RuleQuest Research http://www.rulequest.com/see5-info.html 1.07 ModerateS-Plus MathSoft http://www.mathsoft.com/splus/ 4 HighWizWhy WizSoft http://www.wizsoft.com/why.html 1.1 Moderate
Tools Evaluated
Page 9
updated October 19, 1998© 1998 Elder Research T8-9
KDD-98: A Comparison of Leading Data Mining Tools
Categories for Comparisons
• Platforms Supported
• Algorithms Included– Decision Trees
– Neural Networks
– Other
• Data Input and Model Output Options
• Usability Ratings
• Visualization Capabilities
• Modeling Automation Methods
Page 10
updated October 19, 1998© 1998 Elder Research T8-10
KDD-98: A Comparison of Leading Data Mining Tools
Platforms
PC
Sta
nd
alon
e (9
5/N
T)
Un
ix S
tan
dal
one
Un
ix S
erve
r /
P
C C
lien
t
NT
Ser
ver
/
PC
Cli
ent
Dat
abas
e C
onn
ecti
vity
Clementine √√ √+√+ √√Darwin √√ √√DataCruncher √√ √√ √√Enterprise Miner √√ √+√+ √√ √√GainSmarts √√ √√ √√Intelligent Miner √√ √√MineSet √√ √√Model 1 √√ √√ √√ √√ModelQuest √√ √√ √√PRW √√ √√
CART √√ √+√+Scenario √√ √√NeuroShell √√OLPARS √√ √√See5 √√ √+√+S-Plus √√ √−√−WizWhy √√
Keyblank no capability
√– some capability
√ good capability
√+ excellent capability
Page 11
updated October 19, 1998© 1998 Elder Research T8-11
KDD-98: A Comparison of Leading Data Mining Tools
Tool Groupings
• PC (standalone)
• Flat Files
• One or Two Algorithms
• Data Fits into RAM
Desktop
• Multiple Platforms, Client-Server
• Flat Files or Direct DatabaseAccess
• Multiple Algorithm Types
• Large Databases
High End
Page 12
updated October 19, 1998© 1998 Elder Research T8-12
KDD-98: A Comparison of Leading Data Mining Tools
• Intuitive Interface– Clear steps in data mining
process
– Non-technical terminology
– Familiar environment
• Descriptive Reporting– Domain terminology
– Graphical representations
End User Perspectives
Business
• Algorithm Options– Knobs to enhance model
performance
• Model Automation– Simplify model design cycle
– Documentation of steps usedin generating models(repeatability)
Technical
Page 13
updated October 19, 1998© 1998 Elder Research T8-13
KDD-98: A Comparison of Leading Data Mining Tools
OLPARS: Interface
Page 14
updated October 19, 1998© 1998 Elder Research T8-14
KDD-98: A Comparison of Leading Data Mining Tools
MineSet: Interface
Page 15
updated October 19, 1998© 1998 Elder Research T8-15
KDD-98: A Comparison of Leading Data Mining Tools
PRW: Experiment Manager
Page 16
updated October 19, 1998© 1998 Elder Research T8-16
KDD-98: A Comparison of Leading Data Mining Tools
Intelligent Miner: “Explorer” Interface
Page 17
updated October 19, 1998© 1998 Elder Research T8-17
KDD-98: A Comparison of Leading Data Mining Tools
Enterprise Miner: Visual Interface
Page 18
updated October 19, 1998© 1998 Elder Research T8-18
KDD-98: A Comparison of Leading Data Mining Tools
Clementine: Visual Interface
Page 19
updated October 19, 1998© 1998 Elder Research T8-19
KDD-98: A Comparison of Leading Data Mining Tools
DataCruncher: Process Flow Diagram
Page 20
updated October 19, 1998© 1998 Elder Research T8-20
KDD-98: A Comparison of Leading Data Mining Tools
Data Input & Model Output
Automatic Header
Save Data
FormatODBC
Native Database Drivers
Summary Reports
Output Source Code
Clementine √√ √√ √√Darwin √√ √√ √√DataCruncher √√ √√ √√ √√ √√Enterprise Miner √−√− √√ √√ √−√− √√GainSmarts √√ √√ √√ √√ √√Intelligent Miner √−√− √√MineSet √√ √√Model 1 √√ √√ √√ √√ √√ √√ModelQuest √√ √√ √√ √√PRW √√ √√ √√ √√ √√
CART √√Scenario √√ √√NeuroShell √√OLPARS √√See5 √−√−S-Plus √√ √√ √√ √√WizWhy √√ √√
Page 21
updated October 19, 1998© 1998 Elder Research T8-21
KDD-98: A Comparison of Leading Data Mining Tools
PRW: Row Splitting
Page 22
updated October 19, 1998© 1998 Elder Research T8-22
KDD-98: A Comparison of Leading Data Mining Tools
Model1: Variable Selection
Page 23
updated October 19, 1998© 1998 Elder Research T8-23
KDD-98: A Comparison of Leading Data Mining Tools
Model1: Data Definitions
Page 24
updated October 19, 1998© 1998 Elder Research T8-24
KDD-98: A Comparison of Leading Data Mining Tools
Scenario : Special Data Types
Page 25
updated October 19, 1998© 1998 Elder Research T8-25
KDD-98: A Comparison of Leading Data Mining Tools
a > 4n y
b > 2n y
2 3a > 1n y
1 0
b>3.5n y
2
Decision Trees
Page 26
updated October 19, 1998© 1998 Elder Research T8-26
KDD-98: A Comparison of Leading Data Mining Tools
Polynomial Networks
a N1
k N9
z1
z9 Double 16
f N6z6
d N4z4
h N8
e N5
Layer 0Layer 1
Layer 2
Single 14
Triple 21
Triple 17
Double 20
z5
z8
Multi-Linear 15
Double 19z20
z17
z14
z16
z19
z15U2 Y1
U7 Y2
Layer 3
(Normalizers)
Unitizers
z21
Z17
= 3.1 + 0.4a - .15b2
+ 0.9bc - 0.62abc + 0.5c3
Page 27
updated October 19, 1998© 1998 Elder Research T8-27
KDD-98: A Comparison of Leading Data Mining Tools
Regression
Polynomial Networks (e.g. GMDH, ASPN)
Decision Trees(e.g., CART, CHAID, C5)
Logistic or SigmoidalNetworks (ANNs)
Hinging Hyperplanes, MARS
orders, terms
∑
∑ ∑
∑
“Consensus” ModelsParametrically Summarize Data Points
Page 28
updated October 19, 1998© 1998 Elder Research T8-28
KDD-98: A Comparison of Leading Data Mining Tools
“Consensus” Models (continued)
Histogram
Radial Basis Function
Wavelets
orientation, bin width
family, order
function
Page 29
updated October 19, 1998© 1998 Elder Research T8-29
KDD-98: A Comparison of Leading Data Mining Tools
Kernels
k-Nearest Neighbor
Delaunay Planes
shape, spread
k, distance metric
Projection Pursuit RegressionSpread, index
Goal, iterations
“Contributory” Modelsretain data points; each potentially affects estimate at new point
Page 30
updated October 19, 1998© 1998 Elder Research T8-30
KDD-98: A Comparison of Leading Data Mining Tools
Properties of Algorithms
Algorithm Accurate Scalable Interpret-able
Useable Robust Versatile Fast Hot
Classical(LR, LDA) – C C– C – – C D
NeuralNetworks
C D D D – D DD CVisualization C DD C C CC D DDD C–
DecisionTrees
D C C C– C C C– C–PolynomialNetworks
C – D C– –D – –D –K-NearestNeighbors
D DD C– – –D D C DKernels C DD D –D D D C D
KeyC good
– neutral
D bad
Page 31
updated October 19, 1998© 1998 Elder Research T8-31
KDD-98: A Comparison of Leading Data Mining Tools
Algorithms
Dec
isio
n T
rees
Lin
ear/
Stat
isti
cal
Mul
ti-l
ayer
Per
cept
rons
Nea
rest
Nei
ghbo
r
Rad
ial B
asis
Fun
ctio
ns
Bay
es
Rul
e In
duct
ion
Pol
ynom
ial N
etw
orks
Gen
eral
ized
Lin
ear
Mod
els
Tim
e Se
ries
Sequ
enti
al D
isco
very
K M
eans
Ass
ocia
tion
Rul
es
Koh
onen
Clementine √√ √√ √√ √√ √√ √√ √√Darwin √√ √√ √√Datamind √√Enterprise Miner √√ √√ √√ √√ √√ √√ √√ √√GainSmarts √√ √+√+Intelligent Miner √√ √−√− √√ √−√− √√ √√ √+√+ √√MineSet √√ √√ √√ √√Model 1 √+√+ √√ √√ √√ModelQuest √√ √√ √√ √√ √−√−PRW √+√+ √√ √√ √√ √√ √√
CART √√Cognos √√NeuroShell √+√+ √√ √−√−OLPARS √√ √√ √√ √√ √√ √√ √√See5 √√ √√SPlus √√ √+√+ √√ √√ √√WizWhy √√
Page 32
updated October 19, 1998© 1998 Elder Research T8-32
KDD-98: A Comparison of Leading Data Mining Tools
Multi-Layer Perceptrons
Lea
rnin
g R
ate
Lea
rnin
g R
ate
Dec
ay
Mom
entu
m
Mul
tipl
e A
ctiv
atio
n F
unct
ions
Mul
tipl
e St
op C
rite
ria
Cro
ss-V
alid
atio
n
Nor
mal
ize
Inpu
ts
Adv
ance
d L
earn
ing
Alg
.
Oth
er C
ost
func
tion
s
Aut
omat
ic M
odel
Sel
ecti
on
Net
wor
k V
isua
l
Par
amet
er S
umm
ary
Clementine √√ √√ √√ √√Darwin √√ √√ √√ √√ √√ √√Enterprise Miner √√ √√ √√ √√ √√ √√ √√ √√ √√ √√Intelligent Miner √√ √√Model 1 √√ √√ √√ √√ √√ √√ √√PRW √√ √√ √√ √√ √√ √√ √√ √√
NeuroShell √√ √√ √√ √√ √√OLPARS √√ √√ √√ √√ √√ √√ √√
Page 33
updated October 19, 1998© 1998 Elder Research T8-33
KDD-98: A Comparison of Leading Data Mining Tools
Enterprise Miner: Neural NetworkTraining
Page 34
updated October 19, 1998© 1998 Elder Research T8-34
KDD-98: A Comparison of Leading Data Mining Tools
PRW: Neural Network Training
Page 35
updated October 19, 1998© 1998 Elder Research T8-35
KDD-98: A Comparison of Leading Data Mining Tools
Decision Trees
"CA
RT
"
C5
or C
4.5
CH
AID
Oth
er
Pri
ors
Cla
ssif
icat
ion
Cos
ts
Mis
sing
Dat
a
Pru
ning
Sev
erit
y
Vis
ual T
rees
Clementine √√ √√ √√ √√ √−√−Darwin √√ √√ √√ √√Enterprise Miner √√ √−√− √√ √+√+ √√ √√ √√ √√GainSmarts √√ √√ √√ √√ √√Intelligent Miner √√ √√ √√MineSet √√ √√ √√ √√ √√ √√Model 1 √√ √√ √−√−ModelQuest √−√− √√ √√
CART √+√+ √√ √√ √√ √√Scenario √√ √√S-Plus √√ √√ √√ √√See5 √+√+ √√ √√ √√
Page 36
updated October 19, 1998© 1998 Elder Research T8-36
KDD-98: A Comparison of Leading Data Mining Tools
CART: Tree Browsing
Page 37
updated October 19, 1998© 1998 Elder Research T8-37
KDD-98: A Comparison of Leading Data Mining Tools
Scenario: Tree Browsing
Page 38
updated October 19, 1998© 1998 Elder Research T8-38
KDD-98: A Comparison of Leading Data Mining Tools
Enterprise Miner: Tree Browsing
Page 39
updated October 19, 1998© 1998 Elder Research T8-39
KDD-98: A Comparison of Leading Data Mining Tools
Enterprise Miner: Tree Results
Page 40
updated October 19, 1998© 1998 Elder Research T8-40
KDD-98: A Comparison of Leading Data Mining Tools
MineSet: Tree Browser
Page 41
updated October 19, 1998© 1998 Elder Research T8-41
KDD-98: A Comparison of Leading Data Mining Tools
Regression / Stats Linear LogisticComplexity
PenaltyCross-
ValidationInput
SelectionFactor
Analysis
Clementine YClementine √√Enterprise Miner √+√+ √+√+ √√ √√ √√ √√GainSmarts √+√+ √+√+ √√Intelligent Miner √−√− √√ √√MineSet √√Model 1 √√ √√ √√ √+√+ModelQuest Enterprise √√ √√ √√ √√ √√PRW √√ √√ √√ √+√+S-Plus √√ √+√+ √√ √√ √√ √√S-Plus √+√+ √+√+ √√ √√ √√ √√Scenario √√
Page 42
updated October 19, 1998© 1998 Elder Research T8-42
KDD-98: A Comparison of Leading Data Mining Tools
MineSet: Bayes Distributions
Page 43
updated October 19, 1998© 1998 Elder Research T8-43
KDD-98: A Comparison of Leading Data Mining Tools
Enterprise Miner: Clustering Results
Page 44
updated October 19, 1998© 1998 Elder Research T8-44
KDD-98: A Comparison of Leading Data Mining Tools
Intelligent Miner: Clustering Results
Page 45
updated October 19, 1998© 1998 Elder Research T8-45
KDD-98: A Comparison of Leading Data Mining Tools
Intelligent Miner: Cluster Explode
Page 46
updated October 19, 1998© 1998 Elder Research T8-46
KDD-98: A Comparison of Leading Data Mining Tools
UsabilityData Loading and
ManipulationModel Building
Model Understanding
Technical Support
Overall
Clementine √+√+ √+√+ √+√+ √+√+ √+√+Darwin √√ √√ √+√+ √√ √√DataCruncher √+√+ √+√+ √√ √√ √√Enterprise Miner √√ √√ √√ √√ √√GainSmarts √+√+ √√ √√ √√ √√Intelligent Miner √√ √√ √√ √√ √√MineSet √√ √+√+ √+√+ √√ √+√+Model 1 √+√+ √+√+ √+√+ √+√+ √+√+ModelQuest Enterprise √√ √+√+ √+√+ √+√+ √+√+PRW √+√+ √+√+ √+√+ √+√+ √+√+
CART √−√− √√ √√ √√ √√Scenario √√ √+√+ √+√+ √√ √+√+NeuroShell √√ √√ √√ √√ √√OLPARS √−√− √√ √√ √√ √√See5 √√ √√ √√ √√ √√S-Plus √√ √√ √+√+ √√ √√WizWhy √√ √√ √+√+ √√ √√
Page 47
updated October 19, 1998© 1998 Elder Research T8-47
KDD-98: A Comparison of Leading Data Mining Tools
Intelligent Miner: Statistics Report
Page 48
updated October 19, 1998© 1998 Elder Research T8-48
KDD-98: A Comparison of Leading Data Mining Tools
Scenario: Result Reporting
Page 49
updated October 19, 1998© 1998 Elder Research T8-49
KDD-98: A Comparison of Leading Data Mining Tools
WizWhy: Reporting
Page 50
updated October 19, 1998© 1998 Elder Research T8-50
KDD-98: A Comparison of Leading Data Mining Tools
DataCruncher: Output Sensitivities
Page 51
updated October 19, 1998© 1998 Elder Research T8-51
KDD-98: A Comparison of Leading Data Mining Tools
Darwin: Predictions
Page 52
updated October 19, 1998© 1998 Elder Research T8-52
KDD-98: A Comparison of Leading Data Mining Tools
Darwin: Lift Chart (Excel)
Page 53
updated October 19, 1998© 1998 Elder Research T8-53
KDD-98: A Comparison of Leading Data Mining Tools
CART: Gains Chart
Page 54
updated October 19, 1998© 1998 Elder Research T8-54
KDD-98: A Comparison of Leading Data Mining Tools
Visualization HistogramsPie
Charts
Scatter/Line Plots
Rotating Scatter
Conditional Plots
Classification Decision Regions
Correlation Plots
Clementine √√ √√ √√ √−√− √√Darwin √−√− √−√− √−√−DataCruncher √√ √√ √√ √√Enterprise Miner √√ √√ √√ √−√− √√ √√GainSmarts √−√− √−√−Intelligent Miner √√ √√ √√ √√MineSet √√ √√ √√ √√ √√Model 1 √√ √√ √√ModelQuest Enterprise √√ √√PRW √√ √√ √√
CARTScenario √√NeuroShell √√OLPARS √√ √√ √√ √−√− √√ √√See5 √√S-Plus √√ √√ √√ √√ √√WizWhy
Page 55
updated October 19, 1998© 1998 Elder Research T8-55
KDD-98: A Comparison of Leading Data Mining Tools
Clementine: Visualization
User-created sub-region
Page 56
updated October 19, 1998© 1998 Elder Research T8-56
KDD-98: A Comparison of Leading Data Mining Tools
OLPARS: Visualization
Page 57
updated October 19, 1998© 1998 Elder Research T8-57
KDD-98: A Comparison of Leading Data Mining Tools
OLPARS: Decision Space
Page 58
updated October 19, 1998© 1998 Elder Research T8-58
KDD-98: A Comparison of Leading Data Mining Tools
Model 1: Target Sensitivities
Page 59
updated October 19, 1998© 1998 Elder Research T8-59
KDD-98: A Comparison of Leading Data Mining Tools
MineSet: Geographical Visualization
Page 60
updated October 19, 1998© 1998 Elder Research T8-60
KDD-98: A Comparison of Leading Data Mining Tools
Automation Method of AutomationFree Text
Annotation of Steps
Clementine Visual Programming, Programming Language √√Darwin Programming Language √√DataCruncher (Task manager)Enterprise Miner Visual Programming, Programming Language √√GainSmarts Macro Language, Wizards √−√−Intelligent Miner (Wizards)MineSet Data History, LogModel 1 Model WizardModelQuest Batch AgendaPRW Experiement Manager; Macros √√
CART Built-in Basic ScriptingScenarioNeuroShellOLPARSSee5S-Plus Scripting (S); C/C++WizWhy
Page 61
updated October 19, 1998© 1998 Elder Research T8-61
KDD-98: A Comparison of Leading Data Mining Tools
PRW: Experiment Manager
Page 62
updated October 19, 1998© 1998 Elder Research T8-62
KDD-98: A Comparison of Leading Data Mining Tools
Model 1: Model Summary
Page 63
updated October 19, 1998© 1998 Elder Research T8-63
KDD-98: A Comparison of Leading Data Mining Tools
A Recent Breakthrough: Bundling
• Case Weights
• Data Values
• Guiding Parameters
• Variable Subsets
1) Construct varied models, and2) Combine their estimates
Generate component models by varying:
Combine estimates using:
• Estimator Weights
• Voting
• Advisor Perceptrons
• Partitions of Design Space
Page 64
updated October 19, 1998© 1998 Elder Research T8-64
KDD-98: A Comparison of Leading Data Mining Tools
Example Bundling Techniques
• Bayes: sum estimates of possible models, weighted by priors
• GMDH (Ivakhenko 68) -- multiple layers of quadraticpolynomials, using two inputs each, fit by LR
• Stacking (Wolpert 92) -- train a 2nd-level (LR) model usingleave-1-out estimates of 1st-level (neural net) models
• Bagging (Breiman 96) (bootstrap aggregating) -- bootstrap data(to build trees mostly); take majority vote or average
• Bumping (Tibshirani 97) -- bootstrap, select single best
• Boosting (Freund & Shapire 96) -- weight error cases by βτ =(1-e(t))/e(t), iteratively re-model; weight model t by ln(βτ)
• Crumpling (Anderson & Elder 98) -- average cross-validations
• Born-Again (Breiman 98) -- invent new X data...
Page 65
updated October 19, 1998© 1998 Elder Research T8-65
KDD-98: A Comparison of Leading Data Mining Tools
Distinctives Strengths Weaknesses
Clementine vis ual inte rface ; algorithm bread th sca lability
Darwin e ffic ient c lie nt-s e rve r; intuitive inte rface options no uns upe rvis e d ; limited vis ualization
DataCruncher e a s e o f us e s ingle a lgorithm
Enterprise Miner depth of algorithms ; visual inte rface harde r to use ; ne w product is s ue s
GainSmarts data trans formations , built on SAS ; a lgorithm option depth no uns upe rvis e d ; limited vis ualization
Intelligent Miner algorithm breadth; graphical tre e /clus te r output fe w a lgorithm options ; no automation
MineSet data visualization fe w a lgorithms; no model e xport
Model 1 e a s e o f us e ; automated model discove ry rea lly a ve rtical tool
ModelQuest breadth of algorithms some non-intuitive inte rface options
PRW e xtens ive a lgorithms; automated model s e le c tion limited visualization
CART depth of tree op tions difficult file I/O; limited visualization
Scenario e a s e o f us e narrow analysis path
NeuroShell multiple neural ne twork architec ture s unorthodox inte rface ; only neural ne tworks
OLPARS multiple s tatis tical algorithms; cla s s -bas e d vis ualization dated inte rface ; difficult file I/O
See5 depth of tree op tions limited visualization; fe w data options
S-Plus depth of algorithms ; visualization; programable /e xte ndable limited inductive m e thods ; s te e p lea rning curve
WizWhy e a s e o f us e ; e a s e o f mode l unde rs tanding limited visualization
Page 66
updated October 19, 1998© 1998 Elder Research T8-66
KDD-98: A Comparison of Leading Data Mining Tools
Closing Observations
• Data Mining Tools Can:– Enhance inference process
– Speed up design cycle
• Data Mining Tools Can Not:– Substitute for statistical and domain expertise
• Users are advised to:– Get training on tools
– Be alert for product upgrades
Page 67
updated October 19, 1998© 1998 Elder Research T8-67
KDD-98: A Comparison of Leading Data Mining Tools
Index to Tools(Page numbers in italics refer to screen captures)
Clementine 7, 8, 10, 18 , 20, 31, 32, 35, 41, 46, 54, 55 , 60, 65Darwin 7, 8, 10, 20, 31, 32, 35, 46, 51 , 52 , 54, 60, 65DataCruncher 7, 8, 10, 19 , 20, 31, 46, 50 , 54, 60, 65Enterprise Miner 7, 8, 10, 17 , 20, 31, 32, 33 , 35, 38 , 39 , 41, 43 , 46, 54, 60, 65Gainsmarts 7, 8, 10, 20, 31, 35, 41, 46, 54, 60, 65Intelligent Miner 7, 8, 10, 16 , 20, 31, 32, 35, 41, 44 , 45 , 46, 47 , 54, 60, 65MineSet 7, 8, 10, 14 , 20, 31, 35, 40 , 41, 42 , 46, 54, 59 , 60, 65Model1 7, 8, 10, 20, 22 , 23 , 31, 32, 35, 41, 46, 54, 58 , 60, 62 , 65ModelQuest 7, 8, 10, 20, 31, 35, 41, 46, 54, 60, 65PRW 7, 8, 10, 15 , 20, 21 , 31, 32, 34 , 41, 46, 54, 60, 61 , 65
CART 7, 8, 10, 20, 31, 35, 36 , 46, 53 , 54, 60, 65Neuroshell 7, 8, 10, 20, 31, 32, 46, 54, 60, 65pcOLPARS 7, 8, 10, 13 , 20, 31, 32, 46, 54, 56 , 57 , 60, 65Scenario 7, 8, 10, 20, 24 , 31, 35, 37 , 41, 46, 48 , 54, 60, 65See5 7, 8, 10, 20, 31, 35, 46, 54, 60, 65S-Plus 7, 8, 10, 20, 31, 35, 41, 46, 54, 60, 65WizWhy 7, 8, 10, 20, 31, 46, 49 , 54, 60, 65
Page 68
updated October 19, 1998© 1998 Elder Research T8-68
KDD-98: A Comparison of Leading Data Mining Tools
Forthcoming Report
• Report provides detailed comparison of high-enddata mining tools, including capabilities, ease ofuse, and practical tips.
• Available for $695 from Elder Research(http://www.datamininglab.com), Q4 1998.
• Purchasers receive brief free consulting session toexplore report findings in more detail, if desired.
Note: The analyses and reviews were performed completely independently,and were made possible by the cooperation of the vendors, for which ElderResearch is very grateful. The companies, however, provided no financialsupport, and had no influence on its editorial content.