Instance Construction via Likelihood-Based Data Squashing Madigan D., Madigan D., et. al. (Ch 12, (Ch 12, Instance selection and Construction for Data Mining Instance selection and Construction for Data Mining (2001), (2001), K ruwer Academic Publishers) Summarize: Jinsan Yang, SNU Biointelligence Lab
13
Embed
Instance Construction via Likelihood-Based Data Squashing
Instance Construction via Likelihood-Based Data Squashing. Madigan D., et. al . (Ch 12, Instance selection and Construction for Data Mining (2001), Kruwer Academic Publishers) Summarize: Jinsan Yang, SNU Biointelligence Lab. Abstract Data Compression Method: Squashing - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Instance Construction via Likelihood-Based Data Squashing
Madigan D.,Madigan D., et. al. (Ch 12, (Ch 12, Instance selection and Construction for Data MiningInstance selection and Construction for Data Mining (2001), (2001), Kruwer Acade
Some computational challengesNeed of multiple passes for data access10^5~6 times slower than main memoryCurrent Solution:Scaling up existing algorithmHere: Scaling down the data
Data squashing: 750000 8443 ( DuMouchel et al (1999), Outperforms by a factor of 500 in MSE than random sample of size 7543
LDS Algorithm Motivation: Bayesian rule
Given three data points d1,d2,d3, estimate the parameter :
Clusters by likelihood profile:
)()|()|()|(),,|( 321321 pdpdpdpdddp
)|()|()|(,
),|()|(
212**
21
21
dpdpdpwithdbyddsquash
dpdpIf
))|((,),|((( 1 kii dpdp
LDS Algorithm Details of LDS Algorithm
[Select] Values of by a central composite design
Central composite Design for 3 factors
LDS Algorithm
[Profile] Evaluate the likelihood profiles
[Cluster] Cluster the mother data in a single pass- Select n’ random samples as initial cluster centers