Pruning and Dynamic Scheduling Pruning and Dynamic Scheduling of Cost-sensitive Ensembles of Cost-sensitive Ensembles Wei Fan, Haixun Wang, and Philip S. Yu Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson, Hawthorne, New York IBM T.J.Watson, Hawthorne, New York Fang Chu Fang Chu UCLA, Los Angeles, CA UCLA, Los Angeles, CA
29
Embed
Pruning and Dynamic Scheduling of Cost-sensitive Ensembles Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson, Hawthorne, New York Fang Chu UCLA, Los.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Pruning and Dynamic Pruning and Dynamic Scheduling of Cost-sensitive Scheduling of Cost-sensitive EnsemblesEnsembles
Wei Fan, Haixun Wang, and Philip S. YuWei Fan, Haixun Wang, and Philip S. YuIBM T.J.Watson, Hawthorne, New YorkIBM T.J.Watson, Hawthorne, New York
Fang ChuFang ChuUCLA, Los Angeles, CAUCLA, Los Angeles, CA
ƒ Charity Donation:Solicit to people who will donate large amount of charity.Costs $0.68 to send a letter.A(x): donation amount.Only solicit if A(x) > 0.68, otherwise lose money.
ƒ Credit card fraud detection:Detect frauds with high transaction amount
$90 to challenge a potential fraudA(x): fraudulant transaction amount.Only challenge if A(x) > $90, otherwise lose money.
Scalable Issues of Data Mining Scalable Issues of Data Mining
ƒ Learning algorithm:non-linear complexity in the size of dataset n. memory based due to random access pattern of record in dataset.significantly slower if dataset is not held entirely in memory.
ƒ State-of-the-artmany scalable solutions are algorithm specific.general algorithms are not very scalable and only work for cost-insensitive problemsCharity donation: solicit to people who will donate a lot.
Credit card fraud: detect frauds with high transaction amount.
ƒ Our solution: general framework for both cost-sensitive and cost-insensitive problems.
TrainingTraining
D
D1 D2D2
large dataset
partition into
K subsets
ML1ML2 MLt
C1 C2Ck
generate
K models
TestingTesting
DTest Set
C1 C2 Ck
Sent to k models
P1 P2 PkCompute k predictions
Combine
P
Combine to one prediction
Cost-sensitive Decision MakingCost-sensitive Decision Making
ƒ Assume that records the benefit received by predicting an example of class to be an instance of class .
ƒ The expected benefit received to predict an example to be an instance of class (regardless of its true label) is
ƒ The optimal decision-making policy chooses the label that maximizes the expected benefit, i.e.,
ƒ When and is a
traditional accuracy-based problem.ƒ Total benefits
Charity Donation ExampleCharity Donation Example
ƒ It costs $.68 to send a solicitation.ƒ Assume that is the best
estimate of the donation amount,
ƒ The cost-sensitive decision making will solicit an individual if and only if
For decision trees, n is the number of examples in a node and k is the number of examples with class label , then the probability is more sophisticated methods
smoothing:early stopping, and early stopping plus smoothing
For rules, probability is calucated in the same way as decision trees
ƒ Always use greedy to choose the next classifier.
ƒ Criteria:Directly use accuracy or total benefits: choose the most accurateMost diversifiedMost accuratecombinations
ƒ Result: directly use accuracy is the best
Pruning ResultsPruning Results
Dynamic scheduling Dynamic scheduling
ƒ For a fixed number of classifiers, do we need every classifier to predict on every example? Not necessarily.
ƒ Some examples are easier to predict than others. For easier examples, we don't require as many classifiers as more difficult ones.
ƒ Techniques:Order the classifiers according their accuracy into a pipelineThe most accurate classifier is always called first.Each prediction generates a confidence that describes the likelihood of the current prediction to be the same as the prediction by the fixed number of classifiers.If the confidence is too low, more classifiers will be employed.
Dynamic SchedulingDynamic Scheduling
D
C1
C1 C1
C1 C1 C1
predicted examples
(pred, conf)
(pred, conf)(C1)
(pred, conf)
(pred, conf)(C1,C2)
(pred, conf)
(pred, conf)(C1, C2,C3)
Dynamic Scheduling ResultDynamic Scheduling Result