Predicting Mail- Order Repeat Buying: Which Variables Matter? Group2 王王王 王王王
Jan 31, 2016
Predicting Mail-Order Repeat Buying: Which Variables Matter?
Group2王祥義 謝宜君
Agenda
• Abstract• Introduction• Research Questions• RFM Variables• Non-RFM Variables• Methodology• Data • Empirical Findings• Conclusion
Abstract
• Customer-oriented conceptual model of segmentation variables for mail-order repeat buying behavior.
• 1) from a theoretical perspective what customer-related variables should be included in response models .
2) empirically validate how these variables perform for predictive purpose.
• Traditionally- Three variables Which variables can additional?
Introduction
• The success of a database-driven (mail-order) marketing campaign mainly depends on the customer list to which it is targeted.
• Response modeling for database marketing is concerned with the task of modeling the customers’ purchasing behavior.
1.Direct-Mail Patronage Behavior
• A Conceptual Model of Segmentation Variables
Independent variableIndependent variable
Dependent variableDependent variable
Within a fixed time interval
Overview of variables
Behavioral Non-Behavioral
Company
specific
Recency
Frequency
Monetary value
Length of relationship
Type/category of product
Source of customer
Customer/company
Interaction
Customer satisfaction
Non-company specific
General mail-order buying behavior
Benefit segmentation
Socio-demographics
Non-company specific variables generally
have to be purchased form external vendor.
Behavioral variables usually correlate more strongly with future purchase behavior .
2. Research Questions
• This study focuses on the issue of what variables to include in predicting repeat purchase behavior by mail-order.
• RQ1a & RQ1b focus in the traditionally RFM variables.
• RQ2 address the issue of including other predictors into response model.
RQ1a
• Address the issue of “how good a model performance can be achieved by RFM variables.”
What is the total performance of the combined use of the three RFM variables in predicting repurchase behavior?
What is the total performance of the combined use of the three RFM variables in predicting repurchase behavior?
RQ1a
RQ1b
• The relative importance of three components has never been thoroughly investigated.
• “Frequency” is the most important.
What is the relative importance of recency, frequency and monetary value predicting repurchase behavior ?
What is the relative importance of recency, frequency and monetary value predicting repurchase behavior ?
RQ1b
RQ2
• Several variables have been added to RFM variables in specific implementations, but have never been systematically investigated.
How much predictive power do additional, i.e non-RFM, Variables offer in modeling mail-order repeat purchasing?
How much predictive power do additional, i.e non-RFM, Variables offer in modeling mail-order repeat purchasing?
RQ2
RFM variables
• RecencyRecency has been found to be inverselyrelated to the probability of the next purchase
• Frequency Frequency is that heavier buyers show greater
loyalty as measured by their repurchase probabilities
• Monetary The volume of purchases a consumer makes
with a particular mail-order company is a measure of usage which has been an important behavioral segmentation variable in several studies
Non-RFM variables 1) company specific or not 2) behavioral or non-behavioral
Company & Behavioral
Length of the relationship
Type/category of product
Source of the customer
Customer/company interaction
Company & Behavioral
1) Length of relationship Social psychology/Economics/OB The duration of a relationship may have predi
ctive power with regard to the continuation of the relationship.
2) Type/Category of Product Kestnbaum suggests to replace RFM by the n
ew acronym FRAC ( amount, category of product)
Company & Behavioral
3) Source of the Customer- Member introduces member- Child from a member parent- Internal mailing lists- Rented mailing lists- Spontaneous requests
4) Customer/Company InteractionContact-information includes several different types: (1) Information inquiries (2) Orders (purchasing) (3) Complaints (post-purchase).
Higher probability of repurchase
Complaint management is a key element.
Company & Non-behavioral
Customer Satisfaction
• When applied to direct marketing, we can state that the probability of repeat behavior will increase if the total buying experience meets or exceeds the expectations of the consumer with respect to the performance.
• Purchasing behavior was positively reinforced by tracking customer satisfaction.
Non-company & Behavioral
General Mail-Order buying behavioral
• when the person only recently became a customer at a particular mail-order company, knowledge about the customer’s general mail-order buying behavior may be valuable in predicting future purchasing behavior.
Non-company & Non-behavioral
• Benefit segmentation
- The benefit people seek in products are the basic reasons for heterogeneity in their choice behavior. Therefore, benefit are relevant bases for segmentation. - Other studies have shown that benefit segments are identifiable and substantial, and differ in brand purchase behavior. - Convenience, Credit line
• Socio-Demographic -Background ex. age education occupation salary
Methodology
* In order to address RQ1a, RQ1b, RQ2
• Specific modeling technique for purchase incidenceincidence modeling
• Model structure & the level of parameterization
• Evaluation Criteria → to assess “improvement” “improvement” in predictive accuracyin predictive accuracy
• Procedure for variable introduction
Methodology
The Binary Logit Model is used to approximate a probability
Whereby:
Pi represents the a posteriori probabilityprobability of a repeat purchase for customer iXij represents independent variableindependent variable j for customer ibj represent the parametersparameters (to be estimated)n represents the number of independent variablesnumber of independent variables
* Purchasing or not is a binary decision problembinary decision problem(two class classification)
0 ~ 1
( 二類評定模型 )
Methodology
Evaluation Criteria
• Percentage correctly classified (accuracy) at the ‘economically optimal’ cutoff purchase
probability (PCC)
• Area under the receiver operating characteristic curve (AUC)
* Classification :
Ranking Likelihoodbuyer A most likely
. .
. .
. .buyer N latest likely
Buyer
← cutoff value
Non-buyer
Methodology
( 錯差矩陣 )
正確率
靈敏度
明確性
分類正確率
正確 錯誤 預測 Buyer 正
確率
預測Non-buyer
正確率
MethodologyCutoff value = Minimal probability of purchase
the objective is to maximize total profits, the optimal decision rule is to mail up until the point where the incremental revenue derived from the mailing equals the incremental cost incurred by sending this additional mailing.
• Disadvantage : Estimated Value for cost & revenue Heterogeneity ( 異質性 ) in average
( 門檻值、臨界值 )ie. 郵寄成本、目錄製作成本
$ 5 $10
0.0 1.0
1.0
0.0
True positive Rate (Sensitivity)
False positive Rate (1-Specificity)
ROC (Receiver Operating Characteristic) Curve( 收受者操作特性曲線 )
(hit percentage)
(false-alarm probability)
Methodology
AUC = Accuracy越大表示越佳
Data
Figure 2: Summary of data sources
Internal data from mail-order company
Internal data from mail-order company
Questionnaire datafrom households
Questionnaire datafrom households
Database marketing data warehouse
for response modeling
Database marketing data warehouse
for response modeling
• Benefit segmentation variable• Customer satisfaction• General mail-order purchasing
• Past purchase behavior– when purchase– what quantity– which product– what price
Empirical Findings
%8.50)5.01(
)5.0754.0(
%5.25)529.01(
)529.0649.0(
AUC
PCC
AUC performance
PCC performance
null model
perfect model
0.0 1.0
1.0
0.0
0.0 1.0
1.0
0.0
True positive Rate
(Sensitivity)
False positive Rate (1-Specificity)
AUC = 0.5
AUC = 1.0
null mode
l
perfect model
RQ1a: performance of RFM in predicting
room for improvement
Empirical FindingsRQ1b: relative importance of RFM in predicting
( 相對的重要性 )
Type AUC PCC
Recency 0.625 0.417
Frequency 0.743 0.678
Monetary value 0.708 0.592
Num. of var.
R, F, or M AUC PCC
1 F 0.743 0.678
2 F & M 0.753 0.675
3 R, F ,& M 0.754 0.650
(most important)
<Multiple Predictors>
<Single Predictor>
(accuracy)
→ F (1st) ; M (2nd) ; R (3rd)Not sensitive
as F is include
Empirical FindingsRQ2: How much predictive power do additional from non-RFM
• Financial Convenience : credit usage• Length of relationship : log (number of days)• General mail-order buying behavior : frequency
Num. of var.
List of var. AUC PCC
3 RFM 0.754 0.650
4Best RFM &
Credit0.764 0.687
5Best RFM,
Credit, & Length.0.768 0.690
6Best RFM,
Credit, Length. & Gen.
0.769 0.688
Empirical Findings
Cumulative AUC performance of predictor models
0.73
0.735
0.74
0.745
0.75
0.755
0.76
0.765
0.77
0.775
1 2 3 4 5 6
Frequency
Monetary value
Recency
CreditLength.
Gen.
Number of Variables in Response Model
AU
C o
n T
est
Sam
ple
0.754(+50.8%)
0.769(+53.8%)
all variablesdiffer < 0.10
Conclusion
• The importance of RFM
• More variables = efficiency
• Cutoff value is important
• Different industry may choose different variables
\
THANK YOU !