University of Naples Federico II, Naples, ITALY domenico ... › eurostat › cros › system › files › piccolo_ntts_20… · University of Naples Federico II, Naples, ITALY...

EUROPEAN COMMISSION – NTTS 2019

Conference on New Techniques and Technologies for official Statistics

A paradigm for rating data models

Domenico Piccolo

University of Naples Federico II, Naples, ITALY

[email protected]

D.Piccolo (NA Federico II) A paradigm for rating data models Brussels, 13 March 2019 1 / 60

Outline

1 Introduction

2 The classical paradigm

3 A generating process for rating data

4 The class of CUB models

5 Conclusions


1. Introduction


Ordinal variables and rating data

➤ In different fields, responses aimed to express subjective evaluations with

respect to events, people, sentences, attitudes, circumstances, etc. are

collected and investigated as ordinal data:

Psychology and Behavioural sciences

Educational assessment

Medicine

Sensory sciences

Marketing and Economic analysis

Evaluation studies and Quality control

Political sciences

Linguistics

Sports

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Ordinal variables in public surveys

➤ In Europe, several surveys are regularly organized to collect information

about opinions, judgements, perceptions, trust of citizens towards Institutions,

etc. in many different fields of interest.

European Economic Survey (EES)

European Opinion polls on Safety and Health at Work (EU-OSHA)

European Working Conditions Survey (EWCS)

European Quality of Life Survey (EWCS)

European Company Survey (ECS)

Survey of Health, Ageing and Retirement in Europe (SHARE)

Eurobarometer surveys

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

➤ In all of these surveys, questionnaires include several items where

respondents are asked to select an ordinal category.


How large should be the scale?

➤ An highly controversial argument: which scale is to be preferred?

➤ As many opinions as fields of interest . . . . . .

Sufficient categories to discriminate

Not so many categories to confusion and/or puzzle

➤ “The larger the scale, the larger the indecision . . . ” ?

➤ The “true” problem is:

are we creating or disclosing uncertainty ?


Rating data as result of an experience

➤ Since sensation is a human experience, when interviewees areasked to select a level (category) to express their personal evaluation,the response cannot be exclusively considered as a (possiblemeditated) reaction to a stimulus.

➤ Ordinal scores may express:

• opinions • agreement • judgements • perceptions

• worry • concern • pain • fear • anxiety . . . .

➤ A rating response involves respondent’s history, local and timecircumstances, mood, attitudes, emotions.


Several approaches

➤ Several approaches are available for the analysis of rating data, aslog-linear and marginal models, contingency tables inference, and soon.

➤ Latent variables and IRT are among the most diffuse methods todeal with this kind of data; often, specific variants have beenintroduced to face and solve new problems.

➤ The main term of comparison is the class of cumulative modelswhich have been embedded into the GLM perspective.

➤ Currently, several variants of cumulative are available to fit realproblems.


2. The classical paradigm


Cumulative models . . . . . . . . . . . . . . . . . . . . . . . . . . . [1]

➤ For an underlying (continuous) latent variable Y ∗i such that, for the i-th

subject,

αj−1 < Y ∗i ≤ αj ⇐⇒ Ri = j , j = 1, 2, . . . ,m ,

where −∞ = α0 < α1 < . . . < αm = +∞ are the thresholds (cutpoints) defined

on the continuous scale of the latent variable Y ∗.

Interval on the Observed

latent variable rating

−∞ = α0 < Y ∗i ≤ α1 Ri = 1

α1 < Y ∗i ≤ α2 Ri = 2

. . . . . .

αj−1 < Y ∗i ≤ αj Ri = j

. . . . . .

αm−1 < Y ∗i ≤ αm = +∞ Ri = m



➤ Assume that p ≥ 1 covariates –whose values are included in a matrix T– are

relevant for explaining the latent regression model by means of:

Y ∗i = tiβ + ǫi , i = 1, 2, . . . ,n,

where ǫi ∼ Fǫ(.).

➤ Then, the probability mass function of Ri is:

Pr (Ri = j | Ci ) = Pr(αj−1 < Y ∗

i ≤ αj

)= Fǫ(αj − tiβ)− Fǫ(αj−1 − tiβ), j = 1, 2, . . . ,m,

where Ci = (Ri , ti ) is the information set characterizing the i-th subject and

Pr (Ri ≤ j | θ, ti ) = Fǫ(αj − tiβ) , i = 1, 2, . . . ,n; j = 1, 2, . . . ,m.

➤ The parameter vector θ = (α′, β′)′ is split into:

intercept values (cutpoints or thresholds) α = (α1, . . . , αm−1)′ ;

covariate coefficients β = (β1, . . . , βp)′.



➤ Some common choices for Fǫ(.) are:

Gaussian distribution → probit models

Logistic distribution → logit models

Extreme value distribution → complementary log-log models

➤ Historical reasons and symmetry considerations.

➤ The logistic link receives increasing considerations since it binds bothsimplicity and robustness properties when referred to ordinal responses.

➤ As a consequence of proportionality properties, the standardspecification of logit models is known as proportional odds model(POM ).



➤ In case of logistic random variables ǫi , the probability that the i-thsubject selects a rating r turns out to be:

Pr (Ri = r |θ, ti) =1

1 + exp(−[αr − tiβ])−

1

1 + exp(−[αr−1 − tiβ]),

for i = 1, 2, . . . ,n and r = 1, 2, . . . ,m.


A feature of cumulative models

➤ If no covariate is specified, the cumulative model is a saturated one.

➤ Thus, empirical and estimated distribution functions strictly coincide:a perfect fitting.

➤ In this situation there is no statistical model but only an arithmeticequivalence.

➤ This circumstance implies that, without the inclusion of covariates,those models cannot be used per se as statistical tools.

➤ Quite often, the interpretation of these models takes advantage ofodds and log-odds measures which are quantities easily manageableby Medicine and Biomedical researchers.


Some difficulties with the classical paradigm

Data generating process refers to a latent variable whose unobservable

distribution defines the discrete distribution for the observable ratings.

Cumulative models are not parsimonious since they require cutpoints

estimates in addition to explicit parameters for significant covariates.

It is difficult to accept that subjects’ decisions consider ratings not greater

than a fixed one, whereas it is more common to consider choices as

determined by the “stimulus” associated to a single category and its

surrounding values.

Without covariates, the classical setting leads to a saturated model,

which implies an arithmetic equivalence between observed and

assumed distributions.

Interpretation of the effect of a single covariate on the probability of a

category is neither easy nor immediate.

Graphical representation is not so immediate since the log-odds are

linear functions of covariates; in general, log-odds are not so easy to

interpret.


Cumulative models in Big Data era

➤ When a lot of ordinal data are collected for several items in repeated

occasions, times, units which are differentiated with respect to the available

information set, and the objective is to compare the behaviour/opinions of

subjects in different circumstances, statistical models for the responses which

depend on covariates are not a solution.

➤ This situation is more and more frequent in time where a huge mass of

opinions, judgements and preference are collected by mail, Internet and

social media.

➤ In such cases, it seems more effective to concentrate the modelling step on

data generating process of ordinal observation and discriminate among the

clusters on the basis of the observed distributions (data-dependent approach)

or an estimated structure (model-based approach).


3. A generating process for rating data


Mixture as a data generating mechanism for rating data

➤ In different contexts, mixture models have been introduced toappropriately fit data to probability mass functions.

➤ The novel paradigm insists on the psychological process whichtransforms a perception into a rating score.

➤ Experimental evidence supports that rating is the result of:

a primary component, generated by the sound impression of therespondent, related to awareness and full understanding of theproblem. It is called feeling (agreement) since it is usually relatedto subject’s motivation;

a secondary component, generated by the intrinsic indecisionabout the final choice. It is called uncertainty (fuzziness), and it ismostly dependent on circumstances that surround the evaluationprocess.


Mixture as a data generating mechanism for rating data

➤ Both components will be explicitly modelled by discrete randomvariables and, in first instances, they have been proposed as (shifted)Binomial and (discrete) Uniform, respectively.

➤ Literature confirms the usefulness and effectiveness of such choicesas proved by formal arguments and a vast empirical evidence.


Latent classes and mixture

➤ In the logic of latent class models, the mixture we will introduce maybe interpreted as requiring two clusters of respondents:

some people assume a responsible behaviour towards the survey(=their choice follows a shifted Binomial distribution)

some others adopt a totally random criterion (=their choicefollows a discrete Uniform distribution).

➤ This situation is well captured by the models we are introducing.

➤ However, we support a different interpretation of the subjectivebehaviour of the respondents.


The reference scheme

➤ New models interpret the probability of a rating R as a mixture of:

a personal decision, motivated by attraction/repulsion, likeness/worry,

agreeableness, agreement towards the item and measured by 1 − ξ;

an inherent indecision in the choice among the categories whose weight

is measured by 1 − π.

Feeling Uncertainty

C U BRandom variable


Definition of a CUB model

➤ A CUB model (= Combination of discrete Uniform and shifted Binomial

random variables) is defined by:

1 A stochastic component:

Pr(Ri = r | xi , wi) = πi

[(m − 1

r − 1

)

(1 − ξ i)r−1ξm−r

i

]

︸︷︷︸

feeling

+ (1 − πi)

[1

m

]

︸︷︷︸

uncertainty

for r = 1, 2, . . . ,m, where πi ∈ (0, 1] and ξ i ∈ [0, 1], i = 1, 2, . . . ,n.

2 Two systematic components:

logit(πi ) = log(

πi

1−πi

)

= xi β;

logit(ξ i ) = log(

ξ i

1−ξ i

)

= wiγ;⇐⇒

{

πi = 11+e−xi β ;

ξ i = 11+e−wi γ

;

where β and γ are the parameters to be estimated, and xi and wi are

the row vectors containing the values of the covariates of the i-th subject,

suitable to explain πi and ξ i , respectively.


Interpretation of a CUB model

➤ Each respondent acts with a propensity to adhere to a thoughtfuland to a completely uncertain choice, which is measured by (πi) and(1 − πi ), respectively.

➤ In case of a rating question/item with positive wording:

(1− ξ i) may be interpreted as a measure of preference towardsthe item.

(1− πi ) is a weight of the uncertainty included in the responses.

➤ When the item concerns a negative (reverse) wording (e.g. worry,disagreement, stress, fear, effort, pain, etc.) the interpretation of ξ i and1 − ξ i must be reversed.


Explicit link between parameters and subjects’ covariates

➤ A noticeable aspect of CUB models is the direct link between subjects’

covariates and parameters.

➤ Since 1 − ξ i is a direct measure of agreement, feeling, likeness with the item

and 1 − πi is a direct measure of the weight of the uncertainty distribution in

the mixture, it is convenient to express those links by means of:

logit(1− πi ) = −β0 − β1 xi1 − β2 xi2 − . . . − βp xip ;

logit(1− ξ i) = −γ0 − γ1 wi1 − γ2 wi2 − . . . − γq wiq .

➤ These expressions allow for an immediate interpretation of the effects of the

selected covariates on the feeling and uncertainty components, respectively.

➤ Any function creating a one-to-one monotone correspondence between

(−∞, ∞) and (0, 1) is legitimate to assess a link between subjects’s covariates

and parameters. It has been proved that logit is a robust link when dealing

with ordinal rating.


CUB model distribution

➤ Although CUB model has been introduced with covariates, one may specify

a CUB distribution without such a constraint:

if π = aver(πi ) and ξ = aver(ξ i) are some averages of the individual

parameters, the parameters (π, ξ) can be used to compare the

responses to different items;

for a given i-th subject, the features of the implied CUB model conditional

to (zi , wi ) may be investigated by letting πi = π and ξ i = ξ .

Pr(R = r) = π

[(m − 1

r − 1

)

(1 − ξ)r−1ξm−r

]

︸︷︷︸

feeling

+ (1 − π)

[1

m

]

︸︷︷︸

uncertainty

,

for r = 1, 2, . . . ,m, where π ∈ (0, 1] e ξ ∈ [0, 1] are defined over the unit square.


CUB models are highly flexible

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model A Mode=9

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model B Mode=8

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model C Mode=9

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model D Mode=7

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model E Mode=7

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model F Mode=4

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model G Mode=5

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model H Mode=5

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model I Mode=5

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model J Mode=1

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model K Mode=4

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model L Mode=1

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model M Mode=2

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model N Mode=2

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model O Mode=2

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model P Mode=1


Visualization of CUB models

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Ratings

Pro

babili

ty

ABC

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Uncertainty

Fe

elin

gA

B

C


Visualization and interpretation

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model A Mode=9

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model B Mode=8

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model C Mode=9

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model D Mode=7

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model E Mode=7

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model F Mode=4

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model G Mode=5

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model H Mode=5

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model I Mode=5

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model J Mode=1

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model K Mode=4

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model L Mode=1

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model M Mode=2

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model N Mode=2

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model O Mode=2

2 4 6 8

0.0

0.1

0.2

0.3

0.4

Model P Mode=1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Parameter space

1 − π

1−

ξ

AB

C

DE

F

GH

I

J

K

L

M

N

O

P


Level of satisfaction of the personal relationships . . . . . . . . . . . [1]

0 1 2 3 4 5 6 7 8 9 10

Family members

0.0

0.1

0.2

0.3

0.4

average = 8.588

0 1 2 3 4 5 6 7 8 9 10

Friends

0.0

0.1

0.2

0.3

0.4

average = 7.863

0 1 2 3 4 5 6 7 8 9 10

Neighbours

0.0

0.1

0.2

0.3

0.4

average = 5.782

0 1 2 3 4 5 6 7 8 9 10

Colleagues−Acquaintances

0.0

0.1

0.2

0.3

0.4

average = 6.542



0.0 0.1 0.2 0.3 0.4 0.5 0.6

0.65

0.70

0.75

0.80

0.85

0.90

CUB models visualizations

Uncertainty (1 − π)

Leve

l of S

atis

fact

ion

(1

−ξ)

Family members

Friends

Neighbours



Job satisfaction of Italian graduates

0.00 0.05 0.10 0.15 0.20 0.25

0.82

0.84

0.86

0.88

0.90

0.92

0.94

Dynamic CUB models with respect to Age at degree


Satis

fact

ion

(1

−ξ)

Age=60

Age=22

Women

Men


CUB models and bimodality

0.00

0.05

0.10

0.15

Bimodal observed distribution

ordinal

Rel

ativ

e fre

quen

cies

1 2 3 4 5 6 7 8 9 2 4 6 8

0.0

0.1

0.2

0.3

CUB distributions, given csi−covariate=0, 1

Pro

b(R

|D=0

) an

d P

rob(

R|D

=1)

➤ Figures show simulated and estimated distributions (conditional to Di = 0, 1,

respectively) of the shifted Binomial model (m = 9):

Pr (Ri = j) = ( 8j−1) ξ

8−ji

(1 − ξ i )j−1 ;

logit(ξ i

)= −1.362 + 2.744 Di ;

j = 1, 2, . . . , 9; i = 1, 2, . . . ,n.


Expectation level curves for CUB models

2 4 6 8

0.0

00.0

50.1

00.1

50.2

00.2

50.3

0

r = 1, 2, ..., m

Pr(

R=

r)

CUB models with expectation E(R) = 5.5 (m=9)

A model

B model

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1 − π

1−

ξ

Level curves of CUB models for given expectation (m=9)

A

B

E(R)=9

E(R)=8

E(R)=7

E(R)=6

E(R)=5

E(R)=4E(R)=3

E(R)=2E(R)=1


Omitting uncertainty causes expectation bias

➤ A possible uncertainty/heterogeneity in the specification of Binomial-type

models causes a bias in the estimation of the feeling (and average)

parameters.

➤ Although uncertainty is just a proportional displacement in the probability

distribution, the bias of the location parameter is proportional to the weight of

the uncertainty component and it decreases for almost symmetric

distributions.

➤ Since a priori researchers do not know the size of uncertainty, a convenient

strategy is: let data speak for themselves.

➤ This strategy is automatically accomplished by CUB models.


Origin of uncertainty

➤ Uncertainty is not the stochastic component related to the sampling

experiment (so that different people generates different ratings).

➤ Uncertainty is the result of possible convergent and related factors:

Limited set of information, Knowledge/Ignorance of properties and/or

characteristics of the object/item to be evaluated.

Personal interest/Engagement in activities related to the specific or

related field of interest.

Amount of time devoted to the response.

Operational mode for responding: face-to-face, questionnaire form,

telephone, mobile, PC, mail, Email, etc.

Nature of the scale in terms of range and wording.

Tiredness or fatigue for a correct comprehension of the wording.

Willingness to joke and fake.

Lack of self-confidence of the respondent.

Laziness/Apathy/Boredom in the selection mechanism.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


What really is uncertainty?

➤ The measure of uncertainty conveyed by 1− πi includes at least three

points of view:

1 subjective indecision: when we examine 1 − πi , it is possible to consider it

as a measure of personal indecision of the i-th respondent as a function

of selected covariates.

2 heterogeneity: when we analyse a global CUB model for the given item, it

is possible to consider 1 − π as a measure of heterogeneity of the

respondents.

3 predictability: if we study a CUB model to predict ordinal outcomes, it is

possible to consider π as a direct measure of predictability of the model

with respect to two extremes:

minimum → responses follow a (pure) discrete Uniform distribution

maximum → responses follow a (pure) Binomial distribution


Comparison of POM and CUB models

➤ A comparison between POM and CUB models requires at least oneexplanatory variable given that POM are saturated when appliedwithout covariates.

➤ To make the comparison possible, a dichotomous covariate (Di = 0for i = 1, . . . , 500; Di = 1 for i = 501, . . . , 1000) has been defined.

➤ Here, with m = 7, the results for data with high heterogeneity arepresented.


Highly heterogeneous data

1 2 3 4 5 6 7

0.00

0.05

0.10

0.15

0.20

0.25

Ratings

Rela

tive

frequ

encie

s


BIC comparison for POM and CUB models

CUB POM

3800

3820

3840

3860

3880

3900

3920

3800 3840 3880 3920

3800

3820

3840

3860

3880

3900

3920

POM

CUB


4. The family of CUB models


Using and applying CUB models

➤ After their introduction (Piccolo, 2003), to cope with different realsituations, CUB models have been generalized in several directions.

➤ Family of CUB models.

➤ The program to perform estimation and testing of CUB models,originally coded in the GAUSS language, has been implemented in theR environment.

➤ Currently, the package CUB, latest version 1.1.2, is freely availableon the CRAN web repository.

➤ Further programs for users of STATA and GRETL are forthcoming.


Family of CUB models

� Variants of univariate distributions:

CUB models with both subjects’ and objects’ covariatesHierarchical and random effects CUB models (HCUB and RCUB)CUB models with a shelter effectGeneralized CUB models (GeCUB)CUSH modelsLatent Class CUB models (LC-CUB)CUB models with “don’t know” option (DK-CUB)CUB model with MIMIC structure (CUB-MIMIC)CUB time series model (CUB-TS)

� Variants of the probability distributions of components:

CUBE models without and with covariatesIHG models without and with covariatesCUB models with varying uncertainty (VCUB)CAUB modelsNon-linear CUB modelsCUP models

� Joint modelling of items:

Multi-objects modelling approachMultivariate CUB models via latent variablesMultivariate CUB models via copula functionsMultivariate mixtures (SCUB and CUSCUB)


CUB models with shelter effect

?

δ 1 − δ

ShelterChoice

CUB

π 1 − π

ShiftedBinomial

DiscreteUniform


Definition of a GeCUB model

➤ A GeCUB model with p covariates for uncertainty, q covariates for feeling

and s covariates for shelter effect is specified by:

Pr (R = r | θ∗) = (1 − δi )[

πi br (ξ i) + (1 − πi)Ur

]

+ δi D(c)r ,

where

πi =1

1 + e−xi β; ξ i =

1

1 + e−wi γ; δi =

1

1+ e−zi ω;

for i = 1, 2, . . . ,n, and xi , wi and zi are the subjects’ covariates for explaining πi ,

ξ i , and δi , respectively.

➤ These rows are included in T , a n × (k + 1) matrix of observed k covariates

related to n subjects.

➤ The columns of the X , W and Z matrices may be the same, partially

coincide or completely differ.



0 1 2 3 4 5 6 7 8 9 10

Family members

0.0

0.1

0.2

0.3

0.4

average = 8.588

0 1 2 3 4 5 6 7 8 9 10

Friends

0.0

0.1

0.2

0.3

0.4

average = 7.863

0 1 2 3 4 5 6 7 8 9 10

Neighbours

0.0

0.1

0.2

0.3

0.4

average = 5.782

0 1 2 3 4 5 6 7 8 9 10


0.0

0.1

0.2

0.3

0.4

average = 6.542


Test for possible shelter effect at R = 0

➤ We test the significance of possible shelter effect at the first category (that

is, R = 0), by letting c = 0 in the model:

Pr (R = r , θ) = (1 − δ)[

π br (ξ) + (1 − π)1

m

]

+ δ D(c)r , r = 1, 2, . . . ,m

Relationship with log-lik(CUB ) log-lik(CUB +shelter) δ̂ p − value

Family members −1972.1 −1972.1 0.000 1.00000

Friends −2166.1 −2164.2 0.006 0.068437

Neighbours −2605.7 −2585.3 0.049 3 × 10−8

Colleagues-Acquaintances −2343.4 −2337.3 0.019 < 3 × 10−12


Comparison of CUB and CUB +shelter models

0.0 0.1 0.2 0.3 0.4 0.5 0.6

0.65

0.70

0.75

0.80

0.85

0.90

CUB and CUB+shelter models visualizations


Leve

l of S

atis

fact

ion

(1

−ξ)

Family members

Friends

Neighbours



Motivations for overdispersion in rating data

➤ Overdispersion may be generated by a variability among individualfeelings. Personal characteristics and different response styles stronglysupport this claim.

➤ A Binomial random variable implies a very strong constraintbetween variance and mean value.

➤ Thus, a Beta-Binomial distribution has been introduced for thefeeling component according to the data generating process ofordinal data.

➤ The specification of the new model is oriented to save the sameparametric structure of CUB models.


CUBE model with covariates . . . . . . . . . . . . . . . . . . . . . . [3]

➤ A CUBE model with covariates is defined by:

Pr (R = ri ) = πi βe(ξ i , φi ) + (1 − πi )1

m;

πi =1

1 + e−xi β; ξ i =

1

1 + e−wi γ; φi = ezi α ;

for i = 1, 2, . . . ,n and where the Beta-Binomial distribution is:

βe(ξ i ,φi ) =

(m − 1

ri − 1

)

ri

∏k=1

[1 − ξ i + φi (k − 1)]m−ri+1

∏k=1

[ξ i + φi (k − 1)]

[1 − ξ i + φi (ri − 1)] [ξ i + φi (m − ri )]m−1

∏k=1

[1 + φi (k − 1)]

.

➤ If φi → 0 the (shifted) Beta-binomial tends to the (shifted) Binomial

distribution.

➤ Thus, CUB are nested into CUBE models.


The overdispersion effect

➤ The expectation and the variance of a CUBE model are:

E (R) =m + 1

2+ π (m − 1)

(1

2− ξ

)

;

Var(R) = Var(Y ) + φm(θ) ,

where Var(Y ) is the variance of a CUB model with the same (π, ξ) parameters

of the CUBE specification.

➤ The overdispersion effect is:

φm(θ) = π ξ (1 − ξ) (m − 1) (m − 2)φ

1 + φ.


CUB and CUBE models: a simulated comparison

➤ The frequencies vector: n = (28, 54, 88, 120, 148, 164, 163, 142, 93)′ have

been generated by a sample of n = 1000 ratings of a known CUBE model with

m = 9 and parameters π = 0.9, ξ = 0.4, φ = 0.2, respectively.

Model π̂ ξ̂ φ̂ Log-lik BIC

True 0.900 0.400 0.200

CUB 0.453 0.354 −2120.6 4255.1

CUBE 0.891 0.399 0.197 −2099.1 4218.8

2 4 6 8

0.00

0.05

0.10

0.15

0.20

Ratings

Relat

ive fre

quen

cies

CUB modelCUBE model


Stochastic mechanism of ordinal choices

FeelingAttractiveness, Satisfaction, Awareness, . . .

Yi ∼ FY (. ;γ, Tm)

UncertaintyIndecision, Fuzziness, Blurriness, . . .

Vi ∼ FV (.)

Ordinal choice

Ri ∼ FR(. ; θ)


An inclusive perspective

➤ A GEneralized Mixture model with uncertainty (GEM ) is defined as follows:

Pr (Ri = j | θ) = πi Pr(

Yi = j | t(γ)i

,Ψ

)

+ (1 − πi) Pr (Vi = j) ,

for i = 1, . . . ,n and j = 1, . . . ,m, where πi = π(t(β)i

,β) ∈ (0, 1] are introduced to

weight the two components and t(γ)i

∈ T (γ) and t(β)i

∈ T (π) include the values

of the selected covariates for the i-th subject.

➤ The probability distribution of the feeling component Yi is

Pr(

Yi = j | γ, t(γ)i

)

, if specified via a discrete distribution;

FY ∗i(τj ;γ, t

(γ)i

)− FY ∗i(τj−1;γ, t

(γ)i

) , if specified via a latent variable distribution;

where FY ∗i(τj ;γ, t

(γ)i

) = Pr(

Y ∗i ≤ τj | γ, t

(γ)i

)

is the distribution function of the

latent variable Y ∗i .


Some models encompassed by the GEneralized Mixture

Typology of discretization DGP Models Variants of models

Class I Discrete IHG(no cutpoints) random SBSupervised discretization variables CUB VCUB

HCUBLC-CUB

CUBE VCUBECUB+shelter GeCUBCUSH CUB-DK

Class II Continuous CUN(known cutpoints) variables D-BetaSupervised discretization

Class III Latent CUMULATIVE Logit(estimable cutpoints) continuous ProbitUnsupervised discretization variables C-log-log

CUP (idem)


5. Conclusions


Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [1]

� We are not working with a single model, a collection of models, avariant of existing models.

� Indeed, we are proposing and implementing a whole framework(that is a “paradigm”) based on the generating process of ratingdata.

� This process includes covariates if and when their effects aresignificant to explain respondents’ behaviour.


Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [2]

➤ In fact, although some of them are useful, “all models aresubstantially wrong” and this aphorism is as much valid for a novelparadigm.

➤ The substantive problem is to establish the starting point for furtheradvances in order to achieve better models which in turn should everbe improved.

➤ As statisticians, we must be aware of the role and importance ofuncertainty in human decisions and CUB models may be consideredas building blocks of more complex statistical specifications.

➤ Above all, CUB models act as a benchmark for more refinedanalyses.


Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [3]

➤ Probably, time is not ripe yet for a paradigm shift.

➤ Nevertheless,

a comprehensive family of models with appealing interpretation and

parsimony features;

a number of published papers supporting the new approach in different

fields;

an increasing diffusion of models which include uncertainty with a

prominent role;

the availability of free software which effectively performs inferential

procedures and graphical analysis,

are convergent signals that the prospective paradigm is slowly emerging.


Essential references

• Piccolo, D. (2003). On the moments of a mixture of uniform and shifted

binomial random variables. Quaderni di Statistica, 5, 85–104.

• D’Elia, A. and Piccolo, D. (2005). A mixture model for preference data

analysis. Computational Statistics & Data Analysis, 49, 917–934.

• . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

• Piccolo, D. (2018) A new paradigm for rating data models, Proceedings of

the XLIX Statistical Meeting of the Italian Statistical Society, in: Abbruzzo A.,

Brentari E., Chiodi M., Piacentino D. (eds.), Book of Short Papers SIS 2018,

Pearson, ISBN-9788891910233, pp.19-30.

• Piccolo, D., Simone R. and Iannario, M. (2018). Cumulative and CUB models

for rating data: a comparative analysis, International Statistical Review, First

published: 01 October 2018, https://doi.org/10.1111/insr.12282.


Thank you for your attention!!!


University of Naples Federico II, Naples, ITALY domenico ... › eurostat › cros › system › files › piccolo_ntts_20… · University of Naples Federico II, Naples, ITALY...

Documents