Top Banner
Andrew Smith Describing childhood diet with cluster analysis Young Statisticians’ meeting. 12th April 2011
21

Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Mar 28, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Andrew Smith

Describing childhood diet with cluster analysisYoung Statisticians’ meeting. 12th April 2011

Page 2: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Describing diet with cluster analysis

• Pauline M. Emmett

• P. Kirstin Newby

• Kate Northstone

• World Cancer Research Fund

• MRC, Wellcome Trust, University of Bristol

2

Page 3: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Outline

• Introductions• ALSPAC• Food frequency questionnaires• Dietary patterns• Cluster analysis

• k-means cluster analysis

• Results• 3 cluster solution• Associations with socio-demographic variables

3

Page 4: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

ALSPAC

• Avon Longitudinal Study of Parents and Children

• Birth cohort study

• 14,541 pregnant women and their children

• www.bris.ac.uk/alspac

4

Page 5: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Food frequency questionnaires5

Page 6: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Dietary patterns

• Examine diet as a whole

• Analyse multivariate FFQ data

• Use correlations between foods

• PCA

• Cluster analysis

6

Image: Paul / FreeDigitalPhotos.net

Page 7: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Cluster analysis

• Separate subjects into

non-overlapping

groups

• Based on ‘distances’

between individuals

• Unsupervised learning

7

Image: Boaz Yiftach / FreeDigitalPhotos.net

Page 8: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

k-means cluster analysis

• Most widely used for dietary patterns

• Number of clusters, k, is specified beforehand

• Minimises – Distance from each subject to his/her cluster

mean– Summed over all subjects in that cluster– Summed over all clusters

8

Page 9: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

k-means cluster analysis9

Page 10: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Problems with the standard algorithm

• Short-sighted

• Tends to find solutions that are at a local minimum– So run algorithm 100 times and choose solution

that is minimum out of all minima

10

Page 11: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Standardising the input variables11

Page 12: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Reliability of the cluster solution

• Split sample in half

• Perform separate analyses on each half

• See how many children change clusters

• Repeat 5 times– 32 out of 8,279 children changed cluster (0.4%)

12

Page 13: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Processed4177 children13

Image: Suat Eman, Rawich, Master Isolated Images / FreeDigitalPhotos.net

Page 14: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Plant-based2065 children14

Image: Suat Eman, Paul, Rob Wiltshire, Simon Howden, winnond / FreeDigitalPhotos.net

Page 15: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Traditional British2037 children15

Image: Suat Eman, Filomena Scalise, Maggie Smith / FreeDigitalPhotos.net

Page 16: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Associations with socio-demographic vars

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

Girls 3,115 1 1 1

Boys 2,941 0.82 (0.72, 0.93)

1.03(0.89, 1.20)

1.18 (1.04, 1.34)

16

Page 17: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Associations with socio-demographic vars

Maternal age

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

< 21 130 1 1 1

21-25 994 0.59 (0.33, 1.07)

1.07 (0.56, 2.05)

1.57(1.02, 2.43)

26-30 2,644 0.52(0.29, 0.92)

1.20(0.64, 2.28)

1.60(1.04, 2.46)

31+ 2,288 0.37(0.21, 0.67)

1.50(0.79, 2.88)

1.77(1.13, 2.76)

17

Page 18: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Associations with socio-demographic vars

Maternal education

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

CSE 740 1 1 1

Vocational 504 0.84(0.60, 1.17)

1.19(0.82, 1.72)

1.01(0.76, 1.32)

O level 2,163 0.65(0.51, 0.83)

1.46(1.10, 1.94)

1.05(0.86, 1.30)

A level 1,604 0.42(0.33, 0.55)

2.01(1.50, 2.69)

1.18(0.95, 1.48)

Degree 1,045 0.30(0.23, 0.39)

2.75(2.00, 3.76)

1.22(0.94, 1.57)

18

Page 19: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Associations with socio-demographic vars

Siblings

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

0 older 2,755 1 1 1

1 older 2,317 1.21(1.03, 1.42)

1.12 (0.94, 1.36)

0.73(0.62, 0.86)

2+ older 984 1.58(1.28, 1.97)

0.99(0.76, 1.27)

0.64(0.52, 0.80)

19

Page 20: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Associations with socio-demographic vars

Siblings

Processed

Plant-based

Plant-based

Traditional British

Traditional British

Processed

0 younger 2,946 1 1 1

1 younger 2,490 1.01(0.86, 1.19)

0.58(0.48, 0.71)

1.69(1.44, 1.99)

2+ younger 620 1.21(0.92, 1.57)

0.43(0.33, 0.58)

1.90(2.50, 2.40)

20

Page 21: Andrew Smith Describing childhood diet with cluster analysis Young Statisticians meeting. 12th April 2011.

Summary

• Multivariate methods to compress FFQ data into

dietary patterns

• k-means cluster analysis is widespread but must

be applied carefully

• Processed, Plant-based and Traditional British

clusters in 7-year-old children

• Associated with various socio-demographic

variables

21