Top Banner
BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 1 of 23 BIOSTATS 640 Intermediate Biostatistics Spring 2021 Examination III Units 6, 7, & 8 – Analysis of Variance, Logistic Regression & Introduction to Survival Analysis Due: Friday May 7, 2021 Last date for submission for credit (-10 points): Monday May 10, 2021 Before you begin: This is a “take-home” exam. You are welcome to use any reference materials you wish and you are welcome to use the computer as you wish, too. And you are welcome to contact me with questions. Otherwise, however, you MUST work this exam by yourself and you may not consult with anyone (except me and that is fine…) Instructions and Checklist: __1. Start each problem on a new page. __ 2. Write your name your file as instructed below __ 3. Complete the signature page How to name your exam submission __a) Please be sure your name is somewhere on your submission. __b) Next, save it as a SINGLE FILE pdf (please do not submit a word file) __c) Suggestion: Use the following naming convention: lastname_exam3.pdf. How to submit your exam. In Blackboard, at left, click the ASSIGNMENTS tab. The link for submitting your exam is there. Questions? Again, questions are always welcome. Please send me an email to my UMass email. Thank you! [email protected]
23

BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

Aug 22, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 1 of 23

BIOSTATS 640 Intermediate Biostatistics Spring 2021

Examination III

Units 6, 7, & 8 – Analysis of Variance, Logistic Regression & Introduction to Survival Analysis

Due: Friday May 7, 2021 Last date for submission for credit (-10 points): Monday May 10, 2021

Before you begin: This is a “take-home” exam. You are welcome to use any reference materials you wish and you are welcome to use the computer as you wish, too. And you are welcome to contact me with questions. Otherwise, however, you MUST work this exam by yourself and you may not consult with anyone (except me and that is fine…) Instructions and Checklist: __1. Start each problem on a new page. __ 2. Write your name your file as instructed below __ 3. Complete the signature page How to name your exam submission __a) Please be sure your name is somewhere on your submission. __b) Next, save it as a SINGLE FILE pdf (please do not submit a word file) __c) Suggestion: Use the following naming convention: lastname_exam3.pdf. How to submit your exam. In Blackboard, at left, click the ASSIGNMENTS tab. The link for submitting your exam is there. Questions? Again, questions are always welcome. Please send me an email to my UMass email. Thank you!

[email protected]

Page 2: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 2 of 23

BIOSTATS 640 Intermediate Biostatistics Spring 2021

Examination III

Units 6, 7, & 8 – Analysis of Variance, Logistic Regression & Introduction to Survival Analysis

Signature This is to confirm that in completing this exam, I worked independently and did not consult with anyone. Name: ___________________________________________________________ Date: ___________________________

Thank you!

Page 3: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 3 of 23

Two things, before you begin:

#1. This exam has a total number of points = 105, thus giving you 5 points of “wiggle room”. The maximum score possible is still 100. #2. Several of the questions have 2 options: “A” or “B”. In these instances, please choose EXACTLY ONE question.

Page 4: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 4 of 23

Question 1 (20 points) Choose EXACTLY ONE: Option A or B

Question 1 - OPTION A 1a. (4 points) Multiple choice. Pick ONE. The purpose of analysis of variance is to compare

__ a. the variances of several populations. __ b. the proportions of successes in several populations.

__ c. the means of several populations. 1b. (4 points) Multiple choice. Pick ONE. A study of the effects of smoking classifies subjects as nonsmokers, moderate smokers, or heavy smokers. The investigators interview a sample of 200 persons in each group. Among the questions is “how many hours do you sleep on a typical night?” The degrees of freedom for the analysis of variance F statistic for comparing mean hours of sleep are

__a. 2 and 197 __b. 2 and 597 __c. 3 and 597

The following setting pertains to Question 1 OPTION A parts c, d, and e. A dental study evaluated the effect of tooth etch time on resin bonding strength. A total of 78 undamaged, recently extracted first molars (baby teeth) were randomly assigned to be etched with phosphoric acid gel for either 15, 30, or 60 seconds. Composite resin cylinders of identical size were then bonded to the tooth enamel. The researchers examined the bond strength after 24 hours by finding the failure load (in megapascal) for each bond. Here are the summary data and ANOVA table for this experiment.

Etch time n s 15 seconds 26 4.49 2.28 30 seconds 26 6.98 3.15 60 seconds 26 8.48 4.17

x

Page 5: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 5 of 23

Source DF Sum of Squares Mean Square F-statistic p-value

Etch time 2 211.208 105.604 9.745 0.0002 Error 75 812.745 10.837

Total, corrected 77 1023.953

1c. (4 points) Multiple choice. Pick ONE. The most striking conclusion from the numerical summaries for the three etch times is that

__a. there appears to be little difference among the etch times. __b. on average, failure load increases with etch time __c. on average, failure load decreases with etch time.

1d. (4 points) Multiple choice. Pick ONE. The conclusion of the analysis of variance test is that

__a. there is quite strong evidence (p=.0002) that the mean failure loads are not the same in all three conditions. __b. there is quite strong evidence (p=.0002)that the mean failure load is much lower for 15 seconds etch time than in any other two etch times.

__c. the data give no evidence (p=.0002) to suggest that mean failure load differ for the three etch times. 1e. (4 points) Multiple choice. Pick ONE. If we used a series of two-sample t procedures to compare the three conditions, we would have to give three 95% confidence intervals to compare all three pairs of etch times. The weakness of doing this is that

__a. we won’t be 95% confident that all 3 intervals cover the true differences in population means. __b. the conclusions from the three intervals might not agree.

__c. the conditions for two-sample t inference are not met for all 3 pairs of etch times.

Page 6: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23

Question 1 - OPTION B 1a. (2 points) Multiple Choice. CHOOSE ONE. Analysis of variance compares the results from two or more groups when the response variable is

(a) Quantitative (b) Ordinal (c) Categorical

1b. (2 points) Fill in the blank. The null hypothesis in testing the means from four groups with one way analysis of variances is ___________________________________. 1c. (2 points) Fill in the blank. The alternative hypothesis in testing the means from four groups with one way analysis of variance is ___________________________________. 1d. (2 points) Select the best response. CHOOSE ONE. Evidence of nonrandom differences in group means occurs when the variance between groups is ____________________________ the variance within groups.

(a) much greater than (b) much less than (c) about equal to

1e. (2 points) Select the best response. CHOOSE ONE. This statistic quantifies the variability of the group means around the grand mean.

(a) mean difference (b) mean square between (c) mean square within

1f. (2 points) Select the best response. CHOOSE ONE. One-way analysis of variance, when restricted to the comparison of means from two groups, is equivalent to a(n) ______________________ test.

(a) Z (b) unequal variance independent t-test (c) equal variance independent t-test

Page 7: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 7 of 23

1g. (2 points) Fill in the blank. The mean square within is equal to _________________ divided by the degrees of freedom within. 1h. (2 points) What is the null hypothesis addressed by Levene’s test of three groups? 1i. (2 points) List two graphical techniques that can be used to compare the shapes, locations, and spreads of data from two or more independent groups. 1j. (2 points) List the three statistical conditions required for analysis of variance.

Page 8: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 8 of 23

Question 2 (15 points) Choose EXACTLY ONE: Option A or B

Question 2 - OPTION A The following is a partial listing of data from a study of the effects of smoking during pregnancy on infant birthweight (grams). Not all of the data are shown and you do not need it to answer this question. You may assume the three groups are independent and that the sample sizes are equal to n=12 in each group, for a total sample size of 12+12+12=36.

Group = Non smoker 1 pack/day > 1 pack/day 3515 3444 2608 3420 3827 2509 3175 3884 3600 … … … 4054 2750 2778 3459 3460 2466 3998 3340 3260

2a. (5 points) State the analysis of variance model using notation as appropriate. Use deviation from means parameterization and define all terms and constraints on the parameters. 2b. (5 points) State the regression model using notation as appropriate. Use reference cell coding parameterization and define all terms. Note to class: There is more than one correct answer here.

2i eμ and τ and σ

20 i eβ and β and σ

Page 9: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 9 of 23

2c. (5 points)

Using your answer to parts “a” and “b” write down the expressions for the mean birthweight in each group in two ways: (i) as a function of the analysis of variance model terms; and (2) as a function of the regression model terms. In reporting your answer, complete the following table:

analysis of variance regression

Non smokers:

1 pack/day:

> 1 pack/day:

1.E [Y ] =

2.E [Y ] =

3.E [Y ] =

Page 10: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 10 of 23

Question 2 - OPTION B Dear class – You do NOT need the actual data to answer this question. Elevated cholesterol levels (certain types anyway) are an established risk factor for coronary artery disease. The following table is a summary of measurements of cholesterol values in three groups of study participants. Groups are defined by number of diseased vessels. You may assume equal variances.

Group, i Sample size ni Mean cholesterol Standard deviation 1

2

3

2a. (5 points) State the analysis of variance model using notation as appropriate. Use deviation from means parameterization and define all terms and constraints on the parameters.

2b. (5 points) Make an analysis of variance table. In your analysis of variance table, provide entries for the following column headings.

Source df SSQ MSQ = SSQ/df F-Statistic p-value

2c. (5 points) Carry out the appropriate hypothesis test to assess whether there are differences among the three groups with respect to cholesterol level. In your answer, please provide

- The null and alternative hypothesis (1 point) - The name of the test statistic used and its formula (1 point) - The value of the test statistic (1 point) - The achieved significance (the p-value) (1 point) - Interpretation of your findings (1 point)

n1 = 36 1Y = 260 1S = 56.0

n2 = 36 2Y = 289 2S = 87.5

n3 = 36 3Y = 295 3S = 72.4

2i eμ and τ and σ

Page 11: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 11 of 23

Question 3 (20 points) Choose EXACTLY ONE: Option A or B

Question 3 - OPTION A The following table shows the scores Y on an emotional maturity assessment test administered to 27 individuals cross-classified by age and the extent to which they use marijuana. For this problem, too, you may assume that the scores Y are distributed normal and that the variances in scores Y are the same in all groups.

Marijuana Use Frequency

Age group Never Occasionally Every Day

15-19 25 18 17 28 23 24 22 19 29

20-24 28 16 18 32 24 22 30 20 20

25-29 25 14 10 35 16 8 30 15 12

3a. (5 points) State what design was used and complete the analysis of variance table.

Source df SSQ MSQ = SSQ/df F-Statistic p-value

Page 12: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 12 of 23

3b. (10 points) Carry out the appropriate hypothesis tests to evaluate the roles of age and marijuana use on emotional maturity. Be sure to state all null and alternative hypotheses, compute the values of all test statistics, and determine all p-values. 3c (5 points) Write a paragraph report of your findings and conclusions.

Page 13: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 13 of 23

Question 3 - OPTION B In a study of intelligence of children with heart disease that is either acyanotic or cyanotic and who have undergone surgery or not, the following changes (change = last – first) in IQ were obtained. Factor II (Disease Type) Acyanotic Cyanotic Factor I (Surgery) No 9

-1 -10 3 -2

2 1 -4 -5 0

Yes -7 -7 -12 -13 -12

5 10 9 2 15

3a. (5 points) State the analysis of variance model. Use deviation from means parameterization. Define all terms and constraints on the parameters.

3b. (5 points) Now state the appropriate model, formulated as a regression using reference cell coding. Define all terms and constraints on the parameters. Note to class: There is more than one correct answer here.

3c. (5 points) Complete the following analysis of deviance and reference cell coding analysis of variance tables. Analysis of Deviance Method Source DF Sum of Squares Mean Square F-statistic p-value

Page 14: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 14 of 23

Reference Cell Coding Method Source DF Sum of Squares Mean Square F-statistic p-value

3d. (5 points) By any means you like (“deviation from means” or “reference cell coding”), test the null hypothesis that changes in IQ are not affected by an interaction between surgery and disease type. In 1-2 sentences, what do you conclude?

Page 15: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 15 of 23

Question 4 (10 points) Required

One of the themes of BIOSTATS 640 was an understanding of effect modification. A term for this in analysis of variance parlance is “interaction”. A two way factorial design with replicate observations for each combination of factor I and II permits the discovery of “interaction”. We can get a visual sense of this by constructing a plot of the group means. Consider a two way factorial design with factor I at two levels designated “A1” and “A2” and factor II at two levels, designated “B1” and “B2”. For each of the following three scenarios of main effects and interaction, provide the following:

(i) Identify the correct graphical summarization of the means. (ii) Write down the correct model using notation as appropriate.

Picture 1 Picture 2 Picture 3 Picture 4

I’ve done one for you as an example No effect of factor I, small effect of factor II, and no interaction. (i) Picture 3 (ii) Model:

2i j ij eμ, , , ( ) , and σa b ab

ijk i j ijkY = μ + α + β + ε

Page 16: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 16 of 23

4a. (3 points) Large effect of factor I, small effect of factor II, and no interaction. (i) Picture: (ii) Model:

4b. (3 points) No effect of factor I, no effect of factor II, and interaction. (i) Picture: (ii) Model:

4c. (4 points) Large effect of factor I, no effect of factor II, and interaction. (i) Picture: (ii) Model:

Page 17: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 17 of 23

Question 5 (20 points) Choose EXACTLY ONE: Option A or B

Question 5 - OPTION A A study was conducted to examine the relationship of survival status (defined as <10 years versus > 10 years) to type of surgery (defined as extensive versus not extensive). Data were obtained for 199 patients with cancer of the ovary. While other predictors of 10-year survival could be included (for example – stage of tumor, age, comorbidities), the first analysis considered only the one predictor variable: type of surgery (extensive versus not extensive). Following are the observed frequencies:

Survival Type of Surgery < 10 years > 10 years

Extensive 20 28 Not Extensive 29 122

The initial logistic regression modeled the logit of survival <10 years in relationship to a 0/1 indicator of extensive surgery. Results of fitting yielded an estimated constant term equal to -1.4367 and an estimated regression coefficient for extensive surgery equal to 1.1003.

5a. (4 points) Write down the expression for the predicted logit using the information given. Define all terms. 5b. (4 points) Using your answer to question 5A, by hand, calculate the predicted probability of survival less than 10 years for a patient whose surgery is NOT extensive. Show your work. 5c. (4 points) Again using your answer to question 5A, by hand, calculate the predicted probability of survival less than 10 years for a patient whose surgery is extensive. Show your work. 5d. (4 points) Again using your answer to question 5A, by hand ,what is the estimated relative odds (odds ratio, OR) of survival less than 10 years for a patient whose surgery is extensive, relative to that for a patient whose surgery is NOT extensive? Show your work. 5e. (4 points) Finally, using the 2x2 table approach (with its counts “a”, “b”, “c” and “d”), by hand what is the estimated relative odds (odds ratio, OR) of survival less than 10 years for a patient whose surgery is extensive, relative to that for a patient whose surgery is NOT extensive? Show your work.

Page 18: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 18 of 23

Question 5 - OPTION B A logistic regression analysis of likelihood (π) of mortality considered several variables: shock (SHOCK: coded 1=shock, 0=NO shock), malnutrition (MALNUT; coded 1=malnourished, 0 = NOT malnourished), alcoholism (ALC: coded 1=alcoholic 0=NOT alcoholic), age (AGE: continuous), and bowel infarction (INFARCT: coded 1=infarction, 0=NO infarction). The following fitted model was obtained:

5a. (8 points total) What is the estimated probability of death for a 60 year old malnourished patient with no evidence of shock, but with symptoms of alcoholism and prior bowel infarction? In developing your answer write out the formula you use and provide the numeric estimate. 5b. (6 points total) Write out the expression for the predicted logit of mortality as a function of age for the sub-population for whom MALNUT=0, ALC=1, SHOCK=0, and INFARCT=1? 5c. (6 points total) Using your answer to question 5B as a start, by any means you like, construct a plot of the estimated probability of death versus age for a person who is NOT malnourished, has symptoms of alcoholism, and has a bowel infarction. A hand drawn plot is just fine! In 1-2 sentences, interpret your plot.

ˆ ˆlogit[π] = -9.754 + 3.674[SHOCK] + 1.217[MALNUT] + 3.355[ALC] + 0.09215[AGE] + 2.798[INFARCT]

Page 19: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 19 of 23

Question 6 (20 points)

Choose EXACTLY ONE: Option A or B

Question 6 - OPTION A The Scottish Heart Health Study (Smith et al, 1987) examined risk factors for coronary heart disease (CHD). This question pertains to a logistic regression analysis that explored the influences of six risk factors: age, total cholesterol, body mass index, systolic blood pressure, smoking, and physical activity. The following table details the variable definitions and their code definitions. Variable Label Type/Code Definitions CHD Coronary Heart Disease 1 = yes, 0=no AGE Age Continuous, years TOTALCHOL Total Cholesterol Continuous, mg/dL BMI Body mass index Continuous, weight/height2

SBP Systolic blood pressure Mm Hg SMOKING Smoking status 1=never, 2=ex, 3=current ACTIVITY Self-reported activity 1=active, 2=average, 3=inactive

For purposes of modeling, two design variables were created to represent the three responses of SMOKING and two design variables were created to represent the three responses of ACTIVITY. 6a. (4 points) In one set of analyses, the author fit six separate one predictor models. The following table summarizes the values of the deviance statistic obtained. It also summarizes the assessment of the statistical significance of each crude association. In particular, there is a row for the “intercept only” model. In assessing the crude significance of each predictor, the one predictor model is compared to the “intercept only” model.

Details of Model Details of Likelihood Ratio Test Model Deviance df Deviancea p-value

“intercept only” 1569.37 4048 -- --

“intercept only” + AGE 1563.46 4047 5.91 0.015 “intercept only” + TOTALCHOL 1534.56 4047 34.81 < .0001

“intercept only” + BMI 1560.43 4047 8.94 0.003 “intercept only” + SBP 1528.01 4047 41.36 < .0001

“intercept only” + SMOKING 1556.22 4046 13.15 0.0014 “intercept only” + ACTIVITY

1569.06 4046 - 0.86

a Current versus “intercept only”

D

Page 20: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 20 of 23

Reproduce the calculations that yielded the p-value of 0.86 for the significance of the crude association of CHD with ACTIVITY. Show all work.

6b. (10 points) Based on the results of the one predictor models, in the next set of analyses, ACTIVITY was dropped from consideration and the investigators considered an initial five predictor model containing AGE, TOTALCHOL, BMI, SBP, and SMOKING. In a second set of model fits, the investigators deleted predictors one at a time. The following summary was obtained for this second set of model fits: Details of Model Likelihood Ratio Test Predictors in model in addition to the intercept:

Deviance df Deviancea p-value

AGE, TOTALCHOL, BMI, SBP, SMOKING 1482.47 4042 --- ---

---, TOTALCHOL, BMI, SBP, SMOKING 1484.00 4043 1.53 0.22 AGE, ---, BMI, SBP, SMOKING 1507.72 4043 25.25 < .0001

AGE, TOTALCHOL, ---, SBP, SMOKING 1486.09 4043 --- 0.057 AGE, TOTALCHOL, BMI, ---, SMOKING 1509.03 4043 26.56 < .0001

AGE, TOTALCHOL, BMI, SBP, --- 1496.34 4044 13.87 0.001 a Current versus “intercept only”

Using this summary, carry out a test that produced the p-value of 0.057 for the row that reads AGE, TOTALCHOL, ---, SBP, SMOKING . In reporting your answer, provide: null hypothesis (1 point), alternative hypothesis (1 point), test statistic name and degrees of freedom, test statistic value (1), p-value (1 point), and interpretation (1 point).

6c. (6 points) Continuing, the investigators next considered a four predictor model in which AGE was dropped. BMI was retained for the reasons that it was of borderline significance in the second set of analyses and highly significant in crude analysis. Then, a third set of model fits was done. In this set of model fits also, the investigators deleted predictors one at a time. The following summary was obtained for this third set of model fits:

Details of Model Likelihood Ratio Test Predictors in model in addition to the intercept: Deviance df Deviancea p-value

TOTALCHOL, BMI, SBP, SMOKING 1484.00 4043 --- ---

---, BMI, SBP, SMOKING 1509.00 4044 25.00 < .0001 TOTALCHOL, ---, SBP, SMOKING 1484.48 4044 3.48 0.062 TOTALCHOL, BMI, ---, SMOKING 1515.37 4044 31.37 < .0001

TOTALCHOL, BMI, SBP, --- 1497.40 4045 13.40 0.001 a Current versus “intercept only” In at most 1-3 sentences, based on the information provided in this table, which model would you choose to report and why?

D

D

Page 21: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 21 of 23

Question 6 - OPTION B 6a. (10 points) The following data recorded the number of minutes of play survived by each of a sample of shuttlecocks, before damaged occurred. Some shuttlecocks were still intact at the end of the test, hence the (right) censored values, which are denoted by *. Time (mins) 2 2* 9 6* 2 13 1* 7* 6 3

Calculate the Kaplan–Meier estimate of the survivor function for this sample by hand. In developing your answer, complete the following three tables:

Table 1 – Chronology of events of shuttlecock damage or censoring

Minutes of play, t

# events of damage at minutes = t

# censored at minutes=t

Table 2 – Worksheet for Solution of Kaplan Meier Estimates of Survival

Occasion of Outocme, t

# At risk at

instant before time t

# Surviving Beyond day = t

Conditional % Surviving beyond t

Pr [ S > t given S > (t-1) ]

Page 22: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 22 of 23

Table 3 – Kaplan Meier Estimates

Minutes of play

Formula for

% NOT Damaged = Pr [ S > t ] =

Solution

6b. (10 points) The following data are the survival times in months of two groups of patients (A=untreated versus B = treated). Right censored values are denoted by *.

Part i (3 points): By any means you like fit an appropriately defined Cox PH model with one predictor: group defined = 1 if group is “A” (UNtreated) versus 0 if group is “B” (treated) Write out the expression for your fitted model.

Page 23: BIOSTATS 640 Exam III 2021 640 Exam... · 2021. 4. 23. · BIOSTATS 640 Exam 3 – Spring 2021 Name _____ Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 6 of 23 Question

BIOSTATS 640 Exam 3 – Spring 2021 Name ________________________________________________

Z:\bigelow\...\2021\...\BIOSTATS 640 Exam 3 2021.docx Page 23 of 23

Part ii (3 points): Using your fitted model, what is the estimated relative hazard of death for the comparison group defined as Untreated, relative to the referent group defined as treated. Show your work. Part iii (4 points): By any means you like, by hand, perform a likelihood ratio test of the null hypothesis that there is no benefit of treatment on survival. In 1-2 sentences, interpret.