Eugm 2012 pritchett - application of adaptive sample size re-estimation in event outcome confirmatory trials - 2012 eugm

The Application of Adaptive Sample Size Re-estimation in Event Outcome Confirmatory Clinical Trials

Yili L. Pritchett, PhD

East User Group Meeting

October 12, 2012

2Company Confidential

Outline

•

“Learning”

and “Confirmatory”

clinical trials

•

A case study: predictivity of change in albuminuria to mortality, CV outcome, or renal outcome events

•

Adaptive sample size re-estimation (SSR) design for an event outcome study

•

Comparisons between adaptive SSR and group sequential design approach

•

Concluding remarks


Background

•

Clinical trials testing a New Molecular Entity (NME) can be classified as “learning”

or “confirmatory”

(Sheiner, 1997)

•

The ideal case: parameters (mostly on the efficacy side) are well learned during the learning phase, and they will guide the design of the confirmatory studies

•

E.g., development program for a new antidepressant

•

For this type of indication, study endpoints can be and should be consistent between learning and confirmatory trials


However, there are exceptions ……•

Indications that need data to demonstrate test drug’s effect in reducing of

certain detrimental outcome events are likely to fall in this category

•

Because sufficiently large sample size and long duration of follow-up are needed to observe the occurrence of events, it is not viable

to design such study for just to learn

•

Most event outcome studies are designed at Phase 3 level

•

How to learn about the NME at early stage of development?

---

Use surrogates or biomarkers

•

How to make decision of Go/No Go for Phase 3?

---

Based on hypothesis or evidence of predictivity that infers treatment effect on event outcomes from its effect on biomarkers


The Meaning of “Confirmatory”

Has Been Modified in Such Clinical Development Programs

•

The Phase 3 is not to replicate and confirm

a treatment effect that was observed in early phase, but to observe

the hypothesized treatment effect that has not yet been seen

•

As such, the NME might enter the confirmatory phase with a great

deal of uncertainty and high risk of failure

•

The uncertainties associated with confirmatory event outcome trials include but are not limited to:

-

Treatment effect, quantified by hazard ratio (HR)

-

Placebo group event rate


Example: Change in Albuminuria and the Outcomes of Cardiovascular and Renal Events

•

Albuminuria is a medical condition often diagnosed by elevation of urinary albuminuria/creatinine ratio (UACR)

•

Data from two prospective trials on telmisartan (ONTARGE and TRANSCEND) were combined to assess the predictivity of changes in albuminuria on mortality, CV, and renal outcomes (Schmieder et al., 2011)

Subjects with UACR at both baseline and 2-year visit

N=23,480

Subjects with �

50% decrease, N=4994

% change in UACR-50% 0 100%

Subjects with �

100% increase, N=6518

Subjects with minor changes, N=11,968

Albuminuria worsenedAlbuminuria improved


UACR Value at Baseline was Associated with Mortality

Annual Motality Rate

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

< 10 [10, 30) [30, 100) [100, 300) � 300

UACR Value at Baseline (mg/g)


Definitions and Data Analysis Methods

•

Events of interest: All cause mortality, CV death, Composite CV endpoint, and combined renal endpoint.

•

Composite CV endpoint: cardiovascular death, myocardial infarction, stroke, and hospitalization for heart failure

•

Combined renal endpoint: doubling of serum creatinine, needing for dialysis or renal transplant.

•

The group of subjects with minor changes in UACR was taken as the reference group

•

Hazard ratio were estimated from a Cox model adjusting for age, sex, body mass index, smoking, alcohol consumption, eGFR, plasma glucose, systolic and diastolic BP and HR at baseline, BP change and eGFR

change within 2-year, treatment, and diagnosis at entry.


Change in UACR Predicted Event Outcomes

Figure 1. Schmieder et al. 2011


Relationship between UACR Reduction and Relative Risk of Renal Outcomes

Data source: Lambers Heerspink and De Zeeuw Nephron. Clin. Prac. 2010

Company Confidential

©

2012 Abbott

Figure 1. Lambers Heerspink and De Zeeuw Nephron. Clin. Prac. 2010


Accumulated Data and Research Suggest ……

•

Albuminuria is powerful predictor of outcome of those events of interest

•

More importantly, change in albuminuria predicted outcome of these events

•

Thus, if a treatment can reduce albuminuria, it should likely have the effect on reducing the rates of these events


A hypothetical clinical program where almuninuria is used as a biomarker

POC study (8-12 weeks)

Dose-finding study

(8-12 weeks)

Event outcome study (3-5 years)Primary efficacy endpoint:

Change in UACR

Primary efficacy endpoint: time-to-event

Time


At the time of designing the event-driven confirmatory study, there are knowns and unknowns

•

It is known that the test drug has effect on UACR reduction

•

It is predicted that the test drug can achieve a hazard rate reduction in the range between 25% and 30% (~HR=0.75 –

0.70) in those who achieved albuminuria reduction under the treatment

•

It is also known that a HR of 0.80 is clinically meaningful and can make the test drug a viable treatment

•

If the predictive model is correct, one should power the study for an HR in the range of 0.70-0.75

•

Alternatively, one can take a conservative approach to power the

study to detect an effect of 20% event reduction


©

2012 Abbott13


Assumptions for Initial Sample Size Calculation•

Hazard Ratio = 0.75

•

Annual placebo event rate = 0.07

•

One-sided Į=0.025, power = 90%

•

An interim analysis will be performed when 50% of the events have been collected

•

Futility stopping: Gamma (-6); no plan to stop for success

•

O’Brien-Fleming alpha-spending function is applied

•

Accrual: 2 years, follow-up: 4 years

Æ Required number of events d = 509 Æ required total sample size N=3123


©

2012 Abbott14


The Adequacy of Sample Size and Potential Study Power Are Conditional on the True Effect Size

90%

94%98%

80%

71%

40

50

60

70

80

90

100

0.7 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.8

Assumed True HR

Power

If true HR is in this area, sample size targeted on HR=0.75 is more than

adequate If true HR is in this area, original sample size are

running short

March 19, 2012 CRC [Atrasentan] Company Confidential

©

2012 Abbott15

4651

5191

3123 N

A solution: use adaptive, sample size re-estimation design


Adaptive Sample Size Re-estimation Using “Promising Zone”

(Mehta and Pocock, 2011)

•

Conduct an interim analysis when 1/2 of the events (d=255) have been collected

•

If interim efficacy data show compelling treatment effect, no sample size adjustment; if interim efficacy data do not support a viable treatment effect, no sample size adjustment

•

If interim efficacy data are somewhat less than the original target but promising, increase the sample size to ensure adequate power for

final analysis

Favorable zonePromising ZoneUnfavorable Zone

0 CPmin CPmax 100 (%)

CP = Conditional power –

the power of study

calculated

using observed HR at interim


©

2012 Abbott16


Operating Characteristics for Various Choices of CPmin

(original event number d=509 with N=3123 targeting for HR=0.75)


Operating Characteristics for CPmin

=10% and CPmax

= 90%

Assume true HR=0.80

CP = 10%

HR = 0.92CP = 90%

HR = 0.77

3123

(d=509)

4651

(d=758)

45% in Promising Zone

40% in Favorable Zone

13% in

Unfavorable

Zone

Power 94%Power 26%

CP/HRCP=0%

HR=1.05CP=100%

HR=0.67

Power 67%

5191

(d=846)

Power 90%


©

2012 Abbott18

Simulation by East Version 5.4.

New sample size

(new # events)

Original sample size

(# events)

Power 87%


Choice of CPmin -

Trade off between Area of Protection and Conditional Power in “Promising Zone”

After Adjustment

Assume true HR=0.80

CP = 10%

HR = 0.92CP = 90%

HR = 0.77

3123

(d=509)

4651

(d=758)


40% in Favorable Zone 13%

CP/HRCP=0%

HR=1.05CP=100%

HR=0.67

Power 87%


©

2012 Abbott19


New sample size

(# events)


(# events)

24%

Unfavorable

Zone


CP = 30%

HR = 0.92

Power 90%

Power 35%

(26%) Power 94%


Choice of CPmin -

Trade off between Area of Protection and Conditional Power in “Promising Zone”

(cont’d)

Assume true HR=0.80

CP = 90%

HR = 0.77

3123

(d=509)

4651

(d=758)



CP/HRCP=0%

HR=1.05CP=100%

HR=0.67

Power 78%


©

2012 Abbott20


New sample size

(# events)


(# events)

Power 94%Power 57%


Choice of CPmax -

Trade off between Incremental Investment and Conditional Power at Final Analysis

Assume true HR=0.80

CP = 10%

HR = 0.92CP = 90%

HR = 0.77

3123

(d=509)

4651

(d=758)



13% in

Unfavorable

Zone

Power 92% (94%)

Power 26%

CP/HRCP=0%

HR=1.05CP=100%

HR=0.67

Power 87%


©

2012 Abbott21


New sample size

(new # events)


(# events)

CP=80%

HR=0.80



Power 84%


Simulations for Sensitivity of the Design (original event number d=509, CPmin

=0.10, CPmax

=0.90, new event number dnew

=758)


Comments

•

Design operating characteristics are sensitive to the definition

of “Promising zone”.

•

Simulations are essential to identify the optimal choice of boundaries so that design can

-

allow for up-adjust sample size when treatment effect is promising;

-

ensure adequate conditional power at end to detect the effect;

-

not spend unnecessary extra sample size.

•

The randomness of effect size and noise at interim could lead to

a decision of increasing or not increasing sample size by mistake.

•

To avoid making wrong decision, the DMC needs to examine the totality of interim data.


Adaptive SSR Design vs. Group Sequential Design (GSD) from Sample Size Perspective

Sample size

5191

Average sample size using GS design:

N=5191*0.74 + 4776*0.26=5083

Probability of stopping for success if interim HR=0.80 is 26%; sample size due to early stopping = 4776

Average SS using adaptive SSR (up-adjust to have near 90% power for HR=0.80

N=4062 >

GS design: sample size is determined to detect HR=0.80 with 90% power; perform an interim at 50% of the events and allow stopping for success; O’brien-

Fleming alpha-spending function is applied.


©

2012 Abbott24

End of Study


Comparisons Between Adaptive SSR and GSD Conditioning on True Hazard Ratio


Methods for Final Analysis•

Final nominal level will be adjusted using O'Brien-Fleming method regardless of whether the sample size is increased

•

If no sample size adjustment, the final statistical inference will be made based on the standard statistic Z,

calculated

using all the events collected in the study

•

If there is a sample size increase, CHW method (Cui, Hung, and Wang, 1999) will be used to calculate the combined statistic at final analysis

•

E.g., if 50% of the events are used for interim analysis, final test statistic will be

where Z1

and Z2

are two score tests calculated using events observed before and

after the interim, respectively.


©

2012 Abbott26

21 21

21 ZZ �


Concluding Remarks•

For indications where confirmatory trial endpoint were not measured at learning phase, great uncertainty is associated with the outcome of confirmatory studies

•

To mitigate costly failure at late stage, adaptive sample size re-

estimation can be a novel design choice along with futility stopping

•

This approach allows the opportunity for the test drug to demonstrate treatment effect using smaller sample size

•

It also provides the opportunity to increase the probability of trial success when the sample size is falling short while a promising treatment effect is observed at interim

•

Careful selection of “Promising Zone”

is important for such design


Acknowledgements

•

Cyrus Mehta, Ph.D., Cytel Inc.

•

Hui Tang, Ph.D., Abbott Laboratories


References

1.

Sheiner L.B., Learning versus confirming in clinical drug development, Clinical Pharmacology and Therapeutics, 61:275-291, 1997.

2.

Roland E. Schmieder, et at., Changes in albuminuria predict mortality and morbidity in patients with vascular disease, J Am Soc Nephrol 22: 1353–

1364, 2011

3.

Lambers

and De Zeeuw, Dual RAS therapy not on target, but fully alive, Clin. Prac. 2010

4.

Brenner et at., Effects of losartan on renal and cardiovascular outcomes in patients with Type 2 diabetes and nephropathy, Nephron. Clin. Prac. 2010

5. Mehta and Pocock, Adaptive increase in sample size when interim results are promising: A practical guide with examples, Statistics in Medicine 2011

6.

Lu Chi, H. M. James Hung, and Sue-Jane Wang, Modification of Sample Size in Group Sequential Clinical Trials, Biometrics 5, 853-857, September 1999


Financial Information Disclosure

Yili L. Pritchett is an employee of Abbott Laboratories. The research presented was financially supported by Abbott Laboratories.

Eugm 2012 pritchett - application of adaptive sample size re-estimation in event outcome confirmatory trials - 2012 eugm

Documents

confirmatory trials

confirmatory phase

event outcome studies

cv outcome

confirmatory studies

company confidential

learning phase

event outcomes