Early stopping for phase II studies with time-to

advertisement

Early stopping for phase II cancer studies: a likelihood approach

Elizabeth Garrett-Mayer, PhD

Associate Professor of Biostatistics

The Hollings Cancer Center

The Medical University of South Carolina garrettm@musc.edu

1

Motivation

Oncology Phase II studies

Single arm

Evaluation of efficacy

Historically,

‘clinical response’ is the outcome of interest

Evaluated within several months (cycles) of enrollment

Early stopping often incorporated for futility

More recently

 targeted agents halt growth may or may not shrink tumor

‘progression-free survival’ is outcome of interest extensions for survival evaluation, but not today

2

Early Stopping in Phase II studies:

Binary outcome (clinical response)

Attractive solutions exist for this setting

Common design is Simon’s two-stage (Simon, 1989)

Preserves type I and type II error

Procedure: Enroll N

1 patients (stage 1).

If x or more respond, enroll N

If fewer than x respond, stop.

2 more (stage 2)

Appropriate for binary responses

Bayesian approaches also implemented

 binary likelihood, beta prior → beta binomial model other forms possible requires prior

Lee and Liu: predictive probability design (Clinical Trials,

2008)

3

Alternative approach for early stopping

Use likelihood-based approach

(Royall (1997),

Blume (2002))

Not that different than Bayesian

Parametric model-based

No “penalties” for early looks

But it is different

No prior information included

Early evaluations are relatively simple

“Probability of misleading evidence” controlled

Can make statements about probability of misleading evidence

4

Today’s talk

Likelihood approach

 principles

 multiple looks

Focus on binary outcome situation

Can be extended to the time-to-event outcome setting

 parametric survival distributions

 issues with length of follow-up issues with how often to “look”

5

Law of Likelihood

If hypothesis A implies that the probability of observing some data X is P

A

(X), and hypothesis B implies that the probability is

P

B

(X), then the observation X=x is evidence supporting A over B if P the likelihood ratio, P

A

(x) > P

B

(x), and

A

(x)/P

B

(x), measures the strength of that evidence.

(Hacking 1965, Royall 1997)

6

Likelihood approach

Determine “what the data say” about the parameter of interest

Likelihood function: gives a picture of the data

Likelihood intervals (LI): gives range of reasonable values for parameter of interest

0.01

0.02

0.03

Lambda

0.04

0.05

1/8

1/32

7

Likelihood approach

Likelihood ratios (LR)

Take ratio of heights of L for different values of λ

L( λ= 0.030)=0.78; L( λ= 0.035)=0.03.

LR = 26

0.01

0.02

0.03

Lambda

0.04

0.05

8

Likelihood-Based Approach

Use likelihood ratio to determine if there is sufficient evidence in favor of the one or another hypothesis

Error rates are bounded

Implications: Can look at data frequently without concern over mounting errors

“Evidence-based”

9

Key difference in likelihood versus frequentist paradigm

Consideration of the alternative hypothesis

Frequentist hypothesis testing:

H

0

: null hypothesis

H

1

: alternative hypothesis

Frequentist p-values:

 calculated assuming the null is true,

Have no regard for the alternative hypothesis

Likelihood ratio:

Compares evidence for two hypotheses

Acceptance or rejection of null depends on the alternative

10

Example:

Assume H

0

H

1

: λ = 0.08

: λ = 0.12 vs.

What if true λ = 0.10?

 Simulated data, N=300

 Frequentist:

 p = 0.01

Reject the null

 Likelihood

 LR = 1/4

 Weak evidence in favor of null

0.08

0.09

0.10

0.11

0.12

Lambda

11

Example:

 Why?

 P-value looks for evidence against null

 LR compares evidence for both hypotheses

When the “truth” is in the middle, which makes more sense?

0.08

0.09

0.10

0.11

0.12

Lambda

12

Likelihood Inference

Weak evidence: at the end of the study, there is not sufficiently strong evidence in favor of either hypothesis

This can be controlled by choosing a large enough sample size

But, if neither hypothesis is correct, can end up with weak evidence even if N is seemingly large (appropriate)

Strong evidence

Correct evidence: strong evidence in favor of correct hypothesis

Misleading evidence: strong evidence in favor of the incorrect hypothesis.

This is our interest today: what is the probability of misleading evidence?

This is analogous to the alpha (type I) and beta

(type II) errors that frequentists worry about

13

Operating Characteristics

Simon two-stage

Accept H0

Reject H0

0.1

0.2

0.3

True p

0.4

0.5

0.6

14

Operating Characteristics

Likelihood Approach

Accept H0

Accept HA

Weak Evidence

0.1

0.2

0.3

True p

0.4

0.5

0.6

15

Misleading Evidence in Likelihood Paradigm

Universal bound: Under H

0

P

L

1

L

0

 k

1 k

,

(Birnbaum, 1962; Smith, 1953)

In words, the probability that the likelihood ratio exceeds k in favor of the wrong hypothesis can be no larger than 1/k.

In certain cases, an even lower bound applies

(Royall,2000)

Difference between normal means

Large sample size

Common choices for k are 8 (strong), 10, 32 (very strong).

16

Implications

Important result: For a sequence of independent observations, the universal bound still holds

(Robbins, 1970)

Implication: We can look at the data as often as desired and our probability of misleading evidence is bounded

That is, if k=10, the probability of misleading strong evidence is ≤ 10%

Reasonable bound: Considering β = 10-20% and

α = 5-10% in most studies

17

Early stopping in phase II study: binary outcome

Motivating Example

Single arm cancer clinical trial

 outcome = clinical response

Early stopping for futility

Standard frequentist approach

Simon two-stage design

Only one look at the data

Determine “optimality” criterion

 minimax

 minimum E(N) under H

0

(Simon’s optimal)

Likelihood approach

Use binomial likelihood

Can look at the data after each observation

18

Motivating Example

New cancer treatment agent

Anticipated response rate is 40%

Null response rate is 20%

 the standard of care yields 20%

 not worth pursuing new treatment with same response rate as current treatment

Using frequentist approach:

Simon two-stage with alpha = beta = 10%

Optimum criterion: smallest E(N)

First stage: enroll 17. if 4 or more respond, continue

Second stage: enroll 20. if 11 or more respond, conclude success.

19

Likelihood Approach

Recall: we can look after each patient at the data

Use the binomial likelihood to compare two hypotheses.

Difference in the log-likelihoods provides the log likelihood ratio

Simplifies to something simple log L

1

 log L

0

N

 y i

 log( 1

  log

 p

1 p

1

)

 log log( 1 p

0

 p

0 log(

)

1

 p

1

)

 log( 1

 p

0

)

20

Implementation

Look at the data after each patient

Estimate the difference in logL

0

Rules:

 if logL

0

– logL

1 and logL

> log(k): stop for futility

1

 if logL

0

– logL

1

< log(k): continue

21

Likelihood Approach

But, given discrete nature, only certain looks provide an opportunity to stop

Current example: stop the study if…

0 responses in 9 patients

1 response in 12 patients

2 responses in 15 patients

3 responses in 19 patients

4 responses in 22 patients

5 responses in 26 patients

6 responses in 29 patients

7 responses in 32 patients

Although total N can be as large as 37, there are only 8 thresholds for futility early stopping assessment

22

Design Performance Characteristics

How does the proposed approach compare to the optimal Simon two-stage design?

What are performance characteristics we would be interested in?

 small E(N) under the null hypothesis frequent stopping under null (similar to above) infrequent stopping under alternative acceptance of H

1 acceptance of H

0 under H

1 under H

0

23

Example 1: Simon Designs

H

0

: p = 0.20 vs. H

1

: p = 0.40. Power ≥ 90% and alpha ≤ 0.10.

Optimal Design:

Stage 1: N

1

= 17, r

2

=3

Stage 2: N = 37, r=10

Enroll 17 in stage 1. Stop if 3 or fewer responses.

If more than three responses, enroll to a total N of 37.

Reject H

0 if more than 10 responses observed in 37 patients

Minimax Design:

Stage 1: N

1

= 22, r

2

=4

Stage 2: N = 36, r=10

Enroll 22 in stage 1. Stop if 4 or fewer responses.

If more than four responses, enroll to a total N of 36.

Reject H

0 if more than 10 responses observed in 36 patients

24

Simon Optimal vs. Likelihood (N=37)

Accept HA, Lik

Accept H0, Lik

Weak Evidence

Accept HA, Simon

Accept H0, Simon

0.1

0.2

0.3

True p

0.4

0.5

0.6

25

Simon Minimax vs. Likelihood (N=36)

Accept HA, Lik

Accept H0, Lik

Weak Evidence

Accept HA, Simon

Accept H0, Simon

0.1

0.2

0.3

True p

0.4

0.5

0.6

26

Probability of Early Stopping

Likelihood (optimal N)

Likelihood (minmax N)

Simon Optimal

Simon MinMax

0.1

0.2

0.3

True p

0.4

0.5

0.6

27

Expected Sample Size

Likelihood (optimal N)

Likelihood (minmax N)

Simon Optimal

Simon MinMax

0.2

0.4

True p

0.6

0.8

28

Another scenario

Lower chance of success

H

0

: p = 0.05 vs. H

1

: p = 0.20

Now, only 3 criteria for stopping:

0 out of 14

1 out of 23

2 out of 32

29

Simon Designs

H

0

: p = 0.05 vs. H

1

: p = 0.20. Power ≥ 90% and alpha ≤ 0.10.

Optimal Design:

Stage 1: N

1

= 12, r

2

=0

Stage 2: N = 37, r=3

Enroll 12 in stage 1. Stop if 0 responses.

If at least one response, enroll to a total N of 37.

Reject H

0 if more than 3 responses observed in 37 patients

Minimax Design:

Stage 1: N

1

= 18, r

2

=0

Stage 2: N = 32, r=3

Enroll 18 in stage 1. Stop if 0 responses.

If at least one response, enroll to a total N of 32.

Reject H

0 if more than 3 responses observed in 32 patients

30

Simon Optimal vs. Likelihood (N=37)

0.0

Accept HA, Lik

Accept H0, Lik

Weak Evidence

Accept HA, Simon

Accept H0, Simon

0.1

0.2

True p

0.3

0.4

31

Simon Minimax vs. Likelihood (N=32)

0.0

Accept HA, Lik

Accept H0, Lik

Weak Evidence

Accept HA, Simon

Accept H0, Simon

0.1

0.2

True p

0.3

0.4

32

Probability of Early Stopping

Likelihood (optimal N)

Likelihood (minmax N)

Simon Optimal

Simon MinMax

0.00

0.05

0.10

0.15

True p

0.20

0.25

0.30

33

Expected Sample Size

Likelihood (optimal N)

Likelihood (minmax N)

Simon Optimal

Simon MinMax

0.0

0.1

0.2

0.3

True p

0.4

0.5

0.6

34

Last scenario

Higher chance of success

H

0

: p = 0.40 vs. H

1

: p = 0.60

Now, 21 criteria for stopping:

0 out of 6

1 out of 8

2 out of 10

3 out of 12

4 out of 4

5 out of 16

6 out of 18

...

20 out 46

35

Simon Designs

H

0

: p = 0.40 vs. H

1

: p = 0.60. Power ≥ 90% and alpha ≤ 0.10.

Optimal Design:

Stage 1: N

1

= 18, r

2

=7

Stage 2: N = 46, r=22

Enroll 18 in stage 1. Stop if 7 or fewer responses.

If more than 7 responses, enroll to a total N of 46.

Reject H

0 if more than 22 responses observed in 46 patients

Minimax Design:

Stage 1: N

1

= 28, r

2

=11

Stage 2: N = 41, r=20

Enroll 28 in stage 1. Stop if 11 or fewer responses.

If more than 11 responses, enroll to a total N of 41.

Reject H

0 if more than 20 responses observed in 41 patients

36

Simon Optimal vs. Likelihood (N=46)

Accept HA, Lik

Accept H0, Lik

Weak Evidence

Accept HA, Simon

Accept H0, Simon

0.2

0.3

0.4

0.5

True p

0.6

0.7

0.8

37

Simon Minimax vs. Likelihood (N=41)

Accept HA, Lik

Accept H0, Lik

Weak Evidence

Accept HA, Simon

Accept H0, Simon

0.2

0.3

0.4

0.5

True p

0.6

0.7

0.8

38

Probability of Early Stopping

Likelihood (optimal N)

Likelihood (minmax N)

Simon Optimal

Simon MinMax

0.2

0.3

0.4

0.5

True p

0.6

0.7

0.8

39

Expected Sample Size

Likelihood (optimal N)

Likelihood (minmax N)

Simon Optimal

Simon MinMax

0.2

0.3

0.4

0.5

True p

0.6

0.7

0.8

40

More on the predictive probability approach

Lee and Liu, Clinical Trials, 2008.

Bayesian but without loss function (no Bayes risk)

Searches for design parameters to ensure size and power

Prior is chosen so that mean of prior is the null hypothesis, but weak.

Predictive probability (PP) = probability that the end result of the trial is positive given current data and data to be observed

Based on the probability that the true response rate is greater than the null response rate.

Again, ignores the alternative

41

More on the predictive probability approach

Stopping:

 if PP < θ

L

: then stop trial and reject alternative

 if PP > θ

U

: stop trial and reject the null (but often θ

U

= 1)

At pre-specified times, the predictive probability is calculated

Lee and Liu explore different frequency of stopping

Comparisons here are for looking after every patient

θ

T is defined as the threshold for determining efficacy at the trial’s end

θ

T and θ

U do not have the same stringency

42

Example of Predictive Probability Design

43

Comparison with Predicted Probability

Minimax Sample Size

44

Comparison with Predicted Probability

Optimal Sample Size

45

Summary and Conclusions (1)

Likelihood based stopping provides another option for trial design in phase II single arm studies

We only considered 1 value of K

 chosen to be comparable to frequentist approach other values will lead to more/less conservative results extension: different K for early stopping versus final go/no go decision

Overall, sample size is smaller

 especially marked when you want to stop for futility when early stopping is not expected, not much difference in sample size

For ‘ambiguous’ cases:

 likelihood approach stops early more often than Simon

In minimax designs, finds ‘weak’ evidence frequently

46

Summary and Conclusions (2)

‘r’ for final analysis is generally smaller.

 why?

the notion of comparing hypotheses instead of conditioning only on the null.

Comparison the the PP approach is favorable

 likelihood stopping is less time consuming and less computationally intensive

LS does not require specification of a prior

“search” for designs in relatively simple

47

Thank you for your attention!

garrettm@musc.edu

48

Early Stopping in Phase II studies: time-to-event outcomes

Disease stabilization

More common with novel treatments, targeted therapies

Example: targeting stem cells

If treatment works, cancer does not progress

But, “bulk” may still remain

Time-to-progression is relevant outcome

But, takes a long time to evaluate…

49

One suggested/common approach

Apply Simon’s two-stage design

Example:

1 year PFS of 0.30 versus 0.50 ( α = β = 0.10)

Enroll 20 patients

If 6 or more are PF at 1 year, enroll an additional 22 for a total of 42 patients.

Study design

Assume trial will take 2 years to accrue (21 patients per year)

First 20 patients will be enrolled by end of year 1

20 th patient should be evaluable for 1 year PFS at end of year 2.

50

One suggested/common approach

So, what’s the problem?

Problem 1: By the end of year 2, almost all of the additional 22 patients will have been enrolled, yet the stage 1 patients have just become evaluable.

Problem 2: if the trial needs to be suspended after 20 patients (to wait for events), investigators may need to stop enrollment for 1 year.

51

Current approaches for early stopping with TTE outcomes

Bayesian approaches

(Thall et al., 2005)

Frequentist approaches

(Case and Morgan, 2003)

Ad hoc approaches

Use related outcome (e.g., clinical response)

Spend a little alpha early and evaluate:

At a prespecified time

When a prespecified number of patients have reached a landmark time (e.g. 1 year)

When a prespecified number of patients have been enrolled

52

Early stopping in phase II study: time to event outcome

Motivating Example

Single arm

Time-to-event outcome

Early stopping for futility

Standard frequentist approach

Non-parametric (i.e., no model)

Kaplan-Meier estimate of 6 mo. PFS

“Robust”, but not powerful!

Likelihood approach

Requires a parametric model (like the

Bayesians!)

53

Model Choice Considerations

Trade-off: One-parameter vs. >One-parameter model

Parsimony versus fit

Bias versus variance

Small amount of data: cannot tolerate many parameters

Exponential (one-parameter) obvious choice

Some other options:

Weibull

Log-normal

Cure-rate

54

Critical Issue

Decision to stop must be robust to model misspecification

“Robustifying” likelihood

(Royall and Tsou, 2003)

Not appropriate here: exponential with censoring does not meet criteria

Further study needed to see early stopping behavior when model is misspecified

55

Exponential Model and Likelihood

probability density : f t

   e

  t survival function : ( )

 e

  t

Log - likelihood function : L

 t d

Maximum likelihood estimate:

  

 d i t i

 i

 N

1 d i log

  t i

56

Simulations

Need comparability across distributions of simulated data

Chose underlying distributions with same 6 month survival

Exponential

Weibull: one with larger variance, one with smaller

Log-normal: one with larger variance, one with smaller

Cure-rate

Working model: exponential distribution

Simulations: data generated assuming treatment lacks efficacy

57

Comparison of underlying distributions

Exponential, 0.08

Weibull, 1/10, 1.43

Black: true distn

Red: best exponential

LogNormal, 2, 0.69

0 5 15 25 35

Time (months)

Cure Rate, 0.13, 30%

0 5 15 25 35

Time (months)

Weibull, 1/17, 0.7

0 5 15 25 35

Time (months)

LogNormal, 2.4, 2.0

0 5 15 25 35

Time (months)

0 5 15 25 35

Time (months)

0 5 15 25 35

Time (months)

58

Simulation study 1

Null hypothesis is true: should stop early for futility in large fraction of trials

Three ways to characterize hypotheses:

H

0

: 6 mo PFS = 62% vs. H

H

0

: E(t) = 12.5 mo vs. H

H

0

: λ = 0.08

vs. H

1

1

: 6 mo PFS = 74%

1

: E(t) = 20 mo

: λ = 0.05

N = 100

Starting with 25 th patient, analyze data every 5 th enrollment

Censoring is assumed to be administrative

24 months of enrollment (assuming no early stopping)

Total study time 36 months (24 month accrual, 12 month

F.U.)

Use likelihood intervals of 1/10

59

Stopping Rule

Stop if the likelihood ratio < 1/10

That is, if the ratio of the likelihood for the NULL to the ALTERNATIVE is 10, then stop.

Note 1: ONLY considering stopping for futility!

Note 2: based on universal bound, we have a less than 10% chance of strong evidence in favor of the wrong hypothesis

Note 3: based on Royall (2000), probably have even less than that….

60

Simulated Data Examples

No stopping Stop at N=55

25 35 45 55 65 75 85 95

Number of Patients Enrolled

25 35 45 55 65 75 85 95

Number of Patients Enrolled

61

Frequentist Properties of Simulation Study

N=100, H

0

: λ = 0.08 vs. H

1

: λ = 0.05

Using exponential test and assuming exponential data:

Alpha = 5%

Power = 98%

Using non-parametric test, and assuming exponential data:

Alpha = 5%

Power = 78%

No interim analyses included

62

Why not look before 25 patients?

Total enrolled

≥ 1 month f.u

≥ 2 month f.u

End month

1

4

End month

2

8

End month

3

12

End month

4

16

0

0

4

0

8

4

12

8

End month

5

21

16

12

End month

6

25

21

16

≥ 3 month f.u

0 0 0 4 8 12

≥ 4 month f.u

≥ 5 month f.u

≥ 6 month f.u

0

0

0

0

0

0

0

0

0

0

0

0

4

0

0

8

4

0

63

Simulations

Exponential, 0.08

Blue: 12 month estimate solid black: true distn

Red: 60 month estimate dashed: hypotheses

Weibull, 1/10, 1.43

LogNormal, 2, 0.69

0 5 15 25 35

Time (months)

Cure Rate, 0.13, 30%

0 5 15 25 35

Time (months)

Weibull, 1/17, 0.7

0 5 15 25 35

Time (months)

LogNormal, 2.4, 2.0

0 5 15 25 35

Time (months)

0 5 15 25 35

Time (months)

0 5 15 25 35

64

Time (months)

Early Stopping

2000

1000

400

100

40

10

4

0

Exponential

Median N = 60

% stopped = 87

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

2000

1000

400

100

40

10

4

0

Cure Rate 1

Median N = 100

% stopped = 4

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

2000

1000

400

100

40

10

4

0

Weibull 1

Median N = 85

% stopped = 64

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

2000

1000

400

100

40

10

4

0

Weibull 2

Median N = 35

% stopped = 97

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

2000

1000

400

100

40

10

4

0

Log-Normal 1

Median N = 60

% stopped = 99

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

2000

1000

400

100

40

10

4

0

Log-Normal 2

Median N = 55

% stopped = 62

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

65

Likelihood ratios:

<

1/32

[

1/32, 1/10

) [

1/10

, 1) [1,10) [10,32) >32

0.76

0.03

0.01

<0.01

<0.01

Exponential* 0.20

Weibull 1 0.47

Log-Normal 1 0.27

Cure Rate <0.01

Weibull 2 0.18

Log-Normal 2 0.06

0.53

0.73

0.04

0.80

0.55

<0.01

<0.01

<0.01

0.01

<0.01

<0.01

<0.01

<0.01

<0.01

<0.01

<0.01

0.01

<0.01

<0.01

0.01

<0.01

<0.01

0.96

0.01

0.37

66

Frequentist Approach: Exponential Data

Based on observed data (stopped and completed trials)

0.55% of trials showed significant p-value

(versus 0.45% with LR>10)

Agreement of 99.6% for hypothesis testing decision

High agreement in inferences

67

Additional simulations

Early stopping is critical when we have a rate that is even WORSE than the null

Example:

We are testing 62% vs. 74% 6 month PFS

What if true 6 month PFS based on our regimen is only 55%? Or 49%?

What is the chance of early stopping in these cases?

Simple scenario: exponential data, exponential model

68

Early Stopping:

H

0

: 6 mo PFS = 62% vs. H

1

: 6 mo PFS = 74%

6 mo PFS = 55% 6 mo PFS = 49%

2000

1000

400

100

40

10

4

Median N = 40

% stopped = 99.8

2000

1000

400

100

40

10

4

Median N = 35

% stopped = 99.9

0 0

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

Total sample size

25 30 35 40 45 50 55 60 65 70 75 80 85 90 95

69

Total sample size

Likelihood Ratios

<

1/32

[

1/32, 1/10

) [

1/10

, 1) [1,10) [10,32

)

0.19

0.81

<0.01

>32

<0.01

<0.01

<0.01

55% 6 mo

PFS

49% 6 mo

PFS

0.26

0.74

<0.01

<0.01

<0.01

<0.01

70

Summary and Conclusions (2)

Properties are consistent with what we expected

When we have exponential data and k=10:

We stop early OFTEN when we should

We RARELY stop early when we shouldn’t

But, we need to be careful…

We need a full understanding of the expected and observed survival distribution

If we have model misspecification, we could run into trouble

Not unrealistic: breast cancer—cure rate might be bestfitting

Quantifying simply by one point in time (e.g. 6 month

PFS) could be dangerous

Should elicit several PFS at several times from clinical investigator

71

Summary and Conclusions (3)

This is the perfect example of why we need to work in close collaboration with oncologists

Need to get a good appreciation for the anticipated distribution

Early stopping should be carefully considered based on observed data

Implementation issues

Probably will not be able to do this in an “off-theshelf” way

High-maintenance for the statistician

Better for patients

Better for Cancer Center (resources)

72

Future work in TTE

Feasibility of 2-parameter models

In practice, can we fix one parameter?

Preliminary data should give us a sense of the shape

Interval censoring

Different censoring mechanisms

Larger deviations from exponential (how common?)

Looks: when to start and how often?

Study design guidelines (e.g. sample size)

73

References

Case and Morgan (2003) Design of Phase II cancer trials evaluating survival probabilities. BMC Medical Research Methodology; v. 3.

Birnbaum (1962) On the Foundations of Statistical Inference (with discussion). JASA,

53, 259-326.

Blume (2002) Likelihood Methods for Measuring Statistical Evidence, Statistics in

Medicine, 21, 2563-2599.

Hacking (1965) Logic of Statistical Inference, New York: Cambridge Univ Press.

Royall (1997) Statistical Evidence:A Likelihood Paradigm, London, Chapman & Hall.

Royall (2000) On the Probability of Misleading Statistical Evidence, JASA, 95; 760-

768.

Royall and Tsou (2003) Interpreting statistical evidence by using imperfect models: robust adjusted likelihood functions. JRSS-B; 65(2), 391-404.

Simon (1989) Optimal Two-Stage Designs for Phase II Clinical Trials. Controlled

Clinical Trials; 10,1-10.

Smith (1953) The Detection of Linkage in Human Genetics. JRSS-B, 15, 153-192.

Thall, Wooten and Tannir (2005) Monitoring event times in early phase clinical trials: some practical issues. Clinical Trials; v. 2, 467-478.

74

Download