# Journal of Bioanalysis & Biostatistics

**Research Article**

**Modeling Biomarker- Informed Adaptive Design**

**Jing Wang**^{3,4*}, Mark Chang^{1,2} and Shein-Chung Chow^{4}

^{3,4*}, Mark Chang

^{1,2}and Shein-Chung Chow

^{4}

^{1}Veristat, Southborough, USA^{2}Department of Biostatistics, Boston University, USA^{3}Gilead Sciences, Boston University, USA^{4}Department of Biostatistics, Duke University, USA

**Jing Wang, Department of Biostatistics, Boston University, 303 Velocity Way, Foster City, CA 94404, USA; E-mail: jing.wang@gilead.com**

^{*}Address for Correspondence:**Citation:**Chang M, Wang J, Chow SC. Modeling Biomarker-Informed Adaptive Design. J Bioanal Biostat 2017;2(1): 6.

**Copyright:**© 2017 Chang M, et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Journal of Bioanalysis and Biostatistics | Volume: 1, Issue: 1

Submission: 6 April, 2017| Accepted: 28 May, 2017 | Published: 5 June, 2017

## Abstract

Adaptive clinical trial designs incorporating biomarkers have gained much attention because of their potential benefits of shorter trial duration, smaller study sizes, higher probability of trial success, enhancement of the benefit -risk relationship, and mitigating everescalating development costs. In the planning of a biomarker-informed adaptive design, it is important to perform clinical trial simulations in order to understand the operating characteristics of the design. This manuscript is concerned with simulating the trial data for a biomarkerinformed adaptive design that uses a biomarker/surrogate endpoint for interim treatment selection.

We demonstrated that correlation between biomarker and the primary endpoint alone is not sufficient. Instead, the modeling the relationship between biomarker and primary endpoint is necessary as we demonstarted using an example of non-small-cell lung cancer trial, and presented an alternative hierarchical model for modeling the two endpoints. We studied how each parameter in the new model affects the power of a biomarker informed two-stage winner design and proposed methods for estimating the parameters. R code for application of the new methodology is provided.

## Background

With the surge in advanced technology especially in the “OMICS” space (eg. Genomics, proteomics, etc), the adaptive clinical trial designs that incorporate biomarker information have attracted significant attention.

Biomarkers are measurable biological indicators of the status of an organism in a particular health condition or disease state Chen et al. [1]. In drug development, biomarkers can be classified into four categories: prognostic biomarkers, predictive biomarkers, pharmacodynamics biomarkers, and surrogate endpoints BDWG, Lassere MN, Wang SJ [2-4].

Prognostic biomarkers predict patients with differing risks of an overall outcome of disease, regardless of treatment. Predictive biomarkers predict the likelihood of patient’s response to a particular treatment. Pharmacodynamic biomarkers indicate drug effect on the target in an organism, which are often used in earlier phases of drug development to demonstrate drug activity and to provide information on likely clinical benefit and go/no-go decisions. A surrogate endpoint is a measure of the effect of a treatment that correlates well with a clinical endpoint Jenkins M, Buyse M [5,6]. Surrogate endpoints are mostly used as biomarkers intended to substitute for clinical endpoints with faster and more sensitive evaluation of treatment effects.

Many types of adaptive clinical trial designs incorporating biomarkers have been proposed and discussed, including the biomarker-enrichment designs that use predictive biomarkers for interim study population selection Freidlin et al., Jiang et al., Freidlin et al., Zhou et al. and Lee et al. and the biomarker-informed adaptive esigns that use surrogate endpoints for interim treatment selection Todd and Stallard, Stallard, Shun et al., Di Scala and Glimm Friede et al. [7-16]. Focused clinical trials using a biomarker strategy have he potential to result in shorter trial duration, smaller study sizes, higher probability of trial success, enhancement of the benefit–risk relationship, and potentially mitigating ever-escalating development costs.

In the planning of a phase III or a phase II/III study that uses biomarker informed adaptive procedures, a biomarker or set of biomarkers needs to be available and has been well studied first in the phase II development stage or other validation studies. Furthermore, as suggested inthe FDA guidance on adaptive design clinical trials for drugs and biologics and Chow and Chang, it is important to perform inical trial simulations before conducting the study in order to evaluate the multiple-trial design options and clinical scenarios that might occur when the study is actually conducted and to assess operating characteristics of the design, including sample size required for a target power [17,18].

In general, clinical trial simulations rely on a statistical model to generate the trial data. This manuscript is concerned with generating trial data for a biomarker-informed adaptive design that uses a biomarker/surrogate endpoint for interim treatment selection, that is, the statistical model for the relationship between biomarker and primary endpoint.

Friede et al. proposed a simulation model based on standardized test statistics that allows the generation of biomarker-informed adaptive trials [16]. The test statistics of the trial were simulated directly instead of trial data. To simulate individual patient data for the trial, on the other hand, the conventional statistical model used is a one-level correlation model. For example, if both endpoints follow normal distribution, Shun et al. used a bivariate normal distribution to model the biomarker and primary endpoint [14]. Wang et al. showed that the bivariate normal model that only considers the individual level correlation between the two endpointsis inappropriate when little is known about how the means of the two endpoints are related [19]. Wang et al. further proposed a two-level correlation (individual level correlation and mean level correlation) model to describe the relationship between biomarker and primary endpoint [19]. The twolevel correlation model incorporates a new variable that describes the mean level correlation between the two endpoints. The new variable, together with its distribution, reflects the uncertainty about the meanlevel relationship between the two endpoints due to a small sample size of historical data. It was shown that the two-level correlation model is a better choice for modeling the two endpoints.

In this manuscript, we demonstrate the necessity of considering the uncertainty about the mean level relationship between biomarker and primary endpoint using an example of non-small-cell lung cancer trial, and present an alternative hierarchical model for the relationship between biomarker and primary endpoint in Section 2 [20]. We investigate how each parameter in the hierarchical model affects the power of a biomarker informed two-stage winner design in Section 3 and discuss methods to estimate the parameters in the hierarchical model in Section 4. Conclusions are drawn in Section 5.

### A non-small-cell lung cancer trial that uses biomarker informed two-stage winner design.

For simplicity, we present our discussions and results in the context of a “biomarker informed two-stage winner design”, however the proposed model and the conclusions drawn could be extended to other biomarker-informed adaptive designs that use a biomarker/ surrogate endpoint for interim treatment selection.

A “biomarker informed two-stage winner design” Shun et al. combines a phase II and a phase III study [14]. It starts with several active treatment arms and a control arm with a planned interim analysis on biomarker. At interim, the inferior arms will be terminated based upon results of biomarker by ranking of observations, and only the most promising treatment (“winner”) will be retained and carried to the end of the study with the control arm. The final comparison between the winner arm and the control arm will be performed on data from both stages and on study primary endpoint. This design has the potential to shorten the duration of the trial for drug development and can be cost effective.

In this section, we demonstrate the necessity of considering the uncertainty about the mean level relationship between biomarker and primary endpoint by considering an example of non-small-cell lung cancer trial that uses biomarker informed two-stage winner design.

In cancer trials, early tumor size reduction allows early assessment of the activity of an experimental regimen, and can serve as an early biomarker for survival prediction and assist in early drug development decisions.

Wang et al. quantified the relationship between early tumor size reduction and patient survival in non-small-cell lung cancer patients, and developed a parametric model for survival times, utilizing data from four non-small-cell lung cancer registration trials [20]. The parametric survival model proposed includes baseline tumor size (centered at 8.5 cm), ECOG status (0/1/2/3 as a categorical variable) and percentage tumor reduction from baseline at week 8 (

*PTR*_{wk8}) as predictors of time to death (T).The regression model writes as follows:

log(T)=α

(Baseline-8.5)+α_{0}+α_{1}x ECOG+α_{2}x(Baseline-8.5)+α_{3}x PTR_{wk8}+ε_{TD}_{3}x PTR

_{wk8}+ ε

_{TD}

Where

*T*is the time to death (day), α_{o}is the intercept, α_{1}α_{2}α_{3}areslopes for ECOG, centered baseline, and PTR

_{wk8}, respectively, and ε_{TD}is the residual variability following a normal distribution with a mean of 0 and variance of .These models have been shown reasonably good predictive ability.

Given the above historical information, we evaluate the performance of a non-small cell lung cancer trial with 3 experimental treatments and a control arm using biomarker informed two-stage winner strategy. It is expected that, in a biomarker informed two stage winner design, the more closely a biomarker and primary endpoint correlated, the better the performance of the design would be. While the individual level correlation

*ρ = Corr(PTR*is the only measurement considered for the relationship between biomarker and primary endpoint, we simulate the power of the design for different values of ρ._{wk8}, log(T)Some assumptions for the design are as follows.

Assume the expected mean survival time for patients in the 3 active treatment arms are 8 months, 10 months and 12 months, respectively, and the expected mean survival time for patients in the control arm is around 6 months. For simplicity, we assume the patient to be enrolled in the study share the same baseline characteristics with baseline tumor size 8.5 cm and ECOG status 1.

We consider the two-stage winner design with maximum sample size N = 86 for each treatment group, and the interim analysis is planned at the information time 0.5 (that is, the interim sample size is n

_{1}= 43 per group). The same total sample size will yield 99% power for a non-adaptive design with three active treatment arms and a control arm with family-wise error rate controlled at 0.05. (The sample size is chosen to ensure the sample correlation coefficient in our simulations is not significantly different from the theoretical value).Two sets of measurements will be obtained , percentage tumor reduction from baseline at week 8 (PTR

_{wk8}) for i^{th}person in j^{th}treatment group; and time to death for i^{th}person in j^{th}treatment group. j = 0,1,2,3. j = 0 represents the control group while j =1,2,3 the 3 active treatment groups. For simplicity, censoring is not considered in our context.Let be the mean of tumor reduction measurements for treatment group j at interim, and be the mean of survival measurements for treatment group j at final.

At interim, if the mean tumor reduction observations we select treatment j as the most effective treatment, and carry only treatment group j and the control group to the end of the study. At the final stage, we perform comparison between the “winner” arm log and the control arm using t-test.

Details on how to simulate the data that satisfies the models (1) and (2) while preserving the correlation coefficient ρ can be found in Appendix 1. Type I error rate of the trial is preserved at 0.025 level by adjusting critical rejection values of the final test statistic of the design Wang et al. [21].

Our simulations show that even for ρ = 0.1 (with average sample correlation coefficient around 0.05), the considered two-stage winner design has power over 95%, which violates the presumption that a biomarker informed two-stage winner design should have a better performance when the interim and final endpoints have a stronger correlation. Further, it suggests that the individual level correlation alone is not sufficient to describe the relationship between biomarker and primary endpoint.

The model we considered that with individual level correlation alone can be written as follows:

While in our context,

u

_{T}= 5.57+ 0.42u_{PTR}for treatment group, andu

_{T}= 5.42+0.38u_{PTR}for placebo groupTherefore, when u

_{T}is assumed, u_{PTR}is a fixed number, and a larger value of u_{T}corresponds to a larger value to u_{PTR}. The power of the design is high even for small values of ρ because the same rank order of the mean responses of the two endpoints is always preserved.However, it is not true that the mean responses of the two endpoints are always with the same rank order for treatment groups. When the parameters in the regression model

Are estimated, the estimates α

_{o}, α_{1}, α_{2}, α_{3}, come with varianceThe uncertainty of the estimates α

_{o}, α_{1}, α_{2}, α_{3}, which corresponds to the uncertainty of mean level relationship should be considered for describing the relationship between biomarker and primary endpoint.In other words, instead of a fixed effects model, a random effects model should be used to describe the relationship between biomarker and primary endpoint.

For the case when both biomarker and primary endpoint follow normal distribution, as an alternative to the two-level correlation model proposed by Wang et al. Proposed the following hierarchical (multilevel) model (MEM):

where is measurement of biomarker for i

^{th}person in j^{th}treatment group, measurement of primary endpoint, ρ and ρ_{u}d are the common correlations between biomarker and primary endpoint at individual and mean levels, respectively [21,22].### Biomarker-informed two-stage winner design using the hierarchical model

To construct clinical trial simulations for a biomarker- informed two-stage winner design using the hierarchical model, the below steps could be followed:

the interim analysis based on biomarker X and determine the winner based on the best response in X

3. Draw additional

*N*samples of the primary endpoint_{2}= N − N_{1}*Y*from the normal distribution in the winner arm*w*and*N*samples of Y from for the placebo.4. Test the hypothesis based on the primary endpoint Y at the final analysis, which will be based on data of the winner arm from the two stages and the all the data of

*Y*from placebo.R function for simulating the power of a biomarker- informed two-stage winner design with the hierarchical model could be found in Appendix 2. By specifying with the null hypotheses and the worst case scenario that ρ

_{u}= ρ = 1, this R function could also be used for determining the critical value of the test statistic for the design that controls the type I error. Required sample size for the design could be obtained by invoking this R function by specifying H_{a}and the target power as well.Simulation studies that investigate how each parameter in this hierarchical model affects the power of the design have been carried out. For the purpose of simulation, we borrow the data from the above non-small-cell lung cancer example and consider a control and three active arms with responses in the primary endpoint 5.42, 5.48, 5.70, and 5.88, respectively. The responses in the biomarker are -0.21, 0.32 and 0.75 for the three active arms respectively.

The critical value for the final test statistic that controls the type I error at 0.025 could be obtained by simulation and is equal to 2.4 in our case, and the simulation results for power are summarized in (Table 1). We can see that, the mean level parameters, ρ

_{u}and σ_{u}impact the power significantly, while ρ and σ only have a mild impact on power.To estimate each parameter in the hierarchical model using historical data, maximum likelihood method and Bayesian inference could be considered.

Assume for each treatment group j , a set of historical data

The maximum likelihood estimator for each parameter could be obtained by maximizing the likelihood function:

However, since there's no closed form solution for each parameter, numeric iterative methods should be applied in order to obtain the value of estimators.

Bayesian inference is an easier option when appropriate prior distribution is chosen in our case. The Normal-inverse-Wishart distribution is a multivariate four-parameter family of continuous probability distributions. It is the conjugate prior of a

multivariate normal distribution.

Given the above, we propose to estimate the parameters in the hierarchical model as follows:

We estimate http://www.avensonline.org/fulltextarticles/wp-content/uploads/2017/08/JBABS-01-0005-formula20.png Σ by calculating the sample covariance matrix between biomarker and primary endpoint for the pooled historical data, which is Σ = Cov(x

_{i}, y_{i}). Both Chang and Wang et al. have shown that the individual level variance μ does not have a significant impact on the power of biomarker informed adaptive design, therefore errors in estimating μ are not serious [21,22].in j

_{th}treatment group by taking means of the posterior distribution of the parameters. Note that, instead of estimating a common Σm , we estimate Σ_{mj}for each treatment group*j*. We feel that differentiating the mean level covariance for treatment groups would benefit the further simulations. But if a common mean level covariance matrix is believed, Σ_{m}could be estimated by taking the weighted means of Σ_{mj}with the weights N_{j}.In a Normal-inverse-Wishart distribution

*NIW(u*defines the mean, A_{0},k_{0},A_{0},v_{0}) , u_{0}_{0}defines the covariance, and two scalar value*k*define how confident we are on the estimation of the first two parameters respectively. In order to specify a relatively noninformative prior, we want both_{0}, v_{0}*k*and_{0}*v*low. For example,_{0}*k*= 0.001 and_{0}*v*= 0.3.We developed R code for estimation of the parameters, see Appendix 3.

## Summary

In this manuscript, we demonstrated the necessity of considering the mean level uncertainty between biomarker and primary endpoint in a biomarker informed adaptive design using the example of a non-small-cell lung cancer trial. We presented a hierarchical multilevel model for modeling the two endpoints and studied how each parameter in the model affects the power. The estimators for the parameters in the hierarchical model were proposed when both endpoints follow Normal and R function for calculating the estimators was developed.

For simplicity, our discussions and results were presented in the context of a “biomarker informed two-stage winner design” . However, the proposed model and the conclusions could be extended to other biomarker-informed adaptive designs that use a biomarker/surrogate endpoint for interim treatment selection.

In the biomarker informed two-stage winner design we considered, interim information time used was 0.5, we claim that an earlier interim time is possible but not suggested, because making critical decisions based on biomarker with limited number of subjects at interim might be a regulatory concern. In our case, we also only considered one biomarker for interim treatment selection, for the case where multiple biomarkers are available and the collinearity presents, future research is needed.

While estimating the parameters, Normal-inverse-Wishart distribution is chosen as prior distribution for the parameters because of its property that it’s conjugate to a Gaussian likelihood and the posterior distribution is very easy to sample from and its mean can be computed analytically. Gelman argued that Normal-inverse-Wish art priors do not have good non-informative properties [24]. An approach to overcome this drawback is to assign prior to each parameters (the standard deviation and correlation parameters) in the covariance matrix Σm respectively Huang, which would be an interesting topic for future studies [25].

## References

- Chen JJ, Lu TP, Chen DT, Wang SJ (2014) Biomarker adaptive designs in clinical trials. Transl Cancer Res 3: 279-292.
- Biomarkers Definitions Working Group (2001) Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther 69: 89-95.
- Lassere MN, Johnson KR, Boers M, Tugwell P, Brooks P, et al. (2007) Definitions and validation criteria for biomarkers and surrogate endpoints: development and testing of a quantitative hierarchical levels of evidence schema. J Rheumatol 34: 607-615.
- Wang SJ (2007) Biomarker as a classifier in pharmacogenomics clinical trials: a tribute to 30th anniversary of PSI. Pharm Stat 6: 283-296.
- Jenkins M, Flynn A, Smart T, Harbron C, Sabin T, et al. (2011) A statistician’s perspective on biomarkers in drug development. Pharm Stat 10: 494-507.
- Buyse M, Michiels S, Sargent DJ, Grothey A, Matheson A, et al. (2011) Integrating biomarkers in clinical trials. Expert Rev Mol Diagn 11: 171-182.
- Freidlin B, Simon R (2005) Adaptive signature design: an adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin Cancer Res 11: 7872-7878.
- Jiang W, Freidlin B, Simon R (2007) Biomarker-adaptive threshold design: a procedure for evaluating treatment with possible biomarker-defined subset effect. J Natl Cancer Inst 99: 1036-1043.
- Freidlin B, Jiang W, Simon R (2010) The cross-validated adaptive signature design. Clin Cancer Res 16: 691-698.
- Zhou X, Liu S, Kim ES, Herbst RS, Lee JJ (2008) Bayesian adaptive design for targeted therapy development in lung cancer--a step toward personalized medicine. Clin Trials 5: 181-193.
- Lee JJ, Xuemin Gu, Suyu Liu (2010) Bayesian adaptive randomization designs for targeted agent development. Clin Trials 7: 584-596.
- Todd S, Stallard N (2005) A new clinical trial design combining phases 2 and 3: sequential designs with treatment selection and a change of endpoint. Ther Innov Regul Sci 39: 109-118.
- Stallard N (2010) A confirmatory seamless phase II/III clinical trial design incorporating short-term endpoint information. Stat Med 29: 959-971.
- Shun Z, Lan KK, Soo Y (2008) Interim treatment selection using the normal approximation approach in clinical trials. Stat Med 27: 597-618.
- Di Scala L, Glimm E (2011) Time-to-event analysis with treatment arm selection at interim. Stat Med 30: 3067-3081.
- Friede T, Parsons N, Stallard N, Todd S, Marquez EV, et al. (2011) Designing a seamless phase II/III clinical trial using early outcomes for treatment selection: An application in multiple sclerosis. Stat Med 30: 1528-1540.
- (2010) Guidance for Industry: Adaptive design clinical trials for drugs and biologics [excerpts]. Biotechnol Law Rep 29: 197-215.
- Chow SC, Chang M (2011) Adaptive design methods in clinical trials, Second Edition. CRC Press, pp. 374.
- Wang J, Chang M, Menon S (2014) Biomarker-informed adaptive design. In: Carini C, Menon S, Chang M (Eds), Clinical and statistical considerations in personalized medicine. CRC Press, pp. 129-148.
- Wang Y, Sung C, Dartois C, Ramchandani R, Booth BP, et al. (2009) Elucidation of relationship between tumor size and survival in non-small-cell lung cancer patients can aid early decision making in clinical drug development. Clin Pharmacol Ther 86: 167-174.
- Wang J, Menon S, Chang M (2014) Finding critical values to control type I error for a biomarker informed two-stage winner design. J Biomet Biostat 5: 207.
- Chang M (2016) Adaptive design theory and implementation using SAS and R, Second Edition. CRC Press, pp. 706.
- Murphy KP (2007) Conjugate bayesian analysis of the gaussian distribution. pp. 1-29.
- Gelman A (2006) Prior distributions for variance parameters in hierarchical models (Comment on Article by Browne and Draper). Bayesian Anal 1: 515-534.
- Huang A, Wand MP (2013) Simple marginally noninformative prior distributions for covariance matrices. Bayesian Anal 8: 439-452.