Abstract
Background/Aim: Progression-free survival (PFS), which is evaluated in oncology clinical trials, is determined based on tumor progression evaluated according to an assessment schedule. There is possibly a bias in median PFS and hazard ratio (HR) for PFS depending on the assessment schedule referring to randomized controlled trials (RCTs) in patients with metastatic colorectal cancer. Materials and Methods: We re-analyzed the PFS in the FTD/TPI phase 2 trial by changing the assessment schedule. To assess biases in median PFS and HR for PFS resulting from different assessment schedules, we performed a computational simulation. Results: The reanalysis of FTD/TPI phase 2 trial and the simulation results showed that there were biases in median PFS and HR for PFS. Conclusion: In RCTs for early progressive cancer, median PFS is dependent on the assessment schedule; however, HR for PFS can be assessed without clinically-meaningful differences between assessment schedules, regardless of biomarker assumptions.
- Progression-free survival
- tumor assessment schedule
- later-line cancer treatment
- colorectal cancer
- biomarker
- simulation
A randomized controlled trial (RCT) has been established to assess the therapeutic effect of drugs. Overall survival, which is defined as the time from randomization to death by any cause, is used as a true endpoint in oncology clinical trials. In nearly a decade, the use of progression-free survival (PFS), which is defined as the time from randomization to tumor progression or death by any cause, as a surrogate end-point has become of increasing importance. RCTs to be conducted for new drug application are designed to meet the requirements of the regulatory authorities. The Food and Drug Administration (1) and the European Medicines Agency (2) have released documents on the definition and the basic nature of PFS. The antitumor effect of drugs has been assessed according to the Response Evaluation Criteria in Solid Tumors (RECIST) guideline (version 1.1) (3).
Oncology clinical trials aimed at the development of new drugs are often conducted for later-line treatment of patients with early progressive cancer. We reviewed the RCTs in patients with metastatic colorectal cancer who were refractory to standard chemotherapy and questioned the relationship between median PFS and tumor assessment schedule (4-8). The shapes of the PFS curves were very similar in the four trials (4-7) in which the initial tumor assessment was performed at 8 weeks after study treatment, with an overlap ranging from 50-100% in PFS between the treatment groups and with similar median PFS between the treatment groups. In contrast, in the trial reported by Yoshino et al. (8) in which the initial tumor assessment was performed at 4 weeks, all the PFS curves were different between the treatment groups from 4 weeks. In general, partial overlap or intersection in the Kaplan–Meier curve suggests the presence of a biomarker. The RAS (KRAS or NRAS) mutation has been established as a biomarker for assessing the response to cetuximab and panitumumab treatment for patients with metastatic colorectal cancer (9-11); however, currently, there is no established biomarker for assessing the response to regorafenib and trifluridine and tipiracil hydrochloride (FTD/TPI) treatment. We assumed that no difference in median PFS exists despite reduced risks for tumor progression or death because of the presence of biomarkers as well as the difference in tumor assessment schedules.
In actual RCTs, tumor assessment is generally scheduled every 6 or 8 weeks according to the RECIST guideline. Regarding the bias resulting from different assessment intervals, Freidlin et al. (12) described that the potential event–time bias is caused by a higher number of unscheduled tumor assessments in the control group than that in the experimental group for unblended RCTs. Panageas et al. (13) reported the results of simulation of median PFS and bias from the Kaplan–Meier method with various tumor assessment schedules, and it is known that median PFS is not measured accurately depending on the tumor assessment schedule. Carroll (14) proposed the formula to calculate a HR for PFS under the assumption of exponential distribution for event time and tumor assessment at regular intervals. However, whether a median PFS and a hazard ratio (HR) for PFS depends on the tumor assessment schedule remains to be evaluated in RCTs for patients with early progressive cancer. In this study, we performed a simulation using several tumor assessment schedules and assessed the effect on median PFS and HR for PFS to investigate the effect of the tumor assessment schedules on PFS by assuming a biomarker, which has an interaction effect on the study treatment.
Materials and Methods
We re-analyzed the results of PFS in the randomized phase 2 trial of FTD/TPI (8) as an experimental treatment (clinical trial registration ID: JapicCTI-090880). Although this trial had been originally scheduled to perform tumor assessment at 4, 8, 12, and every 8 weeks thereafter, we prepared a dataset assuming that the assessments were performed every 8 weeks. The median PFS was estimated using the Kaplan–Meier method, and the HR for PFS was estimated using the Cox proportional hazard model. Statistical analysis was performed using SAS 9.4.
Statistical simulations for PFS were performed using three tumor assessment schedules based on the results of RCTs in patients with metastatic colorectal cancer refractory to standard chemotherapy (4-8). The RCTs in simulation were designed as two-group parallel controlled trials. It was assumed that the time of tumor progression followed the exponential distribution and the Weibull distribution. Tumor assessment was scheduled according to the following three types of interval: every 8 weeks (Sc8), every 6 weeks (Sc6), or at 4, 8, 12, and every 8 weeks thereafter (Sc4). Tumor progression was completed during the period of 7 days before and after the standard visit date with normal distribution. The same scenarios were included in the assumption of the interaction between study treatment and biomarker-positive/-negative (biomarker-positive patients have a good response to experimental treatment). The number of patients was set to meet the power of 80% and the two-sided significance level of 5% for each scenario. The median PFS, HR for PFS, and the power were estimated by the Monte Carlo simulation using R 3.2.3. The number of iterations was 10,000 for each scenario.
Results
Based on the analysis result of PFS in the modified tumor assessment schedule in FTD/TPI phase 2 trial, the median PFS was 2.1 months (95% confidence interval=2.0-3.7) in the FTD/TPI group and 1.9 months (95%CI=1.9-1.9) in the placebo group, and the HR for PFS was 0.48 (95%CI=0.33-0.70; logrank p<0.0001). The PFS curve is shown in Figure 1.
The simulation results assuming the exponential distribution are shown in Table I. Median PFS in the control group was dependent on the initial time of tumor assessment in the scenarios 1, 10, and 16, and these scenarios reproduced the trial results for patients with metastatic colorectal cancer (4-8). The biases in HR for PFS emerged, and the power decreased to less than 75% in scenarios 1, 2, 4, 7, 10, 13, 16, 17, 19, and 20 for Sc8 or Sc6. The decrease of power to less than 75% was not observed in the scenarios for Sc4. The HR for PFS was not affected in any tumor assessment schedule when the true median PFS in the experimental group was 12 weeks. The simulation results assuming the Weibull distribution were not shown because these results and discussion were almost similar to those assuming the exponential distribution.
Discussion
We re-analyzed the PFS in the modified tumor assessment schedule to every 8 weeks using FTD/TPI phase 2 trial data. There were few differences in the median PFS between the treatment groups (2.1 vs. 1.9 months from the reanalysis; 2.0 vs. 1.0 months from the original), which were consistent with the results of other studies (4-7). Regardless of biomarker-positive/-negative patients, the median PFS would be easily affected by the initial time of tumor assessment. The HR for PFS was estimated conservatively than the original result (reanalysis HR=0.48; original HR=0.41); however, the differences would not be clinically meaningful. These results suggest that median PFS and HR for PFS would be affected statistically by the tumor assessment schedule in practical study for patients with early progressive cancer. Median PFS should not be considered as “the median of the time of disease progression or death” but as “the time when 50% or more patients have progressed disease or died” if median PFS is close to the initial time of tumor assessment. It might be, therefore, useful for summarizing PFS to calculate restricted mean survival time or the PFS rate (the number of patients with PFS divided by the number of all patients) at the initial time of tumor assessment as an alternative to median PFS.
The simulation results reproduced the reanalysis results of PFS in the FTD/TPI phase 2 trial. The median PFS was affected by the initial time of tumor assessment when the true median PFS was less than or equal to the initial time of tumor assessment. Statistical biases affected to the power (<75%: more than 5% reduction from 80%) were also observed in HR for PFS when the true median PFS in a group was less than or equal to the initial time of tumor assessment and the true median PFS in a group was less than 12 weeks; however, the differences in HR for PFS were not clinically meaningful between the tumor assessment schedules (for example, 0.566 for Sc8, 0.543 for Sc6, and 0.530 for Sc4 in scenario 1). From the aspect of statistical power, these biases in HR for PFS would be few in RCTs assessing the tumor progression according to schedule of every 8 or 6 weeks when the true median PFS in the experimental group is greater than or equal to 12 weeks. The simulation results indicated that the PFS curve depends on the tumor assessment schedule if there is no potential biomarker that has an interaction effect on the study treatment. We cannot speculate whether there are potential biomarkers for experimental treatment from the PFS curve. Potential biomarkers should be considered based on clinical hypotheses in reference to subgroup analysis, p-value for interaction, or other exploratory analyses (15).
We conducted a systematic review of articles published on PubMed between Jan 1, 2007 and Dec 31, 2016. The reports selected for review were limited to the RCTs using either placebo alone or best supportive care alone as the control group. We discovered possible biases in the median PFS and the HR for PFS for 38 RCTs, due to a median PFS that was less than or equal to the initial time of tumor assessment for following cancers: adrenocortical cancer, colorectal cancer, gall bladder cancer, gastric cancer, gastrointestinal stromal tumor, hepatocellular cancer, head and neck cancer, malignant pleural mesothelioma, non–small–cell lung cancer, oesophageal cancer, pancreatic cancer, pancreatic neuroendocrine tumor, prostate cancer, renal cell cancer, soft tissue sarcoma, and urothelial cancer.
From another point of view, the initial tumor assessment at 4 weeks would affect the best overall response (BOR). For example, it was assumed that the criterion of stable disease (SD) as the BOR was defined to be at least 6 weeks. Patients who are evaluated as having non-progressive disease (non-PD) at 4 weeks and whose second assessment is unavailable for any reason are determined to be not evaluable (NE) as the BOR because they do not meet the criterion of SD. This is also caused by the difference between independent review and investigator review. If the initial assessment is performed at 4 weeks and the patient is determined to be PD by an investigator at 4 weeks, followed by treatment discontinuation, the subsequent imaging results cannot be evaluated. If the patient is considered to have a non-PD in an independent review, the patient is determined to be NE because the patient does not meet the criterion of SD. This would result in an increase in the number of patients who are NE in independent reviews. The FTD/TPI phase 2 trial showed that the proportion of patients determined to be NE in the investigator review was 0.9% (1/112) in the FTD/TPI group and no patients in the placebo group; in the independent review, it was 8.9% (10/112) in the FTD/TPI group and 12.3% (7/57) in the placebo group. The disease control rate (DCR), that was defined as the proportion of patients with BOR of complete response, partial response, or SD, in the independent review was less than that in the investigator review because more patients were determined to be NE in the independent review. DCR in the independent review was 43.8% (49/112) in the FTD /TPI group and 10.5% (6/57) in the placebo group, and DCR in the investigator review was 54.5% (61/112) in the FTD /TPI group and 14.0% (8/57) in the placebo group.
In conclusion, we demonstrated the characteristic of PFS for patients with early progressive cancer. The median PFS would be dependent on the tumor assessment schedule, and the statistical power would be decreased when the median PFS is less than or equal to the initial time of tumor assessment. The effects on the HR for PFS would not be clinically meaningful; however, a warning is necessary in the decrease of statistical power.
Acknowledgements
The Authors would like to thank Natsuko Nabeoka (SunFlare, Tokyo, Japan) for medical writing services, which were funded by Taiho and Masanobu Ito, Mirai Sugiyama, and Masashi Shimura (Taiho) for their support to this work.
Footnotes
Conflicts of Interest
TT is an employee of Taiho and owning Taiho stock. CH receives consulting fees from Taiho. TY receives honoraria from Taiho, Chugai, and Eli Lilly Japan, and research funding from Glaxo Smith Kline and Nippon Boehringer Ingelheim. AO has declared no conflicts of interest.
- Received August 11, 2017.
- Revision received August 30, 2017.
- Accepted August 31, 2017.
- Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved