Abstract
Background/Aim: Adverse events (AEs) in cancer trials may be caused by the investigational agents or the underlying disease. Determining the causality is challenging, especially in early cancer drug development when a control arm is lacking. Materials and Methods: We carried out a systematic literature review of AE frequencies in placebo arms of randomized trials for malignant solid tumors and hematologic malignancies reported in PubMed from 2016 to January 2022. Results: Among 148 placebo arms, the AEs with the highest reported mean frequencies among all publications were: Fatigue (20.1%), nausea (16.3%), diarrhea (14.3%), abdominal pain (12.4%), and anemia (10.9%); AEs resulting in drug discontinuation were reported in 5.6% of placebo-treated patients and serious AEs in 18.7% of placebo patients. Conclusion: The data presented here may be used as a benchmark to help assess drug causality in early development cancer studies without a control arm.
When drug development moves from preclinical assessment to human trials, the first goals are to describe the toxicity profile, determine the recommended phase 2 dose, and optimal biologic dose of the investigational medicinal product. In oncological drug development, this is typically conducted initially with patients for whom multiple previous treatment attempts have failed. New symptoms might be caused by cancer progression, the investigational agent, or other causes such as pre-existing comorbidities. In the very first subjects of a first-in-human trial, temporality is one of the most helpful tools to identify whether adverse events (AEs) are caused by the drug. Drug causality is strongly suggested when an effect occurred after the drug was started (treatment-emergent), improved when the drug was stopped (dechallenge), and reoccurred when the drug was restarted (rechallenge). However, in real life, observations are typically more difficult to interpret. To separate the description of what occurred from the interpretation of why it occurred, the International Conference of Harmonization defined an AE independent of causality (1). In randomized phase 3 trials with placebo arms, a significantly higher frequency in the treatment arm may support drug causality. However, in earlier phases, control arms and placebo arms are typically lacking. For those, a predetermined expected frequency of AEs in a patient population without the investigational agents can be a useful tool to put the reported AE frequency into context.
Clinical trial data are not identical with real-world data. The study patient populations differ, mainly regulated by eligibility criteria. Moreover, AE documentation differs. Treatment-emergent AEs used in clinical trials describe only those AEs that occurred after the drug/investigational medicinal product was started or became worse since. Furthermore, in clinical trials, AEs are typically coded following Medical Dictionary for Regulatory Activities, a dictionary with a five-level hierarchy (2). Among those, preferred terms are typically used in AE tables. AEs are then graded by the Common Terminology Criteria for Adverse Events (3). For the purpose of creating benchmarks to assess AE frequencies in clinical trials, data which are generated in the same way need to be used as sources, and placebo arms of randomized clinical trials are probably the best source of information (4-7).
Recently, Cochrane methodology (3) and Bayesian Network Meta-Analysis (4, 5, 8, 9) are increasingly used to summarize AE frequencies across various studies, and to identify findings that would not have been obvious in any of the individual studies. For instance, the incidence of acute myeloid leukemia or myelodysplastic syndrome occurring after exposure to poly (ADP-ribose) polymerase inhibitor was higher than among placebo-treated patients (10). Anti-cytotoxic T-lymphocyte–associated antigen-4 plus anti-programmed death-1 were associated with higher risk of renal AEs than anti-programmed death-1 alone (3, 4). However, the purpose of these studies is typically to support treatment choices from among various approved drugs, while frequency tables of AEs to create benchmarks supporting early oncological drug development remain an unmet need (11).
Materials and Methods
The PubQuant database was generated during several analyses each focusing on one specific AE (4-7). For the current analysis of all AEs, the database was further updated to include publications until January 2021. The search algorithm was unchanged from previous work (7), combining three topics by AND, namely, “cancer”, “randomized”, and “adverse events”. Each of the topics was further described by its own list of Medical Subject Heading terms combined with OR. In January 2022, this search resulted in 554 hits for the publication years 2021 and 2022. Of these, 438 articles were excluded by manual review of the title and bibliographic data. The most common exclusion in this step was of trials that did not include patients with cancer, and the trial design was not randomized. The remaining 116 publications were reviewed in detail and compared to what was entered already in the database, resulting in the exclusion of a further 93 articles (placebo was not monotherapy but added to other anticancer therapy for 43, the trial was already included in existing data for 22; for further details, see Figure 1). In the following analyses, the specific questions determined the data selection. For evaluating the influence of covariant such as age or Eastern Cooperative Oncology Group (ECOG) status, all placebo arms of studies with cancer drugs were included. For comparison between placebo and treatment arms, all treatment cohorts were added in which the study had a placebo arm. The data selection was further restricted for the calculations determining the benchmarks for the frequencies of specific AEs, serious AEs (SAEs), and AEs leading to discontinuation. For these calculations, studies were only included for which a cancer diagnosis was part of the eligibility criteria, thereby excluding phase 1 studies with healthy volunteers, and cancer-prevention studies.
Study flow diagram. The data collection grew over several published projects addressing specific adverse events (left side of the Figure). For this report, which combines all common adverse events, the publication year 2021 was added. The additional literature search is described on the right side of the Figure. TEAE: Treatment-emergent adverse event.
The data were analyzed using SPSS (SPSS Version 23.0, IBM, Armonk, NY, USA) as previously described (4-6, 7). This included a multiple step imputation process to fill in frequencies of Common Terminology Criteria for Adverse Events grades for those cohorts where they could be determined from original values (7). Pearson correlations were used to describe the relation of quantitative variables, and analysis of variance to describe the influence of categorical variables on AE frequencies. All p-values are considered exploratory. The complete method description and the items listed in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (12) are available from the corresponding author upon request and consist of the search algorithm and the complete list of included articles.
Results
After the database was updated with the new search, the combined PubQuant database contained 149 publications including cancer drugs that matched the criteria and described 160 cohorts of placebo monotherapy with 30,374 individuals. Among those were 11 publications (12 cohorts) describing 2,387 individuals without cancer diagnosis. Those were placebo-controlled randomized cancer-prevention studies and studies with healthy volunteers, which were of value when exploring the influence of covariates, but they were excluded from the benchmark calculations for cancer studies. The details of the demographic distribution are described in Table I. As an overview: The average reported median age was 57.1 years, 56.4% were male, 56.5% had an ECOG performance status of 0, and 66.9% were phase 3 studies. The most common diagnoses were hematological, colorectal, breast and lung cancer. Eligibility criteria included measurable cancer in 38.1% of the reported cohorts and relapsed/refractory malignancies in 44.4%. Placebo was given per os in 75.6%.
Demographic data of all cohorts included in the study of adverse events. All cohorts in placebo arms were treated with placebo monotherapy regardless of availability of adverse event data.
The most commonly reported treatment-emergent AEs among placebo-treated cohorts of patients with cancer (both hematological and solid tumors) was fatigue. This preferred term was reported in 102 out of 148 cohorts, with an average [±standard deviation (SD)] of 20.1±14.1% per cohort. In the same data, the preferred term of ‘asthenia’, which is a medical concept very similar to fatigue (12), was reported less frequently. Most of the specific AEs reported were gastrointestinal symptoms: Nausea, diarrhea, abdominal pain, constipation, decreased appetite, vomiting, and stomatitis were reported with averages above 5% per cohort. All three of the hematological AEs of cytopenia were among the most commonly reported: Anemia, neutropenia, and thrombocytopenia. Pyrexia (fever) was more common than febrile neutropenia. The respiratory system (cough), musculoskeletal system (arthralgia, back pain), nervous system (headache), and skin (rash) were represented with one preferred term each among those >5%. For details, see Table II. The order of frequency was different among more severe AEs. Among AEs of grade 3 and higher, the most common AEs were anemia (3.0±3.8%), abdominal pain (2.2±3.4%), and neutropenia (2.2±4.6%).
Frequency of adverse events (AEs) in randomized oncology trials. Data are absolute numbers of patients reported with the AE summed over all patient cohorts and total number of patients with the AE in these columns, with the mean frequency percentage (range) for each patient cohort. For this table, studies with healthy volunteers and cancer prevention studies were excluded. Using benchmarks for the assessment of AE frequencies in oncology studies without a control arm: Drug causality is not supported when the frequency of the AE is lower than the average percentage among placebo arms. Drug causality is likely when the observed frequency is higher than the highest reported frequency in placebo arms (upper limit of the range). For observations that are higher than the placebo average but lower than the placebo maximum, statistical methods to include the denominator are recommended, and the absolute numbers may be used as control values.
The treatment arms in general had significantly more AEs reported than the placebo arms among patients with cancer. This was particularly prominent among the AEs diarrhea (placebo: 14.3±9.8%, treatment: 29.9±21.0%; p=1.1×10–11), thrombocytopenia (placebo: 5.0±6.9%, treatment: 16.8±15.4%; p=0.000001), rash (placebo: 6.5±7.0%, treatment: 16.5±14.4%; p=0.000009), and weight loss (placebo: 5.7±4.8%, treatment: 17.1±16.6%; p=0.000006).
SAEs were reported in 83 placebo-treated cohorts of cancer drug studies, with an average frequency of 16.5% (range=0-47%, SD=13.3%). Among patients enrolled after cancer diagnosis, the frequency was higher: 18.7% (range=0-47%, SD=12.7%). The SAE frequency was correlated with median age (Pearson correlation R=0.546, p=1.3×10–7, N=81), proportion of patients with ECOG status 0 (R=–0.516, p=0.00002, n=62), male sex (R=0.322, p=0.004, n=77), and previous lines of therapy (R=0.380, p=0.003, n=61). SAEs were less commonly reported among the eight randomized phase 1 studies (p=0.001). Among various diagnoses, SAEs were more commonly reported in liver, gastric, and thyroid malignancies, and less in uterine, breast, gastrointestinal stromal tumor, ovarian cancer, and among individuals without cancer diagnosis. The average SAE frequency was higher among studies that enrolled only patients with measurable disease (24.6±13.8%, n=30) than those enrolling only those without measurable disease (7.5±8.1%, n=26, p=0.000003). SAEs were 1.3-fold higher among patients with cancer receiving cancer drugs than those receiving placebo (24.7±14.6%, range=0-57%).
AEs leading to discontinuation were reported in 100 out of the 160 placebo-treated cohorts of cancer drug studies, at an average frequency of 5.2±5.2% (range=0-28.4%). The frequency was higher when the analysis was restricted to cohorts of patients enrolled after cancer diagnosis: 5.6±5.2% (range=0-28.4%). This variable was positively correlated with the median age of the cohorts (Pearson correlation R=0.321, p=0.001, N=106), as cohorts with older patient populations reported more common AEs leading to discontinuation. Discontinuation caused by AEs was reported more commonly among phase 3 studies (analysis of variance p=0.012). The frequency was also dependent on the diagnosis, with patients with gastric, liver, kidney, and hematological cancer reporting more common AEs leading to discontinuation; and in cancer-prevention studies, with esophageal, brain, ovarian, and thyroid malignancies being less common (p=0.0004, Figure 2). Other variables remained without significant influence. Among those were the year of publication, number of patients per cohort, phase of the trial, sex, ECOG status, median lines of previous treatment, tumor status (measurable disease yes/no), and treatment route. Among treatment arms, AEs leading to discontinuation were 2.9-fold more frequent (15.3±12.3%, range=0-74%) than among corresponding placebo arms.
Box plot of the frequency of adverse events resulting in drug discontinuation among patients with different cancer diagnoses. Only diagnoses with more than 20 cohorts of patients available are shown. The box size represents the second and third quartile, and the line within it represents the median. The whiskers represent the first and fourth quartile (excluding outliers). Circles and stars indicate the outliers of the datasets.
Discussion
This study combined published data of placebo arms of randomized clinical cancer trials, described the frequency of the most commonly reported treatment-emergent AEs (Table II), the overall mean frequency of SAEs (16.5%), and AEs leading to discontinuation (5.3±5.2%).
Gastrointestinal AEs were common among placebo-treated patients with cancer. The effect of oral chemotherapeutic agents on the gastrointestinal tract are well known and several mechanisms have been identified. The mucosa as a rapidly proliferating tissue might be a direct target of anticancer agents and the loss of mucosal integrity may cause painful oral mucositis (stomatitis), or diarrhea (13). Microbiota dysbiosis and inflammatory processes (14) are additional mechanisms of toxicity. However, the data in this study focused on the placebo arms. The patients in these arms did not receive chemotherapy, yet stomatitis, diarrhea, constipation, nausea, and vomiting were still among the most commonly reported AEs. Other drugs provided as supportive care, cancer location in the intestinal tract, age-related comorbidities or resurgence of drug side-effects from previous treatments might explain such high rates. Of note, the specific AEs varied in their relation to treatment drugs. For instance, while both diarrhea and constipation were common, only diarrhea was more common in treatment arms, indicating this AE to be worthy of particular attention in oncology trials.
SAE reporting is one of the most commonly used approaches to establish safety in drug development. The term is defined by specific criteria, independent of causality, as any AE that led to death, disability or permanent damage, hospitalization, congenital anomaly/birth defect, or was life-threatening, or required intervention to prevent permanent impairment or damage (15). Within the context of cancer, hospitalization is the definition that most commonly applies. The data provided here show that SAEs are very frequent events in cancer trials, with an average of overall reporting of 16.5% even when the patient is only treated with placebo. The frequency was higher in studies with older patients, when measurable tumor was present at enrollment, and when the patients had been treated with more previous lines of therapy; all of these findings can be viewed as descriptions of more vulnerable patient populations. It is to be expected that these patients are also more commonly admitted to a hospital. Drug-related side-effects can add further SAEs. However, the average frequency observed in the treatment arms of the same studies was only higher by a factor of 1.3 than in placebo arms. This suggests that the SAE frequency in cancer trials is only a modestly effective tool to determine drug causality, and optimal dose.
Drug discontinuations due to an AE imply the assessment of possible drug causality. This differentiates this parameter significantly from SAE frequencies. Consistent with this, the average frequency was lower (5.3%) and less dependent on the general health of the patient than were SAEs. Moreover, the frequency among treatment arms was 2.6-fold higher than in placebo arms. This suggests the frequency of AEs leading to drug discontinuation to be a more useful instrument for the detection of drug toxicity. However, the clinical investigator’s causality assessment is not always perfect. One in 20 patients discontinued the placebo treatment, suggesting also that one-third of those discontinuing active drugs do this by an erroneous assumption of drug causality. This confirms findings of a recent review also concluding that the attribution process was more unreliable than expected (16), highlighting the need for more tools to support the process.
The frequency of treatment-emergent AEs provides one such tool. The most powerful version of this variable applies to randomized trials, when the frequencies can be compared directly in the same patient population reported by the same investigators through identical reporting mechanisms. Based upon the experience of the project (4-7), we recommend prioritizing two numerical values for each preferred term: The frequency of any reported AE, and the frequency of those of grade 3 or higher. The other grades did not add further insights (4-7). In first-in-human oncology trials, there is typically no control arm. This study aimed to develop benchmarks which may be used to fill this gap, to provide an exploratory tool for interpreting emerging safety data in first-in-human cancer trials. For those AEs listed in Table II, there should not be a concern when the observed frequency is lower than the average reported for placebo-treated arms. Conversely, a concern may be raised when the observed frequency is higher than the maximum frequency reported among placebo-treated arms – i.e., higher than the upper limit of the range reported in Table II. Unfortunately, the ranges are large, and the two limits leave a significantly large area of uncertainty. For remaining uncertain situations, statistical comparison will be necessary between the literature-reported placebo data and trial observations. A preferred term not listed in Table II which has an average frequency among placebo-treated arms of less than 5% should be evaluated in detail when observed in oncological patients treated with investigational drugs.
This study has limitations, several of which were described in the earlier analyses of this series (2-5) and by other groups (17). The frequency of AEs differed between placebo-treated arms of various trials, which might be associated with reporting diligence, visit frequencies, and observation times. Simpson’s paradox is the phenomenon in which a trend appears in several groups of data but disappears or reverses when the groups are combined (18). In practice, it would be ill-advised to consider the provided numerical averages as a reflection of symptom frequencies in the real world of patient care. The provided data are the result of reporting rules and common practice of clinical trials, which may be viewed as a filter quite likely to alter average frequencies. However, the clinical trial data assessed have been produced by the same mechanism and were subject to the same filters as published clinical arms. Therefore, data from placebo-treated arms are the closest comparator available.
Further refinement of the methods is possible. For some cancer indications, such as lung and breast cancer, sufficient data might be available to create disease-specific benchmarks. We hypothesize that normalizing the observed frequency of the AE of interest by a frequently other reported AE in the same trial will eliminate the bias caused by factors that influence the entire study. A similar technique is used successfully in post-market pharmacovigilance of approved drugs (19). In classical disproportionality analyses, the frequency of an AE reported for one drug is normalized by the frequency reported for other drugs, eliminating various influences on the reporting frequency, and making the indicator independent of the number of patients receiving the drug (20). In parallel, in the setting of early oncology trials, normalizing the frequency of a specific AE to other AEs within the same trial will eliminate reporting diligence and observation time as influencing variables. The hypothesis that such normalization will improve the detection of drug causality is testable with data of early oncology trials of drugs that have since completed their subsequent phase 3 studies. However, this will require additional data collection and goes well beyond the scope of this study.
Conclusion
In oncology, the frequency of specific treatment-emergent AEs observed in placebo-treated arms provides a tool for assessing the emerging safety profile of drugs in development which otherwise may be difficult to interpret. The frequency of AEs leading to drug discontinuation may be a more helpful tool for causality assessments compared to the frequency of SAEs.
Acknowledgements
The Authors would like to thank Fabio Lievano, Elisa Cerri and Anjla Sood for helpful discussions and editorial suggestions.
Footnotes
Authors’ Contributions
JW: Concept, data collection, Excel, statistical programming, SPSS, code quality control, draft 1, draft 2; draft 3; BW: Concept, writing, SPSS, figures; MT: Data collection, data quality control, Excel, draft 3, writing quality control; HH: Concept, SPSS, code quality control, draft 2, tables, figures.
Conflicts of Interest
JW, and MT are employees of AbbVie pharmaceuticals Inc., and may own stocks. However, this project was not part of their employment, and the data interpretation reflects the personal opinion of the Authors, not the company. HH is head of a pediatric palliative care team in Giessen, Germany and has no conflict of interest. BW is attending a Swedish covenant hospital and has no conflict of interest.
- Received March 6, 2022.
- Revision received March 31, 2022.
- Accepted April 12, 2022.
- Copyright © 2022 International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY-NC-ND) 4.0 international license (https://creativecommons.org/licenses/by-nc-nd/4.0).