Abstract
Background: Since the first reports (in 1979) suggesting an etiological role for human papillomavirus (HPV) in bronchial squamous cell carcinoma, literature reporting HPV detection in lung cancer has expanded rapidly, but a comprehensive meta-analysis has yet to be published. We performed a systematic review and formal meta-analysis of the literature reporting on HPV detection in lung cancer. Materials and Methods: MEDLINE and Current Contents were searched through April 2012. The effect size was calculated as event rates and their 95% Confidence intervals (CI), with homogeneity testing using Cochran's Q and I2 statistics. Meta-regression was used to test the impact of study-level co-variates (HPV detection method, geographical origin of study, cancer histology) on effect size, and potential publication bias was estimated using funnel plot symmetry (Begg and Mazumdar rank correlation, Egger's regression, and Duval and Tweedie's trim and fill method). Results: One hundred studies were eligible, covering 7,381 lung cancer cases from different geographical regions. Altogether, 1,653 (22.4%) samples tested HPV-positive; effect size was 0.348 (95% CI=0.333-0.363; fixed-effects model), and 0.220 (95% CI=0.18-0.259; random effects model). There was significant heterogeneity between the studies stratified by HPV detection technique, but the random effects in between-strata comparison was not significant (p=0.193). When stratified by i) different geographical regions, and ii) different histological types, the between-strata comparison was significant (p=0.0001). However, in meta-regression, HPV detection method (p=0.473), geographical origin (p=0.298) and histological type (p=0.589) were not significant study-level co-variates. No evidence for significant publication bias was found in funnel plot symmetry testing. In sensitivity analysis, all meta-analytic results seemed robust to all one-by-one study removals. Conclusion: These meta-analytic results imply that the reported variability in HPV detection rates in lung cancer is better explained by geographical study origin and histological types of cancer than by the HPV detection method itself. In formal meta-regression, however, none of these three factors were significant study-level co-variates accounting for the heterogeneity of the summary effect size estimates, i.e. HPV prevalence in lung cancer.
- Bronchus
- lung
- HPV
- squamous cell carcinoma
- adenocarcinoma
- meta-analysis
- meta-regression
- study heterogeneity
- publication bias
- detection method
- geographic region
- review
Lung cancer remains by far the leading cause of global cancer morbidity and mortality, with over 1.6 million new cases annually and almost 1.4 million deaths worldwide (both genders included) (1). Epidemiological and experimental data suggest that cigarette smoke, as well as occupational or environmental exposure to radon and asbestos, are the prime etiological agents of this malignancy. Other causal factors implicated include certain metals (chromium, arsenic, cadmium, silica and nickel), air pollution, coal smoke, hormones, previous lung disease, dietary factors, and genetic susceptibility (2, 3). It is well known, however, that i) fewer than 20% of smokers ever develop lung cancer, ii) a sizeable subset (25%) of lung carcinomas develop among never-smokers, and iii) lung cancer is a major course of death (300,000 cases) among never-smokers (2, 3). This indicates that factors other than cigarette smoking must exist among the causative agents of this disease (2, 4-7).
It was suggested over 30 years ago that human papillomavirus (HPV) could be one of these unknown causal agents of lung cancer among non-smokers and even in smokers, acting synergistically with cigarette smoke (8-10). This ground-breaking hypothesis was based on original observations of morphological similarities between a subset of bronchial squamous cell carcinomas (SCCs) and the clinical manifestations of HPV in the female genital tract, characterized a few years earlier (11-14).
Following these primary reports, an interest in HPV and lung cancer increased steadily until the early 2000s when the published literature was subjected to the first systematic reviews (15, 16). By that time, 2,468 lung cancer samples had been analyzed in 44 separate studies, of which 536 (21.7%) were shown to test positively for HPV. Since then, the speed of appearance of new studies has substantially increased, but for some obscure reason, the first two meta-analyses (17, 18) published in 2009 only identified a few additional studies that were not included in the author's review of 2002 (16). HPV prevalence in the included studies had increased to 24.5%, and both reviews emphasized a major heterogeneity between the published studies. Importantly, not only HPV DNA but also active transcription of the virus has been convincingly demonstrated in lung cancer by detecting the expression of HPV oncogens E6 and E7 by reverse transcriptase PCR (RT-PCR) (19-24).
Although the role of HPV in lung carcinogenesis has been repeatedly reviewed in book chapters (15, 25) and other monographs (16), the first meta-analyses for this rapidly expanding literature were published only in 2009 (17, 18). This co-incides with the fact that since the licensing of the first prophylactic HPV vaccines (Cervarix®, Gardasil®), interest in global disease burden due to the vaccine HPV types (HPV6, 11, 16, 18) has increased tremendously (26, 27), particularly for non-genital types of cancer potentially preventable using existing and future (second generation) HPV vaccines (28, 29).
Given the fact that since the appearance of these two meta-analyses (17, 18), more than 25 new large studies have been published (increasing the number of cases by almost three-fold), it was felt necessary to update the accumulated evidence by conducting a systematic review and formal meta-analysis covering this comprehensive literature, without any restrictions concerning HPV detection method, geographical origin of the study, and histological type of lung cancer.
Materials and Methods
Data extraction. Eligible studies were identified by searching MEDLINE (via PubMed), Current Contents, and reference lists from eligible original articles, book chapters and other reviews until April 2012. No language or date-of-publication limitations were imposed. The search terms included papillomavirus, HPV, condyloma, papilloma, bronchus, lung, cancer, carcinoma, lung cancer, and bronchial carcinoma. All publications appearing in peer-reviewed journals were considered eligible, irrespective of which method (see later) was used for HPV detection in human lung cancer, provided that the report included exact numbers of analyzed cases and of those testing/interpreted as being HPV-positive, making calculation of the event rates i.e. HPV prevalence and their 95% confidence intervals (95% CI) possible.
With the used search terms, altogether, 950 abstracts were derived from the databases, covering the years 1954 through 2012. For the present meta-analysis, a total of 110 original studies were deemed eligible, fulfilling the criteria defined above. At this stage, all studies reporting normal bronchial samples, benign squamous cell metaplasia (SQM), and benign squamous cell papillomas (SCP) were included, following the practice of recent reviews (15, 16, 25). The formal meta-analysis, however, was focused on lung cancer, where the different histological types: adenocarcinoma (AC), squamous cell carcinoma (SCC), adeno-squamous carcinoma (ASC), large cell carcinoma (LCC), small cell carcinoma (SmCC), as well as SQM and SCP were entered as subgroups within the study.
From the summaries (where available) and/or body texts of each eligible study, the following key information was extracted: HPV detection method, geographical region where the samples were derived from, HPV genotypes analyzed and detected, total number of cases analyzed, number testing (or otherwise interpreted) as being HPV-positive, percentage of HPV-positivity, authors, and publication year. In anecdotal instances, the authors were contacted for clarification of their missing data.
Statistical analyses. A specific software package Comprehensive Meta Analysis™ (Version 2.2.064, July 27, 2011; Biostat Inc., Englewood, NJ, USA), was used to perform the meta-analysis. The data input in the software includes all the above items taken from the 110 original studies. The software calculates the event rates (logit event rates, SE and variance) based on the events and sample size data. To assess overall heterogeneity in the event rates between the different studies, Cochran's Q (two-sided) homogeneity p-value as well as I2 statistics (for percentage of variation) were used (30). To explore the eventual publication bias, funnel plots were drawn by plotting the logit event rates by their precision (1/SE) (31). Funnel plots were evaluated for asymmetry using the following statistics: i) Begg and Mazumdar rank correlation (32), ii) Egger's test of the intercept (regression) (33), and iii) the Duval and Tweedie's trim and fill method (34), which imputes results that are hypothetically missing due to the publication bias. Funnel plot asymmetry analyses were also stratified by the histological type of cancer, HPV detection method, and geographical region of the studies.
To assess the variation in the event rates i.e. HPV prevalence due to the differences between the individual studies, the key study characteristics were evaluated using stratified random-effects meta-analysis and restricted maximum likelihood meta-regression. Stratified meta-analysis allows descriptive comparison of the summary event rates across the different categories of specified study characteristics, e.g. cancer histology and HPV detection method. Restricted maximum likelihood meta-regression formally compares these differences in event rates across the selected study-level covariates and estimates the among-study variance (35). Given the inherent differences in analytical sensitivities between the different HPV detection methods: histology, immunohistochemistry (IHC), Southern blot hybridization (SB), filter in situ hybridization (FISH), in situ hybridization (ISH), and polymerase chain reaction (PCR), meta-analysis were performed across these strata. Similarly, to distinguish true study-specific effects from random variation, all analyses were also stratified by the geographical regions of their origin because of the reported major differences in HPV prevalence between the distinct geographic regions (17, 18, 24, 36, 37). Together with cancer histology (AC, SCC, ASC, LCC, SmCC, AnCC), HPV detection method and geographical study origin were also tested as study-level co-variates in meta-regression.
Finally, sensitivity analysis was performed to assess the influence of each individual study on the strength and stability of the meta-analytic results. Sensitivity analysis runs the analysis k (n=109) times, each time removing one study to show that study's impact on the combined effect size. The sensitivity of the results to these one-by-one study removals was evaluated by descriptively comparing the homogeneity p-value, funnel plots, and Begg and Egger's one-sided p-values, as well as the magnitude and precision of the random-effects summary event rates (point estimates).
Results
Eligible studies. Using the specified selection criteria, a total of 110 studies were considered eligible for the present analysis (8-10, 19-24, 36, 37, 38-136), comprising 7,381 lung cancer cases analyzed by different HPV detection methods. In addition, these same studies included 82 cases of SCP, and 57 samples of SQM, as well as 483 samples of normal bronchial biopsies analysed concomitantly with the cancer samples. Both case reports and larger series are included, comprising up to 399 samples analyzed by PCR (24) and 166 lung carcinomas examined by ISH (37) (Table I). The methods used to evaluate HPV involvement include the following: light microscopic morphology (8-10), IHC (21, 39, 102), FISH (40, 49), HC2 (87), SB (38, 41, 50), ISH (37, 42, 44-48, 50-53, 56, 57, 93, 94, 98, 107, 117, 119, 131, 135), and PCR (19, 24, 54, 55, 58-86, 88-92, 95-97, 99-101, 103-106, 108-116, 118, 120-130, 132-134). Based on the available data on geographical regions with different HPV prevalence in lung cancer (17, 18, 24, 36, 37), the studies were categorized into the following regions of origin: China and Taiwan, Other Asia, South America, Australia, Europe, and North America. When all studies reporting only benign lesions were omitted, 100 studies remained that report on HPV detection in lung cancer (any histological type). These 100 studies comprise the target of this meta-analysis. Of all 7,381 lung cancer cases analyzed, 1,653 (22.4%) tested HPV-positive.
Analytical results.
Point estimates of event rates. In the entire set of 100 studies, the crude HPV-positivity rate (1,653/7,381) translates to event rates (i.e. effect size, HPV prevalence) of 0.348 (95% CI=0.333-0.363) using the fixed-effects model, and 0.220 (95% CI=0.180-0.259) using the random effects model. Table II depicts the meta-analysis of the 100 included studies, stratified by HPV detection technique. The random effects model results in lower point estimates than the fixed effects model. Irrespective of the HPV detection method, there is significant heterogeneity between the studies as measured by Cochran's Q and I2 homogeneity statistics, with p=0.0001 (except for biopsy and FISH studies, p=0.235 and p=0.064, respectively). The same applies to the overall comparison within strata (p=0.0001), but not for comparison between the strata (random effects model, p=0.193; fixed effects model, p=0.054). The percentage of variation (I2) is lowest (30.8%) for biopsy-based studies, and highest (92.7%) for IHC-based studies. Using the random effects model, studies based on biopsy alone give the highest point estimates of HPV prevalence (0.327, i.e. 32.7%), followed by SB- (27.3%), PCR- (22.0%) and ISH-based studies (20.9%).
All 100 studies were also subjected to meta-analysis stratified by the geographical origin of the study (Table III). There is significant heterogeneity (p=0.0001) between the studies from different geographical regions, except for Australia (p=0.123), with the percentage of variation from 57.9% (Australia) to 90.5% (Other Asia). There is three-fold higher effect size derived from studies carried out in China and Taiwan (HPV prevalence 37.7%) as compared with those from USA/Canada (12.5%), and the difference is more than two-fold as compared with the European studies (16.9%). Both the fixed effects and random effects models result in highly significant p-values for homogeneity (0.0001) both in the within-strata and between-strata summary comparisons, indicating substantial heterogeneity between studies from the same geographical region as well as between the different geographical regions, respectively.
The 100 studies were also subjected to meta-analysis stratified by the histological type (Table IV). There is significant heterogeneity between the studies analyzing AC, SCC and ASC, but not between the studies assessing HPV in LCC and SmCC (p=0.135 and p=0.574, respectively). The percentage of variation ranged from 36.8% (LCC) to 89.9% (SCC). With the random effects model, the point estimates for effect size are highest for SCC (HPV prevalence 25.1%), followed by LCC (20.3%), ASC (18.5%) and AC (15.1%). Both the fixed effects and random effects models result in highly significant p-values for homogeneity, both in the within-strata and between-strata summary comparisons, indicating substantial heterogeneity between the studies analyzing the same histological type of lung cancer (p=0.0001), and somewhat less between studies assessing different histological types (p=0.019).
Meta-regression. These stratified meta-analyses were followed by formal meta-regression to confirm the impact of the study-level co-variates (namely HPV detection methods, geographical origin, histological type) on the summary effect size. In Table V, all methods except SB (p=0.140) resulted in point estimates that were significantly different from the reference. Due to the large number of studies, the seemingly small effect size difference (−0.071) between ISH and PCR methods was also statistically significant (p=0.0001). However, the HPV detection method was not a significant study-level co-variate (p=0.473 for slope, i.e. regression coefficient β1 or effect parameter). The same was true when only the studies using ISH (n=17) or PCR (n=71) were included in this meta-regression (p=0.984 for slope).
Table VI gives the results of a similar meta-regression examining the impact of geographical origin as the study-level co-variate of the effect size. The effect size difference of 0.178 between the high-incidence regions (HIR: China and Taiwan, Other Asia, South America) and low-incidence regions (LIR: Australia, Europe, North America) was highly significant (p=0.0001). In meta-regression, geographical origin of the study had a significant impact on the effect size (p=0.020 for slope) only when used as dichotomized (HIR/LIR) variable, but not when all individual regions were included as separate categories (p=0.298).
A similar meta-regression was carried out using the histological type as the study-level co-variate (Table VII). Using AC as the reference, the effect size differences from the other histological types are all significant (p=0.003 to p=0.0001). In meta-regression, however, the histological type had no significant impact on the effect size (p=0.589) when all histological categories were included. When only AC and SCC categories were included, histological type is a significant study-level co-variate associated with HPV prevalence (p=0.035).
Publication bias. Potential publication bias was assessed by funnel plot asymmetry statistics, separately for the three major study characteristics. There was practically no evidence for publication bias among studies using different HPV detection methods (Begg p>0.05, Egger's p>0.05), except for PCR-based studies (Egger's p=0.0001). However, Duval and Tweedie's trim and fill method imputed no hypothetically missing studies, and thus had no effect on the adjusted point estimates for effect size. As to the studies stratified by geographical region, Begg and Egger's p-values did suggest some funnel plot asymmetry for studies from Other Asia (Begg p=0.012) and Europe (Egger's p=0.038). Duval and Tweedie's trim and fill method did not impute any hypothetically missing studies for these two regions, however, leaving the effect size estimates unchanged. Finally, both Begg and Egger's p-values suggested some funnel plot asymmetry for studies on AC (n=23) (p=0.013 and p=0.003, respectively), but again Duval and Tweedie's trim and fill method led to no adjustments in the effect size estimates for the AC studies. The same was true for the SCC studies (n=92) and ASC studies (n=8), despite Egger's p=0.0004, and p=0.014, respectively.
Sensitivity analysis. Sensitivity analysis was performed to assess the influence of each individual study on the strength and stability of meta-analytic results. Meta-analytic results seemed robust to all (n=99) one-by-one study removals, with no change in the magnitude and precision of the fixed effects and random effects summary event rates.
Discussion
Since the first evidence in 1979 suggesting that HPV might be involved in etiology of at least a subset of lung carcinomas (8-10), this topic has become a subject of increasing interest, with widely expanded literature (15-24). To date, only two meta-analyses have been published (17, 18), and unfortunately, both have an incomplete coverage of the literature. The meta-analysis presented here is based on a systematic review, updating all the literature published since the author's own review of 2002 (16). Importantly, no restrictions were made according to the method used for HPV detection, even if some of the early DNA techniques are obsolete today, to validate by formal meta-analysis the frequently presented concept that the wide variation in HPV prevalence in lung cancer is mainly due to the different detection techniques (16-18, 24, 36, 37). The other study-level co-variates with potential impact on the effect size considered in this meta-analysis are the geographical origin of the study and the histological type of lung cancer, also listed as potential causes of variation in HPV prevalence.
Assessing the heterogeneity in meta-analysis is crucial because the presence or absence of true heterogeneity (i.e. between-study variability) directly affects the statistical model that should be used to analyze the database (30, 137-139). The usual way of assessing whether true heterogeneity exists has been the Q test, originally introduced by Cochran (140). Non-significant homogeneity p-values in the Q test indicate that the homogeneity hypothesis should not be rejected, and justifies the adoption of a fixed effects model, assuming that the estimated effect sizes only differ by sampling error (137). In contrast, significant p-values in the Q test imply true heterogeneity, warranting the use of a random effects model that includes both within- and between-study variability. The Q statistic has the shortcoming in that it has a poor power to detect true heterogeneity in meta-analyses including a small number of studies, but excessive power to detect even insignificant variability when large number of studies are available (30, 137-140). Furthermore, the Q statistic does not indicate the magnitude of true heterogeneity, only its statistical significance (137). To overcome these shortcomings, Higgins et al. (141) recently proposed three indices for assessing the heterogeneity in meta-analysis: the H2, R2, and I2 indices. Of the three, the I2 index measures the extent of true heterogeneity, interpreted as the percentage of the total between-study variability of the effect sizes. The I2 index values 25, 50 and 75 indicate a low, medium, and high degree of heterogeneity, respectively (137, 141). One of the major advantages of the I2 index is that the indices obtained from meta-analyses with different numbers of studies and different effect metrics are directly comparable (141).
Given the above considerations, there is little doubt that marked heterogeneity exists between the studies within all HPV detection method categories (Table II), as indicated by the significant p-values for homogeneity (p=0.0001) for the Q test among most of the method categories, despite a markedly variable number of studies in each category (n=1 up n=71). This is also concordant with the values of the I2 index, indicating that the percentage of the total variability within each method category is very high (up to 93.2%). This marked heterogeneity justifies the adoption of the random effects model to analyze the summary statistics for within- and between-strata heterogeneity (137-139). Using the random effects model, the most important conclusion from the data in Table II is that there is no true heterogeneity between the studies using different HPV detection techniques, as indicated by the non-significant p-value for homogeneity (p=0.193 for the random effects and p=0.054 for the fixed effects model) for the between-study comparison. In other words, we can revisit the concept raised in several recent reviews (15-18, 24), suggesting that the differences in HPV prevalence reported in the lung cancer literature would be explained by the different HPV detection techniques. In this meta-analysis, however, there are no formal grounds to reject the between-studies homogeneity hypothesis, thus precluding the role of HPV detection methods as being the main explanatory factor for the highly variable prevalence of HPV in lung cancer.
An alternative view suggests that this variable HPV prevalence in lung cancer is related to the different geographical regions of the study origin (15-18, 24, 36, 37). This has provoked a hypothesis that HPV plays a different role in lung cancer in the LIRs and in the HIRs (17, 18, 21, 24, 57, 112, 122, 123, 129). To validate this concept, we performed our meta-analyses stratified by the geographical origin of studies (Table III). Both the Q test and the I2 index demonstrate a marked heterogeneity between the published studies within all distinct geographical regions, irrespective of the number of studies included in each stratum. Using the above rationale, we adopted the random effects model to interpret the summary statistics for the between-strata heterogeneity. The highly significant summary p-value for homogeneity of the between-strata comparison leads to rejection of the homogeneity hypothesis. This implies that the major variation in HPV prevalence reported in the published literature is explained by the geographical origin of the studies; HPV prevalence is significantly higher in China and Taiwan, Other Asia and South America, as compared with the LIRs (Europe, Australia, North America) (15-18, 24, 36, 37).
The third potential source of variation in HPV prevalence is the histological type of lung cancer analyzed in different studies. To date, data on HPV detection has been provided for several histological types, most frequently in SCC and AC, but also of ASC, LCC and even SmCC (five studies) (Table IV). Major heterogeneity was found between studies of AC, SCC, and ASC, but not between those assessing LCC and SmCC. Using the random effects model to interpret the summary results, both the within-strata and between-strata comparisons are significant (p=0.0001 and p=0.019, respectively), confirming the substantial heterogeneity i) between the studies analyzing the same histological type of lung cancer, and (somewhat less) ii) between the studies assessing the different histological types. Thus, another source of variation in HPV prevalence seems to be the histological type of lung cancer, of which SCC and AC are the two clinically most important types.
In addition to these stratified meta-analyses that allow a descriptive comparison of the summary event rates across the different study characteristics, here meta-regression was also performed to formally compare these differences in summary effect sizes (35). In meta-regression with the HPV detection method as the co-variate, the regression coefficient for the effect parameter (β1, or slope) is not statistically significant (p=0.473). The same is true when the geographical origin of the study (p=0.298) and histological type of lung cancer (p=0.589) are tested for their impact as study-level co-variates. These data imply that despite the marked heterogeneity observed for these three study characteristics in stratified meta-analysis, no formal confirmation was obtained in meta-regression to indicate that any of the three are significant study-level co-variates accounting for the heterogeneity of the summary effect size estimates (i.e. HPV prevalence) of the lung cancer studies. Significant regression coefficients in meta-regression were only obtained, if the geographical study origin was used as a dichotomized variable (HIR/LIR) (p=0.020), and if the histological type variable only included the two most important categories, AC and SCC (p=0.035). Including only ISH and PCR studies in the HPV detection method variable does not make the regression coefficient significant, however (p=0.984). Thus, unlike in another HPV-associated type of cancer, esophageal squamous cell cancer (ESCC), where the highly variable HPV prevalence in different geographical origin of the study seems to indicate a different etiology in LIRs and HIRs (142), no similar conclusions can be drawn from lung cancer on the basis of the present meta-analysis.
However, as determined from the results of the stratified meta-analysis, the reported heterogeneity of HPV prevalence is attributed more to the geographical origin of the study and the histological type of lung cancer than to the HPV detection method itself. Considering only the studies based on ISH and PCR (representing the bulk of all studies; 88/100), it is obvious that the summary effect size is almost identical (20.9% and 22.0%, respectively). On the other hand, almost three-fold differences in effect size exist between the HIRs (e.g. China=0.377) and LIRs (e.g. North America=0.125), and similarly, even larger differences between the summary effect size of SCC (0.251) and SmCC (0.057).
Taken together, in the absence of any documented publication bias, and because of the robustness of all of our meta-analytic results in sensitivity analysis, it can be concluded that the reported wide variability in HPV detection rates in lung cancer is not mainly due to the HPV detection technique used, but is better explained by the geographical origin of the study and the histological type of lung cancer. Since this is not formally confirmed by the meta-regression, however, it seems premature to conclude that lung cancer has a different etiology in different geographical regions (2,4-7, 15-18, 21, 24, 57, 112, 122, 123, 129). Similarly, failure to control for the smoking history and gender of the patients in the present meta-analysis precludes any speculations on the possible different pathogenesis of lung cancer among smokers and non-smokers, as well as between the two genders (2, 4-7, 99, 136, 143-145). Prospective cohort studies are urgently needed to better evaluate the impact of HPV in the pathogenesis of lung cancer, as related to the other known risk factors.
- Received April 18, 2012.
- Revision received June 13, 2012.
- Accepted June 14, 2012.
- Copyright© 2012 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved