Abstract
Background: No reliable biomarker for metastatic potential in the risk stratification of papillary thyroid carcinoma exists. We aimed to develop a gene-expression classifier for metastatic potential. Materials and Methods: Genome-wide expression analyses were used. Development cohort: freshly frozen tissue from 38 patients was collected between the years 1986 and 2009. Validation cohort: formalin-fixed paraffin-embedded tissues were collected from 183 consecutively treated patients. Results: A 17-gene classifier was identified based on the expression values in patients with and without metastasis in the development cohort. The 17-gene classifier for regional/distant metastasis identified was tested against the clinical status in the validation cohort. Sensitivity for detection of metastases was 51.5% and specificity 61.6%. Log-rank testing failed to identify any significance (p=0.32) regarding the classifier's usefulness as a prognostic marker for recurrence. Conclusion: A 17-gene classifier for metastatic potential was developed, and the results showed a clear biological difference between groups. However, through validation, no prognostic significance of this classifier was shown.
Thyroid cancer is the most common endocrine malignancy, and a significant rise in incidence has been reported in several countries (1-5). The increase is predominantly due to the papillary sub-type – especially smaller tumors (6, 7). Papillary carcinomas (PTC) account for approximately 80% of patients, and most can expect a favorable prognosis. However, tumor aggressiveness differs significantly, and prognostic scoring systems, based on clinical and histopathological factors, have been proposed (8-11). These systems were designed for estimation of survival, but they are also used for treatment planning. In both situations, the presence of regional or distant metastases plays a very important role.
Based on the increasing incidence in especially smaller PTCs, the decision to undertake lobectomy versus total thyroidectomy will become relevant for an increasing number of patients. Hence, knowledge on tumor potential for seeding of regional or distant metastases will be pivotal.
Molecular-based management strategies hold promise for the development of biological markers that can accurately predict adverse outcome and help risk stratification of patients. Most studies in this field have been based on sporadic investigations of random markers, most notably the B-Raf proto-oncogene, serine/threonine kinase (BRAF) mutation (12). With the success of the human genome project and advances in bioinformatics, the focus of interest has turned to gene-expression analysis, where the DNA or RNA levels are used to identify classifier genes. In 2001, Huang et al. demonstrated that several genes were uniformly expressed in a cohort of 14 PTCs, the conclusion being that this cancer type is “characterized by constant and specific molecular changes” (13). Subsequent studies have identified mutations that have been related to tumor progression and tendency to relapse (14-16). In 2011, Nilubol et al. published a genome-wide expression analysis of 64 patients and identified a 100-gene signature that was able to separate patients with and without PTC-associated mortality (17). Relapse of disease, however, is the main risk in this group of patients, and relapse risk is often related to metastatic disease. No signature for recurrence or metastatic potential has been identified. Using a national consecutive cohort, we developed a gene-expression classifier for metastatic potential by measuring RNA expression in the primary tumor at the time of cancer surgery. Furthermore, we investigated the ability of the gene classifier to identify metastatic and recurrent cases. We hypothesize that a gene classifier can identify patients with no risk after initial treatment.
Materials and Methods
Patients and tissue. For both the development and validation of the gene classifier, the tissue was collected at the time of primary cancer surgery, either hemi- or total thyroidectomy, and all tissue was derived from the primary tumor.
For the development of the gene classifier, gene-expression profiles were obtained from freshly-frozen tissue collected during 1986 to 2009 at the Department of Pathology, Odense University Hospital. After sampling, the tissue was placed in a small aluminum foil tray, covered with Tissue-Tek O.C.T. Compound (Sakura Finetek Europe B.V., Alphen aan den Rijn, the Netherlands), snapfrozen in dry-ice cooled isopentane for approximately 1 minute and then transferred to a −80°C freezer for storage. Tissues from 39 patients were available, but RNA extractions failed in one case; thus, the material consisted of 16 cases without metastasis and 22 cases with. To transfer to formalin-fixed paraffin-embedded (FFPE) tissue, a training set consisting of the same cohort was used; however, two specimens were missing, and RNA extraction again failed in one case, leaving 15 cases without metastasis and 20 cases with metastasis. Characteristics for the cohort are shown in Table I.
Validation of the gene classifier was performed on a consecutive cohort of patients registered in the national prospective DATHYRCA database (18), from which clinical data were extracted. Tissue was fixed in formalin and embedded in paraffin following standard protocols at Odense and Aarhus University Hospital at the time of surgery. FFPE tissue was stored at room temperature until the time of analysis. Patients from Aarhus were diagnosed during 2000 to 2009, and patients from Odense between 1996 and 2009. Tumors less than 5 mm in diameter were not included in order to secure sufficient tumor material for analyzes. The selection process is shown in Figure 1.
Survival time was defined as the time from first cytological or histological verification until event or censoring, and follow-up was secured for all patients by review of medical history.
Recurrences were defined as persistent disease or occurrence of disease after the end of primary treatment. This was confirmed by the following modalities: histology, cytology or imaging. Follow-up ended on May 1, 2014. Patients were censored at death or emigration from Denmark.
In order to be considered as a metastatic disease, one or more of the following criteria had to be met: Histological or cytological verification; identification by imaging, including computed tomography (CT), magnetic resonance imaging, ultrasonography, positron-emission tomography–CT or radioiodine scan. In order to be considered as not having metastatic disease, the patient had to fulfill the following criteria: No clinical suspicion of metastases; metastatic disease not suspected on any kind of imaging; at least 5 years of follow-up without evidence of metastases.
Patient and tumor characteristics for the included patients are shown in Table I. The validation cohort was not significantly different from the development cohort, apart from sex and a smaller proportion of multifocal cases.
Informed consent. The study was approved by the Danish Regional Ethics Committee and by the Danish Data Protection Agency. As stated in the approval from the Danish Regional Ethics Committee, an exemption from informed consent was granted (Ref. S-20090050).
Quantification of gene expression. Gene-expression profiles from freshly-frozen tissue were obtained using Affymetrix Human Genome U219 arrays (ArosAB, Aarhus, Denmark). The data are deposited in the NCBI Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) and are accessible through GEO Series accession number GSE65074.
Gene-expression analyses on FFPE tissue were performed according to the methods described by Toustrup et al. (19). Briefly, RNA was extracted from a 7-μm section of FFPE biopsies with silica bead-based, fully automated isolation method for RNA on a robotic Tissue Preparation System using VERSANT Tissue Preparation Reagent (Siemens Healthcare Diagnostics, Tarrytown, NY, USA). cDNA was generated using the High Capacity cDNA Archive kit and pre-amplified using the Taqman PreAmp Master Mix Kit (Applied Biosystems, Life Technologies Europe BV (Denmark), Naerum, Denmark). Quantitative polymerase chain reaction (PCR) was performed on an ABI Prism 7900 HT Sequence Detector (for identifying reference genes) or the Fluidigm Biomark 96.96 dynamic gene-expression system using Taqman Gene Expression PCR mastermix (Applied Biosystems). Gene expression levels were calculated using RealTime Statminer (Intergromics, Madison, WI, USA). The final signature included 20 genes (see ‘Results’) and four reference genes [(calmodulin 2 (phosphorylase kinase, delta) (CALM2)' glucuronidase, beta (GUSB); polymerase (RNA) II (DNA-directed) polypeptide A, 220 kDa (POLR2A) and ribosomal protein L37A (RPL37A)]. The four reference genes were identified using the GeNorm and Normfinder applications available in Real-Time Statminer (Intergromics) and were selected among 22 potential reference genes.
Classification into groups with and without metastasis. Classification was based on the methods described by Toustrup et al. (19). The investigator performing the classification procedure was blinded to data on patient status until the classification had been performed.
Statistical analyses. Descriptive statistics were derived according to data type, i.e. categorical variables are reported as frequencies and respective percentages, whereas continuous variables were analyzed by medians and ranges. The Kaplan–Meier method was used to evaluate survival. The outcome variable was time to recurrence. Fisher's exact test and Chi-square test were used to examine variables. The level of accepted significance was 5% (two-sided). The database and analysis system Medlog (Information Analysis Corporation, Crystal Bay, NV, USA) was used for data registration, and STATA/IC 11 (StataCorp LP, College Station, TX, USA) was used for statistical analyses. Profiling data were analyzed using the R packages SAMR (cran.r-project.org/web/packages/samr) and PAMR (cran.r-project.org/web/packages/pamr).
Results
In the development cohort, gene-expression profiles were obtained from 38 patients, 22 of whom had been diagnosed with metastases. Initially, SAM analyses (two-class unpaired comparison) were performed using the true classification of patients and with 10 random classifications. In the random classifications, the number of patients with metastases in each classification was kept at 22. SAM analyses were performed with false-discovery rates (FDR) of either 0.01, 0.05 or 0.1. At the lowest FDR, 135 up-regulated probes were identified in patients with metastasis, and no down-regulated probes were identified. For the random classifications, a mean of 2 up-regulated genes were identified (range=0-72). At all three FDR levels, the true classification of patients consistently identified more probes than the random classifications.
Next, PAM analysis was performed using the true classification. An optimal threshold was selected with respect to number of correctly classified samples and FDR. The analysis resulted in 110 probes (positive classification rate=74%, FDR=0.02). The final gene list consists of genes found by both SAM (FDR=0.01) and PAM (total=80 probes) and expressed in the upper 33% percentile of all probes (30 probes). These 30 probes correspond to 20 unique genes, the majority of which are involved in signal transduction and have been associated with metastasis in solid tumors/epithelial–mesenchymal transition. The expression pattern is shown in Figure 2.
Before expression could be analyzed in FFPE tissue, four reference genes were selected based on an analysis of 22 potential reference genes CALM2, GUSB, POLR2A, and RPL37A. The expression of the 20 genes was compared to the corresponding values from microarray measurements on the freshly frozen tissue sample from the same patients. Significant correlations were found for 17 out of the 20 genes [ADAM metallopeptidase with thrombospondin type 1 motif, 1 (ADAMTS1); anthrax toxin receptor 1 (ANTXR1); complement component 7 (C7); chemokine (C-X-C motif) ligand 12 (CXCL12); early B-cell factor 1 (EBF1); eibulin 2 (FBLN2); FOS-like antigen 2 (FOSL2); gamma-glutamyltransferase 5 (GGT5); G protein-coupled receptor 124 (GPR124); junctional adhesion molecule 3 (JAM3); leucine-rich repeats and immunoglobulin-like domains 1 (LRIG1); N-myc downstream-regulated 1 (NDRG1); paired related homeobox 1 (PRRX1); roundabout, axon guidance receptor, homolog 1 (Drosophila) (ROBO1); sortilin-related receptor, L (DLR class) A repeats-containing (SORL1); transcription factor 4 (TCF4), and zinc finger E-box-binding homeobox 1 (ZEB1)]. Based on the expression values of these genes in the two groups with and without metastasis, a classifier was developed for individual classification of FFPE samples.
Finally, the gene-expression signature was validated in a series of 183 patients on FFPE material. The median follow-up time for these patients was 8.0 years (range=0.003-17.2 years). The 17-gene classifier for regional/distant metastasis identified was tested against the clinical status, and the results are shown in Table II. Sensitivity for detection of metastasis was 51.5% [95% confidence interval (CI)=41.2%-61.8], and specificity was 61.6% (95% CI=50.5%-71.9%).
The Kaplan–Meier method was used to estimate whether the classifier was useful as a prognostic marker for all recurrences, these being sited in T (thyroid bed), N (regional lymph nodes) or M (distant site). Figure 3 shows a plot of recurrence-free survival dichotomized according to the classifier as ‘metastatic’ or ‘not metastatic’. Log-rank testing showed that no significance was found for the survival differences (p=0.32). When only N and M site recurrences were considered as events, significance was also not reached (p=0.90).
Discussion
A 17-gene classifier for metastatic potential was developed, and the results showed a clear biological difference between groups. Through validation, however, no prognostic significance of this classifier was shown in identifying metastatic cases or in the ability of dichotomizing patients according to risk of recurrence after primary treatment. Therefore, the difference shown does not seem to be related to metastasis or recurrence of PTC.
The strengths of this study are related primarily to the patient population. The cohort was consecutive, and the classifying investigator was blinded to clinical outcome, which should have reduced selection biases. In Denmark, all patients are equipped with a unique 10-digit personal identification number (that also contains birth date and sex information). Using the data on the personal identification number, it is possible to trace the individual patient through all their contacts with the hospital and clinical outpatient services, thus ensuring the possibility of long-term follow-up. Furthermore, to our knowledge, this is the first study addressing the subject of identifying a gene profile for metastatic potential based on genome-wide expression analysis.
Some limitations need to be considered when interpreting the results of this study. No standardization of treatment protocols was performed and different treatment modalities were used, which may have had an influence on outcome in the included cases. Since 2001, however, all patients in Denmark with thyroid cancer have been treated according to that national guidelines, which reduces the influence from the treatment aspect.
Furthermore, a substantial proportion of the included cases did not have nodal surgery performed, and even when performed, all levels were not evaluated. One might argue that cases without metastases could in fact harbor silent metastases. This potential bias cannot be further evaluated. However, all patients in the group without metastasis were without clinical suspicion, and ultrasound evaluation was routinely performed. In addition, these patients were followed-up for at least five years after diagnosis (median 8.2 years), and cases exhibiting signs of metastases during follow-up were classified as metastatic cases. Conversely, PTC is known to develop late recurrences (20), and prolonged follow-up would have been preferable.
Questions as to whether the identified metastases are sure to have developed from the evaluated tumor should be addressed. In this study, we adapted the theory of metastatic dormancy, which is a concept where micrometastatic lesions or individual cancer cells can survive in a quiescent state in metastatic niche without progression, and it is suggested that this is the means by which differentiated thyroid carcinoma metastasize (21). When no further tumors are found in the thyroid gland, metastases should then stem from the evaluated tumor. For this reason, data were also analyzed where T site recurrences were not considered events, and here the profile was also not significant. With regard to this issue, multifocal cases pose another limitation; such cases harbor more than one tumor, and there are uncertainties as to which tumor seeded the metastatic cells. Bansal et al. have suggested that multifocal cases are in many instances actually multiple synchronous primary tumors (22). With this in mind, an argument could be made for the exclusion of multifocal cases. However, it was not possible to secure a sufficient number of cases in this study to allow this. A secondary analysis was performed in order to evaluate whether the classifier could dichotomize unifocal cases according to risk of recurrence. Even here, however, significance was not proven (results not presented).
In this study, the FFPE tumor tissue was routinely stored during the inclusion period; hence, the decay of mRNA might influence the results. However, Tramm et al. have shown that it was possible to analyze gene expression for 16 reference genes in material stored for 1 to 29 years, despite a half-life for mRNA of 4.6 years (23). In our study, the tissue was not stored beyond 20 years, such that mRNA extraction was considered feasible. Moreover, normalization was performed according to the method described in Tramm et al.'s study.
A profile consisting of 17 up-regulated genes was identified using freshly frozen tissue. Previously, up-regulation of a number of these genes was associated with cancer progression: ADAMTS1, a metalloproteinase, was found to promote breast cancer progression (24); ANTXR1 was associated with angiogenesis in colorectal cancer (25); CXCL12 was found to induce tumor growth and metastasis (26); FOLS2 was associated with metastatic progression in breast cancer (27); GPR124 mediated endothelial cell survival and was related to resistance to therapy in non-small cell lung cancer (28); JAM3 promoted metastases in lung cancer and malignant melanomas (29); inhibition of TCF4 was found to inhibit cell proliferation and induce apoptosis in colorectal cancer cells (30); ZEB1 was found to correlate with metastasis in endometrial, colorectal, and prostate cancer (31). Two other genes, PRRX1 and TGFBR3, were associated both as tumor suppressors and tumor promoters (32-33).
Conversely, some genes were previously described as being down-regulated in tumor progression: C7 expression was reduced in esophageal carcinoma (34); EBF1 was associated with Hodgkin's lymphoma (35); FBLN2 was found to act as a tumor suppressor in nasopharyngeal cancer (36); LRIG1 was described as a tumor suppressor in several cancer types (37); NDRG1 was found to be down-regulated in colorectal cancer (38); NFIX was inversely related to metastasis of breast cancer (39); ROBO1 was described as a tumor suppressor, and its down-regulation was believed to promote tumor cell migration (40).
To our knowledge, the remaining genes: EPB41L2, GGT5, and SORL1, have not been described in relation to cancer.
In sum, the genes included in the gene profile have been found to play different roles in cancer progression. Even though the role of genes could vary in different cancer types, the failure of the gene profile in fulfilling the aim of the study corresponds to current knowledge.
Conclusion
With the failure to validate a gene classifier for metastasis in this study, the prognostic scoring systems are still the best available tool for stratifying patients according to risk. A recently developed prognostic system for recurrence appears to be the most suitable option when estimating risk of recurrence (41). A biological marker that can accurately predict metastatic potential, and thus aid in the risk stratification of patients, is still needed. Thus, more aggressive treatment could be reserved for those patients who are at high risk of recurrence of this cancer, which generally carries favorable prognosis. Furthermore, all currently available tools are dependent on post-surgical evaluation. Perhaps a biological marker could be applied prior to surgery, as seen in a commercially available gene classifier developed for cytologically indeterminate thyroid nodules (42, 43).
Acknowledgements
The work was supported by Odense University Hospital, University of Southern Denmark, DAHANCA, the Danish Cancer Society, Fabrikant Einer Willumsens Mindelegat, Becket-Fonden, Else og Mogens Wedell-Wedellsborgs Fond.
Footnotes
↵* DATHYRCA Group (in alphabetic order): Andersen LJ, MDa; Andreassen N, MDb; Bastholt L, MDc; Bentzen J, MDd; Bülow I, MDa; Ebbehøj EV, MDb; Feldt-Rasmussen U, MD DMSc.e; Fisker RV, MDa; Hahn, CH, MDe; Godballe C, MD Ph.D.c; Grupe P, MDc; Hegedüs L, MD DMSc.c; Larsen SR, MDc; Jespersen ML, MDb; Kiss K, MDe; Kristensen M, MDa; Lelkaitis G, MDa; Morsing A, MDb; Nygaard B, MD Ph.D.d; Oturai P, MDe; Pedersen HB, MDa; Schytte S, MDb. aAalborg University Hospital, Denmark; bAarhus University Hospital, Denmark; cOdense University Hospital, Denmark; dHerlev Hospital, Denmark; eCopenhagen University Hospital, Denmark.
Competing Interests
No competing financial or non-financial interests exist.
- Received December 7, 2015.
- Revision received January 18, 2016.
- Accepted January 19, 2016.
- Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved