Abstract
Aim: To identify differentially expressed genes (DEGs) between perineural invasion-positive (PP) and -negative (PN) cutaneous squamous cell cancers (CSCC). Materials/Methods: Forty CSCC samples with and without perineural invasion were processed for RNA isolation and hybridization to Affymetrix-U219 DNA microarrays. Raw gene expression data were normalized by Robust Multi-array Averaging (RMA) and log2 transformed. Gene expression-based classification models were created and accuracies evaluated using leave-one-out cross-validation. Results: At a stringent limma p-value (p<0.001), 24 genes were differentially expressed between PP and PN samples. The cross-validated performance of the eight classification models exhibited a mean accuracy of 85-95%. Diagonal linear discriminant was most accurate at 95%, followed by Bayesian compound covariate at 94%. The poorest accuracy (85%) was observed for 1-Nearest neighbor and Support vector machines. Conclusion: Gene expression may distinguish between PP and PN CSCC. Understanding these gene patterns may potentiate more timely diagnosis of perineural invasion and guide comprehensive therapies.
- Squamous cell skin cancer
- perineural invasion
- genetic markers
- molecular markers
Cutaneous squamous cell carcinoma accounts for approximately 20% of new cases of non-melanoma skin cancer diagnosed each year (1). Some 2.5-14% of these cancers have perineural invasion and with simple excision have been found to recur 46% of the time (2). Many who present with clinical signs of perineural invasion, such as paresthesias or facial weakness, have a history of previous cutaneous squamous cell carcinoma (CSCC) resection with reported negative margins (3, 4). These recurrences often require skull base or sub-cranial resections in order to obtain negative margins. The morbidity of such resections and the subsequent adjuvant radiotherapy is certainly much greater than a local excision at the primary site. Identifying lesions with perineural invasion at the time of primary excision could direct more extensive surgical management at that time or radiotherapy to limit recurrences and poor outcomes.
Molecular markers of perineural invasion have been investigated in multiple cancer types; however, no consensus has been determined for CSCC of the head and neck. Markers in the head and neck, including membrane type-1 matrix metalloproteinase (MT1-MMP) (5), p75 nerve growth factor receptor (6), tyrosine kinase receptor 4 (7-9), foxp3 (10), alpha heat shock protein (11), Ecad proteins (12), CD44 (13) and nerve growth factor (NGF), as well as tropomyosin receptor kinase A (TrkA) overexpression in adenoid cystic carcinoma (14) have all been identified to be associated with perineural invasion; however, there has been limited replicability. To date there have been no gene expression profiling studies aimed at profiling cancer tissue with perineural invasion to find specific genes implicated in the disease. This study aims to identify gene signatures linked to perineural invasion using microarray analysis in order to build an initial framework of genes with the hope of better honing these signatures to a more specific signature in future studies. This signature could potentially be used to create a diagnostic or screening assay to be used in the clinical setting for identifying samples that are high-risk for perineural invasion.
Materials and Methods
Sample acquisition. The contents of the Tumor Tissue Shared Resource (Comprehensive Cancer Center of Wake Forest University, Winston Salem, NC, USA) were searched for cases of CSCC of the head and neck after Institutional Review Board approval. All samples within the tumor bank had previously been placed into the bank after patient consent. Fresh frozen samples were utilized for microarray expression analysis. Additional samples were collected in a prospective fashion by identifying patients in our outpatient head and neck oncologic clinics, as well as a private dermatologic surgery clinic. These patients were consented for use of their tissue samples in this study.
Confirmation of tumor. All eligible samples underwent review by a faculty or fellow dermatopathologist in order to confirm the presence or absence of perineural invasion and to assess the adequacy of cancer tissue versus benign stromal tissue. Perineural invasion was determined histologically by using hematoxylin and eosin stain. Tumor samples were macrodissected (1 to 4 cores, approximately 2 mm2, were punched from each block) to obtain tissue with >80% cellularity of malignant epithelium. Tumors with sparse cancer cellularity were excluded from analysis.
RNA processing and quality analysis. Tumor tissue samples were disrupted under liquid nitrogen and homogenized according to the Qiagen QiaShredder protocol (Valencia, CA, USA). Total RNA (approximately 1-10 μg per sample) was isolated from each tumor sample according to the Qiagen RNeasy Microarray Tissue Mini Kit protocol. Once adequate mass was confirmed using an Eppendorf BioPhotometer (Hauppauge, NY, USA), the quality of the RNA was assessed using an Agilent 2100 Bioanalyzer (Santa Clara, CA, USA). RNAs suitable for microarray processing were selected using the following criteria: (i) RNA integrity number (RIN) >8.0; and (ii) absorbance ratio (A260/A280) between 1.8 and 2.1.
Expression microarray analysis. Forty samples (21 perineural-positive (PP) and 19 perineural-negative (PN)) that met tissue and RNA quality criteria were profiled on the Affymetrix GeneAtlas U219 human genome array strip in the Comprehensive Cancer Center's Cancer Genomics Shared Resource. For each sample, 250 ng of total RNA was used as a starting template for the reverse transcription and labeling reactions. Exogenous PolyA controls were spiked into each sample to monitor quality of amplification reaction. The amplified and biotinylated cRNA targets were hybridized to the microarrays, stained, washed and scanned according to standard Affymetrix protocols. Log intensity distributions and pair-wise correlations between arrays were examined to assess quality of each microarray hybridization. The resultant raw data (CEL files) were normalized using the Robust Multi-array Average (RMA) algorithm (15) as implemented in the Affymetrix GeneAtlas Instrument Control Software and log2 transformed for downstream statistical analyses.
Analytical plan. Differentially expressed genes (DEGs) between PN and PP samples were identified by using random variance model for univariate tests implemented in BRB-Array Tools (16). At two different significant thresholds of p-values 0.001 and 0.005, we obtained 24 and 130 probe sets, respectively. In addition, we used Significant Analysis of Microarray (SAM) with target proportion of false discoveries of 0.05 to obtain 134 probe sets. The above selected sets of genes were used in pathway enrichment analyses using the National Institute of Allergy and Infectious Diseases (NIAID's) DAVID microarray resource (17) and construction of classification models.
BRB-Array Tools software (16) was used to develop gene expression-based sample classification models. Using leave-one-out cross-validation, the resulting accuracies of eight different classification algorithms were evaluated. They are (i) The Prediction Analysis for Microarrays (PAM), using the shrunken centroid algorithm (18); (ii) The Compound Covariate Predictor, using a weighted linear combination of log-intensities for genes that are univariately significant at the specified level (19); (iii) The Diagonal Linear Discriminant Analysis (DLDA), using linear discriminant analysis but ignoring correlations among the genes to avoid over-fitting the data (20); (iv-v) The 1 or 3 Nearest Neighbor Predictor, using Euclidean distance as the distance metric to predict the class of test samples; (vi) Nearest Centroid Prediction, predicting a test sample belonging to a class corresponding to the nearest centroid; (vii) Support Vector Machine (SVM), with linear kernel functions, to separate the data subject to penalty costs on the number of specimens misclassified (21); (viii) The Bayesian compound covariate predictor, using weighted average of the log expression values of the selected genes, with the weights being the t statistics of differential expression in that training set (22). For each outcome of the eight methods, accuracy=((A+B)/n), sensitivity=(A/(A+C)) and specificity=(B/(D+B)) were calculated. N represents the total number of samples, A represents the number of perineural samples predicted as perineural, C represents the number of perineural samples predicted as non-perineural, D represents the number of non-perineural samples predicted as perineural and B presents the number of non-perineural samples predicted as non-perineural.
Results
Twenty-one PP and 19 PN samples were analyzed and deemed adequate for analysis. At a stringent p-value threshold of p<0.001, 24 genes were differentially expressed between PP and PN specimens (Table I). At a p-value threshold of p<0.005, 130 genes were differentially expressed between PP and PN specimens. Gene expression-based classification models were developed using both p-value thresholds, p<0.001 and p<0.005.
At the p-value threshold of p<0.001, the cross-validated performance of the eight classification models exhibited a mean accuracy of 85-95%. DLDA was most accurate at 95%, followed by Bayesian compound covariate at 94%. The poorest accuracy (85%) was observed for 1-Nearest neighbor and Support vector machines. For all eight methods, the sensitivities and specificities ranged from 79%-95% (Table II).
At the p-value threshold of p<0.005, the cross-validated performance of the eight classification models exhibited a mean accuracy of 82-92%. PAM was most accurate at 92%, followed by the Compound covariate predictor, DLDA and Bayesian compound covariate predictor 90%. The poorest accuracy (82%) was observed for 1-nearest neighbor, 3-nearest neighbors and Support vector machines. For all eight methods, the sensitivities and specificities ranged from 63%-90%.
We performed gene ontology/pathway enrichment analysis on selected DEGs at both p-value thresholds of p<0.001 and p<0.005; however, no significant enrichments were observed.
Discussion
Perineural invasion of the head and neck has been identified as a marker of poor outcomes, decreased survival, increase locoregional recurrence and shorter time-to-recurrence (4). In the head and neck, cancer cells may spread along the entire cranial nerve network in a retrograde or anterograde fashion in a relatively low resistance manner. Once patients become symptomatic, many times with facial paresthesias or facial paresis, the tumor has often spread into larger named nerves or to the skull base, making surgical options more morbid. Though new excision techniques, such as Mohs micrographic surgery, are being utilized to allow serial frozen sectioning to track margins, these techniques generally treat smaller and less advanced skin cancers with more subtle perineural invasion (1). The presence of inflammatory cells and peritumoral fibrosis, as well as artifacts from sectioning, have contributed to the underreporting of perineural invasion in Mohs surgery (23). Therefore, diagnostic or screening methods utilizing gene markers of perineural invasion or more aggressive subtypes would decrease underreporting and identify high risk cancers at an earlier stage.
To our knowledge, expression profiling studies, seeking to identify genes involved in perineural invasion, have not been reported. However, there is published work on markers of aggressive subtypes and progression of squamous cell carcinoma. Kivisaari et al. used tissue microarrays and immunohistochemistry in order to find MMP-7 up-regulation in epidermolysis bullosa-associated CSCC, known to be an aggressive subtype, suggesting a potential therapeutic effect of epidermal growth factor receptor antagonists in treatment of advanced CSCC (24). Farshchian et al. found elevated expression of serpin peptidase inhibitor clade A member 1 (Serpin A1) in more aggressive subtypes of CSCC (25). Nathan et al. investigated markers of the MEK/ERK pathway in metastasis in patients undergoing elective parotidectomy with CSCC. They found increased expression of pS6 in CSCC with parotid metastasis compared to those without parotid metastasis. Though this work is useful in helping to differentiate between aggressive subtypes, we wanted to look at specifically perineural invasion and the gene markers associated with this process.
This study is unique in that we used DNA microarrays to examine the differential expression patterns of over 20,000 genes in PP and PN tissues. By testing classification models whose gene features were selected based on multiple significance thresholds, we found that the threshold of p<0.001 (24-gene set) demonstrated the best predictive performance overall, likely owing to the statistical power of using the top 24 statistically significant genes versus the top 130. Of these 24 genes, none could be identified in the literature as having previously been associated with perineural disease in CSCC. By using leave-one-out cross validation to control for overfitting, we were able to develop classification models with high accuracy with a range of 85-95%. In developing these classification models we found that these 24 genes have a robust ability to distinguish PP and PN tissues regardless of the classification algorithm used, though some perform better than others.
A weakness of this study was the small sample size analyzed. As the goal of this project was to create pilot data that could serve as a groundwork for future studies, we estimated that forty samples would be adequate to identify a number of lead genes capable of discriminating PP and PN. We believe the findings presented here provide a strong rationale for subsequent and larger population-based studies aimed at honing and validating the prognostic performance of the 24-gene classifier towards identifying PP disease at the time of primary excision.
Conclusion
Our work shows that gene expression patterns can distinguish PP and PN SCSC. Internally cross-validated classification models based on these gene patterns distinguish PP and PN cancers with high sensitivity and specificity. Gene-based classification models may potentiate more timely and objective diagnosis of perineural invasion and may have utility in guiding more comprehensive adjuvant therapies.
Acknowledgements
John Albertini, MD - Role: Dermatologic Mohs surgeon responsible for collection of samples in the outpatient dermatology clinic. James Cappellari, MD - Role: Faculty dermatopathologist responsible for design of histologic plan for confirmation of perineural disease. Kyle Mills, MD - Role: Dermatopathology fellow responsible for screening samples for perineural disease and presence of malignant tissue. Sara E West, MD - Role: Dermatopathology fellow responsible for screening samples for perineural disease and presence of malignant tissue. Greg Kucera, PhD - Role: Director of Advanced tumor bank who assisted with tissue sample procurement .
Footnotes
Conflicts of Interest
The Authors declare that they have no competing interests.
- Received June 16, 2016.
- Revision received July 6, 2016.
- Accepted July 7, 2016.
- Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved