Abstract
Background/Aim: Epithelial–mesenchymal transition (EMT) program has been linked as a driver of metastatic dissemination by conferring migratory and invasive capacity to cancer cells. Gastric cancer (GC) patients with tumors expressing altered levels of EMT markers have low survival. This study aimed to assess if polymorphisms of CDH1, TWIST1, SNAIL2, ZEB1 and ZEB2 genes are associated with survival in GC patients. Patients and Methods: A total of 153 individuals with diagnosis of GC were recruited in Santiago, Chile. All patients were genotyped using Infinium Global Screening Array (GSA). Twenty Tag SNPs of the studied genes were retrieved. Results: Three SNPs were associated with survival: rs2526614 (TWIST1) (genotype CA + AA, adjusted HR=0.58, 95%CI=0.37-0.93), rs6953766 (TWIST1) (genotype GG, crude HR=2.02, 95%CI=1.06-3.82, adjusted HR=2.14, 95%CI=1.07-4.25), and rs431073 (ZEB1) (genotype AC + CC, crude HR=1.62, 95%CI=1.01-2.59, adjusted HR=1.96, 95%CI=1.18-3.25). Conclusion: To the best of our knowledge, this is the first study proposing a role of these SNPs in cancer prognosis. Their use as prognostic markers of GC survival warrants further investigation.
Gastric cancer (GC) is the sixth most common cancer and the third leading cause of death due to cancer worldwide (1). Mortality due to GC varies according to world region, from 15.2 age-standardized rate (ASR) per 100,000 in East Asia to 2.3 ASR per 100,000 in North America. Chile has a high GC mortality, with 16.7 ASR per 100,000 predicted for 2017; and has one of the highest GC mortality rates worldwide (1, 2).
More than half of patients are diagnosed at an advanced stage (TNM III-IV), where the 5-year overall survival (OS) rate is less than 50% (3). In Chile, it was estimated that 3 out of 4 cases are diagnosed with stage III or IV, with a 5-year survival of 10.6% (4). The stage of cancer at diagnosis is a key factor in defining prognosis and therefore it is a critical element in determining appropriate treatment. The most recognized evidence-based GC staging system in practice is the tumor-node-metastasis (TNM) concept from the American Joint Committee on Cancer (AJCC) (5). The AUC (area under the curve) of TNM staging to predict GC survival is near to 0.8 (6). Since this value is below 1 (considered “perfect prediction”), it suggests that other factors such as host characteristics play a role in GC prognosis. A handful of single nucleotide polymorphisms (SNPs) have been described as risk factors for GC (7), and there is growing evidence associating SNPs with GC survival (8).
Metastasis is estimated to be responsible for over 90% of patient mortality associated with solid tumors (9). Epithelial–mesenchymal transition (EMT) is known as a driver of metastatic dissemination by conferring migratory and invasive capacity to cancer cells, allowing them to get a cancer stem cell state (10). EMT is a program which transforms epithelial cells into mesenchymal ones. This program is involved not only in cancer, but also in embryonic development, wound healing and fibrosis (11). EMT is triggered by various signals including fibroblast growth factors (FGFs) and the Wnt pathway (10). Canonical downstream effectors include EMT transcription factors belonging to zinc finger or bHLH families, such as Snail, Zeb, and Twist. These factors suppress the expression of the prototypical adhesion protein E-cadherin at the transcriptional level (11). Interestingly, mutations in CDH1 (the gene encoding E-cadherin) are responsible for hereditary diffuse gastric cancer. EMT is closely related to GC and levels of EMT markers are all up-regulated in patients with dysplasia or early GC, while the level of E-cadherin is decreased in these patients (12). GC patients with tumors expressing low levels of E-cadherin and high levels of Snail, Zeb and Twist have low survival (13-15).
Thus far, there are no studies that investigate the prognostic value of polymorphisms in EMT genes in GC patients. In the present study, it was examined whether variants of CDH1, TWIST1, SNAIL2, ZEB1 and ZEB2 genes are associated with OS in GC patients.
Patients and Methods
Patients. A total of 153 individuals with a preoperative diagnosis of gastric adenocarcinoma were recruited at the time of surgical resection between December 2010 and August 2014 from different hospitals located in Santiago de Chile: University of Chile Clinical Hospital, Salvador Hospital, Barros Luco Trudeau Hospital, and San Juan de Dios Hospital. Only cases with a postoperative histopathological diagnosis of gastric adenocarcinoma were included. In all cases, the tumor was located distally from the cardia. Passive follow-up was performed obtaining death report from Civil Registry and Identification Service of Chile. The last follow up was in January 2018. OS was considered as the endpoint. Patients alive on the last follow-up were considered censored. Tumor size, depth of invasion, and lymph node metastasis were obtained from the histopathological report. Lauren's criteria were used to classify tumors as intestinal or diffuse.
Ethical approval. This study was approved by the Ethical Committee of the following institutions: University of Chile School of Medicine (#023/2011), University of Chile Clinical Hospital (#029/2011), Metropolitan South-Santiago Public Health Agency (#MK523B-118), Metropolitan East-Santiago Public Health Agency (#24/01/2012), and Metropolitan West-Santiago Public Health Agency (#236/2009). All participants gave their written informed consent. The study was performed in accordance with the Declaration of Helsinki.
Genotyping and SNP selection. Genomic DNA was isolated from peripheral blood leukocytes using salting out method and Proteinase K or according to the method described by Chomczynski and Sacchi (16). In both cases, genomic DNA was repurified using Monarch PCR and DNA Cleanup columns (NEB, MA, USA). DNA samples were genotyped using Infinium Global Screening Array (Illumina, CA, USA) according to manufacturer's instructions in the Human Genotyping Facility (HuGe-F) at Erasmus MC, The Netherlands. Per-individual and Per-marker quality controls were performed according to the guideline of Anderson et al. (17). All the studied patients and SNPs passed the quality control otherwise indicated. Studied genes were CDH1, TWIST1, SNAIL, ZEB1, and ZEB2. The analyzed SNPs were selected from the manifest file of the GSA array according to the following criteria: (1) located between 5 kb upstream of transcription start site and 5 kb downstream of the stop triplet according to GRCh37 assembly of human genome; (2) minor allele frequency (MAF) higher than 0.16; (3) not in linkage disequilibrium (LD) (r2>0.8) with other SNPs contained in the array; and (4) with no departure from Hardy-Weinberg equilibrium (HWE) in the studied population (p<0.05). The final list of the analyzed SNPs was: CDH1: rs16260, rs2098728, rs7186053; TWIST1: rs2526614, rs10240058, rs10228406, rs6953766, rs17140672, rs73325513; SNAIL2: rs2582778; ZEB1: rs431073; ZEB2: rs35339313, rs16823675, rs13013418, rs7597006, rs7599224, rs12327962, rs13382811, rs12691693, rs6740731.
Statistical analyses. A forward stepwise Cox regression model was used as variable selection procedure to obtain a candidate model to predict GC survival including SNPs (additive model) and clinicopathological variables, using p<0.05 for addition from the model and p>0.051 for removal from the model. The assumption of proportional hazards was tested according to Gramsch and Therneau (18) for all the selected variables. Hazard Ratios (HR) were estimated from Cox regression models using the selected variables as predictors. The association of polymorphisms with survival was also assessed with the log-rank test and Kaplan–Meier method. Median Survival Time (MST) was estimated at the 50th percentile, otherwise indicated. The Hausman specification test (19) was used to assess if there are differences between beta coefficients from two Cox regression models. p-Values were 2-sided and p<0.05 was considered statistically significant. All statistical analyses were performed using Stata 12 (StataCorp LLC, TX, USA). For SNPs, allele (additive), dominant and recessive models were considered.
Results
Patient characteristics. The demographic characteristics and clinicopathological features of the 153 patients included in the study are shown in Table I. The age at diagnosis ranges between 29 and 88 years, mean of 65.9 years, median 68 years and standard deviation of 10.9 years. Median follow-up time was 56.4 months (95%CI=54.2-59.7) according to reverse K–M method (20). The MST of the 153 included patients was 42 months (95%CI=27.2-67.1). Of the 153 patients, 85 (55.6%) died during the follow-up period. The advanced stage was more prevalent at diagnosis than early stage: 49.7% had tumors with T4 invasion, and 72.5% were positive for lymph node (N1, N2 or N3). In the univariate analysis (Table I), tumor size >5cm, Lauren's diffuse-type (or mixed), depth of invasion T3 and T4, and Lymph node metastases N1, N2 or N3 were significantly associated with poor OS.
SNPs proposed as independent predictors of overall survival. A forward stepwise Cox regression analysis (Table II) with all the 20 SNPs was performed using the additive model, and age, gender, tumor size, histological type, depth of invasion and lymph node metastases as potential prognostic factors. rs2526614 (TWIST1), rs6953766 (TWIST1) and rs431073 (ZEB1), as well as lymph node metastases stage N1, N2 or N3 were identified as independent predictors of OS in GC patients.
Association of demographic characteristics and clinicopathological features with overall survival.
Variables included in the model after a stepwise Cox regression analysis.
Association of rs2526614 (TWIST1), rs6953766 (TWIST1) and rs431073 (ZEB1) with gastric cancer overall survival. HR for rs2526614 (TWIST1) was only significant under the dominant model (CA+AA) adjusted by clinicopathological variables (Table III). Nevertheless, there were no significant differences between hazard function of CA+AA versus CC subjects (Log-rank p-value=0.24, Figure 1). Genotype GG of rs6953766 (TWIST1) was significantly associated with poor OS, both crude (HR=2.02, 95%CI=1.06-3.82) and adjusted by clinicopathological variables (HR=2.14, 95%CI=1.07-4.25) (Table III). Comparing Kaplan–Meier estimator curves, GG genotype has the worst survival time (MST=9.8 months) compared with TT or TG genotypes (MST=46 months) (Figure 1), nevertheless the difference did not reach statistical significance (Log-rank p-value=0.18). Carriers of the C allele of rs431073 (ZEB1) have a worse prognosis compared to those with the AA genotype. Adjusted Cox regression analyses showed that this polymorphism is associated with OS under the dominant model (crude HR=1.62, 95%CI=1.01-2.59, adjusted HR=1.96, 95%CI=1.18-3.25) (Table III). According to Kaplan–Meier estimator curves, carriers of rs431073 C allele (CC or CA genotypes) have a low OS (MST=17.9 months) compared to patients with the AA genotype (MST=38.5 months) (Log-rank p-value 0.04, Figure 1).
Kaplan–Meier curves of overall survival for (A) rs2526614, (B) rs6953766, (C) rs431073, in gastric cancer patients.
Cox regression analyses for association of rs2526614 (TWIST1), rs6953766 (TWIST1) and rs431073 (ZEB1) with gastric cancer overall survival.
Hazard Ratio estimates according to clinicopathological features. Finally, the association of the selected polymorphisms with OS was assessed separately depending on the clinicopathological features of patients. For the three SNPs, HR was higher among patients with lymph node metastases (N1, N2 or N3, n=111) compared to HR for all GC patients (rs2526614 Hausman test p=0.0026, rs6953766 Hausman test p<0.0001, rs431073 Hausman test p=0.0391) (Table IV). There were also differences in HR when comparing all GC patients versus patients with diffuse-type (rs6953766) or with tumor size >5 cm (rs431073).
Discussion
EMT program has been linked to metastatic dissemination. Canonical effectors of this program include E-cadherin, Snail, Zeb, and Twist. Levels of those EMT markers have been associated with cancer survival. Nevertheless, little is known regarding the role of SNPs in genes encoding EMT markers in cancer prognosis, particularly with GC survival. In the present investigation, two SNPs in TWIST1 (rs2526614 and rs6953766), and rs431073 in ZEB1 were associated with OS in GC patients.
TWIST1 is located on chromosome 7p21.1 and encodes Twist Basic Helix-Loop-Helix Transcription Factor 1. This protein was reported to be associated with migration and invasion of GC cells (21). It is frequently expressed in tumor GC cells and also in stromal cancer-associated fibroblasts but its expression was absent in gastric dysplasia, metaplasia, gastritis or normal gastric mucosa (22, 23). In addition, tumors expressing TWIST1 are more frequent among patients with lymph node metastases compared to N0 patients (22, 23). It is in agreement with our observation that both rs2526614 and rs6953766 have worse survival among lymph node-positive patients (Table IV). TWIST1 expression in tumor cells has been associated with poor OS (22) and progression free survival according to data from The Cancer Genome Atlas (24). Interestingly, the expression is more frequent in diffuse-type tumors compared to intestinal tumors (23), and TWIST1 levels in stromal fibroblasts were associated with poor prognosis in the diffuse-type but not in the intestinal type (22). This fact could explain why the HR for rs6953766 was higher among patients with diffuse GC (HR=2.82, 95%CI=1.13-7.03, p=0.026) than intestinal GC (HR=1.17, 95%CI=0.40-3.40, p=0.772) (Table IV).
Hazard Ratios of rs2526614 (TWIST1), rs6953766 (TWIST1) and rs431073 (ZEB1) according to patient's characteristics.
The associated polymorphisms in TWIST1 were rs2526614 and rs6953766. rs2526614 is located at the last intron and rs6953766 on the second intron. After a search in PubMed and NHGRI-EBI GWAS catalog, neither of them nor the proxy SNPs (r2>0.8 in 1000 Genomes AMR population: rs2717327, rs55958613, rs2390045 and rs2390046) have been associated with any phenotype. TWIST1 rs2526614 is 300bp from its proxy variant rs2717327 (r2=1 in AMR population) and is not an expression quantitative trait loci (eQTL) according to the Genotype-Tissue Expression (GTEx) project (www.gtexportal.org). They are both located in enhancer (state 7 of Core 15 ChromHMM states) and DNAse hypersensitive sites as reported by HaploReg v4.1 (25) using data from Roadmap Epigenomics Consortium. Non-coding associated variants from GWAS studies are enriched in enhancer and DNAse hypersensitive sites (26, 27), therefore, a SNP present in these sites is probably functional. Also, rs2526614 is located in a binding site of P300 protein identified by ChiP-Seq (ENCyclopedia of DNA Elements - ENCODE- data retrieved from RegulomeDB (28)). Taken together, the evidence above support, to some extent, a functional effect of rs2526614 on TWIST1 expression.
Concerning TWIST1 rs6953766, this variant is in LD with rs55958613, rs2390045 and rs2390046 in a region that spans 2.5kb in intron 2 of TWIST1. The analysis of HaploReg v4.1 indicates that this region contains enhancer histone marks and DNAse hypersensitive sites. SNP2TFBS (29) was used to assess whether the associated variants and their proxy SNPs affect transcription factor (TF) binding sites in the Human genome. This tool revealed that rs6953766 creates a binding site for Spi-B transcription factor. It is an Ets family TF expressed exclusively in mature B cells, T-cell progenitors and plasmacytoid dendritic cells (30). Nevertheless, Jian et al. (31) reported that Spi-B is one of the twenty TFs overexpressed in GC tumors compared to normal tissues. Recently, Du et al. (30) published the results of a study of this TF in NSCLC tumors. They found that high Spi-B expression is associated with worst prognosis. Interestingly, down-regulation of Spi-B in an NSCLC cell line that expresses endogenous Spi-B and has a mesenchymal phenotype, resulted in up-regulation of E-cadherin and down-regulation of EMT TFs, including Twist1. The authors concluded that Spi-B might be essential in maintaining the mesenchymal phenotype of lung cancer cells. Therefore, it is possible that rs6953766 G allele is associated with poor OS in GC patients. This allele could create a binding site for Spi-B, a TF differentially overexpressed in GC cells, and contribute to the expression of TWIST1 in GC cells. Further studies experimentally evaluating this mechanism are warranted to support its association with OS.
ZEB1 encodes zinc-finger E-box-binding homeobox factor 1 (ZEB1), a member of the zinc finger family of proteins. Down-regulation of this gene in MKN1 gastric cancer cells increased E-cadherin expression and reduced its capacity to proliferate, migrate, and invade (15). Jia et al. (32) found that this marker was frequently found in GC tumors compared to the adjacent normal gastric mucosa, in particular in poorly differentiated tumors and lymph node metastases positive patients. However, no differences were found according to tumor size (15, 32). These results are in agreement with our findings that show an association of rs431073 with poor prognosis, markedly among lymph node-positive patients. On the other hand, it does not support the fact that the HR of this polymorphism is higher in patients with tumor >5 cm (HR=2.53, 95%CI=1.46-4.39) than in patients with tumor <5 cm (HR=0.84, 95%CI=0.32-2.19). rs431073 is located in intron 1 of ZEB1 and has no proxy SNPs. This polymorphism has not been associated with any phenotype. HaploReg v4.1 revealed that rs431073 is located in a region that does not contain enhancer histone marks and DNAse hypersensitive sites. This polymorphism lies in a 115 bp site identified by ChIP-seq to bind BRG1 in CD36 cells. BRG1 (encoded by SMARCA4) is a member of the SWI/SNF family of proteins involved in transcriptional activation or repression of selected genes by chromatin remodeling, and there is growing evidence about its role in cancer (33). Up to date, no studies have been published evaluating a possible role of BRG1 on the expression of ZEB1. Further studies are needed to assess if rs431073 is a functional variant.
In conclusion, our study found three polymorphisms associated with OS in GC patients: TWIST1 rs2526614, TWIST1 rs6953766, and ZEB1 rs431073. The association was more noticeable when patients were stratified by certain clinicopathological features. Information from published studies as well as from databases allowed us to propose that these SNPs are functional variants. To the best of our knowledge, this is the first study proposing a role for those SNPs in GC prognosis. The use of TWIST1 rs2526614, TWIST1 rs6953766, and ZEB1 rs431073 as prognostic markers of GC survival warrants further investigation.
Acknowledgements
The Authors would like to acknowledge Benjamin García-Bloj, MD, PhD for his help in proofreading the manuscript. This work was supported by Fondo Nacional de Desarrollo Científico y Tecnológico -Chile- (FONDECYT) #1151015.
Footnotes
This article is freely accessible online.
- Received April 24, 2018.
- Revision received May 22, 2018.
- Accepted May 23, 2018.
- Copyright© 2018, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved