Abstract
Aim: To test the association between common TP53 haplotypes and colorectal cancer (CRC) development. Patients and Methods: A total of 277 CRC patients and 167 healthy volunteers were included in the study. Common TP53 haplotypes were estimated from eight single-nucleotide polymorphisms (SNPs) (rs1614984, rs77697176, rs12947788, rs1800372, rs2909430, rs1042522, rs17878362 and rs11652704). Stepwise haplotype trend regression showed the haplotype-regressor cccgaRDa as a possible predictive marker. Results: The rare haplotype cccgaRDa was identified in 10 CRC cases and 3 controls. Although it is approximately twice as common in CRC (odds ratio (OR)=2.068; 95% confidence interval (CI)=0.471-9.069), the cccgaRDa haplotype frequency is low in the studied groups. Results of our study suggest that the common TP53 variability is relatively low (only 3 haplotypes occurred above 10%). Conclusion: The haplotype background of TP53 gene is relative stable and despite low haplotype-regressor cccgaRDa frequency it shows to be a possible predictive parameter for CRC development.
Cancer has been the leading cause of death worldwide. Oncology research has focused on the role of single-nucleotide polymorphisms (SNPs; germline variants) in various genes involved in carcinogenesis, such as DNA repair genes, genes of bio-transforming enzymes and tumour suppressors (1-7). Transcription factor p53 is recognized as one of the most prominent tumour suppressors that stimulates downstream pathways leading to protective cellular processes, including cell cycle arrest and apoptosis (8-11). TP53 is frequently mutated in sporadic cancers. Over 200 SNPs (germline variants) in TP53 have been identified; in contrast to tumor associated mutations, most of these TP53 SNPs are unlikely to have biological effects (12). The majority of polymorphisms are located in introns, outside splice sites, or in non-coding exons (13). Because the effects of polymorphisms can be subtle and vary according to genetic background, the effect of polymorphisms on cancer risk is associated with rigorous methodological challenge (12). The analysis of haplotypes represents a much more powerful approach than standard genotyping methods. Assembling of individual alleles to haplotypes can provide additional information about sequence evolution. The TP53 gene is highly polymorphic but its haplotype structure has yet not been described in detail (14).
In Slovak population, several association studies of individual variants have been conducted to elucidate the etiology of colorectal cancer (CRC) (15, 16). Here, we focus on haplotype variability in the TP53 gene region and haplotype potential as a CRC risk predictor.
Patients and Methods
Study subjects. The Ethical Committee of the Jessenius Faculty of Medicine, Comenius University, Martin, Slovak Republic (Number EK950/2011), approved the study and informed consent was signed by all participants.
The samples were obtained from unselected consecutive patients at the Surgery Clinic in Martin University Hospital, Slovak Republic. All patients have undergone curative surgical resection for primary CRC. Tumours were originated in all parts of colon. A total of 277 patients were included in the study. The control group had 167 healthy unrelated volunteers. All subjects were Caucasians of European origin. Healthy donors had negative both past medical and family histories of CRC.
DNA analysis. Genomic DNA was isolated from peripheral blood leukocytes by SiMax™ Genomic DNA Extraction kit (SBS Genetech Co., Ltd., Beijing, China), according to the manufacturer's instructions. DNA concentration was adjusted to 30 μg/ml and stored at −20°C for future analysis. All genotyping analyses, except one, were performed using high-resolution melting analysis (HRMA) on LightCycler® 480 II (F. Hoffmann-La Roche Ltd, Basel, Switzerland), according to the manufacturer's instructions. One insertion-deletion (indel) polymorphism was detected by electrophoresis on 1.5% agarose gel. The designing of all required primer sequences (Table I) was performed by Primer3Plus open source software (http://primer3plus.com/cgi-bin/dev/primer3plus.cgi) (17). In all reactions, positive and negative controls were included. As a quality control, about 10% of all samples were used as blinded duplicates. Fourteen ambiguous genotypes of rs2909430, first identified by HRMA, were sequenced at an external laboratory using the Sanger sequencing method (Macrogen Europe, Amsterdam, The Netherlands), ordered by local provider of DNA sequencing services.
TP53 gene analysis. Six tagSNPs capturing major variability of the TP53 gene and four most frequently studied polymorphisms of the TP53 gene were selected for this association study.
Introductory analysis for the selection of tagged SNPs in the TP53 gene region was performed. Data from the 1,000 Genomes Project (18) and an adopted algorithm implemented in Haploview 4.2 were used (19). Parameters for 1,000 Genomes Browser were: pilot_1_CEU_low_coverage_panel for population and 17:7565097-7590856 for location for targeted genomic region. VCF file with 62 SNPs was filtered with MAF >0.1 criterion, 24 SNPs rested for tagging analysis. In Haploview 4.2, pairwise tagging with r2 threshold=0.8 was chosen for extracting tagged SNPs in the TP53 gene region. Six tagSNPs resulted from analysis (rs11652704, rs2909430, rs1042522, rs12947788, rs1614984, rs77697176) captured 24 SNPs with MAF >0.1. Another four genetic variants selected from literature, rs1800371, rs1800372, rs17878362 and rs35163653, were previously reported to be associated with cancer diseases (20-23). The basic characteristics of analyzed SNPs, including nucleotide sequences, are given in Table I with further details described elsewhere (24, 25).
Statistical analysis. For descriptive statistics and testing hypothesis, SPSS 16 was used (Released 2007; SPSS for Windows, Version 16.0; SPSS Inc., Chicago, IL, USA). Non-parametric Mann-Whitney test was used for testing the differences of age values between control and patient groups. Pearson's χ2 test for contingency tables was used to test distribution of sex between control and patient groups.
Single marker and haplotype analysis were performed using SNP & Variation Suite v8.3 (Golden Helix, Inc., Bozeman, MT, USA; www.goldenhelix.com). The Fisher's exact test was used to estimate a significance of deviation from Hardy-Weinberg equilibrium and execute the basic allelic association. Pearson's chi-squared test for contingency tables was used to examine haplotype associations. A haplotype frequency was estimated using EM algorithm. Adjusted association tests were performed by logistic regression with case/control status as the dependent variable and age and sex as confounding variable in all three genetic models. For predictive modelling, stepwise format of logistic regression was used. p-Values less than 0.05 were considered statistically significant. Odds ratios (ORs) with 95% confidence intervals (CIs) were used to assess genetic effect. Clustal Omega's bioinformatics services on EMBL-EBI website (26) was used for multiple sequence alignment of sequenced data and alignment of particular sequences to reference genome.
Results
In this study, we genotyped a total of 277 cases with average age at onset of CRC of 65.3±11.3 years; 164 males (59.2%) and 113 females (40.8%). The control group consisted of 167 healthy individuals with average age of 50.1±14.5; 76 males (45.5%) and 91 females (54.5%). There were statistically significant differences in distribution of gender and age between the study groups. Finally, all association results were adjusted for gender and age.
Allele analysis. Allelic distributions were in accordance with Hardy-Weinberg equilibrium in all tested SNPs. Results of allele frequency analysis for 8 of 10 targeted genetic variants are reported in Table II. For two SNPs selected from literature, no variability was obtained in our samples. No new mutations were detected and all samples shared the previously described alleles; allele G (Val) for rs35163653 and allele C (Pro) for rs1800371. In two tagSNPs, rs77697176 and rs12947788, allelic frequency was below 10% (3.6% and 8.7%, respectively), which was unexpected according to introductory analysis. The result of HRM analysis of rs2909430 was confusing in 14 samples. In these cases, rs2909430 genotypes were determined by sequencing data analysis that also revealed the presence of another variant, rs113530090. The mutation c.376-86 T>C (rs113530090) in heterozygous state, 5 nucleotides from rs2909430 in 5’ direction, occurred in 7 CRC cases and 5 controls.
The allele frequencies for 8 SNPs were not different in CRC cases versus controls, while significant allelic associations were not observed (Table II). Higher, but not significant, odds ratio for rs1800372 (OR=2.028; 95% CI=0.554-7.423) could be influenced by increased frequency of heterozygotes in CRC; minor homozygote frequency was found neither in CRC cases nor in controls.
Genotype analysis. The genotype distributions in CRC cases and controls, with results of association tests for standard three genotype groups and for three genetic models, are displayed in Table III. Occurrence of minor genotype G/G for rs11652704 in first intron, 5’ regulatory region of the TP53 gene, was increased in controls group. Association tests for this polymorphism were significant for 3×2 contingency tables (p=0.037) and for the recessive genetic model (p=0.013). Because of unequal representation of gender and age in the two tested groups, multivariate logistic regressions adjusted for sex and age for all three genetic models were performed, followed by adjusted forward stepwise logistic regressions to find out the best predictive association model. All results are present in Table IV; there were no significant genotype associations with CRC shown except for one.
For the recessive genetic model, adjusted association analysis confirmed the result from Fisher's exact test; significant association of minor genotype G/G with the disease. The G/G genotype of rs11652704 might have protective function in developing CRC. Stepwise regression for the recessive genetic model showed only this rs11652704 as predictive marker for CRC.
Haplotype analysis. To determine overall major TP53 gene variability, common haplotypes from all 8 SNPs were estimated. Frequencies of individual haplotypes were not different in CRC cases compared to controls, while significant haplotype associations were not observed (Table V). Stepwise haplotype trend regression assigned only one haplotype-regressor for the best predictive model (cccgaRDa) with no significant result. This haplotype was identified in 10 CRC cases and 3 controls. Statistically not significant, roughly two-fold, more frequently occurrence in CRC cases (OR=2.068; 95% CI=0.471-9.069) is disputed by low cccgaRDa haplotype frequency.
Discussion
Colorectal cancer is one of the leading cancers in Europe with much discussed etiology and molecular pathogenesis. Inactivation of TP53 tumour suppressor gene is in general a frequent event in carcinogenesis. Several polymorphisms have been identified both in non-coding and coding regions of TP53 (4). Most studies have focused on TP53 gene variations and associations of selected SNPs have been analyzed independently. There are several studies on association analysis of differently defined haplotypes of TP53 gene in various diseases and populations (14, 27-30), as well as in CRC in populations with similar origin (31, 32).
The aim of the present study was to determine common haplotype variability of the TP53 gene in the healthy Slovak population and compare their distributions with CRC patients.
From 10 genotyped SNPs selected for this study, only 8 had variant alleles. Two previously described mutations in the coding region of TP53 (13, 33, 34), rs1800371 (p.Pro47Ser) and rs35163653 (p.Val217Met), respectively, were not detected in studied cohorts. Between 8 remaining SNPs, there were 5 tagSNPs and one short tandem repeat (STR) variant in the non-coding region of TP53 gene and one synonymous mutation (rs1800372) and one missense polymorphism (rs1042522, tagSNP) in the coding region of TP53 present. Linkage disequilibrium (LD) analysis (Figure 1) showed overall independence of 8 genotyped SNPs except for the correlation between rs2909430 and rs17878362 (r2=0.811). All SNPs were included in construction common haplotypes.
Our study showed no association of the variant alleles of any of the SNPs to be significantly associated with CRC overall risk. In case of rs1800372, we observed higher, but not significant, risk for minor allele G (OR=2.028; 95% CI=0.554-7.423). This could be influenced by increased frequency of heterozygotes in CRC (10 patients – 3.6% vs. 3 healthy individuals – 1.8%) but minor homozygote frequency was not found either in CRC cases or in controls.
In overall genotype association tests, we observed significant association only for rs11652704 in first intron on 5’ UTR of TP53. This is a potential location for binding of regulatory elements and rs11652704 could, thus, affect regulatory functions. For the recessive genetic model (p=0.013), minor genotype G/G was nine times more frequently present in controls versus CRC cases (3.6% vs. 0.4%). Genotype frequencies of rs11652704 in Slovak healthy individuals are comparable with HapMap frequencies (35) in the Caucasian European (CEU) reference population, 3.6%, 16.8% and 79.6% vs. 3.5%, 18.6% and 77.9%, respectively. Low incidence of rare genotype in CRC cases is readable; however, the absolute count of carriers has to be taken into account as well. We can consider a protective function of this rare genotype only. It would be, thus, useful to undertake functional analysis of allelic variants for rs11652704. This significant result was confirmed by logistic regression adjusted for sex and age. Stepwise regression for the recessive model proved only rs11652704 as significant predictive marker for CRC protection.
There were estimated 11 common haplotypes from 8 SNPs using expectation–maximization algorithm (EM) implemented in SVS 8 in the studied cohorts. Just over 70% of all haplotypes are represented by only three most common haplotypes; cccaaRDa in about 40%, tccaaRDa in about 20% and cccagPIa in about 12%. One is occurring in about 5% and all other remaining in about less than 5%. None of the haplotypes was significantly associated with overall risk of CRC. It was not possible to identify a haplotype with significant predictive value with regard to CRC, even by using stepwise haplotype trend regression adjusted by gender and age. Noteworthy, the least frequent haplotype cccgaRDa estimated in our study was in full correlation with rare exonic allele G of the rs1800372, that is, in our research sample, all carries of allele G (10 CRC cases and 3 controls) had the same combination alleles on remaining 7 SNPs in haplotype. In addition, this combination is the same as defined for the most common haplotype cccaaRDa. This suggests the occurrence of mutation event on most common haplotype background of TP53. The least frequent haplotype cccgaRDa had twice as frequent occurrence in CRC cases in comparison to controls. To clarify the role of the rs1800372, resp. cccgaRDa in development of CRC in Slovak population, it will be necessary to increase the analyzed sample size in order to capture subtle effect of these markers and also use the linkage genetic study to obtain more information about inheritance of suspected alleles and connection to family history of CRC.
The results of our study suggest that the common variability of TP53 is relatively low and, in Slovak population, only 3 haplotypes occurred above 10% frequency. This can be explained as a support to sequence stability due to its multiple cell-cycle roles as it is crucial in stress responses preserving genomic stability. This germline sequence stability is also maintained in patients with CRC in our population. Relative stable haplotype background of the TP53 gene is disrupted with low frequency or rare variants whose role, in the development of CRC, can be addressed in future epidemiological studies with larger sample sizes.
Acknowledgements
This publication is the result of the project implementation: “Competence Center for Research and Development in the Field of Diagnostics and Therapy of Oncological Diseases” (ITMS: 26220220153) and “Center of Translational Medicine” (ITMS 26220220021) supported by the Operational Programme Research and Innovation funded by the ERDF; and APVV-15-0217.
Footnotes
Conflicts of Interest
The Authors confirm that there are no conflicts of interest in regard to this study.
- Received February 1, 2017.
- Revision received March 3, 2017.
- Accepted March 6, 2017.
- Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved