Abstract
Background/Aim: Kallikrein-related peptidases (KLKs) comprise a serine protease family with prominent roles in tissue physiology and disease pathogenesis, including cancer. Previously, we have characterized canine Klk4-10 and -14. Herein, we continue our efforts by characterizing three novel members of the canine family, i.e. Klk11-13, and investigating their expression in mammary cancer. Materials and Methods: Reverse transcription-polymerase chain reaction (RT-PCR) and DNA sequencing were used for investigating the expression and determining the nucleotide sequence of all transcripts identified, respectively. Results: It was demonstrated that (i) unlike other Klks, (CANFA)Klk12 probably possesses a non-AUG translation initiation codon, (ii) all three Klks undergo alternative splicing, with exon 2 and 3 concurrent elimination serving as the most prominent event, (iii) all transcripts identified were detected in both tumor and normal tissues, yet with different frequencies. Conclusion: Having completed this work, Klk15 is the only gene remaining to experimentally resolve the entire canine Klk family. Our data lay sufficient groundwork for validation studies and await further incorporation into genetic/evolutionary studies with translational impact.
The human kallikrein-related peptidases (KLKs) comprise a family of 15 serine proteases, encoded by a contiguous KLK gene cluster (1-3). The genes encode single-chain pre-proenzymes, which carry a signal peptide, a short propeptide and the catalytic domain. Proteolytic cleavage of these peptides eventually allows for secretion and generation of the active enzymes (4).
Earlier experimental and in silico investigations have successfully described Klk gene families in the mouse, rat, pig, chimpanzee, dog and opossum (5-7). However, the most detailed breakthroughs on the structural characterization of this family in various animal species have risen from studies addressing their evolutionary perspective (8-10). It is now presumed that the kallikrein locus is unique to all mammals and the majority of tissue KLKs is highly conserved among species. Quite expectedly, certain structural properties within the KLK gene sequences are particularly conserved. For instance, all KLKs have 5 coding exons and 4 intervening introns with identical patterns of intron phases. In addition, their catalytic domain comprises three residues, namely His, Asp and Ser, whose codons are strictly positioned within the second, third and fifth exon, respectively (5).
The expression pattern of most human KLKs expands to the majority of cell types and tissues, where they cooperate in proteolytic cascade pathways to regulate physiological processes (11-13) of which skin desquamation and homeostasis, dental enamel formation and regulation, as well as seminal plasma liquefaction, constitute a few notable examples. The complex enzymatic circuitries elicited by KLKs in various tissues are also seen as mediators of cellular responses during disease onset and progression, including inflammation and cancer (4, 11, 14). Furthermore, the levels of multiple KLKs are disturbed in both the gene and protein expression levels in various malignancies, an observation that also provides rationale for assessing these molecules as putative biomarkers (15).
Following earlier studies on the characterization of canine Klk1 (16) and Klk2 (17, 18) mRNA sequences, our group experimentally characterized many of the remaining members of the family (i.e. Klk4-10 and -14) for the first time using normal and neoplastic canine mammary tissues (19-21). Our published data have collectively pointed-out certain inaccuracies between in silico predicted and experimentally verified mRNA sequences, as in the case of Klk4, -9, -10 and -14, while also denoted alternative splice variant forms for Klk8, -9 and -14 genes (19-21). Additionally, certain Klks and/or their variants demonstrated differential expression between normal and neoplastic tissues (19-21), which, on occasion, showed to be consistent with relevant human breast cancer studies (22). Our findings, therefore, support the notion that the dog could be a useful animal model for in vivo studies of Klk expression and function in breast cancer. To complement and finalize our efforts on the structural delineation of the entire canine Klk transcriptome, here, we sought to investigate the expression of three additional canine Klks, namely Klk11, -12 and -13, in canine mammary cancer.
Materials and Methods
Tissue samples. Tissue specimens were obtained from surgically removed mammary gland tumors from 30 pet dog cancer patients that were admitted to the Companion Animal Clinic, Department of Veterinary Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki. For 20 of them, tissue samples were taken during surgery from the center of the tumor, whereas tissue samples were taken from both the tumor and a site 2-5 cm from the visible tumor margin (normal tissues adjacent to the tumor) for the other 10. All specimens were immediately immersed in an RNAlater solution (TAKARA, Shiga, Japan) and stored at-80°C until further processing. Histological analysis was performed to verify that the tissues were either malignant or normal.
RNA extraction and cDNA synthesis. Total RNA was extracted from tissue samples using the NucleoSpin Total RNA Isolation kit (Macherey-Nagel, Duren, Germany) and reverse transcription was carried out using the PrimeScript 1st strand cDNA synthesis kit (TAKARA), according to the manufacturers' instructions. One μg of total RNA was used as starting material for cDNA synthesis.
Polymerase chain reaction (PCR) amplification. The primers used for PCR amplification, along with the sizes of the amplicons produced for each primer pair, are presented in Table I. Primers were designed based on: (i) in silico predicted sequences available in GenBank (accession numbers-Klk11: XM_005616254, Klk12: XM_849479, Klk13: XM_003638808) and (ii) alignments of the corresponding human KLK mRNAs (GenBank accession numbers-KLK11: NM_006853, KLK12: NM_145894, KLK13: NM_015596) with Canis familiaris “whole genome shotgun sequences” (GenBank accession numbers: AAEX03000771.1 for KLK11 and KLK13; AOCS01189798.1 for KLK12).
Touchdown PCR protocols were adopted to achieve high specificity, as previously described (19-21). RNA integrity was verified through PCR amplification of canine β-actin (Actb) housekeeping gene, as previously described (19-21).
DNA sequencing, GenBank accession numbers and in-silico analysis. PCR products were purified using the NucleoSpin Extract II kit (Macherey-Nagel) and sent to a commercial sequencing facility (VBC-Biotech Service GmbH, Vienna, Austria) for DNA sequencing. The nucleotide sequences obtained, either corresponding to classical forms or splicing variants of Klk11, Klk12 and Klk13, were then submitted to GenBank and assigned accession numbers KJ831037 through KJ831044. Sequence homology searching/analyses and signal peptide predictions were performed as previously described (19-21).
Results
Structural characterization of Klk12. Since Klk12 nucleotide sequence has never been experimentally determined, the primers required for investigating its expression by RT-PCR, were designed based on an in-silico identified sequence (predicted) available in GenBank (accession No. XM_849479). Primers KLK12-Fa/KLK12-Ra (Table I) align at the very ends of the coding region (795 bp) -as allocated within the predicted sequence -and yielded a PCR product, which, upon DNA sequencing, revealed 99.75% homology with the predicted Klk (2/795 nucleotides difference). In order to determine the structural characteristics of this gene -i.e. exon/intron boundaries, exon/intron sizes, catalytic triad codons etc.- the obtained sequence was aligned with both KLK12 (GenBank accession No. NM_145897) and the canine genome. The alignments, however, revealed the following discrepancies to the well-known KLK features: (i). Klk12 has 6 instead of 5 coding exons (Figure 1, Klk12-model 1); (ii) the first 48 nucleotides of the coding region have no homology to the human KLK12 coding region (Figure 2A); and (iii) only the last four exons exhibit similarity in terms of size and sequence with human KLK12 (Figure 1, Klk12-model 1 and Figure 2A). For that, we kept the numbering of these exons the same as in the human KLK12, i.e. 2, 3, 4 and 5, and designated the first two exons as 1’ and 1.
We hypothesized that the additional coding exon identified (i.e. exon 1 or 1’) might have been the consequence of an incorrectly predicted translation initiation codon within the in silico identified canine Klk12 (XM_849479). In this case, if the real start codon resides downstream of the one predicted, the additional coding exon might simply constitute an untranslated exon. On the other hand, if the start codon is located upstream of the predicted one, then, the additional coding exon might be the result of an insertion, possibly due to partial intron retention, as previously found in other canine Klks (19-21). Since no start codon was identified further downstream of the predicted one and around the area of high homology to the human KLK12 translation initiation site (Figure 2A), we searched for alternative start codons upstream of the one proposed in GenBank. An ATG codon obeying to the Kozak rules (23) was identified 65 nucleotides upstream and a new primer was designed to incorporate it (Table I, primer Fb). PCR amplification using primers KLK12-Fb/KLK12-Ra resulted in the expected 860 bp product, which, upon sequencing, showed 99.77% homology to the predicted sequence (2/860 nucleotides difference). Importantly, no product of smaller size that would indicate absence of the extra coding exon described above was detected (data not shown).
While searching for alternative start codons, we rationalized that if nucleotide C at position +49 of the predicted Klk12 coding region was substituted by an A, an ATG start codon would be created, at the exact same site as the translation initiation codon of the KLK12 (Figure 2A). This would give rise to a coding region with 5 exons and high homology to human KLK12. However, all our sequencing data revealed a C at that position. To exclude the possibility that there might be some transcripts with an A at that position, we performed an allele-specific PCR using two new primers (Table I, KLK12-Fc and KLK12-Fd) differing only by the last nucleotide (A or C, respectively) at their 3’-end. PCR amplification both with KLK12-Fc/KLK12-Ra and KLK12-Fd/KLK12-Ra pairs revealed products only for the latter combination, thus verifying the absence of A at that position.
All the above led us to propose that for Klk12 translation may be initiated at a CTG rather than the classical ATG codon (Figures 1 and 2, model 2). Although relatively rare, initiation of translation can occur at non-ATG codons that differ from ATG by a single nucleotide, with CTG being the most efficient one in mammals (24). For initiation at non-ATG codons, the presence of a good Kozak context is still crucial. Indeed, the hereby proposed CTG codon has a purine (A) at position -3 and a G at position +4, as dictated by the Kozak motif (23). The following observations further support model 2 as the prevailing one: (i) the start codon resides on nucleotides 20-22 of exon 1, consistently with the observation that all KLKs have a 5’ non-translated sequence within their first exon (4); (ii) there is a purine (A) at position -3 (23), whereas in model 1 there is a pyrimidine (C) at that position; (iii) the coding region includes 5 exons with the exact same sizes with the one in KLK12; (iv) the coding region exhibits high homology to the corresponding human KLK12 sequence (81.6% vs. 76.7% of model 1; Table II); (v) when translated, it produces a protein with high similarity to its human counterpart (77.0% vs. 71.9 % of model 1; Table II) and a signal peptide that, as in human KLK12, spans amino acids 1-17 (Figure 2B).
Structural characterization of Klk11 and) Klk13. Like Klk12, Klk11 and Klk13 of the canine family have never been experimentally characterized. Therefore, in order to study their expression in canine mammary tissues we designed primers (Table I), again, based on computationally identified sequences available in GenBank (accession numbers-XM_005616254 and XM_003638808, respectively). Primer pairs were selected to amplify the entire coding region of the genes. DNA sequencing of the amplified products revealed nucleotide sequences exhibiting 100% homology to the corresponding predicted Klks (data not shown), as well as high similarity rates (85.7% and 86.1%, respectively) to the corresponding human KLKs (Table II). Alignments of the obtained Klk sequences with both canine “whole-genome shotgun sequences” (GenBank accession No. AAEX03000771.1) and the respective KLKs (GenBank accession Nos.: NM_006853 and NM_015596, respectively) revealed several structural characteristics of these genes like exon/intron sizes, exon-intron boundaries and the positions of the codons of the catalytic triad (H, D, S) (Figure 3, classical forms). When translated, these nucleotide sequences gave protein products with high homologies (82.0% and 83.8%, respectively) to their human counterparts (Table II) bearing an 18-amino-acid signal peptide on their N-terminal domain. The above observed characteristics are in complete agreement with those described for all human and other canine Klks.
Detection and characterization of alternatively spliced variants. PCR amplifications revealed, besides the expected for each Klk product (Table I), amplicons of lower molecular weights (Figure 4). Gel extraction and DNA sequencing of these products verified that they comprise transcripts of the corresponding canine Klks. Transcript variants that lack the entire exon 2-3 sequence were detected in all three Klks and were designated as variant 1 in all cases (Figure 4). For Klk11, one more transcript (variant 2), bearing exon 3 deletion and a 4 bp insertion between exons 2 and 4, was identified. Searching within the Klk11 genomic DNA, we found that the insertion encompasses the first 4 nucleotides of the 5’-end of intron II. A second transcript, missing exons 2, 3 and 4, was also identified for Klk13 (variant 2).
The generation of these transcript variants could be explained either by complete omission of the regular splice sites or by employment of alternative splicing sites (Figure 5). The alternative splicing events that lead to exon 2-3 deletion do not disturb the open reading frame in any of the three Klks (Figure 3). The predicted protein products, however, do not encode for active serine proteases as they lack both His and Asp of the catalytic triad. On the other hand, both exon 2 3’-extention /exon 3 deletion and exon 2-3-4 skipping cause a frame shift and lead to premature termination of translation in Klk11 and Klk13, respectively. The truncated polypeptides possess either only one or none of the amino acids of the catalytic triad and, as a result, they cannot act as serine proteases. All isoforms encode for a signal peptide and can, thus, be secreted (Figure 3).
Expression in tumor and normal mammary tissues. The expression of all three Klks was examined in 20 mammary tumors and 10 paired tumor and adjacent to the tumor normal mammary tissues. The results are presented in Table III. Among the 30 tumor samples analyzed in total, positivity rates ranged from (93.3%) for Klk11 variants 1 and 2, the classical form of Klk12 and Klk13 variant 1 (28/30) to 50% for Klk13 variant 2. All Klk transcripts were detected in normal tissues as well, yet with lower frequency, except for the classical form of Klk11, which was found in 9/10 tissues. In the 10 matched samples, the positivity of Klk11 variant 2, the classical form of Klk12 and Klk13 variant 1 in normal tissues dropped to approximately 50% of the one in the corresponding tumors (Figure 4).
Discussion
Despite the fact that the increasing availability of sequencing data in genomic databases has enabled in silico identification of the Klk family within the genome of several organisms (5, 8-10), studies on their experimental characterization and expression in non-human tissues are still limited. Such information, however, would be of great importance to assist in the development of appropriate animal models for investigating KLK functions in both physiology and disease. Lately, we have proposed the dog as a promising animal model for in vivo KLK studies and we sought to examine the Klk expression profile in both neoplastic and non-neoplastic tissues (19-21). In the present report we investigated the expression of canine Klk11, Klk12 and Klk13 in mammary tissues.
First, the experimentally identified nucleotide sequences were in agreement with the ones predicted in silico to represent canine orthologs of human KLK11, KLK12 and KLK13. Klk11 and Klk13 were found to share high homology to their human counterparts and abide by all KLK-defining characteristics. For Klk12, however, it was noted that the start codon, according to GenBank, generates a coding sequence and a protein product with absence of fundamental KLK features and no homology to the 5’- and NH2-termini, respectively, of the KLK12 nucleotide and amino acid sequences. These discrepancies have never been observed in any other Klk either of the same or of a different species.
Having excluded the possibility that this Klk12 transcript might constitute an alternatively spliced variant, we noticed a CTG codon further downstream of the in silico predicted start codon, which, upon acting to initiate translation, it could lead to a gene and a protein product with all KLK features and high homology to human KLK12. It is possible that an A to C substitution during evolution might have abolished the ATG initiation site at this position. Alternatively, the coding region may begin with a non-ATG codon. Tikole and Sankararamakrishnan (25) have demonstrated that coding regions of about 0.1% of the total mRNA sequences begin with a non-AUG codon. In mammals, CUG (CTG in the DNA sequence) appears to be the most efficient non-AUG start codon, whereas AAG and AGG are the least efficient ones (24). Searching within the “non-AUG” bioinformatic database (http://bioinfo.iitk.ac.in), we found that there are 5 such sequences within the canine genome, which all initiate translation at a CTG codon. Out of these, 3 have a purine at position -3 and a G at +4, as identified in the CTG codon, hereby suggested to constitute Klk12 start codon. All the above led us to propose two potential models for the structure of the canine Klk12 coding region; one beginning with the ATG codon predicted in the computationally determined sequence available in GenBank (model 1) and one beginning with the CTG codon presented in this study (model 2). For reasons explained above, the most probable -to our opinion-one is model 2. This, however, remains to be confirmed with in vitro translational studies.
As previously demonstrated for other canine Klks (19-21), Klk11-13 were also shown to undergo alternative splicing. A splice variant lacking both exons 2 and 3 was detected for all three of them (variant 1). This was the sole variant identified for Klk12. Klk11 had one more transcript, which combined 3’-extention of exon 2 and deletion of exon 3 (variant 2), and Klk13 an additional variant bearing only exons 1 and 5 (variant 2). In Klk11 variant 2 and Klk13 variant 2, alternative splicing led to a frame shift and the subsequent generation of a premature termination codon. As a result, both variants, if translated, would produce truncated peptides. On the other hand, exon 2-3 elimination in variant 1 of all three Klks does not disrupt the reading frame but, still on translation, smaller polypeptides will be produced. None of these peptide isoforms is expected to encode functional serine proteases as they lack critical residues of the catalytic triad. In Klk11 variant 2, the sequence encoding for the signal peptide is retained, whereas in variant 1 of all three Klks, it is disrupted by the deletion. The new sequence generated on the 5’-end of the transcripts, however, also encodes for a signal peptide in all three variants. These observations imply that all the alternatively spliced transcripts identified are likely secreted upon translation.
The investigation of the expression of all mRNA transcripts of the three Klks in mammary tumor tissues revealed almost ubiquitous expression of the classical form of Klk12, both variants 1 and 2 of Klk11 and variant 1 of Klk13. With the exception of Klk11 variant 1, the other three transcripts were detected in almost all tumors and in only half of the corresponding normal tissues. It is also worth mentioning that the predominant -in terms of expression levels- transcript for both Klk11 and Klk13 was the one with missing exons 2 and 3 (variant 1) and not the classical form of the genes. Moreover, Klk13 variant 1 was detected in almost all tumor tissues, whereas the corresponding classical form in only half of them. It may, thus, be possible that the elimination of exons 2-3 via alternative splicing may be the mechanism utilized to abolish Klk11 and Klk13 function.
This study has experimentally characterized canine Klk11, Klk12 and Klk13, along with their alternatively spliced variants, and demonstrated their expression in neoplastic and non-neoplastic mammary tissues. Having completed this work, Klk15 is the only gene remaining to experimentally resolve the entire canine KLK family. Overall, the preliminary evidence from our current and past reports lays sufficient groundwork for validation studies with larger cohort sizes. These data await deeper mining, interpretation and, even, incorporation in both genetic/evolutionary and clinical studies with translational impact.
Acknowledgments
This work was funded by the “Support of Research Activity in A.U.TH-2012” institutional program (A.U.TH Research Committee no. 89290).
- Received January 26, 2015.
- Revision received February 3, 2015.
- Accepted February 6, 2015.
- Copyright© 2015 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved