Abstract
Background/Aim: Aldehyde dehydrogenase 1 (ALDH1) is known as a breast cancer stem cell (CSC) marker. This study aimed to identify genes associated with ALDH1. Materials and Methods: ALDH1-positive and -negative breast cancer cells were isolated using laser capture microdissection from five tissue samples of ALDH1-positive breast cancer patients. Messenger RNA expression levels were compared between ALDH1-positive and -negative cells. Results: We found 104 differentially expressed genes between ALDH1-positive and -negative cells. Gene ontology and pathway analysis revealed that these genes were correlated with CSC functions and pathways. Network analyses identified 10 genes that were closely associated with ALDH1. We validated these 10 genes utilizing The Cancer Genome Atlas and the Molecular Taxonomy of Breast Cancer International Consortium cohort, and found that they were associated with ALDH1 expression and correlated with Wnt pathway signaling. Conclusion: The 10 genes we identified could be potential targets for CSC therapy of breast cancer.
Aldehyde dehydrogenase 1 (ALDH1) has been identified as a marker of breast cancer stem cells (CSCs) (1). Two meta-analyses on ALDH1 function in breast cancer have been reported (2, 3). One of these studies analyzed 15 publications on ALDH1A1 and revealed that ALDH1A1 expression was significantly associated with tumor size, nodal status, histological grade, estrogen receptor (ER)- and progesterone receptor (PR)-negativity, and epidermal growth factor receptor 2 (HER2)-positivity. The prognosis in patients with ALDH1A1-positive tumors was worse than that in patients with ALDH1-negative tumors (2). In the other meta-analysis on 12 eligible studies, the results were similar except for tumor size and nodal status (3).
We also previously examined ALDH1A1 expression in 653 invasive breast cancer cases using core needle biopsy specimens at diagnosis (4). ALDH1 expression was examined in tumor cells and detected in 139 of the 653 cases (21.3%). The association of ALDH1 expression with clinicopathological features was consistent with that shown in previous meta-analyses. According to intrinsic subtypes, ALDH1-positive cases were found in the luminal type (12.2%), luminal-HER2 type (36.5%), HER2-enriched type (37.9%), and triple-negative type (30.0%).
Based on these results, it is clear that ALDH1 is associated with poor clinical outcomes in breast cancer patients, probably through regulating CSC features. ALDH1 is known as an enzyme that catalyzes biosynthesis of retinoic acid (RA) by oxidizing retinal and aliphatic aldehydes and plays a role in detoxification (5). However, questions remain as to how ALDH1 affects biological features of breast cancer cells and why this gene acts as a marker of CSCs.
In this study, we focused on triple-negative breast cancer (TNBC) because some cellular populations of TNBC were shown to possess stem cell features in comprehensive molecular analysis (6, 7). We aimed to identify genes associated with ALDH1 function as potential target genes in CSC that could be used to develop treatment for TNBC.
Materials and Methods
Patients and samples. Tissue samples were obtained from patients who underwent surgery at the Yokohama City University Medical Center. Five patients with triple-negative breast cancer (TNBC) and ALDH1A1 expression were enrolled in this study. The patients did not receive any preoperative treatments to avoid potential gene modification. This study was approved by the Institutional Review Board of Yokohama City University (D1207027). All procedures performed on human participants were in accordance with the ethical standards of the institutional research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The patients provided informed consent prior to inclusion in the study.
Histopathological and immunohistological staining. Hematoxylin and eosin (H&E)-stained sections from each block were prepared to determine the histological examination and diagnosis. To determine the breast cancer subtype, immunohistochemistry (IHC) of paraffin-embedded breast cancer tissues was performed to detect ER, PgR, and HER2. ER-negative, PgR-negative, and HER2-negative tumors were considered as TNBC. IHC was performed with an anti-ALDH1A1 (EP1933Y, ab52492, Abcam, Cambridge, UK) antibody. The IHC protocol with anti-ALDH1A1 was as previously described (4). Representative images of the H&E and ALDH1A1 staining are shown in Figure 1.
Laser micro dissection of ALDH1-positive and ALDH1-negative tumor cells for RNA extraction. ALDH1-positive and -negative cells were dissected separately from the five TNBC tissue samples using laser capture microdissection (LCM; PALM MicroBeam, Zeiss, Germany). Representative images pre- and post-LCM are shown in Figure 2. Then, the RNA was isolated from tumor tissue specimens after LCM according to a proprietary procedure from Response Genetics (Los Angeles, CA, USA) (8). Total RNA was analyzed using Affymetrix GeneChip microarrays (Affymetrix Human Genome U133 Plus 2.0 Array Thermo Fisher Scientific, Waltham, MA, USA). We performed a microarray analysis of five ALDH1-positive TNBC samples.
Microarray analysis to identify differentially expressed mRNAs between ALDH1-positive and ALDH1-negative tumor cells. The data were calibrated and standardized using Microarray Suite version 5.0 (MAS 5.0) (9, 10). MAS5 is the most commonly used and suitable method for microarray normalization. Following standardization, we excluded genes with unreliable values or values <300 for the quality of microarray data. We calculated the fold change (FC) of gene expression (ALDH1-positive area vs. ALDH1-negative area) and identified 104 genes with FC values >2.0 or <0.5.
Molecular network and statistical analyses. The 104 identified genes were analyzed using the KeyMolnet knowledge database (viewer program version 6.2, contents version 9.7.20180921161102) (KM Data Inc.; www.km-data.jp) (11). KeyMolnet has manually curated content on numerous associations among genes, proteins, metabolites, microRNAs, and molecular annotations such as diseases, pathological events, drug targets, and biomarker information. The list of differentially expressed genes was imported into KeyMolnet. The “start points and end points” network search algorithm was performed using differentially expressed genes as the start points and ALDH1 as the end-point to generate the network and identify candidate regulatory molecules causing ALDH1 induction. The statistical significance in concordance between the canonical pathways and the extracted network was evaluated using an algorithm that counts the number of overlapping molecular relations shared by both. This made it possible to identify the canonical pathway exhibiting the most significant contribution to the extracted network.
Gene expression analyses of the TCGA-BRCA and METABRIC cohorts. We used two large publicly available cohorts, The Cancer Genome Atlas (TCGA) (12) and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) (13), to confirm the clinical relevance of the identified genes. Normalized gene expression data (log2 transcripts per million values) of primary breast cancer tumors from female patients in the two cohorts were obtained from the cBio Cancer Genomics data portal. Gene set variation analysis (GSVA) was used to transform the gene expression values into enrichment scores for the pathway (14). The GSVA score for the HALLMARK_WNT_BETA_CATENIN_SIGNALING mSigDb Hallmark gene set (15) was calculated for each tumor from its gene expression. For each of the ALDH1-associated genes of interest, patients from both cohorts were grouped into high- and low-expression groups based on the within-cohort 10th percentile gene expression value. The boxplots depicted median, inter-quartile range, and outliers using the Tukey method. The Hallmark gene set scores, as well as the ALDH1 gene expression values of the two groups were compared using one-way ANOVA.
Results
Identification of genes associated with ALDH1A1. The total RNA isolated from ALDH1A1-positive and -negative cells dissected using LCM was subjected to gene expression analysis using Affymetrix GeneChip microarrays (Figure 2). The data on up-regulation and down-regulation of genes were recorded. Initially 54,682 genes were extracted, and 32,264 genes were selected after background noise elimination. The GAPDH as a housekeeping gene and ALDH1A1 from our microarray datasets are shown in Tables I and II. High expression of GAPDH was detected in all samples (Table I). On the other hand, the expression of ALDH1A1 varied among samples, and not all ALDH1A1-positive cells expressed ALDH1A1 compared to ALDH1A1-negative cells (Table II).
The fold change (FC) in gene expression (ALDH1A1-positive area vs. ALDH1 A1-negative area) was calculated, and genes with FC values >2 or <0.5 in each of the five cases were identified (Table III). Among them, genes that were commonly different between the ADH1A1-positive and ALDH1A1-negative cells in the five cases were extracted. With regard to the FC in gene expression, 63 genes showed two-fold higher and 41 genes showed two-fold lower expression in ALDH1A1-positive cells compared to ALDH1-negative cells.
Gene ontology and pathway analysis. Gene Ontology (GO) analysis revealed that the identified genes were associated with stem cell function such as organ morphogenesis, cell differentiation, metabolic homeostasis, and regulation of TOR signaling pathways (Table IV). The results of pathway analysis are shown in Table V. It also revealed genes associated with metabolism alteration including cyanoamino acid, steroid, and fatty acid biosynthesis pathways. The ABC transporters and nucleotide excision repair that are associated with stemness were also altered among ALDH1-positive and -negative cells.
Network analysis of genes related to ALDH1A1. The list of the 104 differentially expressed genes was imported into KeyMolnet. Then, the “start points and end points” network search algorithm was performed using the differentially expressed genes as the start points and ALDH1A1 as the end point to generate the network and identify candidate regulatory molecules causing ALDH1A1 induction (Figure 3). Network analysis extracted 10 transcription factors: SMAD4, RARα, MUC1, HASH1, C/EBPβ, PITX3, BRD4, LXR, PCAF, and SIRT2. These factors were directly or indirectly associated with ALDH1A1 expression.
Gene expression analyses of TCGA-BRCA and METABRIC cohorts. We validated our data using two large publicly available cohorts, The Cancer Genome Atlas (TCGA) and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) to identify the association between the 10 genes, ALDH1A1, and the Wnt signaling pathway that are related to cancer stem cell function (16). The results are shown in Figure 4. Indeed, several genes including C/EBPβ, NR1H3 (LXR), MUC1, and SIRT2 were associated with ALDH1A1 expression in both datasets. The expression levels of BRD4, C/EBPβ, ASCL1, NR1H2 (LXR), MUC1, PITX3, RARα, SIRT2, and SMAD4 were correlated with Wnt pathway signaling.
Discussion
In this study, we identified differentially expressed genes between ALDH1-positive and -negative breast cancer tissue samples. The difference in gene expression between ALDH1A1-positive and -negative cells in the same tumor may provide an explanation regarding the mechanism behind ALDH1A1 function in cancer stemness. Notably, 63 genes were up-regulated whereas 41 genes were down-regulated in ALDH1A1-positive cells compared to ALDH1A1-negative cells. CSCs exhibited self-renewal and tumor initiating properties, and treatment resistance (17). Furthermore, CSCs showed metabolic alterations in glycolytic (18), lipid (19), and steroid biosynthesis (20). Indeed, GO analysis revealed stemness related categories such as organ morphogenesis, cell differentiation, metabolic alterations, and regulation of TOR signaling pathways. Likewise, the pathway analysis also revealed altered gene expression in stemness-related pathways, such as several metabolic and treatment resistance mechanisms including ABC transporters and nucleotide excision repair among ALDH1A1-positive cells compared to ALDH1A1-negative cells.
Network analysis identified 10 transcription factors (e.g., SMAD4, RARα, MUC1, HASH1, C/EBPβ, Pitx3, BRD4, LXR, PCAF, and SIRT2) that were associated with ALDH1A1. For example, SMAD4 is the main mediator of TGF-β signaling pathway that is involved in many biological activities including fibrosis, embryonic development, wound healing, tumor development, cell differentiation, apoptosis, homeostasis and immune response regulation. In the complex with other transcription factors, SMAD4 acts as a regulator of the expression of target genes such as Twist1, Snail, and Slug that are associated with stemness (21). We then validated the association between these 10 factors and ALDH1A1 expression or the CSC-related signaling pathway by utilizing two large publicly available cohorts, The Cancer Genome Atlas (TCGA) (12) and the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) (13). These two cohorts include all subtypes of breast cancer. We have used these cohorts to demonstrate the clinical relevance of several studies (22-30). Indeed, several genes, including C/EBPβ (31), NR1H3 (LXR) (32), MUC1 (33), and SIRT2 (34) were associated with ALDH1A1 expression in both datasets. The expression of BRD4 (35), C/EBPβ, ASCL1 (hASH1) (36), NR1H2 (LXR), MUC1, PITX3 (37), RARα (5), SIRT2, and SMAD4 (38) were correlated with the Wnt signaling pathway, which plays an important role in self-renewal and differentiation of stem cells (16). Interestingly, most of these 10 factors were associated with poor survival outcome in TCGA cohorts (data not shown).
Among the ALDH1A1-positive samples, some showed low expression levels of ALDH1A1 in our microarray data. The discordance of the protein and mRNA expression levels was presumably derived from the difference in transcriptional activity of the cells or changes in transcriptional efficacy due to post-transcriptional modification (39). For example, microRNAs are recognized as one of the key mechanisms of the mRNA transcription regulatory network (40). As we have previously demonstrated the importance of ALDH1A1 protein expression in breast cancer patients (4), we have conducted microarray and network analyses based on the expression of the ALDH1A1 protein.
Although we validated our data by utilizing two large publicly available cohorts, subsequent studies involving the latest techniques such as single-cell sequencing are warranted to provide more specific information regarding the mechanisms of the regulation of breast CSCs (41). The specific mechanisms of regulation of ALDH1 in CSCs remain unclear. However, regulation of RA, reactive oxygen species (ROS), and detoxification by reactive aldehyde metabolism are considered to be closely related to functional roles of CSCs. ALDH1 has 19 human isozymes subdivided among 11 families and 4 subfamilies. Among them, ALDH1A1 and ALDH1A3 isoforms are particularly associated with CSCs owing to their roles mentioned above to exert resistance to radiotherapy and chemotherapy (5, 42). We only examined the ALDH1A1 isoform in this study. Thus, it is intriguing to perform the same analysis with ALDH1A3 as we did with ALDH1A1 in this study.
In conclusion, we found alterations of expression of 104 genes among ALDH1-positive and -negative cells that were associated with CSC functions. Network analysis showed that 10 genes were associated with ALDH1 expression. Most of these 10 genes have already been shown to reinforce their critical roles in maintaining stem cell features, providing a rationale for ALDH1A1 being a stem cell marker of breast cancer. These genes can be potential targets for cancer stem cell therapy, particularly for treating incurable breast cancer.
Acknowledgements
The Authors thank Dr. Edward Barroga (https://orcid.org/0000-0002-8920-2607), Medical Editor and Professor of Academic Writing at St. Luke’s International University, and Editage for editing the manuscript.
Footnotes
Authors’ Contributions
Conception and design: AY and TI. Acquisition of data: AY, CS, SA, HS, SY, MT, DS, MO, and KK. Drafting the manuscript: AY. Analyzed and interpreted data: KN, RT, KT, and YM. Supervised the project YI and EI. All Authors read and approved the final article.
Funding
This work was supported by National Institutes of Health (NIH) grant R01CA160688 to KT.
Conflicts of Interest
The Authors declare that they have no conflicts of interest in regard to this study.
- Received October 15, 2020.
- Revision received November 3, 2020.
- Accepted November 13, 2020.
- Copyright © 2020 International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.