Introduction

Breast cancer is a very commonly occurring cancer in women nowadays. Many studies have found that the disease is associated with enzymes involved in metabolism, but its etiology is still not clear. Cytochrome P450 1A1 (CYP1A1) is one of the most important Phase I enzymes expressed in breast tissue. This enzyme is involved in the metabolism of estrogens and mammary carcinogens such as polycyclic aromatic hydrocarbons (PAH) and heterocyclic amines (HCA). CYP1 enzymes are responsible for metabolic activation of PAH and HCA. In this way, many PAHs, e.g., benzo[a]pyrene (BaP), can become activated to form harmful products that can covalently bind to nucleic acids and proteins to form adducts to facilitate mutagenesis (McManus et al. 1990). The CYP1A1 gene also plays an important role in the metabolism of estrogen in vivo, by catalyzing 2-hydroxylation. The 2-hydroxylation products (2-OH estradiol catechol and 2-OH estrone catechol) lack estrogenic activities (Kawajiri et al. 1990). Furthermore, the 2-hydroxy catechol metabolites can be converted by O-methylation into 2-methoxy derivatives, which have been shown to possess anti-proliferative and anti-angiogenic properties (Cascorbi et al. 1996; Crofts et al. 1993). Another, mutually exclusive, pathway of estrogenic metabolism is 16a-hydroxylation, which produces strong estrogenic metabolites and had been linked to estrogen-induced carcinogenesis in both laboratory animals and humans (Cushman et al. 1995; Fotsis et al. 1994). These contrasting functions make CYP1A1 an interesting candidate factor that might influence susceptibility to breast cancer risk. T3801C (a substitution in the 3′ + non-coding region; Spurr et al. 1987), A2455G (isoleucine change to valine at codon 462; Hayashi et al. 1991), T3205C (a transition mutation in the 3′ + non-coding region; Crofts et al. 1993), C2453A (threonine to asparagine change at codon 461; Cascorbi et al. 1996) are four polymorphisms that have been found in this gene. Among them, C2453A is very rare and T3205C exists only in Africans or African-Americans, so most studies have focused on T3801C and A2455G.

The functional significance of polymorphisms found in CYP1A1 is still not clear. Cosma et al. (1993) found significantly elevated levels of inducible lymphocyte CYP1A1 enzyme activity in individuals carrying A2455G when compared to wild type genotypic individuals. Crofts et al. (1993) reported a threefold elevation in CYP1A1 enzymatic activity associated with A2455G G/G genotypes. The T3801C allele was also reported to encode an inducible form of CYP1A1 (Kiyohara et al. 1996).

Analysis of case-control studies is the most prevalent method of investigating the association between a disease and a specific gene. However, the several studies on CYP1A1 polymorphisms in breast cancer so far have provided conflicting results. Shen et al. (2006) found that, compared with CYP1A1 T3801C wild-type (T/T), the risk of breast cancer doubled for T3801C heterozygote (T/C) [odds ratio (OR) 1.83; 95% confidence interval (CI) (1.24–2.69)] and T3801C homozygote (C/C) [OR 2.22; 95% CI (1.26–3.85)] genotypes, but Miyoshi et al. (2002) reported the opposite result, i.e., that both T3801C and A2455G showed a significantly reduced breast cancer risk as compared with non-carriers [T3801C: P + < + 0.01, OR 0.60, 95% CI (0.41–0.88); A2455G: P< + 0.05, OR 0.66, 95% CI (0.45–0.96)]. Furthermore, most studies concluded that there is no significant association between the disease and these polymorphisms. However, some of these studies were based on small sample numbers of cases and controls, so a meta-analysis of all available studies will help to establish a more convincing overall result. Considering the dual role of CYP1A1 in carcinogen activation and estrogen 2-hydroxylation, subgroup analysis based on ethnic and environmental factors may also yield more meaningful results.

Here, we performed a meta-analysis including a subgroup analysis based on ethnicity and menopausal status, and found that genotype A2455G G/G might reduce the risk of breast cancer for pre-menopausal women regardless of ethnicity. This homozygote genotype might also specifically decrease the risk of breast cancer in Asian women.

Materials and methods

Selection of studies

All of the case-control studies were identified by a computerized literature search of the PubMed database (prior to September 2006) using the following words and terms: “CYP1A1”, “polymorphism”, and “breast cancer”. References of the retrieved publications were also screened. Only research articles were included and the language of publication was restricted to English. Studies had to be based on an unrelated case-control design, so pedigree data were excluded. Allele frequency also had to be clearly reported. The genotype distribution of the control population of the studies had to be in Hardy–Weinberg equilibrium (HWE) (P> + 0.05).

Data extraction

The following basic data were collected from the studies: first authors, journals, year of publications, ethnicities, genotypes, ages, menopausal status and racial descent (categorized as east-Asians, Caucasians, Africans and others).

Statistical analysis

For each study, the OR (odds ratio) was first calculated to assess the association between the polymorphisms (alleles) and the disease in a 2 + × + 2 table. In meta-analysis, we examined the association between allele C of T3801C and the risk of breast cancer compare to that of allele T, as well as using additive (CC vs TT), recessive (CC vs TC + + + TT), and dominant (CC + + + TC vs TT) genetic models. The same method was applied to the A2455G polymorphism. There are three widely used methods of meta-analysis for dichotomous outcomes: two fixed effects methods (Mantel–Haenszel’s method and Peto’s method), which assume that studies are sampled from populations with the same effect size, making an adjustment to the study weights according to the in-study variance; and one random effects method (DerSimonian and Laird’s method), which assumes that studies are taken from populations with varying effect sizes, calculating the study weights both from the in-study and between-study variance, considering the extent of variation, or heterogeneity (Petitti 1994). In our study, both Mantel–Haenszel’s fixed effects method and DerSimonian and Laird’s random effects method were used in ReviewManager 4.2 software. All results are indicated in OR, calculated according to Woolf (see Rosner 2000). A Q test was performed to evaluate the between-study heterogeneity of the studies (Lau et al. 1997). If P< + 0.10, i.e., the between-study heterogeneity was considered to be statistically significant, we chose the random-effects model to calculate the OR. Otherwise, when P≥ + 0.10, i.e., the between-study heterogeneity was not significant, then the fixed-effects model was suitable. In the absence of between-study heterogeneity, the two methods yield similar results. In order to make a clear comparison, we present the OR of both the random-effects model and fixed-effects model for every meta-analysis. A pooled OR obtained by meta-analysis was used to give a more reasonable evaluation of the association. A Z test was performed to determine the significance of the pooled OR (P≤ + 0.05 suggests a significant OR). For each genetic comparison, subgroup analysis was performed according to racial descent and menopausal status. Funnel plots were used to access publication bias by the method of Egger’s regression test (Egger et al. 1997). A T test was performed to determine the significance of the asymmetry. An asymmetric plot suggested possible publication bias (P + ≥ + 0.1 suggests no bias).

Hardy–Weinberg equilibrium was tested by the Chi-square test based on a web program (http://www.ihg.gsf.de/cgi-bin/hw/hwa1.pl). Analyses were performed using ReviewManager 4.2 (Oxford, England) and stata 7.0 softwares.

Results

Eligible studies

Based on the search criteria, 17 articles (Ambrosone et al. 1995; Bailey et al. 1998; Basham et al. 2001; Boyapati et al. 2005; Chacko et al. 2005; Hefler et al. 2004; Huang et al. 1999a, 1999b; Krajinovic et al. 2001; Le Marchand et al. 2005; Li et al. 2004; Miyoshi et al. 2002; Okobia et al. 2005; Shen et al. 2006; Singh et al. 2006; Taioli et al. 1995, 1999) were identified by reviewing 66 papers. Two of these articles were finally excluded as the studies of Taioli et al. (1995) and Huang et al. (1999a) were replaced by their later reports (Taioli et al. 1999; Huang et al. 1999b, respectively). Three articles (Li et al. 2004; Taioli et al. 1999; Bailey et al. 1998) provided data on two ethnicities: African-American and Caucasians. Each subpopulation in these articles was treated as a separate study in our meta-analysis. One article (Le Marchand et al. 2005) provided data on mixed ethnicities (25% Japanese-American, 22% White, 21% Latino, 19% African-American, 7% Hawaiian, and 6% other ethnic/racial origin). Among all the eligible articles, four described the T3801C polymorphism, two focused on A2455G, and nine articles investigated both T3801C and A2455G. In these latter nine articles, the genotyping data for T3801C deviated from HWE in Chacko et al. (2005), Taioli et al. (1999) and Bailey et al. (1998), while the genotyping data for A2455G matched HWE in these three studies; the genotyping data for A2455G in Hefler et al. (2004) deviated from HWE, while the genotyping data for T3801C matched HWE in this paper. We extracted the available data, rejected datasets where HWE was doubtful. Thus, in total, there were 13 studies based on the T3801C polymorphism and 13 studies focusing on A2455G. All the studies were published from January 1995 to April 2006. Notably, the patients and controls in one study were predominantly postmenopausal women (Le Marchand et al. 2005) and in another the populations were comprised entirely of postmenopausal women (Ambrosone et al. 1995). Among the 15 eligible articles, 67% (10/15) stated that controls were age-matched. We compared the distribution of the T3801C polymorphism genotypes in controls in Indians, Chinese and Japanese using Chi-square tests. Populations were divided into four ethnic categories: Caucasians, east-Asians, Africans and others. Others mean Indians and/or mixed population.

Meta-analysis database

Table + 1 shows the details of the cases and controls in the included studies, together with the ORs we calculated to make a primary evaluation. The studies provided 9,316 cases and 12,714 controls for T3801C and 9,552 cases and 9,320 controls for A2455G in all. Eight studies were from North America, four from East Asia, and six from other areas. All genotypes and allele frequencies of cases and controls for both T3801C and A2455G are shown in Table + 2. T3801C allele T had a lower presentation in cases and controls of east-Asians (61.1 and 61.5%, respectively) than in those of Caucasians (90.0 and 89.9%). The wild type (T/T) allele was the most common in cases and controls in Caucasians (83.0 and 80.0%, respectively), while in east-Asians (Chinese and Japanese), the heterozygous genotype (T/C) was more prevalent (46 and 47%, respectively). In one study (Shen et al. 2006), the number with wild-type (T/T, 47.8%) genotype was higher than that with the heterozygous genotype (T/C, 40.7%) in the control group of Chinese women. Different data were reported from another study on east-Asians (Boyapati et al. 2005). For A2455G, the A allele was less represented in cases and controls of east-Asians (76.9 and 75.3%, respectively) than in those of Caucasians (95.8 and 95.2%, respectively) and Africans (97.9 and 98.3%, respectively). The genotype 2455A/A was more common than other genotypes in all populations. The proportion with 2455A/A genotype in cases and controls of east-Asians (58.6 and 57.6%, respectively) was lower than that of other ethnicities (Caucasians 92.0 and 90.7%; Africans 95.8 and 96.5%, respectively).

Table + 1 Characteristics of studies included in the meta-analysis. OR Odds ratio, CI confidence interval, SD standard deviation
Table + 2 Distribution of T3801C and A2455G genotype and allele among breast cancer cases and controls included in the meta-analysis

Effect for allele and subgroup analysis

A2455G

There was significant between-study heterogeneity (P = 0.001) among the 13 studies when we investigated the association between allele G and breast cancer risk for A2455G, comparing with allele A. Thus, a random-effects model was used to calculate the pooled OR. There was no significant association between allele G and the disease in a worldwide population [P = 0.90, random-effects OR 1.01; 95% CI (0.82–1.25), Fig. + 1]. In subgroup analysis based on ethnicity, no between-study heterogeneity was found in east-Asians (P = 0.20), Caucasians (P = 0.29) and Africans (P = 0.78). In this situation, a fixed-effects model was appropriate to assess the pooled OR. However, no significant association was found between the allele and the risk of breast cancer, with three comparisons in east-Asians [P = 0.13, fixed-effects OR 0.91; 95% CI (0.81–1.03)], five comparisons in Caucasians [P = 0.86, fixed-effects OR 0.98; 95% CI (0.81–1.19)], and three comparisons in Africans [P = 0.46, fixed-effects OR 1.31; 95% CI (0.64–2.70)]. Meta-analysis in a recessive genetic model suggested that the G allele may be associated with a trend to reduce the risk of breast cancer only in east-Asians [P = 0.04, fixed-effects OR 0.73; 95% CI (0.53–0.99), P = 0.97 for heterogeneity Fig. + 2a], but not in other ethnic groups. The additive model also revealed a similar effect, i.e., compared to that of A2455G A/A, the association between A2455G G/G and the risk of the disease existed only in east-Asians [P = 0.04, fixed-effects OR 0.72; 95% CI (0.53–0.99), P = 0.95 for heterogeneity, Fig. + 2b]. Although in subgroup analysis concerning menopausal status, we found no association between allele G and breast cancer in pre-menopausal women in a worldwide population, the recessive genetic model suggested that the G/G genotype was more likely to decrease susceptibility [P = 0.02, fixed-effects OR 0.51; 95% CI (0.29–0.90), P = 0.38 for heterogeneity, Fig. + 2c]. Furthermore, the additive model also supported this conclusion in pre-menopausal women [P = 0.02, fixed-effects OR 0.52; 95% CI (0.29–0.92), P = 0.39 for heterogeneity, Fig. + 2d]. However, we found no significant association for post-menopausal women in various genetic comparisons.

Fig. + 1
figure 1

Overall meta-analysis for cytochrome P450 1A1 gene (CYP1A1) A2455G polymorphisms and breast cancer: G versus A allele. Each study is represented by a point estimate of the OR and the accompanying 95% CIs using a random-effects model. n Total number of G alleles, N total number of A alleles plus G alleles. The weight is displayed in a percentage form, not the real weight in the calculation of the pooled OR. The calculation of the weight for every study was different in the fixed-effects model and the random-effects model. WeightMH = Wi = bici/Ni, WeightDL = Wi* + = (a−1i+b−1i+c−1i+d−1i2)−1, where MH = Mantel–Haenszel method, DL = DerSimonian and Laird method, i was the number of the study, a was the population of the case with intervention, b was the population of the case without intervention, c was the population of the control with intervention, d was the population of the control without intervention. N was the total number of cases and controls, Δ2 was the between-study variance of all the eligible studies. The pooled OR was influenced by the weights

Fig. + 2a–d
figure 2

Meta-analysis for A2455G polymorphism in breast cancer. Each study is shown by a point estimate of the OR and the accompanying 95% CI under a fixed-effects model. n Total number of G alleles, N total number of A alleles plus G alleles. a Recessive model (GG vs GA + + + AA) in east-Asians. b Additive model (GG vs AA) in east-Asians. c Recessive model (GG vs GA + + + AA) in pre-menopausal women. d Additive model (GG vs AA) in pre-menopausal women

T3801C

For T3801C we also obtained significant heterogeneity (P = 0.001) between the 13 studies. The lack of significant association suggested that the C allele might not influence disease susceptibility in worldwide population [P = 0.88, random-effects OR 0.99; 95% CI (0.87–1.12), Fig. + 3]. In subgroup analysis based on racial descent, heterogeneity was still high in east-Asians (P = 0.0005) and in Africans (P = 0.03), but not in Caucasians (P = 0.32). We did not find any significant association between the polymorphism and breast cancer in east-Asians [P = 0.67, random-effects OR 1.06; 95% CI (0.80–1.41)], Caucasians [P = 0.13, fixed-effects OR 0.86; 95% CI (0.71–1.05)], or Africans [P = 0.78, random-effects OR 1.02; 95% CI (0.85–1.24)]. No evidence could be found to support the hypothesis that allele C is related to the risk of breast cancer (data shown in Table + 3) under additive, recessive, or dominant genetic comparisons. Subgroup analysis by menopausal status in alleles and various genetic comparisons also suggested that allele C seems not to be a risk factor. All relevant OR and P values are shown in Table + 3.

Fig. + 3
figure 3

Overall meta-analysis for CYP1A1 T3801C polymorphisms and breast cancer: C versus T allele. Each study is shown by a point estimate of the OR and the accompanying 95% CI using a random-effects model. n Total number of C alleles, N total number of T alleles plus C alleles

Table + 3 Summary of ORs for various comparisons

Sensitivity analysis was performed by sequential omission of individual studies under various comparisons in east-Asians, Caucasians, Africans, pre-menopausal and post-menopausal subgroups, respectively. When we investigated T3801C, the pooled results consistently encompassed 1.0 in the five subgroups under fixed- or random-effects models, indicating that the significance of pooled ORs was not excessively influenced by any single study in these subgroups (data not shown). However, in the east-Asians subgroup analysis of A2455G, if any single study was excluded from the subgroup, the results were not significant, either in recessive (G/G vs G/A + + + A/A) or in additive (G/G vs A/A) models. Furthermore, in the pre-menopausal subgroup analysis of A2455G, when the study of Huang et al. (1999b) was excluded, the pooled result was not significant, while exclusion of other studies did not influence the result. Thus, the association between A2455G and breast cancer risk in the pre-menopausal and east-Asians results must be treated with caution.

Publication bias

Funnel plots and Egger’s test were performed to assess publication bias. The data suggested that there was no evidence of publication bias for the comparison of 3801C versus 3801T (t = 0.62, P = 0.548) or for the comparison of 2455G versus 2455A (t = 0.84, P = 0.421).

Discussion

This meta-analysis investigated the association between CYP1A1 T3801C, A2455G polymorphisms and the risk of breast cancer by examining allele frequency and comparing genotypes in groups and subgroups (menopausal status and ethnicity). The between-study heterogeneity was high when investigating the worldwide population, but was eliminated by examining subgroups based on ethnicity in the analysis of A2455G. In the analysis of T3801C in Africans or African-Americans, between-study heterogeneity still existed in subgroups.

Analyzing the high heterogeneity for T3801C, we found that when Shen et al. (2006) was excluded, the between-study heterogeneity in east-Asians was decreased significantly (from P = 0.0005 to P = 0.04); when Taioli et al. (1999) was removed, the between-study heterogeneity in Africans no longer existed (from P = 0.03 to P = 0.26). If both these studies were eliminated, no between-study heterogeneity was found in the worldwide population (from P = 0.001 to P = 0.28). Shen et al. (2006) used as controls randomly selected age-matched females with no prior history of any cancer who were permanent residents of an urban Shanghai population. The genotype distribution was different from that in other research on east-Asian populations in this article, especially compared with another study based on a population of urban women from Shanghai (Boyapati et al. 2005). It is possible that the selected populations in these two articles were from separate areas in Shanghai, but further studies are needed to clarify the reason for the difference. Taioli et al. (1999) contained two studies based on two United States populations (African-American and White). However, only the data on African-Americans contribute to between-study heterogeneity. The control population in the study based on African-Americans was from a different region, which might explain the genotype difference.

No significant association between the A2455G polymorphism and breast cancer in the worldwide population was found under allele and genotype comparisons. However, in subgroup analysis, we found that the G/G genotype tended to reduce the risk of breast cancer compared to the wild-type (G/G vs G/A + + + A/A or G/G vs A/A) for A2455G in east-Asians or in pre-menopausal women, although the number of studies was small. In fact, all the studies had such a trend, but no single result was significant. Furthermore, most studies in the pre-menopausal subgroup were from east-Asians. Thus the G/G genotype must influence east-Asians more than other races. Our meta-analysis on A2455G suggested that allele C is not a risk factor alone, regardless of racial descent or menopausal status, although both polymorphisms together increase the activity of the enzyme.

The T3801C and A2455G alleles were reported to encode an inducible form of CYP1A1 (Crofts et al. 1993; Kiyohara et al. 1996). Although CYP1 enzymes are responsible for metabolically activating PAHs and aromatic amines, Nebert et al. (2004) proposed that whether CYP1A1 detoxifies or causes toxicity depends on many factors, such as sub-cellular content and location, amount of Phase II metabolism, degree of coupling to Phase II enzymes, and cell type- and tissue-specific context, as well as pharmacokinetics. In the pathway of PAH metabolism, CYP1A1 tends to increase the risk of breast cancer by activating carcinogens, while in the pathway of estrogen metabolism, the CYP1A1 gene is inclined to reduce the risk. The G/G genotype of A2455G might function in estrogenic metabolism, which decreases the risk of breast cancer by influencing the level of estrogen or competing with the 16a-hydroxylation pathway. This effect was significant in pre-menopausal women but not in post-menopausal women since the latter have much lower levels of estrogen. However, conflicting results have been reported (Taioli et al. 1999). It is possible that in post-menopausal women, the pathway of carcinogenic metabolism plays a more important role than that of estrogenic metabolism. In our sub-group analysis of post-menopausal women, we found no significant association between the G/G genotype for A2455G and the disease. However, in subgroup analysis of east-Asians, the G/G genotype reduced the risk of breast cancer compared to the wild-type. This might result from a gene–environment interaction, such as diet. Furthermore, the analysis with pre-menopausal women and A2455G showed a significant result only in a fixed-effects model, not in a random-effects model in spite of lower between-study heterogeneity (P = 0.38 in recessive model). This might result from the small number of studies included in the comparison.

In conclusion, our meta-analysis suggests that a 2455 G/G genotype has a trend to reduce the risk of breast cancer in east-Asian women, or in pre-menopausal women in a worldwide population, under both recessive and additive models, while T3801C has no association to breast cancer. The studies included in the subgroup analysis were limited and the result was sensitive to study selection. More comparative studies are needed to evaluate interactions of CYP1A1 polymorphisms and breast cancer risk in specific populations.