Abstract
Background/Aim: Fibroepithelial lesions (FEL) of the breast include fibroadenomas and phyllodes tumors (PT). Their histologic characteristics on core needle biopsy can overlap, while their clinical management is different. The aim of this study was to develop and to validate a pre-operative score for the diagnosis of PT with surgical decision rules. Patients and Methods: We developed a pre-operative score for the diagnosis of PT by performing logistic regression on 217 FEL of the Rene Huguenin Hospital. This score and the surgical decision rules were validated on 87 FEL of the Lariboisiere Hospital. Results: Three variables were independently and significantly associated with PT: age ≥40 years, mammography's tumor size ≥3 cm and PT diagnosed by CNB. The pre-operative score was based on these three criteria with values ranging from 0 to 10. Surgical decision rules were created: the low-risk group of PT (score≤2) had a sensitivity of 92.6% and a LR− of 0.2, the high-risk group (score>7) had a specificity of 93.5% and a LR+ of 4.4. In the validation sample, surgical decision rules were applied. Conclusion: These surgical decision rules may prove useful in deciding which FEL needs surgical resection.
Fibroepithelial lesions (FEL) of the breast are heterogeneous biphasic neoplasms involving proliferation of both epithelial and stromal components including the common benign fibroadenomas (FA) and the rarer phyllodes tumors (PT) (1-3). In 2012, the World Health Organization (WHO) sub-classified PT histologically as benign, borderline or malignant grade categories based on the degree of stromal cellularity and atypia, mitotic count, stromal overgrowth and tumor borders (1).
PT and FA have different clinical implications and surgical management. Whereas PT can be aggressive with possibility of recurrence, potential for malignant transformation and distant metastasis, FA can regress with age (4). A complete surgical excision with an attempt to obtain negative margins of 1 cm or greater is the current National Comprehensive Cancer Network (NCCN) guideline recommendation for PT. The management of FA can be follow-up evaluation and reassurance (5).
Differences in the recommended management of these lesions, ranging from observation to wide surgical resection, make the distinction between PT and FA crucial. However, FEL can be difficult to diagnose with an initial core needle biopsy (CNB). A CNB demonstrating FEL with any histologic feature suggesting a more advanced lesion than FA may indicate a PT (1-3, 6, 7). Consequently, surgical excision is recommended for many patients to rule out PT resulting in potential overtreatment (1-3, 8, 9).
The aim of this study was to develop and to validate a pre-operative score for the diagnosis of PT with surgical decision rules.
Patients and Methods
Study design and database. We conducted a bicenter retrospective study in FEL of the breast diagnosed by CNB at two hospitals. FEL from the Department of Surgical Oncology of the Rene Huguenin Hospital (Saint-Cloud, France) were included from April 1997 to November 2015 and those from the Department of Gynecology and Obstetrics of the Lariboisiere Hospital were included from October 2006 to November 2016. Our score was developed using data from the population of the Rene Huguenin Hospital and validated on the data from patients of the Lariboisiere Hospital.
We included all operated lesions. Exclusion criteria were recurrences of PT, lesions belonging to patients lost to follow-up after CNB or with history of breast cancer.
All CNBs were performed by breast specialized radiologists. All CNBs and surgical specimens were reviewed by a breast specialized pathologist. All surgical excisions were made by a breast specialized surgeon.
Our database consisted of clinical, radiological, surgical and histological information.
Diagnosis of FEL. The diagnostic reference (gold standard) used for the diagnosis of FEL, FA or PT, was the final pathologic diagnosis on the surgical specimen. PT were described with the 2012 WHO classification as benign, borderline or malignant (1).
Statistical analysis. We compared FEL defined as FA to those defined as PT.
We carried out univariate analysis using a quantitative (Student's t-test) or qualitative (Chi2 test) test as appropriate. We converted variables associated with the presence of PT at a threshold of p<0.10 into dichotomous variables using receiver operating characteristic (ROC) curves.
We used multiple logistic regression analysis to select the best combination of variables that was independently associated with the diagnosis of PT (p<0.05). Variables were selected by a backward stepwise procedure from those associated with PT in the univariate analysis at a threshold of p<0.10.
Bootstrap resampling was performed to assess the robustness of the multivariate model, using 1,000 replications, to estimate the distribution of each logistic regression coefficient to remove variables potentially responsible for instability in the model (10).
The performance of the model in the diagnosis of PT was specified by calculating its area under the ROC curve (ROC-AUC).
The logistic regression model was used to build a score by rounding up the β coefficients from the multivariate analysis to generate a simple scale (11). Missing data were considered as absent. The ROC-AUCs of the logistic regression model and score model were compared to check that the two values were not significantly different. Sensibility, specificity, positive and negative likelihood ratios were calculated for different threshold values of the score.
We made surgical decision rules by classifying patients as at low-risk or high-risk of PT by choosing two threshold values of the score to make two classifications, one with sensitivity >95% and positive likelihood ratio (LR-) <0.25 (rule out) and the other with specificity >90% and negative likelihood ratio (LR+) >4.0 (rule in) (12).
Finally, the surgical decision rules were applied to the validation sample for external validation, by calculating sensitivity, specificity, LR+ and LR- with their 95%CI in the low-risk and the high-risk groups of PT.
Statistical analyses were performed using stat 13.0 (Stata Corp; College Station, TX, USA).
Results
FEL. A total of 1326 FEL were diagnosed by CNB in the derivation sample. Among them, 312 were excluded (4 recurrences of PT, 291 lesions belonging to patients lost to follow-up after CNB and 17 lesions belonging to patients with a history of breast cancer). Of the 1014 FEL left, 217 beneficiated surgical excision. There were 123 FA and 94 PT (78 benign, 14 borderline and 2 malignant) (Figure 1).
The validation sample consisted of 87 FEL diagnosed by CNB and operated on. There were 68 FA and 19 PT (16 benign, 2 borderline and 1 malignant).
Univariate and multivariate analyses. The findings from the univariate analysis are reported in Table I.
The multiple logistic regression analysis identified three variables independently and significantly (p<0.05) associated with the diagnosis of PT, namely, age ≥40 years (p=0.026), mammography's tumor size ≥30 mm (p<0.001) and PT diagnosed by CNB (p<0.001) (Table II). All of these variables were stable after 1000 bootstrap replications (Table II).
The ROC-AUC of this model was 0.76 95%CI=0.74-0.86 (Figure 2).
Score and surgical decision rules. The score was given by the following equation: score=(age ≥40 years×2)+ (mammography's tumor size ≥3 cm×3)+(PT diagnosed by CNB×5) (Table III).
The ROC-AUC of the score was 0.75 (95%CI=0.69-0.82) (Figure 3). There was no significant difference between the ROC-AUC of the score and the ROC-AUC of the model (p=0.78).
We then defined surgical decision rules with a low and high-risk group of PT (Table IV): FEL with a score ≤2 were defined as a low-risk group of PT. A threshold score value of 2 produced a sensitivity of 92.6% (95%CI=85.3-97.0) and a LR− of 0.2 (95%CI=0.1-0.4). A surgical excision is not needed for those FEL; FEL with a score >7 were defined as a high-risk group of PT. A threshold value of 7 produced a specificity of 93.5% (95%CI=87.6-97.2) and a LR+ of 4.4 (95%CI=2.1-9.3). A surgical excision is absolutely needed for those FEL with 1 cm margins.
The surgical decision rules were applied in the validation sample. The low-risk group of PT had a sensitivity of 84.2% (95%CI=60.4-96.6) and a LR- of 0.24 (95%CI=0.09-0.7). The high-risk group of PT had a specificity of 98.5% (95%CI=92.1-100.0) and a LR+ of 17.9 (95%CI=2.2-144.0). The 95%CI of the sensitivity, the specificity, the LR+ and the LR- in the validation sample were in the expected range.
Discussion
We have developed the first pre-operative score for the diagnosis of PT with surgical decision rules for FEL diagnosed by CNB of the breast. The 10-point score is based on three simple items: age ≥40 years (2 points), mammography's tumor size ≥3 cm (3 points) and PT diagnosed by CNB (5 points). The surgical decision rules were obtained by classifying FEL as at low-risk or high-risk of PT. A score ≤2 puts FEL in a low-risk group of PT with a sensitivity of 92.6 and a LR− of 0.2, and these FEL can be observed; a score >7 puts patient in a high-risk group of PT with a specificity of 93.5% and a LR+ of 4.4, and these FEL need wide resection with 1 cm margins. These surgical decision rules were validated in an external sample.
The strengths of our study are the following. The first is the face validity of our score. Indeed, the score is made of one clinical, one imaging and one histological item. Furthermore, these three items are validated in the literature to be helpful in distinguishing FA from PT. Some studies have demonstrated that the median age at presentation of PT is 42-45 years (13) and that FA have a peak of incidence at 20-40 years (14). Also, a diagnosis of PT may be favored if the tumor is larger than 4 cm (15, 16). The goals of the triple approach, clinical, imaging and histological evaluation, are to find all PT among FEL and so avoid unnecessary diagnostic surgical procedures, if possible. However, it has been suggested that even the use of this approach may lead to disappointing results (16) because clinical, imaging and histological characteristics of these lesions can overlap (1-3, 6, 17). Our model is the first to reunite these pre-operative characteristics to obtain better performance in the pre-operative diagnosis of PT than each item considered separately.
Second is that we took into account overfitting, the most important bias in predictive studies (18). We used two statistical methods to minimize this risk: the bootstrap resampling procedure, which allowed us to detect and remove variables potentially responsible for instability in the model (10) and the validation in a specific population (19).
Third, although the study was retrospective, as the data were not initially gathered for this study, the database of the derivation sample was prospectively constituted from all FEL of the breast diagnosed by CNB since 1997 in the Department of Surgical Oncology of the Rene Huguenin Hospital. This ensured that all operated FEL were included in our study to build the score.
The principal limitation of our study is the referral bias that could have occurred. Indeed, we only included FEL which underwent surgical resection implying that observed lesions were not included. Those FEL may differ from those which underwent surgical resection: it can be assumed that operated FEL have a higher pre-operative suspicion of PT than observed lesions as attested by the high prevalence of PT in our population. This could have increased the link between the variables studied and PT and may have affected the diagnostic accuracy of our model (20).
We have developed a 10-point score which is reproducible, original, and would be easy to use in routine practice. In daily clinical practice, our pre-operative score could be helpful to a multidisciplinary team to plan the surgical management of patients with FEL on the CNB of the breast. Patients belonging to the low-risk PT group would not be eligible for surgical resection, whereas those belonging to the high-risk group should undergo wide surgical resection with 1 cm margins.
Footnotes
Authors' Contributions
Camille Mimoun and Roman Rouzier contributed to the design and implementation of the research, to the analysis of the results and to the writing of the manuscript.
Amelie Zeller, Julien Seror, Laurene Majoulet, Eva Marchand, Matthieu Mezzadri, Pascal Cherel, Vinciane Place, Françoise Cornelis and Jean-Louis Benifla contributed to the design and implementation of the research.
Conflicts of Interest
The Authors have no conflicts of interest to declare regarding this study.
- Received December 14, 2019.
- Revision received December 19, 2019.
- Accepted December 30, 2019.
- Copyright© 2020, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved