Abstract
Background/Aim: Breast cancer is associated with appearance concerns and issues of appearance identity, that require an appropriate and independent assessment. The Derriford Appearance Scale (DAS) has been widely used to this end, as has been shown in UK and other international samples. The aim of this study was to determine the extent to which an Italian translation of DAS24 is valid, reliable, and culturally appropriate, while remaining user friendly. The extent to which the statistical robustness of the scale is maintained was also assessed. Patients and Methods: Ninety-three female participants were recruited at a Breast Cancer Department in Southern Italy. According to the protocol designed by the original authors of the questionnaire, a booklet containing the DAS24ita and other scale was completed anonymously by participants. Conclusion: The results of the statistical analysis confirmed the validity and reliability of the DAS24ita. The DAS24ita demonstrated significant correlations with the other measures of appearance sensitivity and quality of life. The translated scale was able to differentiate among patients with differential diagnoses, and was more sensitive to these differences than generic quality-of-life scales.
Breast cancer diagnosis is a significantly stressful experience for women (1, 2). There are a plethora of patient reported outcomes that assess health-related quality of life and patient satisfaction with care in breast cancer and other cancer types [see, for example, Kanatas et al. (3) for a review]. However, these outcomes typically very broad concerns around cancer and breast cancer, including satisfaction with treatment, pain, and other physical symptomatology. Breast cancer is associated with appearance concerns and issues of appearance identity, which are somewhat independent of these factors, and require an appropriate assessment separately. There is an increasingly apparent need to measure distress and dysfunction associated with appearance concerns. Maas et al. describe the need for a ‘gold standard’ professional aesthetic assessment scale, recognising that aesthetic outcomes are subject to poor inter-rater reliability (4). Furthermore, the lack of consistency between objectively assessed and subjectively assessed severity of appearance (5) emphasises the importance of valid and reliable patient-based, subjective reporting of the impact of appearance differences. Surgeons in clinical practice, as well as psychologists, have recognised this. Valid and reliable measurement tools enable surgical and non-surgical interventions to contribute to a meaningful evidence basis on which to address the clinical need and the effectiveness of therapies post-intervention within the framework of evidence-based medicine (6-8). The requirements for such a measure include the need for demonstrable validity and reliability, evident content validity to ensure that the content of the measure addresses the condition(s) subject to investigation, sufficient sensitivity to discriminate amongst individuals with varying levels of distress, and crucially, be user friendly for both patients and clinicians. Most notably, the Derriford Appearance Scale (9) has been widely used to this end as it has been shown in UK and other international samples to meet these criteria.
The longer version of the DAS was translated and culturally validated in an Italian sample (10). It was demonstrated that subscales of the DAS59 did produce internally coherent subtotals based on analysis of internal scale and subscale validity. The sample size in the dataset did not, however, permit a confirmatory factor analysis to determine the extent to which the Italian sample replicated the UK samples. Overall, the Italian sample demonstrated significantly more distress and dysfunction as assessed by the DAS 59 total scores than those for the UK sample. The Italian sample also demonstrated a difference between the sexes with respect to total scores, with women more distressed (scoring higher) than men on the subscales of sexual/bodily self-consciousness, and social self-consciousness.
The DAS24 is a self-reported questionnaire of 24 questions and statements, each with five response options. An introductory section collects appropriate demographic information about the respondent and identifies which, if any, aspect of appearance is of greatest concern, which is referred to as the ‘feature’ in some items. The introductory section and inclusion of a ‘not applicable’ response category for most items render the DAS24 user-friendly for adult respondents (16 years and older) whether or not they have an aspect of appearance about which they are concerned. The scale can be completed in the consulting room, in the waiting room, or at the patient's home. Successful translations of this scale have been made into numerous languages: Chinese (11) and Taiwanese (12); Spanish, Portuguese and Swedish are ongoing; as well as translation of the previous version (DAS 59) into different languages including Nepalese (13), Italian (10), Korean (14) and Japanese (15).
The shorter version of the scale (DAS24) has not yet been subject to the same linguistic translation and validation procedures in Italian. The aim of this study was to determine the extent to which an Italian translation is valid, reliable, culturally appropriate, and user friendly, and also the extent to which the statistical robustness of the scale is maintained.
In order to validate the measure for use in a clinical setting, it is obviously important to use a clinical sample with appearance-altering conditions. To this end, we hypothesised that there would be differences between the adjustments to the appearance-altering surgeries of mastectomy, and quadrantectomy in comparison to each other. De Feudis et al. investigated breast cancer outcomes in a Southern Italian sample, comparing patients who underwent quadrantectomy and mastectomy (16). They found that mastectomy was associated with comparatively higher levels of both anxiety and depression. However, this study was not able to investigate the impact of appearance on patient outcomes in relation to surgery type. This is consistent with a large-scale longitudinal study by Engel et al. on an English-speaking sample (17).
It was hypothesised that the translated version of DAS24 would be comprehensible and meaningful in the Italian language, and when used clinically would demonstrate internal consistency, convergent criterion validity through correlation with known correlates of the DAS24, and demonstrate concurrent criterion validity by differentiating preoperative, quadrantectomy, and mastectomy patients. We also hypothesised that these patient groups would differ according to other outcomes relevant to appearance (appearance valence and appearance salience), depression, anxiety and body shame scores, as well as quality of life.
Patients and Methods
Translation and adaptation method. The Italian translation of the DAS24 was validated according to the protocol designed by the original authors of the questionnaire. For the translation, we adopted an iterative, multi-step, committee-based translation approach. Our procedure was initially inspired by the TRAPD framework (18–20). TRAPD is the acronym for five subsequent (but interrelated) phases: Translation, Review, Adjudication, Pretesting and Documentation. This framework is particularly in use in cross-cultural studies to overcome cultural differences that are often an issue. Figure 1 shows a detailed flow chart of the translation procedure. The translation process started with two separate translations. One was considered as main and one as secondary. The two translators worked separately and independently. The first translator was a professional interpreter whose mother tongue was English, who was fluent in Italian and had been residing in Italy for several years. His work was intended to provide the best possible rendering of the original source into Italian, especially from the point of view of conceptual and semantic equivalence [we classify equivalence according to Herdman et al. (21, 22)]. This was considered as the main translation. The second translator was a professional psychologist, with a background in questionnaire design and analysis. His mother tongue was Italian but he was fluent in English. His task was more focused on disclosing issues regarding operational and measurement equivalence. This was considered as a secondary translation, to be used in suborder with respect to the first. A series of meetings (“team review & reconciliation” in Figure 1) was held to review the two translations and the English source, and to reconcile all into a suitable Italian version. A plastic surgeon, a nurse, the first translator, and the project coordinator (who was also the secondary translator) attended these meetings. The plastic surgeon and the nurse worked at the same hospital where the study took place. Each member of the team was provided with the translations and also with the last Italian version of scale (10). The first draft was further considered by the team (“team TCM screening”), in order to screen adherence to Body perceived multidimensional schema (23) and to examine issues of comprehensibility on behalf of patients (“Team BPMS screening”). After the team reached an agreement, a first reconciled Italian version was produced and discussed with the original authors of the questionnaire. The draft copy was tested on 10 volunteers (patients with breast cancer) who received concise information about the DAS and body-image perception, and then completed the questionnaire without supervision. The two translator reviewed the completed questionnaire together with each respondent, investigating missing data, problems of comprehension, and possibly problematic wording. In this session, comments were solicited from the respondents, and a further meeting discussed this first phase. The retrospective debriefing round was followed by cognitive debriefing interviews with another 15 volunteers. It was also specifically verified that the polarity of response scales had been correctly recognized. Based on these retrospective and cognitive interviews, some variations concerning response scales and their verbal descriptors were proposed. The variations were reviewed by the medical doctor and the nurse (“variations & clinicians' review”) after a new discussion concerning the framework, underlining the body perception multidimensional schema (23). After the team review approved all the previous steps, the draft copy was then employed, without changes, for testing in pilot studies of the face validity and the internal consistency of the scale's items (clinical testing for psychometric properties). The factorial structure of the hereafter termed DAS24ita and its relationships with other theoretically related measures were then assessed. The other measures include specific appearance-related scale, and also a quality-of-life scale used to evaluate different topics related to oncology patients, and breast cancer in particular. Participants from the pilot studies and for the main study were both recruited from the same hospital.
Pilot Studies
Face validity was preliminarily evaluated through a think-aloud procedure in a sample of 25 volunteer patients with breast cancer aged between 20 and 60 years (mean±SD=44.04±9.44 years). In ‘think-aloud’ interviews, the participants were asked to think aloud as they answered questions, thus verbalizing the thoughts that would normally remain silent during the process. Participants were not asked to explain or justify what they were doing and were not asked to report their strategies. Thought processes were then examined for comprehension, recall and judgment difficulties. This methodology can be useful in identifying the face validity of the measure and any problematic questions, and can be repeated after a revision of the instrument (24).
Internal consistency was evaluated through Cronbach's alpha on a sample of 60 newly diagnosed patients with breast cancer aged 26-74 (mean±SD=45.22±9.27 years). Internal consistency of the scale was demonstrated with Cronbach's alpha of 0.93.
Main Study
Procedure. Data were collected at the Breast Cancer Department of the G. Pascale National Cancer Institute in Naples, Italy. A booklet containing the questionnaires was completed anonymously by participants, or at a participant's request, collected with the help of a trained psychologist, who was also member of the research team. The cover instruction page also reiterated the aim of project, the ethical considerations, including the right of the participants to withdraw, and an assurance of patient anonymity, and all information about the study to ensure informed consent.
Participants. Ninety-three female participants were recruited, using a computer-based randomised selection from the electronic clinical database available at the Breast Cancer Department.
For confirmatory factor analysis, it is feasible either to follow general rules for sample size, or use more specific guidance around participant to variable ratios. Given a lower number of variables in the confirmatory factor analysis reported here, it was feasible to adopt a participant to variable ratio. Bryant and Yarnold (25) and Gorush (26) both accept a ratio of 1:5 variables to participants. A 24-item scale would thus require 120 participants. The sample here was less than this, although approaching it. Confirmatory factor analysis results should thus be interpreted with caution.
The database includes patients at different stage of their cancer treatment course (i.e. patients waiting for surgery and patients in follow-up). For this study three group of patients were recruited: i) patients waiting for breast cancer surgery (N=45, with no previous treatments); ii) patients in follow-up of at least 12 months (range=12-16 months, mean=13.8 months) after conservative surgery (N=30); and iii) patients in follow-up of at least 12 months (range=12-16 months, mean=14.2 months) after mastectomy (N=18).
Inclusion criteria were: participants older than 18 years of age, with a breast cancer diagnosis, of either sex, of Italian nationality. Exclusion criteria were a psychiatric diagnosis, an objective disfigurement identifiable by a trained psychologist, known alcoholism or drug addiction. Patient data are shown in Table I.
Measures of the DAS24ita. The DAS24ita has the features of the original scale (response options, introductory section). Some of the items request a response about the intensity of emotional response, using response categories of ‘extremely’ to ‘not at all’ (e.g. “How distressed do you get when you see yourself in the mirror/window?”). Other items ask about the frequency of particular behaviours indicative of a self- conscious response (e.g. “I avoid going out of the house”), using an ‘almost always’ to ‘never/almost never’ set of response categories. The final two items, concerning pain and functional limitation, are included at the request of medical clinicians, which assess any physical impact of the problem of appearance. These are for information only, and do not contribute to the scoring of the scale.
In order to determine criterion validity, the patients were also evaluated on different measures of appearance sensitivity, emotional distress, and quality of life.
Appearance sensitivity measures. Body shame: The body shame subscale of the Experience of Shame Scale (27) was used. It has four items, rated on a 4-point scale ranging from 1 (nothing) to 4 (very much), and requires the respondent to select the option that best expresses the intensity to which they experienced each item in the previous 3 months (e.g. “Have you avoided looking at yourself in the mirror?”). The total score ranges between 1 and 4, with higher scores indicating higher levels of shame. Cronbach's alpha for the present sample was 0.88.
Centre for Appearance Research Salience scale (CARSAL): The core construct of salience was operationally defined as “the extent to which appearance and physical self is brought into conscious awareness.” The CARSAL item pool consisted of 10 items with Likert scale response categories ranging from 1 (strongly disagree) to 6 (strongly agree). Three items were reverse scored. Higher scores for each item indicated increased salience of appearance within the self-concept – that is, appearance being part of the working self-concept – than a lower score. Items for this scale were specifically cognitive rather than affective or behavioural content. Cronbach's alpha for the present sample was 0.74.
Centre for Appearance Research Valence scale (CARVAL): The core construct of valence was operationally defined as “The extent to which the respondent evaluates her/his appearance in a positive/negative way”. The item pool consisted of 12 items with the same response options as the CARSAL. Seven of the candidate items were reverse scored. Higher item scores indicated a more negative evaluation of appearance. Items for this scale were specifically affective and cognitive rather than behavioural. For both scales, items were required to be applicable to objectively visibly different and also other general population respondents. Cronbach's alpha for the present sample was 0.89.
There are no published studies validating these scales in the Italian language, hence for each of these scales, we used the same translation procedure as used for the DAS scale. Other psychometric properties are available upon request.
Emotional distress. The Hospital Anxiety and Depression Scale (28) is a 14-item scale measuring current levels of depression and anxiety. The Italian version has good psychometric qualities and comprises two subscales: depression and anxiety, both with seven items (29). It uses a 4-point scale (0-3) and the total score for each subscale ranges from 0 to 21, with higher scores indicating more symptomatology. Cronbach's alpha for the present sample was 0.90.
Quality of life. Quality-of-life outcomes are extremely wide-ranging, and have been described as encompassing almost all aspects of a cancer patient's well-being (30). In order to measure the individual's subjective perception of their quality of life, the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 scale was used. This instrument was validated for the Italian population according to the guidelines of the EORTC QLQ group, and presented good reliability and validity (31). The EORTC QLQ-C30 is a 30-item questionnaire composed of multi-item scales and single items that reflect the multidimensionality of the quality-of-life construct. It incorporates five functional scales (physical, role, cognitive, emotional, and social), three symptom scales (fatigue, pain, and nausea and vomiting), and a global health and quality-of-life scale. The remaining single items assess additional symptoms commonly reported by patients with cancer (dyspnoea, appetite loss, sleep disturbance, constipation, and diarrhoea), as well as the perceived financial impact of the disease and treatment.
A tumour-specific questionnaire module (BR23 EORTC) was used to supplement the QLQC30. This 23-item questionnaire incorporates symptoms specifically relevant to breast cancer that are not (adequately) addressed by the QLQC30 (33).
Data analysis. To examine the underlying constructs that the DAS24 is supposed to measure, a confirmatory factor analysis was performed on our validation sample. Because other published research of the DAS24 showed various factor solutions, we tested two different confirmatory factor analysis models, one as proposed by Carr et al. in their first validation study of the scale (one factor) (9), and the second as recently proposed by Moss et al. (32) (two-factor). The two different analyses were carried out in order to better understand the structure being assessed by the DAS24 translation. To establish criterion validity, the correlations between DAS24ita and the other measures of appearance sensitivity were evaluated. Because different studies dealing with body perception have focused on the way in which alternative surgical treatments, especially in breast cancer (33), can have different perceived outcomes, each of the correlations was also calculated separately for each of the three groups of patients based on operation type.
All analyses were carried out using SPSS Version 21.0 (SPSS, Inc., Chicago, IL, USA), and Mplus-7 (Muthén&Muthén, Los Angeles, CA, USA) (34).
Results
Think-aloud procedure. One full iteration of revisions and retesting was conducted before the survey instrument received its final think-aloud test: one think-aloud session, followed by refinements to the instrument, and one final think-aloud session to confirm that all the issues had been remedied and no new issues were being detected. Following identification of the issues that emerged from the first think-aloud procedure, a revision of the instrument underwent the last think-aloud screening. At this point, we were satisfied that the meaning of each of the items in the Italian translation was consistent with the meaning, as originally presented in the English language version. The face validity of the DAS24ita was also confirmed at this stage.
Main study. The scoring of the scale was computed according to standard scoring instructions (35). The DAS24ita was evaluated to determine skewness, and exclude floor or ceiling effects. Skewness was evaluated and found to be non-significant (skew/st error skew=0.782/0.250).
The internal consistency of the single-factor scale demonstrated a Cronbach's alpha of 0.93. The mean of the scale was 40.1, with standard deviation of 14.32.
Ninety-three participants provided data, with two missing data entries for overall quality of life and 25 for breast-related body image, as shown in Table II.
Factor structure: Confirmatory factor analyses. The adequacy of the single-factor structure of the DAS24 proposed by Carr et al. (9) and confirmed in other languages (12), was examined with the DAS24ita. Confirmatory factorial analyses, using MPLUS-7 software (34) was conducted. As an alternative model, we also tested a recently proposed bi-factorial model (32). In order to evaluate the adequacy of the model, we considered the scale against a variety of goodness of model-fit indices, including Tucker-Lewis Index and Comparative Fit Index (CFI) (values close to 0.95 indicate good fit); root mean-square error of approximation (RMSEA) value of less than 0.06 indicates good fit (36), and a χ2/df ratio of less than 2 (37).
As shown in Table III, the one-factor model provided the better fit to the data. Cautious interpretation of these findings are required on a sample this size.
Convergent criterion validity. The DAS24ita demonstrated significant correlations with the other measures of appearance sensitivity, including the CARSAL (r=0.37, p<0.05), as well as the CARVAL (r=0.63, p<0.001) and body shame (r=0.82, p<0.001) scales.
The DAS24ita demonstrated positive strong association with measures of anxiety (r=0.55) and depression (r=0.73) (both p<0.001), an inverse relationship (r=−0.30, p<0.01) with the Global Health Status subscale of EORTC QLQ-C30, and the Body Image subscale of BR23 EORTC (r=0.74, p<0.001).
Concurrent criterion validity. In order to examine the concurrent criterion validity, the ability of the DAS24ita to discriminate amongst preoperative, quadrantectomy, and mastectomy patients was examined (see Table IV for summary descriptive statistics). Differences between these three groups for other relevant variables (body shame, appearance salience and valence, anxiety, and depression) were also examined to establish that the groups were meaningfully different.
A Kruskal–Wallis H-test was run to determine if there were differences between each of the three groups, no surgery (n=45), quadrantectomy, (n=30 other than for quality of life, when n=28) and mastectomy (n=18) across all the measured variables. Distributions of scores for each of the variables were not similar for all groups, as assessed by visual inspection of a boxplot. Subsequently, pairwise comparisons were performed using Dunn's (38) procedure with a Bonferroni correction for multiple comparisons.
DAS24ita, body shame, and depression variables had a similar pattern in the three treatment groups. The distributions of DAS24ita scores were statistically significantly different between groups (χ2(2)=39.0, p<0.001). All groups were statistically different from each other at p<0.05. No surgery scored lower (better adjustment) than quadrantectomy, which was also significantly lower than mastectomy (poorest adjustment).
The distributions of body shame scores were statistically significantly different between groups (χ2(2)=37.4, p<0.001). Again, each group was significantly different from the others in body shame (p<0.02), with a similar pattern to DAS24ita scores.
The distributions of depression scores were statistically significantly different between groups (χ2(2)=36.54, p<0.001). All groups were significantly different to each other (p<0.05), following the same pattern as DAS24ita scores and body shame scores.
Quality of life scores followed the same broad pattern as DAS, body shame and depression, but the size of differences between the quadrantectomy and the other two groups did not reach significance. The distributions of quality of life scores were statistically significantly different between groups (χ2(2)=6.46, p<0.039). Only the mastectomy and no surgery groups were significantly different from each other (no surgery significantly lower); the quantrantectomy group had an intermediary score and was thus not significantly different from either.
CARVAL, anxiety, and breast-related body image followed a similar pattern to each other. The distributions of CARVAL scores were statistically significantly different between groups, (χ2(2)=16.7, p<0.001). The scores for the no-surgery group was significantly lower than both of the other groups, which did not differ significantly from each other.
The distributions of anxiety scores were statistically significantly different between groups (χ2(2)=27.61, p<0.001). The no surgery group had significantly lower anxiety than both the surgical groups, which did not differ from each other.
The distribution of breast-related body image scores were significantly different between groups (χ2(2)=30.80, p<0.001). The no surgery group had significantly better breast-related body image than both the surgical groups, which did not differ from each other.
The distributions of CARSAL scores were not statistically significantly different between groups (χ2(2)=1.43, p=0.489).
Discussion
The translation of the DAS24 from English to Italian was successful. The Italian scale was internally valid, and had appropriate criterion validity, correlating with measures of appearance sensitivity, body shame, depression, anxiety, and quality of life. The concurrent criterion validity was demonstrated by an investigation of the impact of surgery upon psychological outcomes. It is worth elaborating upon these results.
DAS24ita, body shame and depression scores had similar patterns – mastectomy was associated with poorer outcomes than quadrantectomy, which itself is associated with poorer adjustment than pre-treatment. Appearance valence (CARVAL) and anxiety scores were not different across the two surgery groups, but both differed from the pre-treatment group. Global quality of life was only different between pre-treatment and mastectomy groups.
This pattern of results is a vindication of the translation process for DAS24ita. It further demonstrates the predicted pattern of outcomes from pre-operative, quadrantectomy, and mastectomy patients, which is supported by the impact on body shame and depression. The difference between pre-treatment patients and surgical patients in anxiety and appearance valence is broadly consistent with this, although it invites the further investigation of why quadrantectomy and mastectomy are not differentiated by these measures. We can speculate that the component of DAS24ita, body shame and depression which differentiates these measures from anxiety and appearance valence is the personal, non-social aspect of bodily difference (what one is) as opposed to the social aspect of bodily difference (how one deals with others in social and interpersonal settings); we can at this stage only offer this as a hypothesis for future investigation. The fact that the quadrantectomy patients scored at a mid, and non-significantly different, level compared to pre-treatment and mastectomy suggests that any effect on global quality of life was small, and may become statistically significant with a larger sample. Whether clinical significance was observed remains a moot point. The breast-related body image scores were broadly consistent with expectations, in that the pre-surgery group had better scores than post-surgery groups.
The implications of this work are important for surgeons and their patients. Firstly, we are confident that the Italian language version of the DAS24 scale is valid and internally consistent. It offers a short, sensitive and specific outcome measure in plastic and reconstructive surgery for Italian-speaking patients. For patients, it provides an opportunity to record and raise issues with their surgeon which may not otherwise be covered in a clinical setting, as has been reported in the initial use and development of the scale by Carr et al. (9). In this way, the patient is empowered to raise sensitive issues, and the surgeon has a means of ensuring that psychological needs are not missed. It also serves as an effective outcome measure for the psychological success of surgical interventions, and allows calculation of clinical cost and cost effectiveness of interventions.
There are a number of limitations to this work which are worth considering. The sample size was sufficient for the analysis of internal validity and for the tests of criterion validity. It was on the lower permissible size for the confirmatory factor analysis, according to Arrindel and van der Ende (39), however we counsel caution in interpreting the confirmatory factor analysis results, and recommend replication on a larger sample. Nonetheless, the past and recent literature on this topic brings results recommending different numbers of subjects, or ratio of subject to variable/items. In our study the ratio was 3.9:1, which could be considered the lower bound of an acceptable ratio by Arrindel and van der Ende (39), and deeply discussed by Khalid, in a review of other studies conducting factor analysis with a ratio below 3:1 (40).
Furthermore, there is a need for developing cross sectional designs such as this and implementing longitudinal studies to develop a clearer understanding of change over time. Finally, there is a hypothetical chance of a confounding of the design by differential diagnosis for the quadrantectomy and mastectomy patients, which may in part explain the difference between the outcomes for these groups. However, from a psychosocial perspective, there is no a priori reason for expecting this to be a factor.
This study has demonstrated an effective translation and validation of the Derriford Appearance Scale 24 from English into Italian. The translated scale was able to differentiate amongst patients with differential diagnoses, and was more sensitive to these differences than generic quality-of-life scales.
Acknowledgements
The Authors thank David Harris FRCS.
Andrea Chirico and Antonio Giordano were funded by Sbarro Health Research Organization (www.shro.org) and the Commonwealth of Pennsylvania, Department of Health, Biotechnology Research Program.
Footnotes
This article is freely accessible online.
- Received January 14, 2016.
- Revision received February 23, 2016.
- Accepted February 24, 2016.
- Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved