Development and validation of a brief screening version of the Childhood Trauma Questionnaire

22
Child Abuse & Neglect 27 (2003) 169–190 Development and validation of a brief screening version of the Childhood Trauma Questionnaire David P. Bernstein a,, Judith A. Stein b , Michael D. Newcomb c , Edward Walker d , David Pogge e,f , Taruna Ahluvalia e,f , John Stokes e , Leonard Handelsman g , Martha Medrano h , David Desmond h , William Zule h a Department of Psychology, Fordham University, Dealy Hall, 3rd Floor, Bronx, NY 10458, USA b Department of Psychology, University of California, Los Angeles, CA, USA c Department of Psychology, University of Southern California, Los Angeles, CA, USA d Department of Psychiatry, University of Washington School of Medicine, Seattle, WA, USA e Department of Psychology, Four Winds Hospital, Ketonah, NY, USA f Fairleigh Dickinson University, Teaneck, NJ, USA g Department of Psychiatry, Duke University School of Medicine, Durham, NC, USA h Department of Psychiatry, San Antonio Health Sciences Center, University of Texas, San Antonio, TX, USA Received 15 June 2001; received in revised form 14 August 2002; accepted 14 August 2002 Abstract Objective: The goal of this study was to develop and validate a short form of the Childhood Trauma Questionnaire (the CTQ-SF) as a screening measure for maltreatment histories in both clinical and nonreferred groups. Method: Exploratory and confirmatory factor analyses of the 70 original CTQ items were used to create a 28-item version of the scale (25 clinical items and three validity items) and test the measurement invariance of the 25 clinical items across four samples: 378 adult substance abusing patients from New York City, 396 adolescent psychiatric inpatients, 625 substance abusing individuals from southwest Texas, and 579 individuals from a normative community sample (combined N = 1978). Results: Results showed that the CTQ-SF’s items held essentially the same meaning across all four samples (i.e., measurement invariance). Moreover, the scale demonstrated good criterion-related validity in a subsample of adolescents on whom corroborative data were available. Dr. Stein and Dr. Newcomb are supported by a grant from the National Institute on Drug Abuse, DA-01070-28. Dr. Walker is supported by a grant from the National Institute on Mental Health, K20MH01106. Corresponding author. 0145-2134/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved. doi:10.1016/S0145-2134(02)00541-0

Transcript of Development and validation of a brief screening version of the Childhood Trauma Questionnaire

Child Abuse & Neglect 27 (2003) 169–190

Development and validation of a brief screening versionof the Childhood Trauma Questionnaire�

David P. Bernsteina,∗, Judith A. Steinb, Michael D. Newcombc,Edward Walkerd, David Poggee,f , Taruna Ahluvaliae,f , John Stokese,

Leonard Handelsmang, Martha Medranoh, David Desmondh, William Zuleh

aDepartment of Psychology, Fordham University, Dealy Hall, 3rd Floor, Bronx, NY 10458, USAbDepartment of Psychology, University of California, Los Angeles, CA, USA

cDepartment of Psychology, University of Southern California, Los Angeles, CA, USAdDepartment of Psychiatry, University of Washington School of Medicine, Seattle, WA, USA

eDepartment of Psychology, Four Winds Hospital, Ketonah, NY, USAf Fairleigh Dickinson University, Teaneck, NJ, USA

gDepartment of Psychiatry, Duke University School of Medicine, Durham, NC, USAhDepartment of Psychiatry, San Antonio Health Sciences Center,

University of Texas, San Antonio, TX, USA

Received 15 June 2001; received in revised form 14 August 2002; accepted 14 August 2002

Abstract

Objective: The goal of this study was to develop and validate a short form of the Childhood TraumaQuestionnaire (the CTQ-SF) as a screening measure for maltreatment histories in both clinical andnonreferred groups.Method: Exploratory and confirmatory factor analyses of the 70 original CTQ items were used to createa 28-item version of the scale (25 clinical items and three validity items) and test the measurementinvariance of the 25 clinical items across four samples: 378 adult substance abusing patients from NewYork City, 396 adolescent psychiatric inpatients, 625 substance abusing individuals from southwestTexas, and 579 individuals from a normative community sample (combinedN = 1978).Results: Results showed that the CTQ-SF’s items held essentially the same meaning across all foursamples (i.e., measurement invariance). Moreover, the scale demonstrated good criterion-related validityin a subsample of adolescents on whom corroborative data were available.

� Dr. Stein and Dr. Newcomb are supported by a grant from the National Institute on Drug Abuse, DA-01070-28.Dr. Walker is supported by a grant from the National Institute on Mental Health, K20MH01106.

∗ Corresponding author.

0145-2134/02/$ – see front matter © 2002 Elsevier Science Ltd. All rights reserved.doi:10.1016/S0145-2134(02)00541-0

170 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Conclusions: These findings support the viability of the CTQ-SF across diverse clinical and nonreferredpopulations.© 2002 Elsevier Science Ltd. All rights reserved.

Keywords:Child abuse; Neglect; Measures; Validity

Introduction

Over the past two decades, research on the prevalence, causes, and consequences of childabuse and neglect has increased exponentially (Crouch & Milner, 1993; Finkelhor, 1994;Kendall-Tackett, Meyer Williams, & Finkelhor, 1993; Knutson, 1995; Malinosky-Rummell& Hansen, 1993). However, many of the empirical studies in this area are limited by se-rious methodological shortcomings, including a lack of standardized, adequately validatedinstruments for retrospectively assessing abuse and neglect (Briere, 1992). Many previousstudies have used methods such as chart review or single questions or items to assess mal-treatment, although such approaches may be unreliable and lack sensitivity (Briere & Zaidi,1989). Moreover, studies have often focused on a single form of childhood trauma, typicallysexual or physical abuse, despite evidence that multiple types of maltreatment often cooccur(Briere & Runtz, 1988; Rosenberg, 1987). As a result, it has been difficult to disentangle theeffects of particular types of trauma from that of other coexisting forms or from the impactof maltreatment in general. Little systematic attention has been paid to issues concerninginstrument format, for example, whether maltreatment phenomena are more adequately as-certained using self-report questionnaire or interview methods (Dill, Chu, Grob, & Eisen,1991; Walker, Bernstein, & Keegan, 1997). A related issue is whether childhood trauma arebetter conceptualized as dichotomous events (i.e., events that either did or did not occur) or asexperiences that vary along continuous dimensions such as frequency, severity, and duration(Lipschitz, Bernstein, Winegar, & Southwick, 1999; Walker et al., 1997). Finally, althoughseveral instruments have been developed that incorporate a more methodologically sophis-ticated approach to the assessment of childhood trauma (Bernstein & Fink, 1998; Bernsteinet al., 1994; Bifulco, Brown, & Harris, 1994; Ditomasso, 1995; Fink, Bernstein, Handelsman,Foote, & Lovejoy, 1995; Gallagher, Flye, Hurt, Stone, & Hull, 1992; Herman, Perry, & van derKolk, 1989; Meyer, Muenzenmaier, Cancienne, & Struening, 1996; Sanders & Becker-Lausen,1995; Straus & Hamby, 1997; Straus, Hamby, Finkelhor, Moore, & Runyan, 1998; Zanarini,Gunderson, Marino, Schwarz, & Frankenburg, 1989), comparatively little attention has beenpaid to their validity. While published reports on many of these instruments contain informationabout reliability, most contain little information about criterion-related validity or constructvalidity. With the exception of the Childhood Trauma Questionnaire (Bernstein & Fink, 1998;Bernstein et al., 1994), none of these instruments has been validated with respect to the criticalquestion of whether they correctly detected abuse and neglect histories (i.e., criterion-relatedvalidity).

This lack of attention to instrument validity is of particular concern, given the controversyover the accuracy of retrospective reports of childhood trauma. Many authors have noted thata variety of factors can affect the accuracy of recollections for childhood events, including

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 171

normative ones, such as the degradation of memories over time, and pathological ones, suchas dissociation and repression (Allen, 1995; Bernstein et al., 1995; Rogers, 1995). The “falsememory syndrome” is another example of inaccurate recall, in this case, one that is purportedlyiatrogenic in nature (Loftus, 1993). On the other hand, some authors have noted that memo-ries for childhood experiences may actually be enhanced in cases where events are unusual,unexpected, or consequential, such as childhood trauma (Brewin, Andrews, & Gotlib, 1993).One experimental study found that recall was improved for emotionally arousing events andthat this enhancement was related to greater beta-adrenergic activation (Cahill, Prins, Weber,& McGaugh, 1994). In light of these controversies, it is essential that trauma researchersdemonstrate the validity of their retrospective assessments.

To address the need for reliable and valid assessment of a broad range of maltreatmentexperiences, Bernstein and colleagues developed a 70-item self-administered inventory, theChildhood Trauma Questionnaire (CTQ;Bernstein & Fink, 1998; Bernstein et al., 1994). TheCTQ uses multiple Likert-type items to create dimensional scales, thereby enhancing reliabil-ity and maximizing statistical power. Cut scores can be applied to identify individuals withhistories of abuse and neglect. In initial studies of adult substance abusers, the CTQ showed ex-cellent test-retest reliability over a 2- to 6-month interval as well as convergent and discriminantvalidity with a structured trauma interview (Bernstein et al., 1994; Fink et al., 1995). Principalcomponents analysis of the CTQ items yielded four rotated factors which were labeled, physi-cal and emotional abuse, emotional neglect, sexual abuse, and physical neglect (Bernstein et al.,1994). Similar factor analytic results were obtained in a study of adolescent psychiatric patients,with the exception that physical and emotional abuse items loaded on separate factors, ratherthan a single factor, and that the numbers of items loading highly on each respective factor weresomewhat different than in the original study (Bernstein, Ahluvalia, Pogge, & Handelsman,1997). In the adolescent study, it was possible to corroborate histories obtained with the CTQthrough the use of independent evidence, such as information from referring clinicians andagencies, and the reports of other informants. When compared to therapists’ trauma ratingsbased on all available data about the patient, the CTQ showed good sensitivity and satisfactoryor better specificity, supporting its criterion-related validity (Bernstein et al., 1997).

The goals of the present study were twofold. First, we wished to develop a short form of theCTQ that would take no more than 5 minutes to self-administer, to provide more rapid screeningfor maltreatment histories in both clinical and nonreferred populations. The original 70-itemversion of the CTQ, which requires 10–15 minutes to give, may be too lengthy for settings inwhich time constraints are present (e.g., primary care medical settings) or may unduly increaserespondent burden, when the CTQ is included in a battery of other tests. A short form of thescale, on the other hand, would overcome some of these limitations. Second, we were interestedin examining two important aspects of the construct validity of the CTQ short form: (1) the“measurement invariance” of its factor structure across clinical and nonreferred groups (Hoyle& Smith, 1994), and (2) its criterion-related validity, that is, its relationship to independentvalidating criteria. Measurement invariance refers to the question of whether a measure holdsthe same meaning across groups and encompasses several related issues: whether the numberand nature of the latent dimensions (i.e., factors) represented by a measure are equivalent acrossthe groups; whether the pattern of factor loadings are the same across groups; and whether thecovariances among the latent dimensions are equivalent across groups (Hoyle & Smith, 1994).

172 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

All of these issues can be addressed through the use of confirmatory factor analysis, a specialcase of structural equation modeling. In practical terms, measurement invariance means that theCTQ short form would be equally useful in both normal and clinical populations, an essentialproperty in a screening instrument. Moreover, measurement invariance is a precondition fora comparison of means between groups, for example, using the CTQ short form to comparelevels of child abuse and neglect across different populations.

Although factor analytic studies of the 70-item CTQ have produced similar results acrossdifferent populations, they have not demonstrated measurement invariance in the strict sense,in that somewhat different factor structures were obtained (i.e., four vs. five factors, differentnumbers of items per factor) (Bernstein et al., 1994, 1997). In the present study, our aim wasto reduce the number of items on each factor to produce a scale with a relatively simple factorstructure that would be invariant across diverse clinical and nonreferred groups. In particular,we dropped items from the original CTQ that loaded highly on more than one factor, so thatthe resulting factors would be as discriminable as possible across multiple populations. Wetested the measurement invariance of the CTQ short form in 1978 individuals consisting offour separate samples: a primarily male sample of adult substance abusers enrolled in inpatientand outpatient treatment programs in New York City, male and female adolescent psychiatricinpatients, male and female substance abusers in a community sample from the Southwest,and a normative sample of male and female participants in a longitudinal study selected fromgreater Los Angeles County. Two of the four data sets—the adult substance abusers from NewYork City and the sample of adolescent psychiatric patients—had been used previously toexamine the validity of the 70-item version of the CTQ (Bernstein et al., 1994, 1997). How-ever, we felt justified in using them again in conjunction with the two new samples, becausethe four samples together provided a diverse set of participants on which to validate the shortform of the scale, and because the short form is a substantially different version of the CTQrequiring separate validation. We also performed latent means analyses to test the hypothesisthat the adult substance abusers from New York City and the Southwest and the adolescentpsychiatric patients would report higher levels of child maltreatment than the normative com-munity sample. The failure to find such differences would be a serious blow to our claims forthe scale’s validity, in light of extensive research documenting the high prevalence of mal-treatment in clinical populations (Crouch & Milner, 1993; Finkelhor, 1994; Kendall-Tackettet al., 1993; Knutson, 1995; Malinosky-Rummell & Hansen, 1993). Finally, we examined thecriterion-related validity of the CTQ short form in a subgroup of the adolescent psychiatricpatients on whom corroborative data were available in the form of therapists’ trauma ratings.

Methods

Participants

Four diverse sets of participants were used in this study: adult substance abusing patientsfrom New York City, adolescent psychiatric inpatients, substance abusing individuals from acommunity sample in southwest Texas, and individuals from a normative community samplein Los Angeles County.

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 173

Adult substance abusing patients.The first sample consisted of 378 adult substance-dependentpatients seeking treatment at two facilities: inpatient drug and alcohol detoxification and re-habilitation units located at the VA Medical Center in the Bronx, NY (N = 252) and anoutpatient methadone maintenance program affiliated with the Mount Sinai Medical Center inNew York City (N = 126). VA patients were consecutive admissions who were given the CTQduring their first week in the hospital as part of a battery of self-report measures and structuredinterviews. Mount Sinai patients were enrolled in a NIDA funded treatment demonstrationproject that examined the efficacy of cognitive behavioral therapy in methadone maintainedheroin addicts with comorbid cocaine addiction. Patients were randomly assigned to a highintensity cognitive behavioral treatment group and a low intensity “treatment as usual” controlgroup. All of the Mount Sinai patients were given the CTQ during the intake phase of the studyprior to assignment to one of the treatment groups. The VA and Mount Sinai patients were quitesimilar with respect to their demographic and clinical characteristics and were therefore com-bined into a single sample for the analyses reported here. The patients in the combined samplewere mostly minority (African-American= 50.3%, Hispanic= 33.7%, White = 13.4%),predominantly male (85.6%) inner-city addicts and alcoholics who ranged in age from 24 to68 years (M = 40.2 years,SD= 8.8 years). Most had extensive lifetime histories of polysub-stance abuse and dependence, with alcohol (90.1%), cocaine (68.3%), cannabis (60%), andheroin (39.2%) being the most frequently used substances. The sample of VA and Mount Sinaipatients used in the present study was partially overlapping with one described in earlier reporton the validity of the original version of the CTQ (Bernstein et al., 1994), and was obtainedby supplementing the original sample with an additional 92 participants drawn by the samemethod from the same population.

Adolescent psychiatric inpatients.The second sample consisted of 398 psychologically dis-turbed adolescents admitted to the inpatient unit of a private psychiatric hospital in Ketonah,NY. The adolescents were given the CTQ approximately 1 week after admission as part of a clin-ical battery of psychological tests, including the Wechsler Intelligence Scale for Children-ThirdEdition (WISC-III), the Wide Range Achievement Test-Version III (WRAT-III), and a varietyof self-report measures. Approximately 25% of the adolescents were unable to complete theCTQ and the other self-report measures due to low intelligence (WISC Full Scale IQ< 80)or poor reading skills (WRAT-III Reading Level below sixth grade) and were excluded fromthe study. The adolescents were diverse with respect to age (M = 14.9 years,SD= 1.4 years,range= 12–17 years), gender (male= 43%, female= 57%), and ethnicity (White= 67.9%,Hispanic= 13.3%, African-American= 11.2%), and spanned a range of family income fromupper- and middle-income families with private health insurance to families in poverty (pa-tients with Medicaid coverage= 51%). Although the adolescents were admitted for a varietyof psychiatric conditions, the most frequent presenting problems were suicide risk (48.9%),substance abuse (37.8%), and mood disorders (35.2%). The adolescent sample used in thisstudy is identical to one describe in an earlier report on the validity of the original version ofthe CTQ (Bernstein et al., 1997).

In both the clinical sample of adult substance abusers and the sample of adolescent psy-chiatric patients, data on participants’ CTQ responses was extracted from their testing files.Specific informed consent for the CTQ was not solicited because it was subsumed under the

174 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

general consent for clinical evaluation and treatment services obtained from patients and/ortheir parents or legal guardians.

Adult substance abusers in the community.The third sample consisted of 625 male and femaleparticipants in the community outreach for the prevention of AIDS (COPA) project, an ongoingNational Institute of Drug Abuse (NIDA) Cooperative Agreement research project. Injectiondrug or crack cocaine using adults residing in South Texas were recruited for the study throughcommunity outreach. To be eligible for the study, participants had to screen positive for cocaine,opiates, or met-amphetamine based on urine toxicology, and to have not received drug treatmentin the prior 30 days. All participants were given an HIV risk behavior interview developed byNIDA, an HIV antibody test, and a NIDA-developed educational intervention that includedHIV pretest counseling. At the time of initial evaluation, participants were given the CTQ anda variety of other self-report measures as part of a substudy funded by the Hogg Foundation toexamine the relationship between HIV risk behavior and history of childhood victimization.Participants who were unable to complete the CTQ and the other self-report measures ontheir own were administered the scales verbally. Participants were 64% male, 60% Hispanic(28% African-American, 11% non-Hispanic White), and ranged in age from 18 to 54 years(M = 34 years). Most of the sample (57%) had not graduated high school. Seventy sevenpercent of participants were injection drug users with heroin (44%), crack cocaine (38%), andintravenous cocaine (14%) being the most frequently used primary substances.

Normative community sample of adults.The fourth sample was obtained from all current 579participants in a 20-year longitudinal study of community adolescents that began in 1976(Newcomb, 1997). When the study began, participants were 7th, 8th, and 9th grade students in11 Los Angeles County schools. Assessments have occurred every 4 years. At present their ave-rage age is 34.9 years (range= 33–37 years); they are 67% Caucasian, 14% African-American,10% Hispanic, and 8% Asian-Pacific Islander. Their average income is US $45,000; their av-erage education is some college, 37% have only a high school diploma, 28% have a BA/BS orhigher degree. The sample is 72% women (N = 417), and most participants are married andhave full-time jobs. The greater preponderance of women has been a feature of this sample sinceits inception. Numerous studies have been published based on this longitudinal sample (e.g.,Newcomb, 1994, 1997; Newcomb & Bentler, 1988; Scheier & Newcomb, 1993; Stein,Newcomb, & Bentler, 1987, 1993). The CTQ was included in the most recent wave of thesurvey, which was sent to the participants by mail. They were given US $30 to complete thequestionnaire.

Measures

Childhood Trauma Questionnaire (CTQ).The original CTQ is a 70-item self-administeredinventory that was developed to provide reliable and valid retrospective assessment of childabuse and neglect (Bernstein et al., 1994). Items on the CTQ ask about experiences in childhoodand adolescence and are rated on a 5-point, Likert-type scale with response options rangingfrom Never True to Very Often True (sample CTQ items are given in an earlier report,Bernsteinet al., 1994). The CTQ has five clinical scales—physical, sexual, and emotional abuse, and

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 175

physical and emotional neglect—which have been empirically derived (Bernstein et al., 1994,1997). The CTQ scales were based on the following definitions of abuse and neglect. Sexualabuse was defined as “sexual contact or conduct between a child younger than 18 years ofage and an adult or older person.” Physical abuse was defined as, “bodily assaults on a childby an adult or older person that posed a risk of or resulted in injury.” Emotional abuse wasdefined as, “verbal assaults on a child’s sense of worth or well-being or any humiliating ordemeaning behavior directed toward a child by an adult or older person.” Physical neglect wasdefined as, “the failure of caretakers to provide for a child’s basic physical needs, includingfood, shelter, clothing, safety, and health care” (poor parental supervision was also included inthis definition if it place children’s safety in jeopardy). Emotional neglect was defined as, “thefailure of caretakers to meet children’s basic emotional and psychological needs, includinglove, belonging, nurturance, and support.” In the short version of the CTQ, each type ofmaltreatment is represented by five items to provide adequate reliability and content coveragewhile substantially reducing the overall number of items in the scale. The CTQ also has athree-item Minimization/Denial validity scale that was developed to detect the underreportingof maltreatment (Bernstein & Fink, 1998). In the present study, the two treatment samples—adult substance abusers and adolescent psychiatric patients—received the original 70-itemversion of the CTQ. The two community samples—adult substance abusers in the Southwestand the normative sample—were given the short form of the CTQ from which the three-itemvalidity scale and many of the other CTQ items had been excluded to reduce respondent burden.

Therapists’ maltreatment ratings.Therapists’ ratings of abuse and neglect were obtained ona subsample of the adolescent psychiatric patients (N = 179) who had also received theCTQ. In an earlier study (Bernstein et al., 1997), these ratings were used to validate the full70-item CTQ. In the present study, these data were reanalyzed to provide external validationfor the short form of the questionnaire. After the adolescent patients were discharged from thehospital, their primary therapists were given the Child Maltreatment Ascertainment Interview, astructured interview eliciting detailed information about their patients’ histories of childhoodtrauma (Bernstein et al., 1997). The therapists were given a synopsis of each case basedon information that was extracted from the clinical record, but were kept blind to the CTQresponses of the adolescents. The therapists were then presented with standardized definitionsof four kinds of maltreatment (physical, sexual, and emotional abuse, and physical neglect) andasked to determine their patients’ maltreatment status (definitely or definitely not maltreated,or uncertain), based on all available information about the case.

The therapists’ maltreatment ratings showed excellent interrater reliability (kappas= .9to 1.0), when two therapists were asked to rate 10 case vignettes that were abstracted frompatients clinical charts (Bernstein et al., 1997). Thus, the therapists were able to apply themaltreatment definitions in a uniform manner.

These therapists’ ratings were used as the validity criterion in this study. The therapistshad extensive contact with the patients and their families during the typically lengthy hos-pitalizations (length of stay:M = 6.9 weeks,SD = 5.6 weeks). Moreover, the therapistswere privy to information from a variety of other sources, such as reports of child welfareinvestigations, referring clinicians and agencies, and other members of the multidisciplinarytreatment team. In a majority of cases of sexual abuse (62.8%), physical abuse (67.7%), and

176 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

physical neglect (75.6%), the therapists were able to support their judgments with independentevidence, such as knowledge of Child Protective Services investigations, criminal or familycourt charges/appearances, or removal of the child from the parental home (Bernstein et al.,1997). Thus, although the therapists’ ratings were based in part on information provided bythe patient, these were substantiated with independent data in most instances and were notinfluenced by responses to the CTQ.

Analyses

Although each of the items on the original CTQ was intended to represent only one factor,many loaded highly on more than one factor. The initial goal of the data analysis was thereforeto identify five items from each of the five hypothesized factors of the CTQ that would loadhighly together and overlap only moderately with the other factors, leaving a briefer (25 itemsplus the three-item validity scale) and more easily interpretable form of the questionnaire.

We wanted to reduce the full form by about two-thirds from 70 to 25 items plus the 3validity items, producing a 28-item short form. We also wanted to establish reliable subscalesthat were equally balanced among the five types of maltreatment and had sufficient items toprovide a breadth of content. Five items seemed a reasonable compromise on each of theseissues. We did not want to give more weight to one type of trauma more than another, butrather give equal credence to all types of trauma, some of which have received little attention(i.e., emotional abuse and neglect, and physical neglect).

First, after excluding the three validity items, we conducted exploratory factor analyses ofthe remaining 67 CTQ items that were given to the adult substance abusers and adolescentpsychiatric patients using the BMDP 4M factor analysis program with maximum likelihoodestimation and direct quartimin rotation. Where appropriate, items were reverse-scored inthe analyses to keep the items positively correlated among themselves. We did not expect thefactors to be orthogonal since previous research (Bernstein et al., 1994, 1997) indicated that thefactors are highly related to each other. Twenty five items were retained that had factor loadingsgreater than .50 on its intended factors and low loadings (<.30) on the other factors. We thusdeveloped reasonably distinct factors corresponding to the a priori constructs developed forthe original CTQ—physical, sexual, and emotional abuse, and physical and emotional neglect.

Next, confirmatory and multisample latent variable analyses were performed using theEQS structural equations modeling program (Bentler, 1995). Latent variables are error-freeconstructs that are composed of the shared variance or relations among a number of manifest orindicator variables (Bentler & Stein, 1992). These analyses compare a proposed hypotheticalmodel with a set of actual data. The closeness of the hypothetical model to the empirical dataare evaluated statistically through goodness-of-fit indexes, which include theχ2/degrees offreedom ratio, and various fit indexes. Aχ2 value no more than twice the degrees of freedomin the model generally indicates a plausible, well-fitting model since with large sample sizesit is difficult to obtain a nonsignificantχ2 value (Newcomb, 1994).

Goodness-of-fit of the models was principally evaluated with Satorra-Bentler robust fitstatistics [the Satorra-Bentler chi-square (S-Bχ2) and the Robust Comparative Fit Index(RCFI)] since the data were multivariately kurtose (Bentler, 1990; Bentler & Dudgeon, 1996).The RCFI ranges between 0 and 1 and compares the improvement of fit of a hypothesized

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 177

model to a model of complete independence among the measured variables while adjustingfor sample size. Values over .90 are desirable since that indicates that 90% or more of the co-variation in the data are reproduced by the hypothesized model (Bentler, 1990, 1995). We alsoreport another indicator of model fit, the root mean square error of approximation (RMSEA,Steiger, 1990). A value of about .07 or less is considered reasonable (Browne & Cudeck, 1993).

Models

Preliminary confirmatory factor analyses.Initial confirmatory factor analyses (CFA) wereperformed for each group separately with each hypothesized latent construct predicting itsproposed five manifest indicators selected from results of the exploratory factor analysis de-scribed above. All latent constructs intercorrelated freely since we expected them to be signif-icantly correlated with each other. This analysis assessed the adequacy of the proposed factorstructure and the relationships among the latent variables. To improve the fit of the models,a few correlated error residuals suggested by the Lagrange Multiplier Test (LM test,Chou &Bentler, 1990) were allowed between the measured variables if they made sense theoretically.We did not allow any complex factor loadings in which an indicator would load on more thanone factor as we planned to contrast the factor structures of the four groups and wanted themodels to be as similar as possible.

Multisample analyses.After the separate confirmatory factor analyses, we tested multiplegroup hypotheses about invariance across the four groups in their factor structures (Hoyle& Smith, 1994). We contrasted the adult substance abusers, the adolescents, the southwestTexas sample, and the community sample members to see whether the revised instrument heldessentially the same meaning for them. We were testing to see whether the instrument wouldbe equally useful in clinical and community samples and also wanted to contrast the means ofthe latent constructs.

Constraints on the equality of the factor loadings in the CFA models were imposed (Bentler,1995; Byrne, 1994; Byrne, Shavelson, & Muthén, 1989). After testing a baseline unconstrainedmodel, the factor loading of each measured variable on its latent factor was constrained toequality across the groups. The tenability of this constrained model was determined withthe same goodness-of-fit indexes described above,χ2-difference tests, and results of the LMtest, which in this context provides information concerning which equality constraints are notplausible and should be released to improve the overall fit of the model.

We also assessed the differences between the samples in latent means (Hoyle & Smith, 1994;Stein & Gelberg, 1995). First we contrasted the latent means of the two clinical populations.Then, in a four-group analysis, we contrasted the latent means of the clinical groups againstthe community sample using the community sample as the reference group. This techniqueprovides a statistic analogous to az-score and was meant to determine if the clinical samplesreported more maltreatment than the community sample.

Criterion-related validation

As noted above, therapists’ independent ratings of four types of childhood trauma were ob-tained from the Child Maltreatment Ascertainment Interview (CMAI) for 179 of the adolescent

178 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

psychiatric patients. Ratings of “present” on the CMAI were coded as “2,” ratings of “absent”were coded as “1,” and ratings of “uncertain” were coded as “1.5.” To evaluate whether scoresfrom the CTQ short form corresponded well with those independently obtained ratings, we firstperformed a CFA in which the five latent variables from the CTQ short form were intercorre-lated with the four scores from the CMAI (physical, sexual, and emotional abuse, and physicalneglect). We then tested a predictive model to observe whether constructs from the CTQ shortform could predict analogous measured constructs from the CMAI. Initially, all possible pre-dictive paths were included simultaneously and nonsignificant paths were dropped gradually.This procedure was a test of both the convergent and discriminative validity of the CTQ shortform (i.e., childhood trauma variables on the CTQ should be related to corresponding variableson the CMAI, and not to noncorresponding variables) and the criterion-related validity of theCTQ short form (i.e., the ability of the CTQ to predict an independent criterion variable).

Results

Confirmatory factor analysis

Table 1reports the means, standard deviations, and factor loadings for the individual itemsthat were selected to form the five latent constructs. Alpha coefficients for each group are alsoreported.

All manifest variables loaded significantly (p ≤ .001) on their hypothesized latent factors inall four groups. Model modification was minimal and is described below. The fit indexes werequite good which indicated that the hypothesized factor structures were plausible for all fourgroups: (1) adult substance abusers from New York City S-Bχ2 (262, N = 378) = 484.98;p < .001;χ2/df = 1.85; RCFI = .92; RMSEA = .05; (2) adolescents S-Bχ2 (263, N =396) = 527.77;p < .001;χ2/df = 2.01; RCFI= .94; RMSEA= .05; (3) substance abusersfrom the Southwest S-Bχ2 (262, N = 625) = 654.47;p < .001;χ2/df = 2.49; RCFI= .93,RMSEA = .05: and (4) normative community sample S-Bχ2 (263, N = 579) = 491.12;p < .001;χ2/df = 1.87; RCFI= .93; RMSEA= .06. All fit indexes were greater than .90,all but oneχ2/degrees of freedom ratios were near 2:1 or less, and RMSEAs were acceptablein all four groups.

A few nonhypothesized covariances among the error residuals were added to each modelbased on suggestions from the LM test. These correlations reflect unique associations betweenvariables that are not accounted for by the latent factor. They may capture either method orcontent similarity. It is not surprising that a few were needed for each group and these in noway altered the fundamental factor structure.

For the adult substance abusers, three additional covariances were added: one was betweenthe residuals of two physical abuse items (“People in my family hit me so hard it left me withbruises or marks,” and “I was punished with a belt, a board, a cord, or some other hard object”),one was between two sexual abuse items (“Someone tried to touch me in a sexual way, or triedto make me touch them,” and “Someone tried to make me do sexual things or watch sexualthings”), and one was between two emotional abuse items (“People in my family called methings like ‘stupid,’ ‘lazy,’ or ‘ugly,’ and “People in my family said hurtful or insulting things

D.P.

Be

rnste

ine

ta

l./

Ch

ildA

buse

&N

egle

ct2

7(2

00

3)

16

9–

19

0179

Table 1Means, standard deviations, and factor loadings of measured variables (CTQ short form itemsa) in the confirmatory factor analysis

396 Adolescents 378 Substance abusers 579 Community members 625 Texas sample

Mean(SD)b

Factorloadingc

Mean(SD)

Factorloading

Mean(SD)

Factorloading

Mean(SD)

Factorloading

I. Emotional abuse (adolescents’ coefficientα = .89, drug abusers’α = .84,community sample members’α = .87, Texas= .88)

Called names by family 2.7 (1.5) .82 2.2 (1.2) .60 1.9 (1.2) .73 2.2 (1.3) .69Parents wished was never born 2.1 (1.3) .72 1.6 (1.0) .67 1.4 (.9) .69 1.8 (1.3) .70Felt hated by family 2.5 (1.5) .81 1.7 (1.2) .78 1.7 (1.1) .73 2.0 (1.3) .83Family said hurtful things 2.7 (1.4) .84 2.2 (1.2) .79 2.1 (1.1) .83 2.2 (1.3) .82Was emotionally abused 2.5 (1.5) .77 1.9 (1.3) .80 1.8 (1.3) .85 2.1 (1.5) .83

II. Physical abuse (α = .86, .81, .83, .85)Hit hard enough to see doctor 1.3 (.8) .60 1.4 (.9) .74 1.1 (.5) .67 1.3 (.9) .63Hit hard enough to leave bruises 2.0 (1.3) .91 1.8 (1.2) .74 1.3 (.8) .84 1.9 (1.3) .78Punished with hard objects 2.1 (1.4) .75 3.0 (1.4) .49 2.2 (1.2) .56 2.5 (1.4) .66Was physically abused 2.0 (1.5) .82 1.6 (1.2) .75 1.4 (1.0) .82 1.8 (1.3) .87Hit badly enough to be noticed 1.4 (1.0) .69 1.3 (.9) .73 1.1 (.5) .63 1.4 (1.0) .72

III. Sexual abuse (α = .95, .93, .92, .94)Was touched sexually 1.9 (1.5) .91 1.7 (1.2) .75 1.6 (1.0) .80 1.8 (1.4) .90Hurt if didn’t do something sexual 1.4 (1.0) .71 1.3 (.8) .75 1.1 (.6) .68 1.4 (1.0) .73Made to do sexual things 1.6 (1.2) .87 1.5 (1.0) .82 1.4 (.9) .85 1.6 (1.2) .90Was molested 1.7 (1.4) .95 1.4 (1.0) .91 1.4 (1.0) .91 1.7 (1.4) .93Was sexually abused 1.7 (1.4) .93 1.4 (1.0) .94 1.4 (1.0) .89 1.7 (1.4) .92

IV. Emotional neglect (α = .89, .88, .91, .85)Felt loved (R) 2.3 (1.3) .86 1.9 (1.2) .78 1.8 (.9) .80 2.1 (1.3) .79Made to feel important (R) 2.5 (1.3) .72 2.3 (1.2) .67 2.0 (1.1) .72 2.7 (1.5) .47Was looked out for (R) 2.6 (1.3) .77 2.0 (1.1) .80 1.9 (1.0) .84 2.3 (1.3) .83Family felt close (R) 3.0 (1.3) .76 2.1 (1.2) .78 2.2 (1.1) .84 2.4 (1.3) .81family was source of strength (R) 2.9 (1.4) .85 2.1 (1.2) .83 2.1 (1.1) .90 2.4 (1.4) .81

180D

.P.B

ern

stein

et

al.

/C

hild

Abu

se&

Neg

lect

27

(20

03

)1

69

–1

90

Table 1 (Continued)

396 Adolescents 378 Substance abusers 579 Community members 625 Texas sample

Mean(SD)b

Factorloadingc

Mean(SD)

Factorloading

Mean(SD)

Factorloading

Mean(SD)

Factorloading

V. Physical neglect (α = .78, .68, .61, .68)Not enough to eat 1.4 (.9) .57 1.5 (.9) .41 1.2 (.6) .28 1.7 (1.1) .40Got taken care of (R) 2.1 (1.2) .79 1.8 (1.2) .65 1.7 (1.0) .82 2.0 (1.3) .60Parents were drunk or high 1.6 (1.2) .53 1.4 (.8) .49 1.3 (.7) .41 1.6 (1.1) .51Wore dirty clothes 1.4 (.8) .58 1.4 (.8) .61 1.2 (.5) .30 1.5 (.9) .42Got taken to doctor (R) 1.7 (1.0) .69 1.5 (.9) .56 1.3 (.8) .44 1.8 (1.2) .66aItems presented in abbreviated form(R) = reverse-scored item.bRange of all variables= 1–5. 1= never true; 2= rarely true; 3= sometimes true; 4= often true; 5= very often true.cAll factor loadings significant,p ≤ .001.

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 181

to me”). For the adolescents, two additional correlated error residuals were added. One wasbetween two sexual abuse items (“Someone tried to make me do sexual things or watch sexualthings,” and “Someone threatened to hurt me or tell lies about me unless I did something sexualwith them”). The other was between two emotional neglect items (“I felt loved,” and “Peoplein my family felt close to each other”). Three correlated errors were added for the Southwestsample. The first was between two emotional abuse items (“People in my family said hurtfulor insulting things to me,” and “People in my family called me things like ‘stupid,’ ‘lazy,’ or‘ugly’ ”); the second was between two physical abuse items (“I got hit or beaten so badly that itwas noticed by someone like a neighbor, teacher, or doctor,” and “People in my family hit meso hard that it left me with bruises or marks”); and the third was between two physical neglectitems (“I didn’t have enough to eat,” and “I had to wear dirty clothes”). Three correlated errorswere also added for the community sample. The first was between two emotional neglect items(“I knew there was someone to take care of me and protect me,” and “There was someone inmy family who helped me feel that I was important or special”). The second was between twosexual abuse items (“Someone tried to make me do sexual things or watch sexual things,” and“Someone tried to touch me in a sexual way or tried to make me touch them”). The third wasbetween “I believe that I was physically abused,” and “I believe that I was emotionally abused.”

Relationships among the latent variables are reported inTable 2. All relationships amongthe latent variables were significant (p ≤ .001). The relationships between emotional abuseand physical abuse were particularly high in all groups (.75 among the adolescents, .80 amongthe adult substance abusers, .77 among the Texas sample, .87 among the community sample)as were the relationships between emotional neglect and physical neglect for the four groups(.88, adolescents; .79, adult substance abusers; .84, Texas sample, .90, community sample).Relatively smaller although still significant relationships were observed between sexual abuseand the other latent variables.

Multiple group comparisons

Factor structure.The factor structure is the relationship between the latent and measuredvariables. The baseline multiple group model for all four sets with no equality constraintsimposed on it served as a comparison for further models (Model 1 ofTable 3). Values for anabsolute null model are also reported for comparison purposes (Model 4). The multiple groupcomparison in which the factor structures (measurement models) were constrained to equalityacross the groups (Model 2) suggested that the factor structures for the adolescents, adultsubstance abusers in New York City, Southwest substance abusers, and normative communitysample were reasonably similar (RCFI= .92), although there was a significant decrement infit from the unrestricted model (the two constraints released in the two group analyses werenot constrained in the four group analysis). Theχ2-difference between Model 2 with equalityconstraints on the factor structure and an unrestricted model (Model 1) was 212.86 withdf = 58, which was significant (p < .001). This lack of equivalence reflects significant groupdifferences in relationships among some of the measured variables especially between thecommunity and clinical samples and how they relate to their associated latent variables. Afterreleasing five of the 58 constraints, the fit improved considerably (Model 3,χ2-difference=55.00/53df, nonsignificant,p > .10). The five constraints that were dropped centered mostly

182 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Table 2Correlations among the latent factorsa

I II III IV

AdolescentsI. Emotional abuse –II. Physical abuse .75 –III. Sexual abuse .44 .41 –IV. Emotional neglect .77 .55 .30 –V. Physical neglect .72 .59 .45 .84

Substance abusersI. Emotional abuse –II. Physical abuse .81 –III. Sexual abuse .50 .41 –IV. Emotional neglect .78 .52 .27 –V. Physical neglect .63 .54 .29 .78

Community sample membersI. Emotional abuse –II. Physical abuse .73 –III. Sexual abuse .47 .59 –IV. Emotional neglect .83 .50 .30 –V. Physical neglect .80 .62 .35 .90

Texas sampleI. Emotional abuse –II. Physical abuse .87 –III. Sexual abuse .59 .60 –IV. Emotional neglect .61 .53 .40 –V. Physical neglect .65 .65 .43 .84aAll correlations significant,p ≤ .001.

around the equivalence of the loadings of the normative sample as compared to the clinicalgroups on three items: “I believe that I was emotionally abused,” “There was someone inmy family who helped me feel that I was important or special,” and “I knew that there wassomeone to take care of me and protect me.”

Table 3Result of multiple group analyses between adolescents, substance abusers, Texas sample, and community sample

Model S-Bχ2 df RCFI (RMSEA) χ2-difference from Model 1

1. Baseline four-group model, noconstraints

2152.47 1049 .93 (.023) NA

2. Four-group model, constrainedmeasurement model

2365.33 1107 .92 (.024) 212.86/58df

3. Model 2 without 5 equalityconstraints

2207.47 1102 .93 (.023) 55.00/53df

4. Absolute null model 17044.74 1200 NA –

Note:S-Bχ2: Satorra-Bentler scaled chi-square; RCFI: robust comparative fit index; RMSEA: root mean squareerror of approximation; NA: not applicable.

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 183

Latent means analysis

We tested for differences between the latent means of the factors in the two clinical samplesand then among the four groups. In these types of models, the factor structure and the observedintercepts (means) are initially constrained to equality between the groups (seeByrne, 1994).In the initial latent means analysis contrasting only the two clinical samples, all latent meanswere significantly higher for the adolescent group than for the adult substance abusers groupexcept for the physical abuse latent mean (emotional abusez = 7.25,p ≤ .001; physical abusez = .62,ns; sexual abusez = 2.79,p ≤ .01; emotional neglectz = 7.42,p ≤ .001; physicalneglectz = 2.27,p ≤ .05). Although this model fit well (RCFI= .98, RMSEA= .054), oneequivalence constraint between the observed intercepts of the two groups was released sinceit was reported to be extremely untenable in the LM test (χ2 = 149.06, 1df, p < .001). Thisindicator was the physical abuse item, “I was punished with a belt, a board, a cord, or someother hard object.” As reported inTable 1, this particular item was endorsed more highly by thesubstance abusers in New York City, although other items on this factor tended to be endorsedmore highly by the adolescents. Once that constraint was dropped, the physical abuse factorlatent mean was significantly higher for the adolescent group (z = 2.33,p ≤ .05).

In the four group latent means model (RCFI= .96), using the community sample as thereference group, we found that the adult substance abusers reported more emotional abuse(z = 2.61,p ≤ .01), more physical abuse (z = 6.78,p ≤ .001), and more physical neglect(z = 4.68,p ≤ .001) than the community sample. There was no significant difference on themeans for the sexual abuse factor. The adolescents reported more emotional abuse (z = 10.84,p ≤ .001), more physical abuse (z = 7.85, p ≤ .001), more sexual abuse (z = 4.19,p ≤ .001), more emotional neglect (z = 10.14, p ≤ .001), and more physical neglect(z = 7.01,p ≤ .001) than the community sample. The Texas sample reported more emotionalabuse (z = 4.47,p ≤ .001), more physical abuse (z = 8.91,p ≤ .001), more sexual abuse(z = 5.01, p ≤ .001), more emotional neglect (z = 6.17, p ≤ .001), and more physicalneglect (z = 10.23,p ≤ .001) than the community sample.

Criterion-related validation

First, a CFA validation model was run using the subsample of 179 adolescents available forthis analysis. We added the same correlated error residuals as in the original CFA model withthe complete set of adolescents. This model fit the data very well: S-Bχ2 (344, N = 179) =534.55;p < .001;χ2/df = 1.55; RCFI= .93. Correlations between the therapist ratings andthe CTQ latent factors are reported inTable 4. Therapist ratings are arranged in columns. Thehighest correlation in each column coincides with the analogous CTQ latent construct.

We then used the CTQ latent factors as predictors of the therapist ratings. All factors wereused as predictors of all constructs simultaneously. We allowed covariances (correlations)among the predictor variables and significant covariances among the error residuals of theoutcome variables. We gradually dropped paths if they were nonsignificant until only signif-icant paths remained. The fit indices for this final path model reflected an excellent fit:χ2

(361, N = 179) = 550.08; p < .001;χ2/df = 1.52; RCFI= .93; RMSEA= .05. Resultsare reported inFigure 1.

184 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Figure 1. Significant regression paths among latent variables in the structural equation model predicting observerratings (N = 179). Regression coefficients are standardized (ap ≤ .05,bp ≤ .01,cp ≤ .001). Correlations amongpredictors, and correlations among residuals of outcomes are not depicted for readability.

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 185

Table 4Correlations between CTQ latent factors and therapist observation scores for 179 adolescents from the validationCFA

Therapist ratings

Physical abuse Sexual abuse Emotional abuse Neglect

CTQ factorsI. Emotional abuse .51a .38a .48a .27a

II. Physical abuse .59a .27a .45a .28a

III. Sexual abuse .18b .75a .20b .23b

IV. Emotional neglect .42a .22b .38a .36a

V. Physical neglect .43a .27a .32a .50a

ap ≤ .001.bp ≤ .01.

We found that the CTQ constructs significantly predicted analogous observational scores bythe therapists. In most cases, there was considerable discriminative validity between similarobserved and reported variables, except that CTQ physical abuse also predicted observedemotional abuse. Since these constructs are highly related to each other, these results are notunexpected.

To refine these results, we needed to determine empirically whether the path from theCTQ physical abuse factor to the physical abuse rating was significantly larger than the pathfrom CTQ physical abuse to the emotional abuse rating variable. Therefore, we ran a modelthat constrained these paths to equivalence and then examined theχ2-difference test betweenthese nested models. The difference test revealed that the paths were significantly different inmagnitude (p < .01), thereby providing additional evidence of the discriminant validity ofthe CTQ.

Discussion

The results of the confirmatory factor analyses indicate that with few exceptions the itemson the CTQ short form performed equivalently across four diverse populations with differ-ing maltreatment histories, supporting the measurement invariance of the scale. In the ini-tial analyses where each sample was examined separately, the proposed five-factor structureof the CTQ short form (i.e., physical, sexual, and emotional abuse, and physical and emo-tional neglect) provided a good fit for the data in all four groups: adult substance abusingpatients in New York City, adolescent psychiatric inpatients, adult substance abusers in theSouthwest, and normative community sample members. To provide a more stringent test ofmeasurement invariance, we then compared the four groups directly, first using an uncon-strained baseline model and then introducing equality constraints on the model. When weconstrained the factor structure (i.e., the relationships between items and their latent variables)to equality, the model provided a good fit for the data, once a few constraints were released.Thus, individuals in the four groups, which differed widely in terms of age, sex, ethnicity,SES, psychopathology, and life experiences, responded to the scale’s items in a reasonably

186 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

equivalent manner, indicating that the items held essentially the same meaning across diversepopulations.

Importantly, the main precondition for the utility of a scale across different groups is theinvariance of its factor structure (Byrne, 1994). For example, a scale with an invariant factorstructure can be used to perform latent means analyses, even when its covariance structureshows some nonequivalence between groups (Byrne et al., 1989). Thus, despite small differ-ences in the covariance structure of the scale particularly in the community sample, our resultssupport the use of the CTQ short form as a screening instrument for maltreatment in bothclinical and nonreferred groups.

The CTQ short form also showed good evidence of criterion-related validity in a subgroupof psychiatrically referred adolescents on whom corroborative data were available. When theCTQ short form’s latent maltreatment variables were compared to analogous therapists’ ratingsof abuse and neglect based on all available information about the patients, the correspondencebetween the two sets of measures was quite precise, supporting the convergent and discrim-inant validity of the CTQ short form. Although the CTQ short form’s physical abuse factorwas related to both physical and emotional abuse ratings made by the therapists, this wasnot unexpected. Indeed, the high intercorrelation between the physical and emotional abusefactors across the four samples supports the clinical observation that physical abuse almostalways occurs in the context of emotional abuse (Claussen & Crittenden, 1991), although theconverse—emotional abuse in the absence of physical abuse—is more common. Moreover,the physical abuse factor was significantly more highly associated with therapists’ physicalabuse ratings than with their emotional abuse ratings, supporting the discriminant validity ofthe physical abuse factor.

The latent means analyses showed that, as expected, the two substance abusing samplesand the sample of adolescent psychiatric patients reported higher levels of maltreatment innearly all areas than the normative community sample. One exception was the nonsignificantdifference in levels of sexual abuse reported by the community sample and the adult substanceabusers. This lack of a difference is probably attributable to the predominance of males in thesubstance abusing group, whereas the community sample has a higher proportion of females.To test this possibility, we examined gender differences in the community sample and foundthat men reported significantly less sexual abuse than the women (p < .001). This resultcorroborates our conclusion the lack of a mean difference on the sexual abuse factor betweenthe adult drug abusers and community samples was due to the disproportionate number of menand women in these groups.

In general, the adolescent psychiatric inpatients reported the highest levels of maltreat-ment, the normative community sample members the lowest levels of maltreatment, and theadult substance abusers were for the most part intermediate between the other two groups,although still showing quite substantial maltreatment. Several studies have reported the preva-lence of abuse and neglect in clinically referred samples of adolescents (Cavaiola & Schiff,1988; Sanders & Giolas, 1991; Sansonnet-Hayden, Haley, Marriage, & Fine, 1987) and adultsubstance abusers (Kroll, Stock, & James, 1985; Schaefer, Sobieraj, & Hollyfield, 1988) thatare far in excess of those found in the general population. For example, in a recent studyof the same adolescent psychiatric patients reported on here (Bernstein et al., 1997), over50% of patients were rated as abused or neglected by their therapists, and over 70% reported

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 187

maltreatment on the CTQ, when cut scores were used to determine caseness. However, fewprevious studies have directly compared the prevalence or severity of maltreatment acrossdifferent groups, in part because measurement invariance is a precondition for the valid-ity of such comparisons. By demonstrating the measurement invariance of the CTQ shortform, the present study helps lay the groundwork for more accurate comparisons of the ex-tent of child abuse and neglect across both clinical and nonreferred populations in futurestudies.

The findings of this study must be considered in light of certain methodological limitations.First, two of the four data sets used in this study (i.e., the adult substance abusers in NewYork City and the adolescent psychiatric patients) were used to derive the CTQ short formby exploratory factor analysis and also to test the measurement invariance of the CTQ shortform by confirmatory factor analysis. Although it would have been preferable to obtain com-pletely new clinical samples to cross validate the exploratory factor analysis results, this wasnot feasible, due to the time and expense that would have been required to gather additionalclinical samples of adequate size. On the other hand, our finding that the CTQ short formshowed measurement invariance in two entirely new samples, including a normative commu-nity sample, suggests that our results are not merely circular. Second, in the adolescent sample,the therapists’ maltreatment ratings based on the CMAI were clustered within therapists (i.e.,each therapist made ratings on more than one patient) and were therefore nonindependent;however, it is unlikely that this lack of independence affected the external validity results,because the therapists’ ratings had excellent interrater reliability. Nevertheless, this possibilitycannot be entirely ruled out. More importantly, however, although we found strong evidencefor criterion-related validity in the adolescent sample, no such analyses were performed forthe other three groups due to the absence of direct corroborative data. The verification ofself-reported childhood trauma poses inherent difficulties, including the passage of time andthe secrecy that often surrounds these experiences. For this reason, corroborative data are oftendifficult or impossible to obtain, particularly in samples of adults. In the adolescent sample,we capitalized on the fact that the events in question were relatively recent ones and that theadolescents’ therapists were privy to many sources of corroborative information, such as childwelfare records and interviews with family members and other informants. No comparabledata were available in the other three samples. Thus, the criterion-related validity of the CTQshort form in other populations remains to be established.

In summary, our findings provide strong support for the coherence and viability of the con-structs measured by the CTQ short form, including the invariance of its factor structure acrossdiverse populations and its criterion-related validity in an adolescent psychiatric populationin which independent corroborative evidence was obtained. The CTQ short form’s brevity ofadministration and assessment of multiple types of maltreatment should give it broad utilityin both clinical and nonreferred groups. As a clinical screening instrument, the CTQ shortform, which takes about 5 minutes to give, can quickly identify individuals with historiesof maltreatment so that appropriate treatments can be provided. As a research tool, its easeand quickness of administration make it well suited for treatment studies and for large scaleepidemiological and multivariate correlational studies. In future studies, we will continue toexamine the validity of trauma histories obtained with the CTQ, using a variety of researchstrategies including corroboration by additional sources of independent evidence.

188 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Acknowledgments

The secretarial and production assistance of Wendy Sallin and Gisele Pham is gratefullyacknowledged.

References

Allen, J. (1995). The spectrum of accuracy in memories of childhood trauma.Harvard Review of Psychiatry, 3,84–95.

Bentler, P. M. (1990). Comparative fit indexes in structural models.Psychological Bulletin, 107,238–246.Bentler, P. M. (1995).EQS structural equations program manual. Encino, CA: Multivariate Software Inc.Bentler, P. M., & Dudgeon, P. (1996). Covariance structure analysis: Statistical practice, theory, and directions.

Annual Review of Psychology, 47,563–592.Bentler, P. M., & Stein, J. A. (1992). Structural equation modeling in medical research.Statistical Methods in

Medical Research, 1, 159–181.Bernstein, D., & Fink, L. (1998).Childhood Trauma Questionnaire: A retrospective self-report. San Antonio, TX:

The Psychological Corporation.Bernstein, D. P., Fink, L., Handelsman, L., Foote, J., Lovejoy, M., Wenzel, K., Sapareto, E., & Ruggiero, J. (1994).

Initial reliability and validity of a new retrospective measure of child abuse and neglect.American Journal ofPsychiatry, 151,1132–1136.

Bernstein, D. P., Fink, L., Handelsman, L., Foote, J., Lovejoy, M., Wenzel, K., Sapareto, E., & Ruggiero, J. (1995).Validity of child abuse measurements: Dr. Bernstein and colleagues reply.American Journal of Psychiatry, 152,1535–1537.

Bernstein, D. P., Ahluvalia, T., Pogge, D., & Handelsman, L. (1997). Validity of the Childhood Trauma Ques-tionnaire in an adolescent psychiatric population.Journal of the American Academy of Child and AdolescentPsychiatry, 36,340–348.

Bifulco, A., Brown, G., & Harris, T. (1994). Child Experience of Care and Abuse (CECA): A retrospective interviewmeasure.Journal of Child Psychology & Psychiatry & Allied Disciplines, 35,1419–1435.

Brewin, C. R., Andrews, B., & Gotlib, I. H. (1993). Psychopathology and early experience: A reappraisal ofretrospective reports.Psychological Bulletin, 113,82–98.

Briere, J. (1992). Methodological issues in the study of sexual abuse effects.Journal of Consulting and ClinicalPsychology, 60,196–203.

Briere, J., & Runtz, M. (1988). Multivariate correlates of childhood psychological and physical maltreatment amonguniversity women.Child Abuse & Neglect, 12,331–341.

Briere, J., & Zaidi, L. Y. (1989). Sexual abuse histories and sequelae in female psychiatric emergency room patients.American Journal of Psychiatry, 146,1602–1606.

Browne, W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.),Testing structural equation models(pp. 136–162). Newbury Park, CA: Sage.

Byrne, B. M. (1994).Structural equation modeling with EQS and EQS/Windows. Thousand Oaks, CA: Sage.Byrne, B. M., Shavelson, R. J., & Muthén, B. (1989). Testing for the equivalence of factor covariance and mean

structures: The issue of partial measurement invariance.Psychological Bulletin, 105,456–466.Cahill, L., Prins, B., Weber, M., & McGaugh, J. L. (1994). Beta-adrenergic activation and memory for emotional

events.Nature, 371,702–704.Cavaiola, A., & Schiff, M. (1988). Behavioral sequelae of physical and/or sexual abuse in adolescents.Child Abuse

& Neglect, 12,181–188.Chou, C. P., & Bentler, P. M. (1990). Model modification in covariance structure modeling: A comparison among

likelihood ratio, Lagrange Multiplier, and Wald tests.Multivariate Behavioral Research, 25,115–136.Claussen, A., & Crittenden, P. (1991). Physical and psychological maltreatment: Relations among types of mal-

treatment.Child Abuse & Neglect, 15,5–18.Crouch, J., & Milner, J. (1993). Effects of child neglect on children.Criminal Justice and Behavior, 20,49–65.

D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190 189

Dill, D. L., Chu, J. A., Grob, M. C., & Eisen, S. V. (1991). The reliability of abuse history reports: A comparisonof two inquiry formats.Comprehensive Psychiatry, 32,166–169.

Ditomasso, M. (1995).Remembering development and validation of an instrument to measure adults’ recall ofmaltreatment in childhood. Unpublished doctoral dissertation.

Fink, L., Bernstein, D. P., Handelsman, L., Foote, J., & Lovejoy, M. (1995). Initial reliability and validity of theChildhood Trauma Interview: A new multidimensional measure of childhood interpersonal trauma.AmericanJournal of Psychiatry, 152,1329–1335.

Finkelhor, D. (1994). Current information on the scope and nature of child sexual abuse.Sexual Abuse of Children,4, 31–53.

Gallagher, R. E., Flye, B. L., Hurt, S. W., Stone, M. H., & Hull, J. W. (1992). Retrospective assessment of traumaticexperiences (RATE).Journal of Personality Disorders, 36,99–108.

Herman, J. L., Perry, J. C., & van der Kolk, B. A. (1989). Childhood trauma in borderline personality disorder.American Journal of Psychiatry, 146,490–495.

Hoyle, R. H., & Smith, G. T. (1994). Formulating clinical research hypotheses as structural equation models: Aconceptual overview.Journal of Consulting and Clinical Psychology, 62,429–440.

Kendall-Tackett, K. A., Meyer Williams, L., & Finkelhor, D. (1993). Impact of sexual abuse on children: A reviewand synthesis of recent empirical studies.Psychological Bulletin, 113,164–180.

Knutson, J. F. (1995). Psychological characteristics of maltreated children: Putative risk factors and consequences.Annual Review of Psychology, 46,401–431.

Kroll, P., Stock, D., & James, M. (1985). The behavior of adult alcoholic men abused as children.Journal ofNervous and Mental Disease, 173,689–693.

Lipschitz, D. S., Bernstein, D. P., Winegar, R. K., & Southwick, S. M. (1999). Hospitalized adolescents’ reports ofsexual and physical abuse: A comparison of two self-report measures.Journal of Traumatic Stress, 12,641–654.

Loftus, E. F. (1993). The reality of repressed memories.American Psychologist, 48,518–537.Malinosky-Rummell, R., & Hansen, D. J. (1993). Long-term consequences of childhood physical abuse.Psycho-

logical Bulletin, 114,68–79.Meyer, I., Muenzenmaier, K., Cancienne, J., & Struening, E. (1996). Reliability and validity of a measure of sexual

and physical abuse histories among women with serious mental illness.Child Abuse & Neglect, 29,213–219.Newcomb, M. D. (1994). Drug use and intimate relationships among women and men: Separating specific from

general effects in prospective data using structural equations models.Journal of Consulting and Clinical Psy-chology, 62,463–476.

Newcomb, M. D. (1997). General deviance and psychological distress: Impact of family support/bonding over 12years from adolescence to adulthood.Criminal Behaviour and Mental Health, 7, 369–400.

Newcomb, M. D., & Bentler, P. M. (1988).Consequences of adolescent drug use: Impact on the lives of youngadults. Beverly Hills, CA: Sage.

Rogers, M. (1995). Factors influencing recall of childhood sexual abuse.Journal of Traumatic Stress, 8, 691–716.Rosenberg, M. S. (1987). New directions for research on the psychological maltreatment of children.American

Psychologist, 42,166–171.Sanders, B., & Becker-Lausen, E. (1995). The measurement of psychological maltreatment: Early data on the Child

Abuse and Trauma Scale.Child Abuse & Neglect, 19,315–323.Sanders, B., & Giolas, M. (1991). Dissociation and childhood trauma in psychologically disturbed adolescents.

American Journal of Psychiatry, 148,50–53.Sansonnet-Hayden, H., Haley, G., Marriage, K., & Fine, S. (1987). Sexual abuse and psychopathology in hospital-

ized adolescents.Journal of the American Academy of Child and Adolescent Psychiatry, 26,753–757.Schaefer, M., Sobieraj, K., & Hollyfield, R. (1988). Prevalence of childhood physical abuse in adult male veteran

alcoholics.Child Abuse & Neglect, 12,141–149.Scheier, L. M., & Newcomb, M. D. (1993). Multiple dimensions of affective and cognitive disturbance: Latent

variable models in a community sample.Psychological Assessment: A Journal of Consulting and ClinicalPsychology, 5, 230–234.

Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach.MultivariateBehavioral Research, 25,173–180.

190 D.P. Bernstein et al. / Child Abuse & Neglect 27 (2003) 169–190

Stein, J. A., & Gelberg, L. (1995). Homeless men and women: Differential associations among substance abuse,psychosocial factors, and severity of homelessness.Experimental and Clinical Psychopharmacology, 3,75–86.

Stein, J. A., Newcomb, M. D., & Bentler, P. M. (1987). An eight-year study of multiple influences on drug use anddrug use consequences.Journal of Personality and Social Psychology, 53,1094–1105.

Stein, J. A., Newcomb, M. D., & Bentler, P. M. (1993). Differential effects of parent and grandparent drug use onbehavior problems of male and female children.Developmental Psychology, 29,31–43.

Straus, M., & Hamby, S. (1997). Measuring physical and psychological maltreatment of children with the ConflictTactics Scales. In G. Kantor & J. Jasinski (Eds.),Out of darkness: Contemporary perspectives on family violence(pp. 119–135). Thousand Oaks, CA: Sage.

Straus, M., Hamby, S., Finkelhor, D., Moore, D., & Runyan, D. (1998). Identification of child maltreatment with theParent-Child Conflict Tactics Scales: Development and psychometric data for a national sample of Americanparents.Child Abuse & Neglect, 22,249–270.

Walker, E. A., Bernstein, D. P., & Keegan, D. (1997).A comparison of interview and questionnaire methods ofassessing childhood interpersonal trauma. Unpublished manuscript.

Zanarini, M. C., Gunderson, J. G., Marino, M. F., Schwarz, E. O., & Frankenburg, F. R. (1989). Childhoodexperiences of borderline patients.Comprehensive Psychiatry, 30,18–25.