A Cluster Analysis of Not-in-Treatment Drug Users at Risk for HIV Infection

26
AM. J. DRUG ALCOHOL ABUSE, 24(2), pp. 199-223 <I998) A Cluster Analysis of Not-in- Treatment Drug Users at Risk for HIV Infection Mark L Williams, Ph.D.*i'2 Zixian Zhao, M.D., Ph.D.'' Robert C. Freeman, Ph.D.^ William N. Elwood, Ph.DJ Ryan Rusek, B.A.^ Robert E. Booth, Ph.D.^ Michael L Dennis, Ph.D.* Dennis G. Fisher, Ph.D.^ Fen Rhodes, Ph.D.^ Norman L. Weatherby, Ph.D.^ Research Company Bethesda, Maryland ^Comprehensive Drug Research Center University of Miami Miami, Florida ^University of Colorado Health Sciences Center Denver, Coiorado ^Chestnut Health Systems Bioomington, Illinois ^University of Alaska, Anchorage Anchorage, Alaska ^CSULB Center for Behavioral Research and Services University of California at Long Beach Long Beach, California 'To whom correspondence and reprint requests should be sent at NOVA Research Company, 4600 East-West Highway, Bethesda, MD 20814, 199 Copyright © 1998 by Marcel Dekker. Inc.

Transcript of A Cluster Analysis of Not-in-Treatment Drug Users at Risk for HIV Infection

AM. J. DRUG ALCOHOL ABUSE, 24(2), pp. 199-223 <I998)

A Cluster Analysis of Not-in-Treatment Drug Users at Riskfor HIV Infection

Mark L Williams, Ph.D.*i'2Zixian Zhao, M.D., Ph.D.''Robert C. Freeman, Ph.D.William N. Elwood, Ph.DJRyan Rusek, B.A.Robert E. Booth, Ph.D.Michael L Dennis, Ph.D.*Dennis G. Fisher, Ph.D.Fen Rhodes, Ph.D.Norman L. Weatherby, Ph.D.

Research CompanyBethesda, Maryland^Comprehensive Drug Research CenterUniversity of MiamiMiami, Florida^University of Colorado Health Sciences CenterDenver, Coiorado^Chestnut Health SystemsBioomington, Illinois^University of Alaska, AnchorageAnchorage, Alaska^CSULB Center for Behavioral Research and ServicesUniversity of California at Long BeachLong Beach, California

'To whom correspondence and reprint requests should be sent at NOVA Research Company,4600 East-West Highway, Bethesda, MD 20814,

199

Copyright © 1998 by Marcel Dekker. Inc.

200 WILLIAMS ET AL.

ABSTRACT

The purpose of the analysis described here was to classify not-in-treatmeni drug users par-ticipaling in the National Institute on Drug Abuse (NIDA)-sponsored Cooperative Agreementstudy into several "homogeneous" HIV risk groups using cluster analysis. Data for thisanalysis (N = 17,778) were collected at 19 study sites in the United States and Puerto Rico.Measures selected for the cluster analysis were limited to (a) cunent drug use and HIV riskbehaviors, (b) mutually exclusive behaviors, (c) behaviors directly related to HIV risk, and(d) behaviors that were not statistically rare.

Eight homogeneous HIV risk clusters were produced. Crack cocaine use was the mostdistinguishing feature of three clusters. Another three clusters were distinguishable by druginjection and needle use practices. Two addiliotial clusters could not be grouped with either(he crack- or the injection-dominant clusters. Prostitution was the most distinguishing riskbehavior of one of these clusters, and extremely high drug injection frequencies and rela-tive rates of risky needle use characterized the other. Composition of the clusters varied sig-nificantly by gender, race/ethnicity, educational attainment, and drug use characteristics. Inaddition, perceptions and behaviors initiated to reduce the chances of becoming infected withHIV varied by cluster. Subjects in the crack-predoniinant clusters reported low perceptiotisof the chances of getting AIDS. Perceptions of the chances of becoming infected with HIVamong subjects in the injection-predominant clusters were strongly related to injection fre-quency. Seroprevalence was also related to cluster. Higher rates of HIV infection were evidentamong the injection-predominant clusters, and higher rates were related to frequency ofinjection and the rate of risky needle use. Among the crack-predominant clusters, the rela-tionship between drug use and sexual behaviors and HIV infection was less clear.

INTRODUCTION

The AIDS epidemic has undergone a major demographic shift in the last fewyears. The Centers for Disease Control (1-3) has estimated that nearly three-quarters of Americans who become infected with HIV will be infected eitherdirectly or indirectly because of illicit drug use. The increasing relevance of druguse as a cofactor of HIV infection has resulted in a number of research pro-grams sponsored hy the National Institute on Drug Abuse (NIDA) that havesought to reduce the impact of the epidemic and to deal with its consequences.One avenue of research supported hy NIDA has been the development andtesting of community-based HIV risk reduction interventions for drug users whoare not in drug treatment (4, 5). The Cooperative Agreement for AIDS Com-munity-Based Outreach/ Intervention Research program (the Cooperative Agree-ment) was one of the largest research projects involving the health concerns ofdrug users ever undertaken by the U.S. government. The goal of the Coopera-tive Agreement was to implement a number of research projects to evaluate theefficacy of community-based interventions aimed at preventing the further spreadof HIV infection among not-in-treatment drug users, especially IDUs and crack

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 201

cocaine smokers. To accomplish this rather broad goal, the Cooperative Agree-ment established three interrelated program objectives: (a) to create a systemfor monitoring the nature and extent of drug use and HIV-related risk-takingbehaviors among not-in-treatment drug users, (b) to assess the efficacy of HIVcounseling and testing in reducing drug and sexual risk-taking behaviors, and(c) to develop and test more elaborate HIV-risk reduction educational interven-tions.

Investigators at the selected research sites' constructed a sample of not-in-treatment drug users employing variants of targeted sampling that have beendeveloped to capture "hidden and elusive" populations (6, 7). Realizing thatsnowball samples may undersample some types of drug users at risk for HIVinfection, investigators also implemented several sampling quotas to ensureadequate numbers of racial/ethnic minorities and women in the sample. How-ever, the sampling procedures adopted by the study and the differences that existamong drug users in different cities resulted in a sample that was heterogeneouswith respect to some measures important to the analysis of study outcomes.Sample heterogeneity is common to many community-based samples and pre-sents particular challenges with regard to the evaluation of project objectives(8, 9).

A number of data reduction or classification strategies have been used toreduce the heterogeneity found in samples of not-in-treatment drug users. Amongthe classification criteria that have been used are subjects' demographic and druguse characteristics (10, 11), geographical location in which subjects reside (12,13), HIV seroprevalence (2, 3), stages of change with regard to specific pre-vention behaviors (14), and other combinations of personal characteristics (15).Although these classification schemes provide the basis of instructive and in-sightful analyses, the subsamples that result from these methods are still quiteheterogeneous in terms of HIV risk. Most not-in-treatment drug users are likelyto exhibit multiple risk behaviors that cannot be summarized by a single or lim-ited number of risk descriptors. For instance, any particular subject may be anIDU, have a history of trading sex for drugs, and have unprotected sexualencounters with multiple padners. What is needed is a multivariate classifica-tion scheme that provides a way to reduce the heterogeneity of a large com-munity-based sample by grouping subjects into relatively "natural" clusters basedon HIV risk behaviors.

The purpose of the analysis presented here was to classify not-in-treatmentdrug users participating in the Cooperative Agreement study into several ho-mogeneous groups, based on drug use and sexual risk behaviors for HIV, us-ing muitivariate cluster analysis. Multivariate cluster analysis has been employed

202 WILLIAMS ET AL.

in Other community-based research projects to classify adolescent alcohol anddrug users (16), opiate addicts (17), and the mentally ill homeless (16). Clus-ter analysis also has been used to distinguish high-risk youths (18) and adults(19) from the general population. When used with the Cooperative Agreementsample, the result of the cluster analysis is subgroups of not-in-treatment drugusers that are homogeneous with regard to level of drug use and sexual HIVrisk. Because multivariate cluster analysis can deal with multidimensional druguse and HIV-related risky behaviors, the resultant classification is especiallyuseful to AIDS researchers and prevention practitioners.

METHODS

Sampling procedures (6, 20), smdy procedures (21, 22), and data (22, 23)used by the Cooperative Agreement study have been explained in greater de-tail elsewhere and in this volume.

Subjects

A data set containing 17,778 cases was available for the cluster analysis.Approximately 70% of the participants were male, and the average age of sub-jects was 33.3 years. Over half (56%) were African-American, slightly morethan one-fifth (22%) were Hispanic, and slightly less than one-fifth (18%) werewhite. The overwhelming majority (94%) of smdy participants self-identifiedas heterosexual. About two-thirds (65%) of subjects reported being unemployedat the time of the interview, about 20% reported being employed either fuU-or part-time, and more than one-tenth (13%) reported being disabled. Themajority (54%) of participants reported less than $5(X) in monthly income.

The average number of times participants reported smoking crack in the 30days before being interviewed was approximately 53. The average number oftimes participants reported injecting drugs in the 30 days prior to the interviewwas approximately 51. The most commonly injected drug used in the previous30 days was heroin (an average of 23 times), followed hy cocaine (17 times)and speedhall (14 times). Participants reported using needles that had been usedby someone else an average of 7 times in the previous 30 days. Respondentsreported an average of three sexual partners in the 30 days prior to the inter-view.

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 203

Risk Measures Used for the Cluster Analysis

In conducting the cluster analysis, it was impossible and unnecessary to in-clude all potentially relevant measures. Measures were selected for inclusionin the analysis based on four criteria. First, measures had to be of current be-haviors; lifetime measures were not included. Second, redundant measures wereexcluded. Since variables related to the number of times a particular behaviorwas engaged in provided more information than variables related to the num-ber of days a behavior was engaged in, the former measures were used. Also,for some risk measures the relative value or proportional rate was calculated.Propottional rates represent a different dimension of risk without repeating in-formation represented by numeric measures. Third, all measures used in thecluster analysis were related to HIV risk. Thus, for example, frequency of al-cohol use was not included in the analysis because it has not been directly re-lated to risk of HIV infection. Finally, only those HIV risk behaviors that werecommonly distributed among study participants were included in the analysisto determine clusters. For example, very few subjects reported using heroinduring sex, and so this variable was not included as a measure for clusteringparticipants. Cases containing missing data on measures chosen for the analy-sis (less than 1 % of the total number of cases) were deleted from the analysts.

Measures selected for the cluster analysis were not standardized. The litera-ture is unclear as to whether measures should be standardized. Several schol-ars have compared the differences between the use of standardized versusnonstandardized variables in generating clusters (24-26). Differences in clus-ter solutions obtained using standardized and nonstandardized measures usuallyare small. However, standardization to unit variance and a mean of zero canreduce the differences between cases for those variables that are the best dis-criminators of between-group differences (27). Similarities between cluster so-lutions using standardized and nonstandardized measures were compared for thisanalysis, and while both solutions were comparable, the most satisfying clus-ter solution was obtained using nonstandardized values.

In addition to conducting the cluster analyses using both standard andnonstandardized variables, analyses were performed using log transformationsof the variables. Risk data are highly skewed toward lower risk across all druguse and sexual risk measures. Consequently, none of the measures approximateda normal distribution. However, log transformations were not used for theanalysis presented in this study for two reasons. First, the large size of thesample provides a great deal of latitude in using skewed variables for the clus-

204 WILLIAMS ET AL.

ter analysis. Second, in comparing the clusters obtained using measures derivedboth from raw data and from transformed data, differences between clustersolutions were niinimal. Given that results obtained using raw data were moreeasily interpreted, the decision was made to compare clusters obtained usingmeasures that had not been log-transformed.

Analysis

Because of the large size of the sample, a two-step cluster procedure wasused to generate a solution. First, a SAS procedure, FASTCLUS (28), was usedto generate a manageable number of clusters. This procedure is designed fordisjoint clustering of very large data sets and can create clusters with a mini-mum of iterations. With the use of this procedure, a preliminary solution of 40clusters was generated. The second step in the two-step procedure was compa-rable to cluster analysis with small data sets. Different clustering methods of-ten generate different solutions using the same data (25, 29). Although severaldifferent methods of generating a cluster solution were tried. Ward's (30) mini-mum variance method provided the most satisfying cluster solution. Thismethod, which produces clusters by minimizing the distance between caseswithin clusters, has been widely used in sociology (31) and the behavioral sci-ences (32). With this procedure, all cases used in the analysis could be groupedinto eight stable clusters. The sample sizes of the clusters were reasonable, anddifferences in risk behaviors among the clusters could be clearly distinguished.The eight-cluster solution accounted for over 93% of the total variance in thesample.

The validity of the cluster solution was confirmed using three methods. Thefirst involved examining distributions of subjects on measures that were not usedfor generating the cluster solution (8, 33, 34), and the distribution of measuresappeared to be consistent with expectations. Second, clusters were replicatedusing randomly derived data sets (34). The large data set of over 17,000 caseswas divided into three mutually exclusive, randomly generated subsamples, twosubsamples each containing one-quarter of the subjects and one containing one-half of the subjects. Cluster solutions were generated using these subsamples,and the solutions were compared by examining distributions of subjects amongclusters and comparing subset centroids. The eight-cluster solution was foundto be reliably replicated using all subsets. Results are shown in Appendix A.Finally, a canonical discriminant analysis was used to generate a three-dimen-sional plot of the eight-cluster solution. The plot of the derived clusters showed

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 205

that the clusters were cleariy distinguishable in three-dimensional space. Resultsare shown in Appendix B. Differences in characteristics among the derivedclusters were tested using analysis of variance and chi-square tests.

RESULTS

Drug Use, Needle Use, and Sexual HIV Risk Behaviors

As shown in Table 1, eight HIV risk clusters were generated using the fulldata set. Six of these clusters can be loosely grouped into two broad risk cat-egories based on the HIV risk behaviors that most distinguish members of acluster from those of other clusters—crack-predominant clusters and drug in-jection-predominant clusters. Within both the crack smoking- and drug injec-tion-predominant clusters, the clusters reflect levels of usage. It is interestingto note that differences between the crack- and injection-predominant clustersare due almost totally to drug and needle use behaviors. With few notable ex-ceptions, there are only minor differences in sexual behaviors among the cracksmoking- and injection-predominant clusters. Subjects in the remaining twoclusters exhibited risk behavior pattems sufficiently different from those of theother clusters that they could not easily be classified into either of the two broadcategories.

Crack use was the predominant identifier in three clusters and accounted for3,170 subjects (approximately 18% of the sample). Crack smokers categorizedin cluster 1 reported smoking crack an average of 581 times in the 30 days priorto being interviewed. Clusters 2 and 3 reported less frequent, but still substan-tial, levels of crack use. Subjects in clusters 2 and 3 reported smoking crackan average of 287 and 134 times, respectively. Although individuals in clus-ters 1, 2, and 3 reported some use of injection drugs, the reported frequencyof injection drug use is not high relative to that for the other clusters. Further-more, only a small percentage of these individuals' injections involve risk ofHIV transmission. The current HIV risks of clusters 1, 2, and 3 are relatedalmost solely to sexual behavior and to the trading of sex for money or drugs.The average numbers of sexual partners reported by persons in clusters 1 and3 were significantly higher than those reported by all other clusters except cluster7. Individuals in the crack clusters reported significantly more sex trading formoney or drugs than those in the drug injection clusters.

Injection drug use was the predominant risk feature of clusters 4, 5. and 6.These clusters were composed of 14,542 subjects, or 82% of study participants.

206 WILLIAMS ET AL.

« 5

> (^Ul II

o ^

O II

Tj- r- r-;

»— lOf* r f i

S oo

— — (S

:£ ^

d — —

.S .3 =§w o uu c C

.i^ 2• r

Lj U flj ^

;§• i f is'

111

^ ii 'H

-S = Z

-S £ - ' ^

1-^— Ml

1

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 207

(N Os

'O fa .c

OO ("^ f^r*^ Os O

& Ss &— r- fM —

&00 o 00 —

—• a rr)

&(N d

— OS w^*N OO —

5d d CT^

% 13

is>, S

n -^b a, i: t: •*

K n s G

s. y

l-H

X

oue01

equ

tup

u,p

sex

o

ucu

equ

o1-

,Q

sex

'3uCu

equ

i2t —

tn01rup

(0 X

208 WILLIAMS ET AL.

The number of times subjects reported injecting and the predominant drug in-jected differ among clusters. Subjects in cluster 4 were predominantly cocaineand speedball injectors. In addition, these subjects reported very higb injectionfrequencies (an average of 111 cocaine injections, 50 heroin injections, and 105speedball injections in the month prior to the interview). Both clusters 5 and 6were predominantly heroin injectors. Subjects in these clusters reported inject-ing heroin an average of 29 and 17.6 times, respectively, during the 30 daysbefore the interview. There were significant differences among the injection-predominant clusters with regard to injection risk behaviors. Cluster 5 subjectswere at the highest risk based on injection with used needles and injection inHIV-risky settings, but they also had significantly lower rates of injecting in ashooting gallery. Among the injection-predominant clusters, subjects groupedin cluster 6 had the lowest levels of risk related to injection practices. Therewere few differences among the injection-predominant clusters in terms of HIVsexual risk behaviors, although subjects in cluster 5 reported a proportionallyhigher rate of sexual partners who injected drugs (72.0%) than subjects in theother clusters.

Cluster 7 is best characterized as a prostitution-predominant cluster, for itsdistinguishing feature is the high rate of trading sex for money or drugs. Sub-jects reported an average of 252 sexual partners in the previous 30 days, in-cluding an average of 179 instances of vaginal sex and 11.5 instances of analsex. Subjects reported using crack during more than one-third (36.7%) of theirsexual encounters. The extremely high number of reported sexual acts is largelyrelated to the frequency of sex trading for money (an average of 287.5 timesin the past 30 days). The reported frequency of trading sex for dmgs also wassignificantly higher than that reported by individuals in any other cluster. De-spite their high number of sexual acts, individuals in cluster 7 also reported afairly high level of condom use during vaginal sex. In only 24% of the totaltimes that cluster 7 individuals engaged in vaginal sex did they fail to use

condoms.Although prostitution is the distinguishing feature of cluster 7, subjects also

reported smoking crack an average of 294.8 times, injecting cocaine 61.1 times,injecting heroin 84.8 times, and injecting speedball 57.9 times in the 30-dayperiod prior to the baseline interview. These subjects also reported proportion-ally high rates of injecting with used needles (28.3% of injections). Proportionalrates of injecting in a shooting gallery were low (7.2% of injections).

The HIV risk of cluster 8 individuals is almost wholly due to drug injection.Because these subjects reported injecting cocaine almost 400 times, heroin 270times, and speedball 230 times in the past 30 days, cluster 8 can be character-

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 209

ized as consisting of predominantly high-frequency injectors. The frequency ofinjecting with a used needle also was very high (an average of 177.6 times);27% of injections took place in a shooting gallery, and more than 10% of in-jections were with a needle that was shared with a stranger. White individualsin cluster 8 exhibited extremely high drug injection frequencies and needle userisk, their sexual risk profile was not significantly different from that of indi-viduals in the crack- and injection-predominant clusters.

Demographic Differences Among Clusters

Sociodemographic characteristics differed significantly among the eight clus-ters, as shown in Table 2. Distribution of subjects by gender tended to begrouped around the HIV risk behavior distinctions discussed above. The crack-predominant clusters are composed of comparatively high proportions of women,and the prostitution-predominant cluster is almost entirely women. The druginjection clusters and the high-frequency injection cluster 7, on the other hand,tend to be composed of relatively higher numbers of men. Subjects are fairlyevenly distributed by age among the clusters, except for clusters 7 and 8, whichhave significantly more subjects between the ages of 18 and 29 than the otherclusters. The crack-predominant clusters tend to have higher proportions ofAfrican-Americans, while the injection-predominant clusters (except for clus-ter 6) tend to have more Hispanics and whites. Clusters 7 and 8 are composedof substantially more Hispanics and fewer African-Americans than are the crack-predominant clusters. A similar pattem was apparent with regard to self-iden-tified sexual orientation. Almost one-third of subjects in the prostitution-predomi-nant cluster self-identified as bisexual. Higher relative numbers of subjects inthe crack-predominant clusters and in cluster 8 self-identified as bisexual orhomosexual than in the injection-predominant clusters. Differences in educationallevel appeared to be distributed according to frequency of drug use, with theclusters reporting the highest levels of crack smoking or dmg injection (exceptfor cluster 7) composed of larger proportions of subjects who had less than ahigh school education.

Except for clusters 7 and 8, where over 90% of subjects reported beingunemployed at the time of the interview, employment showed no remarkabledifferences among clusters. Only cluster 6 subjects reported a significant levelof employment. Slightly less than half (45.4%) of subjects in cluster 4, thecluster reporting the highest frequency of drug injection, reported income froman illegal source in the 30 days before being interviewed. All subjects in clus-

210 WILLIAMS ET AL.

r- in — — 00 00 Mso r i M 00

00 (N O O-* "o d d

Os Or- d

Tf q q sq— d d 00

OS rst00 IO

Ol <N so- d "rJ m fs|

Os — OS —

— OS ( S OS

f^ r~ M —•(S in

5! "

OS 00 q (*iOs - - (N so

(N p (N so

3' r - I-" —•(N

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 211

o o

m r- m m

so (N

O 00—• r j

Od

\D (N 00 in

*T * fi r^

— f-" d —

in fs

— in *s —

o — rj — —^

O O; — — -- oor- uS en V in (

8 3^ i:

an C

X aV X

P o t. xiX o- I = S?

UJ

uM

"re00_uT^

S2aEo

oS'^2D.Eg

o

212 WILLIAMS ET AL.

ter 7 reported income from prostitution, but almost one-fifth of subjects in thecrack-predominant clusters reported income from prostitution.

Perception of Risk, Activities to Reduce Risk, and HIV Serostatus

As shown iti Table 3, subjects in the crack-predominant clusters had rela-tively similar perceptions of their chances of acquiring AIDS. Among subjectsin the injection-predominant clusters, however, perceptions appear to be stronglyrelated to frequency of injection. Slightly less than half of subjects in cluster 5perceived their chances of getting AIDS to be greater than 50%, while onlyabout one-quarter of cluster 6 subjects perceived that their chances of acquir-ing AIDS were more than 50%. The largest proportion of subjects who per-ceived that their chances of getting AIDS were greater than 50% were in clus-ters 7 and 8, where more than half perceived that their chances of acquiringAIDS were 50% or greater.

The distribution of reported recent activities to reduce the chances of acquir-ing AIDS differed greatly among clusters. Appropriately, reported behaviorchanges among subjects in the crack-predominant clusters were related to sexualactivities, as only about 10% of subjects in clusters 1 through 3 reported chang-ing their drug injection practices. Changes in risk behaviors among subjects inthe injection-predominant clusters involved both injection and sexual behaviors.Cluster 6 subjects were less likely than cluster 4 or 5 subjects to report makingchanges in needle use behaviors, but almost half of cluster 6 subjects reportedmaking some type of change in sexual behaviors. Cluster 4 subjects were theleast likely to report a reduction in number of sexual partners and initiation ofcondom use. About one-fifth of subjects in the prostitution-predominant clus-ter 7 reported cutting back on needle use or initiating needle cleaning withbleach. Although these subjects reported little reduction in number of sexualpartners, almost half reported initiating condom use.

The results of the HIV antibody tests tended to follow drug use frequencies.Among the injection-predominant clusters, cluster 4 had the highestseroprevalence rate (28.5%), while the cluster with the highest crack smokingfrequencies (cluster 1) also reported the highest seroprevalence rate (21.8%)among the crack-predominant clusters. It is interesting to note that the crack-predominant clusters 2 and 3 had higher seroprevalence rates (17.3% and13.1%, respectively) than the injection-predominant clusters 5 and 6 (7.3% and8.5%, respectively). Overall, the highest rate of HIV infection (29.1%) wasfound in cluster 8, the high-fi"equency drug injection cluster. Subjects in the

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 213

3 —O

a- N

s

r) n —

r- ri— f l r-- - r* rJ

— 00

^ § ^ n i t p.

214 WILLIAMS ET AL.

prostitution-predominant cluster had a comparatively low rate of infectioniS.5%).

DISCUSSION

This article presented the results of a cluster analysis of a large community-based sample of not-in-treatment drug users at risk for HIV infection. The analy-sis showed that drug users can be clustered into eight relatively homogeneous

HIV risk groups. Crack cocaine use was the most distinguishing feature of threeof these clusters. Distinctions among the crack-predominant clusters were lim-ited primarily to differences in frequency of use. Three injection drug use-pre-dominant clusters were also produced. Differences among the injection-domi-nant clusters were related to frequency of injection, drug of choice, and needleuse practices. Two additional clusters were identified but could not be groupedwith either the crack- or injection-dominant clusters. A high frequency of sextrading for money or for drugs was the most distinguishing risk behavior of oneof these clusters, and extremely high drug injection frequency was the HIV riskbehavior that characterized the other.

Risk of HIV infection, as represented hy the clusters, varied significantly bygender, race/ethnicity, and drug use characteristics, and these differences wererelated to the cluster distributions. The crack-predominant clusters were com-posed of comparatively high proportions of women, while the drug injectionclusters and the high-frequency injection cluster tended to be composed of pro-portionally higher numbers of men. The crack-predominant clusters had rela-tively higher numbers of African-Americans, while the injection-dominant clus-ters generally were proportionally more Hispanic and white. Educationaldifferences tended to reflect drug use frequency, with those clusters reportingthe highest levels of crack smoking or drug injection generally having less thana high school education.

Subjects in the crack-predominant clusters reported similar perceptions oftheir chances of getting AIDS, while among subjects in the injection-predomi-nant clusters such perceptions appeared to be strongly related to injection fre-quency (i.e., those who injected more frequently perceived that they had agreater chance of acquiring AIDS). Subjects in the crack-domitiant clusters weremore likely to report recent changes in their sexual activities as compared totheir injection practices, while behavior changes among subjects in the injec-tion-dominant clusters tended to involve both injection practices and sexualhabits, with the relative degree and type of behavior change related to risk

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 215

practice frequency. Hence, compared to subjects in the relatively high-firequencyinjection clusters 4 and 5, subjects in the relatively low-frequency injectioncluster 6 were less likely to report reductions in needle risk. But almost half ofcluster 6 subjects reported making some type of change in sexual behavior.Conversely, cluster 4 subjects reported the least reduction in number of sexualpartners and were the least likely to have initiated condom use. Although therewas little reported reduction in number of sexual partners among subjects in theprostitution-dominant cluster, almost half reported initiating condom use.

HIV seroprevalence was clearly associated with risk cluster. Subjects in clus-ter 8, having the highest frequency of injection and rates of injecting with usedneedles in a shooting gallery or with strangers, had the highest rates of HIVinfection. The highest level of HIV infection among the injection-dominantclusters was found in cluster 4, which had the highest injection frequencies.Among the injection-dominant clusters, injection in a shooting gallery was (inaddition to high frequency of injection) the most striking characteristic of theclusters with highest HIV seroprevalence. As for the crack-predominant clus-ters, the relationship among drug use, sexual behaviors, and HIV infection wasless clear. Nevertheless, cluster 1, with the highest crack use frequencies, hadthe highest HIV seroprevalence rate among the three crack-predominant clus-ters. Crack-predominant clusters 2 and 3 had higher seroprevalence rates thanthe injection-predominant clusters 5 and 6. This result is not totally unexpected.Since the clusters were constructed using only current indicators of drug useand HIV risk behaviors, it is possible that the higher rate of HIV infection amongthe crack-predominant clusters is related to past drug injection history. Anequally plausible explanation is that the drug use and social networks of thosegrouped in injection-dominant clusters 5 and 6 tend to prohibit the easy trans-mission of HIV. The strong relationship between race/ethnicity and risk clus-ters supports a network explanation for the high rates of HIV infection amongthe crack-predominant clusters (35).

The small number of cases in the prostimtion-predominant and very high-frequency injection clusters render problematic any conclusions regarding theirrisk behaviors. Because the HIV risk behaviors of subjects categorized in clus-ters 7 and 8 were so different compared to those in the other clusters, they weretreated as distinct groups. Despite the limited number of cases, these small clus-ters may represent two of the more interesting at-risk groups among not-in-treat-ment drug users. Subjects grouped in cluster 8, very high-frequency drug in-jectors, may represent injectors whose drug use is so extreme that they havelittle or no control over their drug and needle use behaviors. In addition to theextremely high rate of injections reported by these subjects, injectors in cluster

216 WILLIAMS ET AL.

8 reported high relative rates of tbe most risky needle use behaviors. Althoughindividuals in cluster 8 could acctirately rate their chances of HIV infection, fewreported taking any steps to reduce their needle use or sexual risks. Conse-quently, given that those assigned to cluster 8 may have very little control overtheir drug use and the risks they incur while looking for or using drugs, indi-vidual level interventions may not provide an appropriate means for interven-ing with this at-risk group. Rather, systemic or macro-level interventions aimedat reducing the risks associated with injecting in shooting galleries or with oth-ers, such as those suggested by McCoy et al. (36), may be the most appropri-ate risk reduction interventions for this group.

Cluster 7, the prostituiion-predominanl cluster, represents a group of womenwho, because they are engaged in prostitution, are clearly different from sub-jects in other at-risk clusters. Furthermore, the data show that these women areat high risk because of both their needle use and their sexual practices. Thesubjects assigned to cluster 7 report relative rates of sharing used needles in ashooting gallery, in a risky setting, and with strangers that are not significantlydifferent from those assigned to the injection-predominant clusters. There is alsoevidence of HIV risk from behaviors associated with smoking crack cocaine.The data support the conclusion that a high percentage of sexual behaviors anda fairly substantial number of sexual encounters are engaged in while subjectsare either seeking crack or under the influence of crack. The women in cluster7 were very aware of their risk of infection. They also reported the highestrelative rates of initiating condom use to reduce risk and actual condom use ofany of the clusters.

The findings presented in this report are limited in several ways. The be-haviors measured here are based on self-report. Although self-report measuresof HIV risk behaviors have been shown to be both reliable and valid, there canbe no expectation that reports of behaviors, especially at the extremes, repre-sent more than estimates on the part of some study participants (22). Moreover,the sample used for the study is not a statistically representative sample. Thereis no list of not-in-treatment IDUs from which a random sample can be gener-ated. The sample used for tbis study is a function of tbe research designs, re-cruitment strategies, and sampling quotas employed by the individtial projectsites to recruit study subjects.

Taken together, these design elements tend to limit the participation of sometypes of drug users, especially the homeless, those heavily engaged in prosti-tution, and those without a stable residence. Nevertheless, the massive size ofthe sample used for this investigation suggests that the findings can be consid-ered as broadly representative of general patterns and trends, at least as far as

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 217

clusters 1 through 6 are concerned. The sample sizes of clusters 7 and 8 are sosmall that the results presented for these clusters are at best only suggestive.

Given these limitations, the data resulting from this study are informative andhighly suggestive of future research and intervention directions. Clearly, the het-erogeneity in a large community-based sample of not-in-treatment drug userscan be reduced by employing multivariate cluster analysis. The fact that drugusers at risk for HIV infection tend to cluster around predominant risk behav-iors can provide a mechanism for developing specific risk reduction interven-tions. For example, risk reduction interventions would differ significantly be-tween drug users in cluster 1, wbose primary risk is sexual, and users groupedin cluster 4, whose primary risk is associated with the injection of cocaine andheroin. The eventual formulation of a simple mechanism that would enableintervention specialists to assign newly contacted at-risk drug users to a validbehaviorally oriented at-risk group and implement an intervention specificallydesigned to match the predominant risk bebaviors of tbat group seems to be anadvancement over many current methods for intervening with this population.

The use of empirically derived classifications, such as those produced usingcluster analysis, will enhance the evaluation of behavior change. Most evalua-tions of interventions targeting not-in-treatment drug users treat those who par-ticipate in evaluation studies as simple groups based on a nonempirical classi-fication scheme or as individuals. Evaluations that treat all drug users at riskfor HIV/AIDS in these ways fail for two basic reasons (37). First, the use ofsuch groupings is neither based on nor derived from empirical study. Any HIVrisk reduction intervention should be predicated on some knowledge that the in-terventions planned have a predictable consequence for the group engaged. Whennot-in-treatment drug users are treated as a single homogeneous group or a smallnumber of nonempirically derived homogeneous groups, there can be no pre-dictable consequences of a planned intervention. Not-in-treatment drug users,as this smdy amply demonstrates, exhibit extremely heterogeneous HIV riskbehaviors. Even the use of multiple groupings, such as drug injectors and crackcocaine smokers, fails to adequately account for this heterogeneity. Therefore,risk reduction interventions cannot adequately predict which behaviors shouldbe targeted for intervention or what behavior change can be expected, if any.Second, if drug users are treated as a single group or a small number ofnonempirically derived groups, there is no guarantee that those who are recruitedto participate in an intervention need it. The evaluation consequence of includ-ing individuals in an intervention who have no need for intervention is to re-duce the validity of the evaluation. The intervention may work for those who

218 WILLIAMS ET AL.

have need, but because the number of those having need within a study samplemay be small, the effect of the intervention is decreased.

The value of grouping participants using cluster analysis in intervention stud-ies extends beyond the obvious distinctions that can be made between, for ex-ample, IDUs and crack cocaine smokers. Within these broad groupings, thecluster analysis presented above produced distinct groups that have very dif-ferent characteristics. For example, injectors classified in cluster 5 reportedclearly different needle use behaviors when compared to injectors grouped incluster 4. While both exhibited high frequencies of drug injection, other druginjection behaviors, such as the drug most often injected, where drugs were used,and with whom drugs were used, were quite different. The sexual risks of eachgroup were also quite distinct. HIV risk reduction interventions developed formembers of one cluster of injectors would likely address some, but not thespecific risks of other clusters of injectors. Furthermore, assessment of the ef-ficacy of a single intervention targeting IDUs as a single group would likelyproduce meager results. While reduction of injecting in a shooting gallery isan ^propriate intervention and outcome measure for injectore assigned to cluster4 or 5, it would not likely produce promising results for injectors assigned tocluster 6.

One way of addressing risk differences among drug users in HIV/AIDSinterventions has been to tailor an intervention to individual risks mentioned bya participant. While individually tailored interventions have the advantage ofindividualizing the intervention, analytically they have extreme disadvantages.First, the intervention, because it is individualized, is an unknown entity. Vari-ance in the intervention is dependent on the ability of the interventionist to assessrisk and to apply the appropriate risk reduction strategies. Since both assess-ment and risk reduction strategies are implemented on a case-by-case basis,individuals with exactly the same risk profiles may be assigned very differentinterventions for any number of reasons. Second, criteria that are appropriateas outcome measures are also largely tmknowable. Like the intervention itself,outcome criteria vary on a case-by-case basis and are beyond sophisticatedanalysis. Therefore, assessment of the efficacy of individualized interventionscannot extend beyond a succeed/fail analysis. Third, from a public health per-spective, replication of individually tailored interventions is impossible. Theirutility in settings other than the original, with intervention personnel other thanthose involved in the original study, and with other drug users is limited.

Clustering study participants based on empirically derived distinctions has anumber of intervention and evaluation advantages over other methods of creat-ing targeted interventions. The method of clustering used in this study is only

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 219

one means that can be employed to create clusters of empirically similar par-ticipants. There are other methods of clustering drug users participating in HIVrisk reduction interventions that may hold promise. For example, stage of changehas gained some popularity as a means of empirically grouping participants withregard to a specific risk behavior (38, 39). Clustering intervention participantsby stage of change has obvious implications. Risk reduction interventions basedon stage of change assume homogeneous clusters of intervention participants.Expected outcomes after participation in an intervention are based on a homo-geneous stage of change clusters. Furthermore, outcome measures used to evalu-ate the efficacy of a planned stage of change intervention assume homogeneousclusters. Whatever method of clustering is chosen, creating homogeneous groupsbased on some method of clustering study participants should be standard prac-tice in HIV/AIDS intervention research with not-in-treatment drug users.

ACKNOWLEDGMENTS

Support for this study was provided by grants from the Commtmity ResearchBranch, National Institute on Drug Abuse. Opinions expressed herein are solelythose of the authors and do not refiect the opinion of the Community ResearchBranch or of the National Institute on Drug Abuse.

ENDNOTE

1. The cities selected as Cooperative Agreement sites were Houston, TX; Mi-ami, FL; New York, NY; Philadelphia, PA; Richmond, CA; Anchorage,AK; Flagstaff, AZ; Denver, CO; Detroit, Ml; Tucson, AZ; San Juan, PR;New Orleans, LA; Long Beach, CA; Dayton/Columbus, OH; Hartford, CT;Lexington, KY; Washington, DC; and St. Louis, MO.

APPENDIX A

The data presented in Table Al are the results of split sample verificationof the cluster assignments. The original sample was randomly split withoutduplication of cases into two subsamples of one-quarter of the data and onesubsample of approximately 50% of the data. Once these subsamples weregenerated, verification of cluster assignment was limited to clusters 1 through

220 WILLIAMS ET AL.

Table Al. Consistency of Cluster Solutions from Split Sample Testing'

clusters

Cluster 1

Cluster 2

Cluster 3

Cluster 4

Cluster 5

Cluster 6

Distribution ofstudy panicipants (%)

Q

1.5

4.9

15.1

4.2

33.1

40.2

V

1.8

5.8

10.8

4.0

20.5

56.6

H

1.5

5.3

11.0

5.6

22.3

53.7

^O.v

.93

.94

.99

.98

.88

.78

Correlationcoefficient

''O.H

.97

.94

.93

.99

.94

.92

''v.H

.97

.90

.95

.98

.95

.92

HIV

Q

15.6

18.9

11.2

21.9

11.6

6.1

seroprevalence

V

20.0

18.8

12.7

27.9

9.3

10.8

H

24.0

17.8

12.4

23.8

7.8

9.8

V = correlation coefficient on canonical risk variable; Q, V, and H represent split samples of 4445,4445, and 8892, respectively.

"HIV seroprevalence rates for those within the sample who were tested.

6 because of the small number of cases assigned to clusters 7 and 8 (29 and 37cases, respectively). Participants assigned to clusters 7 and 8 in the data repre-sented in the text were treated as unassigned cases for the split sample analy-sis. Despite the consistent assignment of participants to clusters, exact replica-tion was not expected. The first comparison of determined clusters between thesubsamples was an examination of cluster sizes. As can be seen in the first threecolumns of Table Al, the distributions of subsample cases among the clustersare comparable. Although the percentages of subjects assigned to a cluster arenot exact, they are rank order and proportionally similar. For example, cluster6, the largest cluster among the data sets, is consistently the largest rank-or-dered cluster with 40.2% of participants in subsample Q, 56.5% of participantsin subsample V, and 53.7% of participants in subsample H assigned to thiscluster. The second comparison of cluster assignment was based on a correla-tion of the risk variables used to generate the clusters. The lowest correlationwas between samples Q and V regarding assignment of participants to cluster6. Tbe correlation between assignments to this cluster was TQ y = .78. All othercorrelations were above .90 for all clusters. Finally, to be of value, the clusterassignments must have been able to consistently predict an outcome variable.For the outcome variable, we chose HIV seroprevalence. Again, seroprevalencerates among study participants are not exact but are generally proportional andrank-comparable among the clusters. For example, individuals assigned to cluster

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 221

4 have the highest seroprevalence rate among the clusters. The only clustersshowing seroprevalence proportional and rank disagreement are clusters 5 and.6. In subsampies V and H, seroprevalence rates, with cluster 6 showing a higherseroprevalence rate than cluster 5, are consistent in rank with those presentedin the main text of this article. The cluster assignment for subsample Q, how-ever, produces a result that is reversed. This result is understandable given therelative closeness of the two seroprevalence rates.

APPENDIX B

Cluster assignment can be confirmed in three-dimensional space. Assignmentof participants to the clusters was plotted using grid numbers of participants andtwo canonical risk variables to determine assignment. Tbe result was a three-dimensional plot. The size of cluster 6 was so large compared to that of theother clusters that we decided to use only a 50% random sample of participantsassigned to cluster 6 to determine the plot. As before, because of the smallnumber of cases assigned to clusters 7 and 8, these clusters were not differen-tiated in the plot. As can be seen in Figure Bl, each of clusters 1 through 6 is

Frequency

-2Can V. Two

-4 -5

- - 4- 8 Can V. One

-6 -7 - 8 -1

A SDK flwdnrtitd Svnpla of Qiniv St«

Fig. Bl . Plot of cluster assignment by frequency and two canonical variables.

222 WILLIAMS ET AL.

represented in a distinct three-dimensional space. Furthermore, the distance,which is a visual representation of the similarity between cluster characteris-tics, is consistent with findings presented in the main text of this article.

REFERENCES

1. Centers for Disease Control, Update: Trends in AIDS diagnosis and reporting under the ex-panded surveillance definition of adolescents and adults, MMWR Morbid. Mortal. Wkly Rep.43:826-831 (1994).

2. Centers for Disease Control, AIDS among racial/ethnic minorities—United States. 1993,MMWR Morbid. Mortal. Wkly Rep. 43:644-647. 653-655 (1994).

3. Centers for Disease Control, Update: AIDS among women—United Slates. 1994, MMWR Mor-bid. Mortal. Wkly Rep. 44:81-83 (1995).

4. Schuster, C , and Pickens, R., AIDS and intravenous drug use, NIDA Res. Monogr. 90:1-13.

5. Brown, B., and Beschner. G., Introduction: At risk for AIDS—Injection drug users and theirsexual partners, in Handbook on Risk of AIDS: Injection Drug Users and Sexual Partners (B.S. Brown and G. Beschner, eds.), Greenwood Press, Westport, Connecticut, 1993, pp. xi-xxi.

6. Watters, J., and Biemacki, P., Targeted sampling: Options for the study of hidden popula-tions, Soc. Probl. 36:416-430 (1989).

7. Birkel, R., Goiaszewski, T., Roman, J., et al.. Findings from the HoHzontes Acquired Im-mune Deficiency Syndrome Education Project: The impact of indigenous outreach workersas change agents for injection drug users. Health Educ. Q. 20:523-538 (1993).

8. Finney, J., and Moos, H., Treatment and outcome for empirical subtypes of alcoholic patients,J. Consult. Clin. Psychol. 47:25-38 (1979).

9. Rapkin, B., and Luke, D., Cluster analysis in community research: Epistemology and prac-tice. Am. J. Cotmnunity Psychol. 21:247-277 (1993).

10. Joe, G.. and Simpson, D., Needle use risks: Composite measures and comparisons, in Hand-book on Risk of AIDS: Injection Drug Users and Sexual Partners (B. S, Brown and G.Beschner, eds.). Greenwood Press, Westport, Connecticut, 1993, pp. 137-154.

11. Easterbrook, P., Chmiel, J., Hoover, D., et al.. Racial and ethnic differences in hunmn im-munodeficiency virus type 1 (HIV-I) seroprevalence among homosexual and bisexual men.Am. J. Epidemiol. 138:415-429(1993).

12. LaBrie, R., McAuliffe, W., Nemeth-Coslett, R., et al.. The prevalence of HIV infection ina national sample of injection drug users, in Handbook on Risk of AIDS: Injection Drug Us-ers and Sexual Partners (B. S. Brown and G. Beschner, eds.). Greenwood Press, Westpon,Connecticut. 1993, pp. 16-37.

13. Holmberg, S., The estimated prevalence and incidence of HIV in 96 large U.S. metropolitanareas. Am. J. Public Health 86:642-654 (1996).

14. Bowen, A., and Trotter. R., II, HIV risk in IV drug users and crack smokers: Predicting stageof change for condom use, J. Consult. Clin. Psychol. 63:238-248 (1995).

15. Deren. S., Estrada, T., Stark, M., et al.. A multi-site study of sexual orientation and injec-tion drug use as predictors of serostatus in male drug users, AIDS 15:289-295.

16. Mowbray, C , Deborah, B., and Cohen, E., Describing the homeless mentally ill: Clusteranalysis results. Am. J. Community Psychol. 21:67-93 (1993).

ANALYSIS OF DRUG USERS AT RISK FOR HIV INFECTION 223

17. Craig, R., Olson, R.. and Shalton, G.. Differences in organization of psychological needs be-tween inpatiem and outpatient opiate addicts. J. Clin. Psychol. 45:462-466 (1989).

18. Dembo. R., Williams. L., Fagan, J., et al.. Development and assessment of a classificationof high risk youths, / Drug Issues 24:25-53 (1994).

19. Mayer, J. P.. Taylor. J. R.. and Thrush. J. C , Exploratory cluster analysis of behavioralrisks for chronic disease and injury: Implications for tailoring health promotion services, J.Community Health 15:377-389 (1990).

20. Carlson. R-, Wang, J.. Siegal. H.. et al.. An ethnographic approach to targeted sampling:Problems and solutions in AIDS prevention research among injection drug and crack cocaineusers. Hum. Organ. 53:279-286 (1994).

21. Simpson, D.. Camacho, L.. Vogtsberger, K., el al.. Reducing AIDS risks through commu-nity outreach for drug injectors, Psychol. Addict. Behav. 8:86-101 (1994).

22. Weatherby, N. L., Needle, R., Cesari, H., et aL. Validity of self-reported drug use amonginjection drug users and crack cocaine users recruited through street outreach, Eval. ProgramPlan. 17:347-355 (1994).

23. Dowling-Guyer, S.. Johnson, M., Fisher, D., et al.. Reliability of drug users' self-reponedHIV risk behaviors and validity of self-reported recent drug use. Assessment 1:383-392 (1994).

24. Edelbrock, C , Comparing the accuracy of hierarchical clustering algorithms: The problemof classifying everyone, Multivariate Behav. Res. 14:367-384 (1979).

25. Miiligan, G., An examination of the effect of six lypes of error perturbation of fifteen clus-tering algorithms. Psychometrika 45:325-342 (1980).

26. Matthews. A., Standardization of measures prior to clustering. Biometrics 35:892 (1979).27. Everiii. B., Cluster Analysis, Halstead. New York. 1980.28. SAS Instimte, SAS/STAT*User's Guide. Version 6. vol. 2, 4th ed.. SAS Institute. Cary. North

Carolina. 1989.29. Aldenderfer, M.. and Blashfield. R.. Cluster Analysis, Sage Publications, Beverly Hills. Cali-

fornia. 1989.30. Ward. J., Hierarchical grouping to optimize an objective function, 7. Am. Slal. Assoc. 58:236-

244 (1963).

31. Cureton. E., and D'Agostino. R., Factor Analysis: An Applied Approach, Erlbaum Associ-ates, Hillsdale, New Jersey, pp. 348-371, 1983.

32. Ward. J., and Hook, M., Application of an hierarchical grouping procedure to a problem ofgrouping profiles, Educ. Psychoi. Meas. 23:69-81 (1963).

33. Filsinger, J.. Faulkner, J , and Warland, R., Empirical taxonomy of religious individuals: Aninvestigation among college students, Sociol. Anal. 40:136-146 (1979).

34. Goldstein, S.. and Linden, J., Muitivariate classification of alcoholics by means of the MMPI,J. Abnorm. Psychol. 74:661-669 (1969).

35. Neaigus, A.. Friedman. S., Curtis. R., et al.. The relevance of drug injectors' social and risknetworks for understanding and preventing HIV infection, Soc. Sd. Med. 38:67-78 (1994).

36. McCoy, C . Rivers, J., and Khoury. E., An emerging public health model for reducing A!DS-related risk behavior among injecting drug users and their sexual partners. Drugs Soc. 7:143-159(1993).

37. Norbeck, J.. Dejoseph, J., and Smith. R., A randomized trial of an empirically-derived so-cial suppon intervention to prevent low birth weight among African American women. Soc.Sci. Med. 43:947-954 (1996).

38. Prochaska, J.. Velicer, W.. DiClemente, C . et al,. Measuring processes of change: Appli-cations to the cessation of smoking. J. Consult. Clin. Psychol. 56:520-528 (1988).

39. Grimley, D.. Prochaska, G., and Prochaska, J.. Condom use assertivcness and the stages ofchange with main and other partners, J. Appl. Biobehav. Res. 1:152-173 (1993).