Lymphoid gene expression as a predictor of risk of secondary brain tumors

10
Lymphoid Gene Expression as a Predictor of Risk of Secondary Brain Tumors Mathew J. Edick, 1,5 Cheng Cheng, 3 Wenjian Yang, 1 Meyling Cheok, 1 Mark R. Wilkinson, 1 Deqing Pei, 3 William E. Evans, 1,5,6 Larry E. Kun, 4,6 Ching-Hon Pui, 2,6 and Mary V. Relling 1,5,6* 1 Department of Pharmaceutical Sciences, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee 2 Department of Hematology/Oncology, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee 3 Department of Biostatistics, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee 4 Department of Radiation Oncology, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee 5 St. Jude Children’s Research Hospital, College of Pharmacy, University of Tennessee, Memphis, Tennessee 6 St. Jude Children’s Research Hospital, College of Medicine, University of Tennessee, Memphis, Tennessee Gene expression profiles are tissue-specific but may also reflect germ-line-driven expression patterns across tissue types. Previously, using a targeted pharmacologic approach, we identified germ-line polymorphisms in a single gene (thiopurine methyltransferase) associated with the risk of irradiation- and chemotherapy-induced secondary brain tumors in children with acute lymphoblastic leukemia (ALL). To identify additional candidate genetic risk factors, in identically treated patients, we compared the gene expression profiles of diagnostic ALL blasts of those who did develop irradiation-associated brain tumors (n 9) with the profiles from those who did not (n 33). Weighted rank regression was used to identify 33 probe sets associated with the time-dependent development of brain tumors; k-means clustering (k 2) identified 2 groups that differed significantly in cumulative incidence of brain tumors (P 0.012). Permutation analysis was used to estimate the probability (P 0.18) of obtaining 2 such clusters by chance. Linear discriminant analysis (time-independent categorization of outcome) was used to identify 70 probe sets whose expression differentiated between the 2 groups of patients. Permutation analyses (n 1,000) was used to estimate the probability of selecting these probe sets by chance (P 0.055). Five probe sets were in common between the time-independent and time-dependent methods. The distinguishing genes are involved in neural growth (FGFR1) and in nuclear trafficking (HNRPL, KPNB1). These data suggest that gene expression profiling from accessible tissues may identify targets involved in therapy-related malignancies in unrelated tissues. (Supplementary material for this article can be found on the Genes, Chromosomes and Cancer website at http://www.interscience.wiley.com/jpages/1045-2257/ suppmat/). © 2004 Wiley-Liss, Inc. INTRODUCTION Secondary brain tumors are one of the most com- mon malignancies to occur after treatment for childhood acute lymphoblastic leukemia (ALL; Meadows et al., 1985; Neglia et al., 1991; Russo et al., 2000). The use of cranial irradiation as part of ALL therapy is a necessary, but not sufficient, risk factor for development of this complication (Nyg- aard et al., 1991; Neglia et al., 1991; Brenner et al., 2003). The development of secondary tumors is likely to depend on treatment-related exposures (Bhatia et al., 2001; Hahn, 2001; Bratt, 2002; Greaves, 2002) and predisposing host factors, al- though only limited data exist on the latter. Using a target-gene approach, we previously showed that after ALL treatment with intensive antimetabolite therapy and cranial irradiation, pa- tients with germ-line polymorphisms leading to low or absent thiopurine methyltransferase activity were at significantly greater risk of developing sec- ondary brain tumors than were those with normal thiopurine methyltransferase (Relling et al., 1999b). Thiopurine methyltransferase methylates and thereby inactivates the antimetabolite mercap- topurine. The locus encoding thiopurine methyl- transferase is polymorphic, with 10% of most populations heterozygous and 1 in 300 homozygous for inactivating point mutations (Krynetski et al., 2000).This translates into increased levels of active thioguanine nucleotides, which are incorporated into DNA and RNA and interfere with normal Supported by: NCI; Grant numbers: CA 51001, CA 78224, CA 36401, and CA21765; NIH/NIGMS Pharmacogenetics Research Network and Database; Grant numbers: U01 GM61393, and U01GM61374; National Institutes of Health; State of Tennessee Center of Excellence grant; American Lebanese Syrian Associated Charities (ALSAC). C.-H. Pui is the American Cancer Society F. M. Kirby Clinical Research Professor. *Correspondence to: Department of Pharmaceutical Sciences, St. Jude Children’s Research Hospital, 332 North Lauderdale, Mem- phis TN 38105-2794. E-mail: [email protected] Received 14 May 2004; Accepted 3 September 2004 DOI 10.1002/gcc.20121 Published online 12 November 2004 in Wiley InterScience (www.interscience.wiley.com). GENES, CHROMOSOMES & CANCER 42:107–116 (2005) RESEARCH ARTICLE © 2004 Wiley-Liss, Inc.

Transcript of Lymphoid gene expression as a predictor of risk of secondary brain tumors

Lymphoid Gene Expression as a Predictor of Risk ofSecondary Brain Tumors

Mathew J. Edick,1,5 Cheng Cheng,3 Wenjian Yang,1 Meyling Cheok,1 Mark R. Wilkinson,1 Deqing Pei,3

William E. Evans,1,5,6 Larry E. Kun,4,6 Ching-Hon Pui,2,6 and Mary V. Relling1,5,6*

1Department of Pharmaceutical Sciences, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee2Department of Hematology/Oncology, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee3Department of Biostatistics, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee4Department of Radiation Oncology, St. Jude Children’s Research Hospital, University of Tennessee, Memphis, Tennessee5St. Jude Children’s Research Hospital, College of Pharmacy, University of Tennessee, Memphis, Tennessee6St. Jude Children’s Research Hospital, College of Medicine, University of Tennessee, Memphis, Tennessee

Gene expression profiles are tissue-specific but may also reflect germ-line-driven expression patterns across tissue types.Previously, using a targeted pharmacologic approach, we identified germ-line polymorphisms in a single gene (thiopurinemethyltransferase) associated with the risk of irradiation- and chemotherapy-induced secondary brain tumors in children withacute lymphoblastic leukemia (ALL). To identify additional candidate genetic risk factors, in identically treated patients, wecompared the gene expression profiles of diagnostic ALL blasts of those who did develop irradiation-associated brain tumors(n � 9) with the profiles from those who did not (n � 33). Weighted rank regression was used to identify 33 probe setsassociated with the time-dependent development of brain tumors; k-means clustering (k � 2) identified 2 groups that differedsignificantly in cumulative incidence of brain tumors (P � 0.012). Permutation analysis was used to estimate the probability(P � 0.18) of obtaining 2 such clusters by chance. Linear discriminant analysis (time-independent categorization of outcome)was used to identify 70 probe sets whose expression differentiated between the 2 groups of patients. Permutation analyses(n � 1,000) was used to estimate the probability of selecting these probe sets by chance (P � 0.055). Five probe sets werein common between the time-independent and time-dependent methods. The distinguishing genes are involved in neuralgrowth (FGFR1) and in nuclear trafficking (HNRPL, KPNB1). These data suggest that gene expression profiling from accessibletissues may identify targets involved in therapy-related malignancies in unrelated tissues. (Supplementary material for thisarticle can be found on the Genes, Chromosomes and Cancer website at http://www.interscience.wiley.com/jpages/1045-2257/suppmat/). © 2004 Wiley-Liss, Inc.

INTRODUCTION

Secondary brain tumors are one of the most com-mon malignancies to occur after treatment forchildhood acute lymphoblastic leukemia (ALL;Meadows et al., 1985; Neglia et al., 1991; Russo etal., 2000). The use of cranial irradiation as part ofALL therapy is a necessary, but not sufficient, riskfactor for development of this complication (Nyg-aard et al., 1991; Neglia et al., 1991; Brenner et al.,2003). The development of secondary tumors islikely to depend on treatment-related exposures(Bhatia et al., 2001; Hahn, 2001; Bratt, 2002;Greaves, 2002) and predisposing host factors, al-though only limited data exist on the latter.

Using a target-gene approach, we previouslyshowed that after ALL treatment with intensiveantimetabolite therapy and cranial irradiation, pa-tients with germ-line polymorphisms leading tolow or absent thiopurine methyltransferase activitywere at significantly greater risk of developing sec-ondary brain tumors than were those withnormal thiopurine methyltransferase (Relling et al.,

1999b). Thiopurine methyltransferase methylatesand thereby inactivates the antimetabolite mercap-topurine. The locus encoding thiopurine methyl-transferase is polymorphic, with 10% of mostpopulations heterozygous and 1 in 300 homozygousfor inactivating point mutations (Krynetski et al.,2000).This translates into increased levels of activethioguanine nucleotides, which are incorporatedinto DNA and RNA and interfere with normal

Supported by: NCI; Grant numbers: CA 51001, CA 78224, CA36401, and CA21765; NIH/NIGMS Pharmacogenetics ResearchNetwork and Database; Grant numbers: U01 GM61393, andU01GM61374; National Institutes of Health; State of TennesseeCenter of Excellence grant; American Lebanese Syrian AssociatedCharities (ALSAC).

C.-H. Pui is the American Cancer Society F. M. Kirby ClinicalResearch Professor.

*Correspondence to: Department of Pharmaceutical Sciences, St.Jude Children’s Research Hospital, 332 North Lauderdale, Mem-phis TN 38105-2794. E-mail: [email protected]

Received 14 May 2004; Accepted 3 September 2004DOI 10.1002/gcc.20121Published online 12 November 2004 in

Wiley InterScience (www.interscience.wiley.com).

GENES, CHROMOSOMES & CANCER 42:107–116 (2005)

RESEARCH ARTICLE

© 2004 Wiley-Liss, Inc.

nucleic acid processing (Krynetskaia et al., 2000;Somerville et al., 2003). Thiopurine methyltrans-ferase deficiency has been associated with the riskof other secondary malignancies, including ultravi-olet light–associated skin cancer (Lennard et al.,1985) and therapy-related acute myeloid leukemia(Relling et al., 1998; Thompsen et al., 1998). None-theless, it is likely that additional genetic factorspredispose to secondary brain tumors.

Techniques for identifying genes that predis-pose to therapy-induced cancers are limited.Linkage studies are of little value because af-fected family members are essentially nonexistent.Genomewide scans have not yet been applied forthis purpose, and most do not focus on functionalpolymorphisms. Gene expression profiles have theadvantage of being technically feasible, but by na-ture, they are limited to specific accessible tissuetypes (Scherf et al., 2000). It has been suggestedthat gene expression profiles in one tissue type maybe predictive of an induced phenotype in an inde-pendent tissue type within the same host (Baria etal., 2000; Berwick et al., 2000; Shadan et al., 2000;Buchholz et al., 2001). The response of peripherallymphocytes to genotoxic stress has been linked tothe risk of cancer in unrelated tissues (Baria et al.,2000; Buchholz et al., 2001). We recently identifiedgene expression profiles from ALL blasts thatcould distinguish patients who eventually devel-oped secondary acute myeloid leukemia (Yeoh etal., 2002), further supporting the idea that geneexpression in one tissue may predict the responseof another tissue to genotoxic stress. Gene expres-sion in blasts may partly reflect germ-line geneticpolymorphisms that could affect gene expressionand function in many tissue types (Cheung et al.,2003).

In the present study, we tested the hypothesisthat gene expression in pretreatment ALL blastscould be used to identify those patients destined todevelop treatment-related brain tumors. In an ALLtreatment protocol in which all patients at high riskof relapse received identical multiagent chemo-therapy and preventative cranial irradiation, an un-usually high frequency of malignant secondarybrain tumors was reported (Relling et al., 1999b).The only tissue available from all patients wasdiagnostic ALL bone marrow cryopreserved priorto therapy. We found that the gene expressionprofiles of these pretreatment ALL blasts distin-guished patients who did develop therapy-relatedbrain tumors from those who did not, thereby iden-tifying multiple genetic targets to be investigated

to determine their contribution to the risk of irra-diation-induced brain tumors.

MATERIALS AND METHODS

Of the 188 patients enrolled in the St. JudeChildren’s Research Hospital Total XII protocol,52 received preventive cranial irradiation and in-tensive antimetabolite treatment as part of thetherapy for ALL (Relling et al., 1999b). All patientsincluded in the current study had serial measure-ment of erythrocyte thioguanine nucleotide levelsas described previously (Relling et al., 1999a). Aftera minimum of 7.4 years of follow-up, 10 of thesepatients had developed secondary brain tumors, fora cumulative incidence of 21.4% � 6.3%. Fiveadditional patients had a relapse of their ALL (andthus received alternative therapy), and 2 additionalhigh-risk infants died in complete remission. The10 patients developed a spectrum of brain tumortypes: 5 cases of glioblastoma multiforme, 2 casesof anaplastic astrocytoma, 2 cases of primitive neu-roectodermal tumor, and one case of embryoplasticneuroepithelial tumor. Six of these cases were de-scribed previously (Relling et al., 1999b).

Cryopreserved diagnostic bone marrow samplesfrom the 52 patients were obtained after approvalof the St. Jude Tissue Resource Committee; allpatients provided informed consent for the mate-rials used in the research. Total RNA was extractedby use of Tri-Reagent (MRC, Cincinnati, OH).Affymetrix HG-U95Av2 GeneChips (Affymetrix,Santa Clara, CA) comprising 12,625 probe sets (rep-resenting approximately 9,600 unique genes) wereused to interrogate labeled cRNA from each sam-ple. Affymetrix data were confirmed by real-timeRT-PCR for 3 probe sets in a subset of patientsrepresenting a range of gene expression levels (see:Supplementary material, which can be found onthe Genes, Chromosomes and Cancer website at http://www.mrw.interscience.wiley.com/suppmat/1045-2257/suppmat/). Two of the 52 samples contained�65% bone marrow blasts (7% and 15% blasts) andwere excluded from the analysis; the median per-centage of blasts in the remaining 50 samples was96%. Of the remaining 50 samples, 42 had ade-quate RNA integrity according to electropho-resis (Agilent Bioanalyzer, Palo Alto, California)and glyceraldehyde 3-phosphate dehydrogenase(GAPD) and actin 3�:5� ratios. Of these 42 patientswith evaluable samples, 27 remained in completeremission, 9 developed secondary brain tumor, 5had relapses of their ALL, and 1 infant died inremission (for data analysis purposes, the last was

108 EDICK ET AL.

grouped with the ALL relapse group because ofthe high relapse rate in infants).

Statistical analysis was performed in R 1.6(http://www.r-project.org; Ihaka et al., 2003) andS-plus 2000 (Mathsoft, Inc., Cambridge, MA) soft-ware unless otherwise specified. Gene expressionvalues were scaled to a target overall intensity byuse of Affymetrix MAS 5.0 (Affymetrix, SantaClara, CA) and were log-transformed. Probe setswere excluded from statistical consideration if ex-pression was considered absent or marginal byMAS 5.0 in 95% of patients, leaving 6,293 probesets that were analyzed. Principal-componentsanalysis and hierarchical clustering were performedusing Spotfire DecisionSite 7.0 (Spotfire, Somer-ville, MA).

Demographics were compared between the pa-tients who did versus those who did not developbrain tumors by use of 2-sided Fisher’s exact testand t test. Gene selection and gene set validation todiscriminate which patients were likely to developa brain tumor were performed according to twodistinctions for the phenotype of interest: (1) bycharacterizing the time at risk for development ofbrain tumors and allowing relapse of ALL as acompeting risk (time-dependent) and (2) by cate-gorically defining 2 independent groups of pa-tients: those who did versus those who did notdevelop brain tumors, ignoring the time at risk(time-independent).

The time-dependent approach considered 3variables for each of the 42 patients: (1) time toevent or to last follow-up, (2) the outcome pheno-type (relapse, brain tumor, or censored in completeremission), and (3) the gene expression profile.Probe set selection was performed by the rankingof genes using the P value of the weighted rankregression association statistic (WRRAS) for eachprobe set (see Algorithm 1 in the Supplement fordetails). Then, permutation analysis was performedto determine the optimal significance (i.e., alpha)level corresponding to the lowest probability ofmaking a type I error in selecting distinguishinggenes (top alpha permutation, see Algorithm 2 inthe Supplement). Probe sets with WRRAS P val-ues less than or equal to the optimal alpha wereconsidered statistically significant and were se-lected from the observed data set. Next, the sam-ples were clustered using unsupervised 2- or3-mean algorithms, with the selected probe sets asfeatures, and the cumulative incidence of braintumor in the clusters was compared. The statisticalsignificance of the selected probe sets and of theclusters was further assessed by permutation anal-

ysis using the comparison statistic of the cumula-tive incidence curves of brain tumors in the clus-ters as an assessment measure. Cumulativeincidence curves of brain tumors in the clusterswere estimated and compared according toGray’s (1988) method. A permutation P value forthe brain tumor cumulative incidence differ-ences between the 2 (or 3) clusters in the ob-served data set was calculated by counting thenumber of times that the 2 (or 3) clusters gener-ated from the permuted data had more statisti-cally significant cumulative incidence differ-ences than did those between the clustersgenerated from the observed data set (see Algo-rithm 3 in the supplemental material).

For the time-independent method, each patienthad 2 variables: the outcome phenotype (brain tu-mor or complete remission; relapse patients wereexcluded, as detailed in the Supplement) and thegene expression profile. Probe set selection wasperformed by linear discriminant analysis of the6,293 probe sets (Golub et al., 1999). We estimatedthe probability that we could have discriminatedbetween the two groups of patients by chance byselecting the same number of top discriminatingprobe sets based on 1,000 random permutations ofthe data. For k, a given number of top probe sets,we computed the average misclassification rate byunsupervised 2-means clustering using the top 1,2, …, to k probe sets. The permutation P value wascalculated as the frequency of observing a permu-tation with an average misclassification rate lessthan or equal to that estimated by the observeddata. The gene sets chosen were those that pro-vided the lowest permutation P values for eachmethod.

RESULTS

There were no differences in demographics orALL characteristics between the patients who de-veloped brain tumors and those who did not (Ta-ble 1). Consistent with our initial analysis, whenonly 6 cases of secondary brain tumors had devel-oped (Relling et al., 1999b), patients who devel-oped brain tumors had higher thioguanine nucleo-tide concentrations during their initial ALLtherapy than those who did not (Table 1). Unsu-pervised hierarchical clustering of the 6,293 probesets grouped patients together by their leukemiccytogenetic and immunophenotypic characteristics(Fig. 1), as we observed in another St. Jude ALLtrial (Yeoh et al., 2002).

We tested whether gene expression patterns dis-tinguished patients who developed brain tumors

109LYMPHOID GENE EXPRESSION PREDICTS BRAIN TUMORS

from those that did not. In the time-dependentanalysis, the top alpha permutation algorithm sug-gested that selecting probe sets with a WRRAS Pvalue � 0.005 (n � 33; Supplemental Table 1) hada minimal likelihood of considering probe sets se-lected by chance (P � 0.021; Supplemental Fig. 1).Among the genes represented by these 33 probesets were those involved in the glial cell growth(i.e., FGFR1), apoptosis or cell survival (i.e., TP53,CRADD), DNA replication and repair (i.e., POLB,DDB1), and chromatin remodeling (SMARCA4,SMARCB1) pathways.

By using these 33 probe sets, patients weregrouped by k means clustering. With k � 2, weobtained clusters of 26 and 16. The cluster with 26patients included 8 of the 9 patients who devel-oped brain tumors and all 6 of the patients whorelapsed. Thus, the cumulative incidence of braintumors in patients in this cluster was significantly

greater (P � 0.012) than the cumulative incidencein the other cluster (Fig. 2). With k � 3, the dif-ference in the cumulative incidence of brain tu-mors in the 2 larger clusters (P � 0.011) was com-parable to that observed when k � 2 (seeSupplement for details).

Permutation analysis with either 2- or 3-meansclustering indicated that the gene expression pat-terns in the observed data identified patient clus-ters whose brain tumor incidence differed signifi-cantly from one another more than would bepredicted with the permuted data 819 and 912times, respectively, out of 1,000 permutations, in-dicating that the probability of selecting these 33probe sets by chance was approximately 9% or 18%for 2 or 3 clusters, respectively.

Treating phenotype as a time-independent vari-able, we identified probe sets that discriminatedpatients who did develop brain tumors (n � 9) from

TABLE 1. Patient Demographics

Totaln � 42

Brain tumorn � 9

No brain tumorn � 33

t Testa or FETb

P value

Age at diagnosis (years)Median 5.5 3.7 5.7 0.94a

Range 0.4–15.9 2.3–15.9 0.4–15.3Gender

Female 14 4 10 0.45b

Male 28 5 23Race

Black 7 0 7 0.31b

White 35 9 26Immunophenotype

B 30 7 23 1b

T 11 2 9Unknown 1 0 1

PloidyHyperdiploid 16 3 13 1b

Nonhyperdiploid 26 6 20Cytogenetics

Tel-AML 9 3 6 0.37b

Bcr-Abl 1 0 1 1b

MLL rearranged 3 0 3 1b

Radiation dose1,800 Gy 37 9 28 0.57b

2,400 Gy 5 0 5Maximum RBC thioguanine nucleotide concentration

(pMol/8 � 108 cells)Median 609 998 585 0.002a

Range 113–4,472 472–4,472 113–1,434% blasts in diagnostic bone marrow

Median 96 96 96 0.69a

Range 65–99 83–99 65–98% of probe sets expressed

Median 41.6 41.3 41.8 0.93a

Range 34.9–47.9 37.5–45 34.9–47.9

FET: Fisher’s exact test.

110 EDICK ET AL.

patients who did not (n � 27). After 1,000 permu-tations of the data, 70 probe sets identified bylinear discriminant analysis distinguished patientswho did develop secondary brain tumors fromthose who did not (P � 0.055; Supplemental Ta-ble 2, Figs. 3 and 4). These probe sets did notseparate patients into cytogenetic or immunophe-notype groups on the basis of their ALL subtype(as was true for the unsupervised analysis). Theprobe sets identified in the time-independent anal-ysis represent genes similar in function to thegenes top-ranked by WRRAS: genes involved inapoptosis and cell survival (CASP1, CASP6, E2F1,

E2F5, VEGF), DNA repair (FRAP1), and chroma-tin remodeling (H2AFA) all were identified. Weidentified 5 probe sets by WRRAS and by lineardiscriminant analysis (Table 2). The probe setsrepresented a diverse set of genes not previouslyassociated with secondary malignancies and in-cluded those involved in tumor growth or in onco-genes (e.g., STAT4, NFIC) or in nuclear trafficking(e.g., HNRPL, KPNB1).

DISCUSSION

Using pretreatment bone marrow ALL blastsfrom identically treated patients, we identified

Figure 1. Unsupervised clustering of the 42evaluable patients on the basis of the 6,293 probe setsthat were expressed in at least 5% of patients [remis-sion status: CCR (continuous complete remission sta-tus), in remission with no events; R, relapse of theoriginal leukemia (R* is an infant who died in completeremission); and BT, developed a malignant brain tu-mor]. Lineage, DNA index, and status of TEL and MLLgenes are characteristics of the ALL blasts that tendedto cluster together (with the distinguishing character-istic boxed) in an unsupervised approach.

111LYMPHOID GENE EXPRESSION PREDICTS BRAIN TUMORS

gene expression patterns that distinguished pa-tients who did develop brain tumors following in-tensive antimetabolite treatment and cranial irradi-ation from those who did not. Because there wereno differences in the demographics or treatmentthat could distinguish these 2 groups of patients,this represents a strategy by which host factors fortumorigenesis may be identified. Although we rec-ognize that the large number of genes compared tothe relatively small number of patients increasesthe possibility that distinguishing gene expressionprofiles may have been identified by chance, per-mutation analyses indicated a relatively small prob-ability that distinguishing genes had been observedrandomly. Thus, we hypothesize that the geneexpression patterns identified may be related toeach patient’s susceptibility to thiopurine- and ir-radiation-induced brain tumors.

The mechanism by which a cell of one type canprovide information about the response of a cell ofanother type to genotoxic stress is controversial(Baria et al., 2000; Berwick et al., 2000; Hemminkiet al., 2000; Dornfeld et al., 2001; Nersesyan, 2002).Peripheral lymphocytes from patients who devel-

oped a variety of solid tumors displayed a higherfrequency of DNA damage than did lymphocytesfrom normal healthy volunteers when exposed togenotoxic stress ex vivo (Berwick et al., 2000;Shadan et al., 2000; Baria et al., 2001; Buchholz etal., 2001). This may be a reflection of inherentgenomic instability in cancer patients (Knudson,2001); however, it is unclear whether this instabil-ity is a consequence of malignancy or a reflection ofgerm-line susceptibility of the patient to cancer(Berwick et al., 2000). Notably, healthy first-degreerelatives of cancer patients had increased levels ofgenomic instability compared to normal individualswithout a family history of cancer (Patel et al.,1997). Because there is a stronger association be-tween susceptibility to DNA damage with cancerswith strong hereditary risks than with cancers withstrong environmental risks (Baria et al., 2000), it isplausible that a genetically controlled germ-linesusceptibility to cancer plays a role in response togenotoxic stress.

The concept that germ-line polymorphisms (re-flected at the genomic level) might affect an indi-vidual’s risk of cancer is not new (Friedberg, 2001;

Figure 2. Cumulative incidence of secondarybrain tumors or relapse in the two clusters that weredefined on the basis of 33 probe sets selected by theWRRAS method. Incidence was significantly higher (P� 0.012) in cluster 1 (45%; n � 26;) than in cluster 2(6%; n � 16).

TABLE 2. Probe Sets Identified by Both Time-Dependent and Time-Independent Methods

Probeset ID Gene symbol Gene description Relative expressiona

35201_at HNRPL Heterogeneous nuclear ribonucleoprotein L 241196_at KPNB1 Karyopherin (importin) beta 1 1440_at NFIC Nuclear factor I/C (CCAAT-binding transcription factor) 2906_at STAT4 Signal transducer and activator of transcription 4 2424_s_at FGFR1 Fibroblast growth factor receptor 1 (fms-related tyrosine

kinase 2, Pfeiffer syndrome)2

aMean expression in patients who developed brain tumor relative to those who did not.

112 EDICK ET AL.

Morton et al., 2001). Thus far, most investigatorshave employed a targeted gene approach for ge-netic association studies. As secondary cancer riskcannot be studied by a genetic linkage approach,the only currently viable genomewide technique isto use functional-genomic expression arrays. Thatthis approach may be plausible was demonstratedby the recent finding that gene expression profilesfrom ALL blasts were predictive of patients whodeveloped secondary acute myeloid leukemia aftertreatment with topoisomerase II inhibitors (Yeoh etal., 2002).

Our study is unique in that the very high fre-quency of brain tumors strongly suggests that ther-apy components contributed to the risk of a braintumor. The host factors identified may have beenelicited because of the potent “environmental”stress of irradiation plus intensive antimetabolitetherapy. With longer follow-up, the cumulative in-cidence in this cohort (21.4%) was even higher thanthat which we reported previously (12.8%; Rellinget al., 1999b), as 4 additional patients subsequentlydeveloped a malignant brain tumor. The only vari-able that distinguished patients at risk of develop-ing brain tumors was low thiopurine methyl-

transferase activity/high thioguanine nucleotides(Relling et al., 1999b), but there are likely to beother host factors involved. An insurmountablehurdle in identifying these factors is that therecannot be a comparison of brain tissue from pa-tients who do develop tumors with that of thosewho do not. Our approach was to use expressionarray analysis of a uniform tissue (ALL blasts) thatwas accessible from all patients before treatment asa more functional method of genomewide scanningthan would be possible with a DNA-based ap-proach. Whether similar results would be obtainedby use of an independent tissue type from thesepatients is not known, as ALL blasts were the only

Figure 3. Principal-component analysis of 70 probe sets selectedby linear discriminant analysis discriminated between the patients whodid (n � 9) and did not (n � 27) develop brain tumors (P � 0.055); noneof these patients experienced a leukemic relapse. Each sphere repre-sents a gene expression profile of a patient separated in 3-dimensionalspace on the basis of the 3 principal components of the 70 commonlyselected probe sets.

Figure 4. Hierarchical clustering of 70 probe sets selected bylinear discriminant analysis. Each column represents a patient; orangeand purple column headings indicate patients who did and did notdevelop brain tumors, respectively; each row represents a probe set.Red and green represent over- and underexpression, respectively, ofthe probe set. Patients are grouped into 2 main clusters, which separatethose who did and did not develop brain tumors, with one exception(one patient who did not develop a brain tumor was misclustered withpatients who developed brain tumors).

113LYMPHOID GENE EXPRESSION PREDICTS BRAIN TUMORS

tissue samples available prospectively from all chil-dren.

As anticipated, unsupervised hierarchical clus-tering based on expression of all 6,293 expressedgenes yielded groups of patients clustered by leu-kemic characteristics in a manner similar to thatreported previously (Armstrong et al., 2002; Yeohet al., 2002). It was not surprising that unsupervisedmethods did not identify the group of patients whodeveloped secondary brain tumors, as the geneexpression patterns underlying the characteristicsof leukemic blasts are likely to be much morepronounced in blast cells than gene expressionpatterns that define host susceptibility to irradia-tion, which may be common to any cell type. Toreveal more subtle expression patterns that mayhave defined the susceptibility to malignant trans-formation, supervised approaches are more appro-priate.

Malignant transformation is thought to occur byan accumulation of precancerous lesions over time(Hanahan et al., 2000). Accordingly, it may be ap-propriate to factor in time when selecting genesthat distinguish patients who developed a braintumor from those who did not. We applied time-independent as well as time-dependent approachesto identifying distinguishing genes. In the latterapproach, the genes selected on the basis of thesignificance of association with time to event (braintumor) achieved greater significance (P � 0.021)from permutation analysis than did the genes se-lected without considering time-to-event (P �0.055). This suggests that it may be prudent toconsider time as an important factor in outcomeanalysis.

Although many genes selected in our analysesare plausible candidates for susceptibility to trans-formation, many have not been previously associ-ated with cancer. Of the 5 genes identified by bothtime-dependent and time-independent methods,several were functionally related to each other andto brain tumors. For example, heterogeneous nu-clear ribonucleoprotein L (HNRPL) codes for aprotein involved in the formation of ribonucleopro-tein complexes with nascent RNA transcripts inthe nucleus (Pinol-Roma et al., 1989), binds togenetic hormone response elements, and is in-volved in differentiation, and its nuclear import ismediated by karyopherins such as karyopherin (im-portin) beta 1 (KPNB1; Fridell et al., 1997). KPNB1plays a role in nuclear import of proteins (Gorlich etal., 1995; Moroianu et al., 1995) and in mitoticspindle formation (Wiese et al., 2001). KPNB1 isinvolved in nuclear export of GAPD (Shamsher et

al., 2002), and GAPD has recently been shown tobe part of a complex that interacts with thiopurinemetabolites in DNA (Krynetski et al., 2003). Thio-purines in DNA constitute the putative pharmaco-logic basis for the high frequency of brain tumors inthese patients. Dysfunction of STAT4 has beenassociated with the development of nitrosourea-induced lymphomas (Zhang et al., 2001), and ni-trosourea and thiopurine have been postulated tohave similar effects on mismatch repair and tumor-igenesis (Swann et al., 1996). STAT4 has not beendirectly implicated in brain tumors; however, 2downstream effectors of STAT4, CCL2(MPC1) andSOCS3 (Torpey et al., 2004), have been shown toplay a role in macrophage recruitment to glial tu-mors (Kielian et al., 2002) and in response to in-flammatory cytokines upon brain damage (Yang etal., 2004). We found that the expression of NFICwas lower in patients who subsequently developedbrain tumors than in those who did not; its geneproduct is a CCAATT-binding transcription factor(Qian et al., 1995). The CCATT box has beenlinked to regulation of the RAS oncogenes by geno-toxic stress, such as UV radiation (Fritz et al., 2001),and to the transcription of DEK, a gene that is thetarget of some malignant translocations and theexpression of which is associated with glioblastoma(Sitwala et al., 2002). A gene identified by use ofthe time-dependent approach, FGFR1, is interest-ing in that it encodes a tyrosine kinase that plays arole in brain development (Trokovic et al., 2003)and in the growth of glial tumors (Jin et al., 2000),is itself the target of multiple tumorigenic chromo-somal translocations in myeloproliferative disorders(Fioretos et al., 2001), and is involved in activatingvarious STAT genes (Hart et al., 2000). Althoughthe mechanisms by which under- or overexpressionof these genes (in lymphoblasts) might be relatedto irradiation-induced brain tumors are not clear,the relatively high degree of interaction amongthese 5 genes (all of which were identified bymultiple computational methods), along with theirinvolvement in cancer and in brain development,provide intriguing candidates worthy of furtherstudy.

We have demonstrated that gene expression pro-files in ALL blasts could distinguish patients whodid versus those who did not develop secondarybrain tumors after cranial irradiation. Our resultssuggest that microarray analysis is a potentiallypowerful tool for identifying genes involved in can-cer susceptibility. Future studies may apply thissame methodology to other classes of second tu-mors and possibly to primary tumors to offer in-

114 EDICK ET AL.

sights into the etiology and risk of cancer develop-ment.

ACKNOWLEDGMENTS

We thank our protocol coinvestigators, clinicalstaff, and the patients and their parents for theirparticipation. We also thank Michael Shipman andAllison Gratzer for laboratory assistance.

REFERENCES

Armstrong SA, Staunton JE, Silverman LB, Pieters R, den Boer ML,Minden MD, Sallan SE, Lander ES, Golub TR, Korsmeyer SJ.2002. MLL translocations specify a distinct gene expression pro-file that distinguishes a unique leukemia. Nat Genet 30:41–47.

Baria K, Warren C, Roberts SA, West CM, Evans DG, Varley JM,Scott D. 2000. Correspondence re: A. Rothfuss et al., Inducedmicronucleus frequencies in peripheral blood lymphocytes as ascreening test for carriers of a BRCA1 mutation in breast cancerfamilies. Cancer Res 60:390–394.

Baria K, Warren C, Roberts SA, West CM, Scott D. 2001. Chromo-somal radiosensitivity as a marker of predisposition to commoncancers? Br J Cancer 84:892–896.

Berwick M, Vineis P. 2000. Markers of DNA repair and suscepti-bility to cancer in humans: an epidemiologic review. J Natl CancerInst 92:874–897.

Bhatia S, Sklar C. 2001. Second cancers in survivors of childhoodcancer. Nat Rev Cancer 2:124–132.

Bratt O. 2002. Hereditary prostate cancer: clinical aspects. J Urol168:906–913.

Brenner DJ, Doll R, Goodhead DT, Hall EJ, Land CE, Little JB,Lubin JH, Preston DL, Preston RJ, Puskin JS, Ron E, Sachs RK,Samet JM, Setlow RB, Zaider M. 2003. Cancer risks attributableto low doses of ionizing radiation: assessing what we really know.Proc Natl Acad Sci USA 100:13761–13766.

Buchholz TA, Wu X. 2001. Radiation-induced chromatid breaks asa predictor of breast cancer risk. Int J Radiat Oncol Biol Phys49:533–537.

Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M,Spielman RS. 2003. Natural variation in human gene expressionassessed in lymphoblastoid cells. Nat Genet 33:422–425.

Dornfeld KJ, Lawrence TS. 2001. A predisposition supposition forglioma. J Natl Cancer Inst 93:1512–1513.

Fioretos T, Panagopoulos I, Lassen C, Swedin A, Billstrom R,Isaksson M, Strombeck B, Olofsson T, Mitelman F, Johansson B.2001. Fusion of the BCR and the fibroblast growth factor recep-tor-1 (FGFR1) genes as a result of t(8;22)(p11;q11) in a myelo-proliferative disorder: the first fusion gene involving BCR but notABL. Genes Chromosomes Cancer 32:302–310.

Fridell RA, Truant R, Thorne L, Benson RE, Cullen BR. 1997.Nuclear import of hnRNP A1 is mediated by a novel cellularcofactor related to karyopherin-beta. J Cell Sci 110:1325–1331.

Friedberg T. 2001. Cytochrome P450 polymorphisms as risk factorsfor steroid hormone-related cancers. Am J Pharmacogenomics1:83–91.

Fritz G, Kaina B. 2001. Transcriptional activation of the smallGTPase gene rhoB by genotoxic stress is regulated via a CCAATelement. Nucleic Acids Res 29:792–798.

Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Me-sirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloom-field CD, Lander ES. 1999. Molecular classification of cancer:class discovery and class prediction by gene expression monitor-ing. Science 286:531–537.

Gorlich D, Vogel F, Mills AD, Hartmann E, Laskey RA. 1995.Distinct functions for the two importin subunits in nuclear proteinimport. Nature 377:246–248.

Gray RJ. 1988. A class of K-sample tests for comparing the cumu-lative incidence of a competing risk. Ann Statist 16:1141–1154.

Greaves M. 2002. Cancer causation: the Darwinian downside of pastsuccess? Lancet Oncol 3:244–251.

Hahn H. 2001. Genetically determined susceptibility markers inskin cancer and their application to chemoprevention. IARC SciPubl 154:93–100.

Hanahan D, Weinberg RA. 2000. The hallmarks of cancer. Cell100:57–70.

Hart KC, Robertson SC, Kanemitsu MY, Meyer AN, Tynan JA,Donoghue DJ. 2000. Transformation and Stat activation by de-rivatives of FGFR1, FGFR3, and FGFR4. Oncogene 19:3309–3320.

Hemminki K, Xu G, Le Curieux F. 2000. Re: markers of DNArepair and susceptibility to cancer in humans: an epidemiologicreview. J Natl Cancer Inst 92:1536–1537.

Ihaka R, Gentleman R. 2003. R: A Language for Data Analysis andGraphics. Journal of Computational and Graphical Statistics5:299–314.

Jin W, McCutcheon IE, Fuller GN, Huang ES, Cote GJ. 2000.Fibroblast growth factor receptor-1 alpha-exon exclusion andpolypyrimidine tract-binding protein in glioblastoma multiformetumors. Cancer Res 60:1221–1224.

Kielian T, van RN, Hickey WF. 2002. MCP-1 expression in CNS-1astrocytoma cells: implications for macrophage infiltration intotumors in vivo. J Neurooncol 56:1–12.

Knudson AG. 2001. Two genetic hits (more or less) to cancer. NatRev Cancer 1:157–162.

Krynetskaia NF, Cai X, Nitiss JL, Krynetski EY, Relling MV. 2000.Thioguanine substitution alters DNA cleavage mediated by to-poisomerase II. FASEB J 14:2339–2344.

Krynetski EY, Evans WE. 2000. Genetic polymorphism of thiopu-rine S-methyltransferase: molecular mechanisms and clinical im-portance. Pharmacology 61:136–146.

Krynetski EY, Krynetskaia NF, Bianchi ME, Evans WE. 2003. Anuclear protein complex containing high mobility group proteinsB1 and B2, heat shock cognate protein 70, ERp60, and glyceral-dehyde-3-phosphate dehydrogenase is involved in the cytotoxicresponse to DNA modified by incorporation of anticancer nucle-oside analogues. Cancer Res 63:100–106.

Lennard L, Thomas S, Harrington CI, Maddocks JL. 1985. Skincancer in renal transplant recipients is associated with increasedconcentrations of 6-thioguanine nucleotide in red blood cells. Br JDermatol 113:723–729.

Meadows AT, Baum E, Fossati-Bellani F, Green D, Jenkin RDT,Marsden B, Nesbit M, Newton W, Oberlin O, Sallan SG, Siegel S,Strong LC, Voute PA. 1985. Second malignant neoplasms inchildren: aAn update from the late effects study group. J ClinOncol 3:532–538.

Moroianu J, Hijikata M, Blobel G, Radu A. 1995. Mammalian karyo-pherin alpha 1 beta and alpha 2 beta heterodimers: alpha 1 oralpha 2 subunit binds nuclear localization signal and beta subunitinteracts with peptide repeat-containing nucleoporins. Proc NatlAcad Sci USA 92:6532–6536.

Morton NE, Zhang W, Taillon-Miller P, Ennis S, Kwok PY, CollinsA. 2001. The optimal measure of allelic association. Proc NatlAcad Sci USA 98:5217–5221.

Neglia JP, Meadows AT, Robison LL, Kim TH, Newton WA,Ruymann FB, Sather HN, Hammond GD. 1991. Second neo-plasms after acute lymphoblastic leukemia in childhood. N EnglJ Med 325:1330–1336.

Nersesyan AK. 2002. Re: Gamma-radiation sensitivity and risk ofglioma. J Natl Cancer Inst 94:949.

Nygaard R, Garwicz S, Haldorsen T, Hertz H, Jonmundsson GK,Lanning M, Moe PJ. 1991. Second malignant neoplasms in pa-tients treated for childhood leukemia. Acta Paediatr Scand 80:1220–1228.

Patel RK, Trivedi AH, Arora DC, Bhatavdekar JM, Patel DD. 1997.DNA repair proficiency in breast cancer patients and their first-degree relatives. Int J Cancer 73:20–24.

Pinol-Roma S, Swanson MS, Gall JG, Dreyfuss G. 1989. A novelheterogeneous nuclear RNP protein with a unique distribution onnascent transcripts. J Cell Biol 109:2575–2587.

Qian F, Kruse U, Lichter P, Sippel AE. 1995. Chromosomal local-ization of the four genes (NFIA, B, C, and X) for the humantranscription factor nuclear factor I by FISH. Genomics 28:66–73.

Relling MV, Yanishevski Y, Nemec J, Evans WE, Boyett JM, BehmFG, Pui C-H. 1998. Etoposide and antimetabolite pharmacologyin patients who develop secondary acute myeloid leukemia. Leu-kemia 12:346–352.

Relling MV, Hancock ML, Boyett JM, Pui C-H, Evans WE. 1999a.Prognostic importance of 6-mercaptopurine dose intensity inacute lymphoblastic leukemia. Blood 93:2817–2823.

Relling MV, Rubnitz JE, Rivera GK, Boyett JM, Hancock ML,Felix CA, Kun LE, Walter AW, Evans WE, Pui CH. 1999b. Highincidence of secondary brain tumours after radiotherapy and an-timetabolites. Lancet 354:34–39.

Russo A, Zanna I, Tubiolo C, Migliavacca M, Bazan V, Latteri MA,

115LYMPHOID GENE EXPRESSION PREDICTS BRAIN TUMORS

Tomasino RM, Gebbia N. 2000. Hereditary common cancers:molecular and clinical genetics. Anticancer Res 20:4841–4851.

Scherf U, Ross DT, Waltham M, Smith LH, Lee JK, Tanabe L,Kohn KW, Reinhold WC, Myers TG, Andrews DT, Scudiero DA,Eisen MB, Sausville EA, Pommier Y, Botstein D, Brown PO,Weinstein JN. 2000. A gene expression database for the molecularpharmacology of cancer. Nat Genet 24:236–244.

Shadan FF, Koziol J. 2000. Induced genome instability as a potentialscreening test for cancer susceptibility? Med Hypotheses 55:69–72.

Shamsher MK, Ploski J, Radu A. 2002. Karyopherin beta 2B partic-ipates in mRNA export from the nucleus. Proc NatlAcad Sci USA99:14195–14199.

Sitwala KV, Adams K, Markovitz DM. 2002. YY1 and NF-Y bindingsites regulate the transcriptional activity of the dek and dek-canpromoter. Oncogene 21:8862–8870.

Somerville L, Krynetski EY, Krynetskaia NF, Beger RD, Zhang W,Marhefka CA, Evans WE, Kriwacki RW. 2003. Structure anddynamics of thioguanine-modified duplex DNA. J Biol Chem278:1005–1011.

Swann PF, Waters TR, Moulton DC, Xu Y-Z, Zheng Q, Edwards M,Mace R. 1996. Role of postreplicative DNA mismatch repair inthe cytotoxic action of thioguanine. Science 273:1109–1111.

Thompsen J, Schroder H, Kristinsson J, Madsen B, Szumlanski C,Weinshilboum R, Andersen JB, Schmiegelow K. 1999. Possiblecarcinogenic effect of 6-mercaptopurine on bone marrow stemcells: relation to thiopurine metabolism. Cancer 86:1080–1086.

Torpey N, Maher SE, Bothwell AL, Pober JS. 2004. Interferon alphabut not interleukin 12 activates STAT4 signaling in human vas-cular endothelial cells. J Biol Chem 279:26789–26796.

Trokovic R, Trokovic N, Hernesniemi S, Pirvola U, Vogt Weisen-horn DM, Rossant J, McMahon AP, Wurst W, Partanen J. 2003.FGFR1 is independently required in both developing mid- andhindbrain for sustained response to isthmic signals. EMBO J22:1811–1823.

Wiese C, Wilde A, Moore MS, Adam SA, Merdes A, Zheng Y. 2001.Role of importin-beta in coupling Ran to downstream targets inmicrotubule assembly. Science 291:653–656.

Yang MS, Lee J, Ji KA, Min KJ, Lee MA, Jou I, Joe E. 2004.Thrombin induces suppressor of cytokine signaling 3 expressionin brain microglia via protein kinase Cdelta activation. BiochemBiophys Res Commun 317:811–816.

Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R,Behm FG, Raimondi SC, Relling MV, Patel A, Cheng C, Cam-pana D, Wilkins D, Zhou X, Li J, Liu H, Pui CH, Evans WE,Naeve C, Wong L, Downing JR. 2002. Classification, subtypediscovery, and prediction of outcome in pediatric acute lympho-blastic leukemia by gene expression profiling. Cancer Cell 1:133–143.

Zhang SS, Welte T, Fu XY. 2001. Dysfunction of Stat4 leads toaccelerated incidence of chemical-induced thymic lymphomas inmice. Exp Mol Pathol 70:231–238.

116 EDICK ET AL.