Application of the Taguchi Method to the Analysis of the Deposition Step in Microarray Production

9
164 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 5, NO. 3, SEPTEMBER 2006 Application of the Taguchi Method to the Analysis of the Deposition Step in Microarray Production Marco Severgnini , Linda Pattini, Clarissa Consolandi, Ermanno Rizzi, Cristina Battaglia, Gianluca De Bellis, and Sergio Cerutti, Fellow, IEEE Abstract—Every microarray experiment is affected by many possible sources of variability that may even corrupt biological evidence on analyzed sequences. We applied a “Taguchi method” strategy, based on the use of orthogonal arrays to optimize the deposition step of oligonucleotide sequences on glass slides. We chose three critical deposition parameters (humidity, surface, and buffer) at two levels each, in order to establish optimum settings. A orthogonal array was used in order to monitor both the main effects and interactions on the deposition of a 25 mer oligonucleotide hybridized to its fluorescent-labeled comple- mentary. Signal-background ratio and deposition homogeneity in terms of mean intensity and spot diameter were considered as significant outputs. An analysis of variance (ANOVA) was applied to raw data and to mean results for each slide and experimental run. Finally we calculated an overall evaluation coefficient to group together important outputs in one number. Environmental humidity and surface-buffer interaction were recognized as the most critical factors, for which a 50% humidity, associated to a chitosan-covered slide and a sodium phosphate 25% dimethyl sulfoxide (DMSO) buffer gave best performances. Our results also suggested that Taguchi methods can be efficiently applied in optimization of microarray procedures. Index Terms—Design of experiments, microarray, optimization, Taguchi methods. I. INTRODUCTION M ICROARRAY technology has become one of the most popular and well-known strategy to analyze thousands of genes at the same time, covering only few square centime- ters of space on a microscopy slide. Microarray technology has developed in many different directions, giving birth to oligonu- cleotide ([1]–[3]) or complementary DNA (cDNA) arrays ([4], [5]), in situ ([1], [3], [6]) or offline synthesized probes ([2], [4]) and many other types to investigate gene expression levels or DNA point mutations. Since this spread variety of possible con- figurations, optimization has historically been one of the major Manuscript received April 29, 2005; revised May 15, 2006. This work was supported in part by the Center for Biomolecular Studies and Industrial Appli- cations (CISI), Milan, under Grant RBNE01TZZ8 and in part by MIPAF Project funds (DM 524/7303/01, in cooperation with IRCCS Spallanzani, Milan). As- terisk indicates corresponding author. *M. Severgnini is with the Institute of Biomedical Technologies of the National Research Council (ITB-CNR), 20090 Milan, Italy (e-mail: [email protected]). C. Consolandi, E. Rizzi, and G. De Bellis are with the Institute of Biomed- ical Technologies of the National Research Council (ITB-CNR), 20090 Milan, Italy (e-mail: [email protected]; [email protected]; [email protected]). L. Pattini and S. Cerutti are with the Department of Biomedical En- gineering, Polytechnic University of Milan, 20100 Milan, Italy (e-mail: [email protected]; [email protected]). C. Battaglia is with the Department of Sciences and Biomedical Technology, University of Milan, 20090 Milan, Italy (e-mail: [email protected]). Digital Object Identifier 10.1109/TNB.2006.880851 challenge for research groups and companies focusing on mi- croarray technology. A high signal-to-noise-ratio (SNR) is cer- tainly the most desirable goal to pursue, but also features as uniformity and sensitivity are important. Although commercial companies have reached a high quality standard for their microarray, homemade printing and usage is far from achieving this standard. Some groups attempted to establish a set of reasonable re- quirements to control and track back all sources of variation and nonuniformity in microarray experiments [7] or advise a series of standardized protocols to accomplish data diversity and com- parison [8] A lot of work on optimization can be found in the scientific literature. Scientists mainly have concentrated on spot quality and DNA retention, trying to optimize a great variety of param- eters [9]–[12] or improve platform sensitivity through studies focused on probe length [13], chemical modifications [14], or molecular targets [15]. Others simply tried to improve amplifi- cation yield to ensure a good output signal [14], [16]. Since the first published article on microarray technology [4], a lot of improvements have been achieved and nowadays exper- imenters can be more confident in data reliability and robust- ness. However, there are still many sources of variability that could corrupt target preparation, experimental phases, data ex- traction, and thus biological evidence of an experiment. We have considered one of the most critical processes in the whole procedure: we concentrated on deposition phase, because of the many parameters (environmental and technical) that may affect quality in this particular step. However, optimization of such a huge number of parameters usually translates in time- consuming and expensive trials. In classical molecular biology, many attempts to optimize aspects as reagent quantities or re- action kinetics have been made. Most of them have used opti- mization strategies such as one-factor-at-a-time [17], fractional factorial [18], [19], or full-factorial [20] approaches. The first involves changing factor levels one at a time, leaving the others unaltered; the second implies choosing a significant number of experiments; the latter means testing of all possible combina- tions (which may be very long and expensive). From a mathematical point of view, of course, full-factorial designs enable a correct prediction of the system behavior in all testing conditions, while the one-factor-at-a-time only analyzes some of those aspects, leaving a larger amount of uncertainty. The same concepts were well expressed by authors of papers in the mathematical field [21]. The fractional factorial approach, instead, provides a robust estimate of how some chosen factors can influence significantly the process yield. Recently Wrobel et al. have described an application of the design of experiment (DOE) strategy and, in particular, surface 1536-1241/$20.00 © 2006 IEEE

Transcript of Application of the Taguchi Method to the Analysis of the Deposition Step in Microarray Production

164 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 5, NO. 3, SEPTEMBER 2006

Application of the Taguchi Method to the Analysis ofthe Deposition Step in Microarray Production

Marco Severgnini�, Linda Pattini, Clarissa Consolandi, Ermanno Rizzi, Cristina Battaglia, Gianluca De Bellis, andSergio Cerutti, Fellow, IEEE

Abstract—Every microarray experiment is affected by manypossible sources of variability that may even corrupt biologicalevidence on analyzed sequences. We applied a “Taguchi method”strategy, based on the use of orthogonal arrays to optimize thedeposition step of oligonucleotide sequences on glass slides. Wechose three critical deposition parameters (humidity, surface,and buffer) at two levels each, in order to establish optimumsettings. A L8 orthogonal array was used in order to monitorboth the main effects and interactions on the deposition of a 25mer oligonucleotide hybridized to its fluorescent-labeled comple-mentary. Signal-background ratio and deposition homogeneityin terms of mean intensity and spot diameter were considered assignificant outputs. An analysis of variance (ANOVA) was appliedto raw data and to mean results for each slide and experimentalrun. Finally we calculated an overall evaluation coefficient togroup together important outputs in one number. Environmentalhumidity and surface-buffer interaction were recognized as themost critical factors, for which a 50% humidity, associated to achitosan-covered slide and a sodium phosphate + 25% dimethylsulfoxide (DMSO) buffer gave best performances. Our resultsalso suggested that Taguchi methods can be efficiently applied inoptimization of microarray procedures.

Index Terms—Design of experiments, microarray, optimization,Taguchi methods.

I. INTRODUCTION

MICROARRAY technology has become one of the mostpopular and well-known strategy to analyze thousands

of genes at the same time, covering only few square centime-ters of space on a microscopy slide. Microarray technology hasdeveloped in many different directions, giving birth to oligonu-cleotide ([1]–[3]) or complementary DNA (cDNA) arrays ([4],[5]), in situ ([1], [3], [6]) or offline synthesized probes ([2], [4])and many other types to investigate gene expression levels orDNA point mutations. Since this spread variety of possible con-figurations, optimization has historically been one of the major

Manuscript received April 29, 2005; revised May 15, 2006. This work wassupported in part by the Center for Biomolecular Studies and Industrial Appli-cations (CISI), Milan, under Grant RBNE01TZZ8 and in part by MIPAF Projectfunds (DM 524/7303/01, in cooperation with IRCCS Spallanzani, Milan). As-terisk indicates corresponding author.

*M. Severgnini is with the Institute of Biomedical Technologies ofthe National Research Council (ITB-CNR), 20090 Milan, Italy (e-mail:[email protected]).

C. Consolandi, E. Rizzi, and G. De Bellis are with the Institute of Biomed-ical Technologies of the National Research Council (ITB-CNR), 20090Milan, Italy (e-mail: [email protected]; [email protected];[email protected]).

L. Pattini and S. Cerutti are with the Department of Biomedical En-gineering, Polytechnic University of Milan, 20100 Milan, Italy (e-mail:[email protected]; [email protected]).

C. Battaglia is with the Department of Sciences and Biomedical Technology,University of Milan, 20090 Milan, Italy (e-mail: [email protected]).

Digital Object Identifier 10.1109/TNB.2006.880851

challenge for research groups and companies focusing on mi-croarray technology. A high signal-to-noise-ratio (SNR) is cer-tainly the most desirable goal to pursue, but also features asuniformity and sensitivity are important.

Although commercial companies have reached a high qualitystandard for their microarray, homemade printing and usage isfar from achieving this standard.

Some groups attempted to establish a set of reasonable re-quirements to control and track back all sources of variation andnonuniformity in microarray experiments [7] or advise a seriesof standardized protocols to accomplish data diversity and com-parison [8]

A lot of work on optimization can be found in the scientificliterature. Scientists mainly have concentrated on spot qualityand DNA retention, trying to optimize a great variety of param-eters [9]–[12] or improve platform sensitivity through studiesfocused on probe length [13], chemical modifications [14], ormolecular targets [15]. Others simply tried to improve amplifi-cation yield to ensure a good output signal [14], [16].

Since the first published article on microarray technology [4],a lot of improvements have been achieved and nowadays exper-imenters can be more confident in data reliability and robust-ness. However, there are still many sources of variability thatcould corrupt target preparation, experimental phases, data ex-traction, and thus biological evidence of an experiment.

We have considered one of the most critical processes in thewhole procedure: we concentrated on deposition phase, becauseof the many parameters (environmental and technical) that mayaffect quality in this particular step. However, optimization ofsuch a huge number of parameters usually translates in time-consuming and expensive trials. In classical molecular biology,many attempts to optimize aspects as reagent quantities or re-action kinetics have been made. Most of them have used opti-mization strategies such as one-factor-at-a-time [17], fractionalfactorial [18], [19], or full-factorial [20] approaches. The firstinvolves changing factor levels one at a time, leaving the othersunaltered; the second implies choosing a significant number ofexperiments; the latter means testing of all possible combina-tions (which may be very long and expensive).

From a mathematical point of view, of course, full-factorialdesigns enable a correct prediction of the system behavior in alltesting conditions, while the one-factor-at-a-time only analyzessome of those aspects, leaving a larger amount of uncertainty.The same concepts were well expressed by authors of papers inthe mathematical field [21]. The fractional factorial approach,instead, provides a robust estimate of how some chosen factorscan influence significantly the process yield.

Recently Wrobel et al. have described an application of thedesign of experiment (DOE) strategy and, in particular, surface

1536-1241/$20.00 © 2006 IEEE

SEVERGNINI et al.: APPLICATION OF THE TAGUCHI METHOD TO THE ANALYSIS OF THE DEPOSITION STEP IN MICROARRAY PRODUCTION 165

response methodology, to establish optimum hybridization con-ditions for DNA microarrays [22].

In our work, we focused on the application of Taguchimethods in the microarray field: in particular we used thiswell-known industrial optimization strategy to increase overallquality of microarray deposition, in order to have more ro-bust and reliable results. Classical applications of Taguchi’soptimization strategy cover a wide range of industrial pro-ductions such as manufacture, electronic circuits settings, andmany others. In 1994, Cobb and Clarkson, for the first time,introduced Taguchi methods for optimization in a biologicalcontext, leading to optimum settings for a PCR reaction [23].Since that first application, Taguchi methods were successfullyapplied in several molecular biology problems [24], [25].

The traditional Taguchi approach is based on the mathemat-ical construction of orthogonal arrays (OAs), whose columnsresult in being mutually orthogonal to each other [26], [27].

A peculiarity of OAs is to define a precise set of experimentalpoints, associating each test run to a specific row in the orthog-onal array. Array columns result to be mutually orthogonal, thatis, if we assign to each value in the array an appropriate andbalanced weight, the vectorial product of two randomly chosencolumns is null. Using OAs allows experimenters to monitora vast portion of variability space with a minimum number oftests. In this case, a graphical representation is a set of equallydispersed points in the space of variables.

Taguchi methods are based on this construction, although notlimited to simple design of experimental phase. Taguchi pro-tocol, in fact, goes through a detailed sequence of steps, asso-ciated with different phases for obtaining variability reductionin results, which is the main goal of the whole procedure andbringing system performances near to a given target.

After a preliminary choice of how many variables, levels, andoutputs were to be tested, experiments have been conducted,trying to both optimize error distribution (partial randomization)and reduce deposition time. Data analysis step was performedon the experimental results to establish the significant parame-ters and calculate optimum settings for future depositions andfurther improvements.

Through use of triangular tables or linear graphs, we couldknow which columns into the OA are to be reserved for interac-tions studies and which ones naturally interact.

The main peculiarity of OAs is their ability to predict systembehavior even in untested combination of levels and variablesand determine if and how two variables interact between them-selves. The main purpose of this work was not simply to useTaguchi methods in microarray, but verify that this optimizationstrategy could really be applied without altering experimentalresults or evidence. We thus meant to investigate if and howdata analysis on full-factorial designs and on Taguchi-plannedmicroarray spotting experiments would have led to identicalconclusions.

II. METHODS

A. Experimental Planning

Taguchi methods were adopted to determine optimum levelsfor some critical variables linked to the deposition process: spot

Fig. 1. Linear graph of interactions for L orthogonal arrays. Numbersrepresent columns index in the OAs. Numbers on vertexes are single—factorcolumns, while numbers on the side indicate which column has to be reservedfor analyzing interactions between them (i.e., 1� 2 interaction could be studiedthrough column #3). Note that column 7 cannot be used for studying any of theinteractions.

quality on slide surface can be influenced by a series of parame-ters users can set during programming step (before effective de-position) of the contact pin spotter. We chose to monitor threevariables among all, at two levels each. Such a small number ofparameters allowed to verify interactions between two variables.Moreover, this design provides to be identical to a full-facto-rial one, allowing us to verify that both would lead to same re-sults. Parameters were selected among the most critical factorsthat can affect microarray quality, according to our experiencein custom oligonucleotide slides preparation.

Our ultimate goal was to establish optimum conditions formicroarray printing and reduce variation in registered outputs.

The chosen OA was an where the subscript 8 indicates thenumber of experimental runs. In this array, experimenters couldcontrol a maximum of seven variables at two levels each, or limitthe analysis on the study of four factors and three interactions,greatly reducing the number of experiments. We opted for thissecond possibility, thus removing one column and assigning op-portunely remaining columns to each factor and interaction, ac-cording to the scheme provided by the triangular table of Fig. 1.

According to both literature and our experience, we chose en-vironmental humidity percentage, surface coating, and printingbuffer as parameters for optimization. For each variable, we con-sidered two very different levels, in order to have less uncer-tainty in a simple “screening” experiment.

Humidity value (in percent of saturation level) and buffer in-fluence on microarray quality has been already studied in a re-cent publication [28], while surface chemistry has been one ofthe first possibilities being investigated in quality improvementexperiments [29]–[31].

Humidity has been recognized as an important factor influ-encing both spot size and homogeneity, the latter being relatedto evaporation rate during print runs. This phenomenon shouldbe avoided to prevent donut-shaped spot deposition, withirregular probe concentration and nonhomogeneous signal,increasing variability and uncertainty in collecting biologicalevidence from experiments.

Surface chemistry and its effects on binding capacity,hybridization specificity, and many other quality controls inmicroarrays was one of the first variables to be studied, for

166 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 5, NO. 3, SEPTEMBER 2006

TABLE IEXPERIMENTAL PARAMETERS AND LEVELS

its complexity and for the huge amount of immobilizationtechniques and strategies known.

Dimethyl sulfoxide (DMSO) addition to spotting solution hasbeen investigated and described previously, increasing consis-tently spot diameter [22], [28]. This makes this buffer unsuit-able for high-density arrays, while, on the other hand, it couldimprove spot homogeneity. As our goal was to improve deposi-tion for low-density arrays (useful for genotyping analysis andsingle nucleotide polymorphisms detection), we chose to testthis particular spotting buffer.

As the most common image analysis softwares are based ona simple fixed circle algorithm, having a nonhomogeneous spotmay result in incorrect pixel segmentation and could lead to mis-takes in quantitation step. Although some softwares are able toextract information even from low-quality depositions via morecomplex algorithms, in our opinion it is better trying to improveoverall quality in production rather than trying to correct errorsduring quantitation.

Humidity was tested at 50% and 75%. Selected surfaces forthis optimization were a commercial one (Codelink slides byAmersham, Piscataway, NJ) and a homemade one (chitosan-covered slide) [32], [33]. Deposition buffer may be composedof sodium phosphate (P) or sodium phosphate DMSO at 25%(v/v percentage in wells) P DMSO .

Table I shows chosen levels for each parameter.

B. Experimental Step

Using a MicroGrid II Compact (Biorobotics, c/o GenomicSolutions Europe, Cambridgeshire, U.K.) spotter equippedwith Apogent Discoveries 2500 split pins (Hudson, NH),a 19 18 array for each slide was spotted, with 30 repeti-tions of each sample (Fig. 2). Each experimental run wasperformed in triplicate, for a total of 24 slides. To assessthe robustness of our results on deposition step, we choseto test a simple and short nucleotide sequence (Zip66, a25mer 5 - amino-modified oligonucleotide with a ten-basepolyA tail, sequence 5 -AAAAAAAAAATACCGGCG-GCAGCACCAGCGGTAAC-3 ). All oligonucleotides andcomplementaries were purchased from Interactiva (Ulm, Ger-many). To reduce deposition duration, thanks to the spotterholder capability to host many slides on a single plate, both“levels” (commercial and homemade chemistries) for surfacecoating were run together. So three commercial slides and threehomemade ones were spotted at the same time, halving the totalnumber of experimental runs needed. After deposition, slideswere processed at 50 C in 50 mM ethanolamine, 100 mM TrisHCl (pH 9) for 15 min, washed with water, rinsed for 15 minin 4X SSC and 0.1% SDS, washed in water again, centrifuged,and stored in the dark. The hybridization step was performed ina GeneTac Hybridization station (Genomic Solutions, MI) for2 h at 65 C in the presence of SSC 5X, Salmon Sperm DNA1 pmol/ml, and cZIP66-Cy5 (complementary to the spotted

Fig. 2. Black and white capture of one of the slides produced for the opti-mization experiment. The color map was inverted to facilitate discrimination.Eighteen rows and 19 columns compose the array. Empty spaces in the matrixcorrespond to blanks (buffer or noncomplementary sequences).

oligo) 1 pmol/ml. Slides were then washed for 15 min withSSC 1X and SDS 0.1% at 65 C, centrifuged at 800 r/min for3 min, and stored in the dark to avoid photobleaching.

Laser scanning was performed at 630-nm wavelength, cor-responding to Cy5 excitation peak, using a ScanArray 5000confocal laser scanner (PerkinElmer, Boston, MA). Laserpower and photomultiplier tube (PMT) gains were set properlyto avoid saturation and photobleaching. Of the total of 24arrays, 7 (all characterized by CodeLink surface) were acquiredat 64% PMT gain, instead of 69% (the remaining arrays). Thisdifference originated from the need to have the entire set ofspots at below-saturation level of fluorescence. Thus, slideswere acquired at two different settings for the photomultipliergain (PMT), keeping laser power fixed. So a proper scalingof quantitation outputs before data analysis was necessary. Toproperly set the scaling factor, we scanned again two slides,setting photomultiplier gain once to 64% and then to 69% andcreating a scatter plot of corresponding intensities for the samespots in the different acquisitions.

Quantitation step was performed by ScanArray Expresssoftware (PerkinElmer), with fixed circle as segmentation algo-rithm and considering almost the whole signal range (5%–95%)as part of spots or background, without excluding significantnumber of pixels. This choice is directly related to the needof maximum objectivity in judging slide quality, because ourultimate goal was to analyze deposition and its peculiar charac-teristics and not to extract biological or functional informationfrom each spot. The following statistical analysis on the wholedata set was performed using specifically implemented Matlab(MathWorks Inc., Natick, MI) procedures.

III. RESULTS AND DISCUSSION

A preliminary analysis on all slides revealed high hybridiza-tion specificity, easily assessed through analysis of blank spots,as well as unspecific oligonucleotides (to which no complemen-tary hybridized), which were all flagged as “not found” by thequantification software.

SEVERGNINI et al.: APPLICATION OF THE TAGUCHI METHOD TO THE ANALYSIS OF THE DEPOSITION STEP IN MICROARRAY PRODUCTION 167

Fig. 3. Plot of spots distribution for percentage of pixels above backgroundintensity plus two standard deviations. Each bin represent an interval in per-centage of pixels above background + 2 standard deviations. Most spots areconcentrated above 80%, and particularly on 100%. The three lowermost binsare the ones excluded by our filtering.

First of all, we had to calculate a proper scaling factor tomanage the different PMT gain used in the two batches of scan-ning. Plotting intensities for the same slides obtained with eachof the settings for the PMTs, we calculated an angular coeffi-cient of 1.7 on the linear regression line describingthe data.

So all intensities and standard deviations (for both spots andbackground) of the seven arrays mentioned above were multi-plied by 1.7 to uniform the intensity distribution.

Before proceeding to the analysis of variance (ANOVA), wedecided to apply a prefiltering to all samples, to reject “outliers”spots not belonging to the significant spot population. First ofall, we excluded those spots flagged as “not found” by the soft-ware, for their incorrect spot segmentation. Then we filtered thespot population according to the percentage of pixels whose in-tensity exceeded background intensity 2 standard deviations,excluding from analysis any spot whose percentage was lessthan 40% (Fig. 3). Finally a filter on the SNR ratio of each spotwas applied, discarding spots belonging to the lower 10% of theSNR distribution (Fig. 4). The number of spots not respectingthese parameters was limited to 13 (on a total of 720 spots), 6 ofwhich were flagged as “not found.” Filters thresholds for pixelpercentage and SNR were established testing different valuesand looking to number of excluded spots. Excluding a relativelyhigh number of spots from the analysis could have introduceda bias, removing possible sources of information on depositionquality, especially if many of those spots were typical of a singlearray or experimental condition.

Output parameters (on which ANOVA was performed) werechosen according to literature references. We decided to keeptrack of SNR (mean signal intensity on background standarddeviation ratio), spot homogeneity (in terms of pixels coefficientof variation %—CV%), fluorescence intensity (IF) variationwithin and between slides (always in terms of CV%), anddiameter variability within and between slides. After that, wecalculated a first ANOVA, using only SNR ratio and spothomogeneity, for which we could associate a number to eachspot and considering all replicates together. This procedure wasrepeatedaftereachfilteringstepdescribedabove.Thispossibilitywas allowed by coincidence of our experimental design toa full-factorial experiment. For each group of the three-way

Fig. 4. Plot of spots distribution for SNR. Number of spots belongingto different SNR bins is represented. Even though majority of spots isconcentrated in the lesser half of the plot, only spots with SNR below12 have been filtered out (this value is the 10% of mean SNR on thewhole population).

TABLE IISIGNIFICANT PARAMETERS AND CONFIDENCE LEVELS FOR MEAN OUTPUTS

ON 30 OR 90 REPLICATES

ANOVA, there are 90 repetitions replicates slides ;data abundance and its great variability (as we expected) did notlet us determine univocally optimum experimental conditions:quite all factors would have resulted as significant. The next stepwas to compute two quantities on mean results for experimentalset (on 90 samples 1 value per group of the ANOVA table)and on mean data for each slide (on 30 samples 3 valuesper group).

As shown in Table II, some variables were significant in bothtests, in particular humidity and surface buffer interaction, de-spite changes in significance levels.

Then we introduced the most important parameter of theanalysis, that Taguchi called “signal-to-noise-ratio”[27], but,to avoid misunderstandings to the term calculated above, wedecided to address it as S/N. It performs a quadratic transfor-mation that allows the extraction of a single number from an

168 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 5, NO. 3, SEPTEMBER 2006

TABLE IIISIGNIFICANT FACTORS FOR TAGUCHI-TRANSFORMED OUTPUTS AND PERCENT

CONTRIBUTION OF EACH

entire series of data, keeping track of both mean and varianceof results around the desired goal.

S/N (1)

S/N (2)

Equation (1) refers to terms which are desirable to have highervalues (called also “larger is better”), while (2) refers to “smalleris better” terms, for which least values are desirable. SNR is a“larger is better”-like output, while all remaining parameters are“smaller is better.”

Transformation was applied to the six parameters mentionedabove, separately. The ANOVA table (Table III) shows signif-icant parameters at three different confidence levels, togetherwith percent contribution of each factor to the total variance.Percent contribution was calculated as ratio of sum of squaresfor each variable on total sum of squares. The error contributionto variability (an estimation of the internal variance of each pa-rameter) is subtracted from each term (3). Ross [26] suggestslooking to both percent contribution and confidence level inorder to select the optimum settings for each parameter foundto be significant

SSSS

(3)

In the above (3), SS is the sum of squares of the th factor,SS is the total sum of squares used in the ANOVA, whereas

is the number of degrees of freedom of the th factor, mul-tiplied for the mean square of the error.

Fig. 5. S/N plots for factor A (humidity) on the six different outputs on eachof the two tested levels.

This procedure was repeated for every variable or interactionof interest, while error contribution was calculated as the differ-ence

Pe Pi (4)

Comparing both percent contributions and significant vari-ables for each level in Table III, we concluded that Humidity(variable A) and Surface Buffer interaction (B X C) were themost critical parameters. In particular, if we consider the lessstringent confidence level (12%), A and B X C were significantfor nearly all factors (Inter-slide IF CV% excluded A, while di-ameter-related parameters excluded B X C), while accountingalways for more than 10% of total variability (excluding CV%diameter interslide).

Percent contribution of error was taken into account to assesssuitability of our design. An OA-based optimization is consid-ered successful if the sum of percent contribution of all con-trolled factors represents more than 85% of total uncertainty.When it happens, experimenters can be confident that no im-portant factor has been excluded from the analysis [26].

We thus calculated optimum variable-level combination formaximizing both humidity (Fig. 5) and surface buffer interac-tion (Fig. 6). In case of significant interaction between two ormore variables, Taguchi suggests the construction of a two-waytable and analyze completely every possible combination of fac-tors and levels, leading to the choice that less penalizes overallperformance of the system.

From the graphical representation (Figs. 5 and 6) it seemsclear that for all the chosen output parameters (except for inter-slide IF CV%) humidity at level 1 (50%) was the best choice,achieving a mean gain of 12% on all derived S/N (with a max-imum of 54% on CV% diameter inter-slide).

Surface buffer interaction study by the two-way table high-lighted that level 2 for both variables gave best performance onmean CV% and CV% intraslide, whereas by andwas the optimum choice for SNR and CV% interslide.

We preferred and for the highly better per-formance in the first two outputs. Homemade chitosan chem-istry for surface was thus preferable, the same as an addition of

SEVERGNINI et al.: APPLICATION OF THE TAGUCHI METHOD TO THE ANALYSIS OF THE DEPOSITION STEP IN MICROARRAY PRODUCTION 169

Fig. 6. Graphical representation of two-way tables for analysis of surface �

buffer interaction for all the levels tested. Upper panel [Fig. 5(a)] shows S/Nvalues for mean CV% spots, CV% IF intraslide, and CV% IF inter-slide, asresult of two-way table between Surface (B) and Buffer (C). Lower panel [Fig.5(b)] shows S/N for SNR as resulting from two-way table. The two histogramswere plotted separately to avoid mixing parameters whose S/N was “smaller isbetter” (upper) to “larger is better” (lower).

DMSO in deposition buffer composition (25% final concentra-tion in wells).

These evaluations, however, considered only one parameterat a time, while, we would have liked to unify a list of require-ments in a unique number. This goal was reached through theintroduction of an overall evaluation coefficient (OEC) [34]. Theexpression of this parameter is shown below in (5):

OECSNR SNR

SNR SNR

CV

CV CV

CV

CV CV

CV

CV

CV

CV CV

CV

CV CV(5)

Each factor in the OEC (5) is weighted according to exper-imenters’ choice and the importance it has on the final qualitywill and considered in relation to “best” or “worst” values forthe parameter itself. Choice depends on factors type in OEC:“larger is better” quantity is related to worst values; “smaller isbetter” to best values. All values are turned into nondimensionalnumbers normalizing on the whole range of variation, makingthem homogeneous quantities, and summed together.

So all variation ranges were directly derived from max-imum and minimum values in the quantitation step, while theweighting factor was arbitrarily set as 1/6, to balance all con-tributions to the OEC. In the literature, we had no evidence onwhich parameters were to be considered as the most important.Each researcher or laboratory tends, in fact, to consider as mostimportant factors those according to their requests.

The 24 OECs resulting from (5) (one per each array) werethen transformed into equivalent S/N according to the Taguchiformula shown before (Table IV). In this case, OECs are all“larger is better” quantities.

OEC construction and the analysis of results shown in Fig. 7and Table V slightly changed our point of view, leading us toconsider every factor (even at 5% confidence) as significant forthe analysis.

An easy comparison with percent contribution to overall vari-ability for each controlled factor, anyway, points to humidityand surface buffer interaction, as the most important factors,as stated before. Note that also the variable named as error in theANOVA table goes to zero, suggesting that all sources of varia-tion were properly analyzed through our experimental plan.

The analysis performed with a unique score allows to high-light the same evidence already assessed before, thus confirmingour previous findings.

Again 50% environmental humidity, chitosan coverage ofslide surface and sodium phosphate buffer 25% DMSO gavethe best performance.

IV. CONCLUSION

From the experimental data collected, it is clear that the ap-plication of Taguchi methods may lead to useful considerationsabout microarray deposition optimization, apparently differentfrom the classical application fields of these optimization strate-gies.

As Wrobel suggested [22], design of experiments (DOE) asthe Taguchi method can improve significantly microarray per-formance in terms of quality, reducing experimental durationand number of runs. Coincidence of experimental evidence forboth mean parameters (on 30 or 90 replicates) and outputs cal-culated via a Taguchi quadratic transformation (also after the in-troduction of an all-comprehensive quantity, the OEC), supportsour statement. Many other parameters related to the printing tipor to environmental conditions can be controlled during deposi-tion and affect quality of microarray experiments. More usefulwork can be done especially on spot homogeneity for which ourresults were not completely successful. Spot pixel population isnowadays too much spread across a large amount of values, thusincreasing uncertainty in outputs if considering mean or median

170 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 5, NO. 3, SEPTEMBER 2006

TABLE IVTAGUCHI-TRANSFORMED OEC FOR EACH EXPERIMENTAL RUN

TABLE VANOVA TABLE FOR TAGUCHI-TRANSFORMED OEC AND PERCENT CONTRIBUTION OF PARAMETERS

Fig. 7. Plot of S/N on OEC for each experimental run.

pixels intensity (that should ideally be equal to each other forperfectly homogeneous spots).

However, humidity level at 50% caused a mean gain of 12%on performance, while interaction between chitosan surface andphosphate buffer DMSO accounted for a 6% improvement.

So our experiments are to be considered only as a startingpoint for further analysis refinement and thus, determiningvalues with higher precision, improve, whenever possible,overall quality of microarray depositions.

REFERENCES

[1] D. J. Lockhart, H. Dong, M. C. Byrne, M. T. Follettie, M. V. Gallo,and M. S. Chee et al., “Expression monitoring by hybridization tohigh-density oligonucleotide arrays,” Nature Biotechnol., vol. 13, pp.1675–1680, 1996.

[2] R. Ramakrishnan, D. Dorris, A. Lublinsky, A. Nguyen, M. Domanus,and A. Prokhorova et al., “An assessment of motorola codelink mi-croarray performance for gene expression profiling applications,” Nu-cleic Acids Res., vol. 7, p. e30, 2002.

[3] T. R. Hughes, M. Mao, A. R. Jones, J. Burchard, M. J. Marton, and K.W. Shannon et al., “Expression profiling using microarrays fabricatedby an ink-jet oligonucleotide synthesizer,” Nature Biotechnol., vol. 4,pp. 342–347, 2001.

[4] M. Schena, D. Shalon, R. W. Davis, and P. O. Brown, “Quantitativemonitoring of gene expression patterns with a complementary DNAmicroarray,” Science, vol. 270, pp. 467–470, 1995.

[5] J. DeRisi, L. Penland, P. O. Brown, M. L. Bittner, P. S. Meltzer, andM. Ray et al., “Use of a cDNA microarray to analyse gene expressionpatterns in human cancer,” Nature Genetics, vol. 4, pp. 457–460, 1996.

[6] S. Singh-Gasson, R. D. Green, Y. Yue, C. Nelson, F. Blattner, and M. R.Sussman et al., “Maskless fabrication of light-directed oligonucleotidemicroarrays using a digital micromirror array,” Nature Biotechnol., vol.10, pp. 974–978, 1999.

[7] A. Brazma, P. Hingamp, J. Quackenbush, G. Sherlock, P. Spellman,and C. Stoeckert et al., “Minimum information about a microarray ex-periment (MIAME)-toward standards for microarray data,” Nature Ge-netics, vol. 29, pp. 365–371, 2001.

[8] V. Benes and M. Muckenthaler, “Standardization of protocols in cDNAmicroarray analysis,” Trends Biochem. Sci., vol. 28, pp. 244–249, 2003.

[9] M. J. Hessner, X. Wang, S. Khan, L. Meyer, M. Schlicht, and J. Tackeset al., “Use of a three-color cDNA microarray platform to measureand control support-bound probe for improved data quality and repro-ducibility,” Nucleic Acids Res., vol. 31, p. e60, 2003.

[10] J. E. Korkola, A. L. Estep, S. Pejavar, S. DeVries, R. Jensen, and F.M. Waldman, “Optimizing stringency for expression microarrays,”Biotechniques, vol. 35, pp. 828–835, 2003.

[11] A. Relogio, C. Schwager, A. Richter, W. Ansorge, and J. Valcarcel,“Optimization of oligonucleotide-based DNA microarrays,” NucleicAcids Res., vol. 30, p. e51, 2002.

[12] M. Sartor, J. Schwanekamp, D. Halbleib, I. Mohamed, S. Karyala,and M. Medvedovic et al., “Microarray results improve significantlyas hybridization approaches equilibrium,” Biotechniques, vol. 36, pp.790–796, 2004.

[13] C. C. Chou, C. H. Chen, T. T. Lee, and K. Peck, “Optimization ofprobe length and the number of probes per gene for optimal microarrayanalysis of gene expression,” Nucleic Acids Res., vol. 32, p. e99, 2004.

[14] S. D. Li, L. Tong, S. H. Cheng, Y. Ding, S. B. Li, and S. Q. Wang,“Fabrication and optimization of HLA-DRB1-12 oligonucleotide mi-croarray,” Zhongguo Shi Yan Xue Ye Xue Za Zhi, vol. 11, pp. 393–397,2003.

[15] J. Peplies, F. O. Glockner, and R. Amann, “Optimization strategiesfor DNA microarray-based detection of bacteria with 16S rRNA-tar-geting oligonucleotide probes,” Appl. Environ. Microbiol., vol. 69, pp.1397–1407, 2003.

SEVERGNINI et al.: APPLICATION OF THE TAGUCHI METHOD TO THE ANALYSIS OF THE DEPOSITION STEP IN MICROARRAY PRODUCTION 171

[16] H. Zhao, T. Hastie, M. L. Whitfield, A. L. Borresen-Dale, and S. S.Jeffrey, “Optimization and evaluation of T7 based RNA linear amplifi-cation protocols for cDNA microarray analysis,” BMC Genomics, vol.3, p. 31, 2002.

[17] J. H. Xiao, D. X. Chen, J. W. Liu, Z. L. Liu, W. H. Wan, and N. Fang etal., “Optimization of submerged culture requirements for the produc-tion of mycelial growth and exopolysaccharide by Cordyceps jiangx-iensis JXPJ 0109,” J. Appl. Microbiol., vol. 96, pp. 1105–1116, 2004.

[18] Y. L. Loukas, “A computer-based expert system designs and analyzesa 2(k-p) fractional factorial design for the formulation optimization ofnovel multicomponent liposomes,” J. Pharm. Biomed. Anal., vol. 1, pp.133–140, 1998.

[19] S. E. Wildsmith, G. E. Archer, A. J. Winkley, P. W. Lane, and P. J.Bugelski, “Maximization of signal derived from cDNA microarrays,”Biotechniques, vol. 30, pp. 202–206, 208, 2001.

[20] M. D. Gupte and P. R. Kulkarni, “A study of antifungal antibiotic pro-duction by Streptomyces chattanoogensis MTCC 3423 using full fac-torial design,” Lett Appl Microbiol, vol. 35, pp. 22–26, 2002.

[21] V. Czitrom, “One-factor-at-a-time versus designed experiment,” Amer.Statist., vol. 53, pp. 126–131, 1999.

[22] G. Wrobel, J. Schlingemann, L. Hummerich, H. Kramer, P. Lichter,and M. Hahn, “Optimization of high-density cDNA-microarray proto-cols by ‘design of experiments’,” Nucleic Acids Res., vol. 31, p. e67,2003.

[23] B. D. Cobb and J. M. Clarkson, “A simple procedure for optimising thepolymerase chain reaction (PCR) using modified Taguchi methods,”Nucleic Acids Res., vol. 22, pp. 3801–3805, 1994.

[24] G. Caetano-Anolles, “DAF optimization using Taguchi methods andthe effect of thermal cycling parameters on DNA amplification,”Biotechniques, vol. 25, pp. 472–476, 478-480, 1998, .

[25] C. Jeney, O. Dobay, A. Lengyel, E. Adam, and I. Nasz, “Taguchi op-timisation of ELISA procedures,” J. Immunol. Methods, vol. 223, pp.137–146, 1999.

[26] P. J. Ross, Taguchi Techniques for Quality Engineering. New York:McGraw-Hill, 1988.

[27] G. Taguchi, Introduction to Quality Engineering: Designing QualityInto Products and Processes. Tokyo, Japan: Asian Productivity Or-ganization, 1986.

[28] M. K. McQuain, K. Seale, J. Peek, S. Levy, and F. R. Haselton, “Ef-fects of relative humidity and buffer additives on the contact printingof microarrays by quill pins,” Anal. Biochem., vol. 320, pp. 281–291,2003.

[29] F. Diehl, S. Grahlmann, M. Beier, and J. D. Hoheisel, “ManufacturingDNA microarrays of high spot homogeneity and reduced backgroundsignal,” Nucleic Acids Res., vol. 29, p. E38, 2001.

[30] K. Lindroos, U. Liljedahl, M. Raitio, and A. C. Syvanen, “Minise-quencing on oligonucleotide microarrays: comparison of immobilisa-tion chemistries,” Nucleic Acids Res., vol. 29, p. E69-9, 2001.

[31] S. Taylor, S. Smith, B. Windle, and A. Guiseppi-Elie, “Impact of sur-face chemistry and blocking strategies on DNA microarrays,” NucleicAcids Res., vol. 31, p. e87, 2003.

[32] C. Consolandi, B. Castiglioni, R. Bordoni, E. Busti, C. Battaglia, andL. Rossi Bernardi et al., “Two efficient polymeric chemical platformsfor oligonucleotide microarray preparation,” Nucleosides NucleotidesNucleic Acids, vol. 21, pp. 561–580, 2002.

[33] C. Consolandi, M. Severgnini, B. Castiglioni, R. Bordoni, A. Frosini,and C. Battaglia et al., “A structured chitosan-based platform forbiomolecule attachment to solid surfaces: application to DNA mi-croarray preparation,” Bioconjug. Chem., vol. 2, pp. 371–377, 2006.

[34] R. K. Roy, A Primer on the Taguchi Method. Dearborn, MI: SME,1990.

Marco Severgnini was born in San Donato Milanese, Milan, Italy on March8, 1978. He received the Laurea degree in biomedical engineering from the Po-litecnico of Milan, Milan, Italy, in 2003, with a final thesis on the application ofthe Taguchi method to DNA microarrays.

He held a honorary fellowship at the University of Wisconsin, Madison, in2005 and currently has a research grant at the Institute for Biomedical Technolo-gies at the National Research Council in Segrate, Milan, Italy. His main researchinterests involve the optimization of procedures in microarray technology, uti-lization and development of new nanotechnological devices in molecular bi-ology field, and statistical data analysis.

Linda Pattini received the Laurea degree in electronic engineering from thePolytechnic University, Milan, Italy, in 1999. In 2003, she received the Ph.D. inbiomedical engineering with a thesis developed in collaboration with a researchgroup of human genetics concerning information treatment in the frameworkof a project about proteins involved in the etiogenesis of neuromuscular andneurodegenerative diseases.

Nowadays she is Research fellow in the Department of Biomedical Engi-neering of the Polytechnic University of Milan, where she holds a course incomputational genomics and practical lessons of fundamentals of electronic bio-engineering. Her research activity concerns various aspects of data processingboth in genomics and proteomics.

Dr. Pattini is a member of the International Society for Computational Bi-ology (ISCB).

Clarissa Consolandi was born in Treviglio (Bergamo), Italy, on January 29,1974. She received the Laurea degree in biological science in 2000 and the Ph.D.degree in molecular medicine (curriculum genomics and proteomics) from theUniversity of Milan, Milan, Italy, in 2004.

She is currently Researcher at the Institute of Biomedical Technology of theNational Council of Research in Milan, Italy. Her main research interests in-clude molecular biology and nanotechnology. Her research activity is focused,in particular, on microarray technology. She has authored or coauthored about20 scientific papers.

Ermanno Rizzi was born in Monza, Italy, on October 16, 1976. He receivedthe Laurea degree in industrial biotechnologies from Università di Milano Bic-occa, Milan, Italy in 2001. He is currently working toward the Ph.D. degree inmolecular medicine at the University of Milan.

He had training on parallel pyrosequencing at 454 LifeSciences, Branford,CT, in 2005. He currently has a fellowship with the Institute of Biomedical Tech-nologies at the National Research Council in Milan. His main research interestsare developments of technologies for acid nucleic analysis and sequencing.

Cristina Battaglia received the degree in biological sciences from the Univer-sity of Milan, Milan, Italy.

She held a postdoctoral fellowship at the Department of Connective TissueResearch of the Max Planck Institut for Biochemistry, Martinsried, Munich,Germany (1989–1991) and at the bioengineering unit of the Kidney DisesesLabs, Pharmaceutical Research Institute Mario Negri (IRFMN), Bergamo(1992–1994). She is Head of the microarray facility at the Centre for Biomolec-ular Interdisciplinary Studies and Industrial Applications (CISI), Milan, Italy.Since 1998, she has been a Researcher at University (BIO/10) in the Departmentof Science and Biomedical Technology (DiSTeB), Faculty of Medicine andSurgery, University of Milan. She is currently an Assistant Professor holdingcourses in Biochemistry at Faculty of Medicine, University of Milan. Since2001, she has tutored many students in the Ph.D. course on molecular medicineat the University of Milan and member of the board of the Ph.D. program,before becoming member of the Centre of Excellence CISI. She is coauthorof more than 30 scientific papers, some of them as corresponding author. Hermain interests include microarray technology applications, the development ofbiochemical strategies which are of interest for human molecular diagnostics,and bioinformatics.

Gianluca De Bellis received the degree in chemistry (cum laude) from the Uni-versity of Milan, Milan, Italy, in 1987.

Since 1989 he has been a Researcher with the National Research Council,Institute for Biomedical Technologies, Milan. He has published over 50 papers

172 IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 5, NO. 3, SEPTEMBER 2006

on international journals and holds four patents in microarray technology. Hisresearch activity has been directed to nucleic acids analysis from the beginning.Since 1996, he has focused on microarray technology exploring with severalpartners the methodological and technological hurdles in this area.

Sergio Cerutti (M’81–S’97–F’02) received the Laurea degree in Electronic En-gineering from the Polytechnic University, Milan, Italy, in 1971.

From 1982 to 1990, he was Associate Professor of Biomedical Engineeringat the Department of Biomedical Engineering of the same Polytechnic, wheresince 1994 he has been a Professor on Biomedical Signal and Data Processing.From 1990 to 1994 he was also Professor of Biomedical Engineering at theDepartment of Computer and System Sciences of the University of Rome “La

Sapienza,” Rome, Italy. He spent over a year as a Visiting Professor at the MITand Harvard School of Public Health, Boston, MA. He is the author of somehundreds of papers and books on these topics published in international scien-tific literature. His research activity is mainly dedicated to various aspects ofbiomedical signal processing and modeling related to the cardiovascular systemand in the field of neurosciences.

Prof. Cerutti is Chairman of the Biomedical Engineering Group of ItalianAEI (Association of Electrical Engineering) and is a member of IEEE-AEI,IFMBE-AIIMB, ESEM, IEC-CEI, ISO-UNI, and other international and na-tional scientific associations. He was a member of AdCom of IEEE-EMBS(1992–1996) as Representative of Region 8. Nowadays he is on the EditorialBoard of various scientific journals and he is Associate Editor of the IEEETRANSACTIONS ON BIOMEDICAL ENGINEERING.