Validation Is Like Motor Oil: Synthetic Is Better

24
Industrial and Organizational Psychology, 3 (2010), 305–328. Copyright © 2010 Society for Industrial and Organizational Psychology. 1754-9426/10 FOCAL ARTICLE Validation Is Like Motor Oil: Synthetic Is Better JEFF W. JOHNSON Personnel Decisions Research Institutes PIERS STEEL University of Calgary CHARLES A. SCHERBAUM Baruch College CALVIN C. HOFFMAN Los Angeles County Sheriff’s Department and Alliant University P. RICHARD JEANNERET Valtera Corporation JEFF FOSTER Hogan Assessment Systems Abstract Although synthetic validation has long been suggested as a practical and defensible approach to establishing validity evidence, synthetic validation techniques are infrequently used and not well understood by the practitioners and researchers they could most benefit. Therefore, we describe the assumptions, origins, and methods for establishing validity evidence of the two primary types of synthetic validation techniques: (a) job component validity and (b) job requirements matrix. We then present the case for synthetic validation as the best approach for many situations and address the potential limitations of synthetic validation. We conclude by proposing the development of a comprehensive database to build prediction equations for use in synthetic validation of jobs across the U.S. economy and reviewing potential obstacles to the creation of such a database. We maintain that synthetic validation is a practically useful methodology that has great potential to advance the science and practice of industrial and organizational psychology. When faced with the need to estimate the validity of a personnel selection procedure, Correspondence concerning this article should be addressed to Jeff W. Johnson. E-mail: [email protected] Jeff W. Johnson, Personnel Decisions Research Institutes, 650 3rd Avenue s., Suite 1350, Minneapolis, MN 55402; Piers Steel, Haskayne School of Busi- ness, University of Calgary; Charles A. Scherbaum, Department of Psychology, Baruch College; Calvin C. Hoffman, Los Angeles County Sheriff’s Department and Alliant University; P. Richard Jeanneret, Valtera Corporation; Jeff Foster, Hogan Assessment Systems. organizations both large and small are often confronted with situations in which the use of traditional validation strategies is not fea- sible (Hoffman & McPhail, 1998; McPhail, 2007). These situations can arise from a small number of incumbents in a partic- ular job, new jobs being added, rapidly changing jobs, too many jobs, or insuf- ficient resources. When faced with these constraints, organizations can use validity generalization techniques to estimate the validity of the selection procedure. A set 305

Transcript of Validation Is Like Motor Oil: Synthetic Is Better

Industrial and Organizational Psychology, 3 (2010), 305–328.Copyright © 2010 Society for Industrial and Organizational Psychology. 1754-9426/10

FOCAL ARTICLE

Validation Is Like Motor Oil:Synthetic Is Better

JEFF W. JOHNSONPersonnel Decisions Research Institutes

PIERS STEELUniversity of Calgary

CHARLES A. SCHERBAUMBaruch College

CALVIN C. HOFFMANLos Angeles County Sheriff’s Department and Alliant University

P. RICHARD JEANNERETValtera Corporation

JEFF FOSTERHogan Assessment Systems

AbstractAlthough synthetic validation has long been suggested as a practical and defensible approach to establishingvalidity evidence, synthetic validation techniques are infrequently used and not well understood by thepractitioners and researchers they could most benefit. Therefore, we describe the assumptions, origins, andmethods for establishing validity evidence of the two primary types of synthetic validation techniques: (a) jobcomponent validity and (b) job requirements matrix. We then present the case for synthetic validation as thebest approach for many situations and address the potential limitations of synthetic validation. We concludeby proposing the development of a comprehensive database to build prediction equations for use in syntheticvalidation of jobs across the U.S. economy and reviewing potential obstacles to the creation of such a database.We maintain that synthetic validation is a practically useful methodology that has great potential to advance thescience and practice of industrial and organizational psychology.

When faced with the need to estimate thevalidity of a personnel selection procedure,

Correspondence concerning this article should beaddressed to Jeff W. Johnson.E-mail: [email protected]

Jeff W. Johnson, Personnel Decisions ResearchInstitutes, 650 3rd Avenue s., Suite 1350, Minneapolis,MN 55402; Piers Steel, Haskayne School of Busi-ness, University of Calgary; Charles A. Scherbaum,Department of Psychology, Baruch College; CalvinC. Hoffman, Los Angeles County Sheriff’s Departmentand Alliant University; P. Richard Jeanneret, ValteraCorporation; Jeff Foster, Hogan Assessment Systems.

organizations both large and small are oftenconfronted with situations in which the useof traditional validation strategies is not fea-sible (Hoffman & McPhail, 1998; McPhail,2007). These situations can arise from asmall number of incumbents in a partic-ular job, new jobs being added, rapidlychanging jobs, too many jobs, or insuf-ficient resources. When faced with theseconstraints, organizations can use validitygeneralization techniques to estimate thevalidity of the selection procedure. A set

305

306 J.W. Johnson et al.

of validity generalization techniques fallingunder the rubric of synthetic validation canbe particularly useful. Synthetic validationis a logical process of inferring validity onthe basis of the relationships between com-ponents of a job (i.e., clusters of similartasks or work behaviors) and tests of theattributes that are needed to perform thosecomponents (Mossholder & Arvey, 1984). Itcan be applied in situations in which mul-tiple jobs share a number of the same jobcomponents, such that relationships can beidentified between predictors and job com-ponents across jobs with larger sample sizesthan can be obtained within jobs.

Although synthetic validation has longbeen suggested as a practical and defensibleapproach to establishing validity evidence(Lawshe, 1952), there has been little cov-erage in the literature regarding how touse these techniques in applied settings.As a result, synthetic validation is infre-quently used and not well understood by thepractitioners and researchers it could mostbenefit. Guion (2006) recently concludedthat synthetic validation ‘‘seems unwillingto die despite the lack of much evidenceto support it after more than 40 years’’(pp. 84–85).

Despite this lack of support, there hasbeen a recent resurgence of interest insynthetic validation techniques (Hoffman,Rashkovsky, & D’Egidio, 2007; Jeanneret &Strong, 2003; Johnson, 2007; Scherbaum,2005; Steel, Huffcutt, & Kammeyer-Mueller, 2006). Based on our collectiveexperience in researching and applying syn-thetic validation techniques, we maintainthat synthetic validation is a practicallyuseful methodology that has great poten-tial to advance the science and practice ofindustrial and organizational (I-O) psychol-ogy. Nevertheless, there is still a need forpractical information about (a) the natureof synthetic validation and (b) the impactthat widespread application of syntheticvalidation can have on the science andpractice of I-O psychology. The objectivesof this article are therefore to (a) providedescriptions of synthetic validation tech-niques and how they work, (b) present the

case for synthetic validation as the bestapproach for many situations, (c) addressthe limitations of synthetic validation, and(d) suggest the creation of a centralizeddatabase to be used with synthetic valida-tion models to build prediction equationsfor specific situations.

The remainder of this article is organizedinto four primary sections. First, we describedifferent types of synthetic validation tech-niques, including their assumptions, origins,and application. Second, we present argu-ments for making synthetic validation amore standard approach to conducting val-idation studies. Third, we discuss potentiallimitations of synthetic validation and whythese limitations should not diminish ourenthusiasm for more widespread applica-tion of these techniques. Finally, we pro-pose the development of a comprehensivedatabase to build prediction equations foruse in synthetic validation of jobs acrossthe U.S. economy. In this section, we alsoreview potential obstacles to the creation ofsuch a database.

Description ofSynthetic Validation

All synthetic validation techniques arebased on two general assumptions. First,when a job component is common acrossmultiple jobs, the human attributes thatfacilitate performance on that componentare similar across jobs. This means that atest measuring a particular attribute shouldbe a valid predictor for any job that con-tains a component for which the attribute isrequired. For example, a measure of cus-tomer service orientation that has beenfound to be valid for predicting customerservice performance should be a valid pre-dictor for any job that includes customerservice, as long as customer service isdefined in the same way in each job.

The second assumption is that the infer-ences that can be made from a test thatpredicts performance on a job componentare similar across jobs and situations. Differ-ences in the relationship between a test andperformance on a job component across

Synthetic validation 307

jobs are expected to be a result of statisticalartifacts such as sampling error and unreli-ability in measurement. A corollary of thisassumption is that statistical artifacts areassumed to be independent of situationalmoderators (e.g., criterion reliability is notrelated to the job). This corollary is neces-sary because this independence assumptionhas been challenged on solid concep-tual grounds (James, Demaree, Mulaik, &Ladd, 1992; Russell & Gilliland, 1995),although the independence assumption hasbeen found to be reasonable in two large-scale studies (Burke, Rupinski, Dunlap, &Davison, 1996). These general assumptionsare similar to those made for validity gener-alization (Jeanneret, 1992), which is a con-cept that has received considerable researchsupport (American Educational ResearchAssociation, American Psychological Asso-ciation, & National Council on Measure-ment in Education, 1999; Schmidt, Hunter,& Pearlman, 1982; Society for Industrialand Organizational Psychology, 2003).

Lawshe (1952) first introduced the term‘‘synthetic validity’’ at a symposium onindustrial psychology for small businesses.He argued that we need not empiricallyrediscover the validity coefficients for everyjob; rather, we can logically infer them bybreaking jobs down into component parts,each with known test validities, and thencombining these elemental validities backinto a whole. This evidence could then begeneralized to other jobs that contain thoserequirements. The process is analogous tohow time study engineers determine howlong a work task will take by breaking thetask down into component parts, each witha known duration, and then adding theseparts together to determine how long theentire task will take.

At the 1957 meeting of the MidwesternPsychological Association, Balma (1959)refined the concept of synthetic validityinto its current conceptualization of theinferring of validity in a specific situationfrom a logical analysis of job compo-nents, determining the validity of predic-tors for these components, and combiningthe predictor-component validity estimates.

The concept of synthetic validity was furtherdeveloped from contributions to the sym-posium by Ghiselli (1959), Primoff (1959),and McCormick (1959), who elaboratedhow breaking jobs into their componentswould allow for the development of valid-ity evidence that is generalizable from jobto job.

Two distinct streams of synthetic valida-tion research flowed from this symposium.The first type was termed job compo-nent validity, which was a term coined byMcCormick (1959) to describe a specifictype of synthetic validation technique thatindirectly links tests and job components bydemonstrating that, across jobs, job incum-bents’ test scores or test validity coefficientsare related to the importance of the attributemeasured by the test, as determined by astandardized job analysis survey. The sec-ond type of synthetic validation has its rootsin the J-coefficient approach introduced byPrimoff (1957, 1959). In this approach, jobanalysis is used to identify the job com-ponents that are common across multiplejobs and predictor measures are chosen topredict performance on those job compo-nents. The J-coefficient is a mathematicalindex describing the relationship betweenthe predictor battery and job performance.Steel et al. (2006) termed this type ofsynthetic validation the job requirementsmatrix approach. These are two fundamen-tally different approaches to synthetic vali-dation. They are similar in that both requirea database linking predictors to job compo-nents and both allow a validity estimate tobe calculated for a single job based solelyon a job analysis of that job. They differ pri-marily in the nature of the required databaseand how synthetic validity is estimated. Inthe following sections, we describe eachof these approaches and how they areapplied.

Job Component Validity

Besides the general synthetic validationassumptions mentioned previously, jobcomponent validity is based on twoadditional premises:

308 J.W. Johnson et al.

1. Jobs can be reliably measured in termsof constructs (e.g., behaviors, skills,work context), whereby work across arange of occupational domains can becompared on equivalent dimensions(e.g., information processing andcommunications; McCormick, 1979).

2. Workers ‘‘gravitate’’ to and remain injobs that are compatible with theirattributes, that is, their aptitudes (i.e.,skills and abilities) and personal char-acteristics (i.e., interests and person-ality; McCormick, DeNisi, & Shaw,1979; Wilk, Desmarais, & Sackett,1995). Note that this assumption isonly required for mean-based jobcomponent validity and is not rele-vant for validity-based job componentvalidity.

The first premise was confirmed byMcCormick and his students (McCormick,Jeanneret, & Mecham, 1972) with the devel-opment of the Position Analysis Question-naire (PAQ) and the identification of aconsistent pattern of behavioral job dimen-sions (components) based on factor analysesof PAQ data for a wide range of jobsvarying in both content and complexity(McCormick & Jeanneret, 1988). Subse-quently, the PAQ job components becamethe foundation for the Generalized WorkActivities of the Occupational Informa-tion Network (O*NET; Jeanneret, Borman,Kubisiak, & Hansen, 1999; Jeanneret &Strong, 2003).

The second premise has been con-firmed by the research of McCormicket al. (1979) and McCormick, Mecham,and Jeanneret (1989), reported in Jean-neret (1992), Viswesvaran and Ones (2003),and Wilk and Sackett (1996). These studiesshowed that there is less variability in themeans and standard deviations of incum-bent test scores within job categories thanacross job categories. In addition, at leastfor cognitive measures, test score meansincrease as job complexity increases. Withrespect to occupational interests, Holland’smodel of interests has been linked to thejob components of the PAQ (Hyland &

Muchinsky, 1991; Rounds, Shubsachs,Dawis, & Lofquist, 1978).

The first step in the job componentvalidity approach is establishing a databaseincluding information on hundreds of jobs.This information includes job characteris-tics obtained from a standardized job anal-ysis instrument (e.g., PAQ and O*NET) andeither test score means or validity coeffi-cients for a standardized set of tests. Thecriterion for the validity coefficients is typi-cally a measure of overall performance. Thejob analysis data are used to identify jobcomponents and the importance of thosejob components within each job. With jobsas the unit of analysis, multiple regression isused to establish the relationship betweenmean job component importance ratingsas the predictors and either validity coeffi-cients or mean test scores as criteria. (Forselection purposes, validity coefficients arethe more appropriate criterion.) A regressionequation is derived for each test.

To estimate a validity coefficient withina single job, a job analysis is conductedfor that job using the same instrument thatwas used to create the job componentvalidity database. The regression equationthat was established for predicting testvalidity coefficients is then applied to themean job component importance ratings forthe job of interest. This is repeated for eachavailable test using the regression equationfor each test. This produces predictedvalidity coefficients for each test for thatjob. A test battery can be constructed bychoosing the tests with the highest predictedvalidity coefficients, taking into account testintercorrelations.

Historically, a problem with validity-based job component validity is that,compared to mean scores, only a smallproportion of the variance in validitycoefficients could be accounted for byjob analysis ratings, likely because ofthe greater stability and lower degree oferror associated with mean scores (Sackett,1991). This has led to increased useof mean-based job component validity,which is a less direct way of supportingtest use in a selection context. In this

Synthetic validation 309

approach, regression equations are used topredict mean test scores within a job andtests with higher predicted means wouldbe considered more job relevant. Thisapproach would not provide an estimatedvalidity coefficient like the validation-basedapproach, but it at least can provide someguidance for selecting predictors and settingcut scores.

Mean-based job component validity hasa large record of successful use, with jobanalysis ratings predicting a high percentageof the variance in mean predictor scoresacross jobs (Jeanneret, 1992; Sparrow,1989). Prediction tends to be strongest forcognitive measures, followed by perceptualand psychomotor tests (Jeanneret & Strong,2003). More recently, personality testscores have been shown to be predictablefrom job analysis data (D’Egidio, 2001;Rashkovsky & Hoffman, 2005).

McCormick’s original job componentvalidity model has demonstrated consis-tency in developing estimates of valid-ity coefficients and mean test scores forfive decades. Job component validity stud-ies encompass many different researchers,databases, predictors, criteria, and job com-ponents derived from the PAQ and morerecently from the Generalized Work Activ-ities of the O*NET. The two job compo-nent sources (the PAQ components andthe Generalized Work Activities) have alsobeen consistent in developing estimatesof occupational literacy requirements fora wide spectrum of jobs, and these lit-eracy indices have value for both selec-tion and employment counseling (LaPolice,Carter, & Johnson, 2008).

Although the emphasis has turned awayfrom validity-based job component valid-ity, a revival of this method is in the works.Using the U.S. Department of Labor’s Gen-eral Aptitude Test Battery (GATB) database,which provides job ratings and incumbentpredictor scores across hundreds of jobs,Steel et al. (2006) and Steel and Kammeyer-Mueller (2009) demonstrated how thisapproach could be made to work, first withO*NET job analyses and later with PAQdata. Recognizing that McCormick’s early

efforts were essentially a precursor to meta-analysis, they applied a recent advanceto meta-analytic moderator research tothe analysis: Weighted least-squares multi-ple regression (Steel & Kammeyer-Mueller,2002). Steel and Kammeyer-Mueller foundthat weighted least-squares regressionexplained 21% of the variance in the valid-ity coefficients, whereas traditional ordi-nary least-squares analysis only explained9%. After adjusting for sampling error,they found that validity-based job compo-nent validity accounted for almost all theexplainable variance and was very precisecompared to most local validation stud-ies. This advance should lead to increasedapplication of validity-based job compo-nent validity.

Job Requirements Matrix

The job requirements matrix approach dif-fers considerably from the job compo-nent validity approach. Rather than build-ing a database linking job componentimportance ratings to overall test validitycoefficients, the job requirements matrixapproach involves building a database link-ing test scores to measures of job com-ponent performance. This means the jobcomponents that are identified in the jobanalysis must be translated into measuresof performance (usually supervisor ratings).Test and performance data are collected forindividuals (usually incumbents, althougha predictive validity study could be con-ducted) to create a matrix of the relation-ships between the tests and the job com-ponents (Steel et al., 2006). Because of theassumption that the validity of inferences isnot situationally specific, these relationshipscan be assessed by computing correlationsbetween tests and job components acrossjobs (Johnson, 2007).

Although criterion-related validity coef-ficients are typically thought of as the indexdescribing the relationship between pre-dictors and job components, a contentvalidation strategy can also be applied.Empirical correlations between predictorsand job components require large sample

310 J.W. Johnson et al.

sizes and performance ratings, so subjectmatter expert judgments may be collectedinstead (Scherbaum, 2005). For example,test experts or I-O psychologists couldestimate the validity coefficients betweenpredictors and job components (Schmidt,Hunter, Croll, & McKenzie, 1983). Researchsuggests that expert judgments provide anaccurate estimate of these relationships,at least for cognitive ability tests andwith experienced judges (Hirsh, Schmidt, &Hunter, 1986; Schmidt et al., 1983).

The other element that is necessaryin the job requirements matrix approachis a vector defining the relationshipsbetween the job components and overalljob performance (Steel et al., 2006; Trattner,1982). This vector defines how the jobcomponent performance scores should beweighted and combined to arrive at adefinition of overall performance. A typicalsource of job component weights is meanimportance ratings from the job analysis,although alternative weighting strategiessuch as unit weights may perform just aswell (Johnson & Carter, in press).

When a database linking predictors tojob components and job components tooverall performance is available, a testbattery can be constructed and an overallvalidity coefficient computed within a jobbased on a job analysis of that job. Teststhat predict performance on the importantjob components identified by the jobanalysis can be selected to create a testbattery, and standard statistical formulascan be applied to the correlation matrixto compute an overall validity coefficient,weighting predictors and job componentsappropriately. Researchers have proposedmany different procedures for combiningthe relationships between predictors andjob components into a single validity index(Hamilton & Dickinson, 1987; Hollenbeck& Whitener, 1988; Johnson, 2007; Johnson& Carter, in press; Primoff 1957, 1959; Steelet al., 2006; Trattner, 1982).

Guion (1965) developed an early attemptto demonstrate the application of the jobrequirements matrix approach to syntheticvalidation. In conducting a job analysis of

the various jobs in a small organization, hefound that the jobs in this organization con-tained differing combinations of seven jobcomponents. Supervisors provided perfor-mance ratings on each job component andoverall performance. He then administereda battery of predictors to the incumbentsand identified the most important predictorsfor each job component using the differ-ences in scores between successful andunsuccessful incumbents. He then com-puted the multiple correlations betweenthe two most important predictors and thejob component performance ratings. Usingthe empirical relationships, he developedexpectancy tables indicating the likelihoodof an employee receiving a superior rat-ing on a job component for different scoreranges on the predictors.

One of the more substantial syntheticvalidation efforts using the job require-ments matrix approach is the U.S. Army’sSynthetic Validity project (Peterson, Wise,Arabian, & Hoffman, 2001). In manyrespects, the process used in this projectresembles Guion’s (1965) approach. How-ever, judgments of I-O psychologists servedas the basis for estimating the relationshipsbetween the predictors and job compo-nents. These estimates were used as thebasis for developing multiple methods ofweighting synthetic validity equations pre-dicting Soldier core technical proficiencyand overall performance.

Johnson and Carter (in press) also usedthe job requirements matrix approach tosynthetic validation, but unlike Petersonet al. (2001), they empirically derived jobcomponent validity coefficients instead ofusing expert judgments. They first con-ducted a job analysis to identify the jobcomponents, job families, and predictorconstructs for a large number of jobs. Theyused this information to develop predictormeasures and performance rating scales forthe job components. After collecting pre-dictor and criterion data, they identified themost relevant predictors for the job compo-nents based on a combination of empiricalcorrelations and expert judgments. Basedon the important components to a job

Synthetic validation 311

family, they formed a predictor battery andcomputed the relationships between thepredictor battery and a composite of thejob component performance ratings usingNunnally and Bernstein’s (1994) formulafor the correlation between composites.

In addition to estimating synthetic valid-ity coefficients, it is possible to use asimilar formula presented by Sackett andEllingson (1997) to compute standardizedmean subgroup differences for the test bat-tery. In addition, Johnson, Carter, Davison,and Oliver (2001) demonstrated how thejob requirements matrix approach to syn-thetic validation can be used to increase thepower of differential prediction analyses.Thus, the job requirements matrix approachallows the estimation of validity, potentialadverse impact, and differential predictionfor a selection procedure.

Arguments for Greater Applicationof Synthetic Validation

Back in 1965, Guion concluded that syn-thetic validity would allow employee test-ing to ‘‘move farther along the continuumfrom a technology to a genuine science’’(p. 63). More recently, Shultz, Riggs, andKottke (1999) echoed the same sentiment,that without synthetic validity, ‘‘we [selec-tion experts] will continue to be viewedas mere technicians’’ (p. 282). Based onour experience in studying and apply-ing synthetic validation, we believe thatwidespread application of synthetic vali-dation is the next step for our profession.In this section, we present arguments sup-porting this notion. In the next section, weconsider the limitations of synthetic valida-tion and address difficulties associated withits application.

Mass Production With Customization

A primary source of synthetic validation’sadvantages is that it provides mass pro-duction while maintaining customization.Mass production means that a single systemcan be rolled out to market at large, likeautomobiles off an assembly line. Once a

synthetic validation database is constructed,it can be applied to a wide variety of jobs.Customization means that quality is notsacrificed as each selection system is stilladjusted for any job of interest. A uniquetest battery can be constructed for any jobfor which all important job components areincluded in the database. In combination,this means low costs, easy implementation,and growing legal resilience, all while pro-viding a high-quality system that constantlyimproves itself as more data are added.

Cost Savings

A synthetic validation strategy can resultin significant cost savings, both duringdata collection and after a database hasbeen established. Rather than attempting toobtain an adequate sample size within eachof a number of jobs, the job requirementsmatrix approach allows the sampling planto include smaller numbers within each jobbecause the overall study sample size iswhat is important. Once a job componentvalidity or job requirements matrix syntheticvalidity database has been established,future validation studies will have very lowcosts because a selection system can becreated from a job analysis alone. Havinga job analyst or subject matter expertanswering a job analysis questionnaire isjust one of several steps required to createa traditional selection system. Becausesynthetic validation reduces the processdown to this single step, it can be donemuch more cheaply than traditional localvalidation studies.

Large Samples Are Not Necessary toFind Evidence of Validity

Synthetic validation allows the calculationof stable validity estimates, even for jobswith small sample sizes. As long as ajob component validity or job requirementsmatrix database is available, no incumbentsare needed for a validation study. In manyorganizations, there are multiple jobs witha relatively small number of incumbents.Even if jobs can be grouped into job

312 J.W. Johnson et al.

families, sample sizes for a criterion-relatedvalidation study will often be too smallfor computing stable validity coefficients.This situation could be a result of smallpopulations, the nature of the job families,limited financial resources, difficulty inobtaining complete predictor and criteriondata, or a combination of those factors.For example, Johnson and Carter (in press)conducted a job requirements matrix studyinvolving 1,946 incumbents in 11 jobfamilies, but sample sizes were still lessthan 60 in three of those job families.Because these job families shared numerousjob components with other job families,however, they were able to compute stablevalidity coefficients.

Indeed, synthetic validity can be appliedto jobs that do not even exist yet. As longas the job analysis questionnaire captures alarge majority of the important tasks for anew job, a job analysis is all that is necessaryto identify a test battery and document thevalidity of inferences that can be madefrom that test battery for that job. Becausethe relationships between tests and jobcomponents have been established in theoriginal validation study, one simply needsto identify the important job components forany new job. Developing test batteries andcomputing validity coefficients based solelyon a job analysis are substantial advantagesfor organizations in which jobs changequickly. This opens up the possibility ofhigh-quality selection to the entire worldof work, including small businesses thatemploy just one or two people for any job.

More Stable Validity Estimates

Synthetic validation should not be limitedonly to small sample situations. Even witha relatively large sample size of 200, the95% confidence interval around a valid-ity coefficient is approximately ±.14. Ifthere are multiple jobs with sample sizesof 200, the job requirements matrix strat-egy could be applied to increase samplesizes by combining data across jobs, result-ing in much more stable validity coeffi-cients. Johnson and Carter (in press) used

this approach and found standard errors ofvalidity coefficients that were an averageof 2.2 times smaller than the original stan-dard errors. Steel and Kammeyer-Mueller(2009) demonstrated that validity-basedjob component validity provides the samestandard error as if the selection systemwere validated with approximately 500employees, about an order of magnitudemore employees than is typical in tradi-tional criterion-related validation studies(Schmidt & Hunter, 1998). Achieving thesame error of estimation that was foundin their analyses for a verbal aptitude testand a clerical perception test would requirelocal criterion validation samples of approx-imately 450 or 650 employees, respectively(Steel & Kammeyer-Mueller, 2009).

A related advantage is that much largersample sizes can be obtained for differentialprediction analyses using synthetic valida-tion (Johnson et al., 2001). Power to detectslope or intercept differences for differentsubgroups is typically very low, requiringsample sizes of 400 or greater (Aguinis &Stone-Romero, 1997). Johnson et al. (2001)demonstrated that the job requirementsmatrix approach can be applied to computecorrelations between job component scoresand the variables necessary for differentialprediction analyses (i.e., predictors, crite-rion, a dummy-coded subgroup variable,and the cross products of the subgroup vari-able and the predictor scores). Combiningdata across jobs using synthetic differen-tial prediction analysis makes it easier tohave adequate power for these analysesand to conduct analyses on all subgroupsof interest.

Synthetic Validity Estimates Are asAccurate as Traditional Validity Estimates

Research shows that synthetic validity esti-mates closely correspond to validity esti-mates from large-scale local validationstudies and meta-analytic validity general-ization estimates. Hoffman and colleagues(Hoffman, Holden, & Gale, 2000; Hoff-man & McPhail, 1998; Morris, Hoffman, &Schultz, 2003) have conducted several

Synthetic validation 313

studies that compare job component valid-ity estimates to more traditional validity esti-mates. For example, Hoffman and McPhail(1998) compared the validity generaliza-tion estimates for clerical jobs reported byPearlman, Schmidt, and Hunter (1980) tojob component validity estimates computedfrom a PAQ database of a large utility com-pany. They found substantial correspon-dence between the validity estimates fromthese two approaches. The job componentvalidity coefficients tended to be the sameas or more conservative than the uncor-rected validity generalization estimates. Thecorrelation between the uncorrected valid-ity generalization and job component valid-ity estimates was .97. Morris et al. (2003)extended this research by comparing jobcomponent validity and validity generaliza-tion estimates for non-clerical jobs usinga variety of commercial tests. They alsofound that the job component validity esti-mates were more conservative, but similarto the uncorrected validity generalizationestimates. The majority of job componentvalidity estimates fell within the 95% confi-dence interval around the validity general-ization estimates.

Hoffman et al. (2000) combined jobcomponent validity, cluster analysis, andprevious validity generalization results toestablish a test battery and cut scores forjobs in a large utilities company for whicha selection battery did not exist and therewere too many job titles needing validityevidence. Using a database of PAQ ratings,they formed job families from a clusteranalysis and computed the job componentvalidity estimates for the job families tocompare to validity estimates from previouslocal validation studies. They found thatthe predicted validities from job componentvalidity were conservative, but similar to theuncorrected local validity estimates.

Synthetic validity coefficients obtainedusing the job requirements matrix approachhave also been found to be similar to moretraditional validity coefficients. Petersonet al. (2001) examined multiple methods ofweighting synthetic validity equations andcompared the similarity between synthetic

validity estimates and empirical validityestimates from Project A. They found thatall the different weighting strategies workedwell and there was little difference betweenthe validity coefficients estimated from thesynthetic validation study and empiricalvalidity coefficients for core technicalproficiency and overall performance.

Because of some relatively large samplesizes within job families, Johnson and Carter(in press) were able to compare the syntheticvalidity coefficients they computed foreach job family to traditional validitycoefficients computed within job families.They found that the synthetic validitycoefficients were often very similar to thetraditional validity coefficients and werealways within the 90% confidence intervalaround the traditional estimates.

Synthetic Validation Forces a MoreRigorous Approach to Personnel Selection

Synthetic validation should help to maxi-mize validity and enhance legal defensibil-ity because it forces the researcher to take arigorous approach to designing and imple-menting a validation study. When usingthe job requirements matrix approach, it isnecessary to use a multivariate conceptu-alization of performance and a construct-oriented approach to personnel selection,which is the best way to maximize valid-ity (Schneider, Hough, & Dunnette, 1996).Because the approach uses job analysisinformation to link predictors to specificjob components, the end result is neces-sarily related to important work outcomes,thereby ensuring that a selection procedureis job related.

In addition, developing a synthetic valid-ity database that can be marketed to thepublic provides a powerful incentive for amore rigorous approach than might typi-cally be taken. Because a public syntheticvalidity database can be broadly marketed,it justifies an attention to detail during itsconstruction that most traditional selectionsystems do not provide (Terpstra & Rozell,1997). Known as technical economies,small improvements in testing methodology

314 J.W. Johnson et al.

(e.g., slight differences in item selection oradministration) can be profitably pursuedwith synthetic validity because the benefitsaccumulate across a large number of jobswhile the costs are shared.

Potential Arguments AgainstSynthetic Validation

High Cost of Developing a Database

Despite growing acknowledgment of itsfeasibility, the biggest challenge in makingsynthetic validity widely available hasalways been and continues to be its cost(Murphy, 2009). To build a full-scale jobcomponent validity database that allows theestimation of validity for any job solely onthe basis of a job analysis requires hundredsof criterion-related validation studies, eachwith its associated costs. Although some jobcomponent validity databases do exist (e.g.,the GATB and the PAQ; Hogan PersonalityInventory), there is a need to establish jobcomponent validity for a wider variety ofpredictors.

At this time, no large publicly avail-able database exists for applying the jobrequirements matrix approach. Absent sucha database, this approach requires a localvalidation study. As noted previously, manycost savings can be realized by using thejob requirements matrix approach ratherthan a traditional approach in conductinga local validation study, but synthetic vali-dation requires a more rigorous study thanwhat might otherwise be considered ade-quate. For example, synthetic validation caninvolve a great deal of development workto ensure that the job analysis instrumentand the criterion measures are comprehen-sive and relevant for all jobs. First, the jobanalysis questionnaire must include all tasksthat are potentially important for each job,probably resulting in a large number ofitems. If the questionnaire is administeredonline, however, it is relatively easy to usea branching format so that respondents canskip blocks of items that are obviously irrel-evant to their job. Development time can besaved by using an existing worker-oriented

job analysis instrument such as the PAQ orO*NET. Second, it is important to developjob performance rating forms that are appli-cable to many jobs. To compute correla-tions between predictors and criteria acrossjobs, it is necessary to use the same crite-rion measure for each job. Great care mustbe taken to write job component definitionsand performance level descriptions that aregeneral enough to apply to all jobs to whichthe job component is relevant. This usuallyinvolves focus groups with supervisors rep-resenting each job or job family, severalrounds of revisions, and a pilot study witha subset of supervisors to ensure that therating scales are relevant and appropriatefor the jobs they represent.

Legal Risk

There is some legal risk associated with syn-thetic validation. To date, there are only twocourt cases associated with synthetic vali-dation in a selection context, both involvingthe use of job component validity, andthese cases were not definitive endorse-ments of the technique. Taylor v. JamesRiver (1989) was a U.S. district court casedealing with the validity of a test batteryand structured interview implemented onthe basis of a job analysis using the PAQ.The defendant was a paper company thatcreated a 4-year apprenticeship program fortraining millwrights and pipefitters. As partof the research conducted to support theapprenticeship program, a consultant useda systematic process to collect job analysisdata and used the results of a PAQ analysisto identify five cognitive ability tests. Basedon testimony provided, the judge ruledthat James River’s selection process mea-sured job-related abilities, complied withthe standards of the profession of industrialpsychology, and served legitimate businessgoals. The court ruled for the defendant ina summary judgment.1

1. It is important to note that the judge’s rulingrelied on Wards Cove v. Atonio (1989), whichwas subsequently overturned by the Civil RightsAct (CRA) of 1991. According to the CRA, ‘‘. . .

Synthetic validation 315

McCoy v. Willamette Industries (2001)was another U.S. district court case thatdealt with disparate treatment and disparateimpact. In McCoy, a consultant conducteda job analysis including the PAQ and usedthis information to create job families. Theconsultant also used PAQ data to show thatjobs at Willamette were substantially similarto other jobs in the paper products industry,and used the PAQ to apply the results ofvalidation studies conducted elsewhere toWillamette Industries. The court ruled forthe defendant in a summary judgment.

Careful reading of the rulings for Taylorv. James River and McCoy v. WillametteIndustries leads to several conclusions. First,both courts found the PAQ to be a pro-fessionally accepted job analysis method,and neither court raised concerns regardingthe adequacy of the job analysis efforts.Second, both courts discussed the topicof ‘‘validity generalization,’’ but did soin the context of a transportability study(Gibson & Caplinger, 2007) rather thanas in meta-analysis (Schmidt & Hunter,1977). Both courts accepted the use ofPAQ results in supporting inferences oftest transportability. The Uniform Guide-lines on Employee Selection Proceduressection 16Y (Equal Employment Opportu-nity Commission, Civil Service Commis-sion, Department of Labor, & Departmentof Justice, 1978) on transportability statesthat ‘‘a work behavior consists of one ormore tasks.’’ Although some might read thisstatement as requiring work-based job anal-ysis prior to conducting a transportabilitystudy, the PAQ is clearly a worker-orientedjob analysis method (Gatewood, Feild, &

the decision of the Supreme Court in Wards CovePacking Co. v. Atonio, 490 U.S. 642 (1989) hasweakened the scope and effectiveness of Federalcivil rights protections . . .’’ As a result of the WardsCove decision, Congress amended the CRA of 1964.The CRA sought to ‘‘. . . codify the concepts of‘business necessity’ and ‘job related’ enunciated bythe Supreme Court in Griggs v. Duke Power Co.,401 U.S. 424 (1971), and in the other SupremeCourt decisions prior to Wards Cove Packing Co. v.Atonio, 490 U.S. 642 (1989).’’ Given the relianceon the Wards Cove ruling in James River, it is notclear how this case would be decided if it wererevisited.

Barrick, 2007). The rulings in these twocases demonstrate that worker-oriented jobanalysis can support inferences of test trans-portability. Finally, neither court differenti-ated between validity generalization (as intransportability), validity generalization (asin meta-analysis), and the generalization ofvalidity using job component validity-basedevidence. The legal status of job componentvalidity will remain unclear until cases deal-ing with job component validity are heardat the Appellate or Supreme Court level.

The legal defensibility of the job require-ments matrix approach to synthetic valida-tion has not yet been challenged in court,but there is good reason to believe thatthis approach would be received favorably.The Guidelines do not address syntheticvalidity directly, but Trattner (1982) arguedthat the operational definition of constructvalidity provided by the Guidelines can beinterpreted as a description of a syntheticvalidity model. The Guidelines state:

. . . if a study pertains to a number of jobshaving common critical or importantwork behaviors at a comparable level ofcomplexity, and the evidence satisfies . . .

criterion-related validity evidence forthose jobs, the selection procedure maybe used for all the jobs to which the studypertains (p. 38303).

According to Trattner (1982), this def-inition of construct validity means that aselection procedure can be used when workbehaviors (i.e., job components) are impor-tant in any occupation within a class ofoccupations, as long as there is criterion-related validity evidence linking the proce-dure to the work behaviors for incumbentsin that class. Trattner concluded that thesynthetic validation approaches of Primoff(1959) and Guion (1965) were consistentwith this interpretation of the Guidelines.The Johnson and Carter (in press) approachalso appears to meet these requirements.

In the definition of construct validity,however, the Guidelines state that criterion-related validity evidence is required.Therefore, we cannot use this definition

316 J.W. Johnson et al.

to infer support for a synthetic validationstudy using expert judgments to link tests tojob components. Of course, the Guidelinesdefine an acceptable content validationstrategy as one in which the content ofa selection procedure is demonstrated to berepresentative of the important aspects ofperformance on the job. This is exactly whatis done in a synthetic validation study whenexperts link the content of the selection pro-cedure to job components, and the contentincluded for a particular job is based on theimportant job components for that job. It istherefore likely that a synthetic validationstudy using expert judgments would meetthe requirements of the Guidelines basedon their definition of an acceptable contentvalidity strategy.

Ultimately, the quality of the job analysis,the appropriateness of the procedures used,and the nature of the inferences made by theusers will determine the defensibility of anysynthetic validation procedure (Scherbaum,2005). It is likely that a well-designed syn-thetic validation study will have a goodchance of successfully withstanding legalscrutiny, based on its emphasis on jobanalysis and linking predictors directly tocriterion constructs. Nevertheless, there isalways some legal risk associated with usingan approach that does not yet have caselaw supporting its use, despite its clear sci-entific and professional support. In fact, themore prominent and applicable syntheticvalidation becomes, the more likely it willattract legal attention. If synthetic validationcan withstand early legal challenges, thiswould provide precedents to help it with-stand later challenges. The more the systemundergoes legal scrutiny, the stronger it willget. An established synthetic validity systemprovides a stable platform that will benefitfrom successfully defending previous legalchallenges.

Failure to Find Discriminant Validity

Peterson et al. (2001) found that their syn-thetic validity equations lacked discrimi-nant validity when an equation was appliedto a different military occupational specialty

(MOS) than the one for which it was devel-oped. In other words, there were smallmean differences between validity estimatesobtained from MOS-specific equations andvalidity estimates obtained from other MOSequations. Johnson (2007) also noted alack of discriminant validity in some syn-thetic validation research. The fact thatjob-specific equations do not seem to havea great advantage over any other equationis probably because of a combination offactors on the criterion and the predictorside.

Criteria. Job performance criterion scoresare usually obtained from ratings made bysupervisors. These ratings are influenced tosome extent by halo, or the tendency togive similar ratings to the same person ondifferent performance dimensions (Cooper,1981). In a meta-analysis, Viswesvaran,Schmidt, and Ones (2005) found that halohas a large effect on performance ratings.Even after removing the effects of halo, how-ever, they still found a large general factorin job performance (i.e., employees reallydo tend to perform similarly on differentperformance dimensions). These findingsindicate that (a) raters can have difficulty dif-ferentiating between different performancedimensions and (b) the determinants of per-formance on different dimensions tend tobe similar. In light of this positive manifoldin performance measures, is it unlikely thatdiscriminant validity can be achieved withsynthetic validity equations?

We do not believe that halo errormakes it impossible to find a benefit injob-specific synthetic validity equations.Although Viswesvaran et al.’s (2005) meta-analysis is a well-done study, it is stillsubject to the limitations of meta-analysisand makes a number of assumptions thatare not universally accepted. For example,the multitude of performance dimensionsincluded in all these studies were collapsedinto nine categories. This is often a neces-sary step in meta-analysis, but in the pro-cess of fitting these different performancedimensions into these categories, it is likely

Synthetic validation 317

that highly correlated performance dimen-sions (within studies) would sometimes beplaced into different categories, whereasperformance dimensions with lower corre-lations (within studies) would sometimesbe placed into the same category. Thenet result is mushy constructs that aredifficult to interpret (Oswald & McCloy,2003) and quite possibly have inflated inter-correlations. Viswesvaran et al. themselvesconcluded that ‘‘There is also a need tostudy narrower, more specific performancedimensions. The construct of job perfor-mance can be sliced in different ways,depending on the level of specificity desiredin its measurement. Future research shouldexplore whether the size of the general fac-tor in ratings and the amount of halo differdepending on the specific set of perfor-mance dimensions studied’’ (p. 123). Thus,the generalizability of their findings to thenarrower and more specific performancedata that are collected in a synthetic valida-tion study is unclear. In addition, the studiesthat serve as input into the meta-analysisinclude both high-quality (reducing haloerror) and lower-quality studies (exacerbat-ing halo error). Rater training and rating for-mat can reduce halo error (Borman, 1979),and the studies in this meta-analysis surelydiffered in the amount and quality of ratertraining and the rating format used. More-over, the studies surely varied in the ratingcontext and purposes for which the perfor-mance data were collected. Contexts withstrong demand characteristics and admin-istrative purposes are ones that serve toincrease rater errors (DeCotiis & Petit, 1978;Ilgen & Feldman, 1983; Murphy, 2008).Given the sole use of published field stud-ies in this meta-analysis, the majority of thestudies likely represent these contexts andpurposes.

Viswesvaran et al.’s (2005) results shouldbe interpreted as an assessment of what hastypically existed, not what can ideally beachieved. During the creation of a syntheticvalidity database, every effort should bemade to ensure that performance appraisalsare as accurate as possible because theresults will be extended to many other jobs.

Those using synthetic validity approachesneed to give considerable thought to themany factors that can impact performanceratings. Motivated raters who are appropri-ately trained and who are using a well-designed performance appraisal measurewill demonstrate less of a halo effect. Alter-natively, broad, abstract, and vague per-formance dimensions will almost certainlydraw upon a general performance factorduring assessment (Steel et al., 2006).

Furthermore, Dudley, Orvis, Lebiecki,and Cortina (2006) have already empir-ically demonstrated differential validitybetween a general performance dimensionversus specific performance dimensions.Still, Viswesvaran et al. (2005) do indi-cate that raters have difficulty making finedegrees of differentiation between perfor-mance dimensions, with rapidly diminish-ing returns as measures get more specific.Fortunately, the fewer the performancedimensions needed to model the workdomain, the easier it will be to completethe job requirements matrix via syntheticvalidation. As long as a few performancedimensions can be reliably separated, wecould see discriminant validity in syntheticvalidity equations.

Predictors. On the predictor side, a largeliterature has developed on the incrementalvalidity of measures of specific abilitiesover general mental ability in predictingjob performance, with the conclusion thatspecific abilities contribute only trivially(Brown, Le, & Schmidt, 2006; Ree & Earles,1991; Ree, Earles, & Teachout, 1994). Thisresearch also shows that the validity ofspecific ability measures depends entirelyon the extent to which they contributeto general mental ability. If all specificability measures are simply indicators ofthe same underlying construct, can we finddiscriminant validity with synthetic validityequations?

There are three issues here. First, syn-thetic validity is not dependent on thepackage of predictors we use and we arefree to explore tests of many different con-structs as well as decompositions of them.

318 J.W. Johnson et al.

Previous synthetic validity studies havefocused primarily on cognitive ability pre-dictors but the underlying techniques donot assume any particular type of predic-tor. Second, the bandwidth–fidelity debateindicates that such an exploration is theoret-ically supported, as a broad criterion shouldbe matched with a broad predictor and anarrow criterion should be matched witha narrow predictor (Schneider et al., 1996;Smith, 1976). In particular, decompositionsof conscientiousness have been shown topredict specific performance dimensionsbetter than overall performance (Connelly &Ones, 2007; Dudley et al., 2006). That gen-eral mental ability does not follow this trendis an interesting exception. Third, even if thegain from decomposition is modest, theseare exactly the type of advantages that canbe accrued through technical economies byusing synthetic validation. That is, whereindividual gains through more precise orbetter methodology might not make finan-cial sense for any individual selection site,a small gain that is consistently applied tothousands of sites is far more attractive.

Another potential cause of the lack ofdiscriminant validity in synthetic validityequations is the idea that there is very littlevariance in validity for cognitive measuresacross jobs that are very different, as long asthe general level of complexity is similar (cf.Schmidt, Hunter, & Pearlman, 1981). Thisseems to argue against the notion that valid-ity can be built up through the specific jobcomponents that are relevant to a givenjob. This issue was closely re-examinedby Steel and Kammeyer-Mueller (2009).The belief that the general level of jobcomplexity sufficiently moderates generalmental ability validity coefficients is a com-mon one, adopted by 81% of researchersand repeated in a variety of prominentpersonnel selection reviews. Most of thissentiment cites a technical paper by Hunter(1983), where he was exploring aspectsof synthetic validity. As he stated withinthat report regarding job complexity, ‘‘Itis clear in this table that validity is notconsistently ordered for any of the three[general mental ability] composites. On the

other hand, it would appear that validityfollows the same pattern across categoriesfor each of the aptitude composites. There-fore, categories were reordered to reflectthis fact’’ (p. 15). Though this post hocreordering is rarely mentioned in subse-quent references to his work, it helps toexplain why several more recent repli-cations have failed to find that generalmental ability and complexity are consis-tently related (Hulsheger, Maier, & Stumpp,2007; Verive & McDaniel, 1996). Re-examining an extended version of Hunter’soriginal database, Steel and Kammeyer-Mueller found that typically general com-plexity does not predict validity coefficientsand that Hunter’s post hoc reordered ver-sion predicts modestly at best. As Steeland Kammeyer-Mueller concluded, ‘‘Theresults do clearly suggest that complex-ity can moderate validity coefficients, butcomplexity requires the specificity of thePAQ or O*NET measures of job complexityto achieve statistically significant levels ofprediction’’ (p. 548).

Conclusion. Current research has notdemonstrated that job-specific syntheticvalidity equations provide significantly bet-ter prediction than do equations derivedfor other jobs, but very few studies haveexamined this issue to date. There are argu-ments on the criterion side and the predictorside for why we should not expect tofind discriminant validity, but we remainoptimistic that discriminant validity willbe found in many cases. Expanding thepredictor space to include a wider vari-ety of predictors than has typically beenstudied in synthetic validation research(e.g., personality, biodata, situational judg-ment) and expanding the criterion space toinclude performance constructs other thanproficiency (e.g., citizenship performance,adaptive performance) should improve ourchances. In addition, paying careful atten-tion to the quality of measurement and ratertraining is a worthwhile investment in build-ing synthetic validity databases.

It is worth noting that these potentiallimitations to synthetic validation on the

Synthetic validation 319

predictor and criterion side also apply totraditional validation studies that use aconstruct-oriented approach. One couldargue that you could just use a measureof general mental ability as your predictorand use a validity generalization argumentto support it because you would proba-bly get pretty good prediction for mostjobs (Schmidt & Hunter, 1998). Then allthese synthetic validation works would beunnecessary. Unfortunately, selection thesedays is a bit more complicated than that.We need more or different predictors thanjust general mental ability to deal with anexpanded criterion space, adverse impact,and applicant reactions. Thus, a construct-oriented approach to building a selectionsystem using synthetic validation shouldgenerate considerable added value beyonda simple measure of general mental ability.

Need to Identify and Validate PredictorsWith Each New Study

Finally, a limitation of synthetic validationat this time is the need to identify and val-idate predictors of job components witheach new synthetic validation study. Thereis some transportability of the findings ifa similar test is shown to tap the sameconstruct. Still, the number of predictorspresently available cannot all be realis-tically incorporated in a focused attemptto create a widespread synthetic valida-tion system. Consequently, there is verylittle accumulated research on specific pre-dictor measures because most publishedsynthetic validation studies have includedpredictor measures that are not commer-cially available (with the notable exceptionof the GATB; Scherbaum, 2005). In thenext section, we propose a solution to thisproblem.

Conclusion

Considering the advantages and potentialdisadvantages, we see little reason why ajob requirements matrix synthetic validationstrategy would not be used in any situationin which it is feasible. As long as there

are multiple jobs that share some jobcomponents, we recommend computingcorrelations between predictors and jobcomponents across jobs and using syntheticvalidation techniques to estimate validitycoefficients. This approach results in largersample sizes with smaller standard errors,more power for differential predictionanalyses, and makes it easier to transport thevalidity of the selection procedure to newjobs not included in the validation study. Inaddition, the steps involved in conductinga well-designed synthetic validation studyensure that the selection procedure is jobrelated.

Building a SyntheticValidation Database

The ultimate goal of synthetic validationresearch should be the development of adatabase to be used with synthetic valida-tion models to build prediction equationsfor specific situations (Hough, 2001;Hough & Ones, 2001; Johnson, 2007).Hough concluded:

The new world of work, with itschanging prediction needs—from pre-diction of global performance for hir-ing and promotion decisions to moreprecise placement decisions for projectstaffing—requires that I-O psychologistschange their research approach. Whatis needed is a database that can beused with synthetic validation modelsto build prediction equations for specificsituations. (p. 37)

We agree. For the selection field toadvance significantly, it needs syntheticvalidity. Still, nearly a decade later, weare without such a database. This kind ofdatabase would allow us to use syntheticvalidation techniques to estimate the valid-ity of a battery of predictors for any jobthat includes job components on whichresearch is available. When this databasehas been developed to a significant extent,practitioners will be able to buy or developmeasures of predictor constructs that have

320 J.W. Johnson et al.

been shown to predict performance on jobcomponents relevant to any job of inter-est and to calculate a validity coefficient forthat job. The database will also advance sci-ence by greatly increasing our knowledgebase with respect to relationships betweendifferent predictor and criterion variables.Database development will lead to muchquicker accumulation of this informationthan would otherwise occur. This will notonly enhance the science of personnelselection, but will be a rich source of datafor other types of research (e.g., to create ameta-analytic correlation matrix as input toa structural equation model; Viswesvaran &Ones, 1995). In this section, we describehow we envision this database could bedeveloped and the potential obstacles to itsdevelopment.

To maximize our chance of build-ing a comprehensive synthetic validationdatabase, we advocate pursuing a hybridapproach, simultaneously using both jobcomponent validity and job requirementsmatrix methodologies. Although eitherapproach should work alone, both canimprove each other’s operations. Althoughjob component validity is traditionally usedwith overall performance validity coeffi-cients, it can also be used to predict validitycoefficients for individual performance ele-ments.

There are two approaches that couldbe taken to develop a large syntheticvalidity database that can be applied tohundreds of jobs. The first approach is toidentify or develop a set of tests and aset of job components and design a large-scale study in which tests are administeredto and performance ratings gathered onincumbents in a large number of jobs ina variety of organizations. This would be asimilar approach to that taken to populatethe O*NET database, where job analysisratings were collected from incumbentsall over the country in a large numberof occupations (Peterson, Mumford, Levin,Green, & Waksberg, 1999). It would bea logical next step for the Department ofLabor to sponsor this type of project, linkingthe O*NET skill, ability, knowledge, and

work style taxonomies to the generalizedwork activity taxonomy. Although similargovernment projects have been undertaken,especially to combat high unemployment(Primoff & Fine, 1988), we see this approachas ideal but impractical at this time. Itwould require considerable political will toapprove the costs associated with a projectof this magnitude and there are a limitednumber of alternative commercial partnerscapable of making the investment. The U.S.military is a potential sponsor of this typeof project, but there are enough differencesbetween the military and civilian sectorsthat generalizability would be a legitimateconcern.

The second approach is to conductprimary studies that report relationshipsbetween predictor constructs and job com-ponents and then use meta-analysis tocumulate the results of those studies. Thisis a more practical strategy in the nearterm because thousands of criterion-relatedvalidation studies have been conductedthat could potentially serve as input tothe database, and future validation stud-ies can be designed with contributing to thedatabase in mind. To underscore its feasi-bility, Meyer, Dalal, and Bonaccio (2009)effectively did exactly this to determine howsituation strength moderates the conscien-tiousness–performance relationship. In thissection, we assume the database will bepopulated with the results of past and futurecriterion-related validation studies.

To develop a database that supports bothjob component validity and job require-ments matrix approaches, three types ofdata are necessary: (a) performance ratingsfor each participating incumbent, (b) scoresfrom a predictor battery for each par-ticipating incumbent, and (c) job analysisinformation for each incumbent’s job. Theperformance ratings and predictor batteryscores are used to compute validity coeffi-cients estimating the relationship betweenpredictors and job components. If thesevalidity coefficients differ across jobs afteraccounting for artifactual sources of varia-tion, the variance can be explained witha moderator search using job analysis

Synthetic validation 321

information. To maximize the efficacy ofthis database, there are challenges to beaddressed in gathering each type of data.Steel et al. (2006) and Johnson (2007) out-line these steps, which we review andupdate here.

Performance Ratings

The job requirements matrix approachrequires that the performance constructsmeasured adequately cover the world ofwork. Many performance taxonomies havebeen proposed, generally represented bythree categories: (a) task, (b) citizenship,and (c) counterproductive performance(Rotundo & Sackett, 2002). Adaptive per-formance is sometimes considered a fourthunique performance category (Hesketh &Neal, 1999; Johnson, 2003). Task perfor-mance is the hardest category to specifyas it can vary the most across differentjobs. To measure task performance at ageneral level, O*NET represents task behav-iors with 42 Generalized Work Activities,each of which can potentially represent ajob component to be evaluated. It is pos-sible, however, that the Generalized WorkActivities are at too specific a level for theinitial stages of this project. The appropri-ate level of specificity can be theorized,but ultimately ‘‘this is fundamentally anissue that can be resolved only throughempirical research’’ (Burke & Pearlman,1988, p. 105). Too many job componentslead to redundancy, where many of themprove to be functionally equivalent duringperformance appraisal and could be col-lapsed. Supervisors have some capacity tomake practical distinctions among perfor-mance dimensions (Dudley et al., 2006),but under typical conditions this capacitymay be quite limited (Viswesvaran et al.,2005). Fortunately, the smaller the num-ber of these job components, the easier itwill be to build the database. More gen-eral job components, however, mean widercredibility intervals for the resulting valid-ity coefficients. We envision starting outwith broad performance categories and thengoing to more specific categories as more

data are accumulated and letting the datadetermine when sufficient specificity hasbeen reached. Consequently, an initial stepis to identify a taxonomy of job components(i.e., performance dimensions) to use as anorganizing structure for the database.

Analysis of the O*NET database indicatesthat almost 80% of the variance amongGeneralized Work Activities is accountedfor by three factors (People, Data, andThings; Gibson, Harvey, & Quintela, 2004).O*NET categorizes the 42 GeneralizedWork Activities into nine broader cate-gories, which may be easier to work with.Other estimates of broad performance cate-gories are similarly low (e.g., eight identifi-able categories; Bartram, 2005; Campbell,1990). Johnson (2003) proposed a multi-level taxonomy of performance dimensionsto be used for building a synthetic validitydatabase. The taxonomy had three com-ponents at the highest level: (a) task per-formance, (b) citizenship performance, and(c) adaptive performance. Level 2 definedthese components with 10 dimensions fromCampbell (1990), Borman et al. (2001), andPulakos, Arad, Donovan, and Plamondon(2000). Johnson recommended that Level 2dimensions be used for cumulating resultsacross studies for meta-analyses, at leastuntil sufficient data have been gatheredto conduct meta-analyses at Level 3. Thistaxonomy may be a useful starting pointfor identifying the job components to beincluded in the synthetic validity database.

We expect that most performance rat-ings would be collected at a specificlevel and then classified into broader cat-egories when building the database. Forexample, ‘‘analyzing data or information’’may be expressed as ‘‘conducting fieldresearch’’ for sociologists, as ‘‘analyze geo-logical research data’’ for geologists, or as‘‘evaluate costs of engineering projects’’ fornuclear engineers. There could be scores ofspecific performance scales representing aparticular job component, but only a fewwould be used in any one job and in theend they all represent their superordinatejob component. It is important, however,to avoid collapsing scales into constructs

322 J.W. Johnson et al.

that clearly do not fit. This is common inmany meta-analyses that sacrifice constructinterpretability for the sake of greater sta-tistical power (Oswald & McCloy, 2003).We advocate being ‘‘splitters’’ rather than‘‘lumpers’’ when cumulating results acrossstudies to ensure the interpretability of jobcomponents.

Predictors

In the local validation studies that willpopulate the database, each incumbentfor whom performance rating data havebeen collected will need to completeat least one predictor measure. Thesepredictors could assess (a) abilities and apti-tudes; (b) physical, psychomotor, and per-ceptual skills; (c) personality and interests;or (d) demographic or biographical data.We are free to be inclusive with regardto what constructs to assess, but a signifi-cant issue is how these potential predictorsshould be operationalized. If we lock onto any single test to measure a particularconstruct, we would need to justify why wedid not consider many equally meritoriousalternatives. In many cases, it could be con-sidered an arbitrary and divisive choice.Consequently, we make no judgments inthis regard and suggest that all tests be wel-comed. This will delay the developmentof the database somewhat, as our effortswill not be concentrated on specific tests.We suggest beginning the development ofthe database by sorting tests into constructson the basis of expert opinion. This willinject more variance into the results andincrease credibility intervals, but at least wewill be able to draw conclusions about thelevel of validity different constructs have forpredicting certain job components. Eventu-ally, enough data will be accumulated toconduct moderator analyses and draw con-clusions about specific tests. Ultimately, theprocess will reflect market forces, with syn-thetic validity evidence accumulating morequickly for more popular tests than for thoseless popular.

We recommend that demographic infor-mation and biodata measures be reported

as often as possible. The strength of validitycoefficients can be moderated by individ-ual variables, especially experience. Forexample, the correlation between generalmental ability and performance tends toincrease with experience (Hulin, Henry, &Noon, 1990; Hunter & Schmidt, 1996).Using experience as a moderator willhelp the job component validity approachby improving the prediction of validitycoefficients.

Job Analysis

The final component is the job analysisinformation, which should come from anestablished, widely available job-analysisinstrument. To enable the job requirementsmatrix approach, the job analysis shouldprovide a way of creating job-specific per-formance appraisal dimensions connectedto job components. Combined with theresults from the test battery, we would haveall that is necessary to create the requisitevalidity coefficients for the synthetic valid-ity database. Therefore, the job analysisinstrument must establish what job com-ponents are important for the target joband translate them into usable performanceappraisal dimensions. This is best providedby an O*NET-style job analysis, one thatexplicitly includes job components (i.e.,Generalized Work Activities). The O*NETwould have to be supplemented with itemsmeasuring other job components from ourperformance taxonomy, however, becauseO*NET Generalized Work Activities are notcurrently adequate measures of citizenshipand adaptive performance dimensions.

To enable job component validity, jobanalysis data must also provide moderatorinformation for these validity coefficients,such as whether the job involves routineor repetitive tasks or whether it operatesin a demanding, hazardous environment.Research indicates that the PAQ, whichwas specifically designed to be used inthis way (McCormick et al., 1972), providesbetter moderating information than O*NET(i.e., the PAQ accounts for more vari-ance around validity coefficients; Steel &

Synthetic validation 323

Kammeyer-Mueller, 2009). Consequently,a combination of elements from O*NETand the PAQ is needed as the formeris the best approach for the job require-ments matrix, whereas the latter is the bestapproach for job component validity. Steeland Kammeyer-Mueller (2009) identifiedthe PAQ dimensions that are most predic-tive of validity coefficients. If the O*NET jobanalysis survey were to be used for build-ing the database, we recommend modifyingit to ensure that PAQ dimensions or theirequivalent are adequately assessed.

Building the Database

These three components—the performanceratings, the predictors, and the job analy-sis—must be gathered across a wide varietyof jobs. The exact number of jobs dependson the number of job components usedand the degree of redundancy during datagathering, but approximately 300 jobs arenot unrealistic based on 8–10 job compo-nents assessed at three levels of complexity(i.e., high, medium, and low). A portion ofthis number is already available, merelyneeding to be meta-analytically codedand incorporated (Meyer et al. 2009). Asfor the remaining studies, they can beobtained while gathering data from tradi-tional criterion-related validation studies.Because the requisite validity coefficientsare themselves valuable, the creation of syn-thetic validity can ‘‘piggyback’’ on what wealready profitably do: create selection sys-tems. The only added cost is minimal (i.e.,submitting the obtained results to a centraldatabase).

As this data gathering progresses andthe job database grows, an additionaladvantage is that these criterion-validatedjobs can benefit from Bayesian estimation(Brannick, 2001). Steel and Kammeyer-Mueller (2008) note that selection systemscan be validated with only a fraction ofthe present employees currently requiredby using previous validity generalizationstudies and Bayesian estimation. By havinga database of validity coefficients, we canform them into a distribution of population

scores calculated without sampling error.New criterion validation studies can use thisdistribution to substantially improve theiraccuracy, which is a significant enhance-ment. For example, with Bayesian estima-tion, a previously low or nonsignificantvalidity coefficient estimate may becomeboth significant and large. Alternatively,Bayesian estimation can potentially providethe same level of accuracy with 30 partic-ipants that previously required 300 jobs.This benefit also encourages some criterionvalidation to perpetually occur, providingdata that would enable ongoing improve-ment of the synthetic validity system evenafter it is built. It would allow experimenta-tion with new test items (to increase the sizeof the validity coefficients) and job analy-sis questions (to better predict these validitycoefficients). This would also allow the sys-tem to remain current if and when newtypes of predictors are developed.

Once the synthetic validity system isfully operational, new selection systemswill be significantly easier to create thanwith a traditional validation approach. Itwould take approximately 1–2 hours intotal; employers or trained job analysts justneed to describe the target job using the jobanalysis questionnaire. After this point, thesynthetic validity algorithms take over andautomatically generate a ready-made fullselection system, more accurately than canbe achieved with most traditional criterion-related validation studies. For example,Steel and Kammeyer-Mueller (2009) foundthat the standard error of their job compo-nent validity and synthetic validity systemwould require approximately 500 employ-ees from a local validation study to equal;the synthetic validity system proposed here,being a combination of both job require-ments matrix and job component validity,should work even better (i.e., have smallerstandard errors).

Potential Obstacles

Despite being technically and financiallypractical, there are a number of challengesto developing a large, public, synthetic

324 J.W. Johnson et al.

validity database. The first is that it isa disruptive technology, what economistsdescribe as creative destruction. A fullyfunctional database would eventually eithereliminate one of the major functionsof other test publishers and consultants(i.e., test validation) or force them tofollow suit to remain competitive (e.g., bydeveloping their own proprietary syntheticvalidity database or by developing tests thatmeasure constructs included in the publicdatabase). Such a major upheaval may notbe welcomed by established players in theindustry.

A second major challenge is the limita-tions of meta-analysis, which have beenwell chronicled (Bobko & Roth, 2003;Burke & Landis, 2003; Oswald & McCloy,2003; Sackett, 2003; Sackett, Schmitt,Tenopyr, Kehoe, & Zedeck, 1985; Schmidt,Hunter, Pearlman, & Rothstein, 1985). Forexample, construct confusion resulting fromcombining different measures of purport-edly the same construct (i.e., the commen-surability problem), low power to detectmoderators, dependent samples, the needto make assumptions about artifact distri-butions, missing data problems, and sam-pling bias are issues that can influencethe accuracy of meta-analytic estimates.These limitations are serious, but they willhave less influence on results if decisionsare made conservatively with these lim-itations in mind. Furthermore, many ofthese issues will resolve themselves as thesynthetic validity database grows. In partic-ular, commensurability problems will notoccur for any test when it generates enoughinstances to justify its own meta-analysis(Steel, Schmidt, & Schultz, 2008).

This brings up another challenge, whichis the need for a sophisticated and credi-ble partner to maintain this initiative. Thereare many administrative details to the pro-cess of building a large public syntheticvalidity database, such as organizing thesponsors, publicizing the effort, and obtain-ing funding. SIOP seems to be an idealpartner to us. They are a warehouse ofneeded skills to champion this effort andcreate the database. There is no shortage

of SIOP members with expertise in jobanalysis, performance appraisal, test val-idation, and meta-analysis. SIOP can bethe neutral party needed to securely collectvalidity coefficients from hundreds of jobsand scores of tests. Finally, SIOP could pro-vide the legitimacy to the project that wouldattract sufficient numbers of independentand reputable participants needed to fill thedatabase. For decades, it has been under-stood that realizing synthetic validity for alarge number of jobs would require a con-sortium of talents and resources to achieve.SIOP can provide the roof under whichto assemble to publicly create this system.Notably, SIOP would not be the first suchinstitution to engage in such an effort. Con-sider The Cochrane Collaboration, whichprovides the same sort of public serviceexcept in a far more ambitious manner.Comprised of over 10,000 volunteers fromnearly 100 countries, the Cochrane Col-laboration creates systematic meta-analyticreviews of healthcare interventions. Thesereviews and the underlying databases areavailable through the Cochrane Library andprovides a major step toward the healthcarefield’s own advancement, that of evidence-based medicine.

A final challenge is motivating those withvalidation study data to share those datafor the purpose of building the database.Test publishers or organizations that havepaid for their own test development may beunwilling to share proprietary informationor negotiate with their clients for thepermission to release this information.However, without these data the databasewould grow at a very slow pace and wouldlikely consist of an unrepresentative sampleof validation studies (e.g., those that werepublished). Similarly, test publishers mightbe motivated only to share results whenvalidity coefficients are relatively large. Onepossible way to encourage test publishersto share complete data is to include themas partners in the project. The ability tocompute synthetic validity coefficients fortheir tests for hundreds of jobs should be apowerful incentive to participate, especiallyif partners in the project are able to access

Synthetic validation 325

the database at no charge. One way offunding the project is to charge a fee foraccessing the database, with larger feeswhen the use is for profit and smaller feeswhen the use is for research or for non-profitorganizations. The fee could be loweredor waived depending on the contributionsorganizations make to the database.

Conclusion

Embracing standardization has been a log-ical step in the maturation of every majorfield, from cars to clothing to screwdrivers(Clarke, 2005; Green, 1997; Rybczynzki,2000). Steam boilers, for example, providedthe horsepower needed to run the world’sindustry for well over a century. Immenselyuseful, they were equally dangerous withthe potential to level entire buildings if theyexploded. After decades of disasters causedby handcrafting unique boilers, the indus-try started to standardize their construction.They found designs and material that reli-ably worked—boilerplate—and started toconsistently use them. Selection is no differ-ent. There are still several details to hammerout, especially the number of job compo-nents to be used and possibly modifyingO*NET to include aspects gleaned fromPAQ-based synthetic validity successes. Wetouched on them here but it is beyond thescope of this article to explicate every issue(see Steel et al. [2006] for more details).However, these issues are simply technical;they are laborious to address but ultimatelymanageable. There are no theoretical obsta-cles left. As McCormick (1959) concludedover 50 years ago, ‘‘I feel very strongly thatthe game is worth the candle, and thatby the application of systematic research,probably including some false starts, we asa profession can ultimately meet this chal-lenge’’ (p. 412). Indeed, he was right: Asa profession, we can substantially advancethe science and practice of I-O psychologythrough synthetic validity.

ReferencesAguinis, H., & Stone-Romero, E. F. (1997). Method-

ological artifacts in moderated multiple regression

and their effects on statistical power. Journal ofApplied Psychology, 82, 192–206.

American Educational Research Association, Ameri-can Psychological Association, & National Councilon Measurement in Education. (1999). Standardsfor educational and psychological testing. Wash-ington, DC: American Educational Research Asso-ciation.

Balma, M. J. (1959). The development of processesfor indirect or synthetic validity: 1. The conceptof synthetic validity. A symposium. PersonnelPsychology, 12, 395–396.

Bartram, D. (2005). The great eight competencies: Acriterion-centric approach to validation. Journal ofApplied Psychology, 90, 1185–1203.

Bobko, P., & Roth, P. L. (2003). Meta-analysis andvalidity generalization as research tools: Issuesof sample bias and degrees of mis-specification.In K. R. Murphy (Ed.), Validity generalization: Acritical review. Mahwah, NJ: Erlbaum.

Borman, W. C. (1979). Format and training effects onrating accuracy and rater errors. Journal of AppliedPsychology, 64, 410–421.

Borman, W. C., Buck, D. E., Hanson, M. A., Motowidlo,S. J., Stark, S., & Drasgow, F. (2001). An examina-tion of the comparative reliability, validity, andaccuracy of performance ratings made using com-puterized adaptive rating scales. Journal of AppliedPsychology, 86, 965–973.

Brannick, M. T. (2001). Implications of empiricalBayes meta-analysis for test validation. Journal ofApplied Psychology, 86, 468–480.

Brown, K.G., Le, H., & Schmidt, F. L. (2006). Specificaptitude theory revisited: Is there incrementalvalidity for training performance? InternationalJournal of Selection and Assessment, 14, 87–100.

Burke, M. J., & Landis, R. S. (2003). Methodologicaland conceptual challenges in conducting andinterpreting meta-analyses. In K. R. Murphy (Ed.),Validity generalization: A critical review. Mahwah,NJ: Erlbaum.

Burke, M. J., & Pearlman, K. (1988). Recruiting,selecting, and matching people with jobs. InJ. P. Campbell, & R. J. Campbell (Eds.), Productiv-ity in organizations (pp. 97–142). San Francisco:Jossey-Bass.

Burke, M. J., Rupinski, M. T., Dunlap, W. P., &Davison, H. K. (1996). Do situational variables actas substantive causes of relationships between indi-vidual difference variables? Two large-scale tests of‘‘common cause’’ models. Personnel Psychology,49, 573–598.

Campbell, J. P. (1990). Modeling the performanceprediction problem in industrial and organizationalpsychology. In M. D. Dunnette, & L. M. Hough(Eds.), Handbook of industrial and organizationalpsychology (2nd ed., Vol. 1, pp. 39–74). Palo Alto,CA: Consulting Psychologists Press.

Civil Rights Act of 1991, 105 Stat. 1071 (1991).Clarke, C. (2005). Automotive production systems and

standardization: From Ford to the case of Mercedes-Benz. New York: Springer.

Connelly, B., & Ones, D. (2007, April). Combiningconscientiousness scales: Can’t get enough ofthe trait, baby. Paper presented at the 22ndAnnual Conference of the Society for Industrialand Organizational Psychology, New York.

326 J.W. Johnson et al.

Cooper, W. H. (1981). Ubiquitous halo. PsychologicalBulletin, 90, 218–244.

Decotiis, T., & Petit, A. (1978). The performanceappraisal process: A model and some testablepropositions. Academy of Management Review,3, 635–646.

D’Egidio, E. L. (2001). Building a job component valid-ity model using job analysis data from the Occupa-tional Information Network. Unpublished doctoraldissertation, University of Houston, Houston, TX.

Dudley, N., Orvis, K., Lebiecki, J., & Cortina, J. (2006).A meta-analytic investigation of conscientiousnessin the prediction of job performance: Examiningthe intercorrelations and the incremental validityof narrow traits. Journal of Applied Psychology, 91,40–57.

Equal Employment Opportunity Commission, CivilService Commission, Department of Labor, &Department of Justice. (1978). Uniform guidelineson employee selection procedures. Federal Regis-ter, 43, 38294–38309.

Gatewood, R. D., Feild, H. S., & Barrick, M. (2007).Human resource selection (6th ed.). Mason, OH:Thomson Learning.

Ghiselli, E. E. (1959). The development of processes forindirect or synthetic validity: II. The generalizationof validity. A symposium. Personnel Psychology,12, 397–402.

Gibson, S., Harvey, R. J., & Quintela, Y. (2004, April).Holistic versus decomposed ratings of generaldimensions of work activity. Poster presented at the19th Annual Conference of the Society for Industrialand Organizational Psychology, Chicago.

Gibson, W. M., & Caplinger, J. A. (2007). Trans-portation of validation results. In S. M. McPhail(Ed.), Alternative validation strategies: Develop-ing new and leveraging existing validity evidence(pp. 29–81). San Francisco: Jossey-Bass.

Green, N. (1997). Ready-to-wear and ready-to-work:A century of industry and immigrants in Paris andNew York. Durham, NC: Duke University Press.

Guion, R. M. (1965). Synthetic validity in a smallcompany: A demonstration. Personnel Psychology,18, 49–63.

Guion, R. M. (2006). Still learning. The Industrial-Organizational Psychologist, 44, 83–86.

Hamilton, J. W., & Dickinson, T. L. (1987). Com-parison of several procedures for generatingJ-coefficients. Journal of Applied Psychology, 72,49–54.

Hesketh, B., & Neal, A. (1999). Technology and perfor-mance. In D. R. Ilgen, & E. D. Pulakos (Eds.), Thechanging nature of performance: Implications forstaffing, motivation, and development (pp. 21–55).San Francisco: Jossey-Bass.

Hirsh, H. R., Schmidt, F. L., & Hunter, J. E. (1986).Estimation of employment validities by lessexperienced judges. Personnel Psychology, 39,337–344.

Hoffman, C. C., Holden, L. M., & Gale, K. (2000). Somany jobs, so little ‘‘N’’: Applying expandedvalidation models to support generalization ofcognitive test validity. Personnel Psychology, 53,955–991.

Hoffman, C. C., & McPhail, S. M. (1998). Exploringoptions for supporting test use in situationsprecluding local validation. Personnel Psychology,51, 987–1003.

Hoffman, C. C., Rashkovsky, B., & D’Egidio, E. L.(2007). Job component validity: Background, cur-rent research, and applications. In S. M. McPhail(Ed.), Alternative validation strategies: Develop-ing new and leveraging existing validity evidence(pp. 82–121). San Francisco: Jossey-Bass.

Hollenbeck, J. R., & Whitener, E. M. (1988). Criterion-related validation for small sample contexts: Anintegrated approach to synthetic validity. Journalof Applied Psychology, 73, 536–544.

Hough, L. M. (2001). I/Owes its advances to per-sonality. In B. W. Roberts, & R. T. Hogan (Eds.),The intersection of personality and indus-trial/organizational psychology (pp. 19–44).Washington, DC: American Psychological Asso-ciation.

Hough, L. M., & Ones, D. S. (2001). The structure,measurement, validity, and use of personal-ity variables in industrial, work, and organiza-tional psychology. In N. R. Anderson, D. S. Ones,H. K. Sinangil, & C. Viswesvaran (Eds.), Handbookof work psychology (pp. 233–277). London andNew York: Sage.

Hulin, C. L., Henry, R. A., & Noon, S. L. (1990).Adding a dimension: Time as a factor inthe generalizability of predictive relationships.Psychological Bulletin, 107, 328–340.

Hulsheger, U. R., Maier, G. W., & Stumpp, T. (2007).Validity of general mental ability for the predictionof job performance and training success inGermany: A meta-analysis. International Journalof Selection and Assessment, 15, 3–18.

Hunter, J. E. (1983). Validity generalization for 12,000jobs: An application of synthetic validity andvalidity generalization to the General Aptitude TestBattery (GATB). Washington, DC: U.S. Departmentof Labor, Employment Service.

Hunter, J. E., & Schmidt, F. L. (1996). Intelligenceand job performance: Economic and socialimplications. Psychology, Public Policy, and Law,2, 447–472.

Hyland, A. M., & Muchinsky, P. M. (1991). Assess-ment of the structural validity of Holland’s modelwith job analysis (PAQ) information. Journal ofApplied Psychology, 76, 75–80.

Ilgen, D. R., & Feldman, J. M. (1983). Performanceappraisal: A process focus. In B. M. Staw, &L. L. Cummings (Eds.), Research in organizationalbehavior (Vol. 5, pp. 141–197). Greenwich, CT:JAI Press.

James, L. R., Demaree, R. G., Mulaik, S. A., & Ladd,R. T. (1992). Validity generalization in the con-text of situational models. Journal of AppliedPsychology, 77, 3–14.

Jeanneret, P. R. (1992). Applications of job compo-nent/synthetic validity to construct validity. HumanPerformance, 5, 81–96.

Jeanneret, P. R., Borman, W. C., Kubisiak, U. C., &Hanson, M. (1999). Generalized work activities.In N. G. Peterson, M. D. Mumford, W. C. Borman,P. R. Jeanneret, & E. A. Fleishman (Eds.), An occu-pational information system for the 21st century:The development of O*NET. Washington, DC:American Psychological Association.

Jeanneret, P. R., & Strong, M. H. (2003). LinkingO*NET job analysis information to job requirementpredictors: An O*NET application. PersonnelPsychology, 56, 465–492.

Synthetic validation 327

Johnson, J. W. (2003). Toward a better understand-ing of the relationship between personality andindividual job performance. In M. R. Barrick, &A. M. Ryan (Eds.), Personality and work: Recon-sidering the role of personality in organizations(pp. 83–120). San Francisco: Jossey-Bass.

Johnson, J. W. (2007). Synthetic validity: A techniqueof use (finally). In S. M. McPhail (Ed.), Alternativevalidation strategies: Developing new and leverag-ing existing validity evidence (pp. 122–158). SanFrancisco: Jossey-Bass.

Johnson, J. W., & Carter, G. W. (in press). Validat-ing synthetic validation: Comparing traditional andsynthetic validity coefficients. Personnel Psychol-ogy.

Johnson, J. W., Carter, G. W., Davison, H. K., &Oliver, D. (2001). A synthetic validity approach totesting differential prediction hypotheses. Journalof Applied Psychology, 86, 774–780.

LaPolice, C. C., Carter, G. W., & Johnson, J. W.(2008). Linking O*NET descriptors to occupationalliteracy requirements using job component valida-tion. Personnel Psychology, 61, 405–441.

Lawshe, C. H. (1952). Employee selection. PersonnelPsychology, 6, 31–34.

McCormick, E. J. (1959). The development ofprocesses for indirect or synthetic validity: III.Application of job analysis to indirect validity. Asymposium. Personnel Psychology, 12, 402–413.

McCormick, E. J. (1979). Job analysis: Methods andapplications. New York: American ManagementAssociation.

McCormick, E. J., DeNisi, A. S., & Shaw, J. B. (1979).Use of the Position Analysis Questionnaire forestablishing the job component validity of tests.Journal of Applied Psychology, 64, 51–56.

McCormick, E. J., & Jeanneret, P. R. (1988). PositionAnalysis Questionnaire (PAQ). In S. Gael (Ed.),The job analysis handbook for business, industry,and government (Vol. II, pp. 825–842). New York:Wiley.

McCormick, E. J., Jeanneret, P. R., & Mecham, R. C.(1972). A study of job characteristics and jobdimensions as based on the Position Analysis Ques-tionnaire (PAQ). Journal of Applied Psychology, 56,347–368 [Monograph].

McCormick, E. J., Mecham, R. C., & Jeanneret, P. R.(1989). Technical manual for the Position AnalysisQuestionnaire (PAQ) (2nd ed.). Logan, UT: PAQServices.

McCoy. et al. v. Willamette Industries, Inc. U.S.District Court for the Southern District of Georgia,Savannah Division, Civil Action No. CV401-075(2001).

McPhail, S. M. (2007). Alternative validation strate-gies: Developing new and leveraging existing valid-ity evidence. San Francisco: Jossey-Bass.

Meyer, R. D., Dalal, R. S., & Bonaccio, S. (2009).A meta-analytic investigation into the mod-erating effects of situational strength on theconscientiousness-performance relationship. Jour-nal of Organizational Behavior, 30, 1077–1102.

Morris, D. C., Hoffman, C. C., & Schultz, K. S. (2003).A comparison of job components validity esti-mates to meta-analytic validity estimates. Posterpresented at the 18th Annual Conference of theSociety for Industrial and Organizational Psychol-ogy, Orlando, FL.

Mossholder, K. W., & Arvey, R. D. (1984). Syntheticvalidity: A conceptual and comparative review.Journal of Applied Psychology, 69, 322–333.

Murphy, K. (2008). Explaining the weak relationshipbetween job performance and ratings of job perfor-mance. Industrial and Organizational Psychology:Perspectives on Science and Practice, 1, 148–160.

Murphy, K. (2009). Validity, validation and values. TheAcademy of Management Annals, 3, 421–461.

Nunnally, J. C., & Bernstein, I. H. (1994). Psychomet-ric theory (3rd ed.). New York: McGraw-Hill.

Ones, D. S., & Viswesraran, C. (2003). Job-specificapplicant pools and national norms for personalityscales: Implications for range restriction correc-tions in validation research. Journal of AppliedPsychology, 88, 570–577.

Oswald, F. L., & McCloy, R. A. (2003). Meta-analysisand the art of the average. In K. R. Murphy (Ed.),Validity generalization: A critical review. Mahwah,NJ: Erlbaum.

Pearlman, K., Schmidt, F. L., & Hunter, J. E. (1980).Validity generalization results for tests used to pre-dict job proficiency and training success in clericaloccupations. Journal of Applied Psychology, 65,373–406.

Peterson, N. G., Mumford, M. D., Levin, K. Y.,Green, J., & Waksberg, J. (1999). Research method:Development and field testing of the con-tent model. In N. G. Peterson, M. D. Mumford,W. C. Borman, P. R. Jeanneret, & E. A. Fleishman(Eds.), An occupational information system for the21st century: The development of O*NET. Wash-ington, DC: American Psychological Association.

Peterson, N. G., Wise, L. L., Arabian, J., & Hoffman,R. G. (2001). Synthetic validation and validitygeneralization: When empirical validation isnot possible. In J. P. Campbell, & D. J. Knapp(Eds.), Exploring the limits of personnel selectionand classification (pp. 411–451). Mahwah, NJ:Erlbaum.

Primoff, E. S. (1957). The J-coefficient approach to jobsand tests. Personnel Administrator, 20, 31–40.

Primoff, E. S. (1959). Empirical validation of theJ-coefficient. Personnel Psychology, 12, 413–418.

Primoff, E., & Fine, S. (1988). A history of job analysis.In S. Gael (Ed.), The job analysis handbook forbusiness, industry and government (pp. 14–29).Toronto: John Wiley & Sons.

Pulakos, E. D., Arad, S., Donovan, M. A., & Plamon-don, K. E. (2000). Adaptability in the workplace:Development of a taxonomy of adaptive per-formance. Journal of Applied Psychology, 85,612–624.

Rashkovsky, B., & Hoffman, C. C. (2005, April). Exam-ining a potential extension of the job componentvalidity model to include personality predictors.In D. A. Newman, & C. C. Hoffman (Chairs), Per-sonnel selection with multiple predictors: Issuesand frontiers. Symposium conducted at the 20thAnnual Conference of the Society for Industrialand Organizational Psychology, Los Angeles.

Ree, M. J., & Earles, J. A. (1991). Predicting trainingsuccess: Not much more than g. PersonnelPsychology, 44, 321–332.

Ree, M. J., Earles, J. A., & Teachout, M. S. (1994). Pre-dicting job performance: Not much more than g.Journal of Applied Psychology, 79, 518–524.

328 J.W. Johnson et al.

Rotundo, M., & Sackett, P. R. (2002). The relativeimportance of task, citizenship, and counterpro-ductive performance to global ratings of job per-formance: A policy-capturing approach. Journal ofApplied Psychology, 87, 66–80.

Rounds, J. B., Shubsachs, A. P. W., Dawis, R. V., &Lofquist, L. H. (1978). A test of Holland’s envi-ronmental formulations. Journal of AppliedPsychology, 63, 609–616.

Russell, C. J., & Gilliland, S. W. (1995). Why meta-analysis doesn’t tell us what the data reallymean: Distinguishing between moderator effectsand moderator processes. Journal of Management,21, 813–831.

Rybczynski, W. (2000). One good turn: A naturalhistory of the screwdriver and the screw. Toronto:HarperFlamingo.

Sackett, P. R. (1991). Exploring strategies for clusteringmilitary occupations. In A. Wigdor, & B. Green(Eds.), Performance assessment for the workplace(Vol. 2, pp. 305–332). Washington, DC: NationalAcademy Press.

Sackett, P. R. (2003). The status of validity generaliza-tion research: Key issues in drawing inferences fromcumulative research findings. In K. R. Murphy (Ed.),Validity generalization: A critical review. Mahwah,NJ: Erlbaum.

Sackett, P. R., & Ellingson, J. E. (1997). The effects offorming multi-predictor composites on group differ-ences and adverse impact. Personnel Psychology,50, 707–721.

Sackett, P. R., Schmitt, N., Tenopyr, M., Kehoe, J., &Zedeck, S. (1985). Commentary on ‘‘Forty ques-tions about validity generalization and meta-analysis.’’ Personnel Psychology, 38, 697–798.

Scherbaum, C. A. (2005). Synthetic validity: Past,present, and future. Personnel Psychology, 58,481–515.

Schmidt, F. L., & Hunter, J. E. (1977). Developmentof a general solution to the problem of validitygeneralization. Journal of Applied Psychology, 62,529–540.

Schmidt, F. L., & Hunter, J. E. (1998). The validityand utility of selection methods in personnelpsychology: Practical and theoretical implicationsof 85 years of research findings. PsychologicalBulletin, 124, 262–274.

Schmidt, F. L., Hunter, J. E., Croll, P. R., & McKen-zie, R. C. (1983). Estimation of employment testvalidities by expert judgment. Journal of AppliedPsychology, 68, 590–601.

Schmidt, F. L., Hunter, J. E., & Pearlman, K. (1981).Task differences and the validity of aptitude testsin selection: A red herring. Journal of AppliedPsychology, 66, 166–185.

Schmidt, F. L., Hunter, J. E., & Pearlman, K. (1982).Progress in validity generalization: Comments onCallender and Osburn and further developments.Journal of Applied Psychology, 67, 835–845.

Schmidt, F. L., Hunter, J. E., Pearlman, K., & Roth-stein, H. R. (1985). Forty questions about validitygeneralization and meta-analysis. Personnel Psy-chology, 38, 697–798.

Schneider, R. J., Hough, L. M., & Dunnette, M. D.(1996). Broadsided by broad traits: How to sinkscience in five dimensions or less. Journal ofOrganizational Behavior, 17, 639–655.

Shultz, K. S., Riggs, M. L., & Kottke, J. L. (1999). Theneed for an evolving concept of validity in indus-trial and personnel psychology: Psychometric,legal, and emerging issues. Current Psychology,17, 265–286.

Smith, P. C. (1976). Behavior, results, and organi-zational effectiveness: The problem of criteria.In M. D. Dunnette (Ed.), Handbook of indus-trial and organizational psychology (pp. 745–775).Chicago: Rand-McNally.

Society for Industrial and Organizational Psychology(2003). Principles for the validation and use ofpersonnel selection procedures. Bowling Green,OH: SIOP.

Sparrow, J. (1989). The utility of PAQ in relatingjob behaviors to traits. Journal of OccupationalPsychology, 62, 151–162.

Steel, P., Huffcutt, A., & Kammeyer-Muller, J. (2006).From the work one knows the worker: A systematicreview of the challenges, solutions, and steps tocreating synthetic validity. International Journal ofSelection and Assessment, 14, 16–36.

Steel, P., & Kammeyer-Mueller, J. (2002). Comparingmeta-analytic moderator search techniques underrealistic conditions. Journal of Applied Psychology,87, 96–111.

Steel, P., & Kammeyer-Mueller, J. (2008). Bayesianvariance estimation for meta-analysis: Quantifyingour uncertainty. Organizational Research Methods,11, 54–78.

Steel, P., & Kammeyer-Mueller, J. (2009). Using ameta-analytic perspective to enhance job com-ponent validation. Personnel Psychology, 62,533–552.

Steel, P., Schmidt, J., & Schultz, J. (2008). Refiningthe relationship between personality and sub-jective well-being. Psychological Bulletin, 134,138–161.

Taylor v. James River Corporation, CA 88-0818-T-C(TC) (S.D. AL, 1989).

Terpstra, D., & Rozell, E. (1997). Why some potentiallyeffective staffing practices are seldom used. PublicPersonnel Management, 26, 483–495.

Trattner, M. H. (1982). Synthetic validation and itsapplication to the Uniform Guidelines validationrequirements. Personnel Psychology, 35, 383–397.

Verive, J. M., & McDaniel, M. A. (1996). Short-term memory tests in personnel selection: Lowadverse impact and high validity. Intelligence, 23,15–32.

Viswesvaran, C., & Ones, D. S. (1995). Theory testing:Combining psychometric meta-analysis and struc-tural equations modeling. Personnel Psychology,48, 865–885.

Viswesvaran, C., Schmidt, F. L., & Ones, D. S. (2005).Is there a general factor in ratings of jobperformance? A meta-analytic framework fordisentangling substantive and error influences.Journal of Applied Psychology, 90, 108–131.

Wards Cove v. Atonio, 490 U.S., 104 L.Ed. 2d 733,109 S.Ct. 2115 (1989).

Wilk, S. L., Desmarais, L. B., & Sackett, P. R. (1995).Gravitation to jobs commensurate with ability:Longitudinal and cross-sectional tests. Journal ofApplied Psychology, 80, 79–85.

Wilk, S. L., & Sackett, P. R. (1996). Longitudinalanalysis of ability-job complexity fit and jobchange. Personnel Psychology, 49, 937–967.