Processing two dimensions of nonspeech stimuli: The auditory-phonetic distinction reconsidered

10
Journal of Experimental Psychology: Human Perception and Performance 1976, Vol. 2, No. 2, 257-266 Processing Two Dimensions of Nonspeech Stimuli: The Auditory-Phonetic Distinction Reconsidered Mark J. Blechner and Ruth S. Day "Yale University and Haskins Laboratories James E. Cutting Wesleyan University and Haskins Laboratories Nonspeech stimuli were varied along two dimensions—intensity and rise time. In a series of speeded classification tasks, subjects were asked to identify the stimuli in terms of one of these dimensions. Identification time for the dimension of rise time increased when there was irrelevant variation in intensity; however, identification of intensity was unaffected by irrelevant variation in rise time. When the two dimensions varied redundantly, identifi- cation time decreased. This pattern of results is virtually identical to that obtained previously for stimuli that vary along a linguistic and a nonlin- guistic dimension. The present data, taken together with those from other studies using the same stimuli, suggest that the mechanisms underlying the auditory-phonetic distinction should be reconsidered. The results are also dis- cussed in terms of general models of multidimensional information processing. Several contemporary accounts of speech perception have emphasized the organiza- tion of processing into a hierarchy of levels, including auditory, phonetic, phonological, lexical, syntactic, and semantic (Fry, 1956; Stevens & House, 1972; Studdert- Kennedy, in press). The distinction be- tween phonetic and higher levels has been commonly accepted by linguists and psy- chologists for some time. Recently, how- ever, much attention has been directed toward the auditory-phonetic distinction (e.g., Fant, 1967; Stevens & Halle, 1967; Studdert-Kennedy, Shankweiler, & Pisoni, 1972). Fry (1956), in an early discussion of the levels-of-processing view of speech perception, emphasized the role of the "physical-psychological transformation" that occurs in the recognition of phonemes This research was supported by National Institute of Mental Health Training Grant PHS5T01MH05276-27 to Yale University, and by National Institute of Child Health and Human Development Grant HD-01994 to the Haskins Laboratories. The authors thank Michael Studdert- Kennedy for a discussion of the issues raised in this paper, Robert L. Plotz for assistance in running the experiment, and Richard M. Levich for sta- tistical advice. Requests for reprints should be sent to Mark J. Blechner at Haskins Laboratories, 270 Crown Street, New Haven, Connecticut 06510. from the acoustical signal. The important characteristic of this transformation is that there is no one-to-one relationship between "the number and arrangement of physical clues and the sound which is recognized" (Fry, 1956, p. 170). Fry did not state that the physical-psychological transformations characteristic of speech are exclusive to speech. However, this possibility was emphasized by later work which viewed speech perception as medi- ated by articulatory mechanisms (Liber- man, Cooper, Shankweiler, & Studdert- Kennedy, 1967; Stevens & House, 1972). Such a view made it desirable and perhaps necessary to partition all sounds into two general classes: those which are speech and those which are not. 1 Definitions concerning which criteria must be met in order for sounds to be classified as speech have varied in the literature. The present paper assumes a two-part definition of speech: Sounds that (a) can be articulated by the human vocal apparatus; and (b) can be receded into higher order linguistic units. According to this definition, phonetic processing may 1 Some authors use the terms linguistic and non- linguistic or verbal and nonverbal to indicate the same distinction. 257

Transcript of Processing two dimensions of nonspeech stimuli: The auditory-phonetic distinction reconsidered

Journal of Experimental Psychology:Human Perception and Performance1976, Vol. 2, No. 2, 257-266

Processing Two Dimensions of Nonspeech Stimuli: TheAuditory-Phonetic Distinction Reconsidered

Mark J. Blechner and Ruth S. Day"Yale University and Haskins Laboratories

James E. CuttingWesleyan University and

Haskins Laboratories

Nonspeech stimuli were varied along two dimensions—intensity and risetime. In a series of speeded classification tasks, subjects were asked toidentify the stimuli in terms of one of these dimensions. Identification timefor the dimension of rise time increased when there was irrelevant variationin intensity; however, identification of intensity was unaffected by irrelevantvariation in rise time. When the two dimensions varied redundantly, identifi-cation time decreased. This pattern of results is virtually identical to thatobtained previously for stimuli that vary along a linguistic and a nonlin-guistic dimension. The present data, taken together with those from otherstudies using the same stimuli, suggest that the mechanisms underlying theauditory-phonetic distinction should be reconsidered. The results are also dis-cussed in terms of general models of multidimensional information processing.

Several contemporary accounts of speechperception have emphasized the organiza-tion of processing into a hierarchy of levels,including auditory, phonetic, phonological,lexical, syntactic, and semantic (Fry,1956; Stevens & House, 1972; Studdert-Kennedy, in press). The distinction be-tween phonetic and higher levels has beencommonly accepted by linguists and psy-chologists for some time. Recently, how-ever, much attention has been directedtoward the auditory-phonetic distinction(e.g., Fant, 1967; Stevens & Halle, 1967;Studdert-Kennedy, Shankweiler, & Pisoni,1972). Fry (1956), in an early discussionof the levels-of-processing view of speechperception, emphasized the role of the"physical-psychological transformation"that occurs in the recognition of phonemes

This research was supported by NationalInstitute of Mental Health Training GrantPHS5T01MH05276-27 to Yale University, and byNational Institute of Child Health and HumanDevelopment Grant HD-01994 to the HaskinsLaboratories. The authors thank Michael Studdert-Kennedy for a discussion of the issues raised in thispaper, Robert L. Plotz for assistance in runningthe experiment, and Richard M. Levich for sta-tistical advice.

Requests for reprints should be sent to Mark J.Blechner at Haskins Laboratories, 270 Crown Street,New Haven, Connecticut 06510.

from the acoustical signal. The importantcharacteristic of this transformation isthat there is no one-to-one relationshipbetween "the number and arrangement ofphysical clues and the sound which isrecognized" (Fry, 1956, p. 170). Fry didnot state that the physical-psychologicaltransformations characteristic of speechare exclusive to speech. However, thispossibility was emphasized by later workwhich viewed speech perception as medi-ated by articulatory mechanisms (Liber-man, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Stevens & House, 1972).Such a view made it desirable and perhapsnecessary to partition all sounds into twogeneral classes: those which are speechand those which are not.1

Definitions concerning which criteriamust be met in order for sounds to beclassified as speech have varied in theliterature. The present paper assumes atwo-part definition of speech: Soundsthat (a) can be articulated by the humanvocal apparatus; and (b) can be recededinto higher order linguistic units. Accordingto this definition, phonetic processing may

1 Some authors use the terms linguistic and non-linguistic or verbal and nonverbal to indicate thesame distinction.

257

258 M. J. BLECHNER, R. S. DAY, AND J. E. CUTTING

refer to articulatory processes either di-rectly (Liberman et al., 1967) or implicitly(Stevens & House, 1972), and follows somesystem of linguistic organization such asa distinctive feature system (Chomsky &Halle, 1968; Jakobson, Fant, & Halle,1963). Furthermore, while all sounds under-go auditory processing, only speech soundsundergo phonetic processing.

To assess the psychological reality ofthe auditory-phonetic distinction, manyexperiments have been conducted. Resultsfrom several paradigms suggest that eventhough speech sounds differ along a widevariety of acoustic dimensions, they areperceived in ways that are qualitativelydistinct from the way nonspeech soundsare perceived. For example, the maindifference between the phonemes /ba/and /da/ is the direction and extent of thesecond-formant transition (Liberman, Del-attre, Cooper, & Gerstman, 1954), whilethe distinction between /ba/ and /pa/lies in the latency between the initialplosive burst and the onset of voicing(Lisker & Abramson, 1964). Yet, for bothdistinctions, there is no one-to-one rela-tionship between changes in the acousticpatterns and probabilities of identification.Instead, item identifications remain at ornear 100% as one phoneme or the other,with an abrupt crossover at the "phonemeboundary." More importantly, whereasmost stimulus dimensions in the environ-ment, such as frequency, intensity, andbrightness, can be discriminated from oneanother much more accurately than theycan be identified (Miller, 1956), this isnot the case for several linguistic dimen-sions. Instead, two acoustically differentstimuli that lie within the same phonemecategory are discriminated at near-chancelevel; two stimuli that lie in separatecategories but differ by the same acousticincrement are discriminated with very fewerrors. This nonlinear mode of perceptionis called categorical perception (Liberman,Harris, Hoffman, & Griffith, 1957; for areview, see Studdert-Kennedy, in press).

Other experimental operations haveshown processing differences for speechand nonspeech stimuli. Recent work using

a selective adaptation paradigm has shownthat repeated presentation of a consonant-vowel (CV) syllable produces systematicshifts in the phoneme boundary (Eimas,Cooper, & Corbit, 1973; Eimas & Corbit,1973). Some experiments suggest that thebasis of this adaptation is phonetic ratherthan auditory (Cooper, 1975), althoughthis evidence is not conclusive (Studdert-Kennedy, in press). The auditory-phoneticdistinction seems to be further supportedby dichotic identification tasks that oftenreveal right-ear advantages for speechstimuli (e.g., Kimura, 1961; Shankweiler& Studdert-Kennedy, 1967) and left-earadvantages for nonspeech stimuli (e.g.,Chaney & Webster, 1966; Curry, 1967;Kimura, 1964).

In addition, several experiments haveinvestigated the relationship of auditoryand phonetic processes in selective atten-tion tasks using stimuli that vary alongtwo dimensions. When both dimensionsare linguistic (e.g., initial stop consonantand vowel in CV syllables), selective atten-tion for either dimension is impaired byirrelevant variation in the other dimension(Wood & Day, 1975). This pattern ofsymmetric interference also occurs whenboth dimensions are nonlinguistic, suchas pitch and intensity (Wood, 1975a).However, when one dimension is linguisticand the other is not, a pattern of asym-metric interference appears; reaction time(RT) for identification of stop consonantsis impaired by irrelevant variation inpitch, but RT for pitch identification isnot increased significantly by irrelevantvariation in stop consonant (Day &Wood, 1972; Wood, 1974, 1975a). Theseresults have been interpreted to supportthe auditory-phonetic distinction.

The dichotic listening, categorical per-ception, selective adaptation, and speededclassification experiments appear to com-prise a set of converging operations(Garner, Hake, & Eriksen, 1956) on thepsychological reality of the auditory-phonetic distinction. Recently, however,this view has been brought into questionby experiments in which the stimuli aresawtooth-wave tones differing in rise time

PROCESSING OF NONSPEECH 259

(see Cutting, in press). While rise timecan cue a speech distinction, such as thedifference between the syllables /$a/ and/t$a/, sawtooth waves are not perceivedas speech. Instead, they sound comparableto a plucked or bowed violin string. Al-though these "plucks" and "bows" are notprocessed phonetically, they are processedsimilarly to speech in several ways. Theyare perceived categorically (Cutting &Rosner, 1974), and their identificationfunctions shift in the same manner as forspeech following selective adaptation (Cut-ting, Rosner, & Foard, in press). Ear-advantage data for plucks and bows arenot decisive.

The present experiment seeks to in-vestigate further the processes by whichthe plucks and bows are perceived andtheir relationship to the auditory-phoneticdistinction. The two-choice speeded classi-fication procedure developed by Garnerand Felfoldy (1970) and modified for usein auditory experiments by Day and Wood(1972; Wood, 1974, 197Sa; Wood & Day,1975) was used to determine how thedimensions of rise time and intensityinteract. If symmetric interference neces-sarily results when stimulus dimensionsare of the same general class, that is, bothlinguistic or nonlinguistic, then selectiveattention to either rise time or intensityshould suffer from irrelevant variation inthe other dimension. However, if a patternof asymmetric interference occurs, it wouldbe clear that such a pattern need not bebased on separate auditory and phoneticlevels of processing. This pattern of results,along with those of other experiments usingpluck and bow stimuli, would lead one toquestion the mechanisms underlying theauditory-phonetic distinction as currentlyconceived.

The present experiment is also con-cerned with the processing of multidimen-sional stimuli in general. The pattern ofasymmetric interference suggests a serialmodel; only the dimension processed firstinterferes with the other. However, thisview has been challenged by the findingthat when the two dimensions of the samestimulus vary redundantly, subjects can

identify them more quickly than whenonly one dimension varies (Wood, 1974).This redundancy gain (Garner & Felfoldy,1970) seems to argue against a serial model.If pitch processing were completed beforeprocessing of the stop consonant began,a subject should not be able to use re-dundant information about the stop con-sonant to speed up identification of pitch.Instead, a parallel model, which positsthat the processing of the two dimensionsoverlap or are simultaneous, would betteraccount for the redundancy gain. Informa-tion processing models that propose strictserial or parallel processing do not seemto offer an adequate explanation of Wood's(1974) finding of asymmetric interferencewith redundancy gain. It seems likely,instead, that subjects have some degree offreedom about the kinds of processing thatthey use in different task conditions. Thepresent experiment, by also including atask that varies the two dimensions re-dundantly, seeks to distinguish the condi-tions in which processing strategies areoptional from those in which they aremandatory.

METHOD

StimuliStimuli varied along two dimensions—intensity

and rise time. They were derived from the sawtoothwaves used by Cutting and Rosner (1974), gen-erated on the Moog synthesizer at the PresserElectronic Studio at the University of Pennsylvania.The original stimuli differed in rise time and re-sembled the sound of a plucked or bowed violinstring. The pluck and bow stimuli reached maxi-mum intensity in 10 and 80 msec, respectively. Twomore stimuli of lower amplitude were created byattenuating the original stimuli 7 db., using thePulse Code Modulation (PCM) system at HaskinsLaboratories (Cooper & Mattingly, 1969). Thus thefinal four stimuli were loud pluck, soft pluck, loudbow, and soft bow. The absolute level of the loudand soft stimuli were 75 and 68 db. re 20 ^N/m2,respectively. All stimuli were truncated to 800 msecin duration (the original stimuli were approximately1,050 msec), and then digitized and stored on discfile using the PCM system. Items were reconvertedto analog form at the time of tape recording.

TapesAll tapes were prepared on the PCM system.

Test stimuli were recorded on one channel of theaudio tape. On the other channel, brief pulses were

260 M. J. BLECHNER, R. S. DAY, AND J. E. CUTTING

TABLE 1STIMULUS SETS FOR EACH DIMENSION AND

CONDITION

Condition

Dimension Control

Rise t ime Loud pluckLoud bow

orSoft pluckSoft bow

Intensi ty Loud pluckSoft pluck

orLoud bowSoft bow

Correlated

Loud pluckSoft bow

Loud pluckSoft bow

Orthogonal

Loud pluckSoft pluckLoud bowSoft bow

Loud pluckSoft pluckLoud bowSoft bow

synchronized with the onset of each test stimulus;these pulses triggered the reaction time counterduring the experimental session.

A display tape was prepared to introduce thelisteners to the stimuli. The four stimuli wereplayed in the same order several times, beginningwith three tokens of each item, then two of each,and finally one of each. Practice tapes were alsoprepared, consisting of a randomized order of 20items, five of each stimulus. There were two practicetapes, each with a different random order.

The eight test tapes each contained 64 stimuliwith a 2-sec interstimulus interval. Each tape wascomposed of different subsets of the four stimuli,depending on the condition of the experiment. Inthe control condition, the stimuli varied along onlyone dimension, while the other dimension was heldconstant. Thus, for example, one intensity controltape consisted of loud and soft bows only, whilethe other consisted of loud and soft plucks. Forhalf the subjects, the nontarget dimension (in thiscase, rise time) was held constant at one value(pluck), whereas for the other half, it was heldconstant at the other (bow). Rise-time controltapes were constructed in an analogous fashion,fn the orthogonal condition, both dimensions variedindependently. Hence, the two tapes for this condi-tion contained all four kinds of stimuli, in differentrandom orders. In the correlated condition, thedimensions varied in a completely redundantmanner: All of the pluck stimuli were loud and allof the bow stimuli were soft. See Table 1 for a com-plete outline of the stimuli in each condition.

Subjects and Apparatus

The six subjects (five males and one female,from 19 to 27 years of age) participated in all sixtasks. All reported no history of hearing trouble.

The tapes were played on an Ampex AG-500tape recorder and the stimuli were presentedthrough calibrated Telephonies headphones (ModelTDH39-300Z). Subjects sat in a sound-insulatedroom and responded with their dominant hand onthe two telegraph keys mounted on a woodenboard. Throughout the experiment, the left key wasused for pluck and loud responses, while the rightkey was used for bow and soft responses. The pulse

on one channel of the tape triggered a Hewlett-Packard 522B Electronic Counter. When a responseon either telegraph key stopped the counter, thereaction time was registered onto paper tape by aHewlett-Packard S60A digital recorder for subse-quent analysis. The listener's response choice wasrecorded manually by the experimenter.

Procedure

At the start of the session, subjects were informedof the general nature of the experiment and of thedimensions which they would be asked to identify.They were told that the difference in rise timecould be compared to the difference in sound be-tween a plucked and a bowed violin string.

For preliminary training, subjects heard the dis-play sequence twice. Next, they listened to the twosets of 20 practice items and responded verbally,first attending to rise time and then to intensity.This insured that the subject could perceive thedifferences along both dimensions. They then re-peated the same practice trials, responding with akey press rather than verbally, in order to becomefamiliar with the mode of response. Subjects wereinstructed to respond as quickly as possible withoutmaking errors.

The order of presentation for the six test tapeswas determined by a balanced Latin square design.Before the test trials of each condition, appropriateinstructions were read. Subjects were then giveneight practice trials to help stabilize reaction timeperformance and to familiarize them with theidentification task and stimulus set for that par-ticular condition. In the control conditions subjectswere told which dimension to attend to and thevalue at which the other dimension would be held.In the orthogonal conditions, they were instructedto attend to one dimension and to ignore variationin the other dimension. In the correlated conditions,they were instructed to attend to one dimension,but were encouraged to use the additional informa-tion from the other dimension.

RESULTS AND DISCUSSION

Both dimensions were easy to identify.In the practice trials, no subject mademore than 2% errors. During test trials,the highest mean error rate for any con-dition was 1.8%. A three-way analysis ofvariance of the error data (SubjectsX Conditions X Dimensions) revealed nosignificant main effects nor interactions.Therefore, the error data are not con-sidered in detail in this discussion.

Asymmetric Interference with RedundancyGain

For the reaction time data, median RTwas calculated for each individual blockof trials for each subject, and means of

PROCESSING OF NONSPEECH 261

medians for each condition across subjectswere computed.2 In addition, the untrans-formed RT data were subjected to a com-plete four-way factorial analysis of vari-ance (Subjects X Conditions X Dimen-sions X Within Cell).

Median RT data are presented in Table2. For the dimension of rise time, therewas an increase of 53.5 msec from thecontrol to the orthogonal condition, whilefor the dimension of intensity, there wasan increase of only .2 msec. In the analysisof variance, the effect of dimensions(intensity versus rise time) was not sig-nificant, while the effect of conditions(control, orthogonal, and correlated) wassignificant, F(2, 126) = 33.23, p < .001.In addition, the Dimensions X Conditionsinteraction was significant, F(2, 126) =6.43, p < .01. In order to differentiateinterference effects from redundancy gains,a contrast of the interactions between thetwo dimensions and only the control andorthogonal conditions was performed, omit-ting correlated conditions. This contrastwas significant, F(l, 63) = 16.58, p < .001.Thus there was an asymmetric pattern ofinterference; intensity variation interferedwith the processing of rise time while risetime had virtually no effect on the pro-cessing of intensity. This finding is es-pecially interesting given that intensitywas somewhat more discriminable thanrise time, although the difference of 14.2msec between the control conditions wasnot statistically reliable.

The effect of redundancy gains wasassessed by two methods. A contrastof the conditions effect showed the corre-lated conditions to be significantly differentfrom the control conditions, F(l, 63) =35.1, p < .001. A subsequent comparisonof the individual means using the Newman-Keuls procedure showed the correlatedconditions in both dimensions to differsignificantly from the respective controlcondition. The different correlated condi-tions, like the control conditions, did notsignificantly differ from each other.

In order to determine whether the re-dundancy gain could rightfully be con-sidered as evidence of parallel processingof the two dimensions in this experiment,

TABLE 2MEDIAN REACTION TIME IN MILLISECONDS AND

PERCENT ERRORS FOR EACH DIMENSION ANDCONDITION

Condition

Dimension Control Correlated Orthogonal

Rise timeIntensity

426.5 (1.32)442.3 (1.34)

393.7 (.06) 480.0 (1.66)406.4 (.06) 442.5 (1.83)

Note. Percent errors are shown in parentheses.

three alternative explanations of the re-dundancy gain were ruled out, as in Wood(1974). First, the possibility of a differentspeed-accuracy trade-off in the two cor-related conditions could be eliminated bythe lack of significant differences in theerror data, as noted above.

Second, the possibility that the re-dundancy gain could be due to selectiveserial processing (see Felfoldy & Garner,1971) was considered. If subjects use thisstrategy in the correlated condition, theymerely attend to the more discriminableof the two dimensions, regardless of theinstructions. Thus, their RT data wouldshow that neither correlated condition isfaster than the faster control condition.To test for the occurrence of the selectiveserial processing strategy, the RT datafor each correlated condition was testedagainst each subject's faster control con-dition with an analysis of variance and asubsequent comparison of means usingthe Newman-Keuls method. The correlatedconditions were still found to be fasterthan the faster control condition, F(2, 63)= 13.5, p < .001. Therefore, the redun-dancy gain cannot be attributed to selec-tive serial processing.

2 Wood and Day, in all of their reaction timeexperiments, transformed their data, so that anyRT longer than 1 sec was set equal to 1 sec. This wasdone to correct for possible malfunctioning of theequipment, such as failure of the response key tomake electrical contact, or temporary inattentionof the subject. While unusually long reaction timesdue to equipment trouble or lapsing of the subject'sattention should be transformed, the continualresetting of long RT values to 1 sec is arbitrary andcould distort the data. If arithmetic means are tobe used, very long data points can be more equitablyadjusted by using reciprocal RT values. Alterna-tively, medians or trimeans can be used, withsimilar effect.

262 M. J. BLECHNER, R. S. DAY, AND J. E. CUTTING

Finally, because the stimulus sets in thetwo correlated conditions were the same,whereas in the control conditions theywere different, it is possible that the re-dundancy gain could be based on differ-ential transfer between control and corre-lated conditions (Biederman & Checkosky,1970). To test this explanation, an analysisof variance of the control and correlatedconditions (Subjects X Conditions X Or-der of Presentation X Within Cell) wasperformed. The control condition presentedfirst was 16 msec slower than the second,suggesting a possible practice effect, al-though this difference was not reliable.The correlated conditions presented firstand second differed by only 2 msec—notsignificantly. Thus the transfer betweenthe correlated conditions was less than orequal to the transfer between the controlconditions, so that differential transferdoes not account for the redundancy gain.

The pattern of results in this experimentis remarkably similar to those of Wood(1974). The relationship of intensity torise time matches that of initial stop con-sonant both in the asymmetric pattern ofinterference and in the significant re-dundancy gain. In Garner's (1974) ter-minology, the dimensions of rise time andintensity are therefore asymmetrically in-tegral. In the following discussion we willfirst consider the implications of our resultsin terms of general information-processingmodels and then reconsider the auditory-phonetic distinction.

Relevance to Theories of MultidimensionalProcessing

The present results pose problems forinformation-processing models that tryto account for perception only in terms ofthe serial or parallel handling of informa-tion. The data suggest that both stimulusand task characteristics may affect themode of processing, so that neither a strictserial nor a strict parallel model can accountfor the whole picture. This view agreeswith the recent suggestions of severalauthors (Garner, 1974; Nickerson, 1971;Townsend, 1971). One alternative to thestrictly serial and parallel models, for

example, is to view the present results asone point on a performance-resource func-tion (Norman & Bobrow, 1975). Perhapsthe processing of rise time is resourcelimited (relatively low in priority andcompeting with other processes for limitedpsychological resources) but the processingof intensity is data limited (relatively highin such priority and limited only by theclarity of the signal). An alternative viewis that the two processes overlap tem-porally and that one is contingent on theother, as suggested by Turvey (1973).Perhaps the processing of rise time andintensity (as well as that of place ofarticulation and pitch) begin simultane-ously. Both kinds of information can becombined to produce a redundancy gainin the correlated condition. However, itmay be that only the orthogonal condition,which requires information gating (Posner,1964), can reveal the contingency of onekind of information on the other. Currenttheories, however, do not account forwhy this contingency relationship mightaffect one task and not the other. Furtherresearch is needed on this point.

One useful approach might be to varythe discriminability of the two dimensionsand to note the changes that occur in boththe orthogonal and correlated conditions.In a pilot study of speeded classification,we used rise time and pitch where thelatter dimension was considerably morediscriminable than the former. The resultwas that the subjects always processedthe more discriminable dimension first.Thus, in the orthogonal condition, therewas asymmetric interference, and in thecorrelated condition, RT performance wasequal to the faster control condition (theSelective Serial Processing pattern). How-ever, it would be more interesting to deter-mine whether, by manipulating discrimi-nability in the reverse direction, rise timecan be made to interfere with intensity.How much more discriminable than in-tensity would rise time have to be for apattern of mutual interference to appear?If RT performance in the correlated condi-tion were only as fast as the faster controlcondition, one might conclude that re-

PROCESSING OF NONSPEECH 263

dundancy gains are impervious to theeffects of contingency processing relation-ships between dimensions. Such a findingwould be congruent with the results ofthe experiment reported here.

Decision- Combination Model of RedundancyGain

Many different processing models havebeen proposed to account for redundancygains (Morton, 1969), but there is noexplicit way to distinguish among themfor reaction time data. However, Wood(1975b) has provided a normative model,based on the theory of decision combina-tion, that predicts both the size and theshape of the reaction time distributionfor redundant dimensions. This modelassumes that completion times for pro-cessing each stimulus dimension are inde-pendent random variables, and that theresponse choice on any given trial isinitiated as soon as a decision is reachedabout either stimulus dimension. If theprocessing-time distributions for the stimu-lus dimensions overlap, a performancegain for redundant information is predicted

Wood applied the decision-combinationmodel to his own data (Wood, 1974)and to simulated distributions based onthe data of Biederman and Checkosky(1970). In all cases, he found that themodel predicted redundancy gains thatwere slightly but consistently larger thanthose obtained.

The data of the present experiment werealso analyzed in terms of the decision-combination model.3 Following Wood(1975b) reaction time distributions foreach subject in each control conditionwere calculated (20-msec cell widths), andthen were transformed into z units.4 Thesingle-dimension distributions were thencombined across subjects and entered intoEquation 1:

TABLE 3MEANS AND STANDARD DEVIATIONS OF OBTAINED

AND PREDICTED REACTION TIME DISTRIBUTIONS

Condition M SD

Rise-time control conditionIntensity control conditionCorrelated condition : Obtained*Correlated condition : Predicted

419.9435.6400.7395.2

87.095.584.272.3

" Pooled data from the two correlated conditions, which, asnoted in the text, did not differ significantly from one another.

h(t) represents the predicted distributionwhen dimensions vary redundantly.

The results of this analysis are displayedin Table 3. The model predicts a redun-dancy gain 5.5 msec larger than that ac-tually obtained. This result parallels Wood'sfinding that the model predicts redundancygains that are slightly but consistently

where f(t) and g(f) represent the distribu-tions for the individual dimensions, and

3 Wood (1974) set all reaction times greater than1 sec equal to 1 sec. In the subsequent analysis ofthe same data, Wood (1975b) does not mentionthis transformation, although its use is inappropriatewhen applying the decision combination model,since a "lump" in the reaction time distribution at1,000 msec would result. Some form of preliminarytransformation is necessary, however, becauseextreme outlying scores would result in a largenumber of empty cells when calculating reactiontime distributions. In the present data, such out-liers were confined to very high values; therefore,for the purpose of applying the decision-combinationmodel, it was decided not to consider the 5% highestreaction times for each subject in each condition.This method is justified when, as in the presentexperiment, error rates are very low, so that thereare outliers at high values but not at low values.The method would not be justified if speed wereemphasized much more such that high error ratesand extremely low reaction times occurred.

4 The normalization of individual subjects' reac-tion time distributions is quite important, especi-ally when there are large differences in the overallreaction times for each subject, as in the presentdata. Wood (1975b) found that his model predictedredundancy gains adequately when he simulatedBiederman and Checkosky's (1970) data, withnormal distributions based on their reported meansand standard deviations. In the present data, how-ever, the simulation of a normal distribution fromthe mean and standard deviation results in apredicted redundancy gain considerably largerthan that actually obtained. This is mainly becausethe large across-subject variance in the data pro-duces greater overlap between the two control con-dition distributions than that which actuallyoccurred for any individual subject.

264 M. J. BLECHNER, R. S. DAY, AND J. E. CUTTING

larger than those obtained. It also demon-strates another way in which there issimilar processing of place of articulationand of rise time.

It is important to note, however, thatthe decision-combination model does notaccount for asymmetric effects. The modelis sensitive to differences in discrimi-nability between stimulus dimensions, butit is indifferent to the nature of the indi-vidual dimensions. For example, it wouldpredict the same redundancy gain whetherintensity were perceived n msec fasterthan rise time, or vice versa. Thus themodel provides a potent norm againstwhich future research may test the hy-pothesis that certain pairs of stimulusdimensions produce asymmetric redun-dancy gains.

Relevance to the Auditory-PhoneticDistinction

The present results bear not only onthe way that different levels of processinginteract, but also on the broader questionof which levels of processing are importantin human auditory perception. In light ofthe present data, which show asymmetricinterference between two "auditory" di-mensions, it seems unwise to lump to-gether all acoustic properties that do notprovide linguistic cues. At the very least,a heterogeneous conception of auditoryanalyzing systems, with processing levelsof increasing complexity, seems preferable.

Assuming that there are several levelsof auditory processing, what are thespecific parameters that best describethese levels? The research on auditoryand phonetic levels of processing suggestedthat acoustic factors could not adequatelyaccount for the data from experimentsusing speech stimuli, and took recourseinstead to models based on the linguistic-nonlinguistic distinction, invoking refer-ence to articulatory processes.

The present data, along with those ofother pluck and bow studies, suggest thatthe importance of the linguistic-nonlin-guistic dimension may have been overratedand that the role of acoustic factors may

have been underrated. Thus, rise timeseems to be perceived in many of thesame ways, regardless of whether it char-acterizes a speech sound, as in /§a/ and/t$a/, or a nonspeech sound, as in plucksand bows (see Cutting & Rosner, 1974, forexperiments that make this comparisondirectly). Similarly, Miller, Pastore, Wier,Kelly, and Dooling (in press) found thatnoise-buzz sequences with varying relativeonset times were also perceived cate-gorically. The stimuli varied in a manneranalogous to the voice-onset time con-tinuum ; thus there appears to be a com-parable mode of perception for such stimuliregardless of whether they are speech ornonspeech.

The results of research with phonemesand with plucks and bows are analogous,but they are not identical. The lack ofsignificant laterality data is one notableexception. Therefore, the linguistic-non-linguistic distinction, which has consider-able heuristic value in much research, can-not be blankly discarded. However, in thefuture, as acoustic factors that underliespeech sounds are understood in greaterdetail, less weight may be given to "special"modes of processing accorded to speech(see, e.g., Kuhn, 1975).

It may well be that nonacoustic factorsdo determine certain perceptual processes,but that the linguistic-nonlinguistic di-mension is not the most accurate way ofdescribing these factors. Speech, after all,is the most important hierarchically codedsystem of sound, but it is not the only one.Consider rise time: By the definition ofthe term speech established above, it isclear that plucks and bows are not pho-netic, since they cannot be articulated bythe human vocal tract. However, theirstatus with respect to the second part ofthe definition, that the sounds can berecoded into higher order linguistic units,is less clear. Certainly, plucks and bowsare not part of spoken language; but theydo comprise lower level components inthe "language" of music, which, likehuman spoken language, can be dividedinto hierarchical levels of organization(e.g., pitch, timbre, and harmony; see

PROCESSING OF NONSPEECH 265

Nattiez, 1975, for a more complete dis-cussion of this problem). Perhaps the typeof processing results from the interactionof the acoustic nature of a sound and itscodability within a hierarchically organizedsystem of sound, whether linguistic or not.

In conclusion, we do not question thatlevels of processing separate certain lin-guistic and nonlinguistic dimensions of thesame stimuli. We suggest, rather, that thecrux of the auditory-phonetic distinctionis, as Fry suggested, a "physical-psycho-logical transformation." The nature ofthis transformation probably cannot beaccounted for solely in terms of the lin-guistic-nonlinguistic distinction. Insteadit may be based on acoustic propertiesalone, on the coding of sounds within ahierarchically organized system, or on theinteraction of acoustic properties withsuch a system.

REFERENCES

Biederman, I., & Checkosky, S. F. Processing re-dundant information. Journal of ExperimentalPsychology, 1970, 83, 486-490.

Chaney, R. B., & Webster, J. C. Information incertain multidimensional sounds. Journal of theAcoustical Society of America, 1966, 40, 447-455.

Chomsky, N., & Halle, M. The sound pattern ofEnglish. New York: Harper & Row, 1968.

Cooper, F. S., & Mattingly, I. G. Computer con-trolled PCM system for investigation of dichoticspeech perception. Journal of the AcousticalSociety of America, 1969, 46, 115. (Abstract)

Cooper, W. E. Selective adaptation to speech. InF. Restle, R. M. Shiffrin, N. J. Castellan, H.Lindman, & D. B. Pisoni (Eds.), Cognitivetheory. Hillside, N. J.: Erlbaum, 1975.

Curry, F. K. W. A comparison of left-handed andright-handed subjects on verbal and non-verbaldichotic listening tasks. Cortex, 1967, 3, 343-352.

Cutting, J. E. The magical number two and thenatural categories of speech and music. In N. S.Sutherland (Ed.), Tutorial essays in psychology.Hillside, N. J.: Erlbaum, in press.

Cutting, J. E., & Rosner, B. S. Categories andboundaries in speech and music. Perception &Psychophysics, 1974, 16, 564-570.

Cutting, J. E., Rosner, B. S., & Foard, C. F. Per-ceptual categories for musiclike sounds: Implica-tions for theories of speech perception. QuarterlyJournal of Experimental Psychology, in press.

Day, R. S., & Wood, C. C. Interactions betweenlinguistic and nonlinguistic processing. Journalof the Acoustical Society of America, 1972, 51, 79.(Abstract)

Eimas, P. D., Cooper, W. E., & Corbit, J. D. Someproperties of linguistic feature detectors. Percep-tion & Psychophysics, 1973, 13, 247-252.

Eimas, P. D., & Corbit, J. D. Selective adaptationof linguistic feature detectors. Cognitive Psychology,1973, 4, 99-109.

Fant, G. Auditory patterns of speech. In W. Wathen-Dunn (Ed.), Models for the perception of speechand visual form. Cambridge, Mass.: M.I.T. Press,1967.

Felfoldy, G. L., & Garner, W. R. The effects onspeeded classification of implicit and explicitinstructions regarding redundant dimensions.Perception & Psychophysics, 1971, 9, 289-292.

Fry, B. B, Perception and recognition in speech.In M. Halle, H. G. Lunt, & C. H. van Schoonveld(Eds.), For Roman Jakobson, The Hague:Mouton, 1956.

Garner, W. R. The processing of information andstructure. Potomac, Md.: Erlbaum, 1974.

Garner, W. R., & Felfoldy, G. L. Integrality ofstimulus dimensions in various types of informa-tion processing. Cognitive Psychology, 1970, 1,225-241.

Garner, W. R., Hake, H. W., & Eriksen, C. W.Operationism and the concept of perception.Psychological Review, 1956, 63, 149-159.

Jakobson, R., Fant, C. G. M., & Halle, M. Pre-liminaries to speech analysis. Cambridge, Mass.:M.I.T. Press, 1963.

Kimura, D. Cerebral dominance and the perceptionof verbal stimuli. Canadian Journal of Psychology,1961, 15, 166-171.

Kimura, D. Left-right differences in the perceptionof melodies. Quarterly Journal of ExperimentalPsychology, 1964, 16, 355-358.

Kuhn, G. M. On the front cavity resonance and itspossible role in speech perception. Journal of theAcoustical Society of America, 1975, 58, 428-433.

Liberman, A. M., Cooper, F. S., Shankweiler, D. P.,& Studdert-Kennedy, M. Perception of the speechcode. Psychological Review, 1967, 74, 431-461.

Liberman, A. M., Delattre, P. C., Cooper, F. S.,& Gerstman, L. J. The role of consonant-voweltransitions in the perception of the stop and nasalconsonants. Psychological Monographs, 1954,68, 1-13.

Liberman, A. M., Harris, K. S., Hoffman, H. S.,& Griffith, B. C. The discrimination of speechsounds within and across phoneme boundaries.Journal of Experimental Psychology, 1957, 54,358-368.

Lisker, L., & Abramson, A. S. A cross-languagestudy of voicing in initial stops: Acoustical mea-surements. Word, 1964, 20, 384-422.

Miller, G. A. The magical number seven plus orminus two, or, some limits on our capacity for pro-cessing information. Psychological Review, 1956,63, 81-97.

Miller, J. D., Pastore, R. E., Wier, C. C., Kelly,W. J., & Dooling, R. J. Discrimination and label-ing of noise-buzz sequences with varying noise-lead times: An example of categorical perception.

266 M. J. BLECHNER, R. S. DAY, AND J. E. CUTTING

• Journal of the Acoustical Society of America, inpress.

Morton, J. The use of correlated stimulus informa-tion in card sorting. Percept-ion & Psychophysics,1969, 5, 374-376.

Xattiez, J. J. Se'miologie musicale: L'e'tat de laquestion. Ada Musicologica, 1975, 42, 153-171.

Nickerson, R. S. Binary-classification reactiontime: A review of some studies of human informa-tion-processing capabilities. Psychonomic Mono-graph Supplements, 1971, 4 (17, Whole No. 65).

Norman, D. A., & Bobrow, D. G. On data-limitedand resource-limited processes. Cognitive Psy-chology, 1975, 7, 44-64.

Posner, M. I. Information reduction in the analysisof sequential tasks. Psychological Review, 1964,71, 491-504.

Shankweiler, D. P., & Studdert-Kennedy, M.Identification of consonants and vowels presentedto the left and right ears. Quarterly Journal ofExperimental Psychology, 1967, 19, 59-63.

Stevens, K. N., & Halle, M. Remarks on analysisby synthesis and distinctive features. In W.Wathen-Dunn (Ed.), Models for the perceptionof speech and visual form. Cambridge, Mass.:M.I.T. Press, 1967.

Stevens, K. N., & House, A. S. Speech perception.In J. V. Tobias (Ed.), Foundations of modernauditory theory (Vol. 2). New York: AcademicPress, 1972.

Studdert-Kennedy, M. Speech perception. In N. J.

Lass (Ed.), Contemporary issues in experimentalphonetics. New York: Academic Press, in press.

Studdert-Kennedy, M,, Shankweiler, D. P., &Pisoni, D. B. Auditory and phonetic processesin speech perception: Evidence from a dichoticstudy. Cognitive Psychology, 1972, 3, 455-466.

Townsend, J. T. A note on the identifiability ofparallel and serial processes. Percept-ion & Psycho-physics, 1971, 10, 161-163.

Turvey, M. On peripheral and central processes invision: Inferences from an information-processinganalysis of masking with patterned stimuli.Psychological Review, 1973, 80, 1-52.

Wood, C. C. Parallel processing of auditory andphonetic information in speech perception.Perception & Psychophysics, 1974, 15, 501-508.

Wood, C. C. Auditory and phonetic levels of pro-cessing in speech perception: Neurophysiologicaland information-processing analyses. Journal ofExperimental Psychology: Human Perception andPerformance, 1975, 1, 3-20. (a)

Wood, C. C. A normative model for redundancygain in speech perception. In F. Restle, R. M.Shiffrin, N. J. Castellan, H. Lindman, & D. B.Pisoni (Eds.), Cognitive theory, Hillsdale, N. J.:Erlbaum, 1975. (b)

Wood, C. C., & Day, R. S. Failure of selectiveattention to phonetic segments in consonant-vowel syllables. Perception & Psvchophysics,1975, 17, 346-350.

(Received June 2, 1975)