Reproduced sound in rooms: scaling the experience between large and small rooms

Theo Stojanov Dec. 8th, 2006

Reproduced sound in rooms:scaling the experience between large and small rooms

The focus of this presentation is sound reproduction in small rooms, and more specifically the

reproduction of large room acoustics in a small space. The goal will be to identify a number of issues

that relate to our understanding of large room acoustics, and how these translate to the playback

environment.

Over a century after the invention of the microphone1, sound recording has become the principal means

of enjoying music. The listening experience has moved away from live performances with real

instruments to listening rooms with loudspeakers, often intended to play back not only music, but also

film sound, and potentially be a part of immersive communication installations. Is the experience so

close to the original as to cause a general shift of interest from live to recorded music? While this is a

topic better addressed by sociologists, great progress continues to be made by audio researchers aiming

to diminish the difference between live and recorded sound, to immerse, to blur the line between reality

and illusion.

In a typical listening environment there will be significant interaction between the monitoring system

and the room. Subjective evaluations improve the understanding of that interaction, which may prove

necessary in achieving a better blend between recorded and playback acoustics; to reconcile the illusion

of a great symphonic performance in a concert hall to the reality that effectively the performance now

takes place in a typical listening room. How will listeners qualify the contribution of the monitoring

environment on the original program material? Where do recorded acoustics begin to beak down under

1. The invention of the microphone in the late 1800 allowed for acoustic sound power to be transduced into electrical voltage and then converted back into sound waves via a loudspeaker, and first served a purpose in early telecommunications.

1 / 11

the influence of the room and where are they helped by it?

There is little argument that even the finest present-day recordings auditioned in state-of-the-art

listening rooms are only a reminder of a live performances in real halls. Sound captured in a multitude

of venues through a variety of techniques, equipment, and post-production treatment, reappears at the

loudspeaker as an interpretation of its source by the collective effort of everyone involved in the

recording: it is therefore merely a subjective impression. Yet there are a number of outstanding

recordings, indicating that music and our perception of it endures through the aleatory conditions of

recording productions and communicates successfully with the listener.

There is no recipe for success, but part of what makes such recordings special is an Ideal towards

which recording engineers should strive; the closer it is approached, the better the chance that the

recording will be well received. In a paper outlining the current state of affairs in research on listening

environments Floyd Toole lists a number of important factors: timbral accuracy, impression of

direction, distance, and spatial information. These should be delivered with minimal contribution

caused by the listening room; loudspeakers and room should be neutral conveyors to the artistic

experience [1].

Among the many types of recorded material, Classical Music requires the greatest amount of

transparency upon playback. Any recording will ultimately be heard in a much smaller room than the

one where the original performance took place; therefore the recording engineer is required to

successfully capture an impression of the space for stereo or multichannel reproduction. To this end,

some familiarity with concert hall acoustics is essential.

Reflections and diffuse field

Acoustical aesthetics and listening conditions for classical music were established long before the birth

of sound recording. Concert halls should ensure a pleasant musical experience to all members of a large

audience. In such spaces listeners beyond the front couple of rows are in a predominantly reverberant

sound field and experience mostly reflections.

2 / 11

The acoustic architect strives to minimize absorption in order to strengthen the energy from un-

amplified instruments and voices via reflections, because a reflective sound field ensures the

distribution of that energy to all seats in the house. Hence the fundamental theory of concert hall

acoustics is that the sound field throughout a reverberant space should be homogeneous (the same

everywhere in space), and isotropic (with sound energy arriving at every point equally from all

directions) [1]. Such a field is referred to as diffuse.

A perfectly diffuse field is a theoretical ideal, and can not be achieved due to sound absorption at the

boundaries, but the air, by the audience, etc. But since it has a positive effect on music, the architect's

objective is to generate as much of it as is tastefully acceptable. The measure of the length of time

during which a diffuse field contributes to the sound is Reverberation Time2 (RT), and is perhaps the

most important measure of performance spaces.

Fig. 1: Direct sound, critical distance,and absorption. (After [1]).

This is a look at what happens in reverberant vs.

free field environments. Firstly, there is sound in

free field (solid line descending diagonally),

which decreases by half in intensity as distance

from the source doubles (inverse square law),

where sound energy is entirely absorbed by the

air. In large environments such as auditoriums or

concert halls, that same air absorption has a much

reduced effect due to reverberation, which

conserves and prolongs sound energy (dashed

curves).

The ability to recreate the impression of being in a real acousical space is the most significant argument

for multichannel3 audio over stereo. The engineer now has a greater control over the sound canvas,

listener envelopment (LEV), and apparent source width (ASW)4 [1]. Listeners seem to appreciate a

2. Acousticians refer to RT60, the length of time it takes for the diffuse field to fall below the listener's perception.3. Throughout the paper the term multichannel will refer to 5 playback channels or up, although in the strictest sense “multi” implies any count greater than one.4. Apparent source width (ASW) is a psychoacoustic term related to visual vs. audible impression of a sound source. For example in a live hall an orchestra will sound larger than the visual spread of the performers. Lateral reflections in the reproduction space will have an effect on ASW.

3 / 11

soundstage that extends beyond the physical arrangement of the loudspeakers. In classical recording

practice this immersive impression is created by capturing important binaural cues, or the effect can be

simulated by artificial means.

A sound in free field forces the observer to listen along a single axis: that of the direct sound. A sound

in a room on the other hand puts the HRTF to work, helping the observer not only to localize the sound

source, but to better perceive the subtle resonances of that source that give it its distinct timbre, which

results from spatial averaging of many reflections arriving at the ears from many angles [1].5

To resume, the presence of refections is a desirable quality. Humans enjoy music in enclosed spaces

rather than outdoors, prefer live acoustics to the lack of sound identity in anechoic spaces6. Hence, the

binaural information imparted by reflections is of utmost import. Just what reflections are preferred (at

what levels, and from which direction) will be seen shortly.

From Large towards Small

From the discussion so far, it can be seen that the one predominant acoustic phenomenon that dictates a

sense of space is the reflection. In large spaces we speak of a collection of reflections and their

duration; what happens as spaces get smaller?

A space is geometrically characterized by its volume and its surface area. As spaces get smaller their

volume shrinks more rapidly than their surface area. For a room with dimensions w = 5.3, l = 6.3, h =

3.7 vs. a hall ten times as large we get:

Area = 2ab + 2bc + 2ac, where a, b, and c are the lengths of the three sidesVolume = w * l * h

room:A = 131.4 m2

V = 123.543 m3

hall:A = 1 134 m2

V = 123 543 m3

5. Outdoors, where there are few binaural cues to aid localization, the visual sense takes over. An example often given in film sound is the tiger scene from Apocalypse Now, where Academy Award-winning mixer Walter Murch effectively uses the lack of ambient sound from the front speakers to focus the viewer's visual attention to the screen.6. A word often used to qualify absorptive or anechoic environments is “dead,” accurately reflecting in language the emotional association invoked.

4 / 11

The volume of the hall has increased by a 100 times while the surface area by only 10. This means that

a large space will have a relatively small surface area compared to a small one. Since virtually all

sound absorption occurs at the boundaries, sound absorbing material placed on boundary surfaces is

much more effective in a small room than a large one. The mean free path7 of a sound is then ten to a

hundred times greater for large rooms. Or to put it another way, a panel of sound absorbing material

will be encountered by a sound wave ten to a hundred times as often [2] in a small space. The smaller

the room, the less reverberant it becomes (more absorption going on). In small spaces such as control

or listening rooms the absorption is already such that an observer will detect no reverberant sound field

at all (although there will still be early reflections).8

Reverberation Time (RT)

In Sabine's RT formula for large spaces, the audience is always treated as a layer. This is because the

ceiling height is great compared to width and length, and it would be statistically justifiable to assign an

absorption coefficient to the audience plus floor. Sabine's mathematics however are based on statistical

analysis9, and small spaces can not generate enough sound energy to justify a statistical approach: a

group of listeners in a control room for example can not be treated as a layer because of the sheer

volume of space they occupy.

A small room on the other hand is characterized by a lower ceiling height relative to length and width,

significant areas of absorption on one or more of the boundary surfaces (carpets, drapes, other acoustic

treatment), large absorbing and/or scattering objects (mixing desk, furniture). Though it is possible to

calculate RT for small rooms, it will not offer very useful information. Average measurements across

of a number of listening rooms can give better information, as shown by Devantier in [3].

7. Mean free path: the average distance that a sound wave travels before it strikes a surface.8. It can be proven mathematically, that an enclosure of any size will generate a sound field akin to what is known as diffuse or reverberant in large rooms (provided wavelengths are smaller than the fundamental resonances of the space), but for this discussion of acoustics we will consider only variables that are perceptually significant.9. Benade's document From Instrument to Ear in a Room: Derect or via Recording sheds important light on the topic discussed in this paper, including a detailed mathematical analisys of Sabine's formula. Discussion on it is however not included in the present version of this paper. Another important publication to be considered is found in Ch. 5 of Handbook for sound engineers, Glen M. Ballou, ed., Small Room Acoustics by Doug Jones.

5 / 11

Reverberation time is a property of the room alone, and is measured using an omni-directional sound

source in order to reach all of the boundaries equally and thus set the stage for statistical analysis.

Furthermore, the mathematical formula for RT assume that the boundaries consist of reflection and

absorption, and the central volume of the space is empty (to allow for the build-up of the diffuse field).

In reality however, the directivity of loudspeakers in control and listening rooms produces reflection

patterns and decays that traditional RT math does not account for.

RT and frequency dependence

In large spaces RT can be used as a measure of the suitability of the venue for music, commonly

focusing on mid-frequency reverberation time and variations of RT with frequency, because the

architect strives to avoid acoustical treatment that would alter the spectral balance of the diffuse

reverberant field.

But then, what is heard in small rooms is dictated primarily by the loudspeakers, and then by some

early reflections. Anything else is far below detection, and traditional RT calculations will not reveal

any applicable information. Any suitability of the space for music is dependent only partially on the

room itself, and largely on the playback system.

So what are small rooms?

Acoustically “small” means:

• significant absorption at room boundaries

• sound absorbing and scattering objects

• likely to have low ceilings (compared to auditoriums)

• concepts developed for large spaces apply only partially, or not at all (RT, critical distance)

6 / 11

Schroeder frequency and room modes

The Schroeder frequency is the frequency above which the spacing between room modes (modal

density) decreases so much that they are no longer seen as resonant peaks. Modal density increases

with frequency. Although they play an important part of room acoustics, room modes are not directly

responsible for the recreation of space, and will be kept outside the scope of this paper.

Sound fields 10

Fig. 2: The frequency response of a loudspeaker in an anechoic room (solid line)vs. that of the same loudspeaker in a listening room (dotted line). Source: [3].

This figure shows the frequency spectrum of an enclosed spaces divided into two sound fields. The

point of separation (indicated by the arrow) is known as the Schroeder frequency for large spaces. It is

less clearly determined for small rooms [1] but exists nonetheless. In this particular graph we see the

anechoic response of a loudspeaker (solid line) superimposed over the response of that same speaker in

a real room.

The upper portion of the curve is of concern to spatial imagery reproduction. In small rooms, above the

transition frequency the sound field is dominated by strong individual early reflections (much stronger

than in large spaces where reflections are a random and diffuse whole).11

10. Not to be confused with Ambisonics and the SoundField microphone, the term sound field is used in this paper to indicate an area of the frequency response of a room.11. First reflections will be stronger in small rooms because the listener is much closer to a reflecting surface, and will thus experience a greater intensity of the reflection.

7 / 11

Evaluating reflections

Arguments have been made by control-room designers that early reflections caused by the monitoring

environment must be eliminated to allow the purity of the recording to come out. All the same,

experiments done by Olive and Toole (among others) show that lateral reflections generally have

neutral to beneficial effects on program material, and where the response of a room is concerned other

issues need to be addressed more urgently than reflections (such as room modes, for example).

Direct sound Average delay 9.6 ms

Vertical reflections

Floor bounce Average delayDeltaAttenuation

11.3 ms1.8 ms

-1.5 dB

Ceiling bounce Average delayDeltaAttenuation

14.5 ms4.9 ms

-3.6 dB

Horizontal reflections

Smallest angle Average delayDeltaAttenuation

18.9 ms9.3 ms

-5.7 dB

Second angle Average delayDeltaAttenuation

22.2 ms12.6 ms-6.6 dB

Third angle Average delayDeltaAttenuation

18.7 ms9.1 ms

-5.5 dB

Largest angle Average delayDeltaAttenuation

14 ms4.4 ms

-3.3 dB

In a small room the listener is much closer to reflective

surfaces than in a concert hall, and will always be within

reach or several strong early reflections. In [3] Devantier

looks at the relative levels and delays of the most

important reflections in typical domestic listening rooms.

Table 1 on the left shows the resulting data, and fig. 2

below puts it in context with other studies done on lateral

reflections.

Table 1: Average time delay and attenuation of direct sound and early reflections [3]

Figure 3 to the right shows detection thresholds

for single lateral reflections.

The point of this diagram is to show how single

reflections affect the original sound. The lowest

line (connected solid points) is where reflections

are barely audible. As they increase in level, they

begin to affect the ASW of the image, until at the

very top (solid curve) we perceive not a reflection

but a distinct second image (echo) of the original.

Devantier's data in that context clearly indicates Fig. 3: The first six reflections in a typical room superimposed against other

related studies on the perceptual effects of reflections. Source: [1]

8 / 11

that the six primary reflections in small rooms have a very perceptible effect.

There is a point up to which the acoustics of small rooms are benign to the signal, but determining the

right balance is still a topic of investigation. Understandably, a tiled shower room is not the ideal

listening environment; but then in the range between that and an anechoic chamber, what qualifies

desirable listening room acoustics? Is it a general question of absorption coefficients, or the

presence/absence/strength/direction of arrival of specific reflections and if so, which ones?

In reference [3] it has been observed that reflections originating directly from the front or rear of the

listener (off of a front / back wall, floor, or ceiling) are least flattering to a musical program or speech.

By contrast, the ear-brain mechanism appreciates lateral reflections from the sides which introduce

what has been qualified as a more pleasant spreading of the musical image, or improved intelligibility

as related to speech.

Fig. 4: Inter-Aural Cross Correlation and listener preference. Source: [1]

This is better illustrated by figure 4 above: it shows an experiment in which a listener's preference

increases as Inter-Aural Cross Correlation decreases: the more differences there are between the ears,

the more pleasant the listening experience.

In Conclusion

When discussing what spatial aspects of recorded sound combined with those of the playback

9 / 11

environment are desirable, it is important to consider the simple process of listening. In film, we speak

of suspension of disbelief: the willingness of viewers to overlook the limitations of a medium, so that

these do not interfere with the illusion; to provisionally suspend their judgment in exchange for the

promise of entertainment. Referring back to the idea mentioned at the beginning of the presentation,

that in spite of the greatest care taken to preserve the integrity of the original signal we are still

experiencing an impression, we see that the listener voluntarily agrees to accept the fact that what is

heard is a recording, and makes an effort to see past the reproduction system to the time and place

where the living performance took place.

By merely walking into a listening environment a visual inspection of the surroundings tells us

immediately that we in fact are not in a concert hall: an initial bias that what we are about to hear is an

illusion. Another indication is the acoustical property of the space: it will not sound like a concert hall.

Still, when we listen to a symphonic recording we experience the acoustics of the actual hall, and so

well is space reproduced that we can in fact recognize certain qualities of a particular venue.12

As engineers strive to improve sonic immersion, it is important to address the dichotomy between the

acoustic worlds of the performance and playback spaces, and one idea to be considered is that the

listening environment needs to have some influence over the playback material in order to improve the

fusion between the two; thus, it may be easier to accept the illusion.13 Balancing that blend of acoustical

identities is a topic of current research, for the preferences of listeners need to be taken into account not

only where treatment of the monitoring environments is concerned but also to evaluate the successful

imparting of spatial information from various microphone techniques. The roles of different channels

will thus also become more clear, contributing to the discussion on how much independence is required

for each. While it is only an account of some aspects of reproduced sound in rooms, this presentation

aimed to identify some directions of research based on what has been studied so far, and other related

threads of information have been mentioned throughout the text. In the words of BBC engineer James

12. Olive and Toole's JAES paper The Detection of Reflections in Typical Rooms contains important information about listener discrimination between spectra that are changing (music material) to spectra that are flat (recording space, loudspeaker, listening environment). This is directly relevant to the present discussion, but will not be examined in the present version of this paper.13. George Massenburg's room at Blackbird Studios in Nashville is intended to be “a room without walls” but with an acoustic identity – a perfectly diffuse small room without any early reflections. This is one way of integrating recorded sound with playback acoustics, but is mentioned only as an experimental example because the spaces discussed in this presentation are of a more traditional design.

10 / 11

Moir [1], “if a room requires extensive treatment for stereophonic listening there is something wrong

with the stereophonic equipment or the recording. The better the stereophonic reproduction system, the

less trouble we have with room acoustics.” It remains to be seen if this belief carries over to

multichannel sound.

REFERENCES

[1] F.E. Toole, Loudspeakers and Rooms for Sound Reproduction – A Scientific Review, Journal of

the Audio Engineering Society, Vol. 54, No. 6 (2006 June)

[2] E. R. Geddes, Premium Home Theater: Design and Construction (GedLee LLC, Novi, MI,

2002), gedlee.com

[3] A. Devantier, Characterizing the Amplitude Response of Loudspeaker Systems, presented at the

113th Convention of the Audio Engineering Society, paper 5638 (2002)

11 / 11

Reproduced sound in rooms: scaling the experience between large and small rooms

Documents

Transcript of Reproduced sound in rooms: scaling the experience between large and small rooms