5.1 Surround Terakoya Lab / サラウンド寺子屋塾: December 2004

2004 ABU BEST ARTCLE AWARD PAPER in ABU Techcal Review

Mick M. Sawaguchi
Director of Production Operations NHK(2004)
Fellow M-AES/IBS M-C.A.S
Mick-sawa@u01.gate01.com

Abstract
The sound mixing technique of Surround sound is going to developing in various countries and Audio Engineers in the Broadcast world. It has a great advantage especially for Digital Broadcasting TV and Radio, Due to enable to utilize a Multi Channel Audio Stream like a Dolby Digital, DTS and AAC so far.This paper will describe fundamentals and latest program production techniques.

Part-1 will discuss a surround set up fundamentals
part-2 will describe latest program production techniques

1.Design of control room acoustics and of the monitoring environment

Why it starts a concept of acoustic design for surround sound mixing?
Since, It is quit new era compare than 2-ch stereo room. There are limited numbers of acoustic designers who have enough knowledge regarding surround room treatment at the moment. For the optimal mixing of surround sounds and their precise reproduction in the home, more appropriate guidelines for the design of the control room environment, and its acoustics in particular, need to be drawn up. HDTV Multichannel Sound Study Group (HDTV-MSSG), which operated from 1992 to 1995 proposed the guideline specifications. These were intended for sound mixing work in small-capacity studios, an environment quite different from that for motion picture sound production.
We started with rather vague ideas and have tried various design approaches, leading to the primary design methodology today as follows:

(1) Each channel must undertake even work in accordance with the number of channels in the multichannel system. This means that increased sound field homogeneity is needed for the control room.
To achieve such homogeneity, NHK has designed the control room to absorb sound in its front section and evenly diffuse it at the rear ceiling and sidewalls. We adopted this idea to secure a broad and homogeneous sound field over the entire band in the optimal listening area, which is different from the conventional method of combining different acoustic conditions, such as live end and dead end. Those, which have been adapted for conventional stereo mixing room, design. The control room's interior is an irregular heptagon that is left-right symmetrical.

(2) The front section of the floor must be made of sound absorbing materials to prevent the primary acoustic reflections of L-C-R speakers from the floor. In the conventional construction, monitor speakers are built into high-rigidity walls, but this construction tends to generate rear reflections in multichannel surround sound production and disturb the frequency characteristics near the mixing area. Therefore, the present front surface has a structure that transmits sound.

(3) Some kind of diffusers must be used for homogeneous diffusion of medium and high frequencies. The installation location will depend on the studio's capacity and space arrangements.
In studios built so far, the diffusers are installed on the ceiling, sidewalls and back wall.

(4) The control room's own low-frequency processing function and unnecessary resonance must be inhibited since the control room has higher low-frequency reproduction ability than conventional 2-ch stereo mixing rooms. Sound traps are installed in the air behind the speakers for low frequency processing near the front side. The walls behind monitor speakers are made of burnt bricks to increase the articulation of low-frequency sound.

(5) Fire-resistant stone blocks, which were made by porous heavy bricks, are used to facilitate low- and medium frequency processing in the studio's production area.
This improves low-frequency absorption ability and eliminates unnecessary resonance with the studio's base structure.

(6) The studios have common parameters: RT = 0.2 to 0.3 sec. with a 'dead' tendency; NC = 15 or less. The front L-R speakers are installed 3.6 to 3.2 m apart in consideration of the alignment of L-C-R speakers compatible with stereophonic production. The distance to the effective acoustic center, which is at the apex of an equilateral triangle formed with the line connecting L and R speakers as its base, is 3.4 to 3.5 m, slightly behind the mixer seat. This arrangement is intended to enlarge the listening area. When more than one pair of speakers are installed in the rear section, one pair is at 60dgree and the other at 120dgree. When a single pair is installed, it is set at 110+/-10 degree, somewhat laterally.

(7) In fact, it is difficult to install all speakers at the same height in an actual production studio as specified in the ITU-R BS-775 Standards.

For this reason, allowances must be taken into account and, particularly in the case where the same listening area as that in the front section cannot be secured in the rear section. It means that the speakers are some 30% higher than the front speakers. This configuration, however, is a practical way to ensure a broader listening area because there are also production staffs on duty in the rear. Since the monitor balance of each channel may vary due to secular changes, it should have a trimming function to enable fine adjustments to be made.

(8) In the use of LFE components, LFE will give a much of low-frequency power, but
It need a careful set up where is a best place in the control room due to avid a low-frequency standing wave with peak and dips.

(9) All the front speakers must be of the same model, while small types of the same model must be used for surround sound production due to physical restrictions.
This unifies the quality of sound in all the channels.

Fig-1 shows some example of Surround mixing room.

2 Tips for surround sound mixing

There are some tips to gain a proper Surround sound Mixing. We discuss the following subjects and review usage examples by rule of thumb:

1) optimal monitor level settings and practicable monitor speaker setting
2) operations of the center channel
3) application of heavy bass components (LFE).

2.1 Monitor level alignment and speaker setting

Since the production of broadcasting programs involves a variety of areas such as Drama Documentary, Sports, Music and live invents under the different mixing environments and speaker size. According to our long periodﾕs practice
1 Large monitor level settings are calibrated at 85 dB per channel in pink noise in order to make them compatible with movie sounds.

2 Medium and small speakers are also calibrated at 82 dB, 80 dB, 78-76 even 70 dB/channel. It is depend of every size and you will find a proper monitor level so that reproduce your best sound balance at home.

3 The set up of LFE speaker level as follows. To supply a reference level of Pink noise from your mixing console such as -20 or -18dbsf. Measure the SPL +4db higher than Main monitor speakers.

4 The distance between mixing center point and every speakers should be in the equal distance. But it is often happen that it is difficult to keep an equal distance L-C-R front and SL/SR speakers. This case, it is priority that you will keep an equal distance between L-R, then you can apply delay compensation to the rest of speakers.

2.2 Down mix and dynamic range control

A monitoring function compatible with both surround mixing and 2-channel stereo mixing is important for maintaining compatibility in mix-down. This function must allow the 2-channel stereo balance to be checked during surround mixing. It is so called [Down Mix function]. It also, saves the working time and studio resources.
(See Fig-2)

There are large differences in the audiovisual environment and reproduction levels between 3-2 surround mixing and 2-0 stereo mixing. To resolve these problems, a processor capable of audibility compensation and appropriate dynamic range compression will be necessary. Since there is number of viewers still enjoying by conventional 2-ch stereo TV set thatﾕs why down mix and Dynamic range control will increasingly require such processors.

2.3 Representation of the center channel (See Fig-3 )

a) Positioning for the hard center
This is used to clearly distinguish a real sound image from the images of other channels or when an articulate sound image of center position, instead of a phantom center image, is needed. This has an advantage that it allows mix-down using coefficients, which approximately match theoretical values. This method helps stabilize overall positioning when used for the main vocals or a specific solo instrument as well as for narrations and monologues. Even if out of the sweet spot area, the sounds would largely be kept in equilibrium.

b) Positioning for the phantom center
The same as with the conventional 2-channel stereo method, this position is used to emphasize high quality sound blending between the front L-C-R speakers or when no articulate sound image is necessary.
This is enough provided there is no pressing need to use the center channel or the speakers in the reproduction area have narrow intervals. It should be noted, however, that phantom center sounds would form a diffused sound image and would sound to be out of balance if the image screen size was 50 inches or larger or the L and R speakers were placed 2.5 m or more apart.

c) Mixture
By combining the foregoing two methods, this approach sets positions by combining the specific hard center in the center channel and the supplementary phantom center between the L and R speakers. It is useful for smoothly blending sound images in the whole front together while articulately placing center components in position. For this purpose, cross talk between the L/C/R channels must be controlled using what is called the divergence function. Typical examples are as follows: for monologues, position the main sound in the hard center channel and also in the L-R channels but with the level reduced by 3 to 4 dB, or avoid the risk of over level by concentrating the bass and kick drum parts only in the hard center channel. The latter example requires special care in mix-down to prevent any difference in the balance between surround sounds and 2-channel stereo sounds.

2.4 LFE control

Heavy bass components that is 120 Hz or under offer a useful means of representing motion picture and drama sounds. In general, music performed with acoustic instruments contains little heavy bass components unless bass drums, 'cannons' and contrabasses are played in concert. Adding a slight flavor with a limited amount of heavy bass components may be enough except when they are intentionally added in bulk for some purpose. This is a fundamental that you must get a best balance in to the L-C-R-SL-SR as main speakers. LFE means a flavor of low sound effect.

3. Surround sound designs fundamentals

3.1 Six basic patterns of surround sound design for dramas
(1) Surround ambience
This is the most fundamental surround sound design for either music or dramas. For music, it produces an environment space behind the audience so that they perceive a stronger sense of reality or atmosphere. In dramas, the environment sounds enable the audience to better perceive how the story is proceeding. The difference between drama environment sounds and environment music is that surround components used for the former are not necessarily those recorded simultaneously.
(2) Fly-over
As suggested by the name, specific sound flows longitudinally between the front and rear sections of the studio. A sharp snap of sound effect adds a strong impact to the scene.
(3) Whirlpool
The audience is thrown into a spiral whirlpool of sound and so feels as if the place is swinging in every direction.
(4) Proceeding sound field
Sounds, which may predict what is about to happen in subsequent scenes, are reproduced, not merely generating a sensation of reality or feeling of unity as with the method in Item (1) above. The sounds produced here must be short, have a punch and allow the audience to guess what's coming next.
(5) Sound shower from above
A shower of sound comes from above the audience. It is theoretically impossible to reproduce vertical relations using any configuration of the current horizontal 6-channel speakers. The method, however, makes use of the physical advantage of surround speakers, which are typically installed higher than the audience seats.
(6) Big, closer sound feeling
Most of the sounds come horizontally instead of from above; the main components are reproduced with the front C-channel, and supplementary components with the L-R/SR-SL channels. This method is useful for emphasizing a specific human voice in dialogue or monologues or for big sound effects representing gunshots and explosions. In this method, sound can be boosted larger than that reproduced from a single channel. Using more than one channel can strengthen the drive to a higher level than representing all with a single channel, while securing the peak margin.

3.2 Three surround sound design for music (See Fig-4)

(1) Stage layout
The main music components are positioned in the front section, while spatial information, such as reverberation of the hall and indirect acoustic components, are reproduced in the rear.
(2) Discrete layout
This layout is not intended to reproduce theatrical performances, but is suitable for the musical representation of something unrealistic by actively using more of the assigned channels. It is aimed at the front section but reproducible sound can be laid out freely over the audience's surroundings.
(3) Omni-directional layout
The audience's front axis is not fixed so that they can receive the sound from all over. Music artists, such as Japan's Isao Tomita and Britain's Allan Parson, have created 'sound walls' by making good use of such omni-directional acoustic space.

Pending Issues
We have discussed the application of a multichannel surround sound recording method for drama and music production. Development of such recording and production methods has just started in Japan and overseas. However, the following issues remain:

1) The methodology has evolved from 2-channel stereo to 3-channel stereo and then to 3-2 surround. How many channels will eventually have to be installed for ideal spatial reproduction?
2) What effects do physical rear reverberation and indirect sound components have on psychological sound?
3) What are the rear equal loudness characteristics?
4) Does the sound field of multichannel music reproduction have a broad sweet spot?
5) What level of deviation from the appropriate arrangement should be allowed in the home music reproduction environment?
6) Is there any adequate down-mix method to secure compatibility with stereo?
7) How to ensure an appropriate dynamic range for various audibility levels
8) Development of a practical method of using the center channel for music
9) Is there any way of effectively using the sub-woofer band, called the LFE channel, for music?

Multichannel Surround Production for Broadcasting Part-2
New Trends in Multi-channel Surround Sound Production

In Part 1 of this paper, we have discussed the fundamentals of surround sound production.
In this part, we are going to describe NHK recent approach to practical production.

1. Examples of surround sound production for documentaries

Key points in surround sound production for documentaries
Since 2000, we have worked on production of documentaries delivered in surround sound to explore the effectiveness of surround sound in such programs. Documentaries are usually produced on location, often on small crew . It is important, therefore, to choose the sound recording gear and other equipment that is lightweight, compact and reliable.
A battery-operated multi-channel recorder is one of the key equipment crucial to surround sound recordings. Four to 10-track multi-channel recorders that record audio onto a hard disk or an MO disk have recently been emerged in the market, facilitating such location sound recording.
Each track is monitored in 2-channel stereo through headphones. In the future, pseudo-surround monitoring will be possible by encoding surround sound.

<4 channels or 5 channels?>
While 4 channels are sufficient to record surround sound ambience for documentaries, 5-channel recording is recommended when a stable sound field is needed or when recording sound for which the hard center positioning should be used, such as interviews, together with surround sound ambience. This conclusion is supported by the results of the subjective listening testing we performed to compare the effectiveness of the above two methods.

Examples of surround sound production for documentaries
In this section, we introduce some of the actual examples of our production.
Figure-5 shows our friend Mr. Florian Camerer of the Austrian Broadcasting Corporation (ORF) in Austria. On his shoulder, he carries three 2-channel portable DAT recorders for recording, with timecode synchronized between them.

Figure-6 shows the example of recording using a 6-channel PD-6 MO recorder. You can see from the picture that location sound mixers of documentary programs also need great physical strength to carry heavy equipment.

Figure-7 shows the production surround sound recordings of HD-SPECIAL Documentary:
surround journey to SHIRAGAMI, a documentary broadcast in Hi-Vision (NHK HDTV technology). Since sound is the key element of the program, the crew picked up sounds from a variety of sources - voice of the main character, the sound of walking on the sandy beach, and the atmosphere of the shore.
Figure-8 is a snapshot taken during the recordings of the sound of running river for environment sounds used on the scenes of Shiragami Mountains in the above documentary. The most important when recording surround sound for documentaries is how to capture the atmosphere of the location venue.

Figure-9 is another shot taken during the location sound recordings of the program. The arrangement of 4 microphones crossing each other to make an X shape is called an IRT-Cross, which is today considered to be an effective system to record ambient surround sound on location. The advantages of the IRT-Cross are that it allows smooth sound transitions and creates a sound thats very soft and natural, which makes it great for using on recording discreet ambience sound that does not distract the main sounds.
Figure-10 shows the recordings of surround sound for Hi-Vision Special: World Natural Heritage. In this picture, the crew is recording the sound of the wave in Sian Kahan of the Mexico Biosphere Reserve, using an IRT microphone cross and a PD-6 recorder.

Multi-channel recorders
When a portable multi-channel recorder was not commercially available, recording was done using three 2-channel recorders with timecode synchronized between them. The following are some of the battery-operated multi-channel recorders recently released in the market.
Figure-11 shows the recorder made by DEVA, the US manufacturer who developed a 4-track hard-disk recorder designed primarily for motion picture sound recording on location. This new multi-track recorder was launched to meet the need of users who require more than 4 channels up to 10 tracks maximam.

Figure-12 shows the Fostex PD-6 Portable Location Recorder developed as a successor to portable DAT machines. This battery-operated recorder allows 6-track recordings made on a DVD-RAM and data transfer to DAW (digital audio workstation) systems. As BWF or Wav so far.
Figure-13 shows HHB UK PORTADRIVE, an 8-channel hard-disk recorder with a built-in digital mixer.
Figure-14 also shows AATON CANTAR-X hard-disk audio recorder.

As we discussed in the above, the environment for field recording of surround sound has improved dramatically with the emergence of portable recorders with 4 or more tracks.

2. Surround sound music production examples

In this section, we discuss the examples of surround music programs recorded in the field, not in the concert hall or in the studio.

HD Surround Hour: Music In Nature
Music, just like dramas, has been regarded as an area where the use of surround can be highly effective since beginning of 1990s, and a variety of surround sound music programs have been produced so far in the hall and in the studio. In classical music category, in particular, new miking techniques and other technical developments have been introduced to actual production. The advantage of surround sounds in music program production has been established in the area of broadcasting.
HD Surround Hour: Music In Nature is a program intended to integrate sound in nature with music. It features surround recording of the music played by musicians at the site where natural acoustics are present, for example in the cave and in the valley between tall buildings. The approach is similar to that of documentary production, but for music program production we can use a small production truck if necessary.
Figure-15 shows the recordings at the Gunma Observatory. This is a successful example of capturing the unique acoustics of the gigantic object made of concrete, together with beautiful reverberation.

Figure-16shows an example of the production in a garden of the Zen temple in Kyoko that captures the sound of traditional Japanese musical instrument in the acoustics of the temple.
Figure-17 shows a surround microphone system called INA-5 that features 3 front microphones and 2 rear microphones.
An error music sound mixers are apt to make is considering sounds other than music to be unwanted noise and thus to be eliminated. While this may apply to studio recording, it is necessary for them to use the sound of nature actively to produce new surround sound for music, based on the concept of integration of music and the sound in nature.

3. Surround sound sports production

Sports, whether played indoors or outdoors, are an area where surround audio is regarded as highly effective. A sense of reality 5.1 surround offers can be appealing to viewers especially when watching live sports broadcast.
What is most important in using surround sound effectively in sports broadcast, is to convey the acoustic imaging of a stadium to viewers through surround sound. To that end, miking will be the key to capturing the unique acoustics of the venue.
Whether surround sound is used in sports relays effectively is subject to the delivery of the unique acoustics of the stadium in surround.The following should be considered when producing surround audio for Outside broadcast on sports:
- Since most live sports programs are run for hours, the surround sound field that keeps viewers from being bored should be created. For this purpose, the use of contrasting sound such as direct-distance, monaural-stereo on occasion helps to keep the sound from being monotonous.
- Surround sound acoustics are naturally created with the noise of the spectators in a sports stadium. In a live of the game, therefore, it is important to make an impression at the very start that the program is delivered in surround sound, not in conventional stereo. The effective way to achieve this is to exaggerate surround sound at the beginning of the broadcast, and then gradually returning the balance of the sound to a natural level.
Examples of surround sound production for sports

2002 FIFA World Cup JAPAN YOKOHAMA

Figure-18 shows the microphone layout used at the International Stadium Yokohama for recording of the final match of the World Cup Soccer 2002 that sent the people of Japan into frenzy. LFE-dedicated microphones are featured in this system. Although it depends on a sound field that varies according to the stadium, this type of microphones work effectively when installed at points where the LFE components assemble.
Figure-19 shows the transmission system. One of the outputs mixed at 5.1 channels at the stadium is encoded to Dolby-E, while the other is mixed down to 2 channels and transmitted to Broadcasting Center. The latter is used not only for surround sound broadcasts but also as the material for various purposes, such as news programs. While it is desirable to transmit uncompressed, discrete 5.1 surround, transmission of encoded signals is more cost efficient considering the existing transmission capacity and line costs. The disadvantage is that inevitable deterioration in sound quality is caused by tandem encoding/decoding; audio encoded in Dolby-E is once again decoded back after reception and then re-encoded to the AAC (MPEG-2 Advanced Audio Coding) standard of Japan. Surround broadcasting of sports programs in Japan, once gained popularity in 1980s but declined due to analog broadcasting restrictions, is now drawing the attention again with the launch of terrestrial digital broadcasting in December 2003. Sports are expected to be the key genre in broadcasting media in the future, and an extensive range of programs on baseball, soccer, horse racing, rugby and grand sumo may be produced.

4. Live concerts
Live concerts are the type of music programs best suited for surround sound transmission. Since most of these programs are broadcast live, development of stable transmission system is vital as in the case of sports programs.
Points to notice in surround sound mixing is that a sound mixer is apt to focus on capturing the enthusiasm in a stadium filled with the cheers and applause of the audience rather than music per se, resulting in negligence of a balance in music that should be the major focus of the broadcast. Sound mixers, on the basis of their own cool judgment, should always work to create a good balance between music along with the response of the audience and the sound ambience, while giving first priority to music. Here are some examples of the production:
NHK Tuesday Song Concert

NHK Tuesday Song Concert is a long-lived popular song program broadcast live from NHK Hall every Tuesday from 20:00 to 20:43. It has been delivered in surround sound since 1990s, and the audio equipment was updated to 5.1 surround in August 2003. Figure-20 shows the current basic surround sound design.Figure-21 shows control room A-B.ROOM-A will mixing total input,B will mixing music parts.

Current issues

Multi-channel surround sounds have been used in various broadcasting genres today. However, the following issues still going to study.

1 Lip sync problem on large-screen displays for home theaters
Today, LCD and plasma flat-panel displays with around 40-inch size are being the TV sets of choice in a growing number of households. As a result, viewers are beginning to notice a video processing delay generated inside the receiving sets that causes slight out-of-sync audio. Since the problem will be even more apparent in the coming age of larger displays, a certain guideline should be drawn up for these products around our industry.

2 Down-mix coefficient to 2-channel stereo
In 2004, the Association of Radio Industries and Businesses (ARIB) established the standard values of down-mix coefficient used to down-mix 5.1 surround sound to standard 2-channel stereo. With the establishment, viewers will be able to enjoy natural sound on any receiving set. This should be welcome news for sound mixers.

3 Should LFE channel always be used?
The issue that has troubled sound mixers is a discussion as to whether or not the LFE channel should always be used. In our opinion, it is not necessary in some program categories. However, as more households purchase a 5.1 surround home theater system today, we receive an increasing number of complaints from viewers claiming that nothing coming from the subwoofer. We can tell them, No, LFE is not required for this program, but still it is not a good idea to arouse such complaints.
Currently sound mixers have been torn between audio expression and market principles due to this issue.

4 Surround miking for field production
Several surround miking systems have been proposed for field production. Just as in the case of main miking system for music production, development of surround miking and surround monitoring methods is highly anticipated.

REFERENCES
1 S.Yoshikawa ［HDTV Multichannel Sound Study Group］ 1996 AES Copenhagen
[Proposal for the Specification of Control Rooms for HDTV Multichannel Sound Program Production]
2 Mick.Sawaguchi 1996 AES Copenhagen
[HDTV Drama Multicannel Sound Productio with 3-2/3-1/2 CH]
3 Mick.Sawaguchi Akira Fukada 1999 IBC Amsterdam
[Multichanel Sound Production Practice for Broadcasting]
4 Mick.Sawaaguchi 2000 19th AES INT Conference
Surround sound mixing seminar

[ Back to Index ]

5.1 Surround Terakoya Lab / サラウンド寺子屋塾

December 31, 2004

Multichannel Surround Production for Broadcasting

December 20, 2004

NARAS Recommendations For Surround Sound Production