June 9, 2013

Latest Trends of Digital Cinema Sound - What I See in the Steps of Digital Cinema Sound [ Special Report ]



By Masaaki Fushiki




I. Méliès
A few Georges Méliès films were shown last December in a small event facility, Le Studio Hermès in Ginza where they had a proper projection room equipped with a couple of 35mm film projectors. The pictures were actually shown via Betacam on the day, but its quality was quite agreeable and I enjoyed the abundant appeal of 100-year old special effects by the pioneer. Méliès himself was one of key characters in the recent well-crafted 3D movie “Hugo” as many of you already understand. This movie was truly a respectable masterpiece for me having myself been associated with filmmaking even only slightly. These two movies related to Méliès illustrate how the motion pictures evolved incessantly by absorbing then-the-latest technologies.
II. Digital Cinema
85% of total 42,000 cinemas in USA is now digital thanks to VPF (Virtual Print Fee) arrangement. Major studios used to prepare 6000 film copies of premium summer hit titles for distribution, but only 500 would suffice nowadays.  Looking at the worldwide penetration, the digital cinema ratio is as high as 69% as of end 2012, therefore “digital” is already a firm foundation along with the 3D feature, and the industry seems to continue their effort introducing more technological advancements such as 4k picture (including laser projectors to follow) and HFR. This is inevitable because the industry is always concerned about their differentiation against home entertainment. The departure from 24fps tradition in the digital domain may not be a tough challenge considering the films are rarely used in the productions today and the frame-rate accommodation would be relatively easy in video projectors. The forthcoming “Avatar 2” promises 60fps/4k release to attract some excitement.
Those advancements are on the other hand a kind of specification upgrade rather than a radical paradigm shift beyond digital cinema. Does the industry steer their focuses in this direction? Well, it would become obvious for us to realize that the digital cinema movement was realistically driven by the economy and efficiency aspects inherent with the digital media, more so than the picture-quality issues. DCP (Digital Cinema Package) allowed the film industry to reduce a great amount of time, cost of films and transportation in generating distribution prints. It also eliminated the process of telecine that was necessary for the content transfer into video packages and broadcasts. A substantial improvement in the production efficiency.
The trend in cinemas has shifted from large auditoria to cinema complexes in the recent years, which enabled them to program all the shows they schedule simply by presetting on the control panel instead of masculine loading and unloading of films each time. Each screen tends to be small enough for using 2k projectors with perceivably sufficient picture quality even though its specifications are not much different from those of digital home video, which also helped the progress.
The high hurdle for alternative contents has been eliminated by the digital format’s nature: About 4 years ago, l’Opera Comique produced the stage of Bizet’s “Carmen” conducted by John Eliot Gardiner with the original libretto, and its last performance was shown in 45 digital cinemas throughout France, and I thought it an excellent example of alternative contents. The same kind of arrangements is shown in Japan now, and theater owners call it “live viewing” enabling them to charge more admissions than those of regular movies. We can understand that the industry has just arrived at a reasonably mature format of digital just as consumers once decided the direction between film cameras and digital cameras, and thus the industry is now in the stage for enhancing the format with some step-up technologies.
III. Sound
Cinema sound kept evolving roughly every 10 years starting with Dolby Stereo in mid 70’s, followed by Dolby SR in 80’s. The introduction of digital was late in the movie industry to wait until 90’s, and at that time some sort of compression coding was employed each with Dolby Digital, DTS and SDDS. DCI then set forth the standard of discrete PCM sound in 5.1/7.1 formats. The intermediate step of Dolby SR before Digital Cinema that continued more than a decade was the valuable time to complete the soil exchange of the entire movie industry during which time the productions gradually invested in the digital environment while cinemas only executed partial changes in their system.
When I posed a philosophical question like “what is the movie in essence”, the engineers in Dolby almost immediately answered, “it is a story-telling” as their common understanding. It starts from a script in the director’s hand, and every single staff or technology is expected to support the story in the script. The story-telling aspect of sound mixing had been traditionally rather rigid with center-fixed dialogs, the modest balance of sound effects or music against dialogs. The surround effects were required strictly ambient in order not to attract the audience attention toward elsewhere off-screen, which was a reliable practice to generate a sense of being there under the limitation of matrix 4-channel surround format without damaging the story’s quality.
This model of self-maintenance has loosened itself a bit in the digital cinema paradigm though, as the trend set more freedom to dialogs’ stereo localization, as typically found in Pixar animation movies, and more dynamic and precise surround sound along with discrete 5.1 channel penetration. Particularly in Dolby Surround EX and later formats, they intended to allow out-of-screen localizations in contrast to the ambience, and this direction is ultimately pushed further by the latest introduction of Dolby Atmos.

IV. Dolby Atmos
Introduced at the CinemaCon 2012 in Las Vegas last April, Dolby Atmos technology is probably the largest revolution in the history of movie sound. In Dolby Atmos system framework, it saves dialogs, sound effects and music in DCP as 128 audio objects that are rendered into as many as 64-channel outputs to suit the theatrical playback system. Hugely different number of channels compared to 5.1 or 7.1 is aimed to achieve the reality of sound sensation unattainable before in order to further enhance the immersiveness into the story. The workflow of this complex structure can be efficiently simplified in the production process thanks to the concept of audio objects.



Dolby Atmos System Flow

    Let’s look into the playback section to begin with. Dolby Atmos provides 2 new features to improve the sensation of sound: One is to add the height with a group of ceiling top speaks, and the other is the surround imaging that changed from wall surfaces by speaker arrays to point source of each individual speaker.
Such nature sound around us as airplane fly-over, bird’s chirping, or thunder has certain directionality, and there has been limitations in their reproduction in the film sound. In the days of Lt/Rt matrix, we basically had on-screen or unlocalized out-of-screen sound (“interior” we used to call it for the latter) and the panning between their contrasts was the expression of an object’s crude move. Even later in the time of discrete 5.1, the difference in the timbre and localization between highly linear screen speakers and surround arrays was so profound that the solid sound imaging or panning to reveal its tonality was not practically desirable except for something rather instantaneous. Traditionally, the theatrical system used to define the surround space in 3 elements of left side, right side and back walls, but Dolby Atmos regards the listening space as a cube or a hemisphere and buries as many speakers as possible in it, each functioning as an individual audio channel whereby the directionality and the uniformity of timbre can be dramatically improved. The recommended ceiling speakers do not form a single common channel in array either. They consist 2 lines front to back, lefty and righty, again each functioning individually, driven by unique, separate amplifier for each.
With the above change, much more colorful effects in the movie sound mixing are attainable. For example in a jungle, a variety of animals and insects may be heard from all sorts of directions in clear, crispy tones by giving full freedom to engineers’ creativity. The quality of panning will be also improved. It should be noted that the traditional ambient effects are still available in the new format by rendering them to multiple speaker outputs just as they used to.
Dolby Atmos Speaker Layout


One more related change in the theater system is the addition of surround speakers closer to the screen in sidewalls. They help localizing the sound in the very far end out of the screen and also smoothing the object’s panning around.
The surround speakers are required to stand up as an individual in their technical specifications. The frequency response and the power handling are particularly important when cinemas consider renovating their system for Dolby Atmos introduction. Still, the ceiling and surround speakers remain physically smaller than front speakers, and Dolby Atmos wisely redistributes low frequency elements to the rear sub-woofer(s) and tries to manage the power balance between the front speakers and the surround speakers by means of plural speaker rendering for the latter.

Next is the impact in the post-production environment. The most significant change brought in with Dolby Atmos to the studio engineers is the way to handle sound elements as objects. Dolby Atmos generates the information of where and when in the system hemisphere each object is placed as a metadata that is saved in a package with the object. This workflow is almost homogeneous with that of ProTools on workstations, and therefore it is like saving the production procedures themselves into the final mix with Dolby Atmos.
The movie production is not an exception where ProTools has become the mainstream in use, and the environment offered by Dolby Atmos with the freedom of specific number of output channels functionally unifies the entire process of post-production, therefore it is not the complication brought in nor the same high hurdle is mandated to the system in cinemas identically to dubbing theaters. When the sound engineer mixes a movie with Dolby Atmos in the dubbing theater equipped in its maximum playback specifications, a precise 5.1 mix (or 7.1) is automatically generated and saved into DCP in parallel with the Atmos mix, eliminating the need for repeated, separate works. This single common DCP package is distributable to variety of digital cinemas with different playback specifications. The scalability feature of Dolby Atmos here plays an important role: Even if a cinema cannot afford a few tens of speaker channels, Dolby Atmos will intelligently take the particular playback environment into account to render its output in a way to maximize and fit the sound mix. This type of care about backward compatibility has been a Dolby tradition.
“It was like having a new instrument” quoted a sound engineer regarding Dolby Atmos. Or it can be said it is like a pallet for drawing new paintings. There are some useful information in the net about how the studios faced with this new system and brought it into their production procedures as shown below, for example:
http://designingsound.org/2012/11/ambiences-with-dolby-atmos/
http://vimeo.com/58805489

V. IOSONO 3D and AURO-3D
A competing format with Dolby Atmos is IOSONO 3D that was developed by Fraunhofer, a giant organization known as codec specialist, and this system may be more futuristic in its concept. Audio objects and rendering through multiple speakers to cover the spherical space are the common factors with Dolby Atmos, but its approach to realize the sound localization is based on a totally new theory: Wave Field Synthesis that captures the sound source through an array of multiple microphones and reproduces it with the same kind of speaker array around the space. IOSONO’s unique features are the wide sweet spot and the ability of localizing the sound at any point within the sphere, for example as close as next to you. With such significant advantages, they claim it the true 3D soundscape format. The drawing shown below illustrates an example of cinema installation with IOSONO system. According to IOSONO, a typical 1:2 box auditorium would need 9 speakers in front behind the screen and also 9 in the back to yield total 54 speakers to form a single-layer horizontal ring. In addition, recent installations have 15 to 18 ceiling speaker channels by using triangulated array for the ceiling zone. For the timbre matching and sufficient power handling of the speakers in the designs, they cooperate with such speaker manufacturers as D&B, Fohhn, JBL, Meyer and QSC. For example in the Chinese Theatre in Hollywood, QSC KW-151 for the screen and K-12 for surround are installed.

IOSONO Speaker Layout
 
     Auro-3D is another new-comer, promoted by Barco, that extends the traditional 5.1 into a 2-layer format: 4-channels excluding Center and LFE are stacked up on the 5.1 foundation to structure total 9.1 channels, or 10.1 with the ceiling channel addition. 6.1/7.1 can be extended to 12.1/13.1 in the same manner. While the traditional surround formats contain only one horizontal line of speaker array, Auro-3D adds the vertical dimension to realize a 3D soundfield. The installations and productions are happening since last year, and its back-compatibility to the traditional layout is quite straightforward.



VI. Gimmick and Essence
Along with the recognition that the movie is essentially a story-telling, my own contemplation about the movie is that it is an effort of work to construct a space to experience the story. This may be only a matter of expressing the same differently, but while the former is a structural framework to work on the logic and the sentiment, the latter is a physiological communication with the audience in my opinion. It may be something like a phase difference between the intellect and the instinct. It is literally the sense of existing in the screen rather than being seated in a cinema, or from the viewpoint of movie producers the effort to drive the audience to such an illusion. When the movie was first shown, the viewers were horrified with a steam locomotive dashing to them and they tried to get out, and this is exactly the evidence of a physiological space. Technologies like large screens, surround sound and 3D imaging all aimed this. The latest motion simulator falls in the same gender.
What is the decision factor for these technologies to end its life as a gimmick or to remain steadily as an essential part of moviemaking? As the intellect and the instinct exist, there are 2 different approaches in the realism of imaging art. One is to capture the scene with a fixed camera sustainably for a long period of time just like a fixed-point observation. The other is very cinematic “montage” technique that edits different cuts to structure a story and drives it forward. As for the sound, there are no formulas in terms of how it needs to go along these approaches. Sound effects and music often continue independently of the transition of scenes. Surround sound is not a form of technology to recreate the acoustic dimensions of each scene precisely from the opening to the finale. It goes along with the scenes to a certain extent, but carries its own montage ups- and-downs to enhance the stage, and it exactly looks like an established style of “sound designing”.
3D has much longer history than surround, but repeated its gimmicky cycles in the past for a few times. Personally, I think 3D is exactly the identical technology in imaging to surround, and thus can perform at least equally to 5.1 especially in digital. On the other hand, I feel the immersive sensation is often greater in huge 2D screen that entirely occupies the viewing angle. I also feel a lack of style in 3D except simply dimensional from beginning to end. These two factors might be its challenge to tackle. In that sense, it is an interesting subject how 3D would digest the live viewing that I previously noted.
One answer that I arrived at in this process of thinking is that excellent contents feed technologies. It is my feeling of reality that potentially a gimmick technology can trace the road to become an essence if it continues to enjoy the benefit of being used in excellent contents.

Above include purely personal observations about the movies and the technologies in the days of digital, and I hope this serves even slightly the readers for their better understanding of the subject. I would like to acknowledge and thank Mr. Doug Greenfield of Dolby Burbank, a long time colleague of mine, and also Mr. Jeff Levison of IOSONO. Both of them helped me clarifying a few factual aspects of latest information.
J


About the Author:
Masaaki Fushiki was born in 1948 and graduated from University of Tokyo majoring French language and literature. After 5 years of experience in the sales promotion and product planning in the Overseas Department of TEAC Corporation, he joined Dolby in 1979 as a liaison licensing staff and initiated promoting the concept of surround sound for home entertainment in 80’s. He also devoted himself in standardizing the digital audio format of DVD-video in 90’s. In 1997, he establish the branch office of Dolby’s international service company in Japan, and formed it into Dolby Japan K.K. as a legal local entity in 2007. He was the first representative director of the company until he resigned in 2009.