DEMOCRATIZING SPATIAL AUDIO

A conversation with MNTN Co-founder and Sound Scenographer Johannes Scherzer

 

During last years epic Ableton LOOP summit, HEDD had the pleasure to support the 3D audio project MNTN – The Sound of the Mountain. In the aftermath of this wonderful event, HEDD Co-Founder Frederik Knop met up with MNTN mastermind Johannes Scherzer to talk about his career as a scenographer, the ever-growing 3D audio world, and what he has in mind with his new MNTN software.

 

FK
Johannes, you are one of the co-founders of the Berlin-based sound scenography studio TAUCHER. What does the term sound scenography mean and why is it significant for your work?

JS
The term sound scenography is inspired by what scenographers do at a theater stage, or at a film set; they design the scene for the actors, which is the actual space where the story happens. They are creating space, and at TAUCHER, this is what we do with sound. I would consider it as a sub-discipline of scenography. But as I see it, sound designers have already practiced sound scenography for a long time; for example in film, theater, or radio dramas‚ they just didn’t call it that way.

FK
Sounds like a dream job, you also mentioned to me that you did film sound for some time. Which of these movies has most influenced your work on sound scenography? What is the connection?

JS
Creating sound and music for spatial media is a fascinating practice because like in film sound design you can shape the audience’s perception of a story, or an exhibition. You can create an entirely new world, and even better than in movies: the people can wander through the story. When I studied sound at the film university in Babelsberg, I learned a lot in almost all film sound-related disciplines, from location sound to the cinema mixing stage. At some point, I realized that there are more possibilities for such sophisticated sound work as it is common in film sound. Why not designing real spaces the way you would design a movie soundtrack, and let people experience the story by physically exploring the space? I mean everything, atmosphere, music, special effects, voices, and so on. All this happened when I was working on NOISE (E. Löwe), a short feature film created with a soundtrack in the wave field synthesis format (WFS). As far as I know, it was the first movie where even the script had been written considering the extended sound design possibilities.

Another project was the interdisciplinary seminar about audio-only storytelling for high-resolution spatial audio systems, in collaboration with the writing/dramaturgy students. We called it artistic research and wanted to figure out how storytelling and sound design works when there is no frontal system, as in the screen or the center loudspeaker being in front of the audience. We found out that storytelling becomes entirely different.

FK
What do you mean exactly, can you give an example?

JS
When you position three actors‘ voices in a triangle, and the listener is in between, you need to find an answer for why you’re doing it. Because with 3D sound, the listener somehow becomes spatially part of the story. For the audience, it feels different when it’s listening to a discussion from outside, or from the inside. It may have a meaning, and you need to find out how that changes the story. Sometimes the effect is more subtle, and sometimes it makes a significant difference.
Whatever you do, you better consider where to put the listener in the scenery, because he or she is always part of it, anyway. In stereo formats, that decision usually does not have such a big impact on how it feels for the audience.

spatial-audio-type05-hedd
spatial-audio-type05-hedd-3

FK
Have you had a particular moment when you decided to focus on spatial sound design?

JS
I remember when we met at the cinema mixing stage at the film university to discuss different microphone techniques we used for recordings with the Babelsberg Filmorchestra (during that period, they were recording at the Funkhaus Berlin). At the time I had already made my first steps with mixing for wave field synthesis where you have each single of the 360 degrees to arrange sounds in space, and here we were discussing homeopathic nuances of space in the stereo picture. I intuitively felt that if I had to choose between the whole space and just a small fragment of it, the first would open up a whole new world to explore different ways of storytelling, composition, and sound design.

FK
When I heard a wave field synthesis mix for the first time at the Electronic Music Studio of Technical University Berlin, I was blown away as well. Yet, I felt and still feel that there is much room for improvements …

JS
Same for me, that’s why I’m still so fascinated by designing spatial sound!

FK

… but anyway, I have to come back to 3D audio with regards to the real world consumers, not the multi-million-$ concert halls and academic institutions. Everyone wants a piece of the pie it seems, I am talking virtual reality, games, and ultimately, headphones. What is the difference between creating sound for headphones and loudspeakers?

JS
For me, the essential difference is the way people experience the story (or whatever the content might be). If you work with loudspeakers, you’re about to create a shared experience, where people come together in one place and enjoy the show together. It is a very social event, and as a sound designer, you have to take that into account. And there is also the real room, the technical system, noise from air conditioners or noise from the outside, which all adds up to the people’s experience. I see this as a sort of augmented reality, and when well done it is not possible to draw a line between the real world and the created sound, matching perfectly with what’s already there.
In contrast, VR has the possibility to keep the so-called „real world“ completely out of the story, which means you have the best control over the user’s experience. Which is a distinct advantage, and it’s almost independent of the location where people experience VR. But the user is separated from other people possibly around her; just imagine a group of people sitting together, with all this gear mounted on their heads. That’s a different experience, and I think we should take this into account when creating content. And that’s one of the interesting questions that I’m very curious about to figuring out. Augmented reality only with headphones also sparked my inspiration, and some fascinating things are happening out there.

For example, the JUNGLE-IZED app (http://www.jungle-ized.com). I have been to New York just last weekend and meandered through the Amazon rain forest. The idea is mind blowing, but the execution with all the details that contribute to the overall experience, such as interfering traffic noise, leveling, dramaturgy, or sound design, there is still a lot of space for improvements.

spatial-audio-type05-hedd-4
spatial-audio-type05-hedd-6

FK
Let’s talk about Ableton Loop, it was so crazy this year. You presented your very own 3D audio engine MNTN and we were happy enough to provide you with 20 Type 05 studio monitors. What was this all about?

JS
The Loop was the first event we have joined with MNTN — and it turned out that we were not the only ones who wanted to work with spatial sound: We created a space where people could come and bring their stems or multitrack sessions to create their own mixes on a 3D multichannel setup. The setup had speakers on the floor, at ear level, and mounted on the ceiling. The aspect that people most appreciated was the simple user interface, which still gives control over the whole space. And some of the visitors completely went crazy over the ease of use. We had some super busy days and an enormous amount of positive feedback; which is a big motivation for us.

FK
What was your initial motivation to build MNTN?

JS
The back story is: For a couple of years, we kept struggling with the client’s budgets which rarely were large enough to afford commercial 3D audio technology available at that time. Moreover, we were missing the spontaneous and intuitive part of creating spatial sound mixes. Most technology that’s available is overloaded with features, at least for real-world commercial projects; not to even mention its price. At some point, we decided to build a system from scratch and make it available to everybody who wants to create immersive sound experiences. A couple of months ago, we finally launched what we call The Sound of the Mountain, in short: MNTN. It comes as a stand-alone app that users can connect to any audio production software. Compared to devices which we were working with before, the app has almost no buttons, switches, or faders. But it has everything you’ll need for 99% of the typical projects.

FK
Yeah, it looks impressive and indeed very intuitive. Can you explain to me how it works technically and who you are aiming at with MNTN?

JS
It’s as simple as connecting a bunch of loudspeakers to a multichannel audio interface, typing in the spatial position of each loudspeaker, and then starting to mix. You can also use subwoofers for bass management, which extends the frequency range on the low end, and it increases the overall system’s power because small speakers don’t need to struggle on the low end. The speaker layout can be 2D or 3D, and you are almost entirely free where to arrange the loudspeakers. A big plus is the binaural 3D HRTF processing which means that you can work on a spatial mix using your headphones alone. You can then switch to a loudspeaker setup and back to headphones at any time. Finally, there are a number of export formats, including Ambisonics B-format as specified by YouTube for 360° and VR videos.

FK
Regarding LOOP: Why did you prefer studio monitors and not PA systems?

JS
Well, thanks to you guys, we could opt for the best sound we could imagine (grins). We indeed turned away from typical installation or PA loudspeakers, because of the „value for channel“ you get. With studio monitors like yours, you get ten times better sound for half of the price, or even less. Studio monitors might not be as powerful as PA systems, but very often you just don’t need that much power. A little disadvantage usually is the mounting part, but all you need is to get a bit creative finding proper solutions, it’s not a big deal. However, the difference can become tens of thousands of Euro, which is a deciding factor as to whether or not a project can be realized.

FK
Until now, complex and expensive 3D audio technologies such as wave field synthesis or Ambisonics have not found a real place in music production. To me, it seems that 3D audio has yet to show how it can be relevant for the process of music composition and production. What is your opinion on this?

JS
The way I see it, spatial audio formats — even surround sound — have had a hard time because it was complicated to use for most people. In comparison, stereo is easy to handle (sometimes complicated enough). In artistic live performances and sound installations, spatial sound has been around for a long time already, but it has always been limited to the event itself, and the experience could hardly be transferred into standardized media. Luckily, we are about to overcome that. Since the upcoming of VR technologies, interactive spatial sound as we know it from computer games has become much more accessible. Of course, it still is a niche thing and the field is far away from standardization. I believe this is the critical problem that needs to be solved: artists want to create a piece only once and then publish it in various formats, and listeners want to have a smooth listening experience. We took that into account for the development of MNTN, where you create only one mix which you can then output in many different formats, including 3D sound for headphones and VR videos.

Another problem which I also experienced myself is that the tools for creating spatial sound are rarely accessible for many artists and sound creators; often they’re too expensive, or too complicated to use. But who on earth should come up with fresh ideas about music and storytelling if not the creatives? Because of that, they need access to the tools first. If there is no content which people can listen to, all the spatial sound technologies are useless. And that’s what we aim for with MNTN.

FK
Do you see 3D audio as the ultimate goal for simulating reality?

JS
As I use to say: Don’t try to simulate reality, create reality! I mean, what is reality? And what is its virtual representation? Experiencing the so-called virtual reality is still real for our senses. If you don’t try to simulate something, the same thing becomes the original. At the very beginning, I was fascinated by the simulation idea, but I quickly learned to see 3D audio as an instrument to play the space. Usually, if you try to simulate something, it is still not as fascinating as reality is. Or it takes an enormous effort regarding technical issues. But using 3D audio as an instrument to sometimes even create a sonic experience that wouldn’t be possible in reality, that makes a big difference for both, artists and the audience. Simulation is good for training purposes, rescue operations for example. But in the arts, who would want to simulate an artistic work, rather than creating an original piece?

FK
Greatly put and probably a good conclusion for our little conversation. But I have one more as I hear that you will be living in Canada for the next months. What is the best part of the next thing you’re doing?

JS
I just started diving into research in the field of information studies. That’s why I came to McGill in Montréal. The connection to sound, or more specifically to sound scenography, might not be that obvious, but I found, that it’s fascinating to look at spatial sound from the perspective of how we communicate, organize, and comprehend information. But I’m not yet exactly sure where this adventure will take me.

 

Read more about MNTN: https://mntn.rocks

From left to right: HEDD CEO Klaus Heinz, Johannes Scherzer (MNTN Co-founder & Sound Scenographer), Benjamin Schulz (MNTN Co-founder & Software Architect), Frederik Knop (HEDD Co-founder).