Audio in Virtual Reality

As I’m sure you know, Virtual Reality, or VR, is becoming more and more relevant in today’s media. As the technology improves and the price tag shrinks, we should expect the idea of putting on a headset and immersing ourselves in a virtual world to become less ‘science fiction’, and more ‘science fact’.

At Bayan, we’re always keeping a keen eye on emerging technologies and VR is no exception. As usual, we’re going to take a closer look at the audio side of things.

How it works in the real world

Your spatial awareness largely depends on your sense of hearing. Your ears and brain work together to detect and dissect an unreasonable amount of information describing your auditory environment. Our system for mapping out our surroundings with sound is highly tuned and can single out tiny changes in timbre, volume, and time. Here’s how we do it:

Interaural Time Differences:

One of the many benefits to having a pair of ears, rather than just one, is the ability to compare what we hear in each. Sound, like everything else, takes time to move about in space. When a sound wave hits your head, it takes a different amount of time to reach each of your ears. This time difference varies depending on where the sound source is in relation to your head. The farther to the right or left, the larger the time difference it.

Interaural Level Differences:

The first method of bat-like echolocation isn’t so easy with higher pitched sounds – like the sound of a mosquito buzzing around your head. We can’t detect the time of arrival of high frequency sounds. When a sound source lies to one side of your head, the ear on the opposite side lies within your head’s ‘acoustic shadow’. The sound must pass through your head, greatly decreasing the volume, before it reaches your other ear. Above about 1.5kHz, we mainly use volume (level) differences between our ears to tell which direction sounds are coming from.

Spectral Filtering:

Sounds coming from different directions bounce, or reflect, off the inside of your outer ears in different ways. Your outer ears modify the frequency content of the sound in unique ways depending on the direction of the sound. This is how you can determine the elevation of a sound source.

How it works in Virtual Reality

In the digital realm, almost anything is possible, including the simulation of a pair of human ears. In a VR environment, the developer can build ‘virtual loudspeakers’ into the world. Whether they’re static, placed inside a waterfall, or dynamically animated, attached to a fluttering moth.

The sounds produced by these virtual loudspeakers disperse about the environment, reflecting naturally off materials as they would in reality. Obstacles with hard and soft surfaces would reflect the sound differently. In the real world, these interactions are very, very complex, so replicating them in full detail and in real time would be practically impossible – especially when running on a smaller mobile phone processor.

To simplify the reflections, Google’s Spatial Audio system has split them down into three components:

  • Direct Sound
  • Early Reflection
  • Late Reverb

The first sound to hit your ears is always the Direct Sound. This is the sound that has travelled straight from the source to your ears, with no reflections or bounces in between. The closer the sound source is to you, the more Direct Sound you hear. As the source moves further away, the Direct Sound becomes less clear and is instead replaced by reflected sound.

This is the Early Reflection stage. These reflected sounds are altered in real time to mimic the real life impact that different surfaces have on a sound. This helps you perceive the shape and size of the environment you are in. For example, in real life, you can be standing in a pitch-black room and be able to tell if you are next to a wall by the way sound reflects off it.

If a sound source is in a large space, the sound will propagate over a longer period. This causes the number of reflections to increase, simultaneously reducing the individuality of each reflection’s sound. This is what is referred to as Late Reverb. Reverberation has been digitally modelled for many years and is now very advanced and efficient. It is possible to alter the shape and size of an environment and have a reverb engine respond dynamically. For example, you could be walking through a cave with a crackling torch in hand. The sound of the torch would reverberate off the walls, changing slightly as you rounded corners and walked into different shaped passages. Then, after stepping into a huge cavern, the reverb would alter itself to simulate the distant walls and wide open space – all in real time.

These techniques are combined with a realistic modelled human ear system, working almost exactly the same as your real ears do. The combination of clever algorithms, world design, and audio processing systems results in a truly convincing experience. Before long, we’ll be taking trips to alien landscapes all in the comfort of our favourite armchairs.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s