by Stephan Schutze
When things go wrong
Most of us have a strong preference for talking about our successes and accomplishments.
We hope that everything we work on will be awesome and trouble free; but the reality is most of us learn infinitely more from our failures than from success. There is a reason why people speak of the benefits of working outside of your comfort zone. I also believe we need to be comfortable enough to be able to admit our limitations and admit we do not always know the solution to every problem.
Defect, like many projects has had its own share of challenges. Some stemmed from lack of knowledge or experience as I had never attempted a project design like this, while others were hard limitations of the hardware and software involved in production. I even caused some of the issues myself.
In sharing with you the experience of designing the sound and music for Defect, I feel it is important to share these challenges and how they were overcome, or not, as a critical aspect of the development journey. It would be misrepresenting the process to pretend there were no problems to overcome, particularly with such a unique approach to the design.
Too big for my boots
The first real issue should not have come as a surprise for me, but the final solution was not something I expected.
The music system for Defect has a lot going on. The number of Reference Events has grown to a significant number and the "depth" of nesting I am using is also more than I had originally planned. My system was designed to utilise individual sounds files essentially as samples and build up the musical score using FMOD Studio as a real time sampler of sorts. In general this works, and the system did indeed assemble musical themes and layer them to create the overall score during gameplay. The hurdle we needed to deal with was the cost.
For most of my career as a game audio professional I have encountered one major technical requirement of all the work I have produced common across every platform; "Use the least amount of memory possible." Efficient design, streaming and audio compression have always been my go-to tools to deal with this requirement. Defect is the first project I have worked on where I have broken that rule.
When we were first able to test-run the system within the game, it did indeed work and the memory usage was actually quite good for what was going on, but the CPU usage was off the scale for what was reasonable for audio. When fully operational, the system was consuming 20-30% total CPU time. This was affecting the overall performance of the game and was never going to be an acceptable situation.
Our tools allowed us to monitor these levels even at early stages of production, which was extremely helpful as a problem diagnosed early has a better chance of being resolved. I spoke with the team at FMOD to get a better idea of which processes would consume significant CPU cycles. I needed to understand what it was I was doing that was causing the problems and hope that it would not be a core aspect of my design. Through all the stages leading up to this issue I knew optimisation would be required, but it had never occurred to me that the fundamental design might not be possible due to software and processing limitations.
After running some tests on our project, the FMOD team reported back a variety of things that were contributing to the high CPU drain. The fact that it was multiple causes allowed more flexibility in finding a solution, but the overall trend pointed to one thing; my workflow methods designed to maximise memory usage were a central part of the problem.
Usually when I add any sound file to a project, I do so with the knowledge that it increases the efficiency of that sound file if I use it in multiple sound events across a project. A sound file that uses memory just by being loaded into the game becomes far more valuable if it is used as part of twenty sound events rather than only one. This is not to say that you can never have unique sound files, often this is the only way to get the best results, but it does expand the overall resource efficiency of memory to get multiple uses from your sound files.
To this end I tend to experiment with extremes of the processes I apply to my sound files. Pitch alteration is one of the most effective ways of extending the usefulness of a sound file. A sound used as a high, sustained looping note can work extremely well pitched down 2 or 3 octaves and used as a low bass drone. What I discovered working on Defect was that the more extreme the pitch shifting in real time, the more expensive the CPU cost becomes. My usual habit of using sound files in multiple locations pitched up or down several octaves was quickly becoming an expensive choice for the CPU.
Another drain on the CPU was compression. I had used ogg vorbis compression on the sound files as it provided the most extreme level of file compression while retaining audio quality. This was ideal for a game that was planned for deployment on tablets and mobile devices. Except again, uncompressing sound files required CPU time, and the complexity of my project meant that a very large number of sound files needed to be uncompressed regularly to allow the system to work. The result was a significant spike in CPU use.
I was very surprised to hear words from our lead programmer that I never thought I would hear as a game audio developer, and especially not for a mobile platform title, "Don't use audio compression for the music. Leave the audio files as standard PCM sound files." At some stage, (and I missed exactly when this happened), the world reached a point where our network speeds and storage capacity on our devices achieved a level where it has become more desirable, at least for this project, to use more memory as a trade-off to reduce CPU usage. This was a somewhat disconcerting position to find myself in, but it was very good to have such a simple solution to part of the issue. It would not solve the entire problem, but it went a long way to helping.
What you don't know
Other causes of the high CPU usage were also related to my ignorance of the tool I was working with. For all the time I have spent working with FMOD Studio, developing the training for it and even writing and updating the user manual, there is no way I could know every aspect of its functionality. I doubt any one person does. This highlights one of the benefits of a project this ambitious. It pushes me well outside the comfortable zone of working within my existing knowledge and capabilities and forces me to develop and learn.
I learnt from the team at FMOD that for mobile devices such as iOS and Android the default sample rate for sound files was 24KHz. While this might seem trivial it was actually very significant for Defect. Regardless of what sample rate I set for the sound files I was using, FMOD would streamline the process at build time by resampling everything into the required rate of 24KHz. I had to ensure my banks were set to build 24KHz sound. The simple act of going through all my sound file assets and setting them to 24KHz and replacing them in the project meant that I could set the sample rate and audition the sounds in my workflow prior to adding them. This allowed me to maintain quality while preserving efficiency in the project.
I had approached the entire project with the idea of efficient CPU usage in mind. My overall plan was to limit CPU costs by mapping out efficient routing in the Mixer. This would avoid using too many individual effects thus reducing the overall cost of DSP objects within the game. This remains a valid approach, but in the case of Defect, I found that there were many other factors affecting CPU before I even got close to worrying about the DSP cost.
Being a bit too clever for my own good
One of the musical layers in Defect is the sound each of the ship energy core components produces. This was originally a low frequency drone. From a musical point of view I found I had too much low frequency content when this combined with percussion instruments and moving bass lines, resulting in the music becoming muddy. So I decided to use a high frequency drone instead. Kind of like a sustained string note hovering above all the other music. It was amazing how much this lightened and cleaned up the musical content.
As I described earlier, I had been using pitch shifting of some middle range notes to create the low drones to improve memory usage. When I switched them to high frequency, I just reversed the pitch shifting. In both cases I was shifting to extremes, which was not great for CPU, but more than that, I was being a bit of a smart-arse in how I created these drones.
I discovered a method many years back of creating looping effects that were not plain looping files. I have used this technique for environmental loops such as the sound of wind, or for a river or other variable loops. This technique uses the Scatter Sound module functions within FMOD Studio to "dovetail" sounds together. Essentially it plays a sustained sound file such as a river flowing, but before that sound file finishes, it fades in another sound file, then the original fades out, so they essentially crossfade in real time. Applying some other processing to the workflow allows for a more organic ongoing audio event instead of a simple, noticeable loop. It has been a very effective and quite efficient method of creating environmental sounds that I have used for numerous projects.
When it came to the core drone layers, I decided I could use this method for the musical layers. Utilising quite short sustained notes and dovetailing them together produced a good effect. I could even create two of the same module on different tracks and then pan them left and right to create a more open stereo result. The thing about all of this was that it was pointless and self-indulgent. I was being overly clever and applying a complex solution to a simple problem.
I wanted the core component layers, to be a simple high pitched sustained note. A subtle layer mixed in with the other musical layers. I didn’t need a fancy solution here; I needed a short, clean loop of a sound that had the right musical qualities to it. When I replaced all of the core sounds from being complex multi layered real-time sound objects, to simple, short mono loops, it was yet another step that reduced the CPU cost of the overall project.
This last point was a critical part of how I was thinking about this project. Certainly there are many aspects of this project that are complex, bleeding edge and even pretty clever. But those descriptions should not have to apply to every single aspect of the design and in fact should probably NEVER apply to every aspect of the design. It’s a little like creating some new complex and expensive technology to allow you to switch on a light. Most of us are perfectly capable of standing up and hitting a switch. Often the simplest solutions are the best ones.
I am glad we had the issues that we did early on as it helped demonstrate to me that I was being a bit of a smart-arse by trying to over-engineer every aspect of the game just to show that I could. Realising this while I still had time to strip back a few things was good for both this and any future projects I work on.