Last term we said Colin McRae DiRT 2 “is a must own for any racing fan.” Now we’ve been given the chance to talk to the man behind its sound about sound technology, getting a job playing video games, and the differences between the PlayStation 3 and Xbox 360.
Simon N Goodwin is a principal programmer in Central Technology at Codemasters. Programmers in Central Technology write code that all games may need to use, so he has worked on the Colin McRae rally games, Race Driver Grid, Brian Lara Cricket and even the rhythm action game Dance Factory amongst others.
In essence, there are people who say where the sounds should be played, and people who make the sounds, and you tie those together into something that can be played as sound on the game.
That’s right — game audio programmers, sound designers, and technology programmers. Central Technology stuff is not only reused across games, it’s also intended to allow the game programmers to write one game regardless of whether it runs on a PlayStation 3, Xbox 360 or a PC.
My speciality is audio, loading and streaming: getting data off the disc, into memory, onto the screen, and into people’s ears as soon as possible. Audio’s a particularly interesting problem, because in many respects it’s the most real-time aspect of games. This is partly because you’ve got persistence of vision, meaning for humans you’ve only got to put an image on the screen a few tens of times a second for it to become a smooth continuous image. However, there is no persistence of hearing, so if the audio stops for a fraction of a hundredth of a second, then everybody notices. Typically the whole audio system has to work out what it needs to do 192 times a second, each time generating 256 samples. If we miss one of those 192 updates, then that’s a class one bug.
Recording a JapSpeed engine for Race Driver Grid » Recording a JapSpeed engine
What attracted you in particular to game programming?
If you’re writing a game, then you have an opportunity to express your personality. If you’re programming a cash machine and you’re trying to express your personality, either you or the customer are going to get very frustrated!
The fact is that programming computers is as varied a job as writing. You could argue that saying a game programmer is the same thing as an application programmer is about as meaningful as saying as they’re both typists.
Considering you spend so long developing them, are many games a gamble?
There are a number of things that go wrong. You might start developing a game, and at the same time presumably other people are doing something similar. Being the first out might mean you’ve bodged it and your game discredits the whole genre, or your marketing campaign means that the second people make loads of money on the back of your marketing because you’ve created the need. Or it might mean that the second people who come along sink without trace because the market has moved on at that stage. That’s just part of the problem.
Do you develop games knowing only some will be a success?
You’ve got to. It’s just the same as the music industry, like any hit driven business. The number one game is probably outselling the rest of the top ten. The top ten is outselling the rest of the top hundred.
Have games ever ever been developed, but never released?
I’ve known people who have worked in the games industry for five years and never had a game come out, and they worked on several games in that time.
Do bad reviews ruin a game?
Sometimes somebody with a calculator and no imagination adds together all of the review scores, excludes a few for political reasons, and then comes up with an average review rating. What that – like any one dimensional metric – fails to pick is that some games polarise reviewers. Some games will get a really high score off some reviewers and a really low score off others, because some people just don’t get it. That tends to be a big risk when it comes to an innovative thing.
Does that squash innovation?
Yes, I think it does. But there are a load of things squashing innovation in the whole business model. You’re better off with a sequel, because you can predict that it’s going to sell a certain proportion of what the previous game did. But there have to be some games which aren’t sequels.
It’s common for gamers to have nice screens, but often no thought is put into listening to sound. Do you think sound technology is overshadowed by graphics?
There’s some great research done in the states, partly at MIT, about how people perceive sound and vision in games. It’s a bit disappointing in some respects for audio people. One of the things which you’d perhaps predict is that good sound makes good graphics look better. Disappointingly, the majority say how much better the graphics look, rather than commenting on the sound!
A more surprising piece of research is that good sound makes bad graphics look worse. I’ve resisted the temptation to say that we should degrade the sound in games in order to make up for the deficiencies of the graphics!
Do you think the quality of audio in games has reached a plateau, perhaps because it’s detailed enough for what humans can perceive as realistic?
It’s not just about realism; a lot of it is about variety, and sustaining the experience after the initial impressiveness. Although I’m a technology programmer, which is only an enabling part, I think there are still substantial technological challenges, and they have to do with things like making sure what you do is always realistic and you don’t end up adding something that actually detracts from the experience.
For instance, you don’t want a pit radio that congratulates you on that amazing manoeuvre when you actually completely fumbled going round a corner and you’re about to smash into a wall. It’s quite difficult to write something that’s capable of telling the difference between a piece of brilliance and a mistake. They’re classic artificial intelligence problems really, and very far from being solved. There’s a lot of scope for art and creativity there.
How do you get the sounds for the game?
For the 2010 Formula 1 game we’re working on at the moment for example, we’ll go off and find somebody who has got a Formula 1 car, and we’ll record it. We typically record eight or ten aspects at a time, mixing the engine or exhaust depending on whether you’re chasing a car or using a bonnet camera in the game.
It’s not as simple as that, because damage is a crucial part of game audio as well. As you smash the car up – as you lose bonnets, windows, doors, exhausts and all sorts of things – the whole sound has got to realistically morph itself from the in-car view into the bonnet camera or something equivalent to that kind of sound.
So you can’t dismantle the real cars to record them?
Well, we have a policy. We don’t dismantle real cars, but we will pop down to a scrap yard with a bag of cash every so often. We’ll take a sledgehammer and a few other bits and pieces, and we’ll come back with an hour or so of recordings of panels and other bits of cars breaking in all possible and imaginable ways.
A car being destroyed for Race Driver Grid » Recording crash sounds for Race Driver Grid
How do you write music for a game?
We don’t do it in house. Virtually all of the music for Codemasters games is bought in from other people, because you get more variety and you don’t burn out the staff.
We don’t just take the prerecorded stereo music track. If you play a game like Race Driver Grid, we get it in the form of what’s called “stems”, which is to say individual components. So, there will be a basic rhythm track, then you’ll have various levels that can be mixed in on top of that, other instruments and so forth. In Grid all the music was broken down into eight separate layers, which we could play separately in surround as the game went on.
For instance, when you overtook another car an extra level would come in regardless of where you were in the tune at that point, so as to celebrate that. If you crash into something you will get a load of collision sounds and the whole music mix will cut down to a minimum.
Although ideally everything would be louder and clearer than everything else, in practice making a collision sound really effective involves working out what you can drop out in order to make a hole. Sometimes the loudest thing you can do in a game is silence. One of our sound designers, James Slavin, is very keen in what he calls “percussive silence”, which is deliberate use of silence as counterpoint in a game in order to make the louder points louder.
Key ideas are “do not annoy” — the mantra of Halo sound designer Marty O’Donnell — and don’t wear people out. You’ve got to make sure the game sounds as good after ten minutes as it did in the first few seconds.
It sounds like it has to be quite intelligent from a creative point of view, as if you were creating a movie scene.
Absolutely. One of the misperceptions is that people writing games would rather be doing movies. A movie is a pre-rendered experience. You have total control: you decide what is and isn’t going to happen. A game is an interactive experience, and therefore the player conducts what is going to happen. Our job in audio is to make sure whatever the player does sounds cool!
Are there any interesting developments in games audio technology like that that people might expect?
One problem is to do with things like occlusion and reverberant spaces. That’s to say, how things sound depends not just on what each sound sound source you’re nearest to, but also how the direct sound reflects off everything else in the world around, and how the reflected sound reflects off everything else, and the reflections and the reflections and the reflections all layer on. You’ve got to come up with a solution so the game sounds realistic as windows open, doors open and things like that.
Aren’t there hardware processors for that?
There are. The processor we use is one of the co-processors on the PlayStation 3 for doing the audio. It’s one of 6 co-processors available to the game, the others are used for things like AI, physics and graphics related functions. The one we get for audio is 32 times more powerful in terms of its mathematical throughput than the supercomputer that did the European medium range weather forecast at the beginning of the 1990s. So, we’ve got a heck of a lot of processor power, tens of billions of floating point operations a second, all used just for the audio. Though that’s still not enough to model the sound of everything bouncing off everything else, bouncing off everything else and so forth.
Are some platforms better at doing some things than others or easier to develop for?
I don’t want to get into a fanboy discussion between Xbox 360 and PlayStation 3. PlayStation 3 was obviously designed for a longer product life, with things like Blu-ray and that it took longer to come out. It’s taken longer to get the maximum performance out of the PlayStation 3 than is possible with the Xbox 360.
Why’s that?
It’s different. The Xbox 360 has three processors, all similar to one of the processors in the PlayStation 3. All of the memory is graphics memory, and four things can access it. Whereas inside a PlayStation 3, you’ve got the same amount of memory but it’s split into two lumps. One of them is controlled by the graphics system, and the other processors can write to that very quickly, but they can’t read from it very fast. The other area of memory is more difficult for the graphics system to look at; the graphics system can read it but it can’t write to it, but the other processors can more readily access it.
You’ve got a situation where the memory dedicated to a processor or system is going to be faster, but you have to put more thought into where you are going to place things in order to get that performance out of it.
If you were just to take a game from a PC and compile it without any changes on an Xbox 360, then you would probably get maybe a fifth or a sixth of the performance than the hardware could give you. If you did the same thing on the PlayStation 3, you would probably get about a fifteenth. It would probably go faster on the Xbox 360, but you could make it go faster on the PS3 if you made an effort to take advantage of the architecture.
So, it’s not really meaningful to say A is better than B any more than it’s meaningful to say that swimming is better than cycling.
The PlayStation is a completely different mindset for a programmer isn’t it?
There’s a good example I can give you. The first game I wrote that was a significant hit was a game called Gold Mine, which ran on a 16K Sinclair ZX Spectrum. The most interesting game I’ve ever worked on is Dance Factory, which was a PlayStation 2 only game. You might say, well, what sort of skills from writing programs that work in 16K on an 8 bit processor are transferrable to somebody writing a game on a PlayStation 2? Surprisingly, rather a lot. On PlayStation 2, although the machine has got 40 MB of memory in various places, nine tenths of the processor power is in two co-processors which have respectively 4K and 16K of memory. So if you are somebody who has learnt to make programs work in the 4K or 16K of memory, then you are going to run rings around the people who don’t know whether or not their program is taking 4K or 5K.
Although it might seem that now we have half a gigabyte of memory on a console the skills needed in order to make something fit are no longer applicable, the fact is that if your program on a PlayStation 3 or Xbox 360 fits in the 32K of code and data which is the level 1 cache on the processor, it’s going to be 60 times faster than something that spills out of that space into the 1MB of level 2. If you manage to write a program (or a combination of six programs running on the various processors) that actually exhausts the capacity of the 1MB cache, something that – shock, surprise – uses as much as 0.2 percent of the total memory and regularly goes outside that window, that program is going to be five hundred times slower than if you can fit the key part into 32K. So, there are still jobs – certainly in the games industry, perhaps not in applications – where it’s still a marketable skill to be able to go through a few hundred lines of code and remove a line that you can safely remove.
I’m not really interested in anything that runs less than a million times a second. If it runs less often that that, somebody else can do it.
Is having surround sound important?
When you’re watching a film, surround sound is used in order to give you a sense of envelopment. You’ve got to give the impression that the sound is coming from the screen even when people are sitting closer to loudspeakers which are not by the screen.
In games, it’s different though. You can optimise the sound for a single listener playing in a fairly predictable position, meaning surround is not just an enhancement. The tiny letterbox that’s on the screen is a tiny proportion of the world around you. Where do the threats come from? A few of them will come from the screen, but the real threats will be from above you, and below you, and behind you, and either side, and the screen can’t help you with those at all. So, surround sound is for accurate positioning of sounds everywhere other than on the screen.
I suppose most people don’t have surround do they?
We asked around 700 of our customers how they were listening, and we found there was quite a difference between console and PC players. On the consoles, about half the players were listening on stereo loudspeakers, and about a third were listening on surround. The remaining sixth were split half and half between people listening on headphones and people listening in mono.
PC game players are different: about a third listen in surround, and about about a third listen in stereo. I think 2 percent said they were listening in mono, but around a third mainly listen on headphones. That’s quite a difference to the console demographic. Five or six times more people are playing with headphones, which means that we take special effort on the PC to make sure the headphone mix is not the same as the stereo speaker mix.
Ideally though, we’d like them to have 7.1, which works a lot better than 5.1 for games. It doesn’t matter so much for cinema, but 7.1 can actually fill in the holes to the sides and to the rear so you’ve got the same angle between the speakers to either side and to the rear as you do for Alan Blumlein’s stereo invented in the 1930s. At that point, all of the directions suddenly become orthogonal. That is, you don’t have a bias in favour of the front.
Strangely the place where it is least important to accurately place sounds is right in front of the player, because they can see them on the screen. If there is a clash between what their ears are telling them, and what their eyes are telling them, their eyes are going to win.
There’s another thing you can do with 7.1 that’s really cool. In the latest Colin McRae game Dirt 2 we’ve got support for what we call “3D 7.1”. In 3D 7.1, you start off with a layout which is compatible with conventional cinema 5.1, but rather than using the extra two speakers to just fill things out to the sides and the rear, you use them to get a symmetric sense of sound up and down. The research into this goes back a few thousand years, back to ancient Greeks, and the regular polyhedra. It turns out that if you arrange just six things around you in an octahedron, you end up with 15 different pairs which you can pan things between. The technique we use in our games, called “Ambisonics”, means that we can use all the speakers in order to guide sounds. You can use speakers on one side of the listener to guide sounds into intermediate positions between the speakers on the other side. So all of the speakers are working together in order to give you what’s called a “sound field”. Now we can place a sound convincingly at any point around the listener.
In a stereo speaker mix, the furthest you can pan things from one side to another is the 60 degrees or so between your speakers, whereas in headphones if you were to pan things all the way to left and all the way to the right, it would sound like they’re not in the world at all. Unless you have a mohican that stretches into the stratosphere and a very big nose, any sound that plays in one ear and is completely inaudible in the other ear doesn’t sound realistic at all. It’s crucial in our games that headphones and stereo speakers don’t get the same mix. Partly what we’re doing is making sure that interaural crosstalk is modelled. That is when you hear some of the sound in the other ear, but also providing cues that give people a realistic idea of whether sounds are coming from above and below. After all, you might say what’s the point of having surround sound, you’ve only got two ears!
The answer lies with your two ears and the pinnae, the flappy bits on the side of your head. They’re not just crinkly in order to use up spare material; they are comb filters, a kind of acoustic filter, designed to derive information about the direction from which sounds come. That has been in our feral past a matter of life or death. By modelling that, it’s possible for us to provide in some ways a better 3D perception on a pair of headphones than we can do with a large array of loudspeakers around the listener.
The difficulty with that is it’s listener specific. Your perception of 3D will change just depending on whether you’ve had a hair cut, or whether you’re wearing a hat, though not as much as if you were to chop one of your ears off or find yourself listening through somebody else’s ears. We have a set of 110 different recordings of surround sound played to ears surrounded by different real or dummy pinnae, so that we were able to analyse the most different perceptions of surround sound.
On DiRT 2 on PC, you have a choice of the five or six most different and distinctive types of HRTF (head related transfer function), so that you can cycle through them and pick the one that works best for you. Now when you’re playing in headphones, you get a 3D sense that is pretty convincing.
What advice would you give to students trying to get into games?
In my experience of going through CVs and so forth, it’s much much easier to say no than it is to say yes. I would encourage persistence, but I would also encourage that if you’re applying for jobs, you know a bit about who you’re applying to and what you’re applying for. When you go along for the interview, you interview the people interviewing you as well as letting them interview you, because if you’re going for the right job, they are going to want you and you’ve got to make sure that it’s going to work for both of you.
Certainly my approach in hiring people is not so much to try and find a pluggable component who can do something we haven’t decided yet, but to try and find somebody who will come along, fit in with the people we’ve already got and contribute stuff that only they can do. That way we all have a lot more fun.
For a given level of skill and attention to detail, it’s true you can make more money writing web applications or banking software than you can writing games. But, I’m not sure you can have as much fun.
If you want to find what it’s like to be in a games studio, and if you want to hang out with games designers, developers and programmers, the quick route is Quality Assurance (QA). The job of somebody in QA is to test the game, and when they break it, work out what they did and write it down in such a way that somebody else can do the same thing to reproduce the problem and fix it. That needs very good memory, quite a methodical approach and a great deal of stubbornness.
On the weekend before we submit a game we will probably have a hundred people playing it. We don’t want to manufacture a million coasters.
Has that happened?
Not to us, but it certainly has happened. It’s very very expensive to have to do a product recall. Not everybody can have an online patch.
For programming, do you just need a Computer Science degree?
From an audio point of view, we probably hire more people with a background in electronics engineering than in computer science. Computer scientists tend to be taught theoretical ways of doing things, and not necessarily ways of doing things in real time. Electronics people have a better idea of that.
But you’ve got to be able to program in C++; Java is not C++. You’ve got to be able to work in a large team. There’s no point in being a brilliant programmer if the other people around you have to stop what they’re doing and change it in order to work with what you’ve done. You have to be independent without undermining the independence of other people. It’s probably a good idea to get involved in some large software projects, which is much easier than it was when I started, because you’ve got the Internet and large open source projects.
Crucially though, if you want to write games, you ought to write a game. It’s not difficult to write a game on your own. It doesn’t have to have the same production values, but it does have to be playable, fast, and it has to have to capture the attention of the player in the first few seconds preferably, certainly in the first few minutes, just as would be the case if you recorded a song and presented that to a record company.
There’s no substitute for actually doing. Like most jobs in Britain, you don’t get hired in order to get trained to do it, you get hired because the person hiring you is confident that you will be able to do it. The easiest way to prove that is to have done it.
It’s definitely possible to get into the games industry. There is a massive demand, particularly for audio programmers, but many people that we interview can’t program well enough.
Can you tell us what you’re working on at the moment?
I’m working on the Formula One game which comes out this year, and I’m working on the technology that’s going to be involved in a number of games that are in a relatively early stage, so my preoccupation at the moment is stuff that going to be used internally in earnest in 2012. The fruits of it are likely to be out 2013 or 2014.
Ben Firshman
This work is licenced under a Attribution-Non-Commercial-Share Alike Creative Commons Licence.