Optimisation, in a game context, is reducing the game's use of resources while preserving its essence. Outside of games, optimisation is a religious issue. The subject attracts a lot of dogma and old rules with forgotten reasons.
Dogmatic assertions about programming usually stem from lessons learned through experience. This original wisdom is often forgotten over time and so it can be as dangerous to follow such a doctrine as it is to challenge it. Here are the most common dogmatic assertions surrounding the subject of optimisation.
The problems of up front optimisations
Premature optimisation is the root of all evil -- Hoare
Here premature means making optimisations before having proof that the optimisation will be needed (such as a runtime profile showing the code as a bottleneck or resource hog.) The evil is the loss of simplicity (otherwise known as hacks) during code development. Unwanted complexity early in development increases the likelihood of errors later on due to misreading the optimised version of the code.
The saying is an example of hyperbole: an exaggeration for rhetorical effect. The idea that all evil in programming stems from optimisation ignores the more fundamental causes of exactly the same problems: laziness, inexperience, pretension, and so on. The real root of all evil in programming is the same as the root of everything in programming: the programmer. Quite often a programmer will label their programming sins as optimisation, but we should not confuse cause with effect.
Many programmers take this assertion as a programming commandment. Certainly, treating it as such prevents programmers from using optimisation as an excuse for poor programming during development. However this will be of no help once the code is complete and the serious tweaking begins. It will also mean that we will have missed out on the accumulated gain of numerous small optimisations along the way.
Not all optimisations complicate the code. Some simply trade one style for another with no overall change to complexity. And some optimisations are simpler than their unoptimised counterparts.
To estimate the potential gains of an up front optimisation, we can use information from previous games. If the game is cross platform, the gains are likely to differ for each. Also, changes to the compiler or system libraries may alter the estimates, and new platforms are unlikely to match old ones. But with this information we can choose the optimisations we wish to make up front. And we can document our choices in our game's coding standard to reduce the risk that misreading an optimisation will result in errors.
When to optimise
1. don't, 2. don't yet (for experts only.) -- Jackson
This is a rhetorical device to explain two points of optimisation dogma. The first point is that optimisation should only be performed by experts. In this case experts means those with experience as optimisers. And the second is that it should be delayed for as long as possible.
Optimisation is often a counter-intuitive field. Problem areas often occur where we least expect them. Even profiling tools can mislead programmers inexperienced in optimisation. So long as we measure our results carefully, we can quickly gain the required experience to perform simple bottle-neck removal. But this type of optimisation is best carried out at the end of the project when most of the code complete and the bottle-necks are obvious and stationary. Although we can make big gains optimising this way, there is a limit that occurs when the optimisations are no longer worth the time it takes to make them. However, without a lot more experience, attempting to make optimisations earlier will just as likely increase resource usage as reduce it.
Advanced optimisation techniques take much longer to learn and are gained over multiple games and platforms. Contractors provide a useful way for start-up companies, or those with a high turnover, to buy in this experience on demand. However, advanced optimisation techniques need to be planned into the process before development even begins. By the end of the project, most of the damage will have already been done and it will be too late to change.
More computing sins are committed in the name of efficiency (without necessarily achieving it) than for any other single reason - including blind stupidity. -- Wulf
Many optimisation techniques are very sensitive to context. General rules of thumb rarely help us to find the right optimisations for a given game. Inexperienced programmers often create more resource problems than they solve by applying common optimisation patterns without considering the effect on the rest of the game.
When to optimise
Our hardware is powerful, but not enough to simply chew through our bottle-necks, or big enough to waste resources. The technology is bleeding edge and the tools are immature ports of existing compilers and often unsuitable for the platform. Also, as game code has evolved over the years, more and more legacy code is turning up in new games. Such legacy code is often optimised with a different architecture in mind and can perform worse than unoptimised code.
Unlike the fundamentalists we cannot take a single view as to when to optimise. Optimising everything up front causes problems because our OS libraries, compiler, middleware (and possibly even platform!) may change before our game is release. This may simply negate the effect of upfront optimisation, but in the worst case we may have introduced a performance problem that it's then too late to fix.
Similarly we cannot leave all optimisation to the end. Most games are developed iteratively. It would be difficult to develop a game on an unoptimised framework and would push all the important load balancing of art resources until after the optimisation phase. This would cause additional chaos in aleady the most stressful part of the development. Not to mention that leaving all optimisation to the end would (after the major bottle-necks have been removed) create a huge back-log of tiny optimisations required to make the smallest impact on the performance.
Figure 1 shows a graph of a profile of 100,000,000 cycles. The 1,700 functions are sorted from left to right by the number of cycles each one took. The time axis shows the scale of the problem. This image is not broken, the data really does account for 100% of the time sampled. The scale puts some perspective on the problem, but makes the data rather unusable for profiling.
Figure 2 zooms in on the extreme bottom right by looking only at the worst 100 functions from the original profile. The scale is the same though. The worst function accounts for only 4.5% of the total time. This is more like the graph we expect to find useful when optimising. Unfortunately there are other hidden problems with this data.
Firstly, this sample was taken from an area of the game where there is a problem. The sample represents about 10 vsyncs in real-time. Since the target for the game is 30 fps, that means we want about 5 frames rendered in that time. In fact this sample represents just over 1 rendered frame. It's 4 times too slow!
Secondly, the code contains areas which compensate and ration processor time depending on the time available. Making a small improvement in one area of the code could well lead to the performance dropping because although we made it quicker it is now trying to do more.
Targeting an unknown platform
More so in PC development than consoles, the target machine is somewhat unknown ahead of time. This makes all optimisation (up front or delayed) more difficult. However, there are some rules that can be applied to improve the potential of optimisation on the largest number of target machines. This kind of optimisation strategy can only really be worked through using theory since it would be way beyond the resources of even the largest developer to performance test each optimisation on every possible combination of compatible hardware.