Tuesday 6th October 2009 at 17:17 = Supercomputer Graphics!
Having recently stood reading Micro Mart's deep article on ATI's Radeon HD5870 and Total Film's deep review of James Cameron's Avatar, I see some time- and effort- saving possibilities regarding really cool computer graphics. The 3D game version of Avatar ought to be outstanding, but imagine the hardware load. This is partly due to legacy programming systems, specifically within graphics engines. This type of programming could be massively upgraded.

While better games get developed, Mr. Cameron has created his ground-breaking film by making it in a new way. He wrote an imaginative script, borrowed tons of dollars for it, and arranged enough mind-melting computer gear together to render some outstanding graphics. The other new bit, apart from what will probably be the best 3D system ever, is the virtual camera technology. While he shot footage, he put omnidirectional markers on the physical cameras, thus continually recording their positions. This gave him the facility to render different views in post-production than what had actually been shot, and some extra twiddly bits too. This type of on-the-fly rendering is something the games industry could develop. He replaced a film-making convention that would have hindered his desired result. I'll return to how this filming technique comes into programming after I try to explain graphics programming.

Code Monkeys Trying To Make Pretty Stuff
Since about 1990, the games industry has put an enormous amount of work into programming optimized for on-the-fly rendering, using pre-render techniques wherever they can. Even quad-core processing with GBs of memory can barely handle what's really needed. Uncompressed, high-definition graphics for three monitors each set at 1920*1080 at 32-bit colour depth at 30 frames-per-second is about 712MB per second. For some reason, games companies and gamers seems to like 60-plus FPS, so 1.4GB+ per second is needed. This would be OK without the memory overhead of up to 8 times anti-aliasing and various shading techniques. Soon 4GB of memory on these cards would be used up. This is why games rarely try things this way. Even for one monitor, the raw throughput is over 256MB/sec. Per-pixel rendering is rare, polygon approximation will probably continue for a while. This is because, although pipelining at such rates is well within graphics cards' capabilities, the original data just cannot be generated fast enough. Given that modeling using polygons rather than particle systems is necessary, because not many companies can scan imaginary objects with an electron microscope, there is a clear need for as many clever workarounds as possible.
When anyone writes the graphics parts of a game, they start in a comprehensible language, usually C++. They have or build algorithms that do neat stuff like texture-map a surface quickly.
The old way of texture-mapping a rectangle would have a loop for each horizontal pixel of the surface inside a loop for each vertical pixel, so that the whole area is covered, and in the middle of this loop would be an instruction to render the resultant pixel colour, dependent on texture, lighting, and fogging.
Since graphics cards gained these sorts of loops in hardware form, a modern games programmer just needs to send the card the texture and bump maps, positions of light points and fogging details. This is wasteful if the surface is obscured, the graphics card would not know which parts were obscured, so hopefully the programmers prevent unnecessary rendering.
One problem is in the way these commands are sent. Writing in low-level assembly would take ten times as long and probably would be no quicker at rendering, a C++ compiler would make a good job of the conversion. The commands are therefore specific to the Application Programming Interface that is being used. These APIs are usually developed by the graphics cards' manufacturers to make it easier to access the chips' optimal instructions. Programmers rarely get to know how the chips work, so do not know whether one super-command is actually better than ten less-super commands.
The other problem is that operating systems' companies have their own APIs to compete, so that there is more of a chance that games companies can develop for their systems. I am not just talking about Microsoft and the various Linux companies, I include Sony, Nintendo, Nokia and anyone else who has a unique platform. APIs are largely necessary to prevent the operating system being compromised by other programmers' code. The new top-spec Radeon is the first to use Microsoft's new DirectX 11. The DX11 API has come about because Microsoft did not think everyone would ignore Vista. Rather than re-write DX10 for XP, they have tweaked it and made it for XP, Vista, Windows 7 and XBox. Both the games and graphics cards industries have to compete to fill the rest of the market, just as they did when Nvidia built the first DX10 graphics card.
The API approach to programming has been around for a while, but I don't think it has worked for several applications. Pre-1990, most programming was done without APIs. If a programmer wanted twiddly bits they would use or buy someone else's specific code to do twiddly bits in the form of libraries. Libraries have an advantage over APIs in that they do not require a specific Integrated Development Environment, instead it is left up to the programmer to include them within his/her code. An IDE is still basically a version of Notepad with bells and whistles added, often including structured design organisation charts. Using libraries means having to learn how the library programmer organised his/her code, but I do not see this as a bad thing. Some programming techniques can be learned by the programmer from reading a code library. With open source code, this is done all of the time, but even with commercial libraries, there is enough scope to make good use of these.

Code Monkeys' Libraries
Two of the best developments in the form of code libraries were UniVESA and WAD. These were both mini-engines of code, UniVESA for Super VGA control and WAD for large file handling. The VESA standard was developed by graphics cards companies and WAD came from the developers of the game Doom. No API nor IDE was needed to make coding high-res graphics or large file handling easy, just a few lines in programmers' code to refer to these ready-made sections of code. There is no equivalent these days, both were written for use only under DOS, not Windows. This is why there are some really amazing graphics demonstrations written for DOS by anyone, and the only Windows ones are written by the graphics cards' manufacturers.
The only reason why code libraries have not resurfaced is because there is no obvious commercial reason for doing so. Back in 1992, I bought two commercial libraries for myself, in order to learn more programming. They cost a few hundred pounds each, but were well worth it, I still refer to the manuals for the techniques. Doing this these days would still be cheaper than buying an SDK for each intended development platform. Visual Studio alone is $500 per user. There is now a preference for selling one big SDK, then selling bolt-on parts, the problem is that programmers only want the bolt-on parts, but with a quick reference guide. Software houses would buy more bolt-ons if the network-wide SDK had not already cost them so much. Software houses have a lot of money, but it is cheaper for them to spend six months writing an engine than it is to buy something outright that is not really what they wanted. The graphics parts of the engines they write include all the same code as has been written before, because each graphics card only has one efficient way of being used. In some cases, the software houses would have the advantage of owning that code, but often they outsource the code writing anyway, so they do not always even own it. For example, there are many games using most of the C4 engine.

Engines Versus APIs
An engine includes bespoke libraries of code written to specifically handle a genre of software. It is usually the "brain" of a game or art package. APIs are not written for one specific genre and have more general libraries of code, which is good for starting off a piece of software, but not finishing one. An engine is far more difficult to use to start coding new software. It is harder with a new API to see the specific design or coding benefits, whereas with an engine everyone can tell what it can do, and probably even see the limitations early on.

Filmgame Industry?
Here is how I reckon the games industry can do the same. Mr. Cameron now has a software tool to create virtual views from virtual cameras. If a software house/developer in the games industry were to make a tool for viewing virtual scenes for BloodBather 9, they would have the disadvantage of only being able to set up one scene at a time whereas Mr. Cameron can use the same tool for any film. This is because all of the API and IDE and graphics cards' programming is based on the “scenegraph” idea. This is the premise that because a modern game has an impressive amount of detail in most views, these views need splitting up into scenes in order for the processors to cope. Mr. Cameron has the advantage of huge processing power on tap, more than even games companies can afford. There is no need to split it up if it was being processed in a different way. Any given view has a lot of detail to render but is well within the capabilities of the user's processors. What is needed is faster setting up and pre-processing of views, such as getting data from storage or calculating vertices. This is now ludicrously fast using DOS, but the operating system's graphics rendering API gets right in the way. The fastest methods for rendering in the old days was called direct video writing, operating systems now do not permit it, but fortunately graphics cards are so fast it does not matter much. Although it was called direct video writing, back then, video meant anything to do with graphics. Imagine how fast it would be if they did still allow it, machine code directly sent to the graphics card. The problem that still remains is that the other significant technique, pre-caching, is not allowed either, and it is the API that controls this. A modern game has to give all of the graphics setup instructions over to the graphics chips. Sending so many instructions, possibly four per pixel, is very slow. It is all very well having them processed quickly by the card, but traffic jams occur in getting them there, and the processor has to wait to send them via the PCI Express bus. It's like posting something by sending the packaging separately, or not sharing a car for the same journey.
Mr. Cameron's pre-processing is done by main processors, not at the last millisecond by rendering processors. It is still quicker to write and/or render something for display in DOS than it is for any graphical operating system, which seems really silly. Surely a graphical operating system should not just allow for its own graphics, but for the graphics written for programs within that system.

Acknowledgements
Thanks to Duncan Peters for sending this link about high-end graphics processing. It inspired this blog when I realised the narrowing gap between supercomputing and games development. With this new tech, Mr. Cameron should be able to make a decent Avatar sequel really fast.