Reposted from Shamus’ Good Robot Devblog
My to-do list grows and shrinks as the project rolls on. I’ll have 20 items on my to-do list one week. I’ll get 13 of them done. Then at our next weekly meeting, those 13 are reviewed. Some are marked as done. Some end up back on the list because my solution was too narrow, or didn’t work in all cases, or I misunderstood the problem. Then a few new issues will get piled onto the list.
So after the meeting my to-do list will be back up to 25 or so items and we’ll begin again. So it goes.
But some items have never been touched. They’ve haunted the bottom of the list, never getting done, never getting looked at. The oldest item on my list now is actually a collection of bullet-points that can all roughly be summed up as “performance problems”. To wit: The game runs too slow.
Not on my machine, mind you. It’s fine on my machine. But on craptops (i.e. really old/slow computers) it runs at about half the framerate it should. So what’s going on?
Finding a Problem You Don’t Have
This is a 2D game. We’re never rendering more than a few thousand polygons per frame, and even old laptops should be able to handle polygon loads in the millions. The game is not performing as well as it should, given what we’re asking the hardware to do.
Unfortunately, I can’t test the problem on my end. I have a decent machine with a mid-range graphics card, and the game hits a stable 60fps for me at all times. I’m trying to solve a problem I can’t detect or measure.
I complained about this on Twitter, and several people suggested using a virtual machine. For the record: You can set up a virtual machine to emulate a computer with low memory or processing power, but last I checked there was no way to usefully do this for graphics hardware. Unless there’s been some amazing leap forward in the graphics capabilities of your average VM over the last couple of years, then this isn’t possible. The emulator would need to take draw calls targeted at the graphics hardware it’s pretending to be, constrain them in various ways, and then hand them off to the actual graphics hardware in my system. It would mean emulating all those horrendous and proprietary driver layers. Even if a VM could do that reliably, I imagine the overhead in such a conversion would be at least as large as the effect I’m trying to measure. To wit: I could spend hours isolating and solving some graphics bottleneck, only to find out I was working around some slow-down inherent in the emulation, and not the problem that real craptops are exhibiting.
I could probably send an email out to our testers and find someone with a suitable machine. But sending builds to people and having them report back their non-technical observations in imprecise terms is a slow and frustrating way to work. I’ll go that way if I have to, but let’s see if I can nail this down on my own.
Fill Rate Problem
We have one important clue to go on:
The problem gets better when the user turns down the resolution.
This actually indicates that the problem probably doesn’t have anything to do with how many polygons we’re drawing.
Let’s imagine our graphics card is good ol’ Bob Ross. There are three main choke points you have to worry about while rendering:
Throughput: How fast you can I tell Bob to paint? Like, “Make a red stroke at the top” or, “Dab some green around this spot here.” Throughput used to be a big problem back in the day, but I haven’t had to worry about throughput problems in about a decade. I’m not saying they never happen, but it’s just not a common case. It’s certainly not a big deal for those of us operating in retro-styled 2D.
Memory: “Think of this as “how many different colors of paint can we keep on the palette at once.” (It’s also tied to how big the canvas can be, and a bunch of other details, but it’s not worth getting into right now.) So if we’re trying to paint with seven different colors (texture maps) but we can only hold onto five of them at a time, then it’s going to be a pain in the ass for our poor painter. I tell him, “Dab some green around this spot.” But he looks down and sees he doesn’t have any green on his palette. So he scrapes off the blue to make room, loads up on green, and paints the requested strokes. And then I tell him, “Paint some blue lines right here.” So he has to dump another color to get blue back, and so on.
Memory problems tend to be really modal. Everything runs normally until the instant you run out, at which point the framerate drops from “normal” to “OMG is the game crashed?”
Fill rate: How much paint is Bob putting down? It doesn’t take much throughput for me to say, “Fill the canvas with blue paint.” And it only uses one color, so the memory cost is minimal. But obviously covering the entire canvas is time consuming when compared to (say) drawing a single stroke. You can imagine a situation where I tell Bob to cover the canvas with dark blue, and then cover it again with white, and then yet another layer of white, over and over, until it’s sky-colored. That’s going to take bloody ages compared to just starting with a lighter shade of blue in the first place.
So based on the described symptoms – that turning down the resolution will greatly speed up the game – I’m thinking we’re dealing with a fill rate problem. I don’t know how. We’re not drawing a lot of stuff or covering the canvas over and over. I’ve had projects that did more and ran faster on worse hardware. But this is the assumption I’m starting with, based on what I’ve been told.
The problem doesn’t seem to change with the number of robots and particles on-screen. The framerate is low, but stable. So I’ll begin with the idea that the slowdown is related to level geometry.
As a brute-force method of testing, I add a loop to the rendering code. Every time it needs to draw part of the level, it will actually draw it ten times.
So the rendering loop looks like this:
- Draw the distant background image. (Ten times)
- Draw the most distant layer of walls. (Ten times)
- Draw the player’s “flashlight” beam.
- Draw the medium-distance layer of walls. (Ten times)
- Draw the glowing “aura” around the walls. (Ten times)
- Draw the floating dust particles.
- Draw the enemy robots.
- Draw the player.
- Draw another version of the flashlight beam.
- Draw the non-robot stuff like doors, projectiles, machines, and so on.
- Draw the particle effects.
- Draw the big black foreground walls. (Ten times)
At the risk of making this too complicated to follow: Items 4-7 are actually drawn twice. It draws a “dark” version of the whole thing, except masked out so that it only draws inside of shadows. Then it draws a bright version of it, but masked out so that it only draws the parts that are NOT shadowed. Yes, it probably seems like there are faster ways of achieving this effect (like just drawing a back shadow over some parts to darken them) but I don’t want to get sidetracked for thousands of words explaining why we have to do things this way. Shadowing is more than just a brute-force “darken”. Trust me.
The point is that items 4-7 are kind of multiplied by two. So they’re actually being drawn twenty times now.
So now the level drawing is ten times slower. What’s the effect?
No effect. The game is still running at 60fps.
You can now see the glow effects around the walls. Those are there to add some brightness and color variation. They make the level “pop” a little more. Looking back, I suppose this particular area of the game was a bad place to take these screenshots. The glow layer is usually set up to use a different / contrasting color in relation to the background. In this particular level it’s just a brighter version of the background.
Obviously drawing a rock wall ten times doesn’t make the wall look any different. A wall drawn once looks identical to a wall drawn ten times. But the glow layer is blended with what’s already on-screen. Like the example above where we kept adding layers of white paint to a dark blue canvas, the image gets a little brighter each time.
Okay, let’s try 20x overdraw.
Nope, still 60fps.
O-kay. Let’s jump to fifty(!) times overdraw.
Finally the framerate dips down a tiny bit.
Okay. Let’s go to 80x overdraw.
Finally the framerate drops into the 30fps range that craptop users are reporting.
I have to say: The power delta between shitty integrated graphics and a proper graphics card is truly staggering. Just imagine how much larger this difference would be if I had a high-end card!
I fly around the game a little. After a while I notice that the game runs more slowly when there’s a lot of rock on the screen. If I’m flying around an empty chamber, the frame rate goes up a little. If I pass through a narrow tunnel, the frame rate dips.
That’s a bit counter-intuitive.
Maybe it’s the shadow system? If there are walls nearby, then it needs to muck around projecting shadows from them. I’ve spent a lot of time tuning this stuff and the system is supposedly pretty lightweight by now, but maybe I bungled something? So I turn off shadows to see if that helps.
The fps drops into the single digits.
That’s… really interesting. I would expect no change. Barring that, maybe a tiny increase. But instead the game slows down to unplayable levels?
So the walls are hiding something that’s slowing down the game? Imagine Bob Ross paints an intricate little homestead with people, vehicles, livestock, and a windmill, but then he paints a hill in the foreground that completely obscures it all. It would take him ages to paint, even though the final product looks like a simple picture. I can’t imagine what could be back there, but I turn off the wall rendering layer so I can get a look.
Hmm. That looks… normal? I don’t see what’s causing…
Hang on. Let me turn off shadows…
Well, shit. I guess that explains it.
That region of solid cyan color shouldn’t be there. Instead we should be seeing through to the background layers. But here we have hundreds and hundreds of glowy spots all stacked on top of each other. The glow layer is supposed to be this sort of “aura” around the walls. There’s no reason to have those glowy circles drawn BEHIND the walls where they can never be seen. When I turned off shadows it had to draw all of that crap twice as much, since it was drawing it in both the shadow and non-shadow passes.
Looking through the code, it looks like I inverted a bit of logic. Instead of “never ever put a glow spot where you can’t see them” we wound up with “always do exactly that”. I made a bug that would slow down the game, and also conceal itself visually. That’s diabolical. And stupid.
It’s a simple fix:
This screenshot sort of undersells just how bad the problem it was. That area on the right was stacked thick with glowy spots. Perhaps a third of the screen area looks different in the screenshot, but in practical terms we’re drawing a fraction of the previous load. (Working it out exactly would require doing some really annoying surface-area calculations that I’m too lazy to work out.)
With this fix in place, the game can keep up a stable 60fps even with the GPU-killing 50x overdraw in place. That is, the game can hit the target framerate even if it has to draw the level fifty times every frame.
I don’t know if this will fix THE problem, but it was certainly A problem.
The final product looks exactly what we started with, except it’s now achieving that image with a fraction of the work.
I anticipate one objection from graphics programmers:
Hey Shamus, why don’t you draw the level front-to-back, instead of back-to-front? If you did that, then this bug would have been harmless. None of the extra glow crap would ever have been drawn, since it would have been skipped on account of being behind the walls.
That’s how the project was originally when I was working solo. The art style was all hard-edged pixels. But when I teamed up with Pyro, the artists wanted to smooth all the coarse edges off the pixels. Giving pixels soft edges means that you have to draw stuff back-to-front, or there would be ugly transparency problems where the walls met the background.
My to-do list is getting short these days. The game will be coming out in a few months. The artists will probably wrap up their work around the end of the year, leaving Arvind and I to worry about bugs and marketing.
So, uh… buy my game?