Reposted from Shamus’ Good Robot Devblog.
The more I work with a team of people, the more I’m convinced that having open, accessible game data is the path of least resistance. Why make buggy, lame, proprietary in-house tools when you can just stick all the data into text files and let people use their text editor of choice? Why spend time and effort packing simple data into binary files when you can leave it in plain text? As long as the data isn’t binary in nature (text-based 3D models and sound would probably not be a good idea) then open files are a win for everyone: Easier for coders, more comfortable for the artists, and more mod-friendly for enterprising players.
Of course, I’ve always thought this way, but I assumed it was bias from all the years I worked at Activeworlds, which focused on user-generated content, similar to Second Life or Roblox. I often wondered if I’d gravitate towards obscured data if I ever found myself working on a “proper” game.
But no. But if anything, I’m more pro-“open data” than ever.
But what if the users edit their data files to cheat and give themselves a billion hitpoints?
Yeah. Not a concern. Stopping single-player cheating is a lot like stopping pirates: It can’t be done, but if you’re really creative and determined you can waste a lot of time and money trying.
Early in the project, I had a lot of stuff hard-coded. Certain gameplay parameters were set in stone, and you couldn’t change them without changing the source code and patching the game. That’s basically fine for a one-person team. When I’m working alone, it’s just as easy to change a bit of source code as it is to change some text file of game data. But once Pyrodactyl joined, more and more of the game migrated out of code and into text files the artists controlled.
The only downside is…
It takes HOW LONG to launch the game?
The game used to launch in well under two seconds. Over the past few months, that number has been creeping upwards. This week I noticed it topped ten seconds. That seems unreasonable. We’re certainly not processing five times as much data at startup.
If anything, the game should be loading a little faster than before. Originally all of the level geometry drew from a single gargantuan atlas texture, which contained the tilesets for every level in the game. This texture was always read at startup, which probably had some modest time cost. But one of the problems Arvind ran into with Unrest is that some craptops can’t handle textures over 2048×2048. So, we chopped up the scenery textures so it only loads one at a time.
So what’s causing this slowdown?
That’s a Really Good Question.
I’ve always said that Microsoft makes horrible software, except for their software to make software, which is absolute excellence. When I left Activeworlds, I lost access to the corporate license of Professional Edition of Visual Studio and had to switch to the Hippie Freeloader Edition. (Called “Express”. Because nothing speeds your progress like missing features.)
It’s still great, but it was missing a good profiler. Profiling tools analyze the program as it runs, and show you where all the CPU cycles are going. It’s not something you need every day, but when you need it you really need it.
This year – in a mad fit of behavior so un-Microsoft it’s borderline suspicious – they began offering the super-premium version to the public, for free. So I have access to profiling tools again. Let’s see what it says about Good Robot:
I guess I sort of gave it away in the intro, but the result was actually a surprise to me. I would have expected the time to go into loading bulky texture data. (Nope. Nearly instant.) Maybe loading the 70 different audio files that comprise the sound effects for the game? (Nope. Trivial.) Or maybe the bit of code that examines the individual sprites a pixel at a time so that it can do per-pixel hit detection? (Eh. That’s a little heavy, taking around 11% of the startup time. But it’s not the problem.) Launching the engine? (Trivial.) Loading the shaders? (So fast you basically can’t measure it.) Loading the interface? (Not really.)
But no. The long load times were the result of reading text files.
Almost 40% of the startup cost is spent in EnvInit (), which is nothing but text parsing. And it’s not even all the text parsing. There are other text parsers at work elsewhere. They’re doing more in less time, because those other parsers are working on XML!
For those that don’t know: XML files are like text files, except they’re huge and cumbersome and barely readable. My two favorite XML quotes are:
“XML is a classic political compromise: it balances the needs of man and machine by being equally unreadable to both.” – Matthew Might
“XML combines the efficiency of text files with the readability of binary files” – unknown
But I probably shouldn’t be chucking stones at the XML camp while I’m living in the glass house that is my own text parser. XML might be huge and ungainly, but the game is loading it far, far faster than the rest of our game data, which is in a far simpler format. A great deal of time is spent reading the weapons in do_projectile_inventory (). The weapons file is a minuscule 25k, and holds just 48 weapons. It’s ludicrous that it should take 17% of the startup time reading that file. In the time the game spends laboriously reading that file, you could probably read it in from disk and write it back out again a hundred times over.
In an ideal world, there would be a single clog somewhere, one bit of inefficient or malfunctioning code that’s being called hundreds or thousands of times. But this is not an ideal world, and it seems like the problem is smeared all over the place. The problem is that I’m using strings poorly.
Way back in 2010, I had a post about what a pain in the ass it is to have to manually manage the memory of strings of text. You have to do lots of fussy work to allocate memory and make sure you handle it just right. Something that could be done in 2 lines of code in another language might take six or seven to do properly in C. And if you messed up you’d waste memory, slow down your program, or crash. So it was a lot of work for a lot of danger.
But that’s not how today’s hip youngsters do things. No! They have C++, which comes with sexy new ways to handle text.
I’ve been gradually transitioning to the “new” way of doing things over the last few years. (“New” is a really strange term in programming. I might mean “since last week” or “since the mid-90’s”, depending on context.) But there’s a difference between learning to do something and learning to do it well. I haven’t learned – or even bothered to investigate – the common pitfalls of the new system.
As a result, my text parser was doing a ton of needless work. Something like this:
It’s time to compare two strings!
But we don’t care about upper / lowercase. So take the first string and MAKE A COPY OF THE WHOLE THING, then convert the individual characters to lowercase, then take the contents of the copy and overwrite the original.
We also don’t care about whitespace. If we’re comparing “Apple” and “Apple “, they should match, even though the second one has a bunch of spaces after it. So MAKE A COPY OF THE WHOLE STRING.
Now take that copy AND MAKE ANOTHER COPY, and pull the spaces off the end. Now MAKE ANOTHER COPY and pull the spaces off the beginning. Now give the new, cleaned-up version of the string back to the original.
Now do ALL of that to the second string.
THEN compare them.
That doesn’t sound that bad, right? Sloppy, yes. But, copying a few extra bytes a few extra times should be harmless on a modern computer.
Except, when you need to look for thing A in group B. If thing A is a zucchini and I need to look for it in the list of 48 different vegetables, then I need to do that whole block of actions for every veggie until I find a match. As the list grows, the amount of extra copying can quickly get out of hand. The time cost can ramp up quickly, particularly in my implementation where I was frequently searching through lists within lists.
Does This Matter?
Well, not really. Not in the long run. This was something wrong, and it was wrong in an interesting and unexpected way, which is why I wrote about it. A ten-second load screen is not a horrible sin, and there are many games that take much longer to do much less.
But I’m also kind of picky when it comes to loading times, because loading slows down iteration and iteration is important. People love to say that “premature optimization is the root of all evil.”, but I think there’s a really good case for keeping the startup times snappy, particularly as team size grows.
Most of the people working on your game are probably not programmers. They’re artists. They’re making constant changes. Change the color. Add a few hitpoints. Swap out a new wall texture. Drop in a new sound effect. Tweak a particle effect. Replace a 3D model. You can’t make every part of the game editable while the game is running, which means your artists are going to be restarting the game often. In some cases, they might spend a majority of their time sitting and waiting for the game to start.
If you’ve ever tried to photoshop a large image on a crappy computer then you probably know what I’m talking about. It’s frustrating to have to wait a few seconds to see the results of every action. Just a two-second delay can be maddening. A ten second delay is even worse. It’s long enough to break your flow, but not long enough that you can (say) check Twitter or look for cat pictures. It’s a gap of dead time and it can kill creativity. I don’t know how other people respond, but when I have a lot of creative latency like this it really increases the odds that I’ll stop working as soon as it’s “good enough” instead of polishing things until it’s “great”.