Hacking Roller Coaster Tycoon with Genetic Algorithms

I used to play a ton of Roller Coaster Tycoon when I was a kid. I loved the game but I was never very good at making the roller coasters. They always felt too spread out, or too unnatural looking. As a ten year old I idly wondered about writing a computer program that was really good at playing the game. What sort of parks would it make? How would a computer approach the freedoms inherent in an empty park? What would we learn about the game engine from doing so?

A cool coaster

Sadly, this one wasn't generated by a computer

In case you're not familiar, Roller Coaster Tycoon is a amusement park simulation game most notable because the entire game was written in x86 assembler by Chris Sawyer.

Finally a few months ago, I had the tools and the free time available to work on this, and I made some progress toward writing a program that would generate cool looking roller coasters. Let's examine the parts of this program in turn.

Interacting with the Game

So let's say you have a program that can generate roller coasters. How do you actually put them in the game, or integrate them into your parks?

Fortunately, Roller Coaster Tycoon has a format for saving track layouts to disk. Even more amazingly, this format has been documented.

4B: departure control flags
4C number of trains
4D number of cars per train
4E: minimum wait time in seconds
4F: maximum wait time in seconds

Once you decode the ride data, it follows a format. Byte 0 stores the ride type - 00000010 is a suspended steel roller coaster, for example. Some bytes indicate the presence of flags - the 17th bit tells you whether the coaster can have a vertical loop. And so on, and so on.

To compress space, RCT used a cheap run-length encoding algorithm that would compress duplicates of the same byte. But once you encoded/decoded the file, it was super easy to get it running in the game.

So, great! I could write my coaster generator in any language I wanted, write out the file to disk, then load it from any of the parks.

Getting Track Data

There are a lot of track pieces in the game, and I needed to get a lot of data about each of them to be able to make assertions about generated roller coasters. As an example, if the track is currently banked left, which pieces are even possible to construct next?

A steep upward slope track piece increases the car height 4 units. A sharp left turn would advance the car 3 squares forward and 3 squares left, and also rotate the car's direction by 90 degrees. I had less than zero interest in doing this all by hand, and would probably make a mistake doing it. So I went looking for the source of truth in the game..

OpenRCT2

Literally the same week that I started looking at this, Ted John started decompiling the Roller Coaster Tycoon 2 source from x86 into C, and posting the results on Github. More importantly (for me), Ted and the source code actually showed how to read and decompile the source of the game.

The repository shipped with an EXE that would load the C sources before the x86 game code. From there, the C code could (and did, often) use assembler calls to jump back into the game source, for parts that hadn't been decompiled yet.

This also introduced me to the tools you use to decompile x86 into C. We used the reverse engineering tool IDA Pro to read the raw assembly, with a shared database that had information on subroutines that had been decompiled. Using IDA is probably as close as I will come to a profession in code-breaking and/or reverse engineering.

Most of the time with IDA involved reading, annotating the code, and then double checking your results against other parts of the code, the same way you might annotate a crossword puzzle. Other times I used guess and check - change a value in the code, then re-run the game and see what specifically had changed, or use debugging statements to see what went on.

So I started looking for the track data in the game. This turned out to be really, really difficult. You would have a hunch, or use the limited search capability in the game to search for something you thought should be there. Ultimately, I figured out where the strings "Too high!" and "Too low!" were being called in the engine, figuring that track height data would have been computed or used near those points.

This was only part of the solution - it turns out that track data is not stored in one big map but in several maps all around the code base. Some places store information about track bank, some store information about heights and it's tricky to compile it all together. Ultimately, I was able to figure it out by spending enough time with the code and testing different addresses to see if the values there lined up with the pre-determined track order.

Visualizing rides

With a genetic algorithm you are going to be generating a lot of roller coasters. I wanted a quick way to see whether those roller coasters were getting better or not by plotting them. So I used Go's image package to draw roller coasters. To start I didn't try for an isometric view, although that would be fun to draw. Instead I just plotted height change in one image and x/y changes in another image. Running this against existing roller coasters also revealed some flaws in my track data.

A fitness function

A good fitness function will have penalties/rewards for various pieces of behavior.

  • Is the ride complete?

  • Does the ride intersect itself at any points?

  • Does the ride respect gravity, e.g. will a car make it all the way around the track?

  • How exciting is the ride, per the in-game excitement meter?

  • How nauseating is the ride, per the in-game excitement meter?

The first two points on that list are easy; the last three are much more difficult. Finding the excitement data was very tricky. I eventually found it by getting the excitement for a "static" ride with no moving parts (the Crooked House) and searching for the actual numbers used in the game. Here's the function that computes excitement, nausea and intensity for a Crooked House ride.

sub_65C4D4 proc near
or      dword ptr [edi+1D0h], 2
or      dword ptr [edi+1D0h], 8
mov     byte ptr [edi+198h], 5
call    sub_655FD6
mov     ebx, 0D7h ; ''
mov     ecx, 3Eh ; '>'
mov     ebp, 22h ; '"'
call    sub_65E7A3
call    sub_65E7FB
mov     [edi+140h], bx
mov     [edi+142h], cx
mov     [edi+144h], bp
xor     ecx, ecx
call    sub_65E621
mov     dl, 7
shl     dl, 5
and     byte ptr [edi+114h], 1Fh
or      [edi+114h], dl
retn
sub_65C4D4 endp

Got that? In this case 0xD7 in hex is 215 in decimal, which is the ride's excitement rating. I got lucky that this value is static and not changed by anything, which meant I could search for it from outside the binary. This is then stored in the ride's location in memory (register edi), at the offset 0x140. In between there are a few subroutine calls, which shows that nothing is ever really easy when you are reading x86, as well as calls to functions that I have nothing besides hunches about.

Anyway, when you turn this into C, you get something like this:

void crooked_house_excitement(rct_ride *ride)
{
    // Set lifecycle bits
    ride->lifecycle_flags |= RIDE_LIFECYCLE_TESTED;
    ride->lifecycle_flags |= RIDE_LIFECYCLE_NO_RAW_STATS;
    ride->var_198 = 5;
    sub_655FD6(ride);

    ride_rating excitement  = RIDE_RATING(2,15);
    ride_rating intensity   = RIDE_RATING(0,62);
    ride_rating nausea      = RIDE_RATING(0,34);

    excitement = apply_intensity_penalty(excitement, intensity);
    rating_tuple tup = per_ride_rating_adjustments(ride, excitement, intensity, nausea);

    ride->excitement = tup.excitement;
    ride->intensity = tup.intensity;
    ride->nausea = tup.nausea;

    ride->upkeep_cost = compute_upkeep(ride);
    // Upkeep flag? or a dirtiness flag
    ride->var_14D |= 2;

    // clear all bits except lowest 5
    ride->var_114 &= 0x1f;
    // set 6th,7th,8th bits
    ride->var_114 |= 0xE0;
}

And we're lucky in this case that the function is relatively contained; many places in the code feature jumps and constructs that make following the code pretty tricky.

So this one wasn't too bad, but I got bogged down trying to compute excitement for a ride that had a track. The function gets orders of magnitude more complex than this. One positive is, as far as I can tell, excitement and nausea ratings are wholly functions of overall ride statistics like the vertical and lateral G-forces, and there's no accumulator per track segment.

Most of the computation involves multiplying a ride statistic by a constant, then bit shifting the value so it can't be too high/influence the final number by too much.

And sadly, this is where the project stalled. It was impossible to test the C code, because the track computation functions were buried four subroutines deep, and each of those subroutines had at least 500 lines of code. Decompiling each of these correctly, just to get to the code I wanted, was going to be a massive pain. There are ways around this, but ultimately I got back from vacation and had to focus on more pressing issues.

Conclusion

You can hack Roller Coaster Tycoon! There are a bunch of people doing interesting stuff with the game, including improving the peep UI, working on cross compilation (you can play it on Macs!), adding intrigues like the possibility of a worker strike, removing limitations based on the number of bytes (you can only have 255 rides, for example), and more.

It's been really fun having an utterly useless side project. I learned a lot about registers, calling conventions, bit shifting tricks, and other things that probably won't be useful at all, for anything.

I will definitely revisit this project at some point, hopefully when more of the game has been decompiled, or I might try to dig back into the x86/C more on my own.

Liked what you read? I am available for hire.

2 thoughts on “Hacking Roller Coaster Tycoon with Genetic Algorithms

  1. rcthelp

    Interesting idea to generate RCT coasters programmatically. It would be good to know if you restart this project.

    Incidentally, the original RCT TD6 TID is at http://tid.rctspace.com/TD6.html . You link to an archive produced by those who thought it was gone for ever. We’ve brought it back, and the original author Doctor J(ames Hughes) has been able to update it.

    Reply
  2. Seppe

    Awesome read,
    too bad that the project has stalled. Would’ve been really nice and interesting to see what kind of rollercoasters an genetic AI would’ve come up with.

    Hopefully you’ll pick it up in the future!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Comments are heavily moderated.