Hello everyone.
I hope you’re doing well.
Today, I want to introduce you to a very special person, Jeremy.
Jeremy is the first enemy in the game. Well, first ‘true’ enemy in the game. He’s a basic follower who attempts to kill you with contact damage (40 right now, but will probably be reduced in balancing).
Yeah it’s literally a square of doom.
The Entity Registry
As for the Entity Registry, it has a few new features! The global Entity Registry, which will now be called the GER, automatically resizes itself to be as small as it can be while also not being too large.
The GER now starts by being 8,192 slots large (to allow for initialization to use many slots). It will resize itself down if there aren’t enough entities, and up if there are too many entities.
Notably, with the Fountain Stress Test, we see the game’s update rate go from 22 -> 62 when the entity registry goes from 8,192 slots -> 32 slots. Clearly, the Entity Registry being full of empty slots causes slowdowns.
These slowdowns occur because the collision with entities function must read the entire GER, and test each entity to see if it can collide with the checking entity. If there is less data in the GER, there’s less to look for.
It would be great if we could just check a portion of the slots that are full, but that doesn’t work because of holes when entities are deleted.
Or can we?
Maybe we could optimize the Entity Registry (to make everything contiguous) and then save the number of contiguous entities, then only check until that?
After implementing, even with an EntityRegistry size of 8,192 slots, we still get 58 updates per second. The only issue? It makes the game way less stable. I’ve undone it for now, but we’ll come back to it when EntityRegistry is a bit more stable.
There it is, the cardinal sin of trying to speed up your app- premature optimization. We’ll stick to resizing for now.
Problems with Stability
So for our Entity Registry, we have an issue. An entity can be deleting itself and being drawn at the same time. The issue is best described as follows:
Essentially, Graphics sets RenderLock to true immediately after Game passes its guard that makes sure renderLock is false. Moving the guards doesn’t solve the issue – the object will never be deleted safely.
My solution? Mark objects for deletion instead of deleting them outright. Objects marked for deletion are immediately marked invisible and inactive. They are assigned a number of updates until they will be deleted, currently 8. The Graphics thread will not draw an entity that is being deleted. This also has a slight benefit due to truthiness of booleans.
You see, when an object is invalid, it usually ends up being either 0xFD, or 0xDD, or 0xCC, or random data. This means that deleting, our flag that an entity is being deleted, is extremely likely not to be 0. If it is any value except 0, the Graphics thread knows not to draw it (which cuts down on invalid draw calls).
The only real issue now is that the Graphics thread can be reading the properties of ent while it is being deleted by the Game thread. Oh well. if it gets deleted before the checks happen and the memory is zeroed, it means that the first two conditions will both report true, causing it to exit early.
Now, let’s get into the meat and potatoes of our issue.
The Lock Problem
Entity Registry entries are much like Entities. They cannot be modified while the Graphics thread is drawing them. For that purpose, we have a dual lock system on the Entity Registry. The Graphics thread can lock the registry using its Graphics Lock, and the Game thread can lock it using the Master Lock. (Originally there was Graphics, Game, and Master locks, but Entity Registry modifications happen on the Game thread).
This is generally the update flow for the game and graphics thread fighting over control of the Entity Registry. Most of the time, the registry is Graphics Locked. Graphics operations just take way longer than game operations (which is usually making the registry contiguous).
In fact, we see that the game is spending over ten percent of its time waiting for locks to clear! For Graphics, it waits 0.0037% of the time.
This is an issue, especially with the game thread having tight deadlines as to its maximum allowed time. I’m working on the lock issue and will get back to you soon. We tried multithreaded rendering, but…
Yeah, the Direct2D renderer reuses things quite a bit to cut down on constructing objects, so when you render multiple entities at the same time, things get conflated quite a bit.
Luckily though, the renderer interface is generic, and we can simply implement a multi-threaded Direct2D renderer. (Literally, it’d be as easy as getting rid of ReuseBrush)
So I did.
Turns out, multithreaded rendering is a lot worse than single-threaded rendering.
We’re seeing FPS of about half and an increase in lock contention of about 6%. Additionally, the image flip transform is sometimes erroneously applied to the entire renderer.
Oh well, we’ll come back to it later, I suppose.
Here’s how it runs on the MasterPad, by the way. Pretty good performance for a device that takes 11 seconds to take a screenshot. Really.
Well, this has been kind of stream-of-conscience, but I think it’s time to stop here. Next time, I’ll talk about… something. I don’t know.
Leave a Reply