Data-Oriented Design and Avoiding the C++ Object-Oriented Programming Zimmer Frame

Object-oriented programming, OOP, or more importantly object-oriented design (OOD) has been the workhorse of software engineering for the past three decades but times are changing: data-oriented design is the new old kid on the block.

Before data-oriented design was ever formalised by the hoi polloi writing about it on Wikipedia it existed during 1980s in such places as video game code for home computers orders of magnitude less powerful than the machines of today.  Trying to use nascent object-oriented techniques to represent video game abstractions such as sprites on a ZX Spectrum with only 48K of RAM would have resulted in a game with not a lot of sprites and a slow FPS before FPS was even part of the gaming culture vernacular.  If you wanted a lot of sprites in your 48K video game the sprites would simply be represented by values in arrays rather than the traditional object-oriented method of abstracting individual game objects using classes, class inheritance and polymorphism.  This approach may have been necessitated by the fact that the programming language used to code most video games of that era was assembly (or if you were lucky C) which lacked the object-oriented features we are accustomed to today.

But what about today? Today’s machines are orders of magnitude more powerful than the ZX Spectrum of the 1980s and object-orientation appears to solve today’s problems. But does it? Moore’s law and CPU clock frequencies are hitting a brick wall and the trend now is to solve the problem of increasingly demanding or complex software by a move toward parallelism achieved by leveraging more CPU cores and more importantly the most modern of computer hardware: the GPU.

Parallelism is all well and good however some tasks can only be solved by a single CPU core (thread) and often there is a need for these tasks to execute as quickly as possible as they will be on the critical code path. If we return to the video game analogy and want to animate sprites in C++ the temptation will be to represent the sprites as objects (in the object-oriented sense of the word) and these objects would be instances of polymorphic classes that allow the sprites to have different behaviours and attributes.  The sprite base-class would be something along the lines of:

class Sprite
{
public:
	Sprite();
	const vec2& Position() const { return m_Position; }
	void SetPosition(const vec2; NewPosition) { m_Position = NewPosition; }
	virtual UpdatePosition() = 0;
	virtual HaveCollidedWith(const Sprite& OtherSprite) const = 0;
private:
	vec2 m_Position;
};

Now as an (arguably extreme) example lets say we have 1 million sprites and we wish to update the positions of all of them:

for (auto& sprite : Sprites())
{
	sprite.UpdatePosition();
}

This involves making 1 million polymorphic (virtual) function calls to  UpdatePosition() which in turn will call SetPosition() 1 million times which will in turn update the m_Position member variable 1 million times.  Why is this a problem?  It will be likely that each sprite object will be separately allocated due to the use of polymorphism so the m_Position variables will not be contiguous in memory which will likely mean 1 million CPU cache misses.  Even if the sprite objects are not separately allocated (by using, for example, std::vector<std::variant<…>>) the m_Position variables will still not be contiguous (or at a fixed stride) in memory due to the presence of the vtable pointer and any derived class member variables and there will still likely be CPU cache misses.  If we use the data-oriented design approach of not using objects for the sprites at all but instead use data:

struct rigid_body
{
	vec2 position;
	vec2 size;
	vec2 velocity;
	vec2 acceleration;
	scalar mass;
};

class Sprites
{
public:
	Sprites();
	UpdateSpritePositions();
private:
	std::vector<rigid_body> m_SpriteBodies;
};

then UpdateSpritePositions() can efficiently iterate through m_SpriteBodies array updating the position of each sprite based on information contained in the rigid_body data.  The data is contiguous and the code to use it can be local to a single loop that doesn’t jump about making polymorphic function calls.  It is also possible to have four separate arrays for each rigid_body component instead of using a struct but your mileage may vary if this has any performance benefit (CPUs and their cache sizes can vary).  The use of data-oriented design to solve this particular problem is actually a design pattern known as Entity-component-system (ECS).  In ECS parlance the sprite would be known as an entity and would be identified not by an object but by a simple unique integer entity ID.  The rigid_body array would be referred to as component data.  A sprite entity would be “destroyed” by simply freeing the entity ID and removing the ID from any associated component data index arrays.

A nice side effect of this data-oriented approach is that the contiguous data can in some cases be passed directly to a GPU as-is (using OpenGL or other similar APIs) which can offer a great graphical or compute performance benefit where the GPU is involved.

Free Goody Bag

Last night I implemented a C++ sorting algorithm called “intrusive_sort” that works like std::sort except that you pass it a custom “swapper” that is used to exchange elements as required by the sort algorithm. The swapper is passed the iterators of the two elements that require exchanging and can be used when implementing ECS type systems as shown in the following example which sorts all rigid bodies by “mass” (heaviest first):

auto& componentData = ecs().component_data<rigid_body>();

intrusive_sort(componentData.begin(), componentData.end(),
	[&componentData](auto lhs, auto rhs)
	{
		auto lhsIndex = std::distance(componentData.begin(), lhs);
		auto rhsIndex = std::distance(componentData.begin(), rhs);

		auto& lhsEntityId = componentData.index()[lhsIndex];
		auto& rhsEntityId = componentData.index()[rhsIndex];

		std::swap(*lhs, *rhs);
		std::swap(lhsEntityId, rhsEntityId);
		std::swap(componentData.reverse_index()[lhsEntityId],
			componentData.reverse_index()[rhsEntityId]);
	},
	[](const rigid_body& lhs, const rigid_body& rhs)
	{
		return lhs.mass > rhs.mass;
	});

As can be seen in the example intrusive_sort allows us to sort multiple containers in parallel using a single operation. index() is an array of entity IDs of the entities associated with the component data and reverse_index() is a reverse mapping of entity ID to component data array index. An alternative approach using a zip iterator and std::sort wouldn’t work in this case as reverse_index() is a sparse array.

intrusive_sort has the same complexity guarantees as std::sort and can be downloaded from here (free and open source).

Advertisements

2 thoughts on “Data-Oriented Design and Avoiding the C++ Object-Oriented Programming Zimmer Frame

Add yours

  1. Nice article, that spells out what I was also thinking after hitting lots of performance problems while working on a huge project (using Qt) with millions of data objects, all implemented as QObjects (my mistake, I didn’t know better when the project was started 6 years ago).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a free website or blog at WordPress.com.

Up ↑

%d bloggers like this: