Concurrent Entity Access

Contents

1 A. Introduction
2 B. Entity Iteration Order
3 C. System Execution
4 D. Task Graph Transformation
5 E. Entity Database Versioning
6 F. Task Partitioning

A. Introduction

It turns out that iterating the Entities in an incremental order of arrival time, which is required by Input Systems to function correctly, is not as easy to implement as it seems to be.

For a set of Systems, we want to make sure that given a specific Entity Database (and possibly along with other data such as frame time) as the input, we always get exactly the same result at the end of execution, especially under a concurrent environment. This property is critical to our replay and game save mechanisms.

To make it hold, there are several more properties must be maintained given a specific Entity Database and a Task Graph (the dependency graph of Systems):

A System always makes the same modification to the Entity Database regardless of the partitioning of the task.
Two Systems without any execution dependency always make the same modification to the Entity Database despite the order of the finish of the two Systems.
To maintain the previous property, for the set of Systems accessing the same type of Component, if at least one of them has a Write Access to the Component, there must be explicit execution dependencies between all the Systems in this set.
A System can see the modification made to the Entity Database by its predecessors.
The modifications made by a System are not visible to Systems without dependency with this System, including itself.

These properties guarantee that a System only sees reproducible results from Systems it depends on. Particularly, the Entities are always inserted to the Entity Database in the same order by a System, and all the Systems insert Entities into the Database in some determined order. So that in every execution of the Systems with the same input, the Entities are always processed in the same order.

B. Entity Iteration Order

The iteration order of Entities has significant impacts on the execution output:

Input Systems rely on that the iteration order of Entities is the same as if the Event Entities are iterated in the order of their occurrence, which is marked using a timestamp component.
- This requirement does not only require the visit order to be consistent but also specifies what that order must be.
Systems using pseudo-random number generators to create Entity data can only produce reproducible results when the initial seed and the visit order of Entities remain the same across multiple runs.
The iteration order of Entities could cause Rendering Systems to produce different rendering command lists, which results in different rendering performance or worse, different visual outputs. This makes debugging certain renderer bugs impossible because they could be gone during the next execution of the System.

It’s easy to see that if the output (the modifications made to the Entity Database) of any Systems relies on factors other than the data in the Entity Database and the iteration order (and some other reproducible input like frame time), the output won’t be reproducible and the effect may propagate during multiple frames to have a big impact. Then we will be the victims of the Bufferfly Effect.

Since Entity Pages form a singly linked list, the order of visiting each Entity reflects the order of insertion of each Entity Page into the linked list. A new Entity Page could only be created by creating Entities using an Archetype. If the new page were to be inserted at the head of the list, the current System will not ever see it since the begin iterator points to somewhere after the new page. As long as the original begin iterator was saved and used for other Systems, the other Systems will neither notice there is a new page.

However, this approach, though easy to maintain, creates an iteration pattern like that (recall that whenever a new Entity Page is allocated the ID counter is incremented by the page size):

[Page 1         ]  [Page 0          ]
[32, 33, ..., 63], [0, 1, 2, ..., 31]

Obviously, this does not satisfy the requirement of Input Systems because they would expect the Entities to be visited in an incremental order of time.

We could just maintain another index which points to the last Entity Page in the list and when inserting a new page, it gets linked to the back of the list. Unfortunately, doing so creates a race condition for the inserting thread and other visiting thread, which is discussed in another post.

Actually, no matter the new page is linked at the head or the back of the list, the possibility of multiple Systems simultaneously inserting new pages causes unpredictable order of the inserted pages. This uncertainty propagates its effect to subsequent Systems relying on this output by randomizing the iteration order of the newly inserted Entities. To avoid this inconsistent behavior, we have to make up some rules for the Systems inserting new pages.

C. System Execution

Let’s pause for a while and briefly review the execution model of Systems. In our model, Systems are tagged with two traits: ReadAccess and WriteAccess. Basically, two Systems with a Read Access to the same Component can execute simultaneously, while if a System holds a Write Access to one Component, it is guaranteed that only this System could access this type of Component before it finishes, including adding and removing this kind of Component from Entities.

Based on this rule, we impose several more requirements on the dependencies among the Systems accessing a certain type of Component.

In the following content, a Task Graph refers to the dependency graph of Systems, specified by the user. A Component Dependency Graph (CDG) for Component C is the transitive reduction of the subgraph of the Task Graph where edges connecting two Systems accessing C are preserved. A Transformed Component Dependency Graph (TCDG) for Component C is a dependency graph transformed from the Component Dependency Graph according to rules defined in Section D. This transformation does not alter the observable execution result of Systems involved. The Reduced Task Graph (RTG) refers to the transitive deduction of the union of the Transformed Component Dependency Graph of all types of Components that appear in the original Task Graph. Suppose that in every dependency graph we talk about, the Systems with an outdegree of 0 (no depending System) depend on a dummy Begin System and those with an in-degree of 0 (no System depending on it) are dependent by a dummy End System. These dummy dependencies are not affected by the transformation described later. All the dependency graphs mentioned above must be acyclic.

Suppose the CDG of Component. We call a System with a Read/Write Access a Read/Write System.

If all Systems are Read Systems, the dependencies among them are ignored.
If there exists at least one Write System, the weight of the shortest path from Begin System to End System equals the number of Write Systems, assuming all Write Systems have a weight of 1 and all Read Systems have a weight of 0. In simple words, in all topological orders of the CDG, the Write System are ordered in the same way and a Read System always sits between the same pair of Write Systems (assuming the dummy Begin and End Systems are Write Systems, too), so it does not incur a race condition.

The CDG of C is only valid if it satisfies the above requirements.

The above CDG shows possible violations of the above rules, where R denotes a Read System and W denotes a Write System.

Path W1-R3-E shortcuts W3 and W4, causing it unclear whether R3 will see the data state produced by W1, W3, or W4.
Path W1-R4-R5-W3 shortcuts W2. However, this edge would have been removed during transitive reduction.
Paths W3-R6-W4 and W3-R7-W5 represent two write paths causing a race condition.

D. Task Graph Transformation

A CDG of Component C validated according to the previous section could be transformed by the following rule to remove unnecessary dependencies and obtain the TCGD of C:

Use an arbitrary topological order of the CDG, for each pair of Write Systems in that order:
- If there are only Read Systems in between, let those Read Systems depend on the former Write System in the TCDG, and by dependent by the latter Write System.
- If they are adjacent, create a dependency between then in the TCDG.

In the presence of multiple Components, calculate the TCDG of all of them. The transitive reduction of the union of all TCDGs is the Reduced Task Graph used for scheduling.

E. Entity Database Versioning

With the rules defined in Section C, we can infer that if there isn’t a path between two Systems in the RTG, either of them cannot see the side effects of the other System on the Entity Database. Therefore, Property A.2 holds as long as the Systems are scheduled according to the RTG. In order to enforce Property A.4 and A.5, we need assistance from some data structures that allow us to keep track of modifications made to the Entity Database.

Denote the original version of Entity Database as the input to the Begin System as E. Denote the original version of Entity Database augmented by the changes made by a System S as \(E+\{S\}\). Consider the outdegree O of a System S in the RTG (excluding the dummy Begin and End nodes):

If O=0, S does not depend on any other System and no System is dependent on it. Therefore, S sees the unmodified version of Entity Database which is E.
If O=1, S is only dependent on its parent node P. If P sees version \(E+D\), S sees \(E+(D \cup \{S\})\)
If O>1, S depends on a set of Systems \(P^+\) that see different versions of the Entity Database. If a System P in \(P^+\) sees \(E+D_p\), S sees \(E+\bigcup\limits_{P \in P^+}D_p\)

By the rules above, it’s guaranteed that the End System sees the changes made to the original Entity Database by all Systems. Therefore, this System can merge all modifications into the database and this update frame finishes.

For the implementation, a table can be used to keep track of all the pages inserted by each System. A System dependent on a set of Systems D sees E+D via an iterator concatenating E and the pages inserted by Systems in D in some deterministic order.

F. Task Partitioning

In the current design, a System inserts new Entities into the Entity Database via an Archetype, which internally refers to an Entity Page that was previously used to store the newly created Entity in order to promote good cache behavior. Let’s call this feature Page Reusing. When the room on a page is used up, a new empty page is allocated in place of the old one. Unfortunately, this scheme doesn’t scale well to multithreaded environments with page reusing.

Suppose in a typical Entity creation loop in a System, the System iterates a filtered range of Entities and creates new Entities when certain conditions are met. If that filtered range is partitioned and executed on different threads, each thread must have its own Archetype to avoid data race. If we partition the range into \(n\) chunks and each of them executes on an independent thread. Each thread owns an Archetype \(A_1, A_2, …, A_n\). The new pages created from an Archetype \(A_n\) are \(P_{n,1}, P_{n,2}, …, P_{n,p(n)}\), where \(p(n)\) denotes the number of pages created by \(A_n\). At the end of a frame, we got a new page chain concatenating the new pages created from all the Archetypes:

\(P_{1,1}, P_{1,2}, …, P_{1,p(1)},\)
\(P_{2,1}, P_{2,2}, …, P_{2,p(2)},\)
\(…,\)
\(P_{n,1}, P_{n,2}, …, P_{n,p(n)}\)

Because the number of Entities created using an Archetype is very unlikely to be a multiple of the page size, the last page created by an Archetype is always not full. If page reusing is not allowed, this could only be a possible performance hit and waste of memory. If it is allowed, some Entities will be created in the page chain created during previous frames. Where exactly the Entities go? It depends on how the ranges were partitioned in previous frames. Therefore, even the observable modifications made to the Entity Database are tied with the execution policy of the Systems, which indicates a leak of implementation details into the interface.

Of course, if we carefully record how the ranges were partitioned and how many threads were used to execute the Systems, we could reproduce the result. Even though, we lose the flexibility to dynamically partition and schedule the Systems with varying number of threads during the runtime because the Archetypes are tied with threads. Doing so results in managing a pool of Archetypes which does not worth the trouble.

So does page reusing still have any use case? Technically speaking, so long as more than one Entities were created using an Archetype we are reusing a page, otherwise, a new Entity Page will be created every time an Entity is created. Except for this, even in a single-threaded scenario such as with Input Systems, there are no obvious benefits for reusing pages allocated during previous frames because input events tend to be plenty and events from previous frames are very likely to be already removed thus the pages they settle on are also deleted.

Therefore, the task would be finding a good way to adjust the partition of ranges so that page fill rate is maximized while the makespan of all Systems is minimized.

Unfortunately, even if the usage of Archetype with multiple threads is addressed, Systems with shared states such as random number generators still cannot benefit from parallel execution. When in such situations, it’s recommended to disable task partitioning (or job splitting in scheduling vocabulary) to maintain a consistent execution result.

Yukino's Dev Blog

A. Introduction

B. Entity Iteration Order

C. System Execution

D. Task Graph Transformation

E. Entity Database Versioning

F. Task Partitioning

Yukino

2 thoughts on “Concurrent Entity Access”

Leave a Reply Cancel reply

A. Introduction

B. Entity Iteration Order

C. System Execution

D. Task Graph Transformation

E. Entity Database Versioning

F. Task Partitioning

Yukino

Related Posts

2 thoughts on “Concurrent Entity Access”

Leave a Reply Cancel reply