When it comes to storing data. Languages usually offer vectors and maps. These two collections are standard, though are they really solution you should use? Let’s give it some thoughts.

When you are collecting data just to process it later, use vector. You need to access them by names, use map. However, it’s not always so simple, and you can be dealing with multiple requirements. What if you need data to be ordered but at the same time, access it also by ID. What if you want to remove some data but do not change IDs of other data? This could lead you to a conclusion that you have to use combination of map and vector and reference same data twice, but this introduces more problems than it solves. Removing something from map is fast, mostly, but vector is problematic. You have to firs find where it is and then do all the jazz to remove it, it can take some time indeed. The there is also a problem with maps being slow as well, if you are dealing with huge amount of data like in Entity component system. So, what’s the right storage?

Warehouse

It’s sometimes good to think about memory as Warehouse. That’s a correct approach because that’s basically how it works. You give a data you want to be stored and they give you an address. You cannot say, which address you want, as it can be already taken. If you take data back or remove it, warehouse will not move everything it has after your data by one place, as you would in case of removing element from vector. Removing data just makes it so it’s considered free and ca be overwritten at any point of time. Let’s apply these rules and creating simple abstraction around a vector. We would define our vector like this (pseudo code):

struct Wearhouse<T> {
inner Vec<T>
free Vec<int>
public taken Vec<int>
}

Notice the free. This is most important field. It’s used like a stack and stores information about free space. If you remove item, it’s index gets inserted to free. When item is added, index is popped from free, data is stored at that index and id is returned to user like this:

func Warehouse::insert(value T) int {
if free.isEmpty() {
inner.append(value)
return free.len() - 1
}
id = free.pop()
inner[id] = value
return id
}
func Warehouse::get(id int) T {
return inner[id]
}
func Warehouse::remove(id int){
if inner[id] == nil {
return
}
inner[id] = nil
free.append(id)
}

This is like made for ESC, entities can be stored in warehouse and removing and inserting is quick as it reuses resources. Only real disadvantage presents itself while iterating through the collection. We still have to look up empty spaces and skip them. This can be optimized, if you are iterating often, by pre-mapping used ids in the used. Then defer your remove calls to do them all at once and then remap ids.

func Warehouse::map() {
taken.clear()
for i = 0; i < inner.len(); i++ {
if inner[i] != nil {
taken.append(i)
}
}
}

Ant there you go; your data can now be accessed by unique “ID”. Order of elements is at least more stable than in map. Removing and inserting is as fast as ewer. Accessing is speed itself as you do not have to rely on hashing algorithm.