Objects persistence refers to the mechanism that allows a programmer to export and later restore the content of class instances. The technique is often used to implement save features but may also be called to duplicate objects over a network architecture.
In this post I will detail how object persistence can be implemented in C++. I have designed the following approach to ease content saving and loading because writing specialized import and export features for the different assets you met in game development takes quite a lot of time. With such a generic approach, much time can be saved.
The global mechanism is represented on Figure 1.

Class instances (objects) have dependencies in both the vertical (is-a) and horizontal (has-a) domains. We call the former inheritance, or parent-child relations, and the latter are variables. Variables can be elementary types (such as integers, floating point values, characters…) but they can also be other class instances which have their own dependencies. Here, I will call elementary types dependencies variables and class instances dependencies objects. The difference is more than semantic because variables and objects do not implement the same storing mechanism.
Still on Figure 1, exporting and importing class instances refers to transforming the objects data to a generic structure that I call a store object. Store objects is a copy of the internal content of a class instance data but it does not care about what the class is trying to achieve. You can see them as abstract data.
Store object contains two lists: one for all its variables and one for all its objects. Because object dependencies are ultimately store objects themselves, we end up with a tree structure.
The advantage of copying each object data into a generic structure is that we can handle the data in an easier way to prepare it for packing and unpacking operations. Packing converts store objects to a stream of bytes and unpacking convert a stream of bytes to store objects. Once you have a stream of bytes you can save them to disk or send them to your networking applications. Here, I have focused on storing the data on disk.
We will now focus on each of these steps individually and then discuss how the various processes can be automated before moving to some examples.
In this first section I will explain how objects and variables can be stored manually. In the next section, things will get a bit more complicated because I will use a lot of C++ tricks to achieve automated storing. But for now, we will keep things simple!
Because we cannot make the storage behaviour comes out of thin air, we will have to make all our storable objects derive from an interface called IStorableObject:
class IStorableObject
{
public:
	virtual void exportObject(StoreObject *pObject) = 0;
	virtual void importObject(StoreObject *pObject) = 0;
};
The interface only implements two functions so far which relates to the export and import mechanism. They are defined as pure virtuals (virtual … = 0) because their content must be defined for any object that should have a storage mechanism. If you do not implement both of these functions in your storable object class, your compiler will issue you an error.
In our example from Figure 1, we have a class Model3D[ that inherits from a class called GameAsset which is believed to represent a general game resource. To implement a storage mechanism, we have to make either Model3D or GameAsset inherit from IStorableObject. The only difference between both solutions is that the latter case represents a more rigid framework where all game assets must implement storage features. This is not preposterous because it sounds to be a reasonable development hypothesis, and we will choose to adopt it. So GameAsset inherits from IStroableObject which open the import/export interface to Model3D. Note that as long as you do not define import and export functions (either in GameAsset or Model3D) nothing has changed in regards to the pure virtual functions. So if GameAsset do not implement these (which it shouldn’t in a good design), we have to define them in Model3D to make the compiler happy.
The interface relates to the StoreObject class which is a nested tree with the two lists we discussed in the introduction:
class StoreObject
{
public:
	StoreObject(void) { }
	~StoreObject(void)
	{
		while(this->m_pChildren.hasElement())
			delete this->m_pChildren.front();
	}
	Iterator<struct store_object_data_s> *getVariables(void)
	{
		return this->m_pVariables.createForwardIterator();
	}
	Iterator<StoreObject*> *getChildren(void)
	{
		return this->m_pChildren.createForwardIterator();
	}
private:
	LinkedList<StoreObject*> m_pChildren;
	LinkedList<struct store_object_data_s> m_pVariables;
};
So far the class is relatively empty and only implements the code required to free the memory on deletion. We will now develop how to populate the objects and variables lists.
Exporting variables means adding data to a store object list. The list should contain at least the name of the variable that is inserted, the data itself and the type of the content. It is also a good idea to add the size (in bytes) of the data because it will make packing easier and it also enables triple-check to be sure that the importer and exporter are talking about the same data (on some computers integers are defined as 2 bytes while on most systems they are 3 bytes wide. By checking both the typename “int” and the size, we can avoid these troubles by issuing exceptions).
Importing variables is the exact opposite mechanism because you will browse the content of a StoreObject variables list to find a particular occurrence to restore the data. Later on, we will add more information but we will keep it simple for now.
struct store_object_data_s
{
	String sTypeName;
	String sVarName;
	size_t nSize;
	void *pData;
};
void StoreObject :: addVariable(void *pVariableData, size_t nByteSize, const char *szTypeName, const char *szVarName)
{
	struct store_object_data_s var;
	var.sTypeName = String(szTypeName);
	var.sVarName = String(szVarName);
	var.nSize = nByteSize;
	var.pData = new byte[var.nSize];
	bcopy(var.pData, (byte*)pVariableData, var.nSize);
	this->m_pVariables.add(var);
}
You should also not forget to free the memory in the class destructor to prevent memory leaks.
In the following example we create a class Test which holds an integer value. The example code creates an instance with the value “10” and store the content into a store object. Then, a second instance is created with the number “3” but we then import the store object to override the content such that it displays the correct value “10”. (Note: In a more serious program, don’t forget to check both the typename and the size before accepting a variable from its name.)
class Test : public IStorableObject
{
public:
	Test(int i=0)
	{
		this->m_iTest = i;
	}
	
	int getValue(void) const
	{
		return this->m_iTest;
	}
	virtual void exportObject(StoreObject *pObject)
	{
		if(pObject != null)
			pObject->addVariable(&this->m_iTest, sizeof(this->m_iTest), "int", "m_iTest");
	}
	virtual void importObject(StoreObject *pObject)
	{
		if(pObject == null)
			return;
		AutoPtr< Iterator<struct store_object_data_s> > it = pObject->getVariables();
		while(!it->end())
		{
			struct store_object_data_s& curr = it->current();
			it->next();
			if(curr.sVarName.equals("m_iTest"))
				this->m_iTest = *(int*)curr.pData;
		}
	}
private:
	int m_iTest;
};
void main(void)
{
	StoreObject obj;
	/* export */
	{
		Test tmp(10);
		tmp.exportObject(&obj);
	}
	/* import */
	{
		Test tmp(3);
		/* will print "3" */
		printf("%d\r\n", tmp.getValue());
		tmp.importObject(&obj);
		/* will print "10" */
		printf("%d\r\n", tmp.getValue());
	}
}
Although we did not save the data to a file yet, we are already able to store the variables from a class and to restore them. It does not sound like a lot but it is already something!
But what now about objects?
Remember that any storable objects can have a store object representation as well so when we want to export another object deriving from the IStorableObject interface, all we have to do is to create a new StoreObject, call the export function with that new object and add it to the parent list:
void StoreObject :: addObject(IStorableObject *pObject, const char *szTypeName, const char *szVarName)
{
	if(pObject == null)
		return;
	StoreObject *pStoreObject = new StoreObject(szTypeName, szVarName);
	if(pStoreObject == null)
		return;
	pObject->exportObject(pStoreObject);
	this->m_pChildren.add(pStoreObject);
}
We have to add a new constructor to the StoreObject class to set the typename and dependency name:
class StoreObject
{
public:
	/* ... */
	StoreObject(const char *szTypeName, const char *szVarName)
	{
		this->m_sVarName = String(szVarName);
		this->m_sTypeName = String(szTypeName);
	}
	String getTypeName(void) const
	{
		return this->m_sTypeName;
	}
	String getVarName(void) const
	{
		return this->m_sVarName;
	}
private:
	/* ... */
	String m_sVarName, m_sTypeName;
};
To import objects, we have to browse the children list for a match and then call the importObject function on that particular child.
For example, taking the Model3D class of Figure 1, the import and export function would look like:
void Model3D :: exportObject(StoreObject *pObject)
{
	if(pObject == null)
		return;
	addObject(&this->m_pTrianglesList, "Array<triangle_s>", "m_pTrianglesList");
}
void Model3D :: importObject(StoreObject *pObject)
{
	if(pObject == null)
		return;
	AutoPtr< Iterator<struct store_object_data_s> > it = pObject->getChildren();
	while(!it->end())
	{
		StoreObject *pCurrent = it->current();
		it->next();
		if(pCurrent != null && pCurrent->getVarName().equals("m_pTrianglesList"))
			this->m_pTrianglesList.importObject(pCurrent);
	}
}
Obviously this requires that the Array class should itself be a storable object. We will come back on this later.
Objects and variables are horizontal dependencies and we will now focus on the verticals ones, that is: inheritance.
When storing an object data we have to take care not only to the content of the specialized class but also to the content of the class eventual parents. Consider the following code:
class A : public IStorableObject
{
public:
	A(void)
	{
		this->m_iTest = 0;
	}
	A(int i)
	{
		this->m_iTest = i;
	}
	virtual void test(void) const
	{
		printf("(A) Value: %d\r\n", this->m_iTest);
	}
private:
	int m_iTest;
};
class B : public A
{
public:
	B(void)
	{
		this->m_iTest = 0;
	}
	B(int i, int j) : A(j)
	{
		this->m_iTest = i;
	}
	virtual void test(void) const
	{
		A :: test();
		printf("(B) Value: %d\r\n", this->m_iTest);
	}
private:
	int m_iTest;
};
class C : public B
{
public:
	C(void)
	{
		this->m_iTest = 0;
	}
	C(int i, int j, int k) : B(j, k)
	{
		this->m_iTest = i;
	}
	virtual void test(void) const
	{
		B :: test();
		printf("(C) Value: %d\r\n", this->m_iTest);
	}
private:
	int m_iTest;
};
There are three classes A, B and C with C deriving from B and B deriving from C. All the class have a variable “m_iTest” to store. We could have added more classes and more variables but it is enough to raise two important points:
1/ Not only the import and export function from C should be called but also the one from B and A. This is pretty straightforward to implement because all we have to do is to add ParentName::importObject(pObject) and ParentName::exportObject(pObject) in the import and export functions respectively. This also solve the problem of having private members in the parent class.
2/ Any instance of class C shares also the data of class B and class A and so an instance of C should be represented by only one store object. The major issue with the code given above is that all variables have the same name.
The consequences of the last point is that we miss some information to be able to distinguish the variables from A, B and C for a same instance of C. The solution is to add to our variable data and store object the class name that the dependency belongs to:
struct store_object_data_s
{
	String sTypeName;
	String sClassName;
	String sVarName;
	size_t nSize;
	void *pData;
};
class StoreObject
{
public:
	/* ... */
	StoreObject(const char *szTypeName, const char *szClassName, const char *szVarName)
	{
		this->m_sClassName = String(szClassName);
		this->m_sVarName = String(szVarName);
		this->m_sTypeName = String(szTypeName);
	}
	String getClassName(void) const
	{
		return this->m_sClassName;
	}
private:
	/* ... */
	String m_sClassName;
};
That way, we can store A::m_iTest, B::m_iTest and C::m_iTest within the same store object.
And this is pretty all about inheritance! Easy, isn’t it?
Packing and unpacking mechanism are not that complicated to design but they can be really tedious in terms of length. This is actually one of the motivations for a generic storage mechanism because I have spent many hours debugging my custom packing and unpacking functions.
The idea behind data packing/unpacking is that you collect all the memory and push it to a bytes stream. I like to organize my packed data as distinct bins which can be accessed by known offsets such as represented on Figure 2.

The various structures hold the information as indices and byte offsets. For example, to write a string, we first insert the string data into the “strings” bin and then refer to it as an offset from the origin. The major advantage of organizing the bytes streams that way is that it makes unpacking very straightforward:
struct header_s
{
	dword dwIdent;
	word wChecksum;
	dword dwSize;
	word wNumVars;
	word wNumChildren;
	dword dwTypeNameIndex;
	dword dwClassNameIndex;
	dword dwVarNameIndex;
	dword dwVarOfs;
	dword dwChildOfs;
	dword dwStringsOfs;
	dword dwVarDataOfs;
	static const dword IDENT_VERSION = 'ST00';
};
struct var_s
{
	dword dwTypeNameIndex;
	dword dwClassNameIndex;
	dword dwVarNameIndex;
	dword dwDataIndex;
	dword dwDataSize;
};
struct child_s
{
	dword dwDataIndex;
	dword dwDataSize;
};
size_t StoreObject :: unpack(byte *pStream, size_t nSize)
{
	if(nSize < sizeof(struct header_s) || pStream == null)
		return 0;
	struct header_s *pHeader = (struct header_s*)pStream;
	if(pHeader->dwIdent != header_s :: IDENT_VERSION)
		return 1;
	if(pHeader->dwSize > nSize)
		return 0;
	word wOldChecksum = pHeader->wChecksum;
	pHeader->wChecksum = 0;
	if(checksum(pStream, pHeader->dwSize) != wOldChecksum)
		return 1;
	struct var_s *pVars = (struct var_s*)(pStream + pHeader->dwVarsOffset);
	struct child_s *pChildren = (struct child_s*)(pStream + pHeader->dwChildrenOffset);
	char *pStrings = (char*)(pStream + pHeader->dwStringOffset);
	byte *pData = (byte*)(pStream + pHeader->dwDataOffset);
	this->m_sVarName = String(pStrings + pHeader->dwVarNameIndex);
	this->m_sTypeName = String(pStrings + pHeader->dwTypeNameIndex);
	this->m_sClassName = String(pStrings + pHeader->dwClassNameIndex);
	for(word wI=0;wI<pHeader->wNumVars;wI++)
	{
		struct var_s *pVariable = pVars + wI;
		addVariable(pData + pVariable->dwDataIndex, pVariable->dwDataSize, pStrings + pVariable->dwTypeNameIndex, pStrings + pVariable->dwClassNameIndex, pStrings + pVariable->dwVarNameIndex);
	}
	for(word wI=0;wI<pHeader->wNumChildren;wI++)
	{
		struct child_s *pChild = pChildren + wI;
		StoreObject *pObject = new StoreObject();
		if(pObject == null)
			continue;
		if(pObject->unpack(pData + pChild->dwDataIndex, pChild->dwDataSize) != 0)
			this->m_pChildren.add(pObject);
		else
			delete pObject;
	}
	return pHeader->dwSize;
}
The header keeps important information such as where does the bins begin in the bytes stream but it also contains an identification field and a checksum field. The role of these is to guarantee that we are importing genuine data. You can use the following checksum function which is adapted from the TCP/IP protocols and can therefore be trusted as a good but fast checksum procedure:
word checksum(byte *pStream, size_t nSize)
{
	word wRet = 0;
	word *pWords = (word*)(pStream);
	while(nSize > 1)
	{
		wRet += ~*(pWords++);
		pStream += 2;
		nSize -= 2;
	}
	if(nSize == 1)
		wRet += ~(((word)*pStream) << 8);
	return ~wRet;
}
Once we have validated the data, we locate the bins and start reading all the variables and children objects to restore them as StoreObject objects.
While unpacking data is easy due to the bin structure of the information, packing can be a real pain because every time we add data to a bin, we must resize the memory allocated for the bin, copy the data and index everything properly. Only one minute mistake when writing the code will cost you hours of debugging… so be extremely careful when implementing the packing mechanism.
void StoreObject :: pack(byte*& rpStream, size_t& rSize)
{
	struct header_s header;
	char *pStrings = null;
	size_t nStringLength = 0;
	byte *pData = null;
	size_t nDataSize = 0;
	struct child_s *pChildren = null;
	size_t nNumChildren = 0;
	struct var_s *pVars = null;
	size_t nNumVars = 0;
	/* header */
	header.dwIdent = header_s::IDENT_VERSION;
	header.wChecksum = 0;
	/* class name */
	{
		size_t nLength = this->m_sClassName.length() + 1;
		pStrings = (char*)realloc(pStrings, nStringLength + nLength);
		
		if(this->m_sClassName.getPointer() != null)
			strcpy(pStrings + nStringLength, this->m_sClassName.getPointer());
		pStrings[nStringLength + nLength - 1] = '\0';
		header.dwClassNameIndex = (dword)nStringLength;
		nStringLength += nLength;
	}
	/* var name */
	{
		size_t nLength = this->m_sVarName.length() + 1;
		pStrings = (char*)realloc(pStrings, nStringLength + nLength);
		
		if(this->m_sVarName.getPointer() != null)
			strcpy(pStrings + nStringLength, this->m_sVarName.getPointer());
		pStrings[nStringLength + nLength - 1] = '\0';
		header.dwVarNameIndex = (dword)nStringLength;
		nStringLength += nLength;
	}
	/* type name */
	{
		size_t nLength = this->m_sTypeName.length() + 1;
		pStrings = (char*)realloc(pStrings, nStringLength + nLength);
		
		if(this->m_sTypeName.getPointer() != null)
			strcpy(pStrings + nStringLength, this->m_sTypeName.getPointer());
		pStrings[nStringLength + nLength - 1] = '\0';
		header.dwTypeNameIndex = (dword)nStringLength;
		nStringLength += nLength;
	}
	/* scan vars */
	{
		AutoPtr< Iterator<struct store_object_data_s> > it = this->m_pVariables.createForwardIterator();
		while(!it->end())
		{
			struct store_object_data_s& curr = it->current();
			it->next();
			pVars = (struct var_s*)realloc(pVars, sizeof(struct var_s) * (nNumVars + 1));
			/* class name */
			{
				size_t nLength = curr.sClassName.length() + 1;
				pStrings = (char*)realloc(pStrings, nStringLength + nLength);
				
				if(curr.sClassName.getPointer() != null)
					strcpy(pStrings + nStringLength, curr.sClassName.getPointer());
				pStrings[nStringLength + nLength - 1] = '\0';
				pVars[nNumVars].dwClassNameIndex = (dword)nStringLength;
				nStringLength += nLength;
			}
			/* var name */
			{
				size_t nLength = curr.sVarName.length() + 1;
				pStrings = (char*)realloc(pStrings, nStringLength + nLength);
				
				if(curr.sVarName.getPointer() != null)
					strcpy(pStrings + nStringLength, curr.sVarName.getPointer());
				pStrings[nStringLength + nLength - 1] = '\0';
				pVars[nNumVars].dwVarNameIndex = (dword)nStringLength;
				nStringLength += nLength;
			}
			/* type name */
			{
				size_t nLength = curr.sTypeName.length() + 1;
				pStrings = (char*)realloc(pStrings, nStringLength + nLength);
				
				if(curr.sTypeName.getPointer() != null)
					strcpy(pStrings + nStringLength, curr.sTypeName.getPointer());
				pStrings[nStringLength + nLength - 1] = '\0';
				pVars[nNumVars].dwTypeNameIndex = (dword)nStringLength;
				nStringLength += nLength;
			}
			/* data */
			{
				pData = (byte*)realloc(pData, nDataSize + curr.nSize);
				pVars[nNumVars].dwDataSize = curr.nSize;
				bcopy(pData + nDataSize, curr.pData, curr.nSize);
				pVars[nNumVars].dwDataIndex = (dword)nDataSize;
				nDataSize += curr.nSize;
			}
			nNumVars ++;
		}
		header.wNumVars = (word)nNumVars;
	}
	/* scan children */
	{
		AutoPtr< Iterator<StoreObject*> > it = this->m_pChildren.createForwardIterator();
		while(!it->end())
		{
			StoreObject *pCurrent = it->current();
			it->next();
			if(pCurrent == null)
				continue;
			byte *pNewStream = null;
			size_t nNewSize = 0;
			pCurrent->pack(pNewStream, nNewSize);
			pChildren = (struct child_s*)realloc(pChildren, (nNumChildren + 1) * sizeof(struct child_s));
			pData = (byte*)realloc(pData, nDataSize + nNewSize);
			bcopy(pData + nDataSize, pNewStream, nNewSize);
			pChildren[nNumChildren].dwDataIndex = nDataSize;
			pChildren[nNumChildren].dwDataSize = nNewSize;
			nDataSize += nNewSize;
			
			nNumChildren ++;
			delete[] pNewStream;
		}
		header.wNumChildren = (word)nNumChildren;
	}
	rSize = sizeof(struct header_s) + sizeof(struct var_s) * nNumVars + sizeof(struct child_s) * nNumChildren + nStringLength + nDataSize;
	rpStream = new byte[rSize];
	header.dwSize = rSize;
	header.dwVarsOffset = sizeof(struct header_s);
	header.dwChildrenOffset = header.dwVarsOffset + sizeof(struct var_s) * nNumVars;
	header.dwStringOffset = header.dwChildrenOffset + sizeof(struct child_s) * nNumChildren;
	header.dwDataOffset = header.dwStringOffset + nStringLength;
	bcopy(rpStream, &header, sizeof(struct header_s));
	bcopy(rpStream + header.dwVarsOffset, pVars, sizeof(struct var_s) * nNumVars);
	bcopy(rpStream + header.dwChildrenOffset, pChildren, sizeof(struct child_s) * nNumChildren);
	bcopy(rpStream + header.dwStringOffset, pStrings, nStringLength);
	bcopy(rpStream + header.dwDataOffset, pData, nDataSize);
	((struct header_s*)rpStream)->wChecksum = checksum(rpStream, rSize);
	free(pVars);
	free(pChildren);
	free(pStrings);
	free(pData);
}
As we keep walking through this article, more features will be added to the StoreObject class and you will have to include these in the packing and unpacking functions.
Finally, you can use the following functions to import and export storable object collections from files:
void exportObjectCollectionToFile(const LinkedList<StoreObject*>& rObjectCollection, const char *pszFileName)
{
	if(pszFileName == null)
		return;
	File fout(pszFileName, "wb+");
	AutoPtr< Iterator<StoreObject*> > it = rObjectCollection.createForwardIterator();
	while(!it->end())
	{
		StoreObject *pCurrent = it->current();
		it->next();
		if(pCurrent == null)
			continue;
		try
		{
			byte *pbStream = null;
			size_t nSize = 0;
			pCurrent->pack(pbStream, nSize);
			fout.putBytes(pbStream, nSize);
			delete[] pbStream;
		} catch(IException &rException)
		{
			printf("Exception: %s\r\n", rException.toString().getPointer());
		}
	}
}
LinkedList<StoreObject*> importObjectCollectionFromFile(const char *pszFileName)
{
	LinkedList<StoreObject*> ret;
	File fin(pszFileName, "rb");
	byte *pbStream = null;
	size_t nSize = 0;
	fin.download(pbStream, nSize);
	if(nSize == 0 || pbStream == null)
		return ret;
	size_t nBytesRead = 0, nOffset = 0;
	do
	{
		try
		{
			StoreObject *pObject = new StoreObject();
			nBytesRead = pObject->unpack(pbStream + nOffset, nSize - nOffset);
			nOffset += nBytesRead;
			ret.add(pObject);
		} catch(IException &rException)
		{
			nBytesRead = 0;
			printf("Exception: %s\r\n", rException.toString().getPointer());
		}
	} while(nBytesRead != 0 && nOffset < nSize);
	delete[] pbStream;
	return ret;
}
Now that the bases have been laid out, we will focus on how to make the storage procedure as automatic as possible.
I will only put only one restriction to automated storage: it should reject dependencies that are pointers. I have designed previous implementations which had mechanism to detect pointers and restore the links between objects but they all suffered the same problem: who is responsible of the pointer construction and destruction? For example, how can you tell the difference between an object having a pointer for an array of data, such as a texture, and an object having a pointer to another object?
In all my implementations the storage engine was responsible of all pointers constructions and destructions but that makes very little sense, especially in a game where entities are managed by a game manager, components are managed by entities, and pointer to other entities are passed to both entities and components as well as pointers to more specific data such as AI structure and so on. Sooner or later I had to rethink some mechanisms just to make them work with the storage engine but the result was always very bad overall design.
I have now realized that automated storage engine should reject pointers by definition because all the point of store objects is to abstract each specialized class into generic components that can easily be processed and packed. On the other hand, the belonging/responsibility of a given pointer is specific to the class internal behaviour which goes directly against the idea of abstraction. So any attempt to include pointers to the storage mechanism will force the programmer to disclose more of the internal working of classes and slow down the overall automated features.
Please note that I am not saying that pointers should be banished from all storage mechanism but only from the automated ones. So if your application requires pointer for some specific issues, you will have to write custom import and export functions and disable the automatic storage for that particular class.
Later I will show how to reject pointers in automated functions but this requires the application to first identify them. One way to do is to use templates:
template<typename Type> struct is_pointer
{
	static const bool value = false;
};
template<typename Type> struct is_pointer<Type*>
{
	static const bool value = true;
};
template<typename Type> bool isPointer(const Type&)
{
	return is_pointer<Type> :: value;
}
A function isPointer is created for every type and each particular template function will then call a template structure to check for a boolean value. Two version of the structure exists: one for pointers and the other one for non-pointers. This is a relatively common trick with templates but don’t worry if you do not understand this one fully, I have plenty of template tricks left for you.
So far we have written custom import and export functions that were tailored for particular class implementations. Not only does this take time but it makes change more difficult to manage.
The big leap lays in defining a simplified approach to tell the storage engine which variables and objects should be looked after. The technique I am using was inspired from the Half-Life SDK where they define a static table for each storable object. The table contains various information such as the dependencies name, their type, the address of the data and so on:
struct storage_data_s
{
	const char *szTypeName;
	const char *szClassName;
	const char *szVarName;
	size_t nByteSize;
	size_t nByteOffset;
	bool bPointer;
	bool bIsObject;
	size_t nClassByteOffset;
};
You will see that the structure does share a lot of common with the internal structure of store objects that were used to track variables. This is not surprising as it represents all the meaningful information about the data. One of the major differences is that, this time, dependencies can be variables as well as objects.
szTypename, szClassName and szVarName contains the type, class, and name of the dependency. nBytesSize is the size of the dependency in bytes and nByteOffset is the address offset from the base of the object so that the data pointer can be retrieved by adding the address of the base of the object (the “this” pointer) to nByteOffset. bIsObject tells the difference between a variable and an object and bPointer tells if the dependency is a pointer or not. Finally, nClassByteOffset is the offset of the IStorableObject interface from the base of the object. We will come back to this shortly because it is quite subtle. I have added detailed information about other functions found in this macro later in the post so don’t worry.
Still following the Half-Life SDK, the static table is filled using macros (the version here has been heavily modified to add new functionalities):
#define DECLARE_STORAGE(Class)					struct storage_data_s Class :: sStorageData[] = 
#define DECLARE_STORAGE_VAR(Class, Var)			{getTypeName(((Class*)0)->Var), #Class, #Var, sizeof(((Class*)0)->Var), (size_t)(&((Class*)0)->Var), isPointer(((Class*)0)->Var), isStorableObjectFunc(((Class*)0)->Var), getStorableObjectOfs(((Class*)0)->Var)}
The ‘#’ symbol before a name in a macro allows the compiler to transform the name into a string. So #Test becomes the string “Test”. This is pretty neat to fill the dependency and class names. ((Class*)0)->Var creates a null pointer of type “Class*” and points to the class variable “Var”. Because we cannot access any real data from a null pointer (that would make the app crash!), the approach may seem pointless but it is actually very useful to many programming idioms. For example, sizeof(((Class*)0)->Var) will computes the bytes size of the dependency Class::Var. (size_t)(&((Class*)0)->Var) corresponds to the address of Class::Var of the null object and will so return the offset of the dependency from the base of the object.
Finally, we can also write a macro to expand the import and export functions. Don’t panic with the definitions of read and write, we will cover them in the next section.
#define IMPLEMENT_STORAGE(Class, Parent)		void Class :: exportObject(StoreObject *pObject)	\
												{	\
													if(#Class != #Parent)	\
														Parent :: exportObject(pObject);	\
														\
													if(pObject == null)	\
														return;	\
														\
													pObject->write(this, sStorageData, sizeof(sStorageData)/sizeof(struct storage_data_s));	\
												}	\
													\
												void Class :: importObject(StoreObject *pObject)	\
												{	\
													if(#Class != #Parent)	\
														Parent :: importObject(pObject);	\
														\
													if(pObject == null)	\
														return;	\
														\
													pObject->read(this, sStorageData, sizeof(sStorageData)/sizeof(struct storage_data_s));	\
												}
We have then created a table (and a way to populate that table) with all the information about the class dependencies and we created export and import functions which sends that table right through the store objects for further processing.
We may then rewrite the very first example as:
class Test : public IStorableObject
{
public:
	Test(int i=0)
	{
		this->m_iTest = i;
	}
	
	int getValue(void) const
	{
		return this->m_iTest;
	}
	virtual void exportObject(StoreObject *pObject);
	virtual void importObject(StoreObject *pObject);
	static struct storage_data_s sStorageData[];
private:
	int m_iTest;
};
DECLARE_STORAGE(Test)
{
	DECLARE_STORAGE_VAR(Test, m_iTest),
};
IMPLEMENT_STORAGE(Test, Test);
To distinguish between variables and objects we have to write a function that tells if a particular dependency inherits from IStorableObject or not. Objects such as classes or structures which do not inherit from the interface are simply copied as variables. Be careful if you use this on classes that have dynamic contents such as pointers.
The function isStorableObjectFunc is built from templates and the programming idiom SFINAE (“Substitution Failure Is Not An Error”):
template<typename Type> struct isStorableObject
{
	static bool check(IStorableObject*)
	{
		return true;
	}
	static bool check(...)
	{
		return false;
	}
	static bool test(void)
	{
		return check(static_cast<Type*>(null));
	}
};
template<typename Type> bool isStorableObjectFunc(const Type&)
{
	return isStorableObject<Type> :: test();
}
A isStorableObjectFunc function is created for every type during compile time. So there is one isStorableObjectFunc for integers, one for strings, one for the Array class… and they are all different. The const reference argument corresponds to a pointer and we can then send an invalid pointer built from the null object as long as we do not try to access its data. Removing the reference mark in the argument will make the compiler copy the object as the function is called and therefore make the application crashes because of the invalid pointer.
So the compiler identifies the different typenames thanks to the ((Class*)0)->Var argument and call a structure template which has a test function that returns either true or false. This value depends on the behaviour of the check function which makes usage of the SFINAE idiom. There are actually two check functions: one for a specific type (a IStorableObject pointer) or a generic one which is the substitution. When calling the check function, we send a pointer of the class type and let the compiler check if there is a possible conversion to an IStorableObject interface. If such a casting operation exists, it calls the specialized function which returns true. If not, it calls the substitution function which returns false.
The function getTypeName is also built from templates but is actually much easier. Once the template is built, the __FUNCSIG__ macro insert the function signature which is a string containing the template name. All we have to do is to extract the template typename from that string. For example getTypeName<int>() will return the following function signature: “class String __cdecl getTypeName<int>(void)”. getTypeName< Array<int> > on its side will return “class String __cdecl getTypeName<class Array<int>>(void)”. By selecting everything between the first occurrence of ‘<’ and the last occurrence of ‘>’ we get the wanted typename.
template<typename Type> String getTypeName(void)
{
	String str(__FUNCSIG__);
	return str.subString(str.indexOf('<') + 1, str.lastIndexOf('>'));
}
template<typename Type> char *getTypeName(const Type&)
{
	static String sTypeName = getTypeName<Type>();
	return sTypeName.getPointer();
}
If you experiment with these features, you will see that the result is different when the typename is a class, a struct, an union or an enumeration. I do not recommend removing these because it will help discriminating all of them which might be required in some applications. It will also identify types created in function and also nested structures or classes. For example, a structure “s” created inside the function “foo” will have the signature “struct foo::s”.
Note that this function will only work with the Microsoft Visual Studio compiler. Other compilers, such as GCC, offer different macro to do about the same thing. Please look at your compiler documentation if you are concerned.
Exporting the data from storable objects is not difficult since all the information has been filled into a table. All we have to do is to browse the list, check if the dependency is an object or a variable and proceed accordingly:
void StoreObject :: write(void *pBase, struct storage_data_s *pArray, size_t nSize)
{
	if(pArray == null || nSize == 0)
		return;
	for(size_t n=0;n<nSize;n++)
		_write(pBase, pArray[n]);
}
void StoreObject :: _write(void *pBase, struct storage_data_s& rStorageData)
{
	if(rStorageData.bPointer)
		throwException(NoHandlingPointerException(String(rStorageData.szClassName) + String("::") + String(rStorageData.szVarName)));
	if(rStorageData.bIsObject)
	{
		IStorableObject *pCast = (IStorableObject*)((byte*)pBase + rStorageData.nByteOffset + rStorageData.nClassByteOffset);
		if(pCast != null)
			addObject(pCast, rStorageData.szTypeName, rStorageData.szClassName, rStorageData.szVarName);
		return;
	}
	addVariable((byte*)pBase + rStorageData.nByteOffset, rStorageData.nByteSize, rStorageData.szTypeName, rStorageData.szClassName, rStorageData.szVarName);
}
Adding a variable does not offer surprise but adding an object is not that easy. Previously, we had access to the pointer conversion engine and when we wrote &this->m_pTrianglesList in our previous example, the compiler knew that it had to actually returns a pointer to the IStorableObject interface of the object m_pTrianglesList. The conversion is easy because the compiler knows the type of m_pTrianglesList and can then proceed to the casting operation. But here all we have is a void pointer which can make the conversion becomes your worst nightmare.
To understand why, you have to know how C++ makes virtual functions work. When we call a virtual function of a class instance, the application locates a virtual function address table at the base of the object. This table contains one entry for each virtual function, regardless of their syntax or number of parameters. The table returns the address of the actual function space and the application then calls this address.
So far, there is no problem because the virtual table lies at the base of the object and so converting the address to an IStorableObject interface will do the trick. Yes but…
C++ also offers the possibility to have multiple inheritance. That means an object can inherit from more than one parent. In that case, the virtual tables are stacked on top of each other and the base of the object does not always represent the actual interface you would like to acquire.
As an illustration, consider the following example:
class ParentA
{
public:
	virtual void dummyA(void)
	{
		printf("I am A\r\n");
	}
};
class ParentB
{
public:
	virtual void dummyB(void)
	{
		printf("I am B\r\n");
	}
};
class Sample : public ParentA, public ParentB
{
public:
};
void main(void)
{
	Sample test;
	byte *pRaw = (byte*)&test;
	ParentA *pParentA = (ParentA*)pRaw;
	ParentB *pParentB = (ParentB*)pRaw;
	pParentA->dummyA();
	pParentB->dummyB();
}
You will be surprised that both call will print “I am A” instead of writing “I am A” followed by “I am B”. This is because when we cast the interface ParentB from a raw pointer, the compiler nor the application does not have any way to know where the virtual table is and therefore uses the default value (zero) which ends up right into a function of ParentA! This is problematic because both functions might be very different or the virtual function index might even not exist in the other interface (in which case we will jump to a pure random memory address!).
So we may want to restrict the storage engine from handling multiple inheritance objects... Okay but how do you do that? How can you guarantee that no people from your programming team will remember that rule if you cannot force them to resign multiple inheritance by issuing compiler errors or exceptions? And even if you could, would that be such a good idea? Multiple inheritance is a great thing and many of your game classes may require it. For example, how does one implement a storable array without having to wrap all the original Array class functions or even make every Array of your game a storable one.
There is however a turn around to the problem which is to find the offset of every interface virtual functions table. Once you have the offset, simply add it to the base of the object to get a pointer to the desired interface.
The previous example can then be rewritten as:
Sample test;
size_t ofsA = getBaseOfs<Sample, ParentA>();
size_t ofsB = getBaseOfs<Sample, ParentB>();
printf("base A : %d\r\n", ofsA);
printf("base B : %d\r\n", ofsB);
byte *pRaw = (byte*)&test;
ParentA *pParentA = (ParentA*)(pRaw + ofsA);
ParentB *pParentB = (ParentB*)(pRaw + ofsB);
pParentA->dummyA();
pParentB->dummyB();
which now works like a charm.
To get the offset we use the compiler abilities on a dummy object:
template<typename Class, class Parent> size_t getBaseOfs(void)
{
	Class tmp;
	Parent *pParent = static_cast<Parent*>(&tmp);
	return (size_t)((byte*)pParent - (byte*)&tmp);
}
The function getStorableObjectOfs is implemented through the SFINAE programming idiom:
template<typename Type> struct IStorableObjectOfs
{
	static size_t getOfs(IStorableObject*)
	{
		return getBaseOfs<Type, IStorableObject>();
	}
	static size_t getOfs(...)
	{
		return 0;
	}
	static size_t get(void)
	{
		Type tmp;
		return getOfs(&tmp);
	}
};
template<typename Type> size_t getStorableObjectOfs(const Type&)
{
	return IStorableObjectOfs<Type> :: get();
}
This works pretty well but unfortunately requires every class that inherits from IStorableObject to have a default constructor with no arguments. This is however still acceptable considering the benefits of handling multiple inheritance.
Finally, importing the data is just the symmetric of the export function:
bool StoreObject :: _read(void *pBase, struct storage_data_s& rStorageData)
{
	if(rStorageData.bPointer)
		throwException(NoHandlingPointerException(String(rStorageData.szClassName) + String("::") + String(rStorageData.szVarName)));
	if(rStorageData.bIsObject)
	{
		IStorableObject *pCast = (IStorableObject*)((byte*)pBase + rStorageData.nByteOffset + rStorageData.nClassByteOffset);
		AutoPtr< Iterator<StoreObject*> > it = this->m_pChildren.createForwardIterator();
		while(!it->end())
		{
			StoreObject *pObject = it->current();
			it->next();
			if(pObject == null)
				continue;
			if(!pObject->m_sClassName.equals(rStorageData.szClassName))
				continue;
			if(!pObject->m_sVarName.equals(rStorageData.szVarName))
				continue;
			if(!pObject->m_sTypeName.equals(rStorageData.szTypeName))
				continue;
			pCast->importObject(pObject);
			return true;
		}
		return false;
	}
	AutoPtr< Iterator<struct store_object_data_s> > it = this->m_pVariables.createForwardIterator();
	while(!it->end())
	{
		struct store_object_data_s &curr = it->current();
		it->next();
		if(curr.nSize != rStorageData.nByteSize)
			continue;
		if(!curr.sClassName.equals(rStorageData.szClassName))
			continue;
		if(!curr.sVarName.equals(rStorageData.szVarName))
			continue;
		if(!curr.sTypeName.equals(rStorageData.szTypeName))
			continue;
		bcopy((byte*)pBase + rStorageData.nByteOffset, curr.pData, curr.nSize);
		return true;
	}
	return false;
}
One of the major issues with automated procedures is how to detect and handle change. Removing a dependency from a class and loading an old file is not really a problem but adding a new dependency and trying to load an old file will result in an indeterminate behaviour because the new variable will not be written.
Sometimes handling the problem will not be difficult but the key point is to detect that the object in the file differs from the current object definition. Once you have detected the problem, you can issue an exception or a warning and let the developer handle the case.
A solution to the detection problem is to add a signature to each storable object and to compare the signature during import with the one found in the store object. The signature should not only depend on the specialized class dependencies but also on the eventual parents dependencies and any change to variable name, size or types should result in a different signature. However, changes in offsets or position in the storage list should have no effect on the signature.
I propose the following procedure:
static word getChecksum(struct storage_data_s *pArray, size_t nSize)
{
	if(pArray == null)
		return 0;
	word wSum = 0;
	for(size_t n=0;n<nSize;n++)
	{
		if(pArray[n].szTypeName != null)
			wSum += ~checksum((byte*)pArray[n].szTypeName, strlen(pArray[n].szTypeName));
		if(pArray[n].szClassName != null)
			wSum += ~checksum((byte*)pArray[n].szClassName, strlen(pArray[n].szClassName));
		if(pArray[n].szVarName != null)
			wSum += ~checksum((byte*)pArray[n].szVarName, strlen(pArray[n].szVarName));
		wSum += ~checksum((byte*)&pArray[n].nByteSize, sizeof(pArray[n].nByteSize));
		wSum += ~checksum((byte*)&pArray[n].bIsObject, sizeof(pArray[n].bIsObject));
		wSum += ~checksum((byte*)&pArray[n].bPointer, sizeof(pArray[n].bPointer));
	}
	return ~wSum;
}
static const word wSignature;
#define IMPLEMENT_STORAGE(Class, Parent)	...	\
							\
						word Class :: getObjectSignature(void) const	\
						{																												\
							word wSum = ~wSignature;	\
								\
							if(#Class != #Parent)	\
								wSum += ~Parent :: getObjectSignature();	\
								\
							return ~wSum;																								\
						}	\
							\
						const word Class :: wSignature = getChecksum(sStorageData, sizeof(sStorageData) / sizeof(storage_data_s));
It has the following advantages:
- It is built from additions and so the order of the storage data in the list does not matter.
- It includes all critical fields and is insensitive to eventual offset changes.
- Most of the signature is generated at launch time and so it uses very little computation time.
We have covered a lot and I will conclude with two examples that illustrate the easiness of usage and power of the method.
Earlier In this post I have been talking about storable arrays. These are a good example to several key features such as the problem of multiple inheritance and custom import/export functions. I will use as a reference a template Array<Type> class that accept any types of data and I will build a StorableArray<Type> class on top of it.
There are two ways to add functionalities to a class: either you create a derivate class that inherits from the Array<Type> class, or you create a StorableArray<Type> class that implements and Array<Type> object instance and you reflect all the functions to that object. Both solutions are equally correct in terms of design patterns but correspond to two very different things. The first is a “is-a” relationship while the second is a “has-a” relationship. Also, the second makes you copy all the function for reflection. Nonetheless, this pattern has the advantage that you can modify the data before it reaches the Array<Type> class.
Here, I have chosen to use the first design because it corresponds better to what we are trying to achieve: create a specialized type of Array<Type> that can be stored without having to modify the data before it is inserted in the array.
The implementation is given below and uses template to provide generic types:
template<typename Type> class StorageArray : public Array<Type>, public IStorableObject
{
public:
	StorageArray(void) {}
	virtual void exportObject(StoreObject *pObject)
	{
		if(is_pointer<Type> :: value)
			throwException(NoHandlingPointerException(getTypeName<Type>()));
		if(isStorableObject<Type> :: test())
		{
			for(size_t n=0;n<getSize();n++)
			{
				IStorableObject *pCast = (IStorableObject*)&(this->operator[](n));
				if(pCast != null)
					pObject->addObject(pCast, getTypeName<Type>().getPointer(), "", "", STORE_OBJECT_MANUAL);
			}
		}
		else
		{
			for(size_t n=0;n<getSize();n++)
				pObject->addVariable(&(this->operator[](n)), sizeof(Type), getTypeName<Type>().getPointer(), "", "");
		}
	}
	virtual void importObject(StoreObject *pObject)
	{
		clear();
		if(pObject == null)
			return;
		if(is_pointer<Type> :: value)
			throwException(NoHandlingPointerException(getTypeName<Type>()));
		if(isStorableObject<Type> :: test())
		{
			AutoPtr< Iterator<StoreObject*> > it = pObject->getChildren();
			while(!it->end())
			{
				StoreObject *pCurrent = it->current();
				it->next();
				if(pCurrent == null)
					continue;
				if(!pCurrent->getTypeName().equals(getTypeName<Type>()))
					continue;
				size_t n = getSize();
				inc(1);
				IStorableObject *pCast = (IStorableObject*)&(this->operator[](n));
				if(pCast != null)
					pCast->importObject(pCurrent);
			}
		}
		else
		{
			AutoPtr< Iterator<struct store_object_data_s> > it = pObject->getVariables();
			while(!it->end())
			{
				struct store_object_data_s& curr = it->current();
				it->next();
				if(curr.nSize != sizeof(Type))
					continue;
				if(!curr.sTypeName.equals(getTypeName<Type>()))
					continue;
				size_t n = getSize();
				inc(1);
				this->operator[](n) = *static_cast<Type*>(curr.pData);
			}
		}
	}
};
void test_array_write(const char *pszFileName)
{
	StorageArray<int> ar;
	ar.inc(10);
	for(int i=0;i<ar.getSize();i++)
		ar[i] = i+1;
	AutoPtr<StoreObject> pObject = createStoreObject(&ar, getTypeName(ar), "", "");
	exportObjectToFile(pObject, pszFileName);
}
void test_array_read(const char *pszFileName)
{
	LinkedList<StoreObject*> collection = importObjectCollectionFromFile(pszFileName);
	AutoPtr< Iterator<StoreObject*> > it = collection.createForwardIterator();
	while(!it->end())
	{
		StoreObject *pObject = it->current();
		it->next();
		if(pObject == null)
			continue;
		
		if(!pObject->getTypeName().equals("class StorageArray<int>"))
		{
			delete pObject;
			continue;
		}
		StorageArray<int> ar;
		try
		{
			ar.importObject(pObject);
		} catch(IException &rException)
		{
			printf("Exception: %s\r\n", rException.toString().getPointer());
			continue;
		}
		delete pObject;
		for(int i=0;i<ar.getSize();i++)
			printf("%d, ", ar[i]);
		printf("\r\n");
	}
}
The import and export functions are explicitly implemented as custom functions and take care for two different scenarios: either the type is an object or a variable. The help functions written previously (is_pointer, isStorableObject, getTypeName…) are used heavily in the code.
The code also provides two test functions, one to write to a file and the other to read from it. Writing creates an array of ten integers and converts the array to a store object that is written to a file. Reading first creates a store object from the unpacked data through the function importObjectCollectionFromFile. The list is browsed and valid entries are imported and printed to screen.
The output test file takes 378 bytes to code 10 integers (40 bytes) so the memory usage is not that good, almost a 10% efficiency. But this is the price to pay for the easiness of usage and the different securities. The output test file is however mainly composed of zeroes and should so be a good target for data compression if you care a lot about memory usage.
To show that automated storage can be quite user-friendly, I will give an example of loading batches of triangles forming a 3D model. 3D Models can have many more features than just triangles batches but this is only for illustration.
The class defines an array of triangles as well as a few functions to add and list the triangles. And that’s about all because if you omit the triangle management code, the storage part takes only 3 lines!
struct vector3f
{
	float x, y, z;
};
struct triangle_s
{
	struct vector3f pos[3];
	struct vector3f normal;
};
class Mesh : public IStorableObject
{
public:
	IMPLEMENT_STORAGE_DEFS
	Mesh(void)
	{
	}
	size_t getNumTriangles(void) const
	{
		return this->m_pGeometry.getSize();
	}
	struct triangle_s& getTriangle(size_t nIndex)
	{
		return this->m_pGeometry[nIndex];
	}
	void addTriangle(const struct triangle_s& rTriangle)
	{
		size_t nSize = this->m_pGeometry.getSize();
		this->m_pGeometry.inc(1);
		this->m_pGeometry[nSize] = rTriangle;
	}
	void clearAllTriangles(void)
	{
		this->m_pGeometry.clear();
	}
private:
	StorageArray<struct triangle_s> m_pGeometry;
};
DECLARE_STORAGE(Mesh)
{
	DECLARE_STORAGE_VAR(Mesh, m_pGeometry),
};
IMPLEMENT_STORAGE(Mesh, Mesh);
I hope these last example has convinced you that automated storage can really change your life ;-)
You may also like:
[»] Data Communication With a PIC Using RS232
[»] RSA Cryptography Implementation Tips