Serialisation of C++ classes with very little code
Serialisation into human readable formats tends to be a tedious task in programming languages without a lot of reflection. In C++, one often ends up with declaring variables at one location, giving them initial values at another location, serialising them at yet another location and deserialsing them elsewhere again. C++11 has added a possibility to initialise member variables at the location where they’re defined, but even fairly convenient libraries to parse and manipulate the saved data cannot eliminate the need to write one blob of code for serialisation and another blob of code for deserialisation.
// For each member variable int timeout = 1000; // C++11 only //... // deserialisation if (source.count("timeout") timeout = source["timeout"]; //... // serialisation source["timeout"] = timeout;
The principle
The serialisation part and the deserialisation part use identical information, the type of the member, the reference to the member and the key that identifies the variable and thus could be merged into one function.
template <typename T> void serialiseOrDeserialise(bool serialising, json& config, const std::string& name, T& value) { if (serialising) config[name] = value; else if (value.count(name)) value = config[name]; }
Now, both serialisation and deserialisation requires only one function that goes through all the serialisaed members. This doesn’t only shorten the code, it also prevents crashes due to checking a wrong key’s existence or forgetting to check it and saving issues due to saving a variable under one key and trying to load it under a different key.
void serialisationDeserialisation(bool serialising, json& config) { serialiseOrDeserialise(serialising, config, "timeout", timeout); serialiseOrDeserialise(serialising, config, "address", address); serialiseOrDeserialise(serialising, config, "enabled", enabled); }
This is quite convenient, however, there’s still quite a lot of copy-paste code.
Making it cleaner
Values serialising
and config
are repeated
all over the code, so they can be moved into a parent class that
provides the functionality. This is why I think it’s super useful to be
allowed to inherit from multiple parent classes.
bool serialising; json config; virtual void serialisation() = 0; json serialise() { serialising = true; config = json(); serialisation(); return config; } void deserialise(const json& source) { serialising = false; json = source serialisation(); }
Together with a rename of the function for serialising members, this can shorten the class-specific code to:
void serialisation() override { synch("timeout", timeout); synch("address", address); synch("enabled", enabled); }
The two arguments could be merged with a macro (as macros can use a part of code as a string literal), but that would prevent naming the file differently in code and macros tend to be ugly.
Now, there is only one loose end: it’s assumed that the magical json
class can serialise everything.
Implementation
The implementation can be found on github. It uses a custom library for accessing JSON to avoid dependencies (it was later replaced by a better one).
I could use nlohmann::json
, but it is quite a large file
and would add an additional dependency. So I used a custom one.
Matching types to exact JSON objects is done through partial template
specialisation, which allows adding custom types outside of the class’
implementation. It also allows serialising containers encapsulating
already serialisable objects, containers of containers of serialisable
objects etc.
template <typename Serialised, typename SFINAE> struct Serialiser { constexpr static bool valid = false; // For static_assert }; template <> struct Serialiser<std::string, void> { constexpr static bool valid = true; static json serialise(const std::string& value) { return json(value); } static void deserialise(std::string& result, const json& value) { result = value.string(); } }; template <typename T> struct Serialiser<std::vector<T>, std::enable_if_t<Serialiser<T, void>::valid>> { constexpr static bool valid = true; static json serialise(const std::vector<T>& value) { auto made = json(json::ArrayType(value.size())); for (unsigned int i = 0; i < value.size(); i++) made[i] = Serialiser<T, void>::serialise(value[i]); return made; } static void deserialise(std::vector<T>& result, const json& value) { const std::vector<json>& got = value.array(); result.resize(got.size()); for (unsigned int i = 0; i < got.size(); i++) Serialiser<T, void>::deserialise(result[i], got[i]); } };
Handling various possible types
Because C++ is statically typed, the best way to store multiple possible types in one variable is polymprphism. To avoid having to break syntax by writing a factory somewhere, I have used a generic self-registering factory.
This adds a dependency, so it’s in a separate header. On C++17, ensuring it can be picked when deserialising needs just to add this into the class (assuming the common parent is named ContentType
and the currently added class is Content1
, whose identifier is c1
):
inline const static polymorphism<ContentType, Content1> = "c1";
Which can be in a macro for better readability:
SERIALISABLE_REGISTER_POLYMORPHIC(ContentType, Content1, "c1");
In order to be able to serialise it, its identifier has to be written by in the serialisation()
method:
subclass("c1");
Conclusion
Before C++23 with its reflexpr
functionality comes out
and its support becomes widespread enough, serialisation into
human-readable files is an easy but tedious task in C++. To deal with
this issue, I have developed a C++ class capable of doing all the work,
requiring the user only to implement a virtual function that matches
variables to the keys in the file.
I am not the first person to have done this, but I have seen too much suffering of not knowing this trick.
Later, I have discovered a way to make classes serialisable with even less code.