Generic self-registering factory – no more need to write factories

It’s somewhat annoying that a factory needs to depend on all classes that inherit from the common interface. This issue can be avoided using a self-registering factory.

Polymorphism is a very good tool for helping obey the Single Responsibility Principle. It allows a class to implement all its functionality while it’s accessed only using an interface shared with many other classes. Its very existence requires only the presence of its source code… and something that calls its constructor. It’s somewhat annoying that a factory needs to depend on all classes that inherit from the common interface. This issue can be avoided using a self-registering factory.

A self-registering factory is a pattern where the factory only holds a data structure where functors constructing instances of specific subclasses and returning the shared interface are held. Each subclass registers its constructor into the factory’s data structure during program startup.

As a result, adding a new component to the program only requires adding the code to the project, no other files need to be edited.

How can it work in C++

Executing code that’s not called from anywhere is usually not a good idea and many languages’ syntax doesn’t allow it. C++ is one of those languages, but there is a way around this restriction. The initialisation of global variables:

bool component35Initialised = initialiseComponent35();

A global variable like this one can be declared/initialised at the bottom of a .cpp file, so it would not cause too much mess.

However, there is still a recipe of disaster. The order of initialisation of global variables is unspecified and it’s easy to unintentionally access something uninitialised in functions like initialiseComponent35() (the name is intentionally bad).

This problem can be avoided using a static local variable in a function. These are initialised when the function is first run. This can be used to create a function that returns a reference to an instance of an object that will be reliably constructed any time it is called. However, its destruction will happen unexpectedly during program exit.

To prevent this from being abused, it can be accessible only to a friend function that registers the class. The function can do something like this:

template <typename Child>
bool registerClassIntoFactory(const std::string& name) {
  auto& instance = ChildRegistry::getInstance();
  instance.addChild([name](Config& config) -> std::shared_ptr<Component> {
    return std::make_share<Child>(config);
  }); 
}

Now, the factory’s method that creates instances doesn’t need to include a load of various subclasses:

std::shared_ptr<Component> makeComponent(const std::string& name, Config& config) {
  auto& instance = ChildRegistry::getInstance();
  return instance.getConstructor(name)(config);
}

This is useful, but the factory needs to be written for each parent class. Repetitive code is fertile groun for bugs, so it needs to be generic. A proper implementation would look roughly like this:

template <typename Parent, typename Child, typename... Args>
bool registerClassIntoFactory(const std::string& name) {
  auto& instance = ChildRegistry<Parent, Args...>::getInstance();
  instance.addChild([name](Args&... args) -> std::shared_ptr<Parent> {
    return std::make_shared<Child>(args...);
  }); 
}

Full code can be seen at github.

A side effect of this approach is that the specific subclasses don’t even need headers, saving a lot of repetitive work.

Using one class’ instance to select another’s instance

What if the generic factory is used to select the type of a configuration object (for example made from a JSON object, using one of its fields as identifier of the actual class) and the actual class needs to be constructed according to the configuration object type rather than the value of a variable?

Well, C++ offers a trick for that. The typeid keyword allows detecting the name of a derived polymorphic class at runtime. This violates the Liskov Substitution Principle. It’s the goto of Object Oriented Programming. It should be used with great care.

The function’s external interface itself doesn’t have to violate the Liskov Substitution principle, becuase it can take one parent class and return another parent class, leaking no information about the child class.

static std::unique_ptr<ConstructedParent> createChild(PrimaryParent primary, Args... args)
{ // some static_assert and mutex skipped!
	auto& factory = GenericSecondaryFactory<PrimaryParent, ConstructedParent>::getInstance();
	auto found = factory._children.find(typeid(*primary).hash_code());
	if(found == factory._children.end())
		throw std::runtime_error("Unknown child");
	return found->second(primary, args...);
}

Macros can actually make this more readable

Macros are typically an efficient tool of obfuscation but the initialisation of globals for no other reason than a side effect is probably more confusing than a rather obvious macro.

REGISTER_CHILD_INTO_FACTORY(Widget, TextWidget, "Text", const nlohmann::json&);

The construction of objects doesn’t need anything wacky:

std::unique_ptr<Widget> widget = GenericFactory<Widget>::createChild(name);

C++17

Static member variables are cleaner than global variables, but their initialisation is fairly annoying unless the class doesn’t have a header (which it needs if some class is to be constructed according to its instance). Since C++17, static member variables can be declared in headers using the inline keyword. So since C++17, the registration can be done in class declarations.

Problem with .a and .lib files

Because the global variables aren’t used, statically linked libraries (extension .a on Linux and .lib on Windows) aren’t guaranteed to actually use them. I don’t know if this is true also for inline static members from C++17.

However, dynamically linked libraries do not suffer from this problem and can be used to add possible child classes merely by being loaded at runtime.

Conclusion

Implementing factory classes can be annoying because it’s repetitive and they need to be edited every time a new class is added. A generic factory can help avoid having to write the factory code repeatedly or to remember to add every class somewhere.

Leave a Reply

Your email address will not be published.