Multi-Tenancy with separate database schemas in Activiti
One feature request we’ve heard in the past is that of running the Activiti engine in a multi-tenant way where the data of a tenant is isolated from the others. Certainly in certain cloud/SaaS environments this is a must.
A couple of months ago I was approached by Raphael Gielen, who is a student at the university of Bonn, working on a master thesis about multi-tenancy in Activiti. We got together in a co-working coffee bar a couple of weeks ago, bounced ideas and hacked together a first prototype with database schema isolation for tenants. Very fun :-).
Anyway, we’ve been refining and polishing that code and committed it to the Activiti codebase. Let’s have a look at the existing ways of doing multi-tenancy with Activiti in the first two sections below. In the third section, we’ll dive into the new multi-tenant multi-schema feature sprinkled with some real-working code examples!
Shared Database Multi-tenancy
Activiti has been multi-tenant capable for a while now (since version 5.15). The approach taken was that of a shared database: there is one (or more) Activiti engines and they all go to the same database. Each entry in the database table has a tenant identifier, which is best to be understood as a sort of tag for that data. The Activiti engine and API’s then read and use that tenant identifier to perform its various operations in the context of a tenant.
For example, as shown in the picture below, two different tenants can have a process definition with the same key. The engine and API’s make sure there is no mixup of data.
The benefit of this approach is the simplicity of deployment, as there is no difference from setting up a ‘regular’ Activiti engine. The downside is that you have to remember to use the right API calls (i.e. those that take in account the tenant identifier). Also, it has the same problem as any system with shared resources: there will always be competition for the resources between tenants. In most use cases this is fine, but there are use cases that can’t be done in this way, like giving certain tenants more or less system resources.
Multi-Engine Multi-Tenancy
Another approach, which has been possible since the very first version of Activiti is simply having one engine instance for each tenant:
In this setup, each tenant can have different resource configurations or even run on different physical servers. Each engine in this picture here can of course be multiple engines for more performance/failover/etc. The benefit now is that the resources are tailored for the tenant. The downside is the more complex setup (multiple database schemas, having a different configuration file for each tenant, etc.). Each engine instance will take up memory (but that’s very low with Activiti). Also, you’d need t write some routing component that knows somehow the current tenant context and routes to the correct engine.
Multi-Schema Multi-Tenancy
The latest addition to the Activiti multi-tenancy story was added two weeks ago (here’s the commit), simultaneously on version 5 and 6. Here, there is a database (schema) for each tenant, but only one engine instance. Again, in practice there might be multiple instances for performance/failover/etc., but the concept is the same:
The benefit is obvious: there is but one engine instance to manage and configure and the API’s are exactly the same as with a non-multi-tenant engine. But foremost, the data of a tenant is completely separated from the data of other tenants. The downside (similar to the multi-engine multi-tenant approach) is that someone needs to manage and configure different databases. But the complex engine management is gone.
The commit I linked to above also contains a unit test showing how the Multi-Schema Multi-Tenant engine works.
Building the process engine is easy, as there is a MultiSchemaMultiTenantProcessEngineConfiguration that abstracts away most of the details:
config = new MultiSchemaMultiTenantProcessEngineConfiguration(tenantInfoHolder); config.setDatabaseType(MultiSchemaMultiTenantProcessEngineConfiguration.DATABASE_TYPE_H2); config.setDatabaseSchemaUpdate(MultiSchemaMultiTenantProcessEngineConfiguration.DB_SCHEMA_UPDATE_DROP_CREATE); config.registerTenant("alfresco", createDataSource("jdbc:h2:mem:activiti-mt-alfresco;DB_CLOSE_DELAY=1000", "sa", "")); config.registerTenant("acme", createDataSource("jdbc:h2:mem:activiti-mt-acme;DB_CLOSE_DELAY=1000", "sa", "")); config.registerTenant("starkindustries", createDataSource("jdbc:h2:mem:activiti-mt-stark;DB_CLOSE_DELAY=1000", "sa", "")); processEngine = config.buildProcessEngine();
This looks quite similar to booting up a regular Activiti process engine instance. The main difference is that we’re registring tenants with the engine. Each tenant needs to be added with its unique tenant identifier and Datasource implementation. The datasource implementation of course needs to have its own connection pooling. This means you can effectively give certain tenants different connection pool configuration depending on their use case. The Activiti engine will make sure each database schema has been either created or validated to be correct.
The magic to make this all work is the TenantAwareDataSource. This is a javax.sql.DataSource implementation that delegates to the correct datasource depending on the current tenant identifier. The idea of this class was heavily influenced by Spring’s AbstractRoutingDataSource (standing on the shoulders of other open-source projects!).
The routing to the correct datasource is being done by getting the current tenant identifier from the TenantInfoHolder instance. As you can see in the code snippet above, this is also a mandatory argument when constructing a MultiSchemaMultiTenantProcessEngineConfiguration. The TenantInfoHolder is an interface you need to implement, depending on how users and tenants are managed in your environment. Typically you’d use a ThreadLocal to store the current user/tenant information (much like Spring Security does) that gets filled by some security filter. This class effectively acts as the ‘routing component’ in the picture below:
In the unit test example, we use indeed a ThreadLocal to store the current tenant identifier, and fill it up with some demo data:
private void setupTenantInfoHolder() { DummyTenantInfoHolder tenantInfoHolder = new DummyTenantInfoHolder(); tenantInfoHolder.addTenant("alfresco"); tenantInfoHolder.addUser("alfresco", "joram"); tenantInfoHolder.addUser("alfresco", "tijs"); tenantInfoHolder.addUser("alfresco", "paul"); tenantInfoHolder.addUser("alfresco", "yvo"); tenantInfoHolder.addTenant("acme"); tenantInfoHolder.addUser("acme", "raphael"); tenantInfoHolder.addUser("acme", "john"); tenantInfoHolder.addTenant("starkindustries"); tenantInfoHolder.addUser("starkindustries", "tony"); this.tenantInfoHolder = tenantInfoHolder; }
We now start some process instance, while also switching the current tenant identifier. In practice, you have to imagine that multiple threads come in with requests, and they’ll set the current tenant identifier based on the logged in user:
startProcessInstances("joram"); startProcessInstances("joram"); startProcessInstances("raphael"); completeTasks("raphael");
The startProcessInstances method above will set the current user and tenant identifier and start a few process instances, using the standard Activiti API as if there was no multi tenancy at all (the completeTasks method similarly completes a few tasks).
Also pretty cool is that you can dynamically register (and delete) new tenants, by using the same method that was used when building the process engine. The Activiti engine will make sure the database schema is either created or validated.
config.registerTenant("dailyplanet", createDataSource("jdbc:h2:mem:activiti-mt-daily;DB_CLOSE_DELAY=1000", "sa", ""));
Here’s a movie showing the unit test being run and the data effectively being isolated:
Multi-Tenant Job Executor
The last piece to the puzzle is the job executor. Regular Activiti API calls ‘borrow’ the current thread to execute its operations and thus can use any user/tenant context that has been set before on the thread.
The job executor however, runs using a background threadpool and has no such context. Since the AsyncExecutor in Activiti is an interface, it isn’t hard to implement a multi-schema multi-tenant job executor. Currently, we’ve added two implementations. The first implementation is called the SharedExecutorServiceAsyncExecutor:
config.setAsyncExecutorEnabled(true); config.setAsyncExecutorActivate(true); config.setAsyncExecutor(new SharedExecutorServiceAsyncExecutor(tenantInfoHolder));
This implementations (as the name implies) uses one threadpool for all tenants. Each tenant does have its own job acquisition threads, but once the job is acquired, it is put on the shared threadpool. The benefit of this system is that the number of threads being used by Activiti is constrained.
The second implementation is called the ExecutorPerTenantAsyncExecutor:
config.setAsyncExecutorEnabled(true); config.setAsyncExecutorActivate(true); config.setAsyncExecutor(new ExecutorPerTenantAsyncExecutor(tenantInfoHolder));
As the name implies, this class acts as a ‘proxy’ AsyncExecutor. For each tenant registered, a complete default AsyncExecutor is booted. Each with its own acquisition threads and execution threadpool. The ‘proxy’ simply delegates to the right AsyncExecutor instance. The benefit of this approach is that each tenant can have a fine-grained job executor configuration, tailored towards the needs of the tenant.
Conclusion
As always, all feedback is more than welcome. Do give the multi-schema multi-tenancy a go and let us know what you think and what could be improved for the future!
Reference: | Multi-Tenancy with separate database schemas in Activiti from our JCG partner Joram Barrez at the Small steps with big feet blog. |