Evolving django-multitenant to build scalable SaaS apps on Postgres & Citus

Written by Gürkan İndibay
May 9, 2023

If you're building a software application that serves multiple tenants, you may have already encountered the challenges of managing and isolating tenant-specific data. That's where the django-multitenant library comes in. This library, actively used since 2017 and now downloaded more than 10K times per month, offers a simple and flexible solution for building multi-tenant Django applications.

In this blog post, we'll dive deeper into the concept of multi-tenancy and explore how Django-multitenant can help you build scalable, secure, and maintainable multi-tenant applications on top of PostgreSQL and the Citus database extension. We'll also provide a practical example of how to use Django-multitenant in a real-world scenario. So, if you're looking to simplify your multi-tenant development process, keep reading.

If you already have a good understanding of multi-tenancy concepts, you can skip this multi-tenancy part and proceed to the sections below:

What is multi-tenancy?

Multitenancy is an architecture where a single instance of a software application serves multiple customers (tenants). Each tenant has its own data, configuration, and user interface, but shares the same codebase and infrastructure. It's beneficial for SaaS applications as it allows for efficient resource usage, cost savings, and easier maintenance through simultaneous updates for all tenants.

Multitenancy challenges and approaches

When dealing with multitenancy in your application, you must design with a few database issues:

  1. Tenant Isolation : Ensure data and resource separation, route requests correctly, and manage tenant-specific configuration.
  2. Scalability : Design a scalable database architecture, partition data, and manage concurrent access to handle increasing tenant load.
  3. Security : Implementing additional security measures to protect tenant data, secure data in transit and at rest, and manage access control.
  4. Customization : Allow easy customization of the application for each tenant, including configuration, branding, and workflows.
  5. Performance : Optimize application performance through efficient database schema design, caching, and data access mechanisms.

There are three approaches to managing multi-tenancy in applications.

  1. Separate Databases - Each tenant has their database.
  2. Shared Database, Separate Schemas. Tenants have their own schemas but can reside in the same database.
  3. Shared Database, Shared Schema - All tenants share the same schema and data objects should include tenant_id to denote ownership to a tenant.

The first 3 approaches above (separate databases, and separate schemas) are out of scope for this blogpost.

We will focus on the 3rd approach, the "Shared Database, Shared Schema" approach.

Multitenancy with a shared database, shared schema

Django-multitenant supports sharing a schema between tenants, which offers advantages such as easier maintenance, reduced complexity, better resource utilization, improved scalability, and enhanced data security. However, this approach has two challenges: scaling Postgres databases for multiple tenants and dealing with complexity in application development.

If you are currently in the process of developing a multi-tenant application using Django ORM on PostgreSQL using the shared schema approach, there are 2 problems needs to be addressed:

  1. Application complexity due to tenant ID addition into application
  2. Scaling the application while adding more tenants

The django-multitenant library helps with the first problem above—application complexity— while the Citus extension to Postgres can be used to enable distributed scale.

Citus is available as open source, and as managed service as part of Azure Cosmos DB for PostgreSQL

You can use the django-multitenant library with Citus seamlessly whether using Citus open source or on the Azure Cosmos DB for PostgreSQL managed service. django-multitenant helps with data isolation in application development. Data isolation requires careful development process, including filtering queries by tenant_id, including tenant column in join clauses, and recording correct tenant data when saving user input which django-multitenant easy for you.

Automating Django queries with django-multitenant

The django-multitenant library extends the Django ORM to automatically and uniformly scope queries by tenant. It adds the tenant_id in both fetching and manipulating data. It's a mature library which is downloaded over 10,000 times each month.

Features of django-multitenant:

  1. You can set the tenant with one method call, and existing application code will access the proper tenant without extensive changes.
  2. With a tenant set, the library will seamlessly include the tenant ID in join operations, eliminating the need to manually add tenant IDs.
  3. Supports standard Django and the Django Rest Framework to streamline integration.
  4. Implements helper classes for Citus, a distributed Postgres extension—and facilitates table distribution during the database migration process.

What's new in django-multitenant version 3.2

With the start of 2023, the django-multitenant project has been reinvigorated with active development and new releases. The latest release, version 3.2, includes the following critical changes and new features:

  • Added DjangoRestFramework support, enabling you to easily create RESTful APIs that can handle multi-tenant data.
  • Improved model migration guidelines, which can help you migrate tenant-specific data seamlessly.
  • Support for the latest version of Django (version 4.1), allowing you to take advantage of its latest features and improvements.
  • Ability to automatically set the tenant for ManyToMany related models.
  • Fixed invalid error messages in case of invalid field names.
  • Support for getting models using apps.get_model.
  • Removed reserved tenant_id limitation by introducing TenantMeta usage.
  • Introduced ReadTheDocs documentation.
  • Fixed issue with ManyToMany Non tenant model saves.

Quickstart on how to get started with django-multitenant

Let's learn how to use django-multitenant in a sample application which includes newly added features introduced with version 3.2. This example provides a web interface for a fictitious project management application. (The example uses the Django Rest Framework for developer efficiency, but the underlying Django-multitenant library works with vanilla Django too.)

  1. To get started, first install the Python packages:

    pip install django-multitenant
    pip install Django
    pip install djangorestframework
    
  2. Add a TenantMeta class in your models and specify the field which identifies tenants.

    class Country(models.Model):
        name = models.CharField(max_length=255)
    
    class Account(TenantModel):
        user = models.ForeignKey(User, on_delete=models.CASCADE)
        name = models.CharField(max_length=255)
        domain = models.CharField(max_length=255)
        subdomain = models.CharField(max_length=255)
        country = models.ForeignKey(Country, on_delete=models.CASCADE)
    
        class TenantMeta:
            tenant_field_name = 'id'
    
    class Manager(TenantModel):
        name = models.CharField(max_length=255)
        account = models.ForeignKey(
            Account, on_delete=models.CASCADE, related_name="managers"
        )
        class TenantMeta:
            tenant_field_name = 'account_id'
        class Meta:
            constraints = [
                models.UniqueConstraint(fields=['id', 'account_id'], name='unique_manager_account')
            ]
    
    class Project(TenantModel):
        name = models.CharField(max_length=255)
        account = models.ForeignKey(
            Account, related_name="projects", on_delete=models.CASCADE
        )
        managers = models.ManyToManyField(Manager, through="ProjectManager")
        class TenantMeta:
            tenant_field_name = 'account_id'
    
        class Meta:
            constraints = [
                models.UniqueConstraint(fields=['id', 'account_id'], name='unique_project_account')
            ]
    
    class ProjectManager(TenantModel):
        project = TenantForeignKey(
            Project, on_delete=models.CASCADE, related_name="projectmanagers"
        )
        manager = TenantForeignKey(Manager, on_delete=models.CASCADE)
        account = models.ForeignKey(Account, on_delete=models.CASCADE)
    
        class TenantMeta:
            tenant_field_name = 'account_id'
    
  3. Distribute tables by their tenant columns. You need to remove constraints before distribution, distribute tables via tenant_migrations.Distribute(), and reinstate the constraints.

    from django_multitenant.db import migrations as tenant_migrations
    
    def get_operations():
        operations = []
        operations += [
            migrations.RunSQL(
                "ALTER TABLE tests_country DROP CONSTRAINT tests_country_pkey CASCADE;"
            ),
            migrations.RunSQL(
                "ALTER TABLE tests_manager DROP CONSTRAINT tests_manager_pkey CASCADE;"
            ),
            migrations.RunSQL(
                "ALTER TABLE tests_project DROP CONSTRAINT tests_project_pkey CASCADE;"
            ),
            migrations.RunSQL(
                "ALTER TABLE tests_projectmanager DROP CONSTRAINT tests_projectmanager_pkey CASCADE;"
            )
        ]
    
        operations += [
            tenant_migrations.Distribute("Country", reference=True),
            tenant_migrations.Distribute("Account"),
            tenant_migrations.Distribute("Manager"),
            tenant_migrations.Distribute("Project"),
            tenant_migrations.Distribute("ProjectManager"),
        ]
    
        operations += [
            migrations.RunSQL(
                "ALTER TABLE tests_country ADD CONSTRAINT tests_country_pkey PRIMARY KEY (id);"
            ),
            migrations.RunSQL(
                "ALTER TABLE tests_project ADD CONSTRAINT tests_project_pkey PRIMARY KEY (account_id, id);"
            ),
            migrations.RunSQL(
                "ALTER TABLE tests_manager ADD CONSTRAINT tests_manager_pkey PRIMARY KEY (account_id, id);"
            ),
            migrations.RunSQL(
                "ALTER TABLE tests_projectmanager ADD CONSTRAINT tests_projectmanager_pkey PRIMARY KEY (account_id, id);"
            ),
        ]
    

That's all the setup needed for models and migrations. Once the initial setup is done, you must call "set_current_tenant" somewhere in the code.

In the case of Django Rest Framework, our library interoperates to ensure effective calling of "set_current_tenant" and prevent any possibility of inter-tenant data leakage.

To complete our REST application, let's complete the integration steps:

  1. Define a "get_tenant" method in a suitable location in the code. In this example, we use the views file

    from django_multitenant import views
    def tenant_func(request):
        return Account.objects.filter(user=request.user).first()
    
    views.get_tenant = tenant_func
    
  2. Construct the viewsets for the multi-tenant models using "TenantModelViewSet.

    # views.py
    from django_multitenant.views import TenantModelViewSet
    
    class AccountViewSet(TenantModelViewSet):
        """
        API endpoint that allows groups to be viewed or edited.
        """
    
        model_class = Account
        serializer_class = AccountSerializer
        permission_classes = [permissions.IsAuthenticated]
    
  3. Add MultitenantMiddleware into the Middleware list in your settings file

    # settings.py
    
    MIDDLEWARE = [
        ..
        "django_multitenant.middlewares.MultitenantMiddleware",
    ]
    

Thats it. Now you have a running multitenant application. Here's the link to the fully functional sample application using django-multitenant. You can download the application and test all its features:

https://github.com/gurkanindibay/django-mt-examples

For comprehensive guidance on how to use django-multitenant, you can refer to its documentation available at the following link: https://django-multitenant.readthedocs.io/en/stable/

It's important to note that this library is responsible for handling tenancy functions in the ORM. However, you will need to set the tenant by calling "set_current_tenant" throughout your application controllers. If you fail to call it correctly, you may end up getting all tenant results in queries, and tenants won't be properly set in your insert/update operations.

For more information on Multi-tenancy and Django-Multitenant, this PyCon Canada presentation about "Scaling multi-tenant apps using the Django ORM and Postgres" gives a good overview.

django-multitenant can simplify your multi-tenant SaaS app

In conclusion, building multi-tenant applications is a challenging task that requires careful planning and implementation. With the Django-multitenant library, you can simplify your multi-tenant development process by providing a flexible and scalable solution to manage and isolate tenant-specific data.

Finally, we provided a practical example of how to use Django-multitenant in a real-world scenario, demonstrating its ease of use and flexibility.

Overall, with its active development and support, django-multitenant is a valuable tool if you are looking to build scalable, secure, and maintainable multi-tenant applications on Postgres (and with Citus).

Further reading

Multi-tenancy is a very broad concept. If you need more information, please refer to the below resources. If you want to learn more about django-Multitenant , you can refer to the links below:

Gürkan İndibay

Written by Gürkan İndibay

Software engineer at Microsoft. BSc in Computer Science. Life-long learner. Full stack developer. Interested in devops systems, machine learning, software architecture, and cloud systems. Dad.