← All posts

database design

What is a polymorphic relation?

by Loïc Knuchel on

Database design is always a tradeoff game. Knowing common database design patterns, their pros and cons will help you make better choices.

Polymorphic relations banner

In this post you will learn how to introduce polymorphism in your database using dynamic relations and when to use it or not to design your database in the best way:

Polymorphic relations, sometimes called polymorphic associations, are a way to create a relation with a dynamically referenced table. Wait, what? Let’s dive in.

Usual relations are defined by a column in a table referencing another table. On GitHub, as an example, users can create repositories. We can then create two tables:

users
  id uuid pk
  name varchar

repositories
  id uuid pk
  name varchar
  owner uuid -> users(id)

ERD diagram of a simple database relation between two entities

Easy. Now, let’s introduce organizations. They can also own repositories. But how to design that?

Dynamic relations design alternatives

Solution 1: Duplicate repositories

A first solution could be to duplicate the repositories table for each owning entity:

users
  id uuid pk
  name varchar

user_repositories
  id uuid pk
  name varchar
  owner uuid -> users(id)

organizations
  id uuid pk
  name varchar

organization_repositories
  id uuid pk
  name varchar
  owner uuid -> organizations(id)

ERD diagram of a duplicated entities

It works. Now, imagine you have more and more kind of repository owners. It will be painful to create new repositories tables for each one.
Worse, if you want to add, change or update a column on repositories, you will have to do it for each one. Keeping them all consistent as they represent the same entity will be a challenge.
Even worse, you have two tables and thus two primary keys instead of one, allowing the same id, breaking the domain intention to one repository entity linked to either a user or an organization.
Worst of all, if you have entities related to repositories, for example, projects or issues, you will have to also duplicate them in order to have the relation to the correct table 😱

ERD diagram of a duplicated entities with their hierarchy

Duplicating the whole hierarchy is not manageable.

Solution 2: Dedicated relation tables

A better way could be to extract the relation from the repository and only duplicate the relation, not the whole repository:

users
  id uuid pk
  name varchar

organizations
  id uuid pk
  name varchar

repositories
  id uuid pk
  name varchar

user_repositories
  user_id uuid -> users(id)
  repository_id uuid unique -> repositories(id)

organization_repositories
  organization_id uuid -> organizations(id)
  repository_id uuid unique -> repositories(id)

ERD diagram of duplicated relations with dedicated tables

This one is way better: no duplication for repository attributes, and other tables can easily reference a repository, whichever the owner kind 🎉

Still we have to create one relation table for each kind of owner, allowing a repository to be owned by all kind of owner, a user and an organization in this case 🙃
Finding a repository owner is complex as it requires to query all the relation tables:

SELECT * FROM (
  (SELECT 'User' AS kind, u.id, u.name
   FROM user_repositories r
     JOIN users u ON r.user_id = u.id
   WHERE r.repository_id = '0192cd29-e5b8-79f9-b36b-0d1a83ce454c')
  UNION ALL
  (SELECT 'Organization' AS kind, o.id, o.name
   FROM organization_repositories r
     JOIN organizations o ON r.organization_id = o.id
   WHERE r.repository_id = '0192cd29-e5b8-79f9-b36b-0d1a83ce454c')
) owners;

Solution 3: Multiple relations

What if we keep the relations inside the repositories table, having one per owner kind ? In this case, one and only one should be filled for each repository:

users
  id uuid pk
  name varchar

organizations
  id uuid pk
  name varchar

repositories
  id uuid pk
  name varchar
  owner_user_id uuid nullable -> users(id)
  owner_organization_id uuid nullable -> organizations(id)

ERD diagram of multiple relations in the same entity

This is probably the best solution so far, it keeps referential integrity and avoid duplication.

The main issues being:

  • we should be careful to have one and only one relation filled (ex: CHECK ((CASE WHEN owner_user_id IS NULL THEN 0 ELSE 1 END + CASE WHEN owner_organization_id IS NULL THEN 0 ELSE 1 END) = 1))
  • having many kind of owners would be impractical with a lot of mostly empty columns
  • adding a new owner kind require a schema update which could be painful depending on your situation

Solution 4: Polymorphic relation

Now it’s time to introduce polymorphic relations: instead of having a column referencing another table and a foreign key enforcing that, a polymorphic relation uses two column:

  • one to define the referenced table
  • one to store the referencing value

Here is our new database schema:

users
  id uuid pk
  name varchar

organizations
  id uuid pk
  name varchar

repositories
  id uuid pk
  name varchar
  owner_kind varchar index check(`owner_kind IN ('users', 'organizations')`)
  owner_id uuid index

ERD diagram of polymorphic relations

As you can see, it’s very similar to our initial situation with just the users and repositories, with just one additional column telling if the relation is toward the users table or the organizations one.

In AML, you can define polymorphic relations with the kind column inside the arrow:

rel repositories(owner_id) -owner_kind=users> users(id)
rel repositories(owner_id) -owner_kind=organizations> organizations(id)

When to use a polymorphic relation?

As we have seen, duplicating entities and dedicated relation tables are not very good and are not recommended. Depending on your situation, multiple relations or a polymorphic relation can be fine.

Polymorphic relation strengths are they allow a large choice number without increasing the schema or querying complexity. They are also able to allow new relation kind without a schema change. On the bad side they require the same column type and can’t rely on foreign keys, leaving the referential integrity checks to the application. Also, many tools won’t see them making your daily life a bit harder (but not Azimutt 😅).

Multiple relations strengths are they keep the referential integrity checks with foreign keys, and they allow target entities with different types (int, bigint, uuid…). On the other side, they require a schema change to add a new relation and can be painful with a large number of choices. Also, you will have to make sure to have one and only one filled value.

If you have a large number of relations, or you need them dynamically, use Polymorphic relations. Otherwise, Multiple relations will keep your schema clearer and safer.

Polymorphic relation use cases

Polymorphic relations can be used in many situations, they are mostly split in two main buckets:

  • Generic entities that can be attached to many/any other entities
    • Comments (or Reviews, Ratings, Likes, Notifications…) attached to entities like Posts, Talks, Videos, Products, Courses…
    • Assets (or Files, Pictures, Documents…) attached to entities like Users, Companies, Articles…
  • Relations with several business alternatives
    • ownership or rights, for several kind of owners and even owned items
    • user events attached to any system entity

For example, I designed a flexible authorization system allowing access to a resource based on the presence of a polymorphic record:

accesses
  id uuid pk
  owner_kind varchar index check(`owner_kind IN ('users', 'teams', 'organizations')`)
  owner_id uuid index
  resource_kind varchar index check(`resource_kind IN ('posts', 'pages', 'datasource')`)
  resource_id uuid index
  level varchar check(`level IN ('read', 'write')`)
  expire_at timestamp nullable

Complex ERD diagram of polymorphic relations for access control in database

Checking valid access is just a simple query:

SELECT level
FROM accesses
WHERE (owner_kind = 'users' AND owner_id = '0192c543-1dce-76b2-a165-d722218c96f2')
  AND (resource_kind = 'posts' AND resource_id = '0192c543-aabb-7748-bd76-3942b197537d')
  AND (expire_at IS NULL OR expire_at > now());

Also, getting all the allowed posts for a user is quite easy:

SELECT p.id, p.title, a.level
FROM accesses a
  JOIN posts p ON a.resource_kind = 'posts' AND a.resource_id = p.id
WHERE (a.owner_kind = 'users' AND a.owner_id = '0192c543-1dce-76b2-a165-d722218c96f2')
  AND (a.expire_at IS NULL OR a.expire_at > now());

And those queries are quite performant with indexes on (owner_kind, owner_id) and (resource_kind, resource_id).

Polymorphic relation support

As this is just a way to organize and query your data, there is no specific requirements from any language or database to use them.

Yet, they may be more or less easy to use. It’s relatively common in certain communities and almost unknown in others 😅

Some libraries supporting it natively:

  1. Active Record (Ruby on Rails):
class Picture < ApplicationRecord
  belongs_to :imageable, polymorphic: true
end

class Employee < ApplicationRecord
  has_many :pictures, as: :imageable
end

class Product < ApplicationRecord
  has_many :pictures, as: :imageable
end
  1. Eloquent (Laravel, PHP):
namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use Illuminate\Database\Eloquent\Relations\MorphTo;
use Illuminate\Database\Eloquent\Relations\MorphMany;

class Comment extends Model {
    public function commentable(): MorphTo {
        return $this->morphTo();
    }
}

class Post extends Model {
    public function comments(): MorphMany {
        return $this->morphMany(Comment::class, 'commentable');
    }
}

class Video extends Model {
    public function comments(): MorphMany {
        return $this->morphMany(Comment::class, 'commentable');
    }
}
  1. Sequelize (Node.js):
class Comment extends Model<InferAttributes<Comment>, InferCreationAttributes<Comment>> {
  declare id: number;

  @Attributes(DataTypes.STRING)
  @NotNull
  declare content: string;

  @Attributes(DataTypes.STRING)
  @NotNull
  declare targetModel: 'article' | 'video';

  @Attributes(DataTypes.INTEGER)
  @NotNull
  declare targetId: number;

  /** Defined by {@link Article#comments} */
  declare article?: NonAttribute<Article>;

  /** Defined by {@link Video#comments} */
  declare video?: NonAttribute<Video>;

  get target(): NonAttribute<Article | Video | undefined> {
    if (this.targetModel === 'article') {
      return this.article;
    } else {
      return this.video;
    }
  }
}

Conclusion

Polymorphic relations are really powerful to introduce some relation flexibility in relational database design, especially for generic entities. Yet, they should be used carefully as you loose integrity constraints, and it can backfire hard down the road. Use them wisely 😉

About Azimutt

Azimutt is a SaaS making working with databases much easier. It’s best in class database design and exploration tool, with backed documentation and collaboration.
If you have issues or friction with your databases, give it a try or contact us, we are always happy to help.

Cheers !

Azimutt Icon

Azimutt · Database explorer and analyzer

Hi there! We're building a Next-Gen ERD to help understand real world databases, with cool UI and privacy focus. You can read about our journey and what we've learnt along the way on this blog.