Inferring relations

Databases relations are not always defined in the database schema, sometimes because the database engine doesn't support them (MongoDB, Couchbase...), sometimes because the developers decided or forgot to create them (performance?).

Still they are very important to understand the data model and help a lot for the database exploration, so you can add them on Azimutt (using AML), even if they are not enforced by your database.

To help with that, Azimutt will do its best to identify them and suggest them. But it will never add they on its own, as false-positive may be very misleading. Here is how it does:

Naming conventions #

It's quite common that attributes storing a reference to another entity reference the entity primary key (often id) and are named with the entity name and attribute name.
For example, a project_id attribute will likely reference the id attribute of the projects entity.

Azimutt is smart enough to handle different naming styles (camel case, snake case...), pluralization and even added prefixes/suffixes.
So attributes like previous_project_id and ProjectIds will match entities named projects, Project or even legacy_projects having attributes like id or project_id.
Of course, attributes like author, owner or admin will hardly be identified. Azimutt has a special case for attributes ending by by, like created_by, it will look for tables like users or accounts.
Here is the code and tests for the Explorer (Elm), and the code and test for the Analyzer (TypeScript).

Let us know if you have a naming convention not well-supported, we will be happy to add it. You can even submit a PR 😉 (tests are the most important part).

Join clauses #

Not implemented yet!

As naming conventions are not always followed, another way to find relations is by inspecting database queries. In the case of relational databases, a JOIN clause may hint for a relation. That's why accessing historical queries is important but Azimutt could also offer you to import the queries you could have gathered by any means.

Let us know if this is crucial for you, so we can prioritize it over other features.

Unique ids #

Not implemented yet!

For databases without joins and which don't follow clear naming conventions, like MongoDB or Couchbase, documenting relations is still a big issue. After several discussions, it should be possible to infer relations using unique ids (like UUIDs or MongoDB ObjectId).
First, we have to identify these columns (database type or pattern in most common values), then we can look for them in all tables with a single attribute primary key having the same type.
Then, if there is a match, thanks to unique ids, it's very likely a relation.

For example, if the author column contains UUIDs, we can look in all other tables if a specific id is found in their primary key. When it's found, for example in the users entity, this relation will be suggestion.
As this can be heavy for large databases, this process will be launched on demand and for specific fields.

What do you think about it? Will it help you? Let us know, so we can prioritize it over other features.

If you have any other idea to automatically identify relations, especially on schemaless environments, let us know, and we will happily evaluate and implement them on usage.