Life Imitates Code. Essays by Ka Wai CheungEssays by Ka Wai Cheung from around the web.

Coding ● May 2015

Building the optimal users database model

In the past, I’ve often wrestled with designing an optimal relational database model to represent users in a multi-tenant system. I’ve wanted to get the model just right—enough structure to accomodate future enhancements without bloating the schema with excessive JOINs, overly-wide tables, or any other number of typical relational-database “sins”.

Over four years into our current iteration of DoneDone, I can happily say that the database model we’ve built has been resilient to all sorts of feature updates we’ve made over the years.* I wanted to share our thought process behind how it’s designed, starting with a simple database model and evolving it into the model we currently use.

A simple model

Let’s start with the most bare-bones relational model to represent users in a system like ours. Naturally, we’ll start with a table called Users. The attributes within Users fall within one of three general groups of data.

So, our starting model looks something like this, with unique constraints on items like Username and EmailAddress:

A simple starting point for database modeling users in a system

I’ve used many systems whose model follows closely to this two-table architecture. But, there’s one very large drawback to this system: The foreign key from Users to Accounts restricts a user’s access to only one account in the system. If someone wants to have access to multiple accounts, they are forced to create a new user record with another email address. That quickly becomes a nuisance to manage for the user.

Associating users to multiple accounts

So, unless I am 100% sure that the system will never allow a user access to multiple accounts (and, sometimes, even if I am 100% sure), I prefer breaking this relationship out as a many-to-many by introducing an association table in between. In DoneDone, we call this a membership.

Adding a Memberships table to allow users access to multiple accounts

The Memberships table holds foreign keys to both Users and Accounts, so that we can both associate any number of users to an account (e.g. employees in a company) as well as any number of accounts to a user (e.g. a user’s freelance account along with their company account).

We can also add membership-specific columns to Memberships. In DoneDone, for instance, the Memberships table can hold columns like an office phone number or fax, as well as an account-specific email address. This way, we can send profile information and password reset links to the user’s global email address, stored in Users, while sending email updates on specific DoneDone issues to the account-level email address, stored in Memberships.

Additionally, the Memberships table can hold information like roles—a user might be an administrator in one account, a regular user in another account, and an account owner in a third account.

Finally, the Memberships table helps clarify the intent of certain database relationships involving a user. For instance, in DoneDone, we tie system alerts to a membership record. This lets us create a specific alert to a person’s specific account (say, a billing notice to an account owner). However, we tie issue relationships to a user record. If a person creates an issue, and then leaves a company, we can delete their Memberships record while still preserving their relationship to that issue.

As I mentioned earlier, even if I’m fairly certain that users will not belong to multiple accounts, I still prefer the additional table in the design. If you avoid it now, should the business one day decide to support multiple accounts for a user, you not only need to massage your data into a new data model, you’ll need to provide some mechanism for a user to link once-disparate accounts together.

Improving performance and database integrity

This three-table architecture provides a lot of flexibility, but there are still other ways we can improve. In DoneDone, we segment off the authentication information away from the Users table into a separate Logins table. We then add a required foreign key from Logins to Users (a login must be associated to an existing user). Since a user can only have one set of credentials, we could further tighten the data integrity by placing a unique constraint against the foreign key.

Extracting credentials into its own table has several relational modeling benefits

There are a few different reasons why I prefer this separation.

First, it provides a cleaner way of representing users in a system who do not have credentials to login. If we don’t extract credentials to a separate table, we’re left with only the unsavory option of allowing NULL values for usernames and password salts/hashes. Even further, many database systems (Sql Server 2008 and prior) do not support an out-of-the-box “unique if not null” constraint. This means that you can’t place a unique constraint on, say, username, if you had more than one unregistered user (i.e. two or more records with username equal to NULL). As a workaround, you might be left with managing uniqueness on usernames solely outside of the database—a game I’d rather not play.

You may be asking why you’d ever want users in your system who can’t login. Where would this come into play? Here are two examples:

Secondly, creating a separate Logins table lets you keep credential information in as thin a table as possible. This will only help with indexing and any lazy selects (select * from Logins). Since the Logins table will usually have significantly more reads than writes, the more efficient those reads are, the better.

Finally, while you could argue placing the foreign key on Users (since this is a one-to-one relationship), I find more benefits placing it on Logins. This way, we can make the relationship required, which again avoids adding any nullable columns or “unique if not null” issues in your data model.

Adding access to groups of users

The last step we’ve made in DoneDone is to add a Companies table between Memberships and Accounts. This lets us further differentiate access levels between groups of users in an account. For example, in DoneDone, we can mark a company as an admin company, and only allow users associated to the admin company to be administrators or account owners. A Companies table also would allow you to easily make certain messages private amongst a user’s respective company.

With this in place, here is the current user data model we’ve used in DoneDone.

Our full-blown database architecture for modeling users

We’ve been really happy with this level of architecture in DoneDone thus far, and I hope this helps your product as well.

*–Much credit goes to former DoneDone colleague Mustafa Shabib for his influence on the data model design that’s largely withstood significant changes since it was released.

Originally published May 27, 2015 at DoneDone. Go to the next essay in Coding, “Programming's three life lessons”.