The Case for Multiple DBs in Multi Tenancy situations

Multitenancy isn't easy, and one of the particularly challenging aspects is what to do with the database. In Ayende's excellent posts on multi tenancy, he's pretty clear that he feels a database per tenant is the way to go . There are two major, reasonable approaches that I know of:  First, to have a database per tenant. (This is the approach discussed by Ayende in his multi tenancy series) A second common alternative is to have a tenant id column in all tables that require differentiation by client.  My team was in a situation requiring multitenancy, and needed to select an approach.   Wary of the idea of multiple databases, the team immediately decided to go with the tenant column approach.

Looking back, I think Ayende was right.  The one advantage of the tenantid column system is not having to deal with multiple databases.  There are many disadvantages.  I'll discuss the advantages, and then give some thoughts on how to deal wwith the complexity of working with multiple databases.

Advantages of multiple databases:


In my mind the most important developer-centric advantage of having multiple databases is the ability to refactor the database.  Many times, developers will design a system a system with certain assumptions, then realize those assumptions are wrong.  Other times, a major new feature would be much easier and cleaner with certain system changes.  If the database is only talking to one system, then when the developer comes upon these situations, s/he can also refactor the database to reflect that understanding, and keep the mapping between database simple, clean and easy to understand.  When the code is pushed live, db update scripts also get pushed live.  However, if multiple systems (in this case tenants) talk to the same database, this kind of db structure change just can't happen-pushing a db change would alter everyone's db, meaning everyone's code would also have to be updated at once.  In many situations, that kind of simultaneous mass upgrade would be totally unacceptable, meaning that once this kind of multitenancy system is put in place, the single db can never again be altered.

Another advantage of multiple databases is the ability to easily load-balance by moving individual tenants to different db servers as necessary.  If the databases are architected to be isolated, this isn't a problem-change a few connection strings, and you're done.  In a shared system, all tenants suffer if one tenant is overloading the server. 

On a related note, though nobody likes to think about it, clients sometimes leave, and when they do, they will want to be able to take their own data with them.  If they have their own database, it's as easy as zipping it up and getting it to them.  If all tenants are on the same db, it could be a much more difficult exercise involving picking through every table and selecting out the tenant-specific data.  Hopefully you dont give them anything that isn't theirs.  Which cleads to the next point.

From a business perspective, data separation is probably the most important reason to go with separate dbs.  With a multitenant system, it's impossible for one tenant to end up with anyone else's data.  Other tenants' data is in a completely different system, so there's no worry about messing up the tenantid and passing a client someone else's data.  Depending on the situation, this could just be embarrassing; in other situations, you might be looking at a lawsuit.

Lastly, having separated databases makes many things in the application more complicated, for the life of the application.  Every join, every query, every new table, every mapping has to deal with the tenant id.  If the strategy is to use the tenant id to make a composite key on the tables, then you've got that much additional complexity-and if you're using a tool like NHibernate, you'll soon find out that working with composite keys is a LOT less seamless than a single autogenerated key.  Eliminate the tenantid, and a whole lot of things become forever simpler, clearer and have a lot less potential for error.

How to deal with multiple databases?


The trouble with multiple databases is that now you have 20 or even 100 databases to manage instead of one.  Managing this manually would be a difficult, error-prone task that could become overwhelming.  In addition, any shared data can become out of sync.  The solution?  Don't do *anything* manually with these databases. 

For versioning: each time you create a new version of your application, create a db script to update the db from the prior version from the last version to the newest one, and create a script to roll the database back from the newest version to the previous version.  There are tools to help produce this kind of script... If you're programming in .Net, Visual Studio Database edition has a tool to help do this for SQL Server, and Red Gate has one as well called SQL Compare.  Bundle this script in with the deploy of the code, and push it all out together, to each tenant as they get upgraded. 

For shared data: 99% of the time, shared data is of the sort where it needs to be updated soon, but not necessarily real-time.  An example of this is cross-client data for statistical computation.  To allow this kind of computation in a multiclient situation, create a process that goes to each system and collects the appropriate data, possibly in varying ways based on the tenant's installed version, then pulls that data into a single system.  The combined reporting system can be optimized and tweaked for reporting needs without affecting the designs of the individual tenants.  Conversely, if there's any reference data they all need to share (dropdown data, etc) this information can be coordinated with a similar process that does a data push to the different systems, based on application version.

This kind of automation is some up-front work, but once it's done, it's done, and it allows tremendous application flexibility... and once it's set up, it's easy-there's no reason it couldn't become fairly invisible from a management perspective..  In addition, it's really a very separate task from the business application itself-it's actually so separate that the syncing application can be written by a completely different team that knows little to nothing about the main application.  For that matter, I'm not sure if there's a tool out there that does all of these things already(I bet there is, come to think of it), but there's no reason such a tool couldn't exist.


Posted 04-24-2009 12:49 AM by Anne Epstein
Filed under:

[Advertisement]

Comments

Mike Hadlow wrote re: The Case for Multiple DBs in Multi Tenancy situations
on 04-24-2009 4:54 AM

Hi Anne,

Great article. I too was very influenced by Ayende's articles on multi-tenancy. I'm one of the authors of Suteki Shop, an open source eCommerce application. Our multi-tenancy strategy is one application, multiple databases.

I've tried to stick to the rule that a multi-tenanted application should not look like a multi-tenanted application.

We use an IoC container to reconfigure the application for each tenant. I've blogged about our approach here:

mikehadlow.blogspot.com/.../multi-tenancy-part-1-strategy.html

and here:

mikehadlow.blogspot.com/.../multi-tenancy-part-2-components-and.html

DotNetShoutout wrote The Case for Multiple DBs in Multi Tenancy situations - Anne Epstein - Devlicio.us
on 04-24-2009 10:47 PM

Thank you for submitting this cool story - Trackback from DotNetShoutout

Brownsblogging wrote Daily Duty
on 04-27-2009 8:30 AM

Daily Duty

Glenn Block wrote re: The Case for Multiple DBs in Multi Tenancy situations
on 05-01-2009 3:09 PM

Nice post Anne.

Aside from reducing complexity by an order of magnitude, multiple databases have signficant performance improvements. Each DB can be optimized based on the specific data access needs of particular customers. In a shared multi-tenant DB a change to help one often has negative impacts on the other.

high quality backlinks wrote re: The Case for Multiple DBs in Multi Tenancy situations
on 07-18-2014 10:37 PM

mC2uSE I really like and appreciate your post.Much thanks again. Much obliged.

matzcrorkz wrote re: The Case for Multiple DBs in Multi Tenancy situations
on 08-06-2014 6:03 AM

totWUA Im grateful for the article.Really thank you! Much obliged.

Add a Comment

(required)  
(optional)
(required)  
Remember Me?

About The CodeBetter.Com Blog Network
CodeBetter.Com FAQ

Our Mission

Advertisers should contact Brendan

Subscribe
Google Reader or Homepage

del.icio.us CodeBetter.com Latest Items
Add to My Yahoo!
Subscribe with Bloglines
Subscribe in NewsGator Online
Subscribe with myFeedster
Add to My AOL
Furl CodeBetter.com Latest Items
Subscribe in Rojo

Member Projects
DimeCasts.Net - Derik Whittaker

Friends of Devlicio.us
Red-Gate Tools For SQL and .NET

NDepend

SlickEdit
 
SmartInspect .NET Logging
NGEDIT: ViEmu and Codekana
LiteAccounting.Com
DevExpress
Fixx
NHibernate Profiler
Unfuddle
Balsamiq Mockups
Scrumy
JetBrains - ReSharper
Umbraco
NServiceBus
RavenDb
Web Sequence Diagrams
Ducksboard<-- NEW Friend!

 



Site Copyright © 2007 CodeBetter.Com
Content Copyright Individual Bloggers

 

Community Server (Commercial Edition)