Date: 2024.09.18 | create | featured | mongo | oracledb | postgres | redis | sqlite | sqlserver | tech | webapps |
DISCLOSURE: If you buy through affiliate links, I may earn a small commission. (disclosures)
I recently read Ethan McCue's Just Use Postgres and a lot of it resonated with me from my own experiences building apps and managing data at work and in side projects.
In this post I'm going to focus on a few reasons why Postgres should be your default DB for webapps and some strategies for how I like to use it that balances simplicity and scale.
Overall my recommendation is to just use Postgres for most web applications. It's free, it's reliable, and it scales - moreso than basically any other competitor.
The alternative from relational databases is basically noSQL - things like doc dbs (Mongo, Dynamo) or k/v stores (Redis, Valkey).
A lot of startups and companies created in the past decade have have built on DocDBs (like Mongo) because it promised a more straightforward data experience - just dump your data as JSON and operate on it there. This works - until it doesn't.
Simplicity
Scalability
Systemization
I'm not saying that Mongo / DocDBs can't work but I am saying that they often bring with them edge case issues that there's not always a great way to resolve. They work quite well when you know what you're doing upfront but the hard part of software is that the requirements keep changing - which means often we don't know exactly what we want at the outset.
Relational databases give us tools to enforce the most important things about our core data:
I've seen these docDB scaling issues first-hand at my current company Rippling. We use MongoDB but we don't really use it as a DocDB. There is no one document per customer - there are simply too many models with too many teams involved to make that scalable. Instead we have a new docs per entity we're tracking - which looks a lot like a relational normalized table structure. What this means is that we get basically none of the benefits of a docdb (fast lookups on id) while all of the downsides (scans / joins are slow, low consistency contrainsts, roll your own migrations). There are certainly ways around this and my anecdata is on small N but my guess is that many usecases follow this pattern.
The downside with these relational approaches is that this extra consistency may require a bit more footwork to setup - you'll often want to update the schema directly yourself which is an extra step compared to just changing your model in code and saving it.
But really this step is required either way. If you're going docDB, you can't "just change it in code" because you still have existing data on the old code in the db. So you do have to either migrate it all now to make the data consistent or build infra to migrate it as you go - an extra step either way, just explicit in the relational model.
Personally, I find the gains in data integrity, data flexibility, and long-term performance are almost always worth it for the small extra cost in data schema maintenance. You only write sometimes but you read quite a lot. So take the extra time to do it right now so you don't have to redo it later.
Plus there are approaches to minimize the extra work required with relational schemas while still getting most of the benefits - we'll touch on these later.
The way I like to build my data models is:
The reason to use Postgres is that it is reliable, performant, and free. It's the most loved DB, has all the features you might ask for via plugins, regularly ranks in the top 5 dbs for any given task performance-wise, and you don't have to worry about a license that may come and bite you later on.
Comparing this to other options:
I'm a big fan of Postgres - it's reliable, performant, and free. No other DB comes close when you compare the whole package. Other DBs will often be good competitors in a given usecase but Postgres wins out overall (and often beats others at their own game).
That said I think the general sentiment against Relational DBs is warranted - they can be a pain to work with if you don't SQL good and there is a bit of boilerplate involved with translating your code model into SQL which could slow down dev times.
So here I want to share an approach that I use with Postgres that tries to leverage it for the good parts while avoiding the known downsides of working with SQL.
The system is:
This basically gets the data flexibility / integrity / consistency benefits of Relational while still getting the dev benefits of DocDBs (not having to change the schema every time I add a field).
As I build my systems, if a certain property turns out to be more important than I originally thought, I will migrate it out from the JSONB into its own column so we can enforce its integrity even better.
This is the core system, then at hyper scale we can use auxiliary dbs like docdb, kv stores to handle caching / read-layers to remove load from our core db. These are rarely necessary until high scale and often can be pretty small, servicing just a few specific read paths.
This gives us most of the benefits of relational while limiting its downsides - typically giving us a better overall experience than if we reached for just relational or non-relational.
This is how I currently build my apps. JSON integrity is still a bit of a pain requiring versioning and hand-rolled migrations but at least I have a bit more flexibility / integrity on the properties that matter most.
If you liked this post you might also like:
The best way to support my work is to like / comment / share for the algorithm and subscribe for future updates.