Instagram's tech stack will surprise you

Date: 2023-03-22 | featured | tech-stack | technology |

I love building software and consume a lot of content talking about the best tech stacks, programming languages, and strategies to scale systems and businesses. I also spent 3 years working as a software engineer at Meta - building and scaling backends for Facebook and Instagram, some of the largest services in the world serving billions of users a day.

The problem is that Meta's tech stack breaks many of the "best practices" espoused in the technosphere. Yet it's a system that's scaled far beyond almost every other system on the planet. This discrepancy calls into question the veracity of these "best practice" strategies.

In this post I'm going to detail 3 surprising things about Meta's tech stack to serve as a counterexample to balance these "best practices".

Architecture

Claim: Monoliths don't scale

A lot of people think monoliths don't scale. For years microservices have been pushed as the solution to this problem and many orgs have undergone the often multi-year endeavor of migrating their architecture.

Data: Monoliths do scale

Yet Meta, one of the largest software systems on the planet, runs on monoliths.

Meta's backbone is estimated at 100ks to millions of servers - most running monoliths. These handle the main app logic of each platform for Meta, Facebook, and Instagram.

Meta / Facebook / Instagram run on Monoliths

Meta / Facebook / Instagram run on Monoliths

Balance: Monoliths can scale

On balance, Meta and its services don't only run on Monoliths. Instead, Monoliths handle most of the core app / business logic and heavy workloads get isolated into their own services.

HAMYTODO: image of service-based monoliths

I call this general architecture approach "service-based monoliths".

For more, read: Software Monoliths for Scale

Programming Languages

Claim: "slow" languages don't scale

Software engineers love to optimize - it's a common path to waste via the build trap and premature optimization. One area of optimization that leads to constant bikeshedding is choosing the best, fastest, newest language. We literally have multiple surveys / polls every year on this very subject (see Stack Overflow Dev Survey 2022)

A common argument is that slower languages don't scale - commonly levered against languages like JavaScript, Python, PHP, and Ruby.

Data: "slow" languages do scale

Well let's look at what languages run Meta / Facebook / Instagram - one of the largest software systems in the world.

  • Meta / Facebook - PHP
  • Instagram - Python / Django

Facebook and Instagram run on PHP and Python

Facebook and Instagram run on PHP and Python

Just like that we have a great counterexample showing these "slow" languages handling some of the heaviest workloads on the planet.

Balance: "slow" languages can scale

Now to be fair, Meta isn't just running PHP and Python and also the PHP / Python Meta runs isn't necessarily something you can get off the shelf.

Facebook and Instagram run on custom PHP and Python with faster languages mixed in

Facebook and Instagram run on custom PHP and Python with faster languages mixed in

Heavy workloads that are specialized / intensive (think video encoding, ML workloads, big data) also don't make too much sense to run on these tech stacks. These workloads are often split off into specialized services built with "faster" languages like C++ and Rust.

Version Control

Claim: git is the tool for version control

Everyone uses git. It interoperates everywhere - GitHub, GitLab, BitBucket, VS Code, etc.

Data: Meta doesn't use git for version control

Instead Meta uses Mercurial - a tool they moved to after facing scaling challenges with git.

Personally after using Meta's mercurial, I can say git sucks. It's painful to use and basically everyone uses 5% of the thing and the other 95% just gets in the way.

Balance: git is the most popular VC tool

But just because I dislike git and think Meta's mercurial is better, doesn't mean I'd recommend it right now.

Popularity comes with a lot of benefits like interoperability, tooling, and support. Mercurial doesn't have that and it has a long road before the developer experience reaches that of git.

Also - it's unlikely your codebase will reach the kinds of scalability challenges Meta did so git will probably be fine for the entirety of your system's life.

I am keeping an eye on Meta's Sapling to see if Mercurial can gain some traction.

Conclusion

That's it - hopefully some ammunition to contradict some "best practices" when they simply don't make sense in the given situation.

If you're interested in the tech stack I use to launch Simple Scalable Systems, read: Up and Running with CloudSeed (F# / SvelteKit boilerplate).

Want more like this?

The best way to support my work is to like / comment / share for the algorithm and subscribe for future updates.