Essay - Published: 2023.10.18 | benchmarks | create | software |
DISCLOSURE: If you buy through affiliate links, I may earn a small commission. (disclosures)
Programming language benchmarks like Tech Empower, Web Benchmarks, and Benchmarks Game utilize standardized test scenarios to try and determine which programming language is faster in each. Many software engineers (myself included) reference these benchmarks when comparing and choosing languages for projects.
The problem with these benchmarks is that they're all missing a key thing necessary to understand the true speed of a programming language. To find out what this is, we need to answer a different more broad question:
Q: What is "speed" wrt a programming language?
The obvious answer is performance or how fast can the language do operations (like math) or respond to web requests (which power your favorite website). This is the approach that most benchmarks take.
This is a valid answer and makes a lot of sense as it's certainly a large factor you want to be aware of when choosing a language (at least to avoid the deal-breaking slow ones). Plus it's relatively easy to implement (you'd think - it's actually quite hard) "fair" tests and compare the results.
So it's unfair to say that these benchmarks are wrong or doing the wrong thing (though all benchmarks have asterisks) - they are doing a reasonable thing reasonably well. But it doesn't really mean that much in reality.
The reason these benchmarks don't hold up when theory meets reality is that the things they're measuring are rarely the bottleneck in today's software.
This means that these benchmarks are largely not that important outside of a few areas:
Okay so I've already laid out why I think your benchmark isn't that useful in reality. But I haven't explained why your benchmark is wrong yet.
The main reason is that it's missing the largest factor in terms of software speed. So just as Big O is bounded on the largest factor, if your benchmark does not include the largest factor in terms of speed then it will also be wrong.
When I think of the speed of a given programming language, I typically think of it in two buckets:
We kind of have the user perspective down by measuring speed of operations. Though this is still debatable as largely your bound will be on how you architect the pieces together (hello Big O) and that most users think in terms of boundaries (it's either slow enough to be noticeable or fast enough).
But we're missing the Business / Builder perspective which I'd posit is the larger factor and thus the bottleneck in terms of programming language speed. Whereas the User perspective is certainly important in terms of the usability of your software, the Business perspective is arguably a larger factor:
So what would the ideal benchmark take into account?
Ideally it would take into account each of these parts:
This would give us a much better perspective of how programming languages stack up in reality, not just theory.
But this is quite hard to do. Each business / tool has its own workflows it cares about (combinatorial explosion of User scenarios to cover) and no two eng teams are the same so the Build Time would likely vary a lot even if all other factors were held constant.
The best recommendation I've got for the Build Perspective is to survey companies (doing things similar to what you want to do) that have used different technologies and try to get some patterns out of what they used, liked / disliked, and ultimately learned from that. The best pattern I've pulled out of this is Static > Dynamic languages long-term.
If you're interested in traditional User Perspective benchmarks, you might be interested in:
The best way to support my work is to like / comment / share for the algorithm and subscribe for future updates.