How fast can we sort 1 Million elements in F#?
Date: 2023-07-12 | create | tech | fsharp |
Last week I investigated how fast we could sort 40k elements in TypeScript. That investigation was fun and people seemed to enjoy it so naturally I was curious to perform similar experiments in my favorite programming language: F#.
In this post we'll be exploring the question:
Q: How fast can we sort 1 million elements in F#?
Answer
The answer is: pretty fast!
I performed 200 benchmark iterations for several test cases with various collection sizes and data types (more details below) and took their p90 (90th percentile) readings to try and find a ~reasonable upper bound. Numbers sorted faster than strings and primitives sorted faster than object properties (as expected).
Overall, p90 results for sorting 1M elements were:
- Primitive ints: 359ms
- Object int property: 1151ms
- Primitive strings: 3101ms
- Object string property: 4026ms
In the rest of this post we'll go into details about the benchmark setup, the code used to run it (and how to run it yourself), and more results including sorting collections of various sizes from 10 - 1M.
Benchmark Setup
The setup for this benchmark is very similar to the one used in my TypeScript benchmark.
Here I'm optimizing for two things:
- Simplicity
- Reproducibility
Simplicity
This is a best-effort benchmark I'm running as a curiosity experiment so wanted to make it as simple as possible to create, understand, and re-run.
The code requires no external dependencies (aside from F# / .NET runtime) so should be copy-pasteable anywhere.
Reproducibility
A good experiment is a reproducible experiment so these benchmarks are built to be as easily reproducible as possible.
- Code - Self-contained, can copy paste as is and run anywhere you can run F#
- Machines - Ran on a free Replit instance so others can test as well
- Template: Official Replit F# template (not sure which .NET / F# version it's running)
- CPU: 0.5 vCPU
- RAM: 512 MB
- Storage: 1GB (Unused)
Benchmark Results
I ran a series of benchmarks timing different scenarios. Each scenario (SIZE, TYPE) was run 200 times to produce summary results.
In general order of sort speed:
- Primitive ints
- Object int property
- Primitive strings
- Object string property
Results will be listed below for each data type:
- SIZE: P90_MILLISECONDS
Sort a list of integers (p90)
- 10: 0
- 100: 0
- 1000: 1
- 10000: 1
- 40000: 13
- 100000: 55
- 1000000: 359
Sort a list of strings (p90)
- 10: 0
- 100: 0
- 1000: 1
- 10000: 54
- 40000: 91
- 100000: 269
- 1000000: 3101
Sort a list of objects by a number property (p90)
- 10: 0
- 100: 1
- 1000: 2
- 10000: 14
- 40000: 72
- 100000: 158
- 1000000: 1151
Sort a list of objects by a string property (p90)
- 10: 0
- 100: 0
- 1000: 1
- 10000: 19
- 40000: 89
- 100000: 320
- 1000000: 4026
In my TS investigation, I also performed a final benchmark sorting ints to check if there was a difference between a "cold" runtime and "hot" runtime. In both this case and TS case I observed no obvious differences but noting these results for posterity.
Sort a list of ints - warmed (p90)
- 10: 0
- 100: 0
- 1000: 0
- 10000: 3
- 40000: 9
- 100000: 59
- 1000000: 375
Benchmark Code
The benchmark code is basically a direct port of my TypeScript benchmark code with a few minor refactors for improved readability / usability.
You can find the full code below but its basically split into 3 primary components:
- BenchmarkTimer - Does the actual timing. Passed into a BenchmarkScenario to use to time what it wants to benchmark.
- BenchmarkScenario - The scenario to benchmark. Can do some setup etc then use the passed-in BenchmarkTimer to time the thing it wants to benchmark (here: sorting)
- BenchmarkRunner - Nice API for running n iterations of a BenchmarkScenario and outputting summary results
Code:
open System
type BenchmarkCode = unit -> unit
type BenchmarkResult = {
TimeElapsedMs : int64
}
type BenchmarkTimer = BenchmarkCode -> BenchmarkResult
type BenchmarkScenario = BenchmarkTimer -> BenchmarkResult
type BenchmarkResultsSummary = {
ScenarioIterations : int
MedianTimeElapsedMs : int64
AverageTimeElapsedMs : int64
P90ElapsedTimeMs : int64
MaxTimeElapsedMs : int64
MinTimeElapsedMs : int64
}
let getCurrentTimeMs =
fun() ->
DateTimeOffset.UtcNow.ToUnixTimeMilliseconds()
let benchmarkTimer (codeToBenchmark : BenchmarkCode) : BenchmarkResult =
let startTimeMs = getCurrentTimeMs()
codeToBenchmark()
let endTimeMs = getCurrentTimeMs()
{
TimeElapsedMs = (endTimeMs - startTimeMs)
}
let getPercentileXElapsedTimeOfResultsArray (results : BenchmarkResult list) (targetPercentile : float) : int64 =
let sortedResults =
results
|> List.sortBy (
fun (x) -> x.TimeElapsedMs
)
let percentileIndex = int (Math.Floor(float(results.Length) * targetPercentile))
sortedResults[percentileIndex].TimeElapsedMs
let getMedianElapsedTimeOfResultsArray (results : BenchmarkResult list) : int64 =
getPercentileXElapsedTimeOfResultsArray results 0.5
let getAverageElapsedTimeOfResultsArray (results : BenchmarkResult list) : int64 =
int (
results
// Cast to float because : https://stackoverflow.com/questions/9527855/int64-doesnt-support-languageprimitives-dividebyint
|> List.map (fun (r) -> float r.TimeElapsedMs)
|> List.average)
let runBenchmarks (scenario : BenchmarkScenario) (times : int) : BenchmarkResultsSummary =
let resultsList =
seq { 1 .. times }
|> Seq.map (
fun _ ->
scenario benchmarkTimer
)
|> Seq.toList
{
ScenarioIterations = resultsList.Length
MedianTimeElapsedMs = getMedianElapsedTimeOfResultsArray resultsList
AverageTimeElapsedMs = getAverageElapsedTimeOfResultsArray resultsList
P90ElapsedTimeMs = getPercentileXElapsedTimeOfResultsArray resultsList 0.9
MaxTimeElapsedMs = (
resultsList
|> List.map (
fun (r) -> r.TimeElapsedMs
)
|> List.max
)
MinTimeElapsedMs = (
resultsList
|> List.map (
fun (r) -> r.TimeElapsedMs
)
|> List.min
)
}
let runAndLogBenchmarks (scenarioName : string) (times : int) (scenario : BenchmarkScenario) : unit =
printfn "Running Scenario: %A - %A times" scenarioName times
let resultSummary = runBenchmarks scenario times
printfn "%A" resultSummary
let getRandomPositiveInteger (random : Random) : int =
int (Math.Floor(float (random.Next() * Int32.MaxValue)))
let getRandomString (random : Random) =
$"{getRandomPositiveInteger random}"
type SimpleObject = {
aString : string
aNumber : int
}
let getRandomSimpleObject (random : Random) =
{
aString = getRandomString random
aNumber = getRandomPositiveInteger random
}
[<EntryPoint>]
let main argv =
printfn "Hello World from F#!"
let random = new Random()
runAndLogBenchmarks "SanityCheck" 5 (
fun (benchmarkTimer : BenchmarkTimer) ->
printfn "In setup"
benchmarkTimer (
fun() ->
printfn "In benchmark!"
)
)
let scenarioIterations = 200
let sizeRuns = [
10;
100;
1000;
10000;
40000; // 40k
100000;
1000000;
]
sizeRuns
|> List.iter (
fun (size) ->
runAndLogBenchmarks (
$"sortNumbersList-{size}") (
scenarioIterations) (
fun (benchmarkTimer : BenchmarkTimer) ->
let allNumbers =
seq {1 .. size}
|> Seq.map (
fun _ -> getRandomPositiveInteger random
)
|> Seq.toList
benchmarkTimer (
fun _ ->
allNumbers
|> List.sort
|> ignore
)
)
)
sizeRuns
|> List.iter (
fun (size) ->
runAndLogBenchmarks (
$"sortStringsList-{size}") (
scenarioIterations) (
fun (benchmarkTimer : BenchmarkTimer) ->
let allStrings =
seq {1 .. size}
|> Seq.map (
fun _ -> getRandomString random
)
|> Seq.toList
benchmarkTimer (
fun _ ->
allStrings
|> List.sort
|> ignore
)
)
)
sizeRuns
|> List.iter (
fun (size) ->
runAndLogBenchmarks (
$"sortObjectNumberList-{size}") (
scenarioIterations) (
fun (benchmarkTimer : BenchmarkTimer) ->
let allObjs =
seq {1 .. size}
|> Seq.map (
fun _ -> getRandomSimpleObject random
)
|> Seq.toList
benchmarkTimer (
fun _ ->
allObjs
|> List.sortBy (fun x -> x.aNumber)
|> ignore
)
)
)
sizeRuns
|> List.iter (
fun (size) ->
runAndLogBenchmarks (
$"sortObjectStringList-{size}") (
scenarioIterations) (
fun (benchmarkTimer : BenchmarkTimer) ->
let allObjs =
seq {1 .. size}
|> Seq.map (
fun _ -> getRandomSimpleObject random
)
|> Seq.toList
benchmarkTimer (
fun _ ->
allObjs
|> List.sortBy (fun x -> x.aString)
|> ignore
)
)
)
sizeRuns
|> List.iter (
fun (size) ->
runAndLogBenchmarks (
$"sortNumbersListWarm-{size}") (
scenarioIterations) (
fun (benchmarkTimer : BenchmarkTimer) ->
let allNumbers =
seq {1 .. size}
|> Seq.map (
fun _ -> getRandomPositiveInteger random
)
|> Seq.toList
benchmarkTimer (
fun _ ->
allNumbers
|> List.sort
|> ignore
)
)
)
0 // return an integer exit code
Next Steps
If you've read this far you must like benchmarks / F#!
In that case, you may also like:
Want more like this?
The best / easiest way to support my work is by subscribing for future updates and sharing with your network.