How fast can we sort 1 Million elements in F#?

Date: 2023-07-12 | create | tech | fsharp |

Last week I investigated how fast we could sort 40k elements in TypeScript. That investigation was fun and people seemed to enjoy it so naturally I was curious to perform similar experiments in my favorite programming language: F#.

In this post we'll be exploring the question:

Q: How fast can we sort 1 million elements in F#?

Answer

The answer is: pretty fast!

I performed 200 benchmark iterations for several test cases with various collection sizes and data types (more details below) and took their p90 (90th percentile) readings to try and find a ~reasonable upper bound. Numbers sorted faster than strings and primitives sorted faster than object properties (as expected).

Overall, p90 results for sorting 1M elements were:

Primitive ints: 359ms
Object int property: 1151ms
Primitive strings: 3101ms
Object string property: 4026ms

In the rest of this post we'll go into details about the benchmark setup, the code used to run it (and how to run it yourself), and more results including sorting collections of various sizes from 10 - 1M.

Benchmark Setup

The setup for this benchmark is very similar to the one used in my TypeScript benchmark.

Here I'm optimizing for two things:

Simplicity
Reproducibility

Simplicity

This is a best-effort benchmark I'm running as a curiosity experiment so wanted to make it as simple as possible to create, understand, and re-run.

The code requires no external dependencies (aside from F# / .NET runtime) so should be copy-pasteable anywhere.

Reproducibility

A good experiment is a reproducible experiment so these benchmarks are built to be as easily reproducible as possible.

Code - Self-contained, can copy paste as is and run anywhere you can run F#
Machines - Ran on a free Replit instance so others can test as well
- Template: Official Replit F# template (not sure which .NET / F# version it's running)
- CPU: 0.5 vCPU
- RAM: 512 MB
- Storage: 1GB (Unused)

Benchmark Results

I ran a series of benchmarks timing different scenarios. Each scenario (SIZE, TYPE) was run 200 times to produce summary results.

In general order of sort speed:

Primitive ints
Object int property
Primitive strings
Object string property

P90 results sorting elements in F#

Results will be listed below for each data type:

SIZE: P90_MILLISECONDS

Sort a list of integers (p90)

10: 0
100: 0
1000: 1
10000: 1
40000: 13
100000: 55
1000000: 359

Sort a list of strings (p90)

10: 0
100: 0
1000: 1
10000: 54
40000: 91
100000: 269
1000000: 3101

Sort a list of objects by a number property (p90)

10: 0
100: 1
1000: 2
10000: 14
40000: 72
100000: 158
1000000: 1151

Sort a list of objects by a string property (p90)

10: 0
100: 0
1000: 1
10000: 19
40000: 89
100000: 320
1000000: 4026

In my TS investigation, I also performed a final benchmark sorting ints to check if there was a difference between a "cold" runtime and "hot" runtime. In both this case and TS case I observed no obvious differences but noting these results for posterity.

Sort a list of ints - warmed (p90)

10: 0
100: 0
1000: 0
10000: 3
40000: 9
100000: 59
1000000: 375

Benchmark Code

The benchmark code is basically a direct port of my TypeScript benchmark code with a few minor refactors for improved readability / usability.

You can find the full code below but its basically split into 3 primary components:

BenchmarkTimer - Does the actual timing. Passed into a BenchmarkScenario to use to time what it wants to benchmark.
BenchmarkScenario - The scenario to benchmark. Can do some setup etc then use the passed-in BenchmarkTimer to time the thing it wants to benchmark (here: sorting)
BenchmarkRunner - Nice API for running n iterations of a BenchmarkScenario and outputting summary results

Run this code on Replit

Code:

open System

type BenchmarkCode = unit -> unit 

type BenchmarkResult = {
  TimeElapsedMs : int64
}

type BenchmarkTimer = BenchmarkCode -> BenchmarkResult

type BenchmarkScenario = BenchmarkTimer -> BenchmarkResult 

type BenchmarkResultsSummary = {
  ScenarioIterations : int
  MedianTimeElapsedMs : int64
  AverageTimeElapsedMs : int64
  P90ElapsedTimeMs : int64
  MaxTimeElapsedMs : int64
  MinTimeElapsedMs : int64
}

let getCurrentTimeMs = 
  fun() ->
    DateTimeOffset.UtcNow.ToUnixTimeMilliseconds()
  

let benchmarkTimer (codeToBenchmark : BenchmarkCode)  : BenchmarkResult =
  let startTimeMs = getCurrentTimeMs()
  codeToBenchmark()
  let endTimeMs = getCurrentTimeMs()

  {
    TimeElapsedMs = (endTimeMs - startTimeMs)
  }

let getPercentileXElapsedTimeOfResultsArray (results : BenchmarkResult list) (targetPercentile : float) : int64 =
  let sortedResults = 
    results
    |> List.sortBy (
      fun (x) -> x.TimeElapsedMs
    )

  let percentileIndex = int (Math.Floor(float(results.Length) * targetPercentile))
  
  sortedResults[percentileIndex].TimeElapsedMs

let getMedianElapsedTimeOfResultsArray (results : BenchmarkResult list) : int64 =
  getPercentileXElapsedTimeOfResultsArray results 0.5

let getAverageElapsedTimeOfResultsArray (results : BenchmarkResult list) : int64 =
  int (
    results
    // Cast to float because : https://stackoverflow.com/questions/9527855/int64-doesnt-support-languageprimitives-dividebyint
    |> List.map (fun (r) -> float r.TimeElapsedMs)
    |> List.average)

let runBenchmarks (scenario : BenchmarkScenario) (times : int) : BenchmarkResultsSummary =

  let resultsList = 
    seq { 1 .. times }
    |> Seq.map (
      fun _ ->
        scenario benchmarkTimer
    )
    |> Seq.toList

  {
    ScenarioIterations = resultsList.Length
    MedianTimeElapsedMs = getMedianElapsedTimeOfResultsArray resultsList
    AverageTimeElapsedMs = getAverageElapsedTimeOfResultsArray resultsList
    P90ElapsedTimeMs = getPercentileXElapsedTimeOfResultsArray resultsList 0.9
    MaxTimeElapsedMs = (
      resultsList
      |> List.map (
        fun (r) -> r.TimeElapsedMs
      )
      |> List.max
    )
    MinTimeElapsedMs = (
      resultsList
        |> List.map (
          fun (r) -> r.TimeElapsedMs
        )
        |> List.min
    )
  }

let runAndLogBenchmarks (scenarioName : string) (times : int) (scenario : BenchmarkScenario) : unit =
  printfn "Running Scenario: %A - %A times" scenarioName times
  let resultSummary = runBenchmarks scenario times
  printfn "%A" resultSummary

let getRandomPositiveInteger (random : Random) : int =
    int (Math.Floor(float (random.Next() * Int32.MaxValue)))

let getRandomString (random : Random) =
  $"{getRandomPositiveInteger random}"

type SimpleObject = {
  aString : string 
  aNumber : int
}

let getRandomSimpleObject (random : Random) =
  {
    aString = getRandomString random
    aNumber = getRandomPositiveInteger random
  }

[<EntryPoint>]
let main argv =
    printfn "Hello World from F#!"

    let random = new Random()

    runAndLogBenchmarks "SanityCheck" 5 (
      fun (benchmarkTimer : BenchmarkTimer) ->
        printfn "In setup"

        benchmarkTimer (
          fun() ->
            printfn "In benchmark!"
        )
    )

    let scenarioIterations = 200

    let sizeRuns = [
        10;
        100;
        1000;
        10000;
        40000; // 40k
        100000;
        1000000;
    ]

    sizeRuns
    |> List.iter (
      fun (size) ->
        runAndLogBenchmarks (
          $"sortNumbersList-{size}") (
          scenarioIterations) (
          fun (benchmarkTimer : BenchmarkTimer) ->
            let allNumbers = 
              seq {1 .. size}
              |> Seq.map (
                fun _ -> getRandomPositiveInteger random
              )
              |> Seq.toList
          
            benchmarkTimer (
              fun _ -> 
                allNumbers
                |> List.sort
                |> ignore
            )
        )
    )

    sizeRuns
    |> List.iter (
      fun (size) ->
        runAndLogBenchmarks (
          $"sortStringsList-{size}") (
          scenarioIterations) (
          fun (benchmarkTimer : BenchmarkTimer) ->
            let allStrings = 
              seq {1 .. size}
              |> Seq.map (
                fun _ -> getRandomString random
              )
              |> Seq.toList
          
            benchmarkTimer (
              fun _ -> 
                allStrings
                |> List.sort
                |> ignore
            )
        )
    )

    sizeRuns
    |> List.iter (
      fun (size) ->
        runAndLogBenchmarks (
          $"sortObjectNumberList-{size}") (
          scenarioIterations) (
          fun (benchmarkTimer : BenchmarkTimer) ->
            let allObjs = 
              seq {1 .. size}
              |> Seq.map (
                fun _ -> getRandomSimpleObject random
              )
              |> Seq.toList
          
            benchmarkTimer (
              fun _ -> 
                allObjs
                |> List.sortBy (fun x -> x.aNumber)
                |> ignore
            )
        )
    )

    sizeRuns
    |> List.iter (
      fun (size) ->
        runAndLogBenchmarks (
          $"sortObjectStringList-{size}") (
          scenarioIterations) (
          fun (benchmarkTimer : BenchmarkTimer) ->
            let allObjs = 
              seq {1 .. size}
              |> Seq.map (
                fun _ -> getRandomSimpleObject random
              )
              |> Seq.toList
          
            benchmarkTimer (
              fun _ -> 
                allObjs
                |> List.sortBy (fun x -> x.aString)
                |> ignore
            )
        )
    )

    sizeRuns
    |> List.iter (
      fun (size) ->
        runAndLogBenchmarks (
          $"sortNumbersListWarm-{size}") (
          scenarioIterations) (
          fun (benchmarkTimer : BenchmarkTimer) ->
            let allNumbers = 
              seq {1 .. size}
              |> Seq.map (
                fun _ -> getRandomPositiveInteger random
              )
              |> Seq.toList
          
            benchmarkTimer (
              fun _ -> 
                allNumbers
                |> List.sort
                |> ignore
            )
        )
    )
    
    0 // return an integer exit code

Next Steps

If you've read this far you must like benchmarks / F#!

In that case, you may also like:

Want more like this?

The best / easiest way to support my work is by subscribing for future updates and sharing with your network.