F# + Entity Framework - Group By Aggregation Could Not Be Translated

Date: 2024-06-10 | create | tech | fsharp | entity-framework | fullstack-projects |

I'm building a web app that creates and ranks fullstack project ideas. For ranking I'm using a naive voting system which means I need to aggregate across individual votes to figure out what order to display them in.

I'm using F# and Entity Framework with a Postgres DB so this aggregation maps nicely to a group by, sum aggregation, then order by total descending. However I ran into a lot of problems writing this query with F# / Entity Framework - writing code that compiled but when run threw Error: The LINQ expression could not be translated.

In this post we'll explore the issues I was facing and how I eventually worked around them.

Want to get started building with F# + Entity Framework? CloudSeed is a fullstack project boilerplate to help get you started.

Voting System Overview

Let's start with a general overview of the system to provide context before diving into code.

The data model of this project is very simple:

  • ProjectIdeas - The ideas to display
  • ProjectIdeaVotes - A vote for a (User, ProjectIdea)

Thus finding the top project ideas is pretty easy:

  • Group votes by ProjectIdea
  • Sum the votes
  • Order the votes by votes descending
  • Take the top n of those you want to display

There are of course many other ways to build and model this but this is the one I chose for this iteration so this is what we'll be diving into.

EF Group By Aggregation Could Not Be Translated

My first iteration tried to use EF idiomatically to solve this via Linq:

  • Group By ProjectIdea
  • Sum the votes
  • Take the ones with votes > 0
  • Order by votes descending
  • Take the top n projects I want
[<CLIMutable>]
type ProjectVoteOrderRecord = 
    {
        ProjectIdeaId: Guid 
        TotalVotes: int
    }

let! orderedProjectIds = 
    dbContext.ProjectIdeaVotesIp
        .GroupBy(fun p -> p.ProjectIdeaId)
        .Select( fun g -> 
            {
                ProjectIdeaId = g.Key 
                TotalVotes = g.Sum(fun p -> p.VoteValue)
            }
        )
        .Where(fun g -> g.TotalVotes > 0) 
        .OrderByDescending(fun g -> g.TotalVotes)
        .Select(fun g -> g.ProjectIdeaId)
        .Skip(page * count)
        .Take(count)
        .ToListAsync()
    |> Async.AwaitTask

This compiled fine but the problem happened when I tried to run it: The LINQ expression could not be translated:

System.InvalidOperationException : The LINQ expression 'DbSet<ProjectIdeaVotesIp>()
.GroupBy(p => p.ProjectIdeaId)
.Select(g => new ProjectVoteOrderRecord(
	g.Key, 
	g
	.AsQueryable()
	.Sum(e => e.VoteValue)
))
.Where(e0 => e0.TotalVotes > 0)' 
could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.

This error basically means that the query I wrote could not be translated to the DB layer. The suggestion therefore is to materialize the query earlier (pull data from DB) then do the non-translatable operations in your app logic.

This works (and is what I did in the interim while I figured out what was wrong) but is not really optimal - the whole idea of a DB is to efficiently do data operations in one place. If we're constantly materializing raw data into our app logic to process it we're going to be hogging our app processing, memory, and paying very large network roundtrip costs. All things we want to avoid.

I experimented with different LINQ queries to see if I could pinpoint the issue and it turned out basically any operation after the GroupBy that accessed values of the intermediate result object led to this error. So something was weird with the GroupBy object even though there seem to be many EF examples that use this.

Fixing the EF GroupBy Aggregation

I tried a lot of things to get this working but ultimately the only thing I found that allowed this query to run in the DB was to lose the intermediate result object. This makes the code less readable IMO (and may make the SQL more convoluted) but seems to compile and execute as expected in my DB.

let! orderedProjectIds = 
    dbContext.ProjectIdeaVotesIp
        .GroupBy(fun p -> p.ProjectIdeaId)
        .Where(fun g -> g.Sum(fun p -> p.VoteValue) > 0)
        .OrderByDescending(fun g -> g.Sum(fun p -> p.VoteValue))
        .Select(fun g -> g.Key)
        .Skip(page * count)
        .Take(count)
        .ToListAsync()
    |> Async.AwaitTask

My research into why this happened in EF didn't turn up much but there seems to be some edge cases around trying group by aggregations with non-scalar (primitive) values. My hypothesis is that the F# object was confusing EF (even though this seems possible in C#?) so simply removing it allows it to translate better.

Not optimal maybe but at least we aren't doing the computations in memory now.

Next

Despite all the possible edge cases in Entity Framework, I still find it to be pretty ergonomic for the common 80% cases and still choose to use it in most of my projects.

Q: What are you using for data access in F#? What edge cases / issues are you running into with it?

If you liked this post you might also like:

Want more like this?

The best / easiest way to support my work is by subscribing for future updates and sharing with your network.