Essay - Published: 2024.06.10 | create | entity-framework | fsharp | fullstack-projects | tech |
DISCLOSURE: If you buy through affiliate links, I may earn a small commission. (disclosures)
I'm building a web app that creates and ranks fullstack project ideas. For ranking I'm using a naive voting system which means I need to aggregate across individual votes to figure out what order to display them in.
I'm using F# and Entity Framework with a Postgres DB so this aggregation maps nicely to a group by, sum aggregation, then order by total descending. However I ran into a lot of problems writing this query with F# / Entity Framework - writing code that compiled but when run threw Error: The LINQ expression could not be translated.
In this post we'll explore the issues I was facing and how I eventually worked around them.
Want to get started building with F# + Entity Framework? CloudSeed is a fullstack project boilerplate to help get you started.
Let's start with a general overview of the system to provide context before diving into code.
The data model of this project is very simple:
Thus finding the top project ideas is pretty easy:
There are of course many other ways to build and model this but this is the one I chose for this iteration so this is what we'll be diving into.
My first iteration tried to use EF idiomatically to solve this via Linq:
[<CLIMutable>]
type ProjectVoteOrderRecord =
{
ProjectIdeaId: Guid
TotalVotes: int
}
let! orderedProjectIds =
dbContext.ProjectIdeaVotesIp
.GroupBy(fun p -> p.ProjectIdeaId)
.Select( fun g ->
{
ProjectIdeaId = g.Key
TotalVotes = g.Sum(fun p -> p.VoteValue)
}
)
.Where(fun g -> g.TotalVotes > 0)
.OrderByDescending(fun g -> g.TotalVotes)
.Select(fun g -> g.ProjectIdeaId)
.Skip(page * count)
.Take(count)
.ToListAsync()
|> Async.AwaitTask
This compiled fine but the problem happened when I tried to run it: The LINQ expression could not be translated:
System.InvalidOperationException : The LINQ expression 'DbSet<ProjectIdeaVotesIp>()
.GroupBy(p => p.ProjectIdeaId)
.Select(g => new ProjectVoteOrderRecord(
g.Key,
g
.AsQueryable()
.Sum(e => e.VoteValue)
))
.Where(e0 => e0.TotalVotes > 0)'
could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'. See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.
This error basically means that the query I wrote could not be translated to the DB layer. The suggestion therefore is to materialize the query earlier (pull data from DB) then do the non-translatable operations in your app logic.
This works (and is what I did in the interim while I figured out what was wrong) but is not really optimal - the whole idea of a DB is to efficiently do data operations in one place. If we're constantly materializing raw data into our app logic to process it we're going to be hogging our app processing, memory, and paying very large network roundtrip costs. All things we want to avoid.
I experimented with different LINQ queries to see if I could pinpoint the issue and it turned out basically any operation after the GroupBy that accessed values of the intermediate result object led to this error. So something was weird with the GroupBy object even though there seem to be many EF examples that use this.
I tried a lot of things to get this working but ultimately the only thing I found that allowed this query to run in the DB was to lose the intermediate result object. This makes the code less readable IMO (and may make the SQL more convoluted) but seems to compile and execute as expected in my DB.
let! orderedProjectIds =
dbContext.ProjectIdeaVotesIp
.GroupBy(fun p -> p.ProjectIdeaId)
.Where(fun g -> g.Sum(fun p -> p.VoteValue) > 0)
.OrderByDescending(fun g -> g.Sum(fun p -> p.VoteValue))
.Select(fun g -> g.Key)
.Skip(page * count)
.Take(count)
.ToListAsync()
|> Async.AwaitTask
My research into why this happened in EF didn't turn up much but there seems to be some edge cases around trying group by aggregations with non-scalar (primitive) values. My hypothesis is that the F# object was confusing EF (even though this seems possible in C#?) so simply removing it allows it to translate better.
Not optimal maybe but at least we aren't doing the computations in memory now.
Despite all the possible edge cases in Entity Framework, I still find it to be pretty ergonomic for the common 80% cases and still choose to use it in most of my projects.
Q: What are you using for data access in F#? What edge cases / issues are you running into with it?
If you liked this post you might also like:
The best way to support my work is to like / comment / share for the algorithm and subscribe for future updates.