F#-like records in Python

Date: 2023-11-01 | create | fsharp | python |

F# has some of the best programming ergonomics of any language I've used in my Software Engineering career. A large part of this is due to its excellent type system and core datastructures - allowing for expressive, precise logic declarations with minimal boilerplate. I liken writing F# to writing in bullet points whereas other languages are more like full-on prose.

The problem is that you don't always get to choose what language you use at work (where a lot of your coding is likely to happen). Getting the same amount of expression and precision in other languages can be quite hard.

Python has been a primary language at companies I've worked for over the last 5 years of my career. In this post we'll explore:

Q: How can we get F#-like records in Python?

Answer

Python dataclasses utilizing the frozen and kw_only flags allow us get similar expression and precision as F# records.

  • dataclass - Provides a lean construct for holding data-mostly groupings
  • frozen - Makes the dataclass immutable
  • kw_only - Enforces creation by field name
@dataclass(frozen=True, kw_only=True)
class SimpleDC:
  a_int: int 
  a_string: str

For a deeper dive into what these do and why this is is so important from a Pythonic / dynamic langage point of view, read: Python Dataclass best practices (and why you should use them)

In the rest of this post we'll explore the basics of how F# records work and how Python dataclasses compare.

You can run all included examples via these Replits:

Fsharp Records vs Python Dataclasses

F# records and Python dataclasses both attempt to solve similar problems - making it easy to group data together and operate on them. This is a very common practice in any software system as at some point you'll want to group related data together (often sub calculations or several inputs to sub calculations).

F# records have several defaults that make them very nice to work with:

  • Value-based equality (vs object-id-based commonly used in OOP)
  • Immutable by default
  • Explicit field-based construction

Example 1: Simple Record

type SimpleRecord = 
  {
    AString : string 
    BInt : int
  }

// .. (excluding main func for brevity)

// Example 1: A simple record (kw_only)
printfn "Example 1"

let mySimpleRecord = 
    {
        AString = "iamastring"
        BInt = 1
    }

printfn "SimpleRecord: %A" mySimpleRecord

// Example 2: default value-based equality
printfn "Example 2"

// Note: Same as mySimpleRecord
let simpleRecord2 = 
    {
        AString = "iamastring"
        BInt = 1
    }

printfn "mySimpleRecord = simpleRecord2 ? %A" (mySimpleRecord = simpleRecord2) // true 

let simpleRecord3 = 
    {
        AString = "DifferentString!" // Note: Different
        BInt = 1
    }

printfn "mySimpleRecord = simpleRecord3 ? %A" (mySimpleRecord = simpleRecord3) // false 

Here we define a simple record, create new records (value-equivalent and value-non-equivalent), then prove that equality is based on value (not instance / object id).

Python's dataclasses don't give us quite so many things out of the box:

  • Value-based equality (vs object-id-based commonly used in OOP)
  • [MISSING] Immutable by default
  • [MISSING] Explicit field-based construction
"""
Example 4: Dataclass equality
"""
print("Example 4: Dataclass equality")

@dataclass 
class Example4DC:
  AString: str 
  AInt: int

e4_a = Example4DC("a", 1)
e4_b = Example4DC("a", 1) # Note: Same as A
e4_c = Example4DC("c", 1) # Note: Different than A

print(f"a == b ? {e4_a == e4_b}") # True
print(f"a == c ? {e4_a == e4_c}") # False

Here we prove that equality is based on the values in the dataclass, not the object ids.

But fear not, Python's dataclass can be extended to fill in the missing attributes we want.

Explicit Field-based Construction

While Python's dataclasses may seem a lot simpler and less verbose than F#'s records based on the examples above, I'm here to tell you that's often not what you want in large projects. Here we can harken back to the Zen of Python's "Explicit is better than implicit" and realize that the simple syntax actually leads to complex code with large amounts of parameters (usually at 3 or more).

Example:

my_data = MyLargeDC("a", 1, 2, 3, "b")

For a more in-depth exploration of implicit dataclass pitfalls, see Python Dataclass Best Practices.

We can enforce explicit field-based dataclass creation by utilizing the kw_only parameter.

"""
Example 5: Explicit dataclasses w kw_only
"""

@dataclass(kw_only=True)
class Example5DC:
  AString: str 
  AInt: int

# TypeError: Expected 0 positional arguments
# ex5_bad = Example5DC("a", 1)

ex5_good = Example5DC(
  AString="a",
  AInt=1
)

Trying to construct a dataclass with kw_only=True without explicit field names will throw a runtime error and most Python typecheckers will surface an error allowing you to catch at buildtime as well.

Immutability

F# is a functional-first language which means it leans heavily on functional programming principles (while allowing you to code more imperative should you wish). Immutability is a principle that largely leads to more robust, easier to understand code.

Python does not do this. It's dynamic and mutable by default. But we can force some level of immutability into these dataclasses using the frozen=True parameter.

This will cause a runtime exception when you try to mutate the dataclass and most typecheckers will surface an error allowing you to catch at build time as well.

"""
Example 2: Immutable dataclasses
* Show a mutable thing you shouldn't do probably
"""
print("Example 2: Immutable dataclasses")

print("Example: MutableDC")


@dataclass
class MutableDC:
  AnInt: int

def bad_mutation(dc: MutableDC) -> None:
  dc.AnInt = 0

mutable_dc = MutableDC(1)
print(f"Before mutation: {mutable_dc}")

bad_mutation(mutable_dc)
print(f"After mutation: {mutable_dc}")

print("Example: ImmutableDC")

@dataclass(frozen=True)
class ImmutableDC:
  AnInt: int

def bad_mutation_blocked(dc: ImmutableDC) -> None:
  dc.AnInt = 0 # squiggly error in editor!

immutable_dc = ImmutableDC(1)
print(f"Before mutation: {immutable_dc}")

# Fails! dataclasses.FrozenInstanceError: cannot assign to field 'AnInt'
# HAMY: Uncomment this section to see failure
# bad_mutation_blocked(immutable_dc)
# print(f"After mutation: {immutable_dc}")

Next

F# is my favorite language but I've yet to be able to use it as part of my job. Luckily it's taught me a lot about what's possible and improved my standards for what I accept as sensible in other languages.

On further investigation of other languages, it seems we can often approach some of these common sense principles - we just might have to dig around in their available libraries and buck some common trends.

Hopefully these tips will help you write more Simple Scalable Systems (3S) in Python.

If you liked this post, you might also like:

Want more like this?

The best / easiest way to support my work is by subscribing for future updates and sharing with your network.