DjangoCon Europe 2019: Pushing the ORM to its limits

Writeup of the DjangoCon Europe 2019 talk »Pushing the ORM to its limits« by Sigurd Ljødal

Sigurd Ljødal: A software developer at Kolonial.no, Norway's largest online grocery store. I have worked with a wide range of technology from frontend to backend, in Python and other languages. For the past year I have been working with warehouse systems, including controlling conveyor systems using asyncio.

Sources – check them, I couldn't write up all the code examples.

Some tips and tricks

Use custom querysets and custom managers: Subclass QuerySet and models.Manager. Then set objects = CustomManager.from_queryset(CustomQuerySet) – and note that the from_queryset method can be chained!

Inspect yo queries: str(queryset.query) and queryset.explain(verbose=True) are your best friends. But be aware that this output will be heavily database dependent.

Avoid extra queries: Use select_related and prefetch_related for greater good! But don't optimize prematurely and make sure that you don't actually make your problems better.

Avoid race conditions: By using select_for_update in a transaction, you're making sure that the row(s) will not be modified by a different query in the meantime. This is a heavy and dangerous tool, so be careful.

Subqueries

Customer.objects.annotate(
    latest_order_time=Subquery(
        Order.objects.filter(
            customer=QuterRef('pk')
        ).order_by('-created_at')\
        .values('created_at')[:1]
    )
)

Use Subquery and Outerref for clever limiting and filtering.

SaltesTarget.objects.annotate(
    gross_total_sales=Subquery(
        Order.objects.filter(
            created_at__year=OuterRef('year'),
            created_at__month=OuterRef('month')
        ).annotate(
            gross_total=Sum('lines__gross_amount')
        ).values_lists('gross_total')
    )
)

Run aggregation without grouping (which happens occasionally unintendedly):

SalesTagets.objects.annotate(
    weekend_revenue=Subquery(
        Order.objects.filter(
            created_at__week_day__in=[7, 1],
        ).values_list(
            Func(
                'lines__gross_amount',
                function='SUM'  # MAGIC
            )
        )
    )
)

Constraints and indexes

Custom constraints in addition to vanilla unique constraints are new in 2.2, and indexes were new in 2.0, including conditional constraints and conditional indexes.

Partial unique constraints

Unique Constraints are the old assignment to unique_together in a model's Meta class. But partial unique constraint is new (and also the only way to limit the amount of null values in a column to one):

class Order(Model):
    ...
    class Meta:
        constraints = [
            UniqueConstraint(
                name='limit_pending_orders',
                fields=['customer', 'is_shipped'],
                condition=Q(is_shipped=False)
            )
        ]

Check constraints

This kind of constraint applies to bulk queries and updates from outside Django:

class MontlyBudget(Model):
    class Meta:
        constraints = [
            CheckConstraint(
                check=Q(month__in=range(1, 13)),
                name='check_valid_month',
            )
        ]
]

Partial index

… sorry, I didn't manage to copy that one.

class Order(Model):
    class Meta:
        indexes = [
            Index(name=)
        ]

Window functions

Get the previous order from the same customer, huge possibilities there:

orders = Order.objects.annotate(
    prev_order_id=Window(
        expression=Lag('order_id', 1),
        partition_by=[F('customer_id')],
        order_by=F('created_at').asc()
    )
)


orders.first().prev_order_id

Customizing the ORM

Create your own database function by subclassing Func:

class Round(Func):
    function = 'ROUND'

… aaand you're done. Use it like all annotation functions. But of course there are more complex possibilities – again, didn't manage to transcribe all of it.

class AsDateTime(Func):

    arity = 2  # takes two fields
    output_field = DateTimeField()

    def as_postgresql(self, compiler, connection, **extra_contextA):
        extra_context['tz'] = settings.TIMEZONE
        template = (
            "(%(expression)s || '%(tz)s')"  # TODO
        )

Writing custom SQL

Let's use the age function in Postgres:

Order.objects.annotate(age=RawSQL('age(created_at)'))

or

Order.objects.extra(select={'age': 'age(created_at)'})

or

Order.objects.raw('''SELECT …''')