EuroPython 2018: Let’s Build a Python Profiler in 25 LOC

Noam Elfanbaum is the author of pycubator.com and a data engineering lead at Bluevine.

Builtins

Python has two builtin profilers in stdlib: profile and cProfile (for speed).

python -m cProfile -s tottime file.py

Deterministic profiling

We can access the process call stack ourself on a trigger, and then print it out. We'll use ``sys.setprofile```, and register a profiler which receives a call_stack, an event, and further args. and then we run traceback.extact_stack(call_stack) to get to the data.

def profiler(call_stack, event, arg):
    line = traceback.extract_stac(call_stack)[0]
    print(line)

sys.setprofiler(profiler)

Statistical profiling

Statistical prifliers sample the program on a given interval. One way to do so is using OS interrupts.

def print_handler(handler, call_stac):
    line = traceback.extract_stac(call_stack)[0]
    print(line)

def start(handler, interval=1):
    signal.signal(signal.SIGPROF, handler)
    signal.setitemer(sgnal.ITIMER_PROF, interval, interval)
    atexit.register(malbda: signal.setitimer(0))

start(print_handler)

Selection

The statistical profilers run with low, controllable and predictable overhead. (Use in prod) The deterministic profiler introduces latency and stress, but is more complete. (Don't use in prod)

Building your own statistical flame graph profiler

Set an interval as before. After extracting the tack, join the stack lines, and count how often we arrived at this location.

import atexit
import collections
import signal
import traceback

stats = collections.defaultdict(int)


def _sample(_, call_stack):
    stack = traceback.extract_stack(call_stack)
    formatted_stack = ';'.join(line.line for line in stack)
    stats[formatted_stack] += 1


def start(interval=0.005):
    signal.signal(signal.SIGPROF, _sample)
    signal.setitimer(signal.ITIMER_PROF, interval, interval)
    atexit.register(lambda: signal.setitimer(
        signal.ITIMER_PROF, 0))


def format_stats():
    stats_out = [f'{trace} {count}' for trace, count
                 in stats.items()]
return '\n'.join(reversed(stats_out))

Run this and pipe the result into flamegraph | browser. You'll now get an svg plotting where most time was spent.

References:

  • profiler
  • flame graph
  • Juila Evans post about Ruby and Python profilers
  • Nylas performance post where they explain how they built a homemade performance monitoring service (recommended!).
  • Python's profilers docs