EuroPython 2018: Cython to speed up your Python code

Stefan Behnel is a core developer of Cython and lxml.

The Python data ecosystem consists out of NumPy to integrate data, and Cython to integrate code.

What is Cython good for?

  1. Integrate native code into Python
  2. Speed up Python code in CPython
  3. Write C without having to write C

Cython is a pragmatic programming language extending Python, but also an optimisng compiler, heavily production proven, widely used, and all about getting things done. Cython allows you to keep focus on fuctionality, by allowing you to move freely between Python and C/++. Your code can be as Pythonic as you want, but as low-level as you need.

Now, calling python functions in compiled code is not really different from calling python functions in regular code. But now we can also call C methods. Note you have to redefine functions in either of these ways:

cimport libc.math

sin_func = libc.math.sin

… or with type checking:

cimport libc.math

def csin(double x):
   return libc.math.sin(x)

Integrations

You can follow along on this Jupyter notebook.i Yeah, it works with Jupyter: %%cython; naturally it adds a build step. Jupyter does caching, so building isn't terribly slow.

You can also use Jupyter to check how the cython methods look in C code – don't be surprised at the length, there is a lot of error checking and integration going on.

Use cdef double fancy_x = x * x to make cython aware of variable types. Use cdev extern from "lua.h": to load lua code, for instance. You can use Cython to run lua code, by instantiating the lua runtime, compile lua code (luaL_loadbuffer(state, code, len(code), '<python>')); lua_pcall(state, 0 LUA_MULTRET, 0)

Optimisations

Have a look at the stdlib module difflib. We'll optimize it using cProfile for profiling and fuzzywuzzy for benchmarking. Profiling tells us that difflib.find_longest_match is our most expensive function.

In that method, we see an integer – now Python integers are unbounded, while C integers are limited (yay wrapping), but we have the i: cython.Py_ssize_t type, which we'll use instead. This takes us down from 46 seconds to 35 secons, as Cython can replace a whole Python for-loop with a C loop. We can also remove Python specific optimisations, which aren't necessary with Cython. Additionally we'll type dicts explicitly as d: dict.