First Sunday of Advent: Built-in Functions
This post is part of my series on Traversing the Python Standard Library.
Built-in functions are included in the standard library and are always available with no need for imports. There are 69 (nice) of them. I've rearranged them from the Python documentation order (alphabetical) into groups – this was short enough to do a detailed breakdown, the following days will be more of a summary.
These are the things I did not know or had forgotten:
errorsargument that you can set to
surrogateescape, which allows you to process and even roundtrip encoding errors in files.
iter()can take two arguments and then calls the first one repeatedly until its result equals the second argument, good for chunked reading.
round(), when it could round either way, chooses the even number, so both
My favourites among the built-ins:
- I love
any()and throw them at iterables all the time, particularly at generators.
- The introduction of
breakpointwas theoretically a godsend, but my fingers are so used to typing
import pdb; pdb.set_trace()that I'm still breaking (ha.) them of the habit.
dir()returns a list of attributes of its argument (or all names in your scope if called without argument). Both of these modes are extremely useful when debugging. Uses
__dir__where possible, and otherwise
__dict__. Absolutely no guarantees or event attempts at completeness, eg does not include metaclass attributes when called on a class.
Types and casting
Types and casting is pretty intuitive in Python, but it's easy to forget some of the built-in types (
instance, when you don't typically work on bytes).
bool()tests the truth value of its argument. It's a subclass of
intand cannot be subclassed further.
callabletells us if the argument has a
__call__method, ie is a class or a function or a lambda function (or probably something I'm forgetting)? Does not catch async functions.
bytearray()objects can only be created with this constructor, which returns a mutable array of bytes. It can be used with a string + encoding, or a buffer, or an iterable of numbers (all of these initialise the bytearray to the values given). Unintuitive: it can also be called with a number and will return a nulled bytearray of that length.
bytesis the same thing, just immutable.
tuple: cast input, typically iterables
slice([start, ]stop)are powerful and
slicein particular is underused.
Math and numbers
Doing math with Python is reasonably pleasant even without waiting for NumPy or SciPy to install. We'll come back to
this in a couple of days when I stare at the
statistics standard library module.
- I always forget that
absexists (and uses
__abs__()) and doesn't need to be imported from
divmod()is exactly that:
a // b, a % b
- Why use
pow()when you can use
**. For integer operands, you can also pass a third
modargument, which is more efficient than doing the technically equivalent
a ** b % c
bin()converts numbers to binary number strings,
hex()does the same for hexadecimals,
float()create numbers (for objects via
intcan take a
complex()creates a complex number, either from a string or from two numbers (on objects, it uses
__float__(), which in turn falls back on
hash()returns a number, optionally using
min()are pretty straightforward. You can supply a
key=named argument just like for list sorting, and you can provide a
default=named argument that will be returned on empty iterables. Both of these take either an iterable or just a lot of arguments to be compared.
sum()has an optional
startnamed argument that is easy to forget.
Everything is a string if you squint hard enough.
ascii()does the same, but escapes all non-ascii characters.
ord()to convert between numbers and characters
format(string, format_string)is the same as
string.format(format_string), but of course now you can format all sorts of things instead.
input()prompts for STDIN. I mostly use the
inquirerlibrary when looking at
input(), because parsing user input 🙄.
open()turns files into file objects. Use as context manager. The mode can be any of
rwax(read, write, append, exclusive create) combined with any of
bt(binary or text mode), and
+(open for reading and writing), default is
bufferingargument can change how files are buffered, by default text files use line buffering and binary files try to use the underlying device block size or
encodingargument is a life saver and you'll know it when you need it. When it stops saving your life, use the
errorargument and set it to
print()prints all its non-named arguments to its
file=named argument, defaulting to
len()the most basic list thing. Will eat your generator alive.
enumerate()goes through an iterable and yields both the next element and its index (optionally starting the index count at the second argument). The index comes FIRST which I will never remember. (I've added it to my Anki deck, though).
filter(function, iterable)is the same as
a for a in iterable if function(a). I will never remember that the filter function comes first.
map(function, iterable)gives you an iterator that applies the first argument to each of the second, and yields the result. Guess what I will never remember. (I never use map, because list comprehension is always there for me.)
zip(*iterables)steps through all iterables given to it at the same time. If they have different lengths, everything past the length of the shortest iterable is discarded.
reversed()is what you use when you don't want to be eDgY and cool and use
__reversed__()if present. Both
sortedcan take a
sorted()returns a sorted version of the provided iterable, optionally using
list.sort()for in-place sorting and
sorted()on non-lists and to create new objects.
next()retrieves the next object from an iterator. I think all the places where I've needed
next()were hacky or just plain bad life choices.
iter(), when given one argument, creates an iterator object from it. Boring, and not often useful. Much more interesting is the version with two arguments: The first one is a callable, and the second one is the stopping value ("sentinel"). Iterating through the result will call the first argument until it returns the second one, making for an easy way to build block-based iteration:
from functools import partial with open('mydata.db', 'rb') as f: for block in iter(partial(f.read, 64), b''): process_block(block)
To take the optimistic view: Dynamic programming makes for great debugging skills.
vars(obj)are nice for debugging, please please please do not use them beyond that.
help()is something I should use more, but since it's hard to predict which library provides good help strings, I usually don't bother – when I hit this level of confusion, I go read the source.
id()is great to see if you've got a shallow copy problem.
issubclass()do exactly what's on the tin. Remember that the class to be checked out can be an iterable for both of these.
type(argument)is good for debugging, but try to use
type(name, bases, dict)is completely different from
type(obj). It allows dynamic class creation, which you will know when you need it.
And all the other built-ins that didn't fit in the categories above, and that provide dark magic not to be provoked lightly.
super()is the height of magic, but points towards the extremely useful
__mro__attribute of classes or types. For more information and recipes, read Hettinger's evergreen super() considered super
@classmethodturns the decorated function into a class function. You typically use
selffor the first argument.
@staticmethodtransforms a method into one that does not receive the implicit first
classmethods, these can be called on the class and the object both.
property(getter, setter, deleter)creates a property. Use as decorator on
def x(), then use
@x.deleterif you need special access handling.
setattrmake dynamic programming way too tempting and fun
compile()turns a string (or bytes) into an AST object, which you can then
eval()(returns the result). Can change the
__future__elements included, if the code may contain top level async code, and optimization levels (none / remove asserts / remove asserts and docstrings). You never want this, and if you do, you probably want
ast.parse(). All of these raise auditing events.
__import__is invoked by
import. You nearly always want to use