November 30: Built-in Constants, Types, Exceptions

rixx

2020-11-30

After yesterday's built-in functions, today's entry in the traversal of the Python Standard Library docs takes us to the built-in constants, types and exceptions.

Highlights

The things I learned or found most interesting in this post:

I keep forgetting that divmod exists, oops.
Use string.casefold() for case-insensitive comparison, and expandtab() to replace tabs with spaces, without reverse operation, mwahaha.
You can assign to list slices (duh), and that includes stepped slices ([2:10:3]) (waat)
I played a bit with memoryview, which lets you access and manipulate bytes and bytearrays directly by exposing the raw integers.
I didn't know that if you use my_set.union() you can actually use any iterable as argument (as opposed to my_set | other_set where other_set really needs to be a set)

Constants

Most constants live in modules, but some are in the built-in global namespace: True, False and None are the important ones. Ellipsis (or ...) is mostly important to show up people who don't know about it. __debug__ tells you if you can use assert statements. NotImplemented is a constant that, when returned from arithmetic special methods like __add__(), will lead to NotImplementedError to be raised. Don't confuse those. (Also, quit, exit, copyright, credits, license to print stuff, if you're into printing things.)

Exceptions

All exceptions derive from BaseException. Built-in exceptions usually have a value (string or tuple) that explains what exactly is going on, passed to the constructor. You can use either the built-in classes or subclass them. When raising new exceptions in an exception context, remember to use raise from to provide the previous context.

Then you get all sorts of exception classes: Exception, of course, ArithmeticError subclasses (Overflow, ZeroDivision), LookupError subclasses (Index and Key), and various useful stuff (AttributeError and TypeError, the dread ImportError, NameError, NotImplementedError, if you're happy and you know it SyntaxError). OSErrors are a bit particular about their parameters and there are tons of subclasses for all sorts of OS oops neds.

Warnings

Warnings (messages typically written to stderr) use exception subclasses as categories.

Types

Booleans, truth, comparison

Every object can be tested for its truth value. Objects are assumed to be true unless their __bool__() returns False or their __len__() returns zero. All operations with boolean results return 0/1/True/False, except for and/or which return their last decisive operand.

Comparisons like x < y < z are evaluated like x < y and y < z, except that y is evaluated only one. You can implement comparison sufficiently by implementing __lt__() and __eq__(), though you are of course free to implement more of __le__(), __gt__(), __ge__(). You cannot override is and is not behaviour. For in and not in, implement __contains__().

Numeric types

We get floats, integers and complex numbers by default, and then other numeric types defined in modules, like decimal.Decimal and fractions.Fraction. We get tons of operations: don't forget that next to the standard, there are things like divmod. On integers, you get bitwise operations (|, ^, &, <<, >>, ~), bit_length() and byte conversion. On floats, you get things like as_integer_ratio() (for hilarious results), is_integer(), and hex conversion.

Iterators and generators

Iterator types are the best magic. The __iter__() method on the iterable returns an iterator (and on the iterator, for ease of use). The iterator has a __next__() method that returns the next element or raises a StopIteration. Generators do this for you by using the yield keyword in their function body.

Sequences

Sequence types are lists, tuples, and range objects. Shared operations are containment operators (in, not in), addition (concatenation), multiplication (repetition), indexed access, slicing, length checks, min and max operations, equality and particularly s.index(element) and s.count(element).

Immutable sequence types are mostly different by actually implementing a hash function, so you can use them as dictionary keys, for example. Mutable sequence types support assignment to items, or to slices, even stepped slices. They also support pop(), clear(), copy(), insert(), remove() and reverse(). Lists have a sort() method (but you can also use the sorted built-in). Ranges are immutable sequences mostly used for looping. Just like slices, they take a start, and end and a step. Range values are good because they are memory efficient.

Text

I'm offended by the fact that the three main ways to write strings are single quotes, double quotes and triple quotes, but the names don't correspond to the amount of characters used aaaah.

String methods are manifold, helpful, and easy to forget. I'm listing them here in an effort to forget less of them:

count(substring) counts occurrences
find(substring) and index(substring) find occurrences (plus rfind() and rindex())
startswith() and endswith()
isalnum(), isalpha(), isascii(), isdecimal(), isidentifier(), islower(), isnumeric(), isprintable(), isspace(), istitle(), isupper()
format and format_map
capitalize(), lower(), upper(), title(), swapcase()
strip(), lstrip(), rstrip()
join, split() (remember to use maxsplit!), rsplit, partition(), rpatition(), splitlines()
maketrans(), translate() and replace()
casefold() for case insensitive matching
center(width), ljust(width), rjust(width), zfill()
expandtab(tabsize=8)
New in 3.9: removeprefix and removesuffix

Python has at least three kinds of string formatting, oh joy. I'm not going into the details here, because that deserves its own post.

Binary

Use byte (immutable) and bytearray (mutable) to manipulate binary data, for instance by using memoryview to interact with memory representations of objects directly. bytes support all the same methods that strings do, only always limited to ASCII characters. Shove bytes into a memoryview to be able to access them individually, and inspect the underlying memory layout.

Sets

set and frozenset are great! Methods include add(), remove(), discard(), pop(), isdisjoint(), <= and <, | and &, - and ^, plus the same again for assignments. Note that things like union() can take any iterable, while | is restricted to actual sets.

Dicts

If you want more like this, wait until we talk about the collections module. keys(), values() and items() produce a dictview, which is connected to the dictionary and updates when it changes.

New in 3.9: You can merge dictionaries with |.

Context managers

Context managers are amazing and cool! Use them with with statement, implement the __enter__ and __exit__ methods. If you want to build context managers, use the contextlib.contextmanager decorator. Look at contextlib for standard library context managers.

Other types

Other built-in types include modules, classes, functions, methods (bound methods in particular), code objects, type objects, ...