December 2: Python Binary Data Services
As part of the Python Standard Library traversal,
today we're following up yesterday's text processing services with
the binary data services:
- Now I know about yet another formatting language in Python. Thanks,
Convert between Python values and C structs (which get presented as bytes). You can
unpack values according
to format strings that you pass on. You can also use
unpack_from for efficient handling of larger
By default, you get proper system-appropriate byte order and alignment, but you can change those with the first character of the format string. Other than that, you get a typical type-based placeholder language, aka yet-another formatting language yaay?
This module defines base classes for codecs: encoders and decoders. Most of the ones included with Python are for
codecs.decode() do what
bytes.decode() do. With
codecs.lookup(name) you can get a
CodecInfo object, which gives you direct access to a streamwriter, a streamreader,
encode and decode functionality and incremental encoding and decoding. If you have codecs of your own, you can
codecs.open() behaves like general
open(), but is restricted to binary modes. If you need to do
weird transcoding magic, use
codecs also defines BOM constants for when you have to meddle with
platform dependent data.
Codec Base Classes
Executive summary: You can implement your own
Codec subclasses, and it's neither impossible nor particularly painuful.
You have to implement stateless encoding and decoding as well as stream reading and writing . Additionally, you are
encouraged to support at least the two main kinds of error handling,
ignore, and optionally further error
modes. The module provides base classes for incremental encoding and decoding, and stream encoding and decoding.
Python comes with a bunch of standard encodings, not just the usual utf-8, latin-1 etc. If you ever need weird encodings, you'll be thankful for it. Some of these are specific to Python itself and have no application outside the language domain. Naq gurer vf nyfb EBG13 fhccbeg.
encodings.idna is there to transform non-ASCII characters in domain names into those
xn-- strings, with
ToUnicode(). It also provides the
nameprep() function, which normalizes domains, mostly by lowercasing them.
If you're looking at this, please use the
idna package from PyPI instead, as
encodings.idna only supports an