Speaker: Bernat Gabor likes to focus on data ingestion pipelines (transformation and quality control).
Let's talk about tox, a testing tools used even by py.test itself.
A regular Python app or library usually has
- business logic
- tests (unit and integration: pytest. nosetest, nose2, …)
- packaging parts (setuptoosl, flit, poetry, PEX, XAR)
- documentation (sphinx, mkdocs)
- type checks if you're fancy
- static code analysis and linters (pylint, flake8)
- support for mutliple Python versions
- support for multiple major dependency versions like Django if applicable
- a quick setup
And you want all of this working after every commit – but all tools have their own interface, so integrating is hard, and using them is hard for newcomers.
A good start is having documentation, like a CONTRIBUTING file. Usually things devolve into shell scripts and Makefiles, too, to ensure everything is run and tested. Then you add multiple environments, reporting (human and machine readable, please), parallelisations, and suddenly you find yourself far down the rabbithole. You spend a lot of time, you need to maintain the tool, it's probably never quite right, etc.
tox was started by pytest developers, who envisioned pytest to be the testing framework and tox being the test runner. tox works on the principle that you should be able to merge CI and local testing, replicating every CI issue locally.
tox is configured in the
tox.ini where you can define tools, and gives you
a central way to invoke them, and isolate them against each other and your
local environment. You can define step as mandatory and optional, and have
different phases of testing.
You can also use tox to run your actual publishing workflow. You can define separate environments for your testing parts (e.g. documentation needs a different Python version than tests).
Packaging tests are currently a bit iffy, because you still need to put build requirements into your tox.ini, which will be improved once PEP 517 and PEP 518 are implemented.
If you run tox, the environment is created on the first run, and then re-used. Python versions will be discovered in PATH. You can also define separate dependencies per target. After than, you can basically invoke a list of commands. The command runner will strip away most of your environment variables, and will stop on first failure, failing the whole target, and failing output is exposed to the user.
You can of course run only a single target, to shorten the runtime. Use
posargs to pass arguments to tools, and use
detox to run tests in
parallel (downside: no streamed output).
You can use factor expressions to list all combinations that should be tested. Of course, there is also a plugin system, if this doesn't quite meet your needs.
Caveat: You'll still need to update the dependencies in environments by force recreating after dependencies change.