MODERN PYTHON PACKAGING AARNI KOSKELA (@AKX) ARCHIPYLAGO #1 – JAN 11, 2024 Content Warning This presentation contains AI generated images of snakes. WHO WHAT Aarni Koskela (@akx on the internet) • Archipylago specs: Turku born, Parainen grown • CTO / co-founder / lead dev / sysadmin / printer xer-upper / all around tech hand at Valohai • Python since version 2.4 probably? • Open source maintainer (you've probably at least interacted with code I maintain) fi • HOW THINGS USED TO BE FOR INSTALLING PACKAGES Once upon a time there was no standard way to install Python packages :( • easy_install was released in 2004; it only installed from source or egg les, and couldn't uninstall packages • pip ("pip installs packages", duh) was released in 2008; it had learned a bunch of new tricks (wheels, uninstall, listing packages, etc., etc.) fi • HOW THINGS USED TO BE FOR BUILDING PACKAGES • Powered by setuptools/distutils • Basic steps: Write a setup.py + setup.cfg + other les (e.g. MANIFEST.in) • • This may get very non-basic if your package uses non-Python les • Run python setup.py sdist (and/or bdist...) • Upload (e.g. using python setup.py upload or twine (2013)) (... and surely you'll upload to TestPyPI before littering the main index, right..?) fi fi • the old "Distributing Python Modules" guide's TOC THE TROUBLE WITH TRIBBLES SETUP.PY • The big elephant trunk snake in the room about setup.py: It can contain any arbitrary code, even malicious code (why would anyone..?!) • Source distribution (sdist) packages will necessarily run that arbitrary setup.py code to install the package • → If you pip install somepackage and there is no wheel distribution of it, you run the author's code before even importing the package • Also, as seen before, they can be complex to write and maintain WHAT ARE WHEELS THEN? • Wheels (.whl) are ZIP les with a precisely structured name and contents (speci ed in PEP 491 and PEP 425) • Wheels can simply* be unpacked without running any code • Wheels can be pure-Python, or platformspeci c (where platform can mean OS, CPU arch, Python impl. etc.) • Most** packages these days are distributed as wheels • You can force your pip to always use wheels with -only-binary=:all: fi fi fi *: for a given value of "simply" **: I didn't actually check because the PyPI open dataset on Google BigQuery hasn't been updated in a while INTERLUDE: NEAT TRICKS TO DO WITH PIP AND WHEELS • • You can use pip wheel to download and/or wheel (to verb a noun) all of the dependencies for a project: • pip wheel --wheel-dir wheels/ ... • (where ... can be e.g. -r requirements.txt) This is useful in e.g. Docker multi-stage builds, or when you need to install packages on an airplane or a train in Japan • • On that note, also see https://github.com/akx/pip-local-cache-index You can also use pex (on GitHub) and zipapp (in the standard library) to bundle wheels + your code BACK TO MODERN TIMES • PEP 517 (introduced 2015, nalized 2017) moves building responsibility from setup.py to build backends • PEP 518 (2016) speci es how to specify the build backend (in pyproject.toml) • This doesn't really alleviate the arbitrary code issue, since the build backends are code too ... but it encourages authors (or CI) building wheels instead of sdists very futuristic optimistic hacker snake with awesome sunglasses fi • fi • As an aside, Python 3.11 introduced tomllib in the standard library (Finnish code, by the way, torille!!!) for reading TOML WHY SHOULD I CARE (AS A LIBRARY MAINTAINER)? • setup.py install has been deprecated since 2021 (and setuptools 58+ will have nagged you about it already) • Switching to PEP 517 tooling will (very likely) make your life easier • Other people (like me) will maintain build backends for you • People don't need to dig through your weird bespoke setup.py • Your CI work ows become easier and likely faster fi fl tired anthropomorphic hacker snake girl, best quality, of cial art WHY SHOULD I CARE (AS AN APPLICATION/PROJECT DEVELOPER)? • You don't necessarily need to, to be honest! However.... • Using PEP 517 will make your package more easily installable by end-users (and fellow developers): • pip install -e . will install the package in the current directory in editable mode – you're done! • • Aside: This also works for libraries you're working on, of course, and for working on libraries while working on a project: pip install -e ../some-library It will make it possible to run your package via pipx too (the analogue of JavaScript's npx) tired anthropomorphic snake girl python software developer QUICK ASIDE: PIPX IS USEFUL AND FUN Pipx lets you run arbitrary code from the internet without looking at it rst! Fun! It also lets you run a tool without having to set up a virtualenv for it rst (and surely you wouldn't install packages system-wide without a virtualenv, right?) • • Ever needed to rip audio from a YouTube video, or download a video from Twitter? pipx run yt-dlp -x --audio-format=wav https://youtu.be/ lpiB2wMc49g The equivalent tool for YLE Areena is yle-dl • You can also use pipx to run ad-hoc from a GitHub repository: pipx run --spec git+https://github.com/akx/nurin nurin fi • fi • OKAY, SO HOW DO WE DO THIS? • You'll need to choose a build backend rst! • I recommend (and will focus on) https://github.com/pypa/ hatch for user-friendliness, extensibility and easy migration from setuptools: hatch new --init Hatch also has a bunch of poetry-like features such as test environment management, etc. • • setuptools is PEP517 compliant (and can be a decent option, though it's less user-friendly and extensible than Hatch) • Poetry is (to a degree) too • there's PDM, it, and a host of others... • or if you have special needs (e.g. bespoke extension modules or you're a NumPy maintainer), you could roll your own! The choice of build backend does not (should not, anyway) affect your end users' experience fi • Other alternatives: fl • setuptools vai kolmipyörä? poetry vai kolmipyörä? hatch vai kolmipyörä? OKAY, I CHOSE HATCH BECAUSE YOU TOLD ME TO, WHAT NOW? 1. Convert your project • For a new project, hatch new myproject will bootstrap a directory structure for you (opinionatedly) • For a current project with or without setup.py, hatch new --init will populate pyproject.toml (opinionatedly) • If you have requirements.txt, copy those over to project.dependencies list in pyproject.toml 2. Delete the legacy les 3. If all goes well, you're done, and you can: • build your package with hatch build or python -m build . • or install the package with pip install -e . fi fi i really don't know why it looks like the snake is drooling, i prompted reworks INTERLUDE: EXAMPLE PYPROJECT.TOML [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] PEP 518 build system spec; [project] name = "nurin" the packages required to build description = "Taasko se netti on nurin" readme = "README.md" [tool.ruff] this one. requires-python = ">=3.7" select = ["ANN", "B", "C", "E", "F", "W"] license = "MIT" ignore = ["E501", "ANN101"] (Hatchling is the packaging core keywords = [] authors = [ [tool.ruff.per-file-ignores] for Hatch; Hatch contains { name = "Aarni Koskela", email = "akx@iki.fi" }, "tests/*" = ["ANN"] the rest ] of the owl, e.g. the CLI, etc.) dependencies = [ "click", ] dynamic = ["version"] [project.scripts] nurin = "nurin.__main__:main" [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml INTERLUDE: EXAMPLE PYPROJECT.TOML [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "nurin" description = "Taasko se netti on nurin" readme = "README.md" requires-python = ">=3.7" license = "MIT" keywords = [] authors = [ { name = "Aarni Koskela", email = "akx@iki.fi" }, ] dependencies = [ "click", ] dynamic = ["version"] [project.scripts] nurin = "nurin.__main__:main" [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] [tool.ruff] select = ["ANN", "B", "C", "E", "F", "W"] ignore = ["E501", "ANN101"] Package metadata (PEP 621) [tool.ruff.per-file-ignores] "tests/*" = ["ANN"] This is for my nurin project which keeps track of when our home Internet connection goes down, and tries to resuscitate it. [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "nurin" description = "Taasko se netti on nurin" readme = "README.md" requires-python = ">=3.7" license = "MIT" keywords = [] authors = [ { name = "Aarni Koskela", email = "akx@iki.fi" }, ] dependencies = [ "click", ] dynamic = ["version"] [project.scripts] nurin = "nurin.__main__:main" [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] [tool.ruff] select = ["ANN", "B", "C", "E", "F", "W"] ignore = ["E501", "ANN101"] [tool.ruff.per-file-ignores] "tests/*" = ["ANN"] Tells the infrastructure which elds aren't statically speci ed in this le and will be determined by the backend [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" fi pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml fi fi INTERLUDE: EXAMPLE PYPROJECT.TOML INTERLUDE: EXAMPLE PYPROJECT.TOML [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "nurin" description = "Taasko se netti on nurin" readme = "README.md" requires-python = ">=3.7" license = "MIT" keywords = [] authors = [ { name = "Aarni Koskela", email = "akx@iki.fi" }, ] dependencies = [ "click", ] dynamic = ["version"] [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] [tool.ruff] select = ["ANN", "B", "C", "E", "F", "W"] ignore = ["E501", "ANN101"] [tool.ruff.per-file-ignores] "tests/*" = ["ANN"] Which executable wrappers to generate from which entry points [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" (yes, it's this easy!) [project.scripts] nurin = "nurin.__main__:main" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml INTERLUDE: EXAMPLE PYPROJECT.TOML [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "nurin" description = "Taasko se netti on nurin" readme = "README.md" requires-python = ">=3.7" license = "MIT" keywords = [] authors = [ { name = "Aarni Koskela", email = "akx@iki.fi" }, ] dependencies = [ "click", ] dynamic = ["version"] [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] [tool.ruff] select = ["ANN", "B", "C", "E", "F", "W"] ignore = ["E501", "ANN101"] [tool.ruff.per-file-ignores] "tests/*" = ["ANN"] [project.scripts] nurin = "nurin.__main__:main" [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" URLs to various project pages; shown on PyPI pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml INTERLUDE: EXAMPLE PYPROJECT.TOML [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "nurin" description = "Taasko se netti on nurin" readme = "README.md" requires-python = ">=3.7" license = "MIT" keywords = [] authors = [ { name = "Aarni Koskela", email = "akx@iki.fi" }, ] dependencies = [ "click", ] dynamic = ["version"] [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] [tool.ruff] select = ["ANN", "B", "C", "E", "F", "W"] ignore = ["E501", "ANN101"] [tool.ruff.per-file-ignores] "tests/*" = ["ANN"] [project.scripts] nurin = "nurin.__main__:main" [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" Tells Hatch where to read the version number from (and where to update it: hatch version patch) pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml INTERLUDE: EXAMPLE PYPROJECT.TOML [build-system] requires = ["hatchling"] build-backend = "hatchling.build" [project] name = "nurin" description = "Taasko se netti on nurin" readme = "README.md" requires-python = ">=3.7" license = "MIT" keywords = [] authors = [ { name = "Aarni Koskela", email = "akx@iki.fi" }, ] dependencies = [ "click", ] dynamic = ["version"] Con guration for other tools (coverage-py and the Ruff linter/formatter, in this case) [project.scripts] nurin = "nurin.__main__:main" [project.urls] Documentation = "https://github.com/akx/nurin#readme" Issues = "https://github.com/akx/nurin/issues" Source = "https://github.com/akx/nurin" [tool.hatch.version] path = "nurin/__about__.py" [tool.coverage.run] branch = true parallel = true omit = [ "nurin/__about__.py", ] [tool.ruff] select = ["ANN", "B", "C", "E", "F", "W"] ignore = ["E501", "ANN101"] [tool.ruff.per-file-ignores] "tests/*" = ["ANN"] A Hatch-initialized pyproject.toml will also have more Hatch-speci c elds here, but they're optional. fi fi fi pipx run pygments -O full=true,style=gruvbox-light -o temp.html pyproject.toml UM, WHAT DO I PUT IN DEPENDENCIES?! • Don't use entirely unversioned dependency speci ers! • I would suggest fairly loose versions to avoid compatibility issues; e.g. click~=7.5 is a shorthand for Click ≥ 7.5,<8 (this is slightly different c.f. npm!) • For an application, you could try https://github.com/ juftin/hatch-pip-compile to lock dependency versions, or simply pip freeze -l > requirements.known-good.txt • Use extras (optional) dependencies to avoid installing the universe! fi fi • You can also use constraints les (supported in pip since 2015) to limit package versions without pinning QUICK DETOUR: OPTIONAL DEPENDENCIES • • You can use optional dependency groups to declare dependencies that aren't required by your project at all times • For example here, type-checking and testing dependencies (though those can also be supported by Hatch's environment capabilities) • For a deep learning library, dependencies only required when training a model could be in a train extra – now if only those libraries did this... When installing a package, pip install mypackage[mypy,test] HOW DO I DISTRIBUTE MY AWESOME PACKAGE? • If your package is ready to be published to the whole wide world (i.e. to PyPI): • the easiest is hatch publish (which uses twine under the hood) • generally, for a dist/ directory containing sdists or wheels, twine upload • PyPI recently gained support for GitHub Actions trusted publishing, so you don't need to mess around with access tokens if you want automated publishing from CI ✨ • And of course you can just install packages from http or local URLs (and use -- nd-links to tell Pip where to nd packages) fi For private packages, you can set up an alternate PyPI index (it's approximately just a HTML page!) with e.g. https://github.com/ chriskuehl/dumb-pypi and use it with --extra-index-url fi • Questions? “Thanks!” – Me