Playing with Python Packaging
I am currently examining whether I should start to use Python seriously. I use
it for simple scripts since forever and we have some Python 2 at work, but I
never published a package - till today.
As you may know from my other posts, I love learning programming languages. I
hold Haskell dear to my heart, I enjoy whipping up stuff with Perl, I switched
from a Java job to a C++ job for the language ... but lately, I find it hard to
feel that spark. As I get older, I might start to become more pragmatic than
before, it seems? And this makes me gravitate towards concentrating my little
energy on Python, which is a kind of a local maximum of a language: It is
expressive, it is fast enough for many tasks (and gets faster with new releases
and allows calling into native code for stuff that still is too slow otherwise)
and it has a broad community, which means many supported packages for most tasks
I could want to do.
So, I am learning how to package Python stuff and I know that Python packaging
supposedly has its thorns. Let's see!
The "Project": astharoshe-hello
My package is yet another "hello world" script. You can install it via
pip install -i https://test.pypi.org/simple/ astharoshe-hello
or check out the sources via
git clone https://git.astharoshe.net/python/astharoshe-hello.git/
(this article describes tag 1.0.2).
Here is a overview of the project structure, created with tree(1):
|-- .gitignore |-- LICENSE |-- Makefile |-- README.md |-- astharoshe_hello | |-- __init__.py | `-- main.py |-- pyproject.toml |-- setup.cfg `-- tests `-- tests_main.py
To make it less trivial, I put the code into a package instead of a single
file. I also wrote some unit tests that also contain some simple mocking and
demonstrate how writing to stdout can be tested - all things that I might want
to look up quickly in the future.
Python has many ways to package code, which I find very amusing from a language
that proudly exclaims "There should be one-- and preferably only one --obvious
way to do it.". Recently, a new pyproject.toml was adopted to contain the
package meta data. If you read the guide on it, you soon find that you should
select a "build backend":
In the glossary, it gives no less than six examples for build backends, but no
guidance on selecting one. The guide uses "hatchling" for the complete
example. I decided to use "setuptools" instead, since this is a name I recognize
and I know that it will be available.
Python code
As you can see in the tree(1) output, I have no src directory but put my Python
code directly in the repository root. I wrote a single package, which I named
astharoshe_hello. It contains an empty __init__.py and a main.py with the
following contents:
import os DOMAIN = "astharoshe.net" def getOS(): return os.uname().sysname def getGreeting(): return f"Hello world from {DOMAIN}! Running on: {getOS()}" def main(): print(getGreeting()) if __name__ == "__main__": main()
The name "main.py" has no magic meaning, I could have called it "kitten.py"
instead to the same effect (except that then my module would be named
"astharoshe_hello.kitten"). Do not confuse it with "__main__.py"!
Python tests
Tests go into a separate directory and start with "tests_". I have a single
"tests_main.py" in my project:
import astharoshe_hello.main as hello import unittest from unittest.mock import patch import io import contextlib class TestMain(unittest.TestCase): @patch('astharoshe_hello.main.getOS') def test_greeting(self, getOSMock): getOSMock.return_value = "Some Unix" self.assertEqual(hello.getGreeting(), "Hello world from astharoshe.net! Running on: Some Unix") @patch('astharoshe_hello.main.getOS') def test_main(self, getOSMock): getOSMock.return_value = "Some Unix" output = io.StringIO() with contextlib.redirect_stdout(output): hello.main() self.assertEqual(output.getvalue(), "Hello world from astharoshe.net! Running on: Some Unix\n")
Project configuration
My pyproject.toml looks like this:
[build-system] requires = ["setuptools"] build-backend = "setuptools.build_meta"
And my setup.cfg contains these lines:
[metadata] name = astharoshe_hello version = 1.0.2 description = test/example/template project author = astharoshe.net author_email = bw@astharoshe.net license = ISC classifiers = Programming Language :: Python :: 3 License :: OSI Approved :: ISC License (ISCL) Operating System :: OS Independent [options] packages = find: python_requires = >=3.10 install_requires = [options.entry_points] console_scripts = astharoshe_hello = astharoshe_hello.main:main
I am still lacking some fields here, as indicated by "The author of this package
has not provided a project description" on the package page. I will fix this in
a future revision. I might also try to replace my setup.cfg with just the
pyproject.toml, since having two declarative configuration files seems
redundant, but for the time being, it can stay like this.
Helper Makefile (optional)
While developing, I found that a simple Makefile would be helpful to avoid
having to remember multiple long command lines. Especially "make test" was
something that I executed quite often.
run-main: PYTHONPATH=`pwd` python3 -m astharoshe_hello.main test: PYTHONPATH=`pwd` python3 -m unittest discover tests/ build: python3 -m build clean: git clean -fxd upload-test: python3 -m twine upload --repository testpypi dist/* upload-real: python3 -m twine upload dist/*
It also shows how I built and uploaded my package. The upload-real target to
upload to the actual PyPI instead of TestPyPI was never used, of course.
Conclusion
Creating and publishing a Python package that can be installed via pip on any
other system was not a hard task and I am very satisfied with the result, not
withstanding the missing description and the setup.cfg/pyproject.toml situation.
I am very interested in seeing how the packaging interacts with actual
dependencies - my tutorial package here does not depend on anything and
therefore does not demonstrate how these work (but I added an empty
"install_requires" property to the setup.cfg as a placeholder/reminder).
I found that the "batteries included" aspect of Python fell short during this
adventure. You are left alone with how you create your directory structure. I
miss something like "cargo new" here. This is even more true when you look at
the configuration files that I could not write again without looking at the
documentation or my own notes here. Other programming languages let me quickly
fill in my data into files created from templates, which is much nicer.
My Makefile is another aspect of this: Where is my "cargo run" or "cargo test"
or "cargo publish"? Maybe using uv would fill these holes, but it does not come
with my Python installation (and even is written in a different programming
language, Rust - which is both nice and sad).
Then again, I had to install py3-build on OpenBSD to build my package, and I had
to install twine from pip to upload it. These both are missing from the default
Python installation, so maybe I should have just installed uv instead? I might
try it in the future. But that, again, just demonstrates that Python makes me
decide too many things myself instead of offering me a sane default and leaving
any deviations to the experts!
Anyway, here is my project page:
Bonus: .gitignore
This, by the way, is the .gitignore I used here:
# Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ share/python-wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .nox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover *.py,cover .hypothesis/ .pytest_cache/ cover/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 db.sqlite3-journal # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder .pybuilder/ target/ # Jupyter Notebook .ipynb_checkpoints # IPython profile_default/ ipython_config.py # pyenv # For a library or package, you might want to ignore these files since the code is # intended to run in multiple environments; otherwise, check them in: # .python-version # pipenv # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. # However, in case of collaboration, if having platform-specific dependencies or dependencies # having no cross-platform support, pipenv may install dependencies that don't work, or not # install all needed dependencies. #Pipfile.lock # UV # Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. # This is especially recommended for binary packages to ensure reproducibility, and is more # commonly ignored for libraries. #uv.lock # poetry # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. # This is especially recommended for binary packages to ensure reproducibility, and is more # commonly ignored for libraries. # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control #poetry.lock # pdm # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. #pdm.lock # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it # in version control. # https://pdm.fming.dev/latest/usage/project/#working-with-version-control .pdm.toml .pdm-python .pdm-build/ # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm __pypackages__/ # Celery stuff celerybeat-schedule celerybeat.pid # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .dmypy.json dmypy.json # Pyre type checker .pyre/ # pytype static type analyzer .pytype/ # Cython debug symbols cython_debug/ # PyCharm # JetBrains specific template is maintained in a separate JetBrains.gitignore that can # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore # and can be added to the global gitignore or merged into this file. For a more nuclear # option (not recommended) you can uncomment the following to ignore the entire idea folder. #.idea/ # PyPI configuration file .pypirc
It is nothing special, I copied it from here: