Index RSS

Playing with Python Packaging

I am currently examining whether I should start to use Python seriously. I use it for simple scripts since forever and we have some Python 2 at work, but I never published a package - till today.

As you may know from my other posts, I love learning programming languages. I hold Haskell dear to my heart, I enjoy whipping up stuff with Perl, I switched from a Java job to a C++ job for the language ... but lately, I find it hard to feel that spark. As I get older, I might start to become more pragmatic than before, it seems? And this makes me gravitate towards concentrating my little energy on Python, which is a kind of a local maximum of a language: It is expressive, it is fast enough for many tasks (and gets faster with new releases and allows calling into native code for stuff that still is too slow otherwise) and it has a broad community, which means many supported packages for most tasks I could want to do.

So, I am learning how to package Python stuff and I know that Python packaging supposedly has its thorns. Let's see!

The "Project": astharoshe-hello

My package is yet another "hello world" script. You can install it via

pip install -i https://test.pypi.org/simple/ astharoshe-hello

or check out the sources via

git clone https://git.astharoshe.net/python/astharoshe-hello.git/

(this article describes tag 1.0.2).

Here is a overview of the project structure, created with tree(1):

|-- .gitignore
|-- LICENSE
|-- Makefile
|-- README.md
|-- astharoshe_hello
|   |-- __init__.py
|   `-- main.py
|-- pyproject.toml
|-- setup.cfg
`-- tests
    `-- tests_main.py

To make it less trivial, I put the code into a package instead of a single file. I also wrote some unit tests that also contain some simple mocking and demonstrate how writing to stdout can be tested - all things that I might want to look up quickly in the future.

Python has many ways to package code, which I find very amusing from a language that proudly exclaims "There should be one-- and preferably only one --obvious way to do it.". Recently, a new pyproject.toml was adopted to contain the package meta data. If you read the guide on it, you soon find that you should select a "build backend":

https://packaging.python.org/en/latest/guides/writing-pyproject-toml/

In the glossary, it gives no less than six examples for build backends, but no guidance on selecting one. The guide uses "hatchling" for the complete example. I decided to use "setuptools" instead, since this is a name I recognize and I know that it will be available.

Python code

As you can see in the tree(1) output, I have no src directory but put my Python code directly in the repository root. I wrote a single package, which I named astharoshe_hello. It contains an empty __init__.py and a main.py with the following contents:

import os

DOMAIN = "astharoshe.net"

def getOS():
    return os.uname().sysname

def getGreeting():
    return f"Hello world from {DOMAIN}! Running on: {getOS()}"

def main():
    print(getGreeting())

if __name__ == "__main__":
    main()

The name "main.py" has no magic meaning, I could have called it "kitten.py" instead to the same effect (except that then my module would be named "astharoshe_hello.kitten"). Do not confuse it with "__main__.py"!

Python tests

Tests go into a separate directory and start with "tests_". I have a single "tests_main.py" in my project:

import astharoshe_hello.main as hello
import unittest
from unittest.mock import patch
import io
import contextlib

class TestMain(unittest.TestCase):
    @patch('astharoshe_hello.main.getOS')
    def test_greeting(self, getOSMock):
        getOSMock.return_value = "Some Unix"
        self.assertEqual(hello.getGreeting(),
                      "Hello world from astharoshe.net! Running on: Some Unix")

    @patch('astharoshe_hello.main.getOS')
    def test_main(self, getOSMock):
        getOSMock.return_value = "Some Unix"
        output = io.StringIO()
        with contextlib.redirect_stdout(output):
            hello.main()
        self.assertEqual(output.getvalue(),
                      "Hello world from astharoshe.net! Running on: Some Unix\n")

Project configuration

My pyproject.toml looks like this:

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

And my setup.cfg contains these lines:

[metadata]
name = astharoshe_hello
version = 1.0.2
description = test/example/template project
author = astharoshe.net
author_email = bw@astharoshe.net
license = ISC
classifiers =
    Programming Language :: Python :: 3
    License :: OSI Approved :: ISC License (ISCL)
    Operating System :: OS Independent

[options]
packages = find:
python_requires = >=3.10
install_requires =

[options.entry_points]
console_scripts =
    astharoshe_hello = astharoshe_hello.main:main

I am still lacking some fields here, as indicated by "The author of this package has not provided a project description" on the package page. I will fix this in a future revision. I might also try to replace my setup.cfg with just the pyproject.toml, since having two declarative configuration files seems redundant, but for the time being, it can stay like this.

Helper Makefile (optional)

While developing, I found that a simple Makefile would be helpful to avoid having to remember multiple long command lines. Especially "make test" was something that I executed quite often.

run-main:
	PYTHONPATH=`pwd` python3 -m astharoshe_hello.main

test:
	PYTHONPATH=`pwd` python3 -m unittest discover tests/

build:
	python3 -m build

clean:
	git clean -fxd

upload-test:
	python3 -m twine upload --repository testpypi dist/*

upload-real:
	python3 -m twine upload dist/*

It also shows how I built and uploaded my package. The upload-real target to upload to the actual PyPI instead of TestPyPI was never used, of course.

Conclusion

Creating and publishing a Python package that can be installed via pip on any other system was not a hard task and I am very satisfied with the result, not withstanding the missing description and the setup.cfg/pyproject.toml situation.

I am very interested in seeing how the packaging interacts with actual dependencies - my tutorial package here does not depend on anything and therefore does not demonstrate how these work (but I added an empty "install_requires" property to the setup.cfg as a placeholder/reminder).

I found that the "batteries included" aspect of Python fell short during this adventure. You are left alone with how you create your directory structure. I miss something like "cargo new" here. This is even more true when you look at the configuration files that I could not write again without looking at the documentation or my own notes here. Other programming languages let me quickly fill in my data into files created from templates, which is much nicer.

My Makefile is another aspect of this: Where is my "cargo run" or "cargo test" or "cargo publish"? Maybe using uv would fill these holes, but it does not come with my Python installation (and even is written in a different programming language, Rust - which is both nice and sad).

uv

Then again, I had to install py3-build on OpenBSD to build my package, and I had to install twine from pip to upload it. These both are missing from the default Python installation, so maybe I should have just installed uv instead? I might try it in the future. But that, again, just demonstrates that Python makes me decide too many things myself instead of offering me a sane default and leaving any deviations to the experts!

Anyway, here is my project page:

astharoshe-hello on TestPyPI

Bonus: .gitignore

This, by the way, is the .gitignore I used here:

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
#   For a library or package, you might want to ignore these files since the code is
#   intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# UV
#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#uv.lock

# poetry
#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
#   This is especially recommended for binary packages to ensure reproducibility, and is more
#   commonly ignored for libraries.
#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
#   in version control.
#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
#  and can be added to the global gitignore or merged into this file.  For a more nuclear
#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# PyPI configuration file
.pypirc

It is nothing special, I copied it from here:

https://github.com/github/gitignore/blob/main/Python.gitignore