Skip to content

Project structure

Using a well-structured and commonly used project organization will help make your projects accessible to collaborators, who will understand and appreciate the familiar setup. Even if the scope of the project is small, it is highly recommended to follow a standard structure to align with common practice and, who knows, be ready to increase the scope.

Looking at our pkoffee project, everything is flatly stored in a single repository, it isn't organized:

ls -l .
total 4700
-rw-r--r-- 1 pollet cta 4044243 janv. 12 16:42 coffee_productivity.csv
-rw-r--r-- 1 pollet cta  149989 janv. 13 02:40 fit_plot.png
-rw-r--r-- 1 pollet cta    2293 janv. 13 06:55 main.py
-rw-r--r-- 1 pollet cta  601073 janv. 13 04:44 pixi.lock
-rw-r--r-- 1 pollet cta     868 janv. 13 04:44 pixi.toml
-rw-r--r-- 1 pollet cta     382 janv. 12 16:42 README.md

A recommended structured organization would look more like this:

pkoffee/
├── data/                  # data files used in the project
   ├── README.md          # describe the origin of your data
   ├── raw/               # store your raw data and do not modify it
   └── processed/         # store cleaned/processed/modified data separately 
├── doc/                   # documentation for your software
   ├── main_page.md       # entry point into the documentation website    
   └── ...
├── figures/               # results of the analysis (figures)
   ├── comparison_plot.png
   └── regression_chart.pdf
├── manuscript/            # manuscript describing the results
├── results/               # results of the analysis (data, tables)  
   ├── preliminary/
   └── final/
├── src/pkoffee/           # contains source code for the project
       ├── LICENSE        # license that just applies to the code
       ├── main_script.py # main script/code entry point
       └── ...
├── CITATION.cff           # citation information for the project
├── LICENSE                # license (reuse terms) for the project as a whole
├── pixi.lock              # pixi lock file enabling reproducibility
├── pixi.toml              # pixi manifest describing the projects environments
├── pyproject.toml         # python package configuration file
├── README.md              # Special documentation file kept in the root directory as entrypoint
└── ...

Note

A license file is not "mandatory", however without a license, the default copyright laws apply: the author retains all rights to the content and no one may reproduce, distribute, or create derivative works from the content.

The organization of some subdirectories are not entirely up to us: for instance the doc directory content may be dictated by our documentation generation tool. Similarly, the source code structure shown above complies with the src layout, a recommended layout for python package sources.

How are conda packages created

A conda package creation follows a seemingly simple idea: a virtual environment is created, and the code to package is installed in the environment from the sources. Once the install is completed, every file added in the environment by the installation is put into an archive (conda packages used to have .tar.gz extension!) with a few additional metadata. Installing the package is essentially opening the archive in a new virtual environment!

In truth, the process is more complex than that: it is required to patch the installed files to make them re-locatable (ie installable in different locations), several environments can be used to cross compile to a different platform, etc. but the interesting information is: in order to make a conda package from any software, we simply have to "make it installable" in a virtual environment. The easiest way to "make installable" our project is to use python packaging tools to first package our project, then install it!

Making a python package

Python packaging is a pretty complicated topic, which we don't have the time to cover in details in this lecture. Long story short: Python packages come in 2 flavor: source distribution or sdist (archives containing source files and recipe to (build if not pure-python and) install them) and wheel or binary distribution. wheels allow to ship python libraries with compiled artifacts, allowing python package to depend on non-python code, but are there specific to a platform. wheels are faster to install and are therefore preferred by package managers over sdist, however sdist offer the possibility to compile a wheel on any architecture, therefore it is best to provide both.

Python packages are configured by a pyproject.toml file. At the bare minimum, the pyproject.toml needs to specify a [build-system], a python packaging backend that will be called to create the package. There are many python build systems available, we will use the uv_build backend. We only need to add a few metadata describing our package to complete the pyproject.toml:

[build-system]
build-backend = "uv_build"
requires = ["uv_build>=0.9,<0.10.0"]

[project]
name = "pkoffee"
version = "0.1.0"
description = "Coffee productivity analysis with statistical modeling"
readme = "README.md"
requires-python = ">=3.12"
authors = [
    { name = "Thomas Vuillaume", email = "thomas.vuillaume@lapp.in2p3.fr" },
]
classifiers = [
    "Development Status :: 4 - Beta",
    "Operating System :: OS Independent",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.12",
    "Programming Language :: Python :: 3.13",
    "Intended Audience :: Science/Research",
    "License :: OSI Approved :: MIT License",
]

[project.urls]
"Homepage" = "https://github.com/s3-school/pkoffee-solution"
"Bug Tracker" = "https://github.com/s3-school/pkoffee-solution/issues"
"Documentation" = "https://github.com/s3-school/pkoffee-solution/blob/main/README.md"

Let's organize our package following the source layout:

mkdir -p src/pkoffee/__init__.py
mv main.py src/pkoffee/
and install uv for a little while, so we can demonstrate the creation of the python package:
pixi global install uv
We can now build our python package:
uv build
Building source distribution (uv build backend)...
Building wheel from source distribution (uv build backend)...
Successfully built dist/pkoffee-0.1.0.tar.gz
Successfully built dist/pkoffee-0.1.0-py3-none-any.whl

Warning

The built distribution is not a "valid" pypi package! Indeed, we didn't add any dependencies to the package specification in the pyproject.toml, because we want to use pixi to manage the dependencies! The python package is only used as a convenient way to install pkoffee, but the package we care about is a conda package.

We can remove uv as we won't be needing it explicitly while working with pixi

pixi global uninstall uv

Making a conda package

Now that we can conveniently install pkoffee using our python package, we can leverage pixi build back-ends to build a conda package for pkoffee. We need to make a few changes to the pixi manifest: 1. Add the "pixi-build" preview feature to the workspace, to allow the usage of build back-ends

preview = ["pixi-build"]
2. Optionally: add build-variants array to the workspace, to inform pixi that we want to build our package against several versions of python. This is not useful for a pure python package like pkoffee, but it would be required for a python version
[workspace.build-variants]
python = ["3.12.*", "3.13.*", "3.14.*"]
3. Add a package section to the workspace, describing the pkoffee conda package. Move the dependencies from the default feature into the run-dependencies of the package, and specify the pixi-build-python back-end:
[package]
name = "pkoffee"
version = "0.1.0"
authors = [
    "Thomas Vuillaume <thomas.vuillaume@lapp.in2p3.fr>",
    "Vincent Pollet <vincent.pollet@lapp.in2p3.fr>",
]
description = "S3School Pkoffe example package"
license = "MIT"
homepage = "https://github.com/s3-school/pkoffee"
repository = "https://github.com/s3-school/pkoffee"
documentation = "https://github.com/s3-school/pkoffee/blob/main/README.md"

[package.build]
backend = { name = "pixi-build-python", version = ">=0.4.1,<5.0.0" }

[package.build.config]

[package.build-dependencies]

[package.host-dependencies]
# pixi build back-end calls uv or pip with "no-build-isolation" so build tools need to be available
# setuptools and other build tools should be host dependencies for now, see https://pixi.sh/latest/build/dependency_types/#python-code
uv = ">=0.9.9,<0.10.0"
uv-build = ">=0.9.9,<0.10.0"

[package.run-dependencies]
numpy = ">=2.4.1,<3"
matplotlib = ">=3.10.8,<4"
pandas = ">=2.3.3,<3"
scipy = ">=1.17.0,<2"
seaborn = ">=0.13.2,<0.14"
4. Add pkoffee as a local conda dependency in the default feature:
[dependencies]
pkoffee = { path = "." }

We can now build the pkoffee conda package:

pixi build
 Successfully built 'pkoffee-0.1.0-pyh4616a5c_0.conda'

Congratulations! We have successfully packaged pkoffe for the conda ecosystem! pkoffee-0.1.0-pyh4616a5c_0.conda is pkoffee package, we can now distribute it to our users.

Packaging distribution

There are a few options to distribute a conda package. Which one to chose depends on the scope and visibility of our project:

  1. Do not distribute. For simple analysis code, maybe it is not required to distribute our package? Will anyone want to use it as a dependency for their own analysis? If the answer is no, there is no need to work more.
  2. Distribute on conda-forge If our project is "prototype tier" and people are interested in using it, directly or as a dependency, the best place to make it available is conda-forge. In fact, conda-forge also provides the infrastructure to build the packages, so we won't even have to build it locally, we just have to provide the recipe and conda-forge CI will build the package and make it available for us.
  3. If you want more control on the conda channel you are using to distribute your package, you can use prefix.dev channels to host your packages. prefix is the company developing pixi and many other tools fo the conda ecosystem. They offer to host private or public conda channel on their infrastructure for free.
  4. If you want to have full control over your distribution, you can use quetz to host a conda channel on your own servers.

Summary

  • Structure all you projects following a standard, consistent organization
  • Even for small analysis code (analysis tier), organize your code base as a package
    • We showed an example for a python project following the source layout
  • pixi build back-ends allow to easily make a conda package from a package of most languages.
    • the rattler-build back-end works for any language, but you would have to write the script that installs your project.
  • Think about whether you should distribute your package, and chose a adequate solution for your needs.