Easy methods to enhance Python packaging, or why fourteen instruments are not less than twelve too many
There may be an space of Python that many builders have issues with. That is an space that has seen many various options pop up over time, with many various opinions, wars, and makes an attempt to resolve it. Many have complained concerning the packaging ecosystem and instruments making their lives tougher. Many newcomers are confused about digital environments. However does it should be this manner? Are the present options to packaging issues any good? And is the group behind a lot of the packaging instruments and requirements a part of the issue itself?
Be part of me on a journey by way of packaging in Python and elsewhere. We’ll begin by describing the basic packaging stack (involving setuptools and pals), the scientific stack (with conda), and a number of the trendy/alternate instruments, corresponding to Pipenv, Poetry, Hatch, or PDM. We’ll additionally have a look at some examples of packaging and dependency-related workflows seen elsewhere (Node.js and .NET). We’ll additionally take a glimpse at a doable future (with a venv-less workflow with PDM), and see if the PyPA agrees with the imaginative and prescient and insights of eight thousand customers.
There are lots of packaging-related instruments in Python. All of them with totally different authors, lineages, and sometimes totally different opinions, though most of them are actually unified below the Python Packaging Authority (PyPA) umbrella. Let’s check out them.
The classic stack
The basic Python packaging stack consists of many semi-related instruments. Setuptools, most likely the oldest software of the group, and itself based mostly on distutils
, which is a part of the usual library (though it is going to be eliminated in Python 3.12), is answerable for putting in a single bundle. It beforehand used setup.py
information to do its job, which required arbitrary code execution. It then added assist for non-executable metadata specification codecs: setup.cfg
, and in addition pyproject.toml
(partially nonetheless in beta). Nevertheless, you aren’t supposed to make use of setup.py
information straight lately, you’re purported to be utilizing pip. Pip installs packages, often from the PyPI, however it might probably additionally assist different sources (corresponding to git repositories or the native filesystem). However the place does pip set up issues? The default was once to put in globally and system-wide, which meant you would introduce conflicts between packages put in by pip and apt (or regardless of the system bundle supervisor is). Even with a user-wide set up (which pip is prone to try lately), you possibly can nonetheless find yourself with conflicts, and it’s also possible to have conflicts wherein bundle A requests X model 1.0.0, however bundle B expects X model 2.0.0—however A and B are in no way associated and will reside individually with their most well-liked model of X. Enter venv
, a regular library descendant of virtualenv
, which might create a light-weight digital surroundings for packages to reside in. This digital surroundings provides you the separation from system packages and from totally different environments, however it’s nonetheless tied to the system Python in some methods (and if the system Python disappears, the digital surroundings stops working).
A couple of additional instruments could be utilized in a typical packaging workflow. The wheel
bundle enhances Setuptools with the flexibility to generate wheels, that are ready-to-install (with out working setup.py
). Wheels can both be pure-Python and be put in anyplace, or they’ll comprise pre-compiled extension modules (issues written in C) for a given OS and Python (and there’s even a regular that permits constructing and distributing one wheel for all typical Linux distros). The wheel
bundle needs to be an implementation element, one thing current inside Setuptools and/or pip, however customers want to concentrate on it in the event that they wish to make wheels on their system, as a result of digital environments produced by venv
don’t have wheel
put in. Common customers who don’t keep their very own packages could typically be advised that pip is utilizing one thing legacy as a result of wheel
is just not put in, which isn’t a great consumer expertise. Bundle authors additionally want twine
, whose sole process is importing supply distributions or wheels, created with different instruments, to PyPI (and there’s not far more to say about that software).
…and a few extensions
Over time, there have been a number of instruments which are based mostly on issues from the basic stack. For instance, pip-tools
can simplify dependency administration. Whereas pip freeze
enables you to produce a file with every thing put in in your surroundings, there is no such thing as a option to specify the dependencies you want, and get a lock file with particular variations and transitive dependencies (with out putting in and freezing every thing), there is no such thing as a straightforward option to skip improvement dependencies (e.g. IPython) while you pip freeze
, and there’s no workflow to replace all of your dependencies with simply pip. pip-tools
provides two instruments, pip-compile
which takes in necessities.in
information with the packages you care about, and produces a requrirements.txt
with pinned variations of them and all transitive dependencies; and in addition pip-sync
, which might set up necessities.txt
and removes issues not listed in it.
One other software that may are available helpful is virtualenvwrapper
, which will help you handle (create and activate) digital environments in a central location. It has a number of bells and whistles (corresponding to customized hooks to do actions on each virtualenv creation), though for primary utilization, you would substitute it with a single-line shell perform.
One more software that works alongside the basic toolset is pipx
, which creates and manages digital environments for apps written in Python. You inform it to pipx set up Nikola
, and it’ll create a digital surroundings someplace, set up Nikola into it, and put a script for launching it in ~/.native/bin
. Whilst you might do all of it your self with venv and a few symlinks, pipx can care for this, and also you don’t want to recollect the place the digital surroundings is.
The scientific stack and conda
The scientific Python group have had their very own instruments for a few years. The conda software can handle environments and packages. It doesn’t use PyPI and wheels, however reasonably packages from conda channels (that are prebuilt, and anticipate an Anaconda-distributed Python). Again within the day, when there have been no wheels, this was the simplest option to get issues put in on Home windows; this isn’t as a lot of an issue now with binary wheels on PyPI—however the Anaconda stack remains to be widespread within the scientific world. Conda packages will be constructed with conda-build
, which is separate, however intently associated to conda
itself. Conda packages aren’t suitable with pip
in any manner, they don’t comply with the packaging requirements utilized by different instruments. Is that this good? No, as a result of it makes integrating the 2 worlds tougher, but in addition sure, as a result of many issues that apply to scientific packages (and their C/C++ extension modules, and their high-performance numeric libraries, and different issues) don’t apply to different makes use of of Python, so having a separate software lets folks focusing the opposite makes use of simplify their workflows.
The new tools
A couple of years in the past, new packaging instruments appeared. Now, there have been numerous “new fancy instruments” launched prior to now, with setuptools extending distutils, then distribute forking setuptools, then distribute being merged again…
The earliest “new software” was Pipenv. Pipenv had actually horrible and deceptive advertising and marketing, and it merged pip and venv, in that Pipenv would create a venv and set up packages in it (from Pipfile
or Pipfile.lock
). Pipenv can place the venv within the challenge folder, or cover it someplace within the challenge folder (the latter is the default). Nevertheless, Pipenv doesn’t deal with any packages associated to packaging your code, so it’s helpful just for growing non-installable functions (Django websites, for instance). In the event you’re a library developer, you want setuptools anyway.
The second new software was Poetry. It manages environments and dependencies in the same option to Pipenv, however it might probably additionally construct .whl
information together with your code, and it might probably add wheels and supply distributions to PyPI. This implies it has just about all of the options the opposite instruments have, besides you want only one software. Nevertheless, Poetry is opinionated, and its opinions are typically incompatible with the remainder of the packaging scene. Poetry makes use of the pyproject.toml
commonplace, but it surely doesn’t comply with the usual specifying how metadata needs to be represented in a pyproject.toml
file (PEP 621), as a substitute utilizing a customized [tool.poetry]
desk. That is partly as a result of Poetry got here out earlier than PEP 621, however the PEP was accepted over 2 years in the past—the largest compatibility drawback is Poetry’s node-inspired ~
and ^
dependency model markers, which aren’t suitable with PEP 508 (the dependency specification commonplace). Poetry can bundle C extension modules, though it makes use of setuptools’ infrastructure for this (and requires a customized construct.py
script).
One other comparable software is Hatch. This software may also handle environments (it permits a number of environments per challenge, but it surely doesn’t permit to place them within the challenge listing), and it might probably handle packages (however with out lockfile assist). Hatch can be used to bundle a challenge (with PEP 621-compliant pyproject.toml
information) and add it to PyPI. It doesn’t assist C extension modules.
A software that tries to be a less complicated re-imagining of Setuptools is Flit. It may well construct and set up a bundle utilizing a pyproject.toml
file. It additionally helps uploads to PyPI. It lacks assist for C extension modules, and it expects you to handle environments by yourself.
There’s another fascinating (albeit not widespread or well-known) software. This software is PDM. It may well handle venvs (but it surely defaults to the saner .venv
location), handle dependencies, and it makes use of a standards-compliant pyproject.toml
. There’s additionally a curious little function referred to as PEP 582 assist, which we’ll speak about later.
The earlier sections talked about 14 (fourteen!) distinct instruments. As we’ll uncover quickly, that’s not less than 12 too many. Let’s attempt to evaluate them.
First, let’s outline 9 issues that we might anticipate packaging instruments to do:
-
Handle environments
-
Set up packages
-
Bundle/develop apps
-
Bundle libraries
-
Bundle C extension modules
-
Set up in editable mode
-
Lock dependencies
-
Assist pyproject.toml information
-
Add to PyPI
Device |
Maintainer |
Use-case |
# of supported options |
# of partially supported options |
# of unsupported options |
---|---|---|---|---|---|
setuptools |
PyPA |
Making issues installable |
4 |
2 (pyproject.toml partially in beta, putting in—solely setuptools-based sdists) |
3 |
pip |
PyPA |
Putting in packages |
2 |
1 (Locking dependencies solely manually) |
6 |
venv |
PyPA |
Creating digital environments |
1 (Creating environments) |
0 |
8 |
wheel |
PyPA |
Constructing wheels in setuptools |
0 |
1 (Constructing wheels in setuptools) |
8 |
Twine |
PyPA |
Importing to PyPI |
1 (Importing to PyPI) |
0 |
8 |
pip-tools |
Jazzband |
Managing necessities information |
2 (Locking dependencies, putting in packages) |
0 |
7 |
virtualenvwrapper |
Doug Hellmann |
Managing digital environments |
1 (Managing environments) |
0 |
8 |
pipx |
PyPA |
Putting in Python command-line instruments |
2 (Putting in packages, editable installs) |
1 (Managing environments) |
6 |
conda |
Anaconda, Inc. |
Managing environments and dependencies |
3 (Managing environments, putting in issues) |
4 (Guide locking, packaging requires conda-build) |
2 (pyproject.toml and PyPI) |
Pipenv |
PyPA |
Managing dependencies for apps |
3 (Managing environments, putting in and locking) |
1 (Growing apps) |
5 |
Poetry |
Sébastien Eustace et al. |
Packaging and managing dependencies |
7 |
2 (pyproject.toml, C extensions) |
0 |
Flit |
PyPA |
Packaging pure-Python tasks |
5 |
1 (Putting in solely flit packages) |
3 |
Hatch |
PyPA |
Packaging and managing dependencies |
7 |
0 |
2 (C extensions, locking dependencies) |
PDM |
Frost Ming |
Packaging and managing dependencies |
8 |
0 |
1 (C extensions) |
Broaden desk with extra particulars about assist for every function
Device |
F1 (Envs) |
F2 (Set up) |
F3 (Apps) |
F4 (Libraries) |
F5 (Extensions) |
F6 (Editable) |
F7 (Lock) |
F8 (pyproject.toml) |
F9 (Add) |
---|---|---|---|---|---|---|---|---|---|
setuptools |
No |
Provided that authoring the bundle, direct use not advisable |
Sure |
Sure |
Sure |
Sure |
No |
Beta |
No (can construct sdist) |
pip |
No |
Sure |
No |
No |
No |
Sure |
Manually |
N/A |
No |
venv |
Solely creating environments |
No |
No |
No |
No |
No |
No |
No |
No |
wheel |
No |
No |
No |
No |
No |
No |
No |
No |
No (can construct wheels) |
Twine |
No |
No |
No |
No |
No |
No |
No |
No |
Sure |
pip-tools |
No |
Sure |
No |
No |
No |
No |
Sure |
No |
No |
virtualenvwrapper |
Sure |
No |
No |
No |
No |
No |
No |
No |
No |
pipx |
Type of |
Sure |
No |
No |
No |
Sure |
No |
No |
No |
conda |
Sure |
Sure (from conda channels) |
Develop (conda-build is a separate software) |
With conda-build |
With conda-build |
Sure |
Manually |
No |
No |
Pipenv |
Sure |
Sure |
Solely develop |
No |
No |
No |
Sure |
No |
No |
Poetry |
Sure |
Sure |
Sure |
Sure |
Type of (customized construct.py script) |
Sure |
Sure |
Sure, however utilizing customized fields |
Sure |
Flit |
No |
Provided that authoring the bundle |
Sure |
Sure |
No |
Sure |
No |
Sure |
Sure |
Hatch |
Sure |
Sure |
Sure |
Sure |
No |
Sure |
No |
Sure |
Sure |
PDM |
Sure |
Sure |
Sure |
Sure |
No |
Sure |
Sure |
Sure |
Sure |
It is best to pay shut consideration to the Maintainer column within the desk. The overwhelming majority of them are maintained by PyPA, the Python Packaging Authority. Much more curiously, the 2 instruments which have probably the most “Sure” values (Poetry and PDM) aren’t maintained by the PyPA, however as a substitute different folks utterly unbiased of them and never taking part within the working group. So, is the working group profitable, if it can not produce one fully-featured software? Is the group profitable if it has a number of tasks with overlapping duties? Ought to the group focus their efforts on requirements like PEP 517, which is a typical API for packaging instruments, and which additionally encourages the creation of much more incompatible and competing instruments?
Most significantly: which software ought to a newbie use? The PyPA has a number of guides and tutorials, one is using pip + venv, one other is using pipenv (why would you continue to do this?), and another tutorial that allows you to decide between Hatchling (hatch’s construct backend), setuptools, Flit, and PDM, with out explaining the variations between them—and with out utilizing any surroundings instruments, and with out utilizing Hatch’s/PDM’s construct and PyPI add options (as a substitute opting to make use of python -m construct
and twine
). The idea of digital environments will be very complicated for newcomers, and managing digital environments is troublesome if everybody has incompatible opinions about it.
Additionally it is notable that PEP 20, the Zen of Python, states this:
There needs to be one– and ideally just one –obvious option to do it.
Python packaging undoubtedly doesn’t comply with it . There are 14 methods, and none of them is clear or the one good one. All in all, that is an unsalvageable mess. Why can’t Python decide one software? What does the competitors do? We’ll have a look at this in a minute. However first, let’s discuss concerning the elephant within the room: Python digital environments.
Python depends on digital environments for separation between tasks. Digital environments (aka virtualenvs or venvs) are folders with symlinks to a system-installed Python, and their very own set of site-packages. There are a number of issues with them:
How to use Python from a virtual environment?
There are two methods to do that. The primary one is to activate it, by working the activate shell script put in within the surroundings’s bin listing. One other is to run the python executable (or some other script within the bin listing) straight from the venv.
Activating venvs straight is extra handy for builders, but it surely additionally has some issues. Typically, activation fails to work, because of the shell caching the situation of issues in $PATH
. Additionally, newcomers are taught to activate
and run python
, which suggests they may be confused and attempt to use activate in scripts or cronjobs (however in these environments, you shouldn’t activate venvs, and as a substitute use the Python executable straight). Digital surroundings activation is extra state you want to concentrate on, and if you happen to neglect about it, or if it breaks, you would possibly find yourself messing up your user-wide (or worse, system-wide) Python packages.
How are (system) Pythons and virtual environments related?
The digital surroundings relies upon very tightly on the (system/world/pyenv-installed) Python used to create it. That is good for disk-space causes (clear digital environments don’t take up very a lot house), however this additionally makes the surroundings extra fragile. If the Python used to create the surroundings is eliminated, the digital surroundings stops working. In the event you absolutely handle your personal Python, then it’s most likely not going to occur, however if you happen to rely on a system Python, upgrading packages in your OS would possibly find yourself changing Python 3.10 with Python 3.11. Some distributions (e.g. Ubuntu) would solely make a soar like this on a brand new distribution launch (so you possibly can plan forward), a few of them (e.g. Arch) are rolling-release and an everyday system improve could embody a brand new Python, whereas some (e.g. Homebrew) make it even worse by utilizing paths that embody the patch Python model (3.x.y), which trigger digital environments to interrupt far more typically.
How to manage virtual environments?
The unique virtualenv software, and its simplified commonplace library rewrite venv, mean you can put a digital surroundings anyplace within the file system, so long as you’ve write privileges there. This has led to folks and instruments inventing their very own requirements. Virtualenvwrapper shops environments in a central location, and doesn’t care about their contents. Pipenv and poetry mean you can select (both a central location or the .venv listing within the challenge), and environments are tied to a challenge (they’ll use the project-specific surroundings if you happen to’re within the challenge listing). Hatch shops environments in a central location, and it means that you can have a number of environments per challenge (however there is no such thing as a choice to share environments between tasks).
Brett Cannon has not too long ago accomplished a survey, and it has proven the group is cut up on their workflows: some folks use a central location, some put them within the challenge listing, some folks have a number of environments with totally different Python variations, some folks reuse virtualenvs between tasks… Everybody has totally different wants, and totally different opinions. For instance, I exploit a central listing (~/virtualenvs) and reuse environments when engaged on Nikola (sharing the identical surroundings between improvement and 4 Nikola websites). However then again, when deploying net apps, the venv lives within the challenge folder, as a result of this venv must be utilized by processes working as totally different customers (me, root, or the service account for the online server, which could have interactive login disabled, or whose residence listing could also be set to one thing ephemeral).
So: does Python want digital environments? Maybe wanting how different languages deal with this drawback will help us determine this out for Python?
We’ll have a look at two ecosystems. We’ll begin with JavaScript/Node.js (with npm), after which we’ll have a look at the C#/.NET (with dotnet CLI/MSBuild) ecosystem for comparability. We’ll show a pattern circulation of constructing a challenge, putting in dependencies in it, and working issues. In the event you’re conversant in these ecosystems and wish to skip the examples, proceed with How is Node better than Python? and Are those ecosystems’ tools perfect?. In any other case, learn on.
JavaScript/Node.js (with npm)
There are two instruments for coping with packages within the Node world, specifically npm and Yarn. The npm CLI software is shipped with Node, so we’ll deal with it.
Let’s create a challenge:
We’ve got a package.json file, which has some metadata about our project (name, version, description, license). Let’s install a dependency:
The mere existence of an is-even
package is questionable; the fact that it includes four dependencies is yet another, and the fact that it depends on is-odd
is even worse. But this post isn’t about is-even
or the Node ecosystem’s tendency to use tiny packages for everything (but I wrote one about this topic before). Let’s have a look at what we’ve got within the filesystem:
$ ls node_modules/ package.json package-lock.json $ ls node_modules is-buffer/ is-even/ is-number/ is-odd/ kind-of/
Let’s also take a peek at the package.json
file:
{ "name": "mynpmproject", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "test": "echo "Error: no test specified" && exit 1" }, "author": "", "license": "ISC", "dependencies": { "is-even": "^1.0.0" } }
Our package.json
file now lists the dependency, and we’ve also got a lock file (package-lock.json
), which records all the dependency versions used for this install. If this file is kept in the repository, any future attempts to npm install
will use the dependency versions listed in this file, ensuring everything will work the same as it did originally (unless one of those packages were to get removed from the registry).
Let’s try writing a trivial program using the module and try running it:
Let’s try removing is-odd
to demonstrate how badly designed this package is:
$ rm -rf node_modules/is-odd $ node index.js node:internal/modules/cjs/loader:998 throw err; ^ Error: Cannot find module 'is-odd' Require stack: - /tmp/mynpmproject/node_modules/is-even/index.js - /tmp/mynpmproject/index.js at Module._resolveFilename (node:internal/modules/cjs/loader:995:15) at Module._load (node:internal/modules/cjs/loader:841:27) at Module.require (node:internal/modules/cjs/loader:1061:19) at require (node:internal/modules/cjs/helpers:103:18) at Object.<anonymous> (/tmp/mynpmproject/node_modules/is-even/index.js:10:13) at Module._compile (node:internal/modules/cjs/loader:1159:14) at Module._extensions..js (node:internal/modules/cjs/loader:1213:10) at Module.load (node:internal/modules/cjs/loader:1037:32) at Module._load (node:internal/modules/cjs/loader:878:12) at Module.require (node:internal/modules/cjs/loader:1061:19) { code: 'MODULE_NOT_FOUND', requireStack: [ '/tmp/mynpmproject/node_modules/is-even/index.js', '/tmp/mynpmproject/index.js' ] } Node.js v18.12.1
How is Node better than Python?
Badly designed packages apart, we will see an vital distinction from Python in that there’s no digital surroundings, and all of the packages reside within the challenge listing. If we repair the node_modules
listing by working npm set up
, we will see that I can run the script from some place else on the file system:
$ pwd /tmp/mynpmproject $ npm install added 1 package, and audited 6 packages in 436ms found 0 vulnerabilities $ node /tmp/mynpmproject/index.js true $ cd ~ $ node /tmp/mynpmproject/index.js true
If you try to do that with a Python tool…
-
If you’re using a manually managed venv, you need to remember to activate it, or to use the appropriate Python.
-
If you’re using something fancier, it might be tied to the current working directory, and it may expect you to change into that directory, or to pass an argument pointing at that directory.
I can also run my code as root
, and as an unprivileged nginx
user, without any special preparation (like telling pipenv/poetry to put their venv in the project directory, or running them as the other users):
If you try to do that with a Python tool…
-
If you’re using a manually managed venv, you can use its Python as another user (assuming it has the right permissions).
-
If your tool puts the venv in the project directory, this will work too.
-
If your tool puts the venv in some weird place in your home folder, the other users will get their own venvs. The
uwsgi
user on Fedora uses/run/uwsgi
as its home directory, and/run
is ephemeral (tmpfs), so a reboot forces you to reinstall things.
We can even try to change the name of our project:
If you try to do that with a Python tool…
-
If you’re using a manually managed venv, and it lives in a central directory, all is well.
-
If you or your tool places the venv in the project directory, the venv is now broken, and you need to recreate it (hope you have a recent
requirements.txt
!) -
If your tool puts the venv in some weird place in your home folder, it may decide that this is a different project, which means it will recreate it, and you’ll have an unused virtual environment somewhere on your filesystem.
Other packaging topics
Some packages could expose executable scripts (with the bin
property). These will be run in 3 ways:
-
Put in globally utilizing
npm set up -g
, which might put the script in a worldwide location that’s probably in$PATH
(e.g./usr/native/bin
). -
Put in regionally utilizing
npm set up
, and executed with thenpx
software or manually by working the script innode_packages/.bin
. -
Not put in in any respect, however executed utilizing the
npx
software, which is able to set up it right into a cache and run it.
Additionally, if we wished to publish our factor, we will simply run npm publish
(after logging in with npm login
).
C#/.NET (with dotnet CLI/MSBuild)
In trendy .NET, the One True Device is the dotnet CLI, which makes use of MSBuild for a lot of the heavy lifting. (Within the basic .NET Framework, the duties had been cut up between MSBuild and NuGet.exe, however let’s deal with the trendy workflow.)
Let’s create a challenge:
$ mkdir mydotnetproject $ cd mydotnetproject $ dotnet new console The template "Console App" was created successfully. Processing post-creation actions... Running 'dotnet restore' on /tmp/mydotnetproject/mydotnetproject.csproj... Determining projects to restore... Restored /tmp/mydotnetproject/mydotnetproject.csproj (in 92 ms). Restore succeeded. $ ls mydotnetproject.csproj obj/ Program.cs
We get three things: a mydotnetproject.csproj
file, which defines a few properties of our project; Program.cs
, which is a hello world program, and obj/
, which contains a few files you don’t need to care about.
Let’s try adding a dependency. For a pointless example, but slightly more reasonable than the JS one, we’ll use AutoFixture
, which brings in a dependency on Fare
. If we run dotnet add package AutoFixture
, we get some console output, and our mydotnetproject.csproj
now looks like this:
<Project Sdk="Microsoft.NET.Sdk"> <PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>net6.0</TargetFramework> <ImplicitUsings>enable</ImplicitUsings> <Nullable>enable</Nullable> </PropertyGroup> <ItemGroup> <PackageReference Include="AutoFixture" Version="4.17.0" /> </ItemGroup> </Project>
The first <PropertyGroup>
specifies what our project is (Exe = something you can run), specifies the target framework (.NET 6.0 ), and enables a few opt-in features of C#. The second <ItemGroup>
was inserted when we installed AutoFixture.
We can now write a pointless program in C#. Here’s our new Program.cs
:
using AutoFixture; var fixture = new Fixture(); var a = fixture.Create<int>(); var b = fixture.Create<int>(); var result = a + b == b + a; Console.WriteLine(result ? "Math is working": "Math is broken");
(We could just use C#’s/.NET’s built-in random number generator, AutoFixture is complete overkill here—it’s meant for auto-generating test data, with support for arbitrary classes and other data structures, and we’re just getting two random ints here. I’m using AutoFixture for this example, because it’s simple to use and demonstrate, and because it gets us a transitive dependency.)
And now, we can run it:
If we want something that can be run outside of the project, and possibly without .NET installed on the system, we can use dotnet publish. The most basic scenario:
$ dotnet publish $ ls bin/Debug/net6.0/publish AutoFixture.dll* Fare.dll* mydotnetproject* mydotnetproject.deps.json mydotnetproject.dll mydotnetproject.pdb mydotnetproject.runtimeconfig.json $ du -h bin/Debug/net6.0/publish 424K bin/Debug/net6.0/publish $ bin/Debug/net6.0/publish/mydotnetproject Math is working
You can see that we’ve got a few files related to our project, as well as AutoFixture.dll
and Fare.dll
, which are our dependencies (Fare.dll
is a dependency of AutoFixture.dll
). Now, let’s try to remove AutoFixture.dll
from the published distribution:
$ rm bin/Debug/net6.0/publish/AutoFixture.dll $ bin/Debug/net6.0/publish/mydotnetproject Unhandled exception. System.IO.FileNotFoundException: Could not load file or assembly 'AutoFixture, Version=4.17.0.0, Culture=neutral, PublicKeyToken=b24654c590009d4f'. The system cannot find the file specified. File name: 'AutoFixture, Version=4.17.0.0, Culture=neutral, PublicKeyToken=b24654c590009d4f' [1] 45060 IOT instruction (core dumped) bin/Debug/net6.0/publish/mydotnetproject
We can also try a more advanced scenario:
$ rm -rf bin obj # clean up, just in case $ dotnet publish --sc -r linux-x64 -p:PublishSingleFile=true -o myoutput Microsoft (R) Build Engine version 17.0.1+b177f8fa7 for .NET Copyright (C) Microsoft Corporation. All rights reserved. Determining projects to restore... Restored /tmp/mydotnetproject/mydotnetproject.csproj (in 4.09 sec). mydotnetproject -> /tmp/mydotnetproject/bin/Debug/net6.0/linux-x64/mydotnetproject.dll mydotnetproject -> /tmp/mydotnetproject/myoutput/ $ ls myoutput mydotnetproject* mydotnetproject.pdb $ myoutput/mydotnetproject Math is working $ du -h myoutput/* 62M myoutput/mydotnetproject 12K myoutput/mydotnetproject.pdb $ file -k myoutput/mydotnetproject myoutput/mydotnetproject: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=47637c667797007d777f4322729d89e7fa53a870, for GNU/Linux 2.6.32, stripped, too many notes (256) 12- data $ file -k myoutput/mydotnetproject.pdb myoutput/mydotnetproject.pdb: Microsoft Roslyn C# debugging symbols version 1.0 12- data
We have a single output file that contains our program, its dependencies, and parts of the .NET runtime. We also get debugging symbols if we want to run our binary with a .NET debugger and see the associated source code. (There are ways to make the binary file smaller, and we can move most arguments of dotnet publish
to the .csproj file, but this post is about Python, not .NET, so I’m not going to focus on them too much.)
How is .NET better than Python?
I’m not going to bore you with the identical demonstrations I’ve already proven when discussing How is Node better than Python?, however:
-
You may run constructed .NET tasks as any consumer, from anyplace within the filesystem.
-
All you have to run your code is the output listing (publishing is non-obligatory, however helpful to have a cleaner output, to simplify deployment, and to presumably allow compilation to native code).
-
In the event you do publish in single-executable mode, you possibly can simply distribute the only executable, and your customers don’t even must have .NET put in.
-
You do not want to handle environments, you do not want particular instruments to run your code, you do not want to consider the present working listing when working code.
Other packaging topics
Locking dependencies is disabled by default, however if you happen to add <RestorePackagesWithLockFile>true</RestorePackagesWithLockFile>
to the <PropertyGroup>
in your .csproj
file, you possibly can allow it (and get a packages.lock.json
file in output).
Relating to command line tools, .NET has assist for these as properly. They are often put in globally or regionally, and could also be accessed by way of $PATH or by way of the dotnet
command.
As for publishing your bundle to NuGet.org or to a different repository, you would possibly wish to have a look at the full docs for extra particulars, however the quick model is:
-
Add some metadata to the
.csproj
file (e.g.PackageId
andModel
) -
Run
dotnet pack
to get a.nupkg
file -
Run
dotnet nuget push
to add the.nupkg
file (passing the file identify and an API key)
As soon as once more, every thing is completed with a single dotnet
software. The .NET IDEs (specifically, Visible Studio and Rider) do supply pleasant GUI variations of many options. A few of these GUIs may be doings issues barely in another way behind the scenes, however that is clear to the consumer (and the backend remains to be MSBuild or an in depth by-product of it). I can take a CLI-created challenge, add a dependency from Rider, and publish an executable from VS, and every thing will work the identical. And maybe XML information aren’t as cool as TOML, however they’re nonetheless straightforward to work with on this case.
Other languages and ecosystems
Whereas we’ve got explored two instruments for 2 languages in depth, there are additionally different languages that deserve not less than a point out. Within the Java world, the 2 mostly used instruments are Maven and Gradle. Each instruments can be utilized to handle dependencies and construct artifacts that may be executed or distributed additional (issues like JAR information). Different instruments with assist for constructing Java tasks exist, however most individuals simply decide one of many two. The group of Scala, which is one other JVM-based language, prefers sbt (which can be utilized for plain Java as properly), however there are additionally Maven or Gradle customers in that group. Lastly, two new-ish languages that are fairly widespread within the latest occasions, Go and Rust, have first-party tooling built-in with the remainder of the toolchain. The go
command-line software can accomplish many construct/dependency/packaging duties. Rust’s cargo
, which ships with the usual distribution of Rust, handles dependencies, builds, working code and assessments, in addition to publishing your stuff to a registry.
Are those ecosystems’ tools perfect?
Not at all times, they’ve their deficiencies as properly. Within the Node ecosystem, packages could execute arbitrary code on set up, which could be a safety threat (there are some identified examples, like a npm bundle wiping hard drives in Russia and Belarus, or one other one stealing imaginary Internet money Bitcoin). Binary packages aren’t distributed on the npm registry straight, they’re both constructed with node-gyp
, or have prebuilt packages downloaded by way of node-pre-gyp
(which is a third-party software).
Within the .NET ecosystem, the instruments additionally create an obj
listing with momentary information. These momentary information are tied to the surroundings they’re working in, and whereas the tooling will often re-create them if one thing adjustments, it might probably typically fail and depart you with complicated errors (which might usually be solved by eradicating the bin
and obj
directories). If a bundle is determined by native code (which isn’t already obtainable on the goal OS as a part of a shared library), it should embody binary builds within the NuGet bundle for all of the platforms it helps, as there may be no standard way to permit constructing one thing from supply.
It’s also possible to discover deficiencies within the instruments for the opposite languages talked about. Some folks assume Maven is horrible as a result of it makes use of XML and Gradle is the best way to go, and others assume Gradle’s use of a Groovy-based DSL makes issues a lot tougher than they have to be and like Maven as a substitute.
Recall that when introducing PDM, I discussed PEP 582. This PEP defines a __pypackages__
listing. This listing could be considered by Python when in search of imports. It might behave equally to node_modules
. Since there shall be no symlinks to the system Python, it should resolve the problems with transferring the digital surroundings. As a result of the packages reside within the challenge, there is no such thing as a drawback with sharing a challenge listing between a number of system customers. It’d even be doable for various computer systems (however with the identical Python model and OS) to share the __pypackages__
listing (in some particular circumstances). The proposed __pypackages__
listing construction has lib/python3.10/site-packages/
subfolders, which nonetheless makes the “reinstall on Python improve” step necessary, but it surely doesn’t apply to minor model upgrades, and if you happen to’re coping with a pure-Python dependency tree, mv __pypackages__/lib/python3.10 __pypackages__/lib/python3.11
would possibly simply work. This construction does make sense for binary dependencies, or for dependencies needed solely on older Python variations, because it means that you can use a number of Python variations with the identical challenge listing. The PEP doesn’t say something about sharing __pypackages__
between tasks, however you would most likely resolve that drawback with symlinks (assuming the tooling doesn’t care if the listing is a symlink, and it shouldn’t care IMO).
Whereas PEP 582 is a superb imaginative and prescient, and it will simplify many package-related workflows, it hasn’t seen a lot care from the powers-that-be. The PEP was proposed in Might 2018, and there’s even a usable implementation that’s lower than 50 traces of code, there hasn’t been much progress on having it accepted and carried out in Python correct. Nevertheless, PDM doesn’t care, and it means that you can allow the longer term by yourself machine.
Enabling the future on your own machine
Let’s allow the longer term alone machine. That can require one easy command:
After that, we will initialize our challenge and set up requests into it. Let’s attempt:
$ mkdir mypdmproject $ cd mypdmproject $ pdm init Creating a pyproject.toml for PDM... Please enter the Python interpreter to use 0. /usr/bin/python (3.11) 1. /usr/bin/python3.11 (3.11) 2. /usr/bin/python2.7 (2.7) Please select (0): 1 Using Python interpreter: /usr/bin/python3.11 (3.11) Would you like to create a virtualenv with /usr/bin/python3.11? [y/n] (y): n You are using the PEP 582 mode, no virtualenv is created. For more info, please visit https://peps.python.org/pep-0582/ Is the project a library that will be uploaded to PyPI [y/n] (n): n License(SPDX name) (MIT): Author name (Chris Warrick): Author email (…): Python requires('*' to allow any) (>=3.11): Changes are written to pyproject.toml. $ ls pyproject.toml $ pdm add requests Adding packages to default dependencies: requests ???? Lock successful Changes are written to pdm.lock. Changes are written to pyproject.toml. Synchronizing working set with lock file: 5 to add, 0 to update, 0 to remove ✔ Install charset-normalizer 2.1.1 successful ✔ Install certifi 2022.12.7 successful ✔ Install idna 3.4 successful ✔ Install requests 2.28.1 successful ✔ Install urllib3 1.26.13 successful ???? All complete!
So far, so good (I’m not a fan of emoji in terminals, but that’s my only real complaint here.) Our pyproject.toml
looks like this:
[tool.pdm] [project] name = "" version = "" description = "" authors = [ {name = "Chris Warrick", email = "…"}, ] dependencies = [ "requests>=2.28.1", ] requires-python = ">=3.11" license = {text = "MIT"}
If we try to look into our file structure, we have this:
$ ls pdm.lock __pypackages__/ pyproject.toml $ ls __pypackages__ 3.11/ $ ls __pypackages__/3.11 bin/ include/ lib/ $ ls __pypackages__/3.11/lib certifi/ certifi-2022.12.7.dist-info/ idna/ idna-3.4.dist-info/ charset_normalizer/ charset_normalizer-2.1.1.dist-info/ requests/ requests-2.28.1.dist-info/ urllib3/ urllib3-1.26.13.dist-info/
We’ll write a simple Python program (let’s call it mypdmproject.py
) that makes a HTTP request using requests
. It will also print requests.__file__
so we’re sure it isn’t using some random system copy:
import requests print(requests.__file__) r = requests.get("https://chriswarrick.com/") print(r.text[:15])
$ python mypdmproject.py /tmp/mypdmproject/__pypackages__/3.11/lib/requests/__init__.py <!DOCTYPE html>
Let’s finally try the tests we’ve done in the other languages. Requests is useless without urllib3, so let’s remove it and see how well it works.
$ rm -rf __pypackages__/3.11/lib/urllib3* $ python mypdmproject.py Traceback (most recent call last): File "/tmp/mypdmproject/mypdmproject.py", line 1, in <module> import requests File "/tmp/mypdmproject/__pypackages__/3.11/lib/requests/__init__.py", line 43, in <module> import urllib3 ModuleNotFoundError: No module named 'urllib3'
Finally, can we try with a different directory? How about a different user?
$ pdm install Synchronizing working set with lock file: 1 to add, 0 to update, 0 to remove ✔ Install urllib3 1.26.13 successful ???? All complete! $ pwd /tmp/mypdmproject $ cd ~ $ python /tmp/mypdmproject/mypdmproject.py /tmp/mypdmproject/__pypackages__/3.11/lib/requests/__init__.py <!DOCTYPE html> # su -s /bin/bash -c 'eval "$(/tmp/pdmvenv/bin/pdm --pep582 bash)"; python /tmp/mypdmproject/mypdmproject.py' - nobody su: warning: cannot change directory to /nonexistent: No such file or directory /tmp/mypdmproject/__pypackages__/3.11/lib/requests/__init__.py <!DOCTYPE html>
This is looking pretty good. An independent project manages to do what the big Authority failed to do over so many years.
Is this the perfect thing?
Properly, virtually. There are two issues that I’ve complaints about. The primary one is the pdm --pep582
hack, however hopefully, the PyPA will get its act collectively and will get it into Python core quickly. Nevertheless, one other vital drawback is the dearth of separation from system site-packages. Avid readers of footnotes may need seen I had to make use of a Docker container in my PDM experiments, as a result of requests may be very generally present in system site-packages
(particularly when utilizing system Pythons, which have requests due to some random bundle, or as a result of it was unbundled from pip). This could break issues in methods you don’t anticipate, since you would possibly find yourself importing and relying on system-wide issues, or mixing system-wide and native packages (if you happen to don’t set up an additional requirement, however these packages are current system-wide, you then would possibly find yourself utilizing an additional you haven’t requested for). This is a crucial drawback—a great answer could be to disable system site-packages if a __pypackages__
listing is in use.
A while in the past, the PSF ran a survey on packaging. Over 8000 folks responded. The users have spoken:
-
Most individuals assume packaging is just too advanced.
-
An awesome majority prefers utilizing only a single software.
-
Most individuals additionally assume the existence of a number of instruments is just not useful for the Python packaging ecosystem.
-
Just about everybody would favor a clearly outlined official workflow.
-
Over 50% of responses assume instruments for different ecosystems are higher at managing dependencies and putting in packages.
The following step after this survey was for the packaging group to discuss its results and attempt to give you a brand new packaging technique. The primary submit from Shamika Mohanan (the Packaging Undertaking Supervisor at PSF) that triggered the dialogue additionally centered closely on the customers’ imaginative and prescient to unify packaging instruments and to have One True Device. This dialogue was open to folks concerned with the packaging world; many individuals of the dialogue are concerned with PyPA, and I don’t assume I’ve seen a single remark from the folks behind Poetry or PDM.
Many of the thread ended up being dialogue of binary extensions, together with discussions of methods to assist software proliferation by making it doable for instruments that aren’t setuptools to construct binary extensions. There was additionally lots of deal with the scientific group’s points with libraries with native code, closely rooted in C/C++, and with makes an attempt to exchange Conda with new PyPA-approved instruments. The “unified software” for everybody else was talked about in some posts, however they had been actually the minority.
Some PyPA members talked a couple of UX evaluation, and that they anticipate the unified software to be re-exporting performance from current instruments—which instantly raises the query: which instruments ought to it export performance from and why? Is pip set up unified-packaging-tool
going to herald all fourteen? Is the truth that customers are sad with what they’ve, and plenty of of them could be pleased with one thing lke npm/dotnet/cargo, not sufficient to find out the UX route of the unified software?
A few of them are additionally in opposition to breaking current workflows. Is a unified packaging software going to work for each single consumer? Positively not. However are there that many distinct primary workflows? If we ignore issues that border on bikeshedding, corresponding to src vs no-src, or venv areas, are there that many workflows to think about? Somebody making a library and somebody making an utility do have totally different wants (e.g. with regard to publishing the bundle or acceptable dependency variations). Somebody working with C extensions (or extensions utilizing one thing like Cython) could have totally different wants, however their wants would often be a superset of the wants of somebody engaged on a pure-Python challenge. The scientific group may need extra specialised wants, associated to advanced non-Python components, however I’m optimistic a lot of their factors may very well be solved by the unified software as properly, even when it’s not by the point this software reaches v1.0. Additionally it is doable that the scientific group would possibly want to stick with Conda, or with some evolution of it that brings it nearer consistent with the Unified Packaging Device but in addition solves the scientists’ wants higher than a software additionally fixing the non-scientists’ wants can.
Then there’s a dialogue concerning the current instruments and which one is the software for the longer term. The maintainer of Hatch (Ofek Lev) says that Hatch can provide the “unified UX”. However do the maintainers of Poetry or PDM agree? Poetry appears to be much more lively than Hatch, going by GitHub points, and it’s additionally value noting that Hatch’s bus issue is 1 (with Ofek Lev answerable for 542 out of 576 commits to the grasp department). Russell Keith-Magee from BeeWare has highlighted the truth that tooling apart, the PyPA does a foul job at speaking issues. Russell talked about that one in all PyPA tutorials now makes use of Hatch, however there is no such thing as a option to know if the PyPA considers Hatch to be the longer term, are folks purported to migrate onto Hatch, and is Flit, one other latest PyPA software, now ineffective? Russell additionally makes good factors about focusing efforts: ought to folks deal with serving to Hatch assist extension modules (which, in keeping with the Hatch maintainer, is the final state of affairs requiring setuptools; different individuals word which you could already construct native code with out setuptools), or ought to folks deal with bettering setuptools compatibility with PEP 517?
There have been additionally some folks stating their opinions on unifying issues in varied methods—and plenty of of them are against unifying issues. There have been some voices of cause, like that of Russell Keith-Magee, or of Simon Notley, who accurately seen the thread fails to resolve issues of builders, who’re confused about packaging, and don’t perceive the totally different decisions obtainable and the way they interoperate. Simon does agree that native dependencies are vital and occur typically in Python tasks (and so do I), however the customers who responded to the survey had one thing else in thoughts — as exemplified by the dialogue opening submit, mentioning the consumer anticipating the simplicity of Rust’s cargo, and by the survey outcomes. 70% of the survey respondents additionally use npm
, so many Python customers have already seen the less complicated workflows. The survey respondents had been additionally requested to rank a number of focus areas based mostly on significance. “Making Python packaging higher serve frequent use circumstances and workflows” was ranked first out of the offered choices by 3248 individuals. “Supporting a wider vary of use circumstances (e.g. edge circumstances, and so on.)” was ranked first by 379 folks, and it was the least vital within the minds of 2989 folks.
Yet one more level that highlights the detachment of packaging people from actuality was talked about by Anderson Bravalheri. To Anderson, a brand new unified software could be disrespectful of the work the maintainers of the present instruments put into sustaining them, and disrespectful of customers who needed to adapt to the packaging mess. This level is totally absurd. Was the substitute of MS-DOS/Home windows 9x and Basic Mac OS with Home windows NT and Mac OS X OS X macOS disrespectful to their respective designers, and the customers who needed to adapt to manually configuring trivia, determining methods to get all of your software program and {hardware} to run with bizarre limitations that had been needed within the Nineteen Eighties, and the system crashing each on occasion? Was the substitute of horses with vehicles disrespectful to horses, and the individuals who had been eradicating horse manure from the streets? Was the substitute of the Ford Mannequin T with sooner, safer, extra environment friendly, and simpler to make use of vehicles disrespectful to Henry Ford? Expertise comes and goes, and typically, getting an enchancment means we have to do away with the previous stuff. This is applicable exterior of know-how, too—you would give you many examples of change on the planet, which could have put some folks out of energy, however has vastly improved the lives of thousands and thousands of individuals (the autumn of communism in Europe, for instance). Additionally, going again to the know-how world of in the present day, this sentiment suggests Anderson is way too connected to the software program they write—is that this a wholesome method?
No one raised PEP 582 or the complexity of digital environments. It may not be seen from the ivory towers of packaging software maintainers, who’ve years of expertise coping with them, but it surely actually does exist for normal folks, for individuals who assume the Python offered by their Linux distro is nice sufficient, and particularly for folks for whom Python is their introduction to programming.
I wish to as soon as once more spotlight: that’s not simply the opinion of 1 random rambling Chris. The opinion that Python packaging must be simplified and unified is held by about half of the 8774 individuals who took the survey.
However right here’s another fascinating factor: Discourse, the platform that the dialogue was held on, reveals the variety of occasions a hyperlink was clicked. Granted, this depend may not be at all times correct, but when we assume it’s, the hyperlink to the outcomes abstract was clicked solely 14 occasions (as of 2023-01-14 21:20 UTC). The dialogue has 28 individuals and a couple of.2k views. If we imagine the hyperlink click on counter, half of the dialogue individuals didn’t even trouble studying what the folks assume.
Python packaging is a large number, and it at all times has been. There are tons of instruments, largely incompatible with one another, and no software can resolve all issues (particularly no software from the PyPA). PDM is absolutely near the best, since it might probably cast off the overhead of managing digital environments—which is hopefully the way forward for Python packaging, or the 2010s of Node.js packaging. Maybe in a number of years, Python builders (and extra importantly, Python learners!) will have the ability to simply pip set up
(or pdm set up
?) what they want, with out worrying about some “digital surroundings” factor, that’s separate however not fairly from a system Python, and that’s not a digital machine. Python wants much less instruments, no more.
Moreover, I think about that the PyPA have to be destroyed. The technique dialogue highlights the truth that they’re unable to make Python packaging work the best way the customers anticipate. The PyPA ought to deal with producing one good software, and on getting PEP 582 into Python. A great way to attain this may be to place its assets behind PDM. The problems with native code and binary wheels are vital, however plain-Python workflows, or workflows with simple binary dependencies, are far more frequent, and have to be improved. This enchancment must occur now.
Focus on within the feedback beneath, on Hacker News, or on Reddit.