summaryrefslogtreecommitdiff
path: root/debian/README.venv
diff options
context:
space:
mode:
Diffstat (limited to 'debian/README.venv')
-rw-r--r--debian/README.venv230
1 files changed, 230 insertions, 0 deletions
diff --git a/debian/README.venv b/debian/README.venv
new file mode 100644
index 0000000..9711ee1
--- /dev/null
+++ b/debian/README.venv
@@ -0,0 +1,230 @@
+=========================================
+ pyvenv support in Python 3.4 and beyond
+=========================================
+
+In Python 3.3, built-in support for virtual environments (venvs) was added via
+the `pyvenv`_ command. For building venvs using Python 3, this is
+functionally equivalent to the standalone `virtualenv`_ tool, except that
+before Python 3.4, the pyvenv created venv didn't include pip and setuptools.
+
+In Python 3.4, this was made even more convenient by the `automatic
+inclusion`_ of the `pip`_ command into the venv so that third party libraries
+can be easily installed from the Python Package Index (PyPI_). The stdlib
+module `ensurepip`_ is run when the `pyvenv-3.4` command is run to create the
+venv.
+
+This poses a problem for Debian. ensurepip comes bundled with two third party
+libraries, setuptools and pip itself, as these are requirements for pip to
+function properly in the venv. These are bundled in the ensurepip module of
+the upstream Python 3.4 tarball as `universal wheels`_, essentially a zip of
+the source code and a new ``dist-info`` metadata directory. Upstream pip
+itself comes bundled with a half dozen or so of *its* dependencies, except
+that these are "vendorized", meaning their unpacked source code lives within
+the pip module, under a submodule from which pip imports them rather than the
+top-level package namespace.
+
+To make matters worse, one of pip's vendorized dependencies, the `requests`_
+module, *also* vendorizes a bunch of its own dependencies. This stack of
+vendorized and bundled third party libraries fundamentally violates the DFSG
+and Debian policy against including code not built from source available
+within Debian, and for including embedded "convenience" copies of code in
+other packages.
+
+It's worth noting that the virtualenv package actually suffers from the same
+conflict, but its current solution in Debian is `incomplete`_.
+
+
+Solving the conflict
+====================
+
+This conflict between Debian policy and upstream Python convenience must be
+resolved, because pyvenv is the recommended way of creating venvs in Python 3,
+and because at some point, the standalone virtualenv tool will be rewritten as
+a thin layer above pyvenv. Obviously, we want to provide the best Python
+virtual environment experience to our developers, adherent to Debian policy.
+
+The approach we've taken is layered and nuanced, so I'll provide a significant
+amount of detail to explain both what we do and why.
+
+The first thing to notice is how upstream ensurepip works to have its pip and
+setuptools dependencies available, both at venv creation time and when
+``<venv>/bin/pip`` is run. When pyvenv-3.4 runs, it ends up calling the
+following Python command::
+
+ <venv>/bin/python -Im ensurepip --upgrade
+
+This runs the ensurepip's ``__main__.py`` module using the venv's Python in
+isolation mode, with a switch to upgrade the setuptools and pip dependencies
+(if for example, they've been updated in a new micro version of Python).
+
+Internally, ensurepip bootstraps itself by byte-copying its embedded wheels
+into a temporary directory, putting those copied wheels on ``sys.path``, and
+then calling into pip as a library. Because wheels are just elaborate zips,
+Python can execute (pure-Python) code directly from them, if they are on
+``sys.path`` of course. Once ensurepip has set up its execution environment,
+it calls into pip to install both pip and setuptools into the newly created
+venv. If you poke inside the venv after successful creation, you'll see
+unpacked pip and setuptools directories in the venv's ``site-packages`
+directory.
+
+The important thing to note here is that ensurepip is *already* able to import
+from and install wheels, and because wheels are self-contained single files
+(of zipped content), it makes manipulating them quite easy. In order to
+minimize the delta from upstream (and to eventually work with upstream to
+eliminate this delta), it seems optimal that Debian's solution should also be
+based on wheels, and re-use as much of the existing machinery as possible.
+
+The difference for Debian though is that we don't want to use the embedded pip
+and setuptools wheels from upstream Python's ensurepip; we want to use wheels
+created from the pip and setuptools *Debian* packages. This would solve the
+problem of distributing binary packages not built from source in Debian.
+
+Thus, we modify the python-pip and python-setuptools packages to include new
+binary packages ``python-pip-whl`` and ``python-setuptools-whl` which contain
+*only* the relevant universal wheels. Those packages ``debian/rules`` files
+gain an extra command::
+
+ python3 setup.py bdist_wheel --universal -d <path>
+
+The ``bdist_wheel`` command is provided by the `wheel`_ package, which as of
+this writing is newly available in Jessie.
+
+Note that the name of the binary packages, and other details of when and how
+wheels may be used in Debian, is described in `Debian Python Policy`_ 0.9.6 or
+newer.
+
+The universal wheels (i.e. pure-Python code compatible with both Python 2 and
+Python 3) are built for pip and setuptools and installed into
+``/usr/share/python-wheels`` when the python-{pip,setuptols}-whl packages are
+installed. These are not needed for normal, usual, and typical operation of
+Python, so none of these are installed by default.
+
+However, this isn't enough, because since the pip and setuptools wheels are
+built from the *patched* and de-vendorized versions of the code in Debian, the
+wheels will not contain their own recursive dependencies. That's a good thing
+for Debian policy compliance, but does add complications to the stack of hack.
+
+Using the same approach as for pip and setuptools, we *also* wheelify their
+dependencies, recursively. As of this writing, the list of packages needing
+to be wheelified are (by Debian source package name):
+
+ * chardet
+ * distlib
+ * html5lib
+ * python-colorama
+ * python-pip
+ * python-setuptools
+ * python-urllib3
+ * requests
+ * six
+
+Most of these are DPMT maintained. six, distlib, and colorama are not team
+maintained, so coordination with those maintainers is required. Also note
+that the `bdist_wheel` command is a setuptools extension, so since some of
+those projects use ``distutils.core.setup()`` by default, they must be patched
+to use ``setuptools.setup()`` instead. This isn't a problem because there's
+no functional difference relevant to those packages; they likely use
+distutils.core to avoid a third party dependency on setuptools.
+
+Each of these Debian source packages grow an additional binary package, just
+like pip and setuptools, e.g. python-chardet-whl which contains the universal
+wheel for that package built from patched Debian source. As above, when
+installed, these binary packages drop their .whl files into the
+``/usr/share/python-wheels`` directory.
+
+Now comes the fun part.
+
+In the python3.4 source package, we add a new binary package called
+python3.4-venv. This will only contain the ``/usr/bin/pyvenv-3.4``
+executable, and its associated manpage. It also includes all the run-time
+dependencies to make pyvenv work *including the wheel packages described
+above*.
+
+(Technically speaking, you should substitute "Python 3.4 or later" for all
+these discussions, and e.g. pyvenv-3.x for all versions subsequent to 3.4.)
+
+Python's ensurepip module has been modified in the following ways (see
+``debian/patches/ensurepip.diff``):
+
+ * When ensurepip is run outside of a venv as root, it raises an exception.
+ This use case is only to be supported by the separate python{,3}-pip
+ packages.
+
+ * When ensurepip is run inside of a venv, it copies all dependent wheels from
+ ``/usr/share/python-wheels``. This includes the direct dependencies pip
+ and setuptools, as well as the recursive dependencies listed above. The
+ rest of the ensurepip machinery is unchanged: the wheels are still copied
+ into a temporary directory and placed on ``sys.path``, however only the
+ direct dependencies (i.e. pip and setuptools) are *installed* into the
+ venv's ``site-packages`` directory. The indirect dependencies are copied
+ to ``<venv>/lib/python-wheels`` since they'll be needed by the venv's pip
+ executable.
+
+Why do we do this latter rather than also installing the recursive
+dependencies into the venv's ``site-packages``? It's because pip requires a
+very specific set of dependencies and we don't want pip to break when the user
+upgrades or downgrades one of those packages, which is perfectly valid in a
+venv. It's exactly the same reason why pip vendorizes those libraries in the
+first place; it's just that we're doing it in a more principled way (from the
+point of view of the Debian distribution).
+
+The final piece of the puzzle is that Debian's pip will, when run inside of a
+venv, introspect ``<venv>/lib/python-wheels`` and put every .whl file it sees
+there *at the front of its sys.path*. Again, this is so that when pip runs,
+it will find the versions of packages known to be good first, rather than any
+other versions in the venv's ``site-packages``.
+
+As an example of the bad things that can happen if you don't do this, try
+installing nose2_ into the venv, followed by genshi_. nose2 has a hard
+requirement on a version of six that is older than the one used by pip
+(indirectly). This older version of six is compatible with genshi, but *not*
+with pip, so once nose2 is installed, if pip didn't load its version of six
+from the private wheel, the installation attempt of genshi would traceback.
+As it is, with the wheels early enough on ``sys.path``, pip itself works just
+fine so that both nose2 and genshi can live together in the venv.
+
+
+Updating packages
+=================
+
+Inevitably, new versions of Python or the pyvenv dependent packages will
+appear. Unfortunately, as currently implemented (by both upstream ensurepip
+and in our ensurepip patch), the versions of both the direct and indirect
+dependencies are hardcoded in ``Lib/ensurepip/__init__.py``. When a Debian
+developer updates any of the dependent packages, you will need to:
+
+ * *Test that the new version is compatible with ensurepip*.
+
+ * Update the version numbers in the ``debian/control`` file, for the
+ python3.x-venv binary package.
+
+ * ``quilt push`` to the ensurepip patch, and update the version number in
+ ``Lib/ensurepip/__init__.py``
+
+Then rebuild and upload python3.4.
+
+Yes, this isn't ideal, and I am working with upstream to find a good solution
+that we can share.
+
+
+Author
+======
+
+Barry A. Warsaw <barry@debian.org>
+2014-05-15
+
+
+
+.. _pyvenv: http://legacy.python.org/dev/peps/pep-0405/
+.. _virtualenv: https://pypi.python.org/pypi/virtualenv
+.. _`automatic inclusion`: http://legacy.python.org/dev/peps/pep-0453/
+.. _pip: https://pypi.python.org/pypi/pip
+.. _PyPI: https://pypi.python.org/pypi
+.. _ensurepip: https://docs.python.org/3/library/ensurepip.html
+.. _`universal wheels`: http://legacy.python.org/dev/peps/pep-0427/
+.. _requests: https://pypi.python.org/pypi/requests
+.. _incomplete: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=719767
+.. _wheel: https://pypi.python.org/pypi/wheel
+.. _nose2: https://pypi.python.org/pypi/nose2
+.. _genshi: https://pypi.python.org/pypi/Genshi
+.. _`Debian Python Policy`: https://www.debian.org/doc/packaging-manuals/python-policy/