TIL - 🎁 Extracting modules provided by installed package
This is not something you do every day, but sometimes you need to match a package name with the modules it installs, as in answering this question: What modules are provided by ‘Django~=3.1.0’?
It took me a while to digg this (but see the PS which accounts for a lot of time spent on this) apparently I found out a way to achieve this:
How does this works?
The easy answer is: setuptools magic
The more in depth answer is that each package ships this information in
top_level.txt metadata (which apparently is specified only for egg format but available in sdists, eggs and wheels) as a list of top-level modules shipped in the package (one per line).
pkg_resources setuptools module provides an API to read packages metadata we can use it to access this file.
First we need to parse our package in a
The nice thing is that we can provide it any “valid” (as per PEP 508) package requirement, so we can pass it the same string we would use in pip (which is the nice thing, given the reason I needed this feature in the first place).
The requirement (which is just a data structure to represent the package name and version required) must then be matched against the installed package by getting the
Distribution, which opposedly represent the exact package currently installed in the current virtualenv (well techically in the current
WorkingSet, but we won’t go in details here):
Distribution object is our gateway to the package information:
One of its method is
Distribution.get_metadata which can read the package metadata from its
dist-info (depending on the package format) directory.
Et voilà, our quest is complete!
PS: If you are working in a ipython session and you uninstall and install packages in the meantime, don’t forget to exit & reenter (or use the ipython magic to reload the available packages): it’s one of those lesson that I keep forgetting 🤦♂️