January 30, 2021

Packaging a Python Project with PyPI

Ever wondered how to publish a python project so anyone can install it with pip? This post dives in to the world of Python Packaging.

Packaging a Python Project with PyPI

Have you ever wondered what goes on behind the scenes to allow you to run pip install netmiko or pip install ansible? I've been writing code in Python for a while (I said code, not good code!) but one thing that I never quite grokked was how to take a simple script and turn it into something that can be installed and used by others. My recent discovery of the Python smtpd module inspired me to jump down this rabbit holeHow do you create a CLI executable program that anyone can install on their computer? In this post, I will discuss how I made and packaged a simple little Python module/CLI program that can be installed with pip.

Creating your Project

The obvious first step to publishing a Python project is actually writing some code. I am not going to go into all of the details on how to write a Python project as tomes can (and have) been written on the subject, however, I will walk you through a simple project I wrote for this post. The project we will follow is called redditmon. This project uses Reddit's API to display the contents of the current top post of a subreddit and refresh and display that post to the user. For example, to view the top post of the /r/networking subreddit and have it refreshed every 30 seconds, redditmon can be launched with redditmon networking -r 30, which (at the time of writing) results in this output to the terminal:

The Current Top Post on /r/networking is:
"Blogpost Friday!" submitted by user "AutoModerator" with 13 Upvotes

It's Read-only Friday! It is time to put your feet up, pour a nice dram and look through some of our member's new and shiny blog posts.

Feel free to submit your blog post and as well a nice description to this thread.

*Note: This post is created at 00:00 UTC. It may not be Friday where you are in the world, no need to comment on it.*

View This Post on Reddit at this URL: https://old.reddit.com/r/networking/comments/l7d6ek/blogpost_friday/
NOTE: To use this script yourself, you will need to create a Reddit API key and put it into a config file structured like this.

You can view this project on GitHub, but here is a quick overview of the files involved to make this happen:

redditmon
├── example_redditmon.config.ini
├── LICENSE
├── README.md
├── redditmon
    └── redditmon.py

All of the actual code lives in a single file, redditmon/redditmon.py, and the rest is either used for config or packaging, so we will just ignore those for the time being. If you want to really dig into what's going on, please check out the source code. Here is the general flow of the program:

"""Module for monitoring a subreddit for new posts
"""

import praw, time, os, argparse, configparser
from os.path import expanduser

class RedditDisplay:
    def __init__(self, subreddit_name, refresh_interval, config_path):
    def load_config(self, config_path):
    def get_top_post(self):
    def print_post_title(self):
    def print_post_body(self):
    def print_post_url(self):
    def display_post(self):
    	self.print_post_title()
        self.print_post_body()
        self.print_post_url()


def get_cli_args():

def redditmon_cli():

if __name__ == "__main__":
    redditmon_cli()

We start off by importing some libraries, all of which are included with Python except for praw which is the "Python Reddit API Wrapper"—keep in mind that this will be an important part of our packaging process later. After our imports, we have a class called RedditDisplay that handles all of the work of grabbing the Reddit post and provides the tools to display the Reddit post in the user's terminal. Near the bottom of the script, we have a few module-level functions. The first, get_cli_args(), defines the arguments available for running this module from the CLI, and the second, redditmon_cli() ties all of this together to allow you to use redditmon as a CLI tool. The final code is a simple `if __name__ == "__main__":  statement that launches the redditmon_cli() function when you call the file directly.

In its current state, we can launch the script by its name with python3 redditmon/redditmon.py networking and it works! If we are in the project directory, we can even import it to a separate script or the Python REPL like so:

┬─[chris@chris-lt01:~/c/redditmon]─[01:21:15 PM]─[V:redditmon]─[G:main=]
╰─>$ ipython
Python 3.8.5 (default, Jul 28 2020, 12:59:40) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import redditmon.redditmon

In [2]: display = redditmon.redditmon.RedditDisplay("networking", 10, "/home/chris/redditmon.config.ini")

In [3]: display
Out[3]: <redditmon.redditmon.RedditDisplay at 0x7fc43d6b61f0>

This should be enough background on redditmon.py to understand the actual point of this post, which is how to package the project for installation with pip, but make sure to look at the source if you want to judge my terrible code learn more about how it works.

Packaging your Project

Okay, PHEW. Now that we got that out of the way, I've added a few files that let us package the project:

redditmon
├── example_redditmon.config.ini
├── LICENSE
├── README.md
├── redditmon
│   ├── __init__.py
│   └── redditmon.py
└── setup.py

__init__.py

The first file is nice and simple—It's an empty file named __init__.py in the redditmon folder. This simply tells python that this directory is part of a "regular package" and the contents of the file are executed as normal python code when you import the package. Ours is blank as we don't have anything we want to run on import, we just want Python to know this is a regular package.

setup.py

This file is where all the magic happens—it's our script that that tells the builtin Python setuptools module all of our package details. Here is what setup.py looks like for redditmon at the time of writing:

import pathlib
import setuptools

here = pathlib.Path(__file__).parent

with open(f"{here}/README.md", "r", encoding="utf-8") as fh:
    long_description = fh.read()

setuptools.setup(
    name="redditmon",
    version="1.0.1",
    author="Chris Cummings",
    author_email="nouser@slash64.tech",
    description="A simple package for viewing reddit posts",
    long_description=long_description,
    long_description_content_type="text/markdown",
    url="https://github.com/cummings-chris/redditmon",
    packages=setuptools.find_packages(),
    license="MIT",
    classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License",
        "Operating System :: OS Independent",
    ],
    install_requires=['praw'],
    entry_points={
        'console_scripts': [
            'redditmon = redditmon.redditmon:redditmon_cli',
        ],
    },
    python_requires='>=3.6',
)

These are pretty straight-forward and mostly self-explanatory, however, I would like to focus on install_requires and entry_points

install_requires

Remember how I said that praw isn't included with the Python standard library? Since praw isn't guaranteed to be present on every user's system, I need to make sure that when redditmon is installed, pip installs praw as well. To ensure this, we simply set the install_requires argument to contain a list of all modules that we depend on:

install_requires=['praw'],

With this value set, pip will now make sure that praw is present when a user installs redditmon.

entry_points

This one is a little more complex than install_requires, however, it provides us with the core feature of our program—running it directly from the CLI. Without this option, a user could run the module directly with python3 -m redditmon networking invoking the if __name__ == "__main__": block of our program, however, that is not as clean as using the console_scripts entry point. With this set, the setuptools package creates the appropriate scrips at installation time so that we can simply launch our program from the CLI using redditmon networking right from our shell! To make this all work, entry_points is populated like so:

entry_points={
        'console_scripts': [
            'redditmon = redditmon.redditmon:redditmon_cli',
        ],
    },

Let's break this apart. entry_points is populated with a single dictionary. We are only populating one key, console_scripts (read about what else is available here.) The value for console_scripts is set to a list of our CLI programs that we want generated. Since we only have one, our list just contains one, but it could contain as many as we want! Our script is defined by the string 'redditmon = redditmon.redditmon:redditmon_cli' which tells setuptools to create a script called redditmon that imports the redditmon_cli() function from redditmon.redditmomn (our library) and then runs that function. Here is the exact script that setup tools generated for my machine and then placed in ~/.local/bin/redditmon which is part of my shell $PATH variable:

┬─[chris@chris-lt01:~/c/redditmon]─[02:38:48 PM]─[G:main=]
╰─>$ cat /home/chris/.local/bin/redditmon 
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from redditmon.redditmon import redditmon_cli
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(redditmon_cli())

The beauty of using entry_points is that the redditmon script is generated and placed dynamically by the installer toolchain, so I don't have to ever fiddle with placing this file in the right directory for the user which varies greatly by installation, os type, etc.

Publishing your Project

Now we are looking good! We have written a project, configured our setup scripts, and now we are ready to generate the project files and upload them to the Python Package Index (PyPI)!

Generating Packages

To generate the package archives, issue the command python3 setup.py sdist bdist_wheel from the directory of your project that contains setup.py. This spits out a ton of information as well as a bunch of files, however, the ones we really care about are the files in the newly created dist/ directory:

dist/
├── redditmon-1.0.1-py3-none-any.whl
└── redditmon-1.0.1.tar.gz

The file redditmon-1.0.1-py3-none-any.whl is a "Built Distribution" that contains everything needed for installation, and the file redditmon-1.0.1.tar.gz contains of all of our code and is called a "Source Archive". Soon we will upload these packages to PyPI, however, first, go ahead and make sure that you have an account setup on the Test PyPI instance and the Production PyPI instance.

Uploading to PyPI Test

Python is kind enough to provide a testing environment that you can upload your packages to! This allows you to get comfortable with PyPI without messing up your project on PyPI. The package that lets us upload our package to PyPI is called twine so make sure you have that installed via pip install twine. Now simply upload everything in dist/ to PyPI Test like so:

┬─[chris@chris-lt01:~/c/redditmon]─[03:05:19 PM]─[V:redditmon-venv]─[G:main=]
╰─>$ twine upload --repository testpypi dist/*
Uploading distributions to https://test.pypi.org/legacy/
Enter your username: <YOUR-USERNAME-HERE>
Enter your password: <YOUR-PASSWORD-HERE>
Uploading redditmon-1.0.1-py3-none-any.whl
100%|███████████████| 8.19k/8.19k [00:01<00:00, 8.19kB/s]
Uploading redditmon-1.0.1.tar.gz
100%|███████████████| 6.94k/6.94k [00:00<00:00, 7.85kB/s]

View at:
https://test.pypi.org/project/redditmon/1.0.1/

Head over to PyPI Test, search for your package name, and you can see your shiny new package!

Now you are all set! You can install your package from PyPI test like so:

┬─[chris@chris-lt01:~/c/redditmon]─[03:18:55 PM]─[V:redditmon-venv]─[G:main=]
╰─>$ pip install --index-url https://test.pypi.org/simple/ redditmon
Looking in indexes: https://test.pypi.org/simple/
Collecting redditmon
  Downloading https://test-files.pythonhosted.org/packages/0b/d2/43c4f86ae4ac9e867412d1694b210f36b1debcaa297bb4a35ae4669cf0aa/redditmon-1.0.1-py3-none-any.whl (4.3 kB)
Requirement already satisfied: praw in /home/chris/code/redditmon-venv/lib/python3.8/site-packages (from redditmon) (7.1.0)
<OUTPUT OMITTED>
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/chris/code/redditmon-venv/lib/python3.8/site-packages (from requests>=2.3.0->update-checker>=0.17->praw->redditmon) (1.26.3)
Installing collected packages: redditmon
Successfully installed redditmon-1.0.1

Uploading to PyPI

Now that we are comfortable with the tools available to us, let's publish our package to the production PyPI instance! We've already generated our Source Archive and Build Distribution, so we simply need to do the upload, this time not specifying any PyPI repository as twine defaults to the production PyPI instance:

┬─[chris@chris-lt01:~/c/redditmon]─[03:18:59 PM]─[V:redditmon-venv]─[G:main=]
╰─>$ twine upload dist/*
Uploading distributions to https://upload.pypi.org/legacy/
Enter your username: crankynetman
Enter your password: 
Uploading redditmon-1.0.1-py3-none-any.whl
100%|███████████████| 8.19k/8.19k [00:01<00:00, 6.09kB/s]
Uploading redditmon-1.0.1.tar.gz
100%|███████████████| 6.94k/6.94k [00:00<00:00, 8.78kB/s]

View at:
https://pypi.org/project/redditmon/1.0.1/

Let's make sure it shows up on PyPI:

Using Your Project

Woohoo, finally! You are all set to install your package using pip with no special flags or requirements.

┬─[chris@chris-lt01:~]─[03:39:34 PM]
╰─>$ pip install redditmon
Collecting redditmon
  Downloading redditmon-1.0.1-py3-none-any.whl (4.3 kB)
Collecting praw
  Using cached praw-7.1.0-py3-none-any.whl (152 kB)
Collecting update-checker>=0.17
  Using cached update_checker-0.18.0-py3-none-any.whl (7.0 kB)
Collecting prawcore<2.0,>=1.3.0
  Using cached prawcore-1.5.0-py3-none-any.whl (15 kB)
Collecting websocket-client>=0.54.0
  Using cached websocket_client-0.57.0-py2.py3-none-any.whl (200 kB)
Collecting requests>=2.3.0
  Using cached requests-2.25.1-py2.py3-none-any.whl (61 kB)
Collecting six
  Using cached six-1.15.0-py2.py3-none-any.whl (10 kB)
Collecting idna<3,>=2.5
  Using cached idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.3-py2.py3-none-any.whl (137 kB)
Collecting chardet<5,>=3.0.2
  Using cached chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting certifi>=2017.4.17
  Using cached certifi-2020.12.5-py2.py3-none-any.whl (147 kB)
Installing collected packages: idna, urllib3, chardet, certifi, requests, update-checker, prawcore, six, websocket-client, praw, redditmon
Successfully installed certifi-2020.12.5 chardet-4.0.0 idna-2.10 praw-7.1.0 prawcore-1.5.0 redditmon-1.0.1 requests-2.25.1 six-1.15.0 update-checker-0.18.0 urllib3-1.26.3 websocket-client-0.57.0

Wow, that's it? Yep! Let's view the redditmon help by launching it with the --help flag (defined with argparse):

┬─[chris@chris-lt01:~]─[03:39:45 PM]
╰─>$ redditmon --help
usage: redditmon [-h] [-r refresh] [-c config] subreddit

A Janky CLI Tool to Monitor a Subreddit

positional arguments:
  subreddit   The name of the subreddit you want to view

optional arguments:
  -h, --help  show this help message and exit
  -r refresh  The amount of time in seconds between refreshes (default 10)
  -c config   The file path to the config file (defaults to ~/redditmon.config.ini)

Let's make sure we have a config file (remember from above that you need to manually populate this with your own API credentials using this example):

┬─[chris@chris-lt01:~]─[03:42:01 PM]─[V:redditmon-venv]
╰─>$ ls ~/redditmon.config.ini 
/home/chris/redditmon.config.ini

And now we can launch our program with redditmon <subreddit_name>!

The Current Top Post on /r/sysadmin is:
"Thickheaded Thursday - January 28, 2021" submitted by user "AutoModerator" with 6 Upvotes

Howdy, /r/sysadmin!

It's that time of the week, Thickheaded Thursday!  This is a safe (mostly) judgement-free environment for all of your questions and stories, no matter how silly you think they are.  Anybody can answer questions!  My name is AutoModerator and I've taken over responsibility for posting these weekly threads so you don't have to worry about anything except your comments!

View This Post on Reddit at this URL: https://old.reddit.com/r/sysadmin/comments/l6t4ff/thickheaded_thursday_january_28_2021/

Final Notes

Thanks for sticking with this whole post (or just skipping to the bottom), it's quite a long one. This isn't an exhaustive guide on Python packaging by any means, but it should help you get started. For more information, checkout the official Python packaging tutorial, which helped me out a lot when learning how to do this. Let me know if you found this helpful, despise it, or have any questions!