During your work, you might have encountered some custom homemade command-line script that was poorly documented, poorly tested (if at all), and hard to use for the end-user, no matter how important the task it performed was. Often, people either don’t use any command-line argument parsing, or the one they use is outdated. Don't get me wrong, I have nothing against getopt, for example, or the built-in argparse library in Python, but there are better alternatives.
Enter click (or otherwise known as “Command Line Interface Creation Kit”): a comprehensive library, dare I say framework, for creating nice-looking CLI applications that practically document themselves. Click mostly uses decorators and functions to define commands and their arguments, which will be automatically called when processing a command.
The nice part is that the main command above is still usable as a regular function. This is an important feature: it will make testing CLI arguments not only possible but also quite easy and far less painful than it would be otherwise. The click library even provides a plethora of tools for making testing click applications a much less painful experience than it would be otherwise.
Hopefully, after reading through this blog entry, you will be able to make awesome command-line applications that are well tested, well documented, and easy to use for the end-user. And even if only you will use the resulting script, you will thank your past self for leaving behind a better application.
If you are the kind of person who most often writes throw-away scripts that will be run once or twice, click might seem like an overkill, however, the necessary code is still less than it would be if you manually parsed arguments. And hey, maybe it’s not so throw-away after all, but now months or years have passed, and you don’t even remember how your own script works.
The simplest use case for click is to write an application that takes a number of options and/or arguments, does something, and returns with a code between 0 to 255, where zero means - by convention – that the execution succeeded and a non-zero code means there’s an error.
So for the example below, let’s make a simple script that takes a JSON filename as an input, reads it, sorts the output by a given key, and outputs the resulting JSON. Let’s call this CLI script json_sorter.py.
We can then execute this script by giving it an input file and letting the script output the result to the console:
Or simply taking the input data from the standard input and displaying it on the standard output:
Let’s study the above example and see what is happening. The first thing one might notice is that click uses decorators heavily. That is because when we define a command (represented by the function main), the click subsystem will call this function with the options and arguments passed to it.
In this case, we get an input- and output file that click opens for us automatically after checking whether it exists, is accessible, etc. This is quite convenient since we don’t have to do these checks manually. But I digress.
The first decorator (click.command) defines that the main function will indeed be a click command. This decorator will then create a new command object (see the class click.Command for further detail). The help argument is important because click will, unless otherwise specified, automatically add a help option (--help) to the script:
Next, we define three options: the input file, the output file, and the key to sort the JSON objects based on.
Let's look at the input option in a little more detail. We define a short option - a long option --input and infile as the name of the function argument this option represents. Neither of these is mandatory; as we can see, the key option does not tell what the parameter name will be explicitly. Click will use the option names to determine them.
The default parameter tells us what the default value is. If not given, the default value will always be None. If we instead want click to give an error for a missing option, we could use the required=True argument.
The type parameter tells the option how the raw value provided by the command line will be processed. If not given, simply the string value will be passed to the function. The type may be a callable, like a function, a class constructor, or a custom click type, like click.File, which will open the file after checking whether it exists, can be opened, etc.
The last common parameter is help which, like in the case of the command, will be the help string for the given command-line option. So far it’s rather easy, but there are a plethora of other options that allow you to finetune how options are processed. These are described in detail in the click documentation, which is very well-written and helpful.
So far it’s rather easy, but many other options allow you to finetune how options are processed. These are described, in detail in the click documentation, which is very well-written and helpful.
Sometimes a script needs to have multiple commands that can be invoked separately. Maybe you have a script that can perform multiple operations on the same data, or maybe you want your entire code to have a single entry point.
To do this, we use the click.Group object. The group object is just a collection of commands that can be invoked from the command line simply by passing the command name. Let's consider the example json_ops.py:
This script has two commands that are very similar: the first one takes a JSON object and extracts the given key from it, while the second works on a list and extracts a given key from each object within it and creates a list of those keys.
The only two differences are the click.Group object that’s declared (which will be called when we want to invoke the command), and that instead of @click.command we use @main.command to define the sub-commands. The first parameter (e.g., key) will be the name of the command.
Now the above code has a lot of repetitions; namely, the input, output, and key options are repeated across all options. With the key, it is understandable, despite the similar name, it is not exactly the same semantically: in the first case, it is the key of the JSON object, while in the second, it is the key of the JSON objects inside the list. But how do we 'mark' those options as common to both sub-commands? Well, using the click.group decorator, we can refactor the code:
And to invoke it:
Again, let's look at what's happening. First, we used click.group the same way we used click.command. In this case, the --input and --output options must be invoked before the sub-commands. The values of these options will be passed to the main function, which will be executed before the sub-commands themselves.
This is incidentally also the reason we want to delay doing any work as long as possible: the group function will be called early on, even before the help option is evaluated. So unless we want to run something even before the help ran, we need to be as lazy as possible and only do the processing at the end. A philosophy click often follows itself.
For example, the outfile will not be opened - and thus created - until the first write operation is called on the file object. The behavior can be controlled by the lazy option: click.File(lazy=True) will delay the opening of the file as much as possible while click.File(lazy=False) will force the file to be opened immediately. If not specified, like in our example, click decides for itself when lazy makes or doesn’t make sense.
Now let's see how we define a command:
Here, we don't do the actual operation but rather return a function that can process the data and return a value. In this case, we simply return a function that returns the given key of the passed dictionary. Later on, we will be able to use this function to execute whatever operation we want on the data.
To do this, we need to define how the results of the sub-command functions should be processed. We can use the group object's result_callback() decorator:
Here, we tell the main group that we want any results returned by the sub-commands after they were executed to be passed to the decorated main_cb function. This function will then be called with the result that the sub-command's function returned plus the arguments corresponding to the various options (infile and outfile). In our case, that means dumping the result object into a file (or the standard output, if the option wasn't given).
Here is the actual lion’s share of the work: we load the infile content, execute the operation on the loaded data and dump the result into a new file.
And naturally, the --help option works here also, both for the script and the individual commands:
Contexts and user objects
Sometimes it's not enough to simply return something from the sub-commands and must be dealt with in the sub-command function. We often will need to pass some object we defined to the sub-commands. Enter click.Context. This context object will hold the contextual information of the command execution: our user-defined objects, configured options, currently executed command, etc. For now, we are only interested in the user-defined object.
Let's rewrite the above example to use such an object. We want something that contains the input file, the output file, and a method that can transform that data and then immediately write it to the output file. So let this class be called UserContext:
This is a really simple object: it has the two files as members and a process that will do the same as our post-processor in the previous example. We execute an operation on the loaded data and write the result to the output file.
We need to initialize it with the group beforehand:
Note, that we are still lazy: we simply store the options we got but don't do anything with it yet. This object will then be used from the commands:
The only thing different is that instead of returning the lambda, we pass it to the UserContext.process method, which will do everything else.
Now let's say we introduce a new sub-command called sum to the first example:
<p>This will simply add together any data within the JSON list. This function will make the assumption that the data will be a list containing numbers:</p>
But what if we have a more complex JSON, one that is a list of objects, for example. We want to be able to get the key from a JSON object and calculate the sum on that, or maybe sum the given fields on an object. To do this, we'd need to either create a file for any immediate steps or pipe the results together:
Both of them are rather unwieldy and not easy to use. It would be nice to be able to invoke the commands together, like this:
The good news is that we don't have to do anything fancy, just modify the a few functions. First, let's see what we need to do with the main function:
Two things happen here: the chain=True tells the group that multiple commands can be executed after each other, while invoke_without_command=True tells click that the group can be invoked without any subcommand. With the second option, we can call the script without any subcommands. In our case, this will simply do nothing except copying the input to the output.
Now since we execute multiple commands instead of one, we don't just get the operation for one invoked sub-command, but all of them, in a list:
We simply execute the operations in the order we got them, and presto! We have exactly what we wanted at the beginning. Personally, I think it's a rather simple and elegant way of achieving such a versatile and useful result.
Options and arguments
Now let's talk about options and arguments. We have already seen some very basic use cases where we wanted two files to be opened and some string for the key names to be passed. However, options are way more powerful than this and can do many things.
First, let's differentiate options and arguments. Options typically look like --option value or for short options -o value. The order in which they are given does not matter, while arguments are given in a sequence after options have been processed. Take the following example:
In this case --option opt1 and -s opt2 are the options and are defined by the previously demonstrated @click.option decorators, while arg1, arg2, and arg3 are arguments that are defined with @click.argument.
Let's take the simplest example (I'll be omitting the boilerplate from here on, so you'll have to do the imports and main invocations yourself):
If we ask for the help on this script, we'll see the following:
Here we only take a single, mandatory argument with the type float and simply print it. We can now calculate the sum of a single number, which is the number itself. So the script is already technically correct for the case where we have a single number, and we all know that technically correct is the best kind of correct!
But it is not very useful. Let's extend the example a little and allow zero numbers, too, in which case the sum is zero:
Ah, our old friend, required. Here, by default, an argument is required, so we have to specify required=False if we want to have an optional argument. If not given, we get None again, so we have to make sure we print a float in both cases.
Now let's say we actually want this to be useful and accept an arbitrary number of arguments (zero included):
In this case, by default, no arguments are required. Hargs tells the argument how many are required. -1 is a special case, it means any number of arguments. Incidentally, we could also add the required=True argument, so at least one argument is passed.
There can be special kinds of options that come in handy in different situations.
Let's say we want to implement the above example, but instead of using arguments, we want to use options. Let's say that the option is -n / --number. In this case, we need a multiple option by using the multiple=True parameter:
Note that unlike before, when no numbers were passed, we get an empty tuple instead of a None value, so we can use sum right away.
Another thing we might want is an option that needs no value, where we're only interested in whether it's defined or not. This is called a flag and can be defined by two different methods. First is by using the is_flag=True parameter:
Sometimes we want two options for the same value, one representing the true value, the other the false value. In the above case we'd want a --sum option for producing sums and a  --product option to mean
If we specify -p or --product, we'll get True as the argument, if we specify nothing, -s or --sum we get a sum.
Option- and argument types
Now that we know a bit more about commands, groups, options, and arguments, let's talk about types. By default, when click parses an option- or command argument, the value will be passed as a string, without further processing. This is not always what we want, as we've seen previously.
Types come in two different "flavors:" callables, which will process the value and then return them, and dedicated type classes, which can do a bit more. Callables are simple: pass a function or type that takes a value as a parameter and returns the processed result. Say we want to convert the numbers in the above example silently to None when they cannot be converted to a float (let's say we pass four instead of 4 to the option):
The type classes, on the other hand, are far more useful and click comes with a lot of pre-defined types. Let's take a quick tour of these types.
The type Choice is for when there are one or more potential values that can be chosen. Anything that is not in the list of acceptable choices will produce an error. The sum or product example can be rewritten to use a choice instead of a flag option:
In the above example, if we specify -t sum or -t product, we get the sum- or product of the numbers, respectively. The case_sensitive=False parameter tells choice - as the name implies - to be case insensitive.
Other built-in types
Other built-in parameter types include
- click.DateTime date or datetime formats that, by default, follow the YYYY-MM-DD HH:MM:SS or similar formats that can be customized.
- click.UUID will parse any valid UUIDs and convert it to a uuid.UUID object instance.
- click.File was previously shown in examples. It will try to lazily open a file for reading, writing, or both.
- click.Path will not open a file but perform various checks on the given parameter (e.g., check if the file exists, check if it is a file and not a dir, etc.)
In many cases, one can get by with these types, but sometimes we need to process the data in complex ways. For the simpler cases, as we previously saw, it is possible to use a function with a single parameter. However, implementing a new click parameter type in the "proper" way gives us far more control over how these options are processed.
Let's say that we want to extend our initial code to automatically read the JSON file from the given file option and pass the loaded data directly. On any errors we get, we want click to tell the user that the file was an invalid JSON file.
First, we inherit from the click.File option, since we want to work on readable JSON files. The JsonFile.convert method will be called when the option is being converted. The value parameter will be the unprocessed option value, and the param parameter will be the option- or argument instance (either a click.Option or a click.Argument instance). Finally, the ctx parameter is the currently active click execution context (click.Context object).
Another interesting thing is the call to self.fail, which will raise a click-specific exception. This exception will then be used to display an error if the input is not a valid JSON file, e.g.:
Arguments and environmental values
Occasionally we run across a situation where we might want to allow the user to give certain or all arguments as environmental variables. Or maybe we have some option that doesn't change much, and we want to set an environmental variable and then use the script without specifying the same parameter over and over again.
To show this, let's write a small script that takes a username, processes the /etc/passwd file, and prints out the shell used by the user. As a sidenote: if you are not familiar with UNIX-based systems, the /etc/passwd file is a colon-separated file that contains, among other things, the user names, their unique numeric identifier, and the shell that starts when the user logs in and starts a session. It is very similar to a bog-standard CSV file.
In Python, a handy built-in package can look up users in this text file: the pwd module. The function we want is pwd.getpwnam, which looks up an entry based on the name and returns an object with the required data. For further information, read the official Python documentation
At this point, we should be able to create the required program with what we have learned so far:
This will show the shells for whatever users we want to. When I run this app, I get the following output:ű
So far, so boring, but what if we usually only want to check out our own shell, for whatever reason? Well, on Posix systems there is a USER environmental variable with the current user’s name. It would be just peachy if we could use this environment variable as the default. We could, of course, use the default=os.envvar["USER"] parameter, however, that will resolve the environmental variable too early. So let's tell click to use the environment variable when possible:
This way, if I don't give a user name, I will get my own shell:
My usual go-to use case for this is when I have some parameter that is usually left to use the default value, but occasionally I want to change it. Maybe I'm passing a configuration to my script. This script is always the same, but when I'm testing, I want to use an alternate configuration. Adding a parameter every time I invoke the script would be cumbersome, so I can simply just set the environment variable.
In this case, by default, we use the ~/.script-config configuration file, but can override it with an option or an environmental variable:
Prompting by default
As if what options can do for us automatically wasn't enough, there is also the option to tell click to ask the user by default on a prompt for a value. Say we wanted to ask the user for the username if one is not specified:
This is also easily done by click, using the prompt parameter for click.option:
Password promting is also possible without much ado: simply use the hide_input=True parameter:
Automatic shell completion
This feature is my favorite, even though I don't use it that much. But when you have a big script with lots of options that you don't use frequently enough to memorize and don't want to reach for the documentation every time you need to know something, it comes in handy to have some sort of autocomplete functionality for your script.
Writing such functionality manually is not a straightforward task, but luckily, click provides a way to do it. Currently, three shells are supported: bash, fish, and zsh. Since bash is the most common shell, we'll use that as an example.
Before we go on, I have to mention a limitation of the autocomplete feature: it only seems to work when the script is installed, for example, if we have a script named getuser installed to a path listed in the PATH environment variable.
Calling the script with a special environment variable set will generate an autocompletion script:
The generated script then can either be stored in a file or generated each time it's needed in the .bashrc file:
Even better, we can write our own completions rather easily. Let's say we only want to autocomplete files with a certain extension. So in the json_ops we'd want to autocomplete directory names and file names ending with .json:
Option autocompletions are handled in the click type classes.
The shell_complete() method will return all potential completed values based on the incomplete parameter. This will be the part of the argument that has been so far typed, a prefix for all potential values.
So what does the above do? Firstly, we separate the path into a directory name (base) and a filename (prefix), then we collect all files in the base directory that start with the prefix and are either a directory or a file. If the resulting list of files only contains a single directory, we need to treat that as a prefix since a directory is not a valid "solution" and needs to be further processed.
While this method may seem a bit convoluted, it's still a lot simpler than doing it manually. And in most cases, we don't need to use it much, but it's always a good thing to keep in mind.
Another thing I often recommend is creating a setup.py for the script. This setup file seems to be a bit overwhelming at times, but only a few things are needed: a name for the project, a version, a Python package or module to install, and potentially any dependencies.
Integrating a script using click into this is really easy: same as it would be otherwise. Let's consider this setup.py file for our json_ops application:
The interesting part is the entry_points. The subject of what entry points are is beyond the scope of this blog post, but this blog-post explains it in detail. For our purposes, anything in the console_scripts part will be converted on installation to a script: the"json-ops = json_ops:main" entry point, which means we will have an executable named json-ops in the executables path (e.g. /usr/bin if we installed it as root or ~/.local/bin if we installed it as a user).
Like all things, click has some limitations. One of the most apparent I have encountered so far is the lack of mutually exclusive option groups.
Sadly, the code above, which will allow either --path or --file to be specified, but not both at the same time, is currently not solvable with the built-in click tools; however, there are outside libraries built on click that provide similar functionalities or ways around this problem.
And that's it, folks
While there are a lot more to click that we haven't mentioned yet, like user input handling, exception handling, or various utility functions, the scope of a single blog post is not enough to cover this very versatile utility. One can implement almost any script, small or large, using click. With the provided tools, making professional/looking scripts, even if they are only ever used once or twice, is easy.
Personally, nowadays, I don't even use anything else. Working with argparse or getopt is not any simpler, and they provide no extra functionality. And working directly with sys.argv is not recommended since you will always end up reimplementing the wheel. Or rather, click.