Table of Contents

What is Type Hinting

Python is a dynamically typed language. This basically means that a variable can be any type (a float, str, dict etc.) and can change at any time.

1var = 123
2var = "spam"

Generally in compiled languages like C, a variable can only ever be one type, and your compiler will refuse to compile your code if this isn’t followed.

1int var;
2var = 123;
3var = "spam";
4// this will cause a compilation error

While this provides a ton of flexibility and makes Python easy to pick up and use, this can often hide issues in your code that will only appear at runtime.

1def add_two(val):
2    return val + 2
3
4add_two("eggs")
5# perfectly valid Python code
1> python temp.py
2Traceback (most recent call last):
3  File "temp.py", line 4, in <module>
4    add_two("eggs")
5  File "temp.py", line 2, in add_two
6    return val + 2
7TypeError: can only concatenate str (not "int") to str

Static analysis tools can’t really help either, as they would effectively have to execute your code in order to check for any issues.

To help alleviate this pain, with Python 3.5, Python introduced the concept of type hinting. These are basically annotations in your code that help static analysis tools check for errors before they occur, by indicating what types a variable is expected to be.

Basic Usage

Taking our example from before, the function expects a variable that is a number, and returns a new number. With type hints, this looks like:

1def add_two(val: float) -> float:
2    return val + 2
3
4add_two("eggs")

Now if we run a static analysis tool such as pyright (the engine behind Pylance), we can see our potential type issue (of adding a number to a string) while never having to actually execute our code.

1# output shortened for clarity
2> pyright temp.py
3...
4temp.py
5  temp.py:4:9 - error: Argument of type "Literal['eggs']" cannot be assigned to parameter "val" of type "float" in function "add_two"
6    "Literal['eggs']" is incompatible with "float" (reportGeneralTypeIssues)
71 error, 0 warnings, 0 infos

Great! Now if we change the function call to use a number, we get no errors.

1def add_two(val: float) -> float:
2    return val + 2
3
4add_two(123)
1> pyright temp.py
2...
30 error, 0 warnings, 0 infos

Even though the type hint is a float and 123 is an int, pyright is smart enough to know that this is fine, as an int can always be turned into a float.

Multiple Types

Now, what if we have a function that can accept multiple types? Take a look at this more complicated example:

 1from typing import Union
 2
 3def print_info(data: Union[str, dict]) -> None:
 4    if isinstance(data, str):
 5        print(f"Given data is {data}")
 6    elif isinstance(data, dict):
 7        print("Given data is:")
 8        for key, value in data.items():
 9            print(f"{key}: {value}")
10
11print_info("spam") # Given data is spam
12print_info({"foo": "bar"}) # Given data is:
13                           # foo: bar

In this example, there are a lot of things going on. First, typing.Union with square brackets is how we specify that an argument may be any of the given types. Additionally, now the return type hint is None as the function doesn’t return any values. So what happens if we run pyright?

1> pyright temp.py
2...
30 error, 0 warnings, 0 infos

Once again, no errors. This also shows another interesting thing. pyright is smart enough to realize that code nested under isinstance() restricts the variable to be of that type. Without this intelligence, it would complain that in the line for key, value in data.items():, data could be a string and does not have an .items() method.

Any

Now let’s say your function doesn’t have different print statements based on the type of the variable, it can handle anything. This can conveniently be typed with typing.Any.

1from typing import Any
2
3def print_info(data: Any) -> None:
4    print(f"Given data is {data}")
5
6print_info("spam") # Given data is spam
7print_info({"foo": "bar"}) # Given data is {"foo": "bar"}

This tells your type checker that literally any type is a valid input. Use with caution, but this is safe to use for functions that just print something, or convert it to a string, since any Python variable should be able to do this 1.

Overloads

Let’s say your function doesn’t return None, but rather returns the type it was given. You would think that you would put a Union on the argument and another Union on the return value, like so.

1from typing import Union
2
3def return_data(data: Union[str, dict]) -> Union[str, dict]:
4    return data
5
6return_data("spam")
7return_data({"foo": "bar"})

While on a surface level this looks okay, and pyright doesn’t raise any errors, you’ll quickly get type errors if you try to do something with the return data.

1from typing import Union
2
3def return_data(data: Union[str, dict]) -> Union[str, dict]:
4    return data
5
6a = return_data("spam")
7print(a[1:4]) # pam
8b = return_data({"foo": "bar"})
9print(b["foo"]) # bar
1> pyright temp.py
2...
3temp.py
4  temp.py:9:7 - error: Argument of type "Literal['foo']" cannot be assigned to parameter "i" of type "int | slice" in function "__getitem__"
5    Type "Literal['foo']" cannot be assigned to type "int | slice"
6      "Literal['foo']" is incompatible with "int"
7      "Literal['foo']" is incompatible with "slice" (reportGeneralTypeIssues)
81 error, 0 warnings, 0 infos

While the error is pretty confusing, what’s really happening is that pyright knows that the output of return_data can be either a str OR a dict. So on line 9, where we get the key "foo" from a dict, pyright is also considering the possibility that you’re trying to slice a string (like line 7) with another string, which is not allowed.

To fix this, we use typing.overload and a bit of syntactic sugar to tie the input type to the output type.

 1from typing import overload, Union
 2
 3@overload
 4def return_data(data: str) -> str: ...
 5
 6@overload
 7def return_data(data: dict) -> dict: ...
 8
 9def return_data(data: Union[str, dict]) -> Union[str, dict]:
10    return data
11
12a = return_data("spam")
13print(a[1:4]) # pam
14b = return_data({"foo": "bar"})
15print(b["foo"]) # bar
1> pyright temp.py
2...
30 error, 0 warnings, 0 infos

Note here that you still need to create a Union in the actual function declaration with all the possible input types.

Literals

Next, how about a function that only accepts a specific list of arguments? You don’t want to put a blanket float or str type, so you can be more specific with typing.Literal2.

 1from typing import Literal
 2
 3def process(mode: Literal["choice1", "choice2"]) -> None:
 4    if mode == "choice1":
 5        print("Green eggs and SPAM")
 6    elif mode == "choice2":
 7        print("Green eggs and ham")
 8
 9process("choice1")
10process("choice3")
1> pyright temp.py
2...
3temp.py
4  temp.py:10:9 - error: Argument of type "Literal['choice3']" cannot be assigned to parameter "mode" of type "Literal['choice1', 'choice2']" in function "process"
5    Type "Literal['choice3']" cannot be assigned to type "Literal['choice1', 'choice2']"
6      "Literal['choice3']" cannot be assigned to type "Literal['choice1']"
7      "Literal['choice3']" cannot be assigned to type "Literal['choice2']" (reportGeneralTypeIssues)
81 error, 0 warnings, 0 infos

You can see that Literal acts a built-in Union. You don’t need to do Union[Literal["choice1"], Literal["choice2"]].

Classes

You’re also completely free to use a class as a type hint:

 1class Car:
 2    def __init__(self) -> None:
 3        self.tank = 0
 4
 5def add_gas(car: Car) -> None:
 6    car.tank += 20
 7
 8car = Car()
 9add_gas(car)
10print(car.tank) # 20

However, in some cases (mainly in return types), the variable for a type hint may actually be defined after the type hint itself, which causes an issue. Type hints are evaluated before code is ever executed, so you can run into possible NameErrors for undefined variables. A simple demonstration of this is to flip the order of the function and class:

 1def add_gas(car: Car) -> None:
 2    car.tank += 20
 3
 4class Car:
 5    def __init__(self) -> None:
 6        self.tank = 0
 7
 8car = Car()
 9add_gas(car)
10print(car.tank) # 20
1> python temp.py
2Traceback (most recent call last):
3  File "temp.py", line 1, in <module>
4    def add_gas(car: Car) -> None:
5NameError: name 'Car' is not defined

pyright reports the same error as well:

1> pyright temp.py
2...
3temp.py
4  temp.py:1:18 - error: "Car" is not defined (reportUndefinedVariable)
51 error, 0 warnings, 0 infos

Thankfully, there’s an easy fix without needing to reorganize your code. Option 1, is to wrap the type hint with quotes to make it a string. This way, Python has nothing to execute, while a type checker knows to still look for a class matching the string (this is why you must use typing.Literal for actual strings).

1def add_gas(car: "Car") -> None:

The second and preferred option is to add

1from __future__ import annotations

to your file(s). This effectively tells Python to evaluate type hints later, so the class name will able to be resolved after the file has been parsed.

One last thing about classes. If your class is in a different file, and you’re only importing it for the sake of type hinting, you can place the import inside a check for typing.TYPE_CHECKING:

1# car.py
2class Car:
3    def __init__(self) -> None:
4        self.tank = 0
1# gas_station.py
2from typing import TYPE_CHECKING
3
4if TYPE_CHECKING:
5    from car import Car
6
7def add_gas(car: Car) -> None:
8    car.tank += 20

This is a magic variable which is always False when code is run by the Python interpreter, but True for type checkers. This is a great way to be able to type hint functions without actually needing to import other files.

Variables

Thus far, we’ve been talking about how to type hint function arguments and return values. What about type hinting variables or class attributes? Well, you can do that with the same : syntax before the assignment of the variable or attribute. This is great to help prevent accidentally changing the type of a variable to something unexpected.

1from typing import Union
2
3class Car:
4    def __init__(self) -> None:
5        self.model: str = "5000"
6
7    def set_model(self, model: Union[str, int]) -> None:
8        self.model = model
1> pyright temp.py
2...
3temp.py
4  temp.py:8:14 - error: Cannot assign member "model" for type "Car"
5    Expression of type "str | int" cannot be assigned to member "model" of class "Car"
6      Type "str | int" cannot be assigned to type "str"
7        "int" is incompatible with "str" (reportGeneralTypeIssues)
81 error, 0 warnings, 0 infos

If you don’t like that syntax, you can do the same thing with a # type: <hint> comment at the end of the line.

1# these are functionally the same
2self.model : str = "5000"
3self.model = "5000" # type: str

Overrides

Sometimes, you can’t avoid that pyright is just wrong about something, or that some 3rd party library isn’t typed correctly. This is a bit of a contrived example, but here’s such an instance:

 1# based on this example:
 2# http://docs.peewee-orm.com/en/latest/peewee/quickstart.html#model-definition
 3import peewee as pw
 4
 5db = pw.SqliteDatabase("people.db")
 6
 7class Person(pw.Model):
 8    name = pw.CharField()
 9    age = pw.FloatField()
10
11    class Meta:
12        database = db
13
14person = Person(name="Nathan", age=99)
15
16temp_age: float
17temp_age = float(person.age)
1> pyright temp.py
2...
30 errors, 0 warnings, 0 infos
Type 'FloatField' cannot be assigned to type 'SupportsFloat | SupportsIndex | str | bytes | bytearray'
Strangely, this only occurs in VS Code for me, and not the command-line pyright tool

In reality, this works fine, but pyright isn’t having it. Often, putting something like

1assert isinstance(var, float)
2# or
3assert var is not None

in the proceeding lines works great, but in this case, person.age is not a float, but a database FloatField which pretends to be a float. The only solution I’ve found to get the warning to go away is to add the comment type: ignore to the end of the line.

1temp_age = float(person.age) # type: ignore

Use with great caution, as this effectively hide all warnings of any kind from Pylance or pyright for that line. I generally consider this a last resort as nearly always, I’ve typed something poorly, or there is a legitimate possible bug.

Red Squiggly Driven Development

Hopefully by now, you can see the value of type hinting your Python code. Now, trying to make sure your code doesn’t have any possible type issues in a large codebase can be a bit difficult. You could click through every single file in VS Code with Pylance, or you could setup an automated job to check every pull request or commit as part of testing. pyright already returns an exit code of 0 for no issues, and other values for problems. This makes it work great for CI (continuous integration) where an exit code of non-zero is almost always considered a failure.

You can pretty easily install the pyright tool with npm. You will need to also install all of your Python requirements as well.

GitHub Actions example:

 1name: Type Checking
 2
 3on:
 4  workflow_dispatch:
 5  pull_request:
 6    branches:
 7      - main
 8
 9jobs:
10  type-checking:
11    runs-on: ubuntu-latest
12    steps:
13      - name: Checkout Code
14        uses: actions/checkout@v2
15
16      - name: Setup Python
17        uses: actions/setup-python@v2
18        with:
19          # whatever Python version you want to use
20          python-version: 3.9
21
22      - name: Install requirements
23        run: python -m pip install -r requirements.txt
24
25      - name: Install pyright
26        run: sudo npm install -g pyright
27        # specific node version doesn't matter, even the oldest node installed
28        # on the latest Ubuntu agents is new enough for pyright
29
30      - name: Run pyright
31        run: pyright

Azure Pipelines example:

 1trigger: none
 2pr:
 3  - main
 4
 5pool:
 6  vmImage: ubuntu-latest
 7
 8steps:
 9  - task: UsePythonVersion@0
10    inputs:
11      versionSpec: "3.9"
12    displayName: Setup Python
13
14  - script: python -m pip install -r requirements.txt
15    displayName: Install requirements
16
17  - script: sudo npm install -g pyright
18    displayName: Install pyright
19
20  - script: pyright
21    displayName: Run pyright

With these CI workflows, this achieves what I like to call, “Red Squiggly Driven Development”. Instead of say, “Test Driven Development” or “Hype Driven Development”, pull requests cannot be merged until all red squiggles have been removed (see my previous post for how to turn on the red squiggles).

Caveats

To begin with, type hints are nothing but mere suggestions. The Python interpreter does nothing to actually enforce them, they are solely for the sake of the programmer. If you are interested in strict typing in Python, the Pydantic package is quite interesting. You can create class objects with strictly typed attributes, or add a decorator to your existing functions to strictly type them as well.

Additionally, type checking is only as good as the type hints that you, the programmer, write. If you’re lazy and don’t write type hints for your functions, there’s (currently) no way for a type checker to be able to validate that there won’t be any type issues.

1def add_two(val):
2    return val + 2
3
4add_two("eggs")
1> pyright temp.py
2...
30 errors, 0 warnings, 0 infos

Lastly, but most annoyingly, you may have to interact with certain libraries, (particularly ones based on auto-generated code) cough protobuf cough that don’t support type-hints, which can make working with them a hell of # type: ignore statements. If you’re determined, you can create stub files3 that define the type hints, or find a library that does it for you (for example, mypy-protobuf).

Conclusion

This is really just scratching the surface of type hinting. There’s a ton of tricks, and lots of different ways you can type hint stuff for more complex functions and data structures. I highly recommend looking through the typing library documentation to learn more. For example, you can use typing.NewType to make “pseudo” types which can be helpful for things like units. Or typing.TypedDict to type very specific dictionary formats.

I truly hope this helps improve your Python code and make you a better programmer. It certainly has helped me reduce the errors in my code without needing to actually run it.

Footnotes


  1. Yes, in some extremely rare cases, this is not the case. One would have to override the __str__ or __repr__ functions of the type’s class to raise an exception. ↩︎

  2. Only available in Python 3.8+, though typing-extensions helps backport this functionality to older versions. ↩︎

  3. Ironic that Google has an article explaining the benefits of static type analysis for Python, but their own protobuf library doesn’t support it. ↩︎