Introduction
Structural Pattern Matching is a relatively new feature, introduced with Python 3.10. For those of you who programmed in languages like C or Java, the new feature resembles the switch statement, yet it is more powerful and flexible.
The feature incorporates itself in the code by the following syntax:
match expression:
case condition_1:
# do something
case condition_2:
# do something else
...
case condition_N:
# do something yet else
Let's take a look at a simple example.
Commandline tool
Let's create a simple program which calculates the frequency of used characters in a piece of text. For example, for the text
one, two, three
The output should be
e: 0.2317
o: 0.1548
,: 0.1548
t: 0.1548
n: 0.0779
w: 0.0779
h: 0.0779
r: 0.0779
The program can be started in two ways:
-
With free text
freqcount [the text to analyze goes here] -
With a file flag
freqcount -f [path/to/file]
Here's the code:
#!/usr/bin/env python3
from collections import Counter
import sys
FrequencyByChar = dict[str, float]
def freqcount(text: str) -> FrequencyByChar:
# Here, a 'word' denotes a contiguous piece of text. For example: 'one,' (with the comma)
# is treated as one word.
text_without_whitespace = ''.join(text.lower().split())
num_chars = len(text_without_whitespace)
return {
char: num_occurences / num_chars
for char, num_occurences
in Counter(text_without_whitespace).items()
}
def output_result(freqs: FrequencyByChar) -> None:
"""
We need to sort the output by frequency before printing.
A dictionary is an unorded data structure.
"""
sorted_items = sorted(
freqs.items(),
# "Items" are tuples: (character, frequency)
key=lambda item: item[1],
reverse=True
)
for char, freq in sorted_items:
print(f'{char}: {freq:.4}')
def report_frequency(text: str) -> None:
freqs = freqcount(text)
output_result(freqs)
def main() -> None:
args = sys.argv[1:] # The first entry is always the name of the program
match args:
case ['-f', filename]:
with open(filename) as f:
text = f.read()
case [*words]:
text = ''.join(words)
report_frequency(text)
if __name__ == '__main__':
main()
The interesting part happens in the main function. Instead of running a series of "if" statements to determine the length of the input, whether it starts with '-f' or to handle edge cases like
freqcount -f words to count here
In the example, the entire process of args-parsing is fully declarative and has no nested logic. Also, notice that the input array got 'decomposed' - in the first case, the filename given by user got assigned to the 'filename' variable. In the second case - all entered words got stored inside the 'words' array. Doing that with if-statements clearly requires more code.
What can be matched?
As shown in the previous example, match is a powerful feature capabla to handle complex data types. You can use structural pattern matching on many different data types. Let me enumerate a few.
OR Patterns
You can use the pipe (|) operator to allow any of these two patterns. For example:
match status_code:
case 400 | 404:
print('bad')
case 200 | 201 | 202:
print('ok')
case 500:
print('error')
You can also assign the result of the conditional matching to a variable. This will be presented in the next example.
Wildcard Pattern
You can use a pattern that will match anything that hasn't been matched before (by any previous pattern)
Example:
cmd = input()
match cmd.split():
case ['quit']:
print('bye')
# 'as' keyword causes the matched value to be stored in the 'direction' variable.
case ['go', ('up' | 'down' | 'left' | 'right') as direction]:
print(f'going {direction}')
# Here we added a 'guard' (if expression), will only match strings from the list
case [str(direction)] if direction in ['up', 'down', 'left', 'right']:
print(f'going {direction}')
# _ is the wildcard pattern - it will match anything that hasn't been matched yet.
case _:
print("I don't understand.")
Classes
It is possible to match against classes. You can even match against certain attributes.
Let's define a bunch of classes (dataclasses would work as well):
class Point:
x: int
y: int
def __init__(self, x: int, y: int) -> None:
self.x = x
self.y = y
class Circle:
r: int
def __init__(self, r: int) -> None:
self.r = r
We can use those classes in case clauses.
Notice that the matching expression doesn't have to be a proper instance of the class.
Cases like Point(x=0) denote any Point instance having x=0
match shape:
case Point(x=0, y=0):
print('The origin')
case Point(x=0):
print('A point on the X axis')
case Point(y=0):
print('A point on the Y axis')
case Circle(r=r):
print(f'A circle with radius {r}')
case _:
print('Invalid data type.')
Dictionaries
You can also use dictionaries in 'case' clauses. It will match ALL dictionaries that meet all requirements
For example:
case {'firstname': firstname}
will match all dictionaries having the key 'firstname'.
case {'firstname': 'Chris', 'age': age}
will match all dictionaries having the key 'firstname' assigned with the value of Chris AND having the key age.
In below example, 'person' can either be a list of two elements or a dictionary with keys: firstname and lastname.
match person:
case [firstname, lastname]:
...
case {'firstname': firstname, 'lastname': lastname}:
...
Built-in types
Finally, it is possible to add additional constraints on the primitive type of the matched data. For example, let's assume that our program parses a list of arguments to form a command. One possible command is:
sleep [number of seconds]
To automatically handle cases where number of seconds is a string instead of an integer, we can handle it as follows:
match command:
case ['sleep', int(t)]:
time.sleep(t)
case ['sleep', str(t)]:
time.sleep(int(t))
This syntax might be difficult to understand at first. But let's break it down. The expression
case ['sleep', int(t)]
simply matches two-element lists, whose first element it the string literal 'sleep' and the second element
is an integer. In addition, the integer gets assigned to the variable t.
Practical application - Event Loop
Imagine that you created a video game. Let's focus on part of the code responsible for handling user events. Events that need to be handled include:
- Mouse clicks (Left / Right buttons are allowed)
- Character movements (Four basic directions)
- The Pause event to stop the game temporarily
- The Quit event
Using Structural Pattern Matching, the code for the loop can be tremendously simplified.
# custom module with types mirroring in-game events
import events
def event_loop(game):
while 1:
event = game.get_event()
match event:
case events.Click(button='L'):
quick_attack()
case events.Click(button='R'):
strong_attack()
case events.Move(direction=dir):
translate_character(direction)
case events.Pause:
pause()
case events.Quit:
quit_game()
Summary
Structural Pattern Matching is still a new feature, and so is not common among different codebases.
It does, nonetheless, yield tremendous benefits when it comes to code readability and compactness.
Can you think of any other scenarios where SPM comes in handy and unwinds a complex chain of if statements?