12 Writing Real Programs

Core idea

Real programs have a clear structure: separate the I/O from the logic, handle errors specifically, log instead of print, parse arguments with argparse, write tests with pytest, and follow a small handful of Pythonic idioms. None of this is optional past about 200 lines of code.

In this chapter you will learn to:

Lay out a working script around main() and if __name__ == "__main__":.
Add type hints that help readers and tools without slowing the program.
Write tests with pytest — including parametrized tests and fixtures.
Recognize and apply the 15 idioms that make Python code feel native.

12.1 The script template

The shape every robust Python script settles into:

#!/usr/bin/env python3
"""
What this script does, what inputs it expects, what it outputs.
"""
import argparse
import json
import logging
import sys
from pathlib import Path

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(message)s",
)
log = logging.getLogger(__name__)


def load_data(path: Path) -> list[dict]:
    log.info(f"loading {path}")
    with open(path, encoding="utf-8") as f:
        return json.load(f)


def process(data: list[dict]) -> list[dict]:
    return [
        {**record, "score": record["score"] * 1.1}
        for record in data
        if record.get("active", False)
    ]


def save_results(results: list[dict], path: Path) -> None:
    path.parent.mkdir(parents=True, exist_ok=True)
    with open(path, "w", encoding="utf-8") as f:
        json.dump(results, f, indent=2)
    log.info(f"saved {len(results)} records to {path}")


def parse_args() -> argparse.Namespace:
    p = argparse.ArgumentParser(description=__doc__)
    p.add_argument("input", type=Path, help="input JSON file")
    p.add_argument("-o", "--output", type=Path,
                   default=Path("output/results.json"))
    p.add_argument("-v", "--verbose", action="store_true")
    return p.parse_args()


def main() -> int:
    args = parse_args()
    if args.verbose:
        logging.getLogger().setLevel(logging.DEBUG)
    try:
        data = load_data(args.input)
        results = process(data)
        save_results(results, args.output)
        return 0
    except FileNotFoundError as e:
        log.error(f"input file not found: {e}")
        return 1
    except Exception:
        log.exception("unexpected error")
        return 2


if __name__ == "__main__":
    sys.exit(main())

Three things this template gets right:

I/O at the edges. load_data and save_results do file operations. process is pure — given the same input it gives the same output, with no file-system calls. That makes it trivially testable.
Errors are specific. FileNotFoundError is handled by name; the catch-all is logged with a traceback (log.exception) and returns a distinct exit code.
main() returns an exit code. sys.exit(main()) makes the script play well with shell pipelines and CI.

12.2 Type hints

Without hints, every function is a black box: you read the body to learn what it expects. Type hints are non-binding annotations that document the expected types right at the signature — Python ignores them at runtime, but readers and tools (mypy, pyright, IDE auto-complete) use them to catch mistakes before the program runs.

Type hints don’t change runtime behavior. They:

Document what a function expects and returns.
Let tools (mypy, pyright, IDEs) catch errors before runtime.
Make code easier to read.

def greet(name: str) -> str:
    return f"Hello, {name}!"

def get_items(count: int, price: float) -> list[dict]:
    return [{"id": i, "price": price} for i in range(count)]

[greet("Alice"), get_items(2, 9.99)]

['Hello, Alice!', [{'id': 0, 'price': 9.99}, {'id': 1, 'price': 9.99}]]

name: str annotates the parameter — name is expected to be a str.
-> str annotates the return type after the parameter list.
list[dict] is the parameterised form (Python 3.9+) — a list of dicts. Older code wrote List[Dict] from the typing module; the lowercase forms are the modern syntax.
At runtime these annotations are stored on the function (in __annotations__) but never enforced — greet(123) still runs, the type checker just flags it.

Annotating a parameter as dict is often too narrow. A function that only reads a few values from a mapping doesn’t actually need a dict — it needs anything that behaves like one. The standard library already has names for these “behaves like” categories, in the collections.abc module:

Abstract type	What it means	Things that satisfy it
`Iterable[T]`	“you can `for` over it”	`list`, `tuple`, `set`, `dict`, generators, files
`Sequence[T]`	“indexable and has a length”	`list`, `tuple`, `str`, `bytes`
`Mapping[K, V]`	“supports `m[k]` lookup, `.keys()`, `.values()`”	`dict`, `defaultdict`, `ChainMap`
`Callable[[A, B], R]`	“you can call it”	functions, lambdas, classes, anything with `__call__`

Each one is a protocol: “I promise to use only the operations this type guarantees.” Using them as parameter hints lets the caller pass any conforming type without locking them into one concrete container:

from collections.abc import Mapping

def total_price(prices: Mapping[str, float]) -> float:
    return sum(prices.values())

total_price({"apple": 1.0, "pear": 0.5})

1.5

Mapping[str, float] says: “give me anything supporting dict-style lookup with string keys and float values.” A plain dict qualifies, but so does a defaultdict, a ChainMap, or any custom class implementing the mapping protocol.
The annotation is a contract in the other direction too: “I’ll only call .values() and other read-only mapping operations” — narrower than dict (which would also allow __setitem__, .pop(), etc.), more flexible for callers.

The general rule: hint parameters with abstract types (Iterable, Mapping, Sequence) and returns with concrete ones (list[int]) — accept broadly, commit narrowly.

Common forms:

list[int]                    # list of ints
dict[str, int]               # dict mapping str to int
tuple[int, str, float]       # specific tuple types
tuple[int, ...]              # variable-length homogeneous tuple
set[str]                     # set of strings

int | None                   # Python 3.10+
Optional[int]                # equivalent (typing module)
int | str                    # union (Python 3.10+)
Union[int, str]              # equivalent

from collections.abc import Iterable, Sequence, Callable, Mapping

def upper_all(items: Iterable[str]) -> list[str]:
    return [s.upper() for s in items]

def apply(func: Callable[[int], str], xs: list[int]) -> list[str]:
    return [func(x) for x in xs]

Two recent typing features pay off immediately. Python 3.12 (PEP 695) adds inline syntax for generic functions — no TypeVar import, no module-level boilerplate:

def first[T](xs: list[T]) -> T:
    return xs[0]

[first([1, 2, 3]), first(["a", "b"])]

[1, 'a']

[T] after the function name declares T as a fresh type parameter scoped to this function only.
The signature now reads “give me a list whose elements are of some type T, and I’ll return one element of the same T.”
A type checker treats first([1, 2, 3]) as returning int, and first(["a", "b"]) as returning str — the connection is preserved instead of collapsing to Any.

The general rule: when a function’s input and output types are linked, introduce a type parameter; the [T] syntax is the modern, boilerplate-free way. Same shape for classes: class Box[T]: ....

Python 3.12 also adds @typing.override — an explicit marker that a method is overriding a parent’s method. Type-checkers will flag a typo or a renamed parent method that the marker no longer matches:

from typing import override

class Animal:
    def speak(self) -> str:
        return "generic"

class Dog(Animal):
    @override
    def speak(self) -> str:
        return "woof"

Dog().speak()

'woof'

Animal.speak is the parent method.
@override above Dog.speak declares intent: “this is overriding a method from a parent.”
If Animal.speak is later renamed to talk, a type checker flags Dog.speak’s @override because it no longer overrides anything — catching what would otherwise be a silent bug.
At runtime @override is a no-op; its value is purely the static check.

The general rule: apply @override to every overriding method in a non-trivial class hierarchy — cheap to add, catches a real class of bug.

The deep treatment of type hints — Protocol, generics, variance, TypeVar — is in Chapter 20 and Chapter 27.

12.3 Testing with pytest

A test is a tiny program that calls your code with known inputs and checks the result is what you expect. pytest is the test runner: you put your tests in a file named test_*.py, run pytest from the terminal, and it reports which ones passed and which ones failed.

The minimum you need to know:

A test is a function whose name starts with test_. pytest discovers them automatically.
The test fails if any assert inside it is false, or if it raises an unexpected exception.
Useful flags: -v (verbose), -x (stop at first failure), -k expr (only run tests matching expr), --lf (rerun last failures).

Step 1: write a test as a plain function with assert.

# test_calculator.py
from calculator import add, divide

def test_add_positive():
    assert add(2, 3) == 5

def test_add_floats():
    import math
    assert math.isclose(add(0.1, 0.2), 0.3)

Two functions, two asserts. Run pytest and you’ll see 2 passed. If add(2, 3) returned 6, you’d see a failure with the line and the actual value.

When you expect a call to fail, wrap it in pytest.raises:

import pytest

def test_divide_by_zero():
    with pytest.raises(ValueError, match="cannot divide"):
        divide(10, 0)

This passes if divide(10, 0) raises ValueError with a message matching "cannot divide". It fails if no exception is raised, or a different one is.

Step 2: @pytest.mark.parametrize — the same test for many inputs.

You’ll often want to run the same logic over a table of inputs. Without parametrize, you’d write five near-identical functions:

@pytest.mark.parametrize("a, b, expected", [
    (2, 3, 5),
    (-1, 1, 0),
    (0, 0, 0),
])
def test_add_cases(a, b, expected):
    assert add(a, b) == expected

@pytest.mark.parametrize is a decorator that takes two arguments: a string naming the parameters ("a, b, expected"), and a list of tuples — one per case. pytest generates one test per tuple, so the report shows three separate tests with three separate pass/fail lines. Add a row, get a new test for free.

Step 3: @pytest.fixture — reusable setup.

When several tests need the same starting state — a configured object, a temp directory, a database connection — you don’t want to repeat the setup in every test body. A fixture is a function that produces that value:

from calculator import Calculator

@pytest.fixture
def calc():
    return Calculator(initial=100)

def test_calc_add(calc):
    calc.add(50)
    assert calc.total == 150

def test_calc_reset(calc):
    calc.reset()
    assert calc.total == 0

Two pieces of magic happen here:

@pytest.fixture registers calc with pytest as a fixture.
When a test function declares a parameter named calc, pytest calls the calc() fixture, takes its return value, and passes it in. Each test gets a fresh Calculator(initial=100) — no shared state between tests.

This is dependency injection by parameter name. Tests that don’t need the fixture simply don’t list the parameter.

The pattern: one assertion concept per test. The test name is the documentation.

12.4 Fifteen idioms that separate beginner Python from expert

Internalize these. They turn 100 lines of busy code into 50 lines that read like prose.

Iterate items, not indices.

for item in items: ...           # not: for i in range(len(items))

Unpack tuples.

name, value = get_pair()         # not: r = get_pair(); name, value = r[0], r[1]

Use enumerate when you need the index.
```
for i, item in enumerate(items): ...
```

Comprehensions, not append-loops.

result = [x * 2 for x in data]   # not: result = []; for x in data: result.append(x*2)

Generator expressions for large data.
```
total = sum(x ** 2 for x in big_iter)
```

with for resources.

with open("data.txt") as f: data = f.read()

dict.get for safe access.
```
value = d.get(key, default)
```

Truthiness, not == True/False/None.

if items: ...                    # not: if len(items) > 0
if result is None: ...           # not: if result == None

f-strings, always.

msg = f"Hello, {name}!"          # not: "Hello, " + name + "!"

Exceptions, not return codes.

if b == 0:
    raise ValueError("cannot divide by zero")

Early return, not deep nesting.

def process(data):
    if not data:
        return None
    if not data["valid"]:
        return None
    return transform(data)

.join for string building.

" ".join(words)                  # not: result = ""; for w in words: result += w + " "

Named functions over complex lambdas. If the lambda needs a comment to explain it, use def.
One responsibility per function. If you can’t describe what a function does in one sentence without “and”, split it.
Name things clearly. def calculate_monthly_payment(principal, rate, months): reads better than def f(a, b, c):. Code is read more often than it’s written.

Why this matters

The script template, type hints, tests, and the fifteen idioms together form the dividing line between Python that works and Python that’s pleasant to read, change, and extend. None of them are difficult — but applying them consistently is what makes a codebase scale beyond one author.

12.5 Going deeper

Type hints in depth — Protocol, generics, TypeVar, runtime checking — are in Chapter 20 and Chapter 27. The Pythonic-objects philosophy and the design choices behind these idioms are spread across Chapter 23, Chapter 25, and the data-model treatment that opens Part I.

12.6 Build: a small grading module with inline tests

We’ve talked about the template, the type hints, and the test idioms — let’s see all three in one tight example. A grading module: pure logic in the middle, type hints on every signature, inline tests at the bottom mirroring pytest’s positive/parametrize/raises pattern. Everything runs in this notebook.

Step 1: pure logic, well-typed. A function that converts a numeric score to a letter grade, plus a parser that validates the input range:

def grade(score: float) -> str:
    """Return the letter grade for a score in [0, 100]."""
    if not 0 <= score <= 100:
        raise ValueError(f"score out of range: {score}")
    if score >= 90: return "A"
    if score >= 80: return "B"
    if score >= 70: return "C"
    if score >= 60: return "D"
    return "F"

def parse_score(text: str) -> float:
    """Parse text as a score; raise ValueError on garbage."""
    try:
        value = float(text)
    except ValueError as e:
        raise ValueError(f"could not parse score: {text!r}") from e
    return value

[grade(95), parse_score("87.5")]

['A', 87.5]

grade(score: float) -> str documents both directions of the contract — readers and tools see the input/output types without reading the body. The validation if not 0 <= score <= 100: raises ValueError (idiom #10: exceptions, not return codes), and parse_score chains the underlying float-parsing failure with from e (chapter 7) so the original cause is preserved in the traceback.

Step 2: glue them together. A small process that takes a list of (name, raw_score) records and returns enriched records. I/O at the edges (load/save) would surround it; the middle is pure:

from collections.abc import Iterable

def process(records: Iterable[tuple[str, str]]) -> list[dict]:
    """Convert (name, raw) pairs into [{name, score, grade}, ...]."""
    out = []
    for name, raw in records:
        score = parse_score(raw)
        out.append({"name": name, "score": score, "grade": grade(score)})
    return out

process([("Alice", "95"), ("Bob", "73.5"), ("Carol", "60")])

[{'name': 'Alice', 'score': 95.0, 'grade': 'A'},
 {'name': 'Bob', 'score': 73.5, 'grade': 'C'},
 {'name': 'Carol', 'score': 60.0, 'grade': 'D'}]

Iterable[tuple[str, str]] accepts any iterable of pairs — a list, a generator, even a CSV reader. list[dict] commits to a concrete return type. That’s the “accept broadly, commit narrowly” rule from earlier in the chapter.

Step 3: inline tests in the three pytest shapes. Without pytest installed in this notebook, plain asserts — bunched into a tiny test runner — produce a pass/fail report in the same style:

def run(*tests):
    failed = []
    for fn in tests:
        try:
            fn()
        except AssertionError as e:
            failed.append((fn.__name__, str(e)))
    return f"{len(tests) - len(failed)}/{len(tests)} passed", failed

# Positive: simple direct check.
def test_grade_positive():
    assert grade(95) == "A"

# Parametrize-style: one logic over a table of cases.
def test_grade_cases():
    cases = [(95, "A"), (85, "B"), (75, "C"), (65, "D"), (50, "F"),
             (90, "A"), (80, "B"), (70, "C"), (60, "D")]
    for score, expected in cases:
        assert grade(score) == expected, f"grade({score}) → {grade(score)} ≠ {expected}"

# Raises-style: assert the boundary exception fires.
def test_grade_out_of_range():
    for bad in (-1, 101, 200):
        try:
            grade(bad)
        except ValueError:
            continue
        raise AssertionError(f"grade({bad}) should have raised ValueError")

run(test_grade_positive, test_grade_cases, test_grade_out_of_range)

('3/3 passed', [])

Each function follows pytest’s discovery convention (test_ prefix). The parametrize-style test does what @pytest.mark.parametrize would do for us: one function, many inputs, an assertion message that names the failing input. The raises-style test mirrors with pytest.raises(ValueError):. Run with real pytest and these same functions would be picked up automatically.

The build is the chapter in miniature: I/O at the edges (omitted here — process is the pure middle), type hints documenting every signature, exceptions over return codes, and tests that exercise the positive path, the cases table, and the boundary failure. About forty lines total, including the test runner.

12.7 Exercises

Refactor a script. Take a script you have lying around. Apply the template: pull I/O to the edges, replace print debugging with logging, replace sys.argv with argparse, add if __name__ == "__main__":. Compare line count.
Add type hints. Add hints to every function in that script. Run mypy script.py. Fix what it finds.
Write three tests. For one function in that script, write a positive test, a boundary test, and an error test using pytest.raises.
Find idiom violations. Scan the same script for violations of the 15 idioms. Pick three; fix them.
logger.exception. Replace one print(f"error: {e}") inside an except block with logger.exception("error context"). Compare the output — what extra information do you get?

12.8 Summary

A real program separates I/O from logic, handles errors with intent, logs rather than prints, takes its inputs from argparse, and is checked by tests. With these foundations in place, you’re ready for Part I — where we deconstruct what makes Python feel like Python: the data model and the special methods that built-in syntax delegates to. We start in Chapter 13.