Build a proper multi-dimensional Vector class by incrementally adding special methods. This chapter demonstrates how protocols (informal interfaces) work in practice.
In this chapter you will learn to:
Implement the sequence protocol — __len__ and __getitem__ — and get iteration, slicing, and in for free.
Make slicing return the same class instead of a raw list.
Use __getattr__ and __setattr__ to expose dynamic attributes (v.x, v.y, v.z).
Hash a sequence by reducing its components with XOR.
Use zip (and zip_longest) for parallel iteration and matrix transposition.
24.1 Protocols and duck typing
Python’s sequence protocol requires only two methods: __len__ and __getitem__. Anything that has them behaves as a sequence — slicing, iteration, in — without inheriting from anything. To see the protocol in action, we’ll generalize Vector2d to N dimensions and earn slicing, iteration, and len() from just those two dunders.
array(self.typecode, components) stores the components in a typed array — much more memory-efficient than a list. The typecode = "d" class attribute means every component is a 64-bit float; Python rejects anything that doesn’t fit at construction time.
__iter__ returns iter(self._components) — a cheap, lazy iterator over the array. Once __iter__ exists, for x in v, tuple(v), and unpacking all work.
__repr__ slices off the bracketed part of the array’s repr (array('d', [0.0, 1.0, ...]) becomes Vector([0.0, 1.0, ...])). It’s a small string surgery — there’s no friendlier API for “show me an array’s contents.”
__len__ makes len(v) work and lets the sequence protocol report a length.
__getitem__ is the interesting one. It receives either an integer (v[1]) or a slice object (v[1:4]); we have to handle both.
For a slice, we delegate to the underlying array’s slicing — which returns another array — and rewrap it in cls(...). Using cls = type(self) instead of hard-coding Vector means subclasses get their own type back.
For an integer, operator.index(key) extracts an integer index, raising TypeError for floats and other non-integer numeric types — matching what built-in lists do.
The general pattern: __len__ + __getitem__ is the minimum sequence interface. The clever part is the slice handling. Without it, v[1:4] would return an array.array — losing the Vector type. With it, slicing returns another Vector:
v[1:4]
Vector([1.0, 2.0, 3.0])
operator.index(key) raises TypeError for non-integers (like floats), matching how built-in sequences behave.
24.2 Dynamic attribute access
The first few components of a vector deserve familiar names: x, y, z, t. Writing four properties (one per name) is repetitive and doesn’t scale to higher dimensions. __getattr__ is the single hook: Python calls it whenever a normal attribute lookup fails, and it’s free to return whatever it wants.
class VectorXYZT(Vector): shortcut_names ="xyzt"def__getattr__(self, name): cls =type(self)iflen(name) ==1: pos = cls.shortcut_names.find(name)if0<= pos <len(self._components):returnself._components[pos]raiseAttributeError(f"{cls.__name__!r} object has no attribute {name!r}")v = VectorXYZT(range(5))v.x, v.y, v.z, v.t
(0.0, 1.0, 2.0, 3.0)
Walking through __getattr__:
The method only runs when the normal lookup fails. v.x first checks the instance dict, then the class chain — neither has x — so Python falls back to __getattr__("x").
cls.shortcut_names.find(name) returns the position of the name in "xyzt" — 0 for x, 1 for y, etc., and -1 for anything else.
The bounds check 0 <= pos < len(self._components) guards against two failures: a name not in shortcut_names (pos == -1), and a name past the end of the vector (asking for .t on a 2-element vector).
If we don’t have a match, we must raise AttributeError — anything else (returning None, raising a different exception) breaks the protocol that callers like hasattr rely on.
The general pattern: __getattr__ is the catch-all for “compute an attribute on demand.” Use it when the attribute set is rule-based (like positional shortcuts) rather than fixed.
There’s a trap. If we don’t override __setattr__, the user can do this:
v.x =99v.x, v[0]
(99, 0.0)
v.x = 99created an instance attribute that shadows our __getattr__. Now v.x and v[0] disagree. The fix is to override __setattr__ and forbid those names:
class VectorXYZT2(VectorXYZT):def__setattr__(self, name, value): cls =type(self)iflen(name) ==1:if name in cls.shortcut_names:raiseAttributeError(f"readonly attribute {name!r}")elif name.islower():raiseAttributeError(f"can't set attributes 'a' to 'z' in {cls.__name__!r}")super().__setattr__(name, value)w = VectorXYZT2(range(5))try: w.x =99exceptAttributeErroras e:print("blocked:", e)
blocked: readonly attribute 'x'
Walking through the fix:
__setattr__ runs on every assignment to an attribute — not just the named ones. So we have to be careful to forward the rest with super().__setattr__(name, value), otherwise the class becomes unusable.
The first check rejects the four shortcut names explicitly. The second blocks every other one-letter lowercase name as a precaution — it’s better to refuse v.q = 1 than to let users invent attributes that look component-like.
We use super().__setattr__ (not self.__dict__[name] = value) so that further subclasses can hook into assignment too. Direct dict writes bypass the chain.
The general pattern: when __getattr__ synthesizes an attribute, pair it with __setattr__ that refuses the same name. The two halves keep reads and writes consistent.
24.3 Hashing a sequence
Hashing two components used hash((self.x, self.y)) — building a tuple. For an N-component vector, the same trick works, but it’s instructive to do it manually with a fold — the pattern generalizes to any “combine the hashes of the parts” computation.
import functoolsclass HashableVector(Vector):def__eq__(self, other):returnlen(self) ==len(other) andall(a == b for a, b inzip(self, other))def__hash__(self): hashes = (hash(x) for x inself)return functools.reduce(operator.xor, hashes, 0)v1 = HashableVector([1, 2, 3])v2 = HashableVector([1.0, 2.0, 3.0])hash(v1) ==hash(v2), v1 == v2, len({v1, v2})
(True, True, 1)
Walking through the implementation:
__eq__ first checks that the lengths match — without this, Vector([1, 2]) and Vector([1, 2, 3]) would compare equal up to the second component. Then zip(self, other) pairs components; all(a == b for ...) short-circuits on the first mismatch.
__hash__ builds a generator expression of per-component hashes — lazy, so we don’t materialize an intermediate list.
functools.reduce(operator.xor, hashes, 0) folds the generator with XOR, starting from the seed 0. Each step is acc = acc ^ hash(component).
hash(v1) == hash(v2) because 1 == 1.0 and hash(1) == hash(1.0) — Python’s numeric tower (the family of related types bool ⊂ int ⊂ float ⊂ complex, defined in the numbers module) keeps these aligned. The interpreter explicitly arranges that any two numerically-equal values across the tower hash the same. That’s the rule “equal objects must have equal hashes” in action.
The general pattern: reduce(op, iterable, seed) is the canonical fold. XOR is the right op for hash-combining because it’s commutative and associative — order doesn’t change the result — and bit-mixing keeps the output well-distributed. The trade-off is that XOR-folding produces the same hash for [1, 2, 3] and [3, 2, 1]; that’s a feature when permutations should compare equal, a bug otherwise.
Warning__eq__ is slow on long sequences
The all(a == b for ...) form short-circuits on the first mismatch — good. But for very long sequences, hashing both and comparing the hashes first is faster. Profile before optimizing; correctness first.
24.4 The awesome zip
zip is the standard parallel-iteration tool — it pairs elements from multiple iterables so you can walk them in lockstep. It has three variants worth knowing because each handles uneven lengths differently. The default stops at the shortest:
*matrix unpacks [(1, 2), (3, 4), (5, 6)] into three positional arguments — equivalent to writing zip((1, 2), (3, 4), (5, 6)).
zip then takes the first element of each — (1, 3, 5) — and the second — (2, 4, 6). Those tuples are the columns of the original matrix.
The general rule: pick zip for “pair these and stop at the shortest”, zip(strict=True) when same-length is a precondition, and zip_longest when you want to pad. The zip(*matrix) pattern works for any operation that needs to flip a “list of rows” into a “list of columns.”
TipWhy this matters
Protocols are informal interfaces defined by convention: the sequence protocol is __len__ + __getitem__. If your class has those, Python treats it as a sequence — no inheritance required. This is duck typing made formal and powerful.
24.5 Build: a sequence-like Matrix class
Vectors live in one dimension; matrices in two. We’ll build a Matrix that’s a sequence of rows — len(m) is the row count, m[0] is a row, m[1:3] is a sub-matrix, m == n compares element-by-element, and m can be used as a dict key. Every dunder from the chapter earns its place.
Step 1: __init__, __len__, __getitem__ with slice handling. Rows are stored as immutable tuples so the matrix can be hashable later. __getitem__ dispatches on int vs slice, and the slice case returns another Matrix:
class Matrix:def__init__(self, rows):self._rows =tuple(tuple(row) for row in rows)def__len__(self):returnlen(self._rows)def__getitem__(self, key):ifisinstance(key, slice):returntype(self)(self._rows[key]) # slice -> Matrixreturnself._rows[key] # int -> tuple (a row)m = Matrix([(1, 2, 3), (4, 5, 6), (7, 8, 9)])[len(m), m[0], type(m[0:2]).__name__]
[3, (1, 2, 3), 'Matrix']
tuple(tuple(row) for row in rows) does two things at once: copies the input so the matrix isn’t aliased to the caller’s data (the chapter-18 lesson), and freezes each row into a tuple so the whole structure is immutable and hashable. The slice branch returns a newMatrix (using type(self) so subclasses get their own type back), exactly the pattern from the Vector example earlier in the chapter.
Step 2: __iter__, __eq__, __repr__. Add iteration (so for row in m: works), structural equality, and a clean repr:
class Matrix:def__init__(self, rows):self._rows =tuple(tuple(row) for row in rows)def__len__(self):returnlen(self._rows)def__getitem__(self, key):ifisinstance(key, slice):returntype(self)(self._rows[key])returnself._rows[key]def__iter__(self):returniter(self._rows)def__eq__(self, other):returnisinstance(other, Matrix) andself._rows == other._rowsdef__repr__(self):returnf"{type(self).__name__}({list(self._rows)!r})"m = Matrix([(1, 2), (3, 4)])n = Matrix([(1, 2), (3, 4)])[m == n, list(m), repr(m)]
__iter__ returning iter(self._rows) is the lazy-friendly form — no copy, just a fresh iterator over the row tuples. __eq__ compares the tuple-of-tuples directly, which delegates equality to the components recursively. __repr__ uses type(self).__name__ so subclasses report their own class.
Step 3: __hash__ via XOR-fold, plus a transpose using zip(*self). The chapter’s hash recipe (reduce with XOR over the components’ hashes) generalises one level higher — we hash each row tuple, then XOR-fold those:
functools.reduce(operator.xor, (hash(row) for row in self), 0) folds the row hashes with XOR — the chapter’s recipe applied at the row level rather than the scalar level. The matrix can now sit in a set or be a dict key. transpose is the zip(*self) trick from the chapter: *self unpacks the rows into positional args; zip then walks them column-by-column, yielding the transposed rows. Wrapping in type(self)(...) keeps the result a Matrix.
The build is the chapter applied at one level of recursion: the matrix is a sequence of rows just as the vector was a sequence of components. __len__, __getitem__ (with slice handling), __iter__, __eq__, __hash__, __repr__, and zip(*self) for transpose — every tool from the chapter, in one fifty-line class.
24.6 Exercises
Slice returns Vector. Verify type(v[1:4]) is Vector, not array.array. Why does that matter for chained operations like v[1:4][0:2]?
__getattr__ for indexed names. Extend shortcut_names so that v.r returns the magnitude (abs(v)) and v.theta returns the angle (for 2D vectors). Where should the special-case live?
zip(strict=True). Replace the __eq__ body with len == len and all(zip(self, other, strict=True)). What happens if you compare two vectors of different length?
Build a transposer. Write transpose(matrix) using only zip and *. Verify on a non-square matrix.
Hash that ignores order. Why does XOR-folding produce the same hash for [1, 2, 3] and [3, 2, 1]? When is that a feature; when is it a bug?
24.7 Summary
The sequence protocol — __len__ and __getitem__ — is the most rewarding contract in Python. Implement it carefully (return the right type for slices, raise the right error for non-integers), pair it with __getattr__/__setattr__ for named components, and your class behaves like the built-ins.
Next, Chapter 25 steps back and surveys Python’s four approaches to typing: duck, goose, static, runtime — and shows when to reach for each.