35  Attribute Descriptors

NoteCore idea

Descriptors reuse property logic across multiple attributes or classes. They power Python’s fundamental machinery: methods, @property, @classmethod, @staticmethod — all implemented as descriptors.

In this chapter you will learn to:

  1. State the descriptor protocol: __get__, __set__, __delete__.
  2. Implement a validating descriptor and use it on multiple attributes via __set_name__.
  3. Distinguish overriding (data) descriptors from non-overriding (non-data) descriptors.
  4. See methods themselves as descriptors — and understand why obj.method is a bound method.
  5. Choose between @property, descriptors, and __init_subclass__ (the next chapter’s tool).

35.1 The descriptor protocol

A descriptor is an object that lives on a class and intercepts attribute access on instances. The protocol has three methods, all optional:

  • __get__(self, instance, owner=None) — called when the attribute is read.
  • __set__(self, instance, value) — called when it’s written.
  • __delete__(self, instance) — called when del is used on it.

A descriptor with __set__ (or __delete__) is a data (or overriding) descriptor — it shadows the instance dict. Without __set__, it’s a non-data descriptor — the instance dict can shadow it.

The lookup priority Python applies for obj.attr:

flowchart TB
    A["obj.attr"] --> B{"data descriptor<br/>on the class?"}
    B -- yes --> G["call __get__"]
    B -- no --> C{"attr in instance __dict__?"}
    C -- yes --> R["return that"]
    C -- no --> D{"non-data descriptor<br/>on the class?"}
    D -- yes --> G
    D -- no --> E{"plain class attribute?"}
    E -- yes --> R
    E -- no --> X["AttributeError"]

Data descriptors win over the instance dict; non-data descriptors lose to it. That single rule explains why @property (a data descriptor) cannot be shadowed by an instance attribute, but a plain method (non-data descriptor) can.

35.2 A validating descriptor

Reusable per-attribute validation, the descriptor way:

class Quantity:
    def __set_name__(self, owner, name):
        self.storage_name = name

    def __get__(self, instance, owner=None):
        if instance is None:
            return self
        return instance.__dict__[self.storage_name]

    def __set__(self, instance, value):
        if value > 0:
            instance.__dict__[self.storage_name] = value
        else:
            raise ValueError(f"{self.storage_name} must be > 0")


class LineItem:
    weight = Quantity()
    price = Quantity()

    def __init__(self, description, weight, price):
        self.description = description
        self.weight = weight
        self.price = price
    def subtotal(self):
        return self.weight * self.price

raisins = LineItem("Golden raisins", 10, 6.95)
raisins.subtotal()
69.5
raisins.weight = 0
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 raisins.weight = 0

Cell In[1], line 14, in Quantity.__set__(self, instance, value)
     10     def __set__(self, instance, value):
     11         if value > 0:
     12             instance.__dict__[self.storage_name] = value
     13         else:
---> 14             raise ValueError(f"{self.storage_name} must be > 0")

ValueError: weight must be > 0

Three pieces fit together:

  • weight = Quantity() creates a descriptor instance and stores it on the class.
  • Python calls Quantity.__set_name__(LineItem, "weight") automatically — that’s how the descriptor learns the attribute’s name.
  • When you do raisins.weight = 10, Python’s attribute machinery sees the descriptor on the class, finds __set__, and calls it with (raisins, 10).

The descriptor stores values in the instance’s __dict__ — keyed by storage_name. That keeps each instance’s data separate, which is what you want.

35.3 Overriding versus non-overriding

Here’s the distinction in action. An overriding descriptor (with __set__) wins over the instance dict; a non-overriding one (without __set__) loses to it.

class Overriding:
    def __get__(self, instance, owner=None):
        return "overriding.__get__"
    def __set__(self, instance, value):
        print("overriding.__set__")

class NonOverriding:
    def __get__(self, instance, owner=None):
        return "nonoverriding.__get__"

class Managed:
    over = Overriding()
    non = NonOverriding()

m = Managed()
m.over, m.non
('overriding.__get__', 'nonoverriding.__get__')
m.over = 7
m.over, m.__dict__
overriding.__set__
('overriding.__get__', {})

m.over = 7 ran __set__. The instance dict stayed empty.

m.non = 7
m.non, m.__dict__
(7, {'non': 7})

m.non = 7 updated the instance dict. The descriptor still exists on the class, but now m.non finds 7 in the instance dict first and never consults the descriptor.

@property is always an overriding descriptor — it has both __get__ and __set__. Functions are non-overriding — they have __get__ but not __set__, which is why an instance can shadow a method by setting an attribute with the same name.

35.4 Methods are descriptors

You’ve been writing obj.method(arg) since chapter 1 and it Just Works — self ends up bound to obj automatically. Where does that binding come from? There’s no special compiler magic; it’s the descriptor protocol again. A function defined in a class body is a non-overriding descriptor, and accessing it through an instance triggers __get__ to produce a bound method.

class C:
    def method(self):
        return "hi"

C.method, C().method
(<function __main__.C.method(self)>,
 <bound method C.method of <__main__.C object at 0x7f971c94c440>>)

Walking through what just happened:

  • C.method — accessed through the class, no instance — is just the plain function. No binding.
  • C().method — accessed through an instance — is a bound method, a small wrapper that pre-binds the instance as the first argument. Calling bound() is the same as calling method(instance).
  • The two access paths produce different objects because Python’s attribute lookup notices method has a __get__ and calls it on the instance access. On the class access, __get__ is called too — but with instance=None — and the function just returns itself.

You can call __get__ directly to see the binding happen:

c = C()
bound = C.method.__get__(c, C)
bound, bound() == c.method()
(<bound method C.method of <__main__.C object at 0x7f971cb3f110>>, True)

Walking through this:

  • C.method.__get__(c, C) is exactly what Python does internally for c.method. The first argument is the instance to bind; the second is the class (used by super() and a few introspection paths).
  • The returned bound is c.method for all practical purposes — calling it is identical to calling c.method(). The == confirms equality of the resulting strings.

The general insight: there’s no “implicit self” in Python. There’s just the descriptor protocol — a function’s __get__ returns a callable with the instance pre-bound. The same mechanism explains @classmethod and @staticmethod:

  • @classmethod is a descriptor whose __get__ binds the class (not the instance).
  • @staticmethod is a descriptor whose __get__ returns the function unchanged.

There’s no “magic.” The same pattern — descriptor + __get__ — implements all three.

35.5 A complete validating descriptor

The earlier Quantity baked one validation rule (> 0) into the descriptor. Real codebases need a family of rules — positive numbers, non-empty strings, valid emails — and you don’t want to rewrite the storage-and-retrieval boilerplate each time. The clean answer is an abstract base descriptor that handles the protocol, with concrete subclasses that implement only the rule. Two steps build it.

Step 1: lift the protocol into a base class. Validated does the __set_name__ / __get__ / __set__ plumbing once and leaves validate abstract. A subclass adds only the rule:

import abc

class Validated(abc.ABC):
    def __set_name__(self, owner, name):
        self.public_name = name
        self.private_name = "_" + name

    def __get__(self, instance, owner=None):
        if instance is None:
            return self
        return getattr(instance, self.private_name)

    def __set__(self, instance, value):
        value = self.validate(value)
        setattr(instance, self.private_name, value)

    @abc.abstractmethod
    def validate(self, value): ...

class Positive(Validated):
    def validate(self, value):
        if value <= 0:
            raise ValueError(f"value must be positive, got {value!r}")
        return value

class Order:
    qty = Positive()
    def __init__(self, qty):
        self.qty = qty

Order(3).__dict__
{'_qty': 3}
  • Validated(abc.ABC) implements __set_name__, __get__, and __set__ once — the protocol plumbing — and declares validate abstract so subclasses are forced to provide it.
  • __set_name__ records both public_name (for error messages) and private_name (_qty here — the underscore-prefixed key used for storage). Python calls this automatically when the class body completes.
  • __get__ returns the descriptor itself on class access (if instance is None) — that’s how Order.qty shows the descriptor for introspection. On instance access it fetches _qty via getattr.
  • __set__ runs validate first, then stores via setattr(instance, self.private_name, value). Storing under a different name than the descriptor’s is the trick — otherwise the descriptor would intercept its own write and recurse.
  • Positive adds nothing but a validate method. The subclass is the rule.

Step 2: a family of validators on the same class. Add a second subclass for non-empty strings, then attach both descriptors to one class:

class NonEmpty(Validated):
    def validate(self, value):
        if not value:
            raise ValueError("value cannot be empty")
        return value

class Order:
    qty = Positive()
    name = NonEmpty()
    def __init__(self, name, qty):
        self.name = name
        self.qty = qty

Order("widget", 3).__dict__
{'_name': 'widget', '_qty': 3}
Order("", 3)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 1
----> 1 Order("", 3)

Cell In[9], line 11, in Order.__init__(self, name, qty)
     10     def __init__(self, name, qty):
---> 11         self.name = name
     12         self.qty = qty

Cell In[8], line 14, in Validated.__set__(self, instance, value)
     13     def __set__(self, instance, value):
---> 14         value = self.validate(value)
     15         setattr(instance, self.private_name, value)

Cell In[9], line 4, in NonEmpty.validate(self, value)
      2     def validate(self, value):
      3         if not value:
----> 4             raise ValueError("value cannot be empty")
      5         return value

ValueError: value cannot be empty
  • Each descriptor manages its own attribute. __set_name__ records different private_name values (_qty, _name) so the instance’s __dict__ keeps them separate.
  • Order("", 3) triggers NonEmpty.validate("") during __init__, which raises before qty is even assigned. The descriptor protocol intercepts the assignment as it happens.

The general pattern: lift the descriptor protocol into a base class, leave only the validation rule for subclasses. This is the right structure when the same validation pattern repeats across attributes or classes.

For library authors: typing.dataclass_transform (PEP 681, 3.11) tells type-checkers that a descriptor-based decorator behaves like @dataclass — so mypy and pyright recognize the synthesized __init__. Reach for it only when shipping a public API that wants ORM-style ergonomics.

TipWhy this matters

Descriptors are Python’s mechanism for customizing attribute access at the class level. @property, @classmethod, @staticmethod, and function objects themselves are all implemented as descriptors. When you understand descriptors, you understand how Python’s object system works at the deepest level.

35.6 Build: a typed-and-bounded Field descriptor family

Real ORM-style descriptors usually layer two checks: the value must be the right type, and it must satisfy a constraint (range, length, pattern). We’ll build that as a small descriptor family — a TypedField base for the type check, and IntField/StrField subclasses for the constraints.

Step 1: a TypedField base with __set_name__ and a type guard. All the protocol plumbing lives here:

class TypedField:
    def __init__(self, type_):
        self.type_ = type_

    def __set_name__(self, owner, name):
        self.public_name = name
        self.private_name = f"_{name}"

    def __get__(self, instance, owner=None):
        if instance is None:
            return self                             # access via the class — return self
        return getattr(instance, self.private_name)

    def __set__(self, instance, value):
        if not isinstance(value, self.type_):
            raise TypeError(
                f"{self.public_name} must be {self.type_.__name__}, "
                f"got {type(value).__name__}"
            )
        setattr(instance, self.private_name, value)

class Box:
    label = TypedField(str)
    weight = TypedField(int)

    def __init__(self, label, weight):
        self.label = label
        self.weight = weight

Box("crate", 5).__dict__
{'_label': 'crate', '_weight': 5}
Box("crate", "five")                                # weight must be int
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 Box("crate", "five")                                # weight must be int

Cell In[11], line 28, in Box.__init__(self, label, weight)
     26     def __init__(self, label, weight):
     27         self.label = label
---> 28         self.weight = weight

Cell In[11], line 16, in TypedField.__set__(self, instance, value)
     14     def __set__(self, instance, value):
     15         if not isinstance(value, self.type_):
---> 16             raise TypeError(
     17                 f"{self.public_name} must be {self.type_.__name__}, "
     18                 f"got {type(value).__name__}"
     19             )

TypeError: weight must be int, got str

__set_name__(self, owner, name) is called automatically when the class body completes; it records public_name (weight) and private_name (_weight) — separate names for the descriptor identity and the storage slot. The split is what avoids infinite recursion: storing under a different key means __set__ doesn’t intercept its own write.

Step 2: subclasses that augment the parent’s check. IntField accepts optional min and max; StrField accepts optional length bounds. Each calls super().__set__ first (the type check) and then adds its own constraint:

class IntField(TypedField):
    def __init__(self, *, min=None, max=None):
        super().__init__(int)
        self.min = min
        self.max = max

    def __set__(self, instance, value):
        super().__set__(instance, value)            # type check first
        if self.min is not None and value < self.min:
            raise ValueError(f"{self.public_name} must be >= {self.min}")
        if self.max is not None and value > self.max:
            raise ValueError(f"{self.public_name} must be <= {self.max}")

class StrField(TypedField):
    def __init__(self, *, min_length=0, max_length=None):
        super().__init__(str)
        self.min_length = min_length
        self.max_length = max_length

    def __set__(self, instance, value):
        super().__set__(instance, value)
        if len(value) < self.min_length:
            raise ValueError(
                f"{self.public_name} must be at least {self.min_length} chars"
            )
        if self.max_length is not None and len(value) > self.max_length:
            raise ValueError(
                f"{self.public_name} must be at most {self.max_length} chars"
            )

The two-stage pattern is the payoff of the base class: type checks live in TypedField.__set__; range and length checks live in the subclass __set__ and call super() to run the type check first. A bad type (age = "old") fails before the range check is reached; a bad value (age = -3) passes the type check, then fails the range check with a focused message.

Step 3: combine on one class. A User with a name and an age, each with its own constraints — the validation lives in the descriptors, not in __init__:

class User:
    name = StrField(min_length=1, max_length=50)
    age = IntField(min=0, max=150)

    def __init__(self, name, age):
        self.name = name                            # goes through StrField.__set__
        self.age = age                              # goes through IntField.__set__

    def __repr__(self):
        return f"User(name={self.name!r}, age={self.age})"

User("Alice", 30)
User(name='Alice', age=30)
User("", 30)                                        # too short
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 User("", 30)                                        # too short

Cell In[14], line 6, in User.__init__(self, name, age)
      5     def __init__(self, name, age):
----> 6         self.name = name                            # goes through StrField.__set__
      7         self.age = age                              # goes through IntField.__set__

Cell In[13], line 23, in StrField.__set__(self, instance, value)
     20     def __set__(self, instance, value):
     21         super().__set__(instance, value)
     22         if len(value) < self.min_length:
---> 23             raise ValueError(
     24                 f"{self.public_name} must be at least {self.min_length} chars"
     25             )
     26         if self.max_length is not None and len(value) > self.max_length:

ValueError: name must be at least 1 chars
User("Alice", 200)                                  # too high
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[16], line 1
----> 1 User("Alice", 200)                                  # too high

Cell In[14], line 7, in User.__init__(self, name, age)
      5     def __init__(self, name, age):
      6         self.name = name                            # goes through StrField.__set__
----> 7         self.age = age                              # goes through IntField.__set__

Cell In[13], line 12, in IntField.__set__(self, instance, value)
      8         super().__set__(instance, value)            # type check first
      9         if self.min is not None and value < self.min:
     10             raise ValueError(f"{self.public_name} must be >= {self.min}")
     11         if self.max is not None and value > self.max:
---> 12             raise ValueError(f"{self.public_name} must be <= {self.max}")

ValueError: age must be <= 150

Each assignment in __init__ runs the descriptor’s __set__. The error messages name the field (name must be at least 1 chars, age must be <= 150) — that’s why __set_name__ recorded public_name. With ten fields, the class body would be ten declarative lines, each of which is enforced both at construction and on every later assignment.

The build is the chapter applied at full strength: a base descriptor with __set_name__ and the storage protocol (Step 1), subclasses that compose constraints via super().__set__ (Step 2), and a model class whose validation is entirely declarative (Step 3). This is the same shape the standard ORM and validation libraries (SQLAlchemy, Pydantic, Django models) use under the hood — minus the metaclass tricks that let them generate the field declarations from type hints.

35.7 Exercises

  1. __delete__. Extend Quantity to support del, raising an error if the descriptor is required. Test with del raisins.weight.

  2. Class-level access. Notice the if instance is None: return self line in __get__. Remove it and access LineItem.weight. What goes wrong?

  3. Function as descriptor. Run print(C.method) and print(C().method). Now do C.method.__get__(C(), C). What’s the type? What did __get__ do?

  4. An auto-typed descriptor. Write Typed(int) that rejects assignments not matching the given type. Use it to create a class with Typed(int) and Typed(str) attributes.

  5. Compare with @property. Re-implement the LineItem class using @property for both weight and price. Count the lines. Now imagine the class has ten such attributes.

35.8 Summary

Descriptors are the engine behind @property, methods, classmethods, and static methods. Use them when the same logic applies across attributes or classes — that’s when the protocol pays for its complexity. For one-off validation, @property is plenty.

Next, Chapter 36 closes the book with the deepest layer: classes are objects, classes have classes (called metaclasses), and __init_subclass__ lets you customize subclass creation without writing a metaclass.