a = [1, 2, 3]
b = a
b.append(4)
a[1, 2, 3, 4]
Python variables are not boxes that hold values — they are labels attached to objects. This single mental model explains assignment, aliasing, copying, and garbage collection.
In this chapter you will learn to:
b = a does not copy the object — and predict the consequences.== and is correctly: value equality versus object identity.list(x), x[:]) from deep copy (copy.deepcopy).weakref.The single most useful mental model: a Python variable is a sticky note you attach to an object. Assignment moves the sticky note; it does not copy the object underneath.
a = [1, 2, 3]
b = a
b.append(4)
a[1, 2, 3, 4]
a and b are two labels on the same list. The append mutated the list — both labels see the change.
flowchart LR
A(("a")) --> L["[1, 2, 3, 4]"]
B(("b")) --> L
Two labels, one list — mutation through either label is visible through both.
a is bTrue
is compares identity: same object in memory? Yes. Reassigning b does not affect a:
b = [10, 20]
a, b([1, 2, 3, 4], [10, 20])
Now b points to a new list. The original list (with a still attached) is untouched.
== and is ask different questions. == calls __eq__ — same value? is checks identity — same object?
charles = {"name": "Charles L. Dodgson", "born": 1832}
lewis = charles
lewis is charles, id(lewis) == id(charles)(True, True)
lewis = charles makes lewis a second name for the dict, not a copy. So both questions return True: lewis is charles because they point at the same object, and id(lewis) == id(charles) because identical objects have the same identity number. The two checks are redundant on purpose — is is just a built-in shortcut for “do these have equal id?”
lewis["balance"] = 950
charles{'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}
We mutated one dict through the lewis name; reading charles (the other name on that one dict) shows the new key. Output: {'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}. Aliasing made visible — same object, two labels, mutation through either is visible through both.
alex = {"name": "Charles L. Dodgson", "born": 1832, "balance": 950}
alex == charles, alex is charles(True, False)
alex is built from a fresh dict literal — different object, but the same contents as charles after the mutation above. So alex == charles is True (value equality holds — both dicts have the same keys with the same values), while alex is charles is False (distinct objects in memory). That’s the equality-vs-identity split made concrete.
alex == charles (same value) but alex is not charles (different objects in memory).
is only for None
x is None is the idiomatic spelling. x == None works but goes through __eq__, which a malicious or buggy class can override. is is faster, simpler, and impossible to spoof. Reserve is for None, sentinel objects, and class identity (type(x) is C).
A tuple is immutable in the sense that the references it holds cannot be replaced. The objects those references point to can still mutate:
t1 = (1, 2, [30, 40])
t2 = (1, 2, [30, 40])
t1 == t2, t1 is t2(True, False)
Both tuples have the same shape — 1, 2, list-with-30-40 — so t1 == t2 is True (recursive equality across each slot). They were built from separate literals, so they’re distinct objects: t1 is t2 is False. So far so normal.
t1[-1].append(99)
t1, t1 == t2((1, 2, [30, 40, 99]), False)
t1[-1] is the list inside t1 — [30, 40]. We mutate that list via .append(99). The tuple t1 itself was not modified (its three references are unchanged — slot 0 is still 1, slot 1 still 2, slot 2 still points at the same list); but the list it holds is now [30, 40, 99]. So t1 shows (1, 2, [30, 40, 99]), and t1 == t2 flips to False because slot 2 now differs. The tuple is “immutable” in the sense that its references can’t be reassigned — but it can’t stop the referenced objects from mutating.
The tuple t1 did not change in any sense Python checks at the tuple level — but the list inside it grew, so t1 == t2 flipped to False. This is why (1, 2, [30, 40]) is not hashable: the equality of the tuple depends on a value that can change.
list(x), x[:], dict(x), and set(x) all make shallow copies. The outer container is new; the contained references are shared.
import copy
l1 = [3, [55, 44], (7, 8, 9)]
l2 = list(l1)
l1 == l2, l1 is l2(True, False)
l2[1].append(33)
l1[3, [55, 44, 33], (7, 8, 9)]
The outer lists are different (l1 is not l2), but the inner list is the same object. Mutating it through l2 is visible through l1.
For independence at every level, use copy.deepcopy:
l3 = copy.deepcopy(l1)
l3[1].append(99)
l1[1], l3[1]([55, 44, 33], [55, 44, 33, 99])
Python passes references — not values, not pointers. Mutating a parameter mutates the caller’s object. Rebinding a parameter does not.
def f(x):
x *= 2
a = [1, 2, 3]
f(a)
a[1, 2, 3, 1, 2, 3]
*= on a list is in-place repetition (it calls __imul__), so a was mutated.
b = 5
f(b)
b5
*= on an int is not in-place — int is immutable, so it rebinds the local name to a new object. The caller’s b is unaffected.
This is why mutable default arguments are a trap. The default is evaluated once, at function definition time, and shared across every call:
class HauntedBus:
def __init__(self, passengers=[]):
self.passengers = passengers
bus1 = HauntedBus(["Alice", "Bill"])
bus2 = HauntedBus()
bus3 = HauntedBus()
bus2.passengers.append("Carrie")
bus3.passengers['Carrie']
Walking through what happened:
passengers=[] is evaluated once, when Python compiles the def. That single empty list is stored as the default and reused on every call that doesn’t supply one.HauntedBus(["Alice", "Bill"]) is fine — bus1.passengers is the caller’s list.HauntedBus() (no argument) makes bus2.passengers point to the shared default list. HauntedBus() again makes bus3.passengers point to the same list.bus2.passengers.append("Carrie") mutates that shared list. bus3.passengers sees the mutation, because there’s only one list.The fix is the standard idiom — None sentinel plus a copy:
class Bus:
def __init__(self, passengers=None):
if passengers is None:
self.passengers = []
else:
self.passengers = list(passengers)
bus_a = Bus()
bus_b = Bus()
bus_a.passengers.append("Carrie")
bus_a.passengers, bus_b.passengers(['Carrie'], [])
Walking through the fix:
passengers=None is a sentinel: None is immutable, so there’s no shared-state risk.if passengers is None: self.passengers = [] builds a fresh list per call. Each Bus() instance gets its own.else: self.passengers = list(passengers) makes a shallow copy of whatever the caller passed. Without this copy, mutating bus.passengers would also mutate the caller’s list — same aliasing problem in a different shape.The general rule: never write a mutable expression after = in a def. Use None plus a fresh allocation inside the body.
del and garbage collectiondel x removes the name x. The object only goes away when its last reference does. CPython uses reference counting, plus a cycle detector for unreachable cycles.
a = [1, 2, 3]
b = a
del a
b[1, 2, 3]
a is gone but the list lives on through b.
weakref is a reference that doesn’t keep the object alive:
import weakref, gc
class Holder: pass
s1 = Holder()
wref = weakref.ref(s1)
print(wref() is s1)
del s1
gc.collect()
print(wref())True
None
Walking through the moving parts:
weakref.ref(s1) creates a weak reference object. Contrast with an ordinary assignment, which does keep the object alive: if we’d written another = s1 instead, that second binding would bump the reference count and the del s1 further down would no longer release the Holder (because another would still hold a strong reference). The weak ref deliberately doesn’t count toward keeping the object alive.wref() (calling the weakref like a function) returns the object if it’s still alive, else None.del s1 removes the only strong reference. The reference count drops to zero — the Holder is collected.wref() now returns None because the underlying object is gone.The general rule: a weak reference lets you observe an object without prolonging its life — the building block for caches, observers, and any structure that should not extend an object’s lifetime.
weakref.WeakValueDictionary is the everyday tool — a dict whose values are weak. Entries vanish as soon as their last strong reference dies, so the dict can never extend an object’s lifetime:
class Cheese:
def __init__(self, kind):
self.kind = kind
def __repr__(self):
return f"Cheese({self.kind!r})"
stock = weakref.WeakValueDictionary()
catalog = [Cheese("Red Leicester"), Cheese("Tilsit"), Cheese("Brie")]
for cheese in catalog:
stock[cheese.kind] = cheese
sorted(stock.keys())['Brie', 'Red Leicester', 'Tilsit']
del catalog
del cheese # the for-loop's leftover binding
gc.collect()
sorted(stock.keys())[]
Walking through what each piece does:
stock = weakref.WeakValueDictionary() is a dict whose values are held weakly. Keys are normal — only values are special.for cheese in catalog loop registers each cheese in stock keyed by its kind. While catalog is alive, the cheeses are alive, so all three keys appear.del catalog removes the strong references in the list. But the loop variable cheese is still bound to the last cheese ("Brie") — Python doesn’t clear it when the loop ends.del cheese removes that last strong reference. Now nothing keeps any of the cheeses alive except the weak entries in stock, which don’t count.gc.collect() forces Python’s cyclic garbage collector to run a pass immediately. CPython’s primary cleanup mechanism is reference counting (an object is freed the moment its count hits zero), but reference counting alone can’t free cycles (a.b = b; b.a = a); the cyclic GC catches those, normally on its own schedule. We call gc.collect() here to make the cleanup happen now so the next cell sees the result, not at some unpredictable later moment.The general rule: caches and registries that must not extend object lifetime use WeakValueDictionary — entries clean themselves up as soon as the underlying objects are no longer needed elsewhere.
CPython sometimes interns small ints and short strings — the same value yields the same object — but only as an implementation detail.
a = 1000
b = 1000
a is b, a == b(False, True)
The result of a is b for 1000 is implementation-dependent and can change between Python versions. Never use is for value comparison. Use ==.
t1 = (1, 2, 3)
t2 = (1, 2, 3)
t1 is t2, t1 == t2(False, True)
The first answer can be True or False. The second is always True. That’s the one to trust.
Variables are labels, not boxes. Assignment creates a new label for an existing object — it does not copy the object. This is why you must explicitly copy() or deepcopy() when you need independence. And it’s why mutable default arguments are a trap that never fails to catch beginners.
Cart classA shopping cart looks trivial — items, quantities, a total — until aliasing turns one customer’s cart into a phantom that mutates whenever any other customer adds something. We’ll build it three times: hit the trap, fix it with the None sentinel, then handle nested mutables with deepcopy.
Step 1: the trap — mutable default argument. The same shape as HauntedBus from earlier in the chapter, made concrete:
class CartBuggy:
def __init__(self, items=[]): # the trap
self.items = items
def add(self, item):
self.items.append(item)
a = CartBuggy()
a.add("apple")
b = CartBuggy() # fresh customer, supposedly
a.items, b.items(['apple'], ['apple'])
b was meant to be empty — but the items=[] default is evaluated once at def time, and every CartBuggy() call without an explicit items shares it. When a.add("apple") mutates that shared list, b.items sees the same "apple" it never asked for. This is the bug the chapter has been warning about, applied to a real-feeling class.
Step 2: fix with None sentinel and defensive copy. Two changes: default to None, and copy whatever the caller passes so callers can’t mutate self.items from the outside:
class Cart:
def __init__(self, items=None):
self.items = [] if items is None else list(items)
def add(self, item):
self.items.append(item)
a = Cart()
a.add("apple")
b = Cart() # really fresh now
[a.items, b.items][['apple'], []]
items=None is safe to share because None is immutable. The body builds a fresh [] per instance when the caller didn’t supply one, and list(items) makes a shallow copy when they did. After this, a and b cannot share state.
caller_list = ["banana"]
c = Cart(caller_list)
caller_list.append("contraband") # caller mutates their list
c.items # cart is unaffected['banana']
The list(items) copy in __init__ is what makes the second step robust. Without it, the caller’s list and c.items would be aliases.
Step 3: handle nested mutables with deepcopy. Now suppose each item is a dict like {"name": "apple", "qty": 1}. A shallow copy of the items list copies the outer list — but the dicts inside are still shared:
import copy
class CartV2:
def __init__(self, items=None):
self.items = [] if items is None else list(items) # shallow
def snapshot(self):
"""Return an independent copy — safe even if items are mutable."""
new = CartV2()
new.items = copy.deepcopy(self.items)
return new
c1 = CartV2([{"name": "apple", "qty": 1}, {"name": "pear", "qty": 2}])
c2 = c1.snapshot()
c2.items[0]["qty"] = 99 # mutate the snapshot
c1.items[0]["qty"], c2.items[0]["qty"](1, 99)
copy.deepcopy recurses into the items, copying every nested dict — so c2’s qty=99 mutation doesn’t reach c1. The shallow copy in __init__ is the right default (cheap, and most callers don’t mutate inner dicts); snapshot() opts into the deep version when independence at every level is needed.
The build threads the chapter’s three lessons through one running class: the mutable-default trap (Step 1), the sentinel-plus-defensive-copy fix (Step 2), and the deepcopy escape hatch when references nest (Step 3). Once you’ve internalised which level of independence each step provides, “why did my data get corrupted?” stops being a recurring question.
Predict the output. Given a = [1, 2, 3], b = a, b = b + [4], what does a end with? Now repeat with b += [4]. Why are the answers different?
Spot the alias. Write a function clone(d: dict) -> dict that takes a dict of lists and returns a copy where mutations to the new dict’s lists don’t leak into the original. Test it.
Default-arg trap. Write a function append_then(x, lst=[]) that appends x to lst and returns it. Call it three times with no lst argument. Explain the result. Fix it.
Identity in containers. Why is [1, 2, [3]] not allowed as a dict key? What error do you get? What about (1, 2, (3, 4))?
weakref lifetime. Replace Holder with a class that prints __del__ runs. Use weakref.ref to observe collection. Now create a cycle (a.b = b; b.a = a) and watch what happens with and without gc.collect().
Variables are labels. Assignment moves labels, not objects. Copies default to shallow; you opt into deepcopy. Mutable defaults are evaluated once and shared forever — use None and copy. Reference counting plus a cycle detector handle most cleanup automatically; weakref is the escape hatch for caches and observers.
That closes Part I. Part II turns to functions: how Python’s first-class function model enables decorators, closures, type hints, and a pattern catalog rewritten without classes. We start in Chapter 19.