Anonymous View

Python Memory Leak: How to Find, Diagnose, and Fix It

A python memory leak does not crash your process immediately — it kills it slowly. RSS memory climbs 10MB per hour, your Django worker restarts at 2am, your Celery task queue grinds to a halt after 6 hours of uptime. CPU looks fine. The code looks fine. But something is holding references that Python’s garbage collector never releases. This page covers the exact tools and patterns to find what is leaking, why the GC missed it, and how to fix it without rewriting your application.

Covers CPython 3.10+, Django, Celery, and long-running scripts. Every section includes runnable diagnostic code you can drop into a production process right now.


TL;DR

  • Python uses reference counting as the primary memory management mechanism — an object is freed the moment its refcount hits zero, not when GC runs
  • The GC only handles circular references — if two objects reference each other with no external references, only the cyclic GC collector catches them
  • tracemalloc is the fastest way to find a leak — two snapshots, one diff, you see exactly which line allocated the most memory
  • The most common leak pattern in production: a global list, dict, or cache that grows unbounded because nothing ever removes entries
  • Circular references with __del__ methods were uncollectable before Python 3.4 — in 3.4+ the GC handles them, but they still add GC overhead
  • weakref.ref() breaks reference cycles without preventing collection — the correct fix for caches and observer patterns

How Python Memory Management Works: Reference Counting and GC

Python memory management runs on two layers. The first layer is reference counting — every object carries a counter of how many references point to it. When that counter hits zero, CPython frees the object immediately, without waiting for the garbage collector. This is fast and deterministic. The second layer is the cyclic garbage collector — it runs periodically to catch objects that reference each other in a cycle, keeping each other’s refcount above zero even when nothing outside the cycle holds a reference.

Understanding this two-layer model is the prerequisite for diagnosing any python memory leak. If an object is not being freed, one of two things is true: something still holds a reference to it (refcount above zero), or it is part of a reference cycle that the GC has not collected yet. These are different problems with different fixes.

# Python — check reference count of an object at runtime
import sys
import gc

data = {"key": "value"} # create an object
print(sys.getrefcount(data)) # prints 2 — one for 'data', one for getrefcount arg

# force a GC collection cycle — does NOT free objects with refcount > 0
collected = gc.collect()
print(f"GC collected {collected} objects") # only collects unreachable cycles

Without understanding that gc.collect() only handles cycles — not all unreferenced objects — developers waste hours calling it manually and wondering why memory does not drop. If your leak is a growing dict with live references, no amount of gc.collect() calls will free it.

Does Python Have Memory Leaks?

Yes — Python has memory leaks, and they are common in production. Reference counting prevents most leaks automatically, but it cannot handle circular references, unbounded caches, closures that capture large objects, or C extension modules that mismanage memory. Long-running processes — Django workers, Celery tasks, WebSocket servers, data pipelines — are the most common victims because they accumulate leaked references over hours of uptime instead of seconds.

Why Python gc.collect() Does Not Free Memory

gc.collect() only frees objects that are part of reference cycles and have no external references. It does nothing for objects that are still referenced — even if those references are unintentional. If your process memory keeps growing after calling gc.collect(), the leak is not a reference cycle — it is a live reference you have not found yet. Common culprits: a module-level list that accumulates items, a class-level cache that never expires entries, or a closure that holds a reference to a large dataset.

Python Memory Leak Detection: tracemalloc and objgraph in Practice

Python memory leak detection starts with tracemalloc — it is built into the standard library since Python 3.4 and requires zero dependencies. The workflow is simple: take a snapshot before the suspected leaking operation, run the operation, take a second snapshot, diff the two. The diff shows you which lines allocated the most memory between snapshots, sorted by size. This is the fastest path from “memory is growing” to “here is the line causing it”.

Deep Dive
Python Async Gotchas Explained

Python asyncio pitfalls You’ve written async code in Python, it looks clean, tests run fast, and your logs show overlapping tasks. These are exactly the situations where Python asyncio pitfalls start to reveal themselves. It...

The snippet below shows a complete tracemalloc diagnostic session. The key is the key='lineno' argument — it groups allocations by source line rather than by traceback, which gives you a flat ranked list instead of a nested tree. Focus on the top 3–5 entries in the diff output.

# Python — tracemalloc memory leak detection (drop into any module)
import tracemalloc

tracemalloc.start()
snapshot1 = tracemalloc.take_snapshot() # baseline before suspected leak

run_suspected_leaking_code() # replace with your actual function

snapshot2 = tracemalloc.take_snapshot() # snapshot after
top_stats = snapshot2.compare_to(snapshot1, key_type='lineno')

for stat in top_stats[:5]: # top 5 lines by memory growth
 print(stat)

Without the baseline snapshot, you see total allocations — not the delta. Total allocations include everything Python loaded at startup, which drowns out the actual leak signal. The diff between two snapshots isolates exactly what grew during your operation.

Python Memory Leak Detection with objgraph

objgraph complements tracemalloc by showing you object counts and reference graphs instead of raw allocation sizes. Install with pip install objgraph. The most useful call is objgraph.show_growth() — it prints which object types grew in count since the last call. Run it inside a loop to watch which types accumulate. When you find the leaking type, objgraph.show_backrefs(objgraph.by_type('YourClass')[0]) renders a PNG reference graph showing exactly what is keeping the object alive. This is the fastest way to find unexpected references in complex object graphs.

python memory keeps growing: Reading RSS vs Heap

RSS (Resident Set Size) reported by your OS or process monitor is not the same as Python heap size. Python requests memory from the OS in large blocks via its internal allocator and rarely returns it — even after objects are freed, the RSS stays high because CPython holds onto freed pages for reuse. A growing RSS over hours means objects are accumulating faster than they are freed. A stable RSS that jumped once and stayed means Python allocated a large block and is reusing it. Use tracemalloc to measure Python-level allocations — not RSS — when diagnosing leaks.

Python Circular Reference Memory Leak: How the GC Misses It

A python circular reference memory leak happens when two or more objects reference each other, keeping each other’s refcount above zero even when nothing outside the cycle holds a reference. CPython’s reference counting cannot detect this — refcount never reaches zero for any object in the cycle. The cyclic GC collector handles it, but with a delay of seconds to minutes depending on GC thresholds, and with a meaningful CPU cost during collection on large heaps.

The most common production pattern is a parent-child relationship where the child holds a reference back to the parent. This is natural in tree structures, observer patterns, and callback registrations. The snippet below shows the cycle and the correct fix using weakref.

# Python — circular reference and weakref fix
import weakref

# WRONG: strong reference cycle — both objects keep each other alive
class Node:
 def __init__(self, parent=None):
 self.parent = parent # strong reference back to parent
 self.children = []

root = Node()
child = Node(parent=root) # child holds strong ref to root
root.children.append(child) # root holds strong ref to child
# deleting root does not free it — child still holds a strong reference

# RIGHT: weakref breaks the cycle — child does not prevent parent collection
class NodeSafe:
 def __init__(self, parent=None):
 self.parent = weakref.ref(parent) if parent else None # weak reference
 self.children = []

Without the weakref on the parent reference, every Node in a tree holds its parent alive indefinitely. In a long-running process that builds and discards thousands of trees — a parser, a request handler, a task scheduler — this compounds into hundreds of megabytes of uncollected objects waiting for the next GC cycle.

Python Circular Reference Garbage Collector: GC Thresholds

The cyclic GC runs based on allocation thresholds, not time. By default, generation 0 collects after 700 net object allocations, generation 1 after 10 generation-0 collections, generation 2 after 10 generation-1 collections. In a high-throughput service allocating thousands of objects per second, this means the GC runs frequently — adding 5–50ms pauses on large heaps. You can tune thresholds with gc.set_threshold(), but the better fix is eliminating the cycles so the GC has less work. Check current thresholds with gc.get_threshold() and current counts with gc.get_count().

python gc not collecting: When __del__ Blocks Collection

Before Python 3.4, objects with a __del__ finalizer that were part of a reference cycle were uncollectable — the GC put them in gc.garbage and left them there forever. This was a well-known leak source in Python 2 and early Python 3 code. In Python 3.4+, PEP 442 fixed this — the GC can now collect cycles containing finalizers in a safe order. If you are running 3.4+ and still see objects in gc.garbage, check with print(gc.garbage) — a non-empty list after gc.collect() means you have uncollectable objects that need manual inspection.

Python Memory Leak in Loop, Django, and Celery: Real Scenarios

A python memory leak in a loop is the simplest and most common form — and the easiest to miss because the loop itself looks innocent. The pattern: an unbounded list or dict inside or outside a loop accumulates references with every iteration and nothing ever removes them. In a script that runs for seconds this is invisible. In a Celery worker that processes tasks for 8 hours, it is a guaranteed OOM crash.

The snippet below shows the two most common loop leak patterns. Both involve accumulation without cleanup — the fix in both cases is explicit deletion or using a bounded data structure.

# Python — WRONG vs RIGHT: accumulation in a processing loop
results = [] # WRONG: grows without bound

for record in large_dataset:
 processed = transform(record)
 results.append(processed) # every result held in memory forever
 # missing: flush results to DB or file and clear the list

# RIGHT: process in chunks, release references explicitly
CHUNK_SIZE = 500
chunk = []

for i, record in enumerate(large_dataset):
 chunk.append(transform(record))
 if len(chunk) >= CHUNK_SIZE:
 flush_to_database(chunk) # write and release
 chunk.clear() # explicit release — refcount drops to 0

Without chunk.clear() on the right pattern, even the chunked version leaks — chunk = [] would create a new list but leave the old one referenced by any closure or traceback that captured it. .clear() removes all references from the existing list object in place.

Technical Reference
Advanced Python Pitfalls Guide

Python Pitfalls: Avoiding Subtle Logic Errors in Complex Applications Python's simplicity is often a double-edged sword. While the syntax allows for rapid prototyping and clean code, the underlying abstraction layer handles memory and scope in...

Python Memory Leak Django: Request-Scoped vs Process-Scoped Objects

Django memory leaks almost always involve process-scoped state — module-level variables, class-level caches, or Django’s own internal structures — accumulating data that should be request-scoped. The most common pattern: a custom middleware or signal handler that appends to a module-level list on every request. After 10,000 requests the list holds 10,000 items with no cleanup. Check your middleware, signal handlers, and any module-level collections. Django Debug Toolbar’s memory panel and tracemalloc snapshots taken at request boundaries are the fastest diagnostic tools. Set CONN_MAX_AGE carefully — persistent database connections can hold result sets in memory longer than expected.

Python Memory Leak Celery: Task State Accumulation

Celery workers leak memory through three common patterns: result backend accumulation when result_expires is not set (task results pile up in Redis or the database indefinitely), large task arguments that are serialized and held in worker memory during execution, and custom task classes with instance variables that persist between task executions on the same worker process. Set worker_max_tasks_per_child to recycle workers after N tasks — this is not a fix but a mitigation that caps maximum leak size. The real fix is finding the accumulating reference with tracemalloc snapshots taken inside the task itself.

How to Fix Python Memory Leak: gc module, weakref, and Profiling

Fixing a python memory leak follows a strict sequence: measure first, identify the object type, trace the reference chain, apply the minimal fix. Skipping to the fix without measurement is the most common mistake — developers add gc.collect() calls everywhere or switch to weakrefs blindly without confirming those are the actual leak sources. Both waste time and may introduce new bugs.

The four-step fix workflow that covers 95% of production Python memory leaks:

Step 1 — Measure with tracemalloc. Take two snapshots around the suspected leaking code path. Diff them. Identify the top 3 growing allocations by line number. This takes 10 minutes and gives you the exact file and line.

Step 2 — Count object types with objgraph. Run objgraph.show_growth() inside the leaking loop or request handler. Find which object type is accumulating. Then run objgraph.show_backrefs() on an instance to see the full reference chain keeping it alive.

Step 3 — Break the reference. Four options depending on the cause: use weakref.ref() for back-references in parent-child relationships; call .clear() on accumulating collections explicitly; set expiry on caches using functools.lru_cache(maxsize=N) instead of a plain dict; delete large temporary objects explicitly with del inside long loops.

Step 4 — Verify with a second measurement. Run the same tracemalloc diff after the fix. Confirm the previously growing allocation no longer appears in the top 5. Do not trust subjective RSS readings — measure Python-level allocations directly.

# Python — bounded cache with lru_cache vs unbounded dict (common leak fix)
from functools import lru_cache

# WRONG: unbounded dict cache — grows forever, never evicts
_cache = {}
def get_user_data(user_id):
 if user_id not in _cache:
 _cache[user_id] = fetch_from_db(user_id) # cached forever
 return _cache[user_id]

# RIGHT: lru_cache with explicit max size — evicts oldest entries automatically
@lru_cache(maxsize=512) # holds max 512 entries, evicts LRU
def get_user_data_safe(user_id):
 return fetch_from_db(user_id) # automatic eviction prevents unbounded growth

Without maxsize, even @lru_cache leaks — @lru_cache without arguments defaults to maxsize=128 in Python 3.8+ but was unlimited in earlier versions. Always set maxsize explicitly so the behavior is predictable across Python versions.

How to Find Memory Leak in Python Production

In production, you cannot run a debugger — use tracemalloc with a background thread that dumps snapshots to a log file on a schedule. Take a snapshot every 60 seconds, compare to the previous one, log the top 5 growing allocations. When memory starts climbing, the log shows you exactly when the growth started and which line is responsible. Add a SIGUSER1 signal handler that triggers an immediate snapshot on demand — this lets you capture the state during a live incident without restarting the process.

python memory profiler: memory-profiler vs tracemalloc

tracemalloc is built-in and suitable for production — low overhead, snapshots on demand, line-level granularity. memory-profiler (pip install memory-profiler) decorates functions with @profile and reports memory usage line by line — it is better for development profiling of specific functions but adds too much overhead for production use. For production diagnosis: tracemalloc snapshots in a background thread. For development: memory-profiler on the specific function you suspect. objgraph sits between both — useful in either context for visualizing reference chains.

Worth Reading
Why code runs slow

Fix the Real Reason Your Python Code Runs Slow — And Stop Guessing Slow code rarely fails where you'd expect. Most slowdowns show up in loops that look fine, async rewrites that gained nothing, or...

FAQ: Python Memory Leak

How to find a memory leak in Python?

Use tracemalloc: call tracemalloc.start(), take a baseline snapshot before the suspected leaking code, run the code, take a second snapshot, diff with snapshot2.compare_to(snapshot1, key_type='lineno'). The output shows which lines allocated the most memory between snapshots. For object-level analysis, use objgraph: objgraph.show_growth() prints which object types grew in count. Combine both tools — tracemalloc finds the line, objgraph shows the reference chain keeping objects alive.

Does Python automatically free memory?

Python frees most objects automatically via reference counting — the moment an object’s reference count hits zero, CPython releases it immediately. However, reference counting cannot handle circular references. For those, the cyclic garbage collector runs periodically. Python also rarely returns freed memory to the OS — it keeps freed pages in its internal allocator for reuse, so RSS reported by the OS may stay high even after objects are collected. This is normal behavior, not a leak.

What causes Python memory leaks in production?

Four causes cover 90% of production cases: unbounded module-level or class-level collections that accumulate entries without expiry; circular references in object graphs where objects hold strong back-references to parents; closures that capture large datasets or objects longer than needed; and C extension modules that mismanage memory outside of Python’s reference counting. Long-running processes — Django workers, Celery tasks, WebSocket servers — expose all four patterns because they run long enough for small leaks to compound into OOM crashes.

How to detect memory leak in Python Django?

Take tracemalloc snapshots at Django request boundaries using custom middleware: snapshot before the view runs, snapshot after, log the diff for requests where memory grew above a threshold (typically 1MB). Check module-level collections in your middleware and signal handlers — these are the most common Django leak sources. Also check django.db.reset_queries() — in debug mode Django stores all SQL queries in memory indefinitely; ensure DEBUG = False in production or call reset_queries() manually in long-running management commands.

Why is Python memory not being released after del?

del removes a name binding — it decrements the object’s reference count by one but does not guarantee the object is freed. If other references to the same object exist (another variable, a list entry, a closure, a cache), the refcount stays above zero and the object stays alive. Use sys.getrefcount(obj) - 1 to check how many references exist before and after your del call. If the count is still above zero after del, find and remove the remaining references.

How to fix Python memory leak in a loop?

Three fixes depending on the pattern. If you are accumulating results in a list: flush to disk or database in chunks and call list.clear() after each flush. If you are caching computed values: replace the plain dict cache with @lru_cache(maxsize=N) to cap size. If large objects are created inside the loop: add explicit del large_obj at the end of each iteration and call gc.collect() periodically if the objects contain reference cycles. Measure with tracemalloc before and after to confirm the fix worked.

What is a circular reference in Python and how does it cause a memory leak?

A circular reference is when object A holds a reference to object B, and object B holds a reference back to object A. Because each object’s reference count stays above zero (the other object is holding it), CPython’s reference counting never frees either object automatically. The cyclic GC collector handles this eventually, but with a delay and CPU cost. The permanent fix is using weakref.ref() for back-references — a weak reference does not increment the refcount, so the object can be collected normally when no strong references remain.

python memory leak tracemalloc: how to use in production?

Run tracemalloc in a background thread that takes snapshots every 60 seconds and logs the top 10 growing allocations when the diff exceeds a threshold. Store snapshots in a rolling buffer of the last 5 — this lets you compare current state to 5 minutes ago during a live incident. Add a signal handler (signal.signal(signal.SIGUSR1, dump_snapshot)) so you can trigger an immediate snapshot from the command line without restarting the process. Keep tracemalloc running permanently in production — its overhead is typically under 5% CPU and 10MB RAM.

Written by:

Source Category: Python Pitfalls