How Does Python Yield Work?
Problem
- You want to know how the
yield
keyword in Python works, and when you should use it. - You need to produce values lazily without building full lists in memory.
- You want a function to pause and resume between values (streaming, pipelines, large files).
- You are unsure how
yield
differs fromreturn
and how to consume generators.
Solutions
- Use
yield
in a function to create a generator that produces a sequence lazily. - Iterate the generator with
for
,next()
, or by converting to a collection. - Prefer
yield
for large/unknown-size data, streaming I/O, or pipelines.
def countdown(n: int):
while n > 0:
yield n # pause here, resume on next() / next loop
n -= 1
# consume
for x in countdown(3):
print(x) # 3, 2, 1
yield
vsreturn
:return
ends the function once;yield
can produce many values over time.
def with_return() -> int:
return 1 # function ends here
def with_yield():
yield 1
yield 2
yield 3
print(list(with_yield())) # [1, 2, 3]
- Compose generators with
yield from
to delegate to sub-iterables.
def chain(*iterables):
for it in iterables:
yield from it
print(list(chain([1,2], (3,4), range(5,7)))) # [1, 2, 3, 4, 5, 6]
- Get a generator’s final
return
value (advanced): catchStopIteration.value
.
def gen():
yield 1
return 99
g = gen()
next(g) # 1
try:
next(g)
except StopIteration as e:
print(e.value) # 99
- Two-way communication (advanced): send values back in.
def accumulator():
total = 0
while True:
x = yield total
if x is not None:
total += x
g = accumulator()
next(g) # prime -> 0
print(g.send(5)) # 5
print(g.send(7)) # 12
When to Use yield
in Python
Use yield
when you want lazy evaluation—producing values one at a time, instead of building them all at once. This is ideal when:
- Large or infinite data Processing a huge log file line-by-line, or streaming from a socket.
- Pipelines / streaming Chaining generators so each step consumes/produces values as needed.
- Save memory Avoid materializing entire lists/dicts in memory.
- Pause & resume logic
Coroutines, back-and-forth communication (
send()
/yield
), stateful computations.
Example:
def read_large_file(path: str):
with open(path, "rt") as f:
for line in f:
yield line.rstrip()
for line in read_large_file("big.log"):
if "ERROR" in line:
print(line)
Things to Consider
- Generators are single-iteration objects; once exhausted, recreate them.
- Laziness saves memory but defers work; debugging may be trickier.
yield from
(PEP 380) simplifies nesting and propagatesreturn
values.- Async generators use
async def
+yield
and are consumed withasync for
(noyield from
in async gens). - Type hints:
Iterator[T]
orGenerator[Y, S, R]
fromtyping
(Y=yielded, S=sent, R=return).
Gotchas
- Forgetting to iterate: calling a generator function returns a generator object; it does not run until iterated.
- Mixing
yield
with a valuefulreturn
:return X
ends the generator and raisesStopIteration(X)
; loops ignore the value. - Converting to
list()
defeats laziness and can blow memory on huge streams. - Generators are not thread-safe by default; avoid concurrent
next()
calls. finally
blocks run only when the generator is closed or exhausted; ensure you fully consume or callclose()
.yield
cannot appear in lambdas; use generator functions or comprehensions.- Don’t mutate external state in subtle ways inside generators without clear contracts.
When to Use yield
Alternatives
If yield
doesn’t fit, use alternatives depending on your needs:
Use return
+ normal lists/tuples
When the dataset is small and you want random access or reuse:
def get_small_data() -> list[int]:
return [1, 2, 3, 4]
List comprehensions / generator expressions
For simple one-liners (they’re just syntactic sugar for yield
internally):
squares = (x*x for x in range(10)) # generator
all_squares = [x*x for x in range(10)] # list
itertools
library
Provides building blocks (chain
, islice
, cycle
, tee
) that often eliminate the need for writing custom yield
functions:
from itertools import islice
for x in islice(range(1000000), 5):
print(x) # 0..4
Async generators (async def
+ yield
)
When values come from async sources (e.g., sockets, APIs):
async def fetch_data():
for i in range(3):
yield i
Return objects / iterators
If you need methods beyond iteration, build a custom iterator class instead of yield
.
class Counter:
def __init__(self, n):
self.n = n
def __iter__(self):
return self
def __next__(self):
if self.n <= 0:
raise StopIteration
self.n -= 1
return self.n
Rule of Thumb
- Use
yield
: when you want a stream of values, memory efficiency, or lazy evaluation. - Use
return
+ lists/dicts: when the collection is small or needs to be reused/indexed. - Use
itertools
/expressions: for one-liners and common iteration patterns. - Use custom classes: if you need iteration + extra behavior/state management.
- Use async generators: when working with async I/O.
Sources
- Official Documentation: Yield expressions
- Official Documentation: Generator types
- PEP 255 — Simple Generators
- PEP 342 — Coroutines via Enhanced Generators
- PEP 380 — Syntax for Delegating to a Subgenerator
- Relevant StackOverflow answer
Further Investigation
- Search terms: “python generator patterns”, “generator pipelines”, “async generators”, “contextlib.contextmanager”.
- Explore
itertools
for common lazy building blocks. - Read about backpressure and streaming design (e.g., reading big files line-by-line).
TL;DR
yield
turns a function into a lazy iterator that pauses between values; use it for streaming and memory efficiency.
def read_lines(path: str):
with open(path, 'rt', encoding='utf-8') as f:
for line in f:
yield line.rstrip()
for line in read_lines('huge.log'):
if 'ERROR' in line:
print(line)