Last Updated: March 13, 2026
Modern Python code relies heavily on powerful language features that make programs more expressive, modular, and efficient. Three of the most important of these features are functions, decorators, and generators.
Functions allow you to organize logic into reusable building blocks. Decorators provide a clean way to extend or modify the behavior of functions without changing their code. Generators enable you to process large amounts of data efficiently by producing values lazily, one at a time.
These concepts appear frequently in AI and data applications. They are used in data pipelines, model training loops, API wrappers, logging utilities, and many Python libraries that power the AI ecosystem.
In this chapter, we will explore how functions, decorators, and generators work, and more importantly, how they are used in real-world Python code.
In Python, functions are values. You can assign a function to a variable, store it in a list, pass it as an argument to another function, and return it from a function. This is what "first-class" means: functions get the same treatment as integers, strings, and any other object.
Here is a more realistic example. Suppose you have a list of model evaluation results and you want to sort or filter them using different criteria.
The sorted function does not know anything about your data. It just calls whatever function you hand it via key. This is the power of first-class functions: you separate the "what to do" from the "how to iterate."
You will encounter *args and **kwargs everywhere in AI codebases, especially in wrapper functions that forward arguments to underlying APIs.
*args collects extra positional arguments into a tuple. **kwargs collects extra keyword arguments into a dictionary. Together, they let you write functions that accept any combination of arguments without knowing them in advance.
This pattern is the foundation of decorator functions, API wrappers, and middleware. The log_and_call function does not need to know what arguments embed_text takes. It just passes everything through. When the embedding API adds new parameters next month, your wrapper still works without changes.
A practical example you will see in AI projects is building flexible API call helpers:
The {**defaults, **kwargs} pattern merges two dictionaries, with later values overriding earlier ones. This is how many AI libraries let you set sensible defaults while keeping everything configurable.
A closure is a function that remembers variables from its enclosing scope, even after that scope has finished executing. This sounds abstract, but it is one of the most practical patterns in Python.
When make_multiplier(2) runs, it creates the inner function multiply and returns it. The inner function holds a reference to factor=2 even though make_multiplier has already returned. That captured variable is the closure.
Here is a more useful AI example, a function that creates a threshold-based classifier:
Each classifier remembers its own threshold. You have created specialized functions from a general template without using classes. Closures are the stepping stone to understanding decorators, which are essentially closures that wrap other functions.
Decorators are where closures become genuinely powerful. A decorator is a function that takes a function, wraps it with extra behavior, and returns the wrapped version. The @decorator syntax is just shorthand for passing a function through another function.
Let's build up to decorators step by step.
The @timing line is identical to writing slow_embedding = timing(slow_embedding). It is just syntactic sugar. But it makes the intent much clearer: "this function is timed."
When you call slow_embedding("hello"), you are actually calling wrapper("hello"). The wrapper records the start time, calls the original function, records the end time, logs the difference, and returns the original result. The caller has no idea any of this happened.
This is probably the single most important decorator you will use in AI engineering. LLM API calls fail. Rate limits, timeouts, server errors, they happen constantly. A retry decorator handles all of this transparently:
Notice this is a three-layer nesting: retry returns decorator, which returns wrapper. That is because retry(max_retries=3) needs to be called with configuration before it can be used as a decorator. The outer function captures the config, the middle function captures the original function, and the inner function does the actual work.
Caching is essential when working with embeddings and LLM responses. If you embed the same text twice, you are wasting money and time. Python's functools module has a built-in solution:
One caveat: lru_cache only works with hashable arguments. Lists and dictionaries are not hashable, so you cannot cache a function that takes a list. For more complex caching needs, you can build your own:
You can apply multiple decorators to the same function. They execute from bottom to top (the decorator closest to the function runs first):
This means the timing decorator measures each individual attempt, and the retry decorator handles failures of the timed function. If you swapped the order, the timing would measure the total time including all retries. The order matters, so think about what behavior you want.
Imagine you need to process 10 million text documents to generate embeddings. If you load them all into a list, you need all 10 million in memory at once. With a generator, you process one at a time, keeping memory usage constant regardless of dataset size.
A generator is a function that uses yield instead of return. When you call a generator function, it does not execute the body immediately. Instead, it returns a generator object. Each time you ask for the next value (via next() or a for loop), the function runs until it hits yield, hands you the value, and pauses. On the next call, it resumes right where it left off.
The key difference from return: a return statement terminates the function and discards its state. A yield statement pauses the function and preserves its entire state (local variables, instruction pointer, everything). The function can be resumed later.
Just as list comprehensions create lists, generator expressions create generators. The syntax is identical except you use parentheses instead of brackets:
The generator version barely uses any memory because it only computes one value at a time. For large datasets, this difference is the difference between your script running and your script crashing with an OutOfMemoryError.
When you stream responses from an LLM, you get tokens one at a time. A generator is the perfect abstraction for this:
In a real application with the OpenAI SDK, streaming looks very similar:
The generator pattern means the calling code does not need to know anything about the streaming protocol. It just iterates.
Processing items in batches is one of the most common patterns in AI. You rarely send one embedding request at a time. You batch them to reduce API overhead and improve throughput:
Because batch_items is a generator, it works with any iterable, including other generators. You can chain generators together to build processing pipelines that handle arbitrarily large datasets without loading everything into memory.
Python's itertools module provides a set of fast, memory-efficient tools for working with iterators. You do not need to know all of them, but three show up constantly in AI data processing.
chain concatenates multiple iterables into one continuous stream without copying anything:
This is cleaner than concatenating lists (train + val + test) and more memory-efficient because it does not create a new combined list.
islice lets you take the first N items from any iterator, which is useful for testing and debugging:
You cannot use normal list slicing on a generator (generators are not subscriptable). islice solves this.
Starting in Python 3.12, itertools includes a batched function that does exactly what our earlier batch_items generator does:
If you are on an older Python version, use the manual batch_items generator from the previous section, or install the more-itertools package.
A lambda is an anonymous, single-expression function. You will see them used most often as quick throwaway functions for sorted, filter, map, and similar higher-order functions:
Lambdas should be short. If your lambda is getting complex, define a named function instead. The goal is readability, not cleverness.
functools.partial creates a new function with some arguments pre-filled. This is useful when you have a general function but want to create specialized versions of it:
This pattern is common when setting up processing pipelines. Instead of passing configuration through every function call, you create pre-configured versions:
partial is often cleaner than lambdas for this kind of argument pre-filling because it preserves the original function's name and docstring. It also works well with functions that have many keyword arguments, which is typical for AI library functions.