Writing prompts is an iterative engineering process. You start with a rough prompt, test it on representative inputs, inspect failures, refine the instructions or context, and repeat. That manual loop works for prototypes. It becomes slow and hard to reproduce when a production system depends on many prompts across many input types.
This is where prompt optimization comes in.
Prompt optimization treats prompts as tunable components of a system rather than fixed strings. Instead of relying on intuition alone, you evaluate prompts against datasets, measure performance, and improve them systematically. The goal is to improve behavior you can measure, not to produce a prompt that merely reads well.
One important tool in this space is DSPy.
This chapter explains evaluation-driven prompt optimization and shows how DSPy turns LLM workflows into optimizable programs.