AWS Lambda is managed, event-driven compute. It is useful when a system needs to run short pieces of code in response to API calls, queue messages, file uploads, schedules, or stream records without operating a server fleet.
A strong design answer does not stop at "use serverless." It explains when Lambda fits, how it scales, what happens during cold starts, and where containers, VMs, Step Functions, or batch systems are a better choice.
This chapter focuses on the parts that matter in system design interviews: execution model, concurrency, event sources, retries, latency, pricing trade-offs, and comparisons with ECS, Fargate, EC2, and Kubernetes.
The diagram traces how an event travels from a trigger through the Lambda service to an execution environment that runs your code, and how the service adds more environments to handle concurrent requests.