Consider an image processing system. Users upload photos, and the service needs to generate thumbnails, extract metadata, run content moderation, and update search indexes. Doing all of that inside the upload request makes the user wait while the API depends on four different downstream systems.
The common fix is to move the slow work behind a queue, so the upload request returns quickly and the processing happens in the background. The design question is which queue to use: one that supports event replay like Kafka, one with broker-side routing like RabbitMQ, or a managed buffer between producers and workers.
For the third case, Amazon SQS is often the right answer. It is not an event streaming platform or a routing broker. It is a managed queue with high availability, simple APIs, dead letter queues, delay support, and strong AWS integration. The trade-off is less broker flexibility in exchange for much less operational work.
There are no brokers to run and no replicas to place. You create a queue, send messages, run consumers, and let AWS handle availability and capacity for the queue itself.
In system design interviews, SQS is strongest when you can say: "We need an async work queue, not event replay or complex routing. Consumers are idempotent, failed messages go to a DLQ, and visibility timeout controls retry behavior."
This article covers the topics that come up most in interviews: Standard vs FIFO, visibility timeout, idempotency, DLQs, scaling consumers, AWS integrations, and when to choose a different system.