AlgoMaster Logo

recover

Last Updated: May 22, 2026

Medium Priority
15 min read

recover is the only way to stop a panic from unwinding the goroutine all the way to a crash. It works by catching the panic value inside a deferred function, so the goroutine can keep running or return an error to its caller. This lesson covers what recover does, the strict rules around where it must be called, the common patterns (library boundary, HTTP middleware, re-panicking), and the anti-patterns that make it a trap.

What recover Does

recover is a built-in function with the signature func recover() any. When called inside a deferred function during a panic, it stops the unwind and returns the value that was passed to panic. The panicking goroutine then resumes normal execution from the point right after the function that called recover would have returned. When called outside of a panic, or outside of a deferred function, recover returns nil and does nothing.

The simplest demonstration uses an order processor that panics on bad input and a deferred recovery in the same function:

Three things happen on the second call. processOrder("") panics on the empty string. The deferred function fires as part of the unwind and calls recover, which returns "order ID is empty" and stops the unwind. Control then returns to main, which prints its final line as if nothing went wrong. The panic was contained inside processOrder.

The return type of recover is any, because panic accepts any value. If the panic was triggered with panic("bad input"), recover returns a string. If it was panic(errors.New("...")), recover returns an error. The runtime itself panics with values of type runtime.Error (a nil dereference, an out-of-bounds index, a division by zero). You almost always treat the recovered value as opaque and convert it to a string or to an error before doing anything with it.

Every panic value comes back through the same any-typed return. The deferred function decides what to do with it: log it, wrap it in an error, or re-panic with more context. The next sections cover each of those choices.

The Strict Rule: Must Be Called Inside a Deferred Function

recover only works when it's called directly from a function that was scheduled with defer. Called anywhere else, it returns nil and the panic keeps unwinding.

The deferred function wontWork was registered correctly, but recover was called from wontWork itself, not from a deferred wrapper. The Go spec says recover only catches a panic when it's a direct call inside a deferred function. Calling recover from a regular function that happens to be deferred works, but calling it from a function that the deferred function calls does not.

That last sentence is subtle. Here's the same idea spelled out:

The deferred function calls tryRecover, which calls recover. Even though tryRecover is reached during the unwind, the recover inside it is not "directly inside a deferred function", so it returns nil and the panic continues. The fix is to call recover from the deferred function itself, not from any helper.

The canonical pattern that does work looks like this:

The recover() call lives inside the anonymous function literal that was passed to defer. That function is the direct deferred caller of recover, so the catch works. This is the shape you'll see almost every time recover shows up in Go code:

Every other shape is either wrong or a variation that adds something to this base pattern.

The diagram shows what the runtime does during a panic. It walks the deferred-function stack from the most recent registration down to the oldest. Each deferred function runs, and the runtime checks whether that function called recover directly. The first one that did stops the unwind and the function it was deferred from returns normally. If no deferred function calls recover, the unwind reaches the top of the goroutine and the program dies.

recover Only Sees Panics in the Same Goroutine

A panic in goroutine A cannot be recovered by a deferred function in goroutine B. Each goroutine has its own stack and its own deferred-call list. When a goroutine panics, the runtime walks only its own deferred functions. If none of them recovers, the goroutine's panic crashes the entire program, regardless of what other goroutines have set up.

This is one of the most surprising rules for newcomers, because it breaks the intuition that a top-level recover in main will catch "anything bad that happens". It only catches things on the goroutine that's running main.

The deferred recover in main never runs. The panicking goroutine has no deferred functions of its own, so the runtime walks its empty deferred list, finds nothing, and crashes the program. The main goroutine doesn't even get a chance to print "main finished" because the entire process is gone.

The fix is to put a recover inside the goroutine itself, right at the top:

Now the deferred function inside processOrderAsync is on the worker goroutine's own stack. When the panic fires, the runtime walks that stack, finds the recover, and stops the unwind. The program keeps running and main prints its final line.

The diagram shows what's wrong with the first version. Each goroutine has its own deferred list. main has a recovery installed, but the worker doesn't. When the worker panics, the runtime walks the worker's empty deferred list and crashes the process. main's recovery is on a different list entirely and never gets a chance to run.

The practical takeaway is simple: any time you write go f(...), the function f (or something it defers) is responsible for recovering its own panics. Otherwise a single bad input to a single goroutine can take down a server that has been up for weeks. In real Go code, the spawning function usually wraps the goroutine body in a recovery helper:

safeGo wraps every goroutine body with a deferred recover. The bad order panics inside its worker, the worker recovers, and the other two orders process normally. Without that wrapper, one bad order would crash the whole program and the other two would never finish.

Converting Panics to Errors at a Library Boundary

A common use of recover is at the public boundary of a package. Internal code may use panic for control flow (especially in deeply recursive parsers, validators, or interpreters where checking every return value would clutter the logic). The public function recovers any panic and returns it as a regular error. Callers of the package never see a panic; they get an error like they would from any other function.

The shape uses a named return value so the deferred function can write to err:

The named return value err is the key piece. Without naming the return, the deferred function would have no way to influence what ValidateCart returns. With the name, the deferred function can assign to err, and that assignment is the value the function returns once the deferred functions finish.

fmt.Errorf("invalid cart: %v", r) formats the panic value into an error message. If you want callers to be able to use errors.Is and errors.As against a specific error type, wrap the recovered value with %w against a sentinel:

Now the caller can write errors.Is(err, ErrInvalidCart) to recognize the failure category, just as they would for any errror produced by the package. The internal use of panic is a private implementation detail; the public surface is a regular error return.

The diagram traces the flow. The caller invokes the public function, which loops over items and calls an internal helper. The helper panics. The unwind reaches the deferred function in the public boundary function. The deferred function recovers, assigns to the named return value, and lets the function return like any other. The caller never sees the panic at all.

Two details matter here. First, only the public boundary functions of your package should use this kind of recover. Internal helpers can panic freely, knowing the boundary will catch them. Don't sprinkle recovery into every internal function; the whole point is to keep the cleanup in one place. Second, recover here is reserved for panics that the package considers its own. If a panic comes from somewhere unexpected (a real bug in your code, a nil deref, a runtime out-of-memory), recovering it without surfacing it is wrong. We'll come back to that distinction in the anti-patterns section.

HTTP Middleware: Recovering at the Request Boundary

A web server is the canonical place for recover. A single bad request handler with a nil pointer dereference would otherwise crash the entire server, dropping every other in-flight request. The standard library's net/http server installs a recover around every request automatically, but real applications usually add their own so they can log the panic with request context, return a clean 500 response, and (if needed) page on-call.

The middleware pattern wraps any handler with a deferred recover:

This server has two endpoints to think about. A request to /product?code=BOOK-01 returns the product. A request to /product (with no code) panics inside the handler. Without the middleware, the panicking goroutine would still be recovered by net/http's own default recover, which is built into the server. With the middleware, the panic is caught one frame earlier, in code you wrote, so you get to control the log message and the response body. Other in-flight requests on other goroutines are completely unaffected because each request has its own goroutine and its own deferred-recover stack.

The reason net/http installs a default recover is exactly the rule from the previous section: a panic in a goroutine that has no recover crashes the program. The standard library can't afford for one bad handler to take down the whole server, so it sets up the safety net. Your middleware adds your own behavior on top: structured logging, metrics, a custom error response shape, a hook to your error reporting service.

The deferred recover is scoped to one request. It doesn't keep the goroutine alive past the response; the goroutine ends normally and the next request gets a fresh goroutine with a fresh deferred recover. The middleware doesn't need to know what the handler does. The recovered value is logged but not exposed to the client. Returning the panic message in the response body is a security mistake; it can leak stack traces, internal types, and details about the runtime environment.

For production code, the deferred function usually also captures the stack trace and forwards it to whatever observability system you use:

debug.Stack returns a byte slice with the formatted stack trace of the current goroutine. Capturing it inside the deferred function preserves the stack at the moment of the panic, which is what you want for debugging. Capturing it later (after the unwind finishes) would show the wrong stack, because by then the panicking frames have already been popped.

The diagram shows the request lifecycle through the middleware. The request enters the middleware, which wraps the handler call in a deferred recover. The handler panics. The unwind hits the middleware's deferred function, which recovers, logs the panic with a full stack trace, writes a 500 response, and lets the goroutine end normally. The server's other goroutines, handling other requests on other connections, never even notice. This is the kind of isolation that makes a Go web server survive operationally even when individual handlers have bugs.

Re-Panicking When You Can't Really Handle It

Sometimes you recover a panic, look at it, decide it's not yours to handle, and want to let it continue unwinding. The way to do that is to call panic again with the recovered value. The runtime treats the second panic as a brand-new panic, but since you pass the same value, the effect is "I caught this, looked at it, and decided to keep panicking".

processOrder handles two cases. The first call passes an empty string, which triggers a deliberate panic("expected business error"). The deferred recover sees that exact string, decides it's a known business error, and converts it to a regular returned error. The second call passes "CRASH", which dereferences a nil pointer. The deferred recover sees the runtime panic value, recognizes that this isn't a known business panic, and re-panics. The unwind continues up to main's own recover, which catches it and prints the message.

The pattern of "recover, inspect, re-panic if you don't recognize it" is a guard against over-broad recovery. A common mistake is to recover everything that comes through, which buries real bugs (nil derefs, out-of-bounds indices) under the same handling as intentional control-flow panics. By inspecting the recovered value and re-panicking unrecognized ones, you keep the safety net for known patterns without hiding actual crashes.

Re-panicking changes the stack trace. The runtime records the location of the new panic call as the panic site. If you want the original stack preserved for logging, capture it before re-panicking:

debug.Stack is captured inside the deferred function, before the re-panic, so it has the original frames. After the re-panic, main's recover sees the same panic value but a fresh stack trace from the re-panic point. Capturing first means you keep the diagnostic value of the original site while still letting the panic propagate.

Anti-Patterns: When Not to Use recover

recover is easy to misuse. The most common abuse is treating it as a general-purpose try/catch, the way some other languages handle exceptions. Go's design treats panics as exceptional and errors as ordinary, and recover is meant to wrap exceptional control flow, not to replace if err != nil. Here are the patterns to avoid.

Anti-pattern 1: catch-all recover that swallows everything. A function that recovers any panic and discards it hides real bugs:

The function recovers everything: the intentional panic("empty order ID"), but also the nil pointer dereference that's a real bug in the code. The caller has no way to tell that the function crashed in the middle. The order didn't process. No error came back. No log entry. From the outside, the call looks successful. This pattern (sometimes called "panic eating") is one of the most reliable ways to produce silent data corruption in a production system.

The fix is to either re-panic unknown values (as in the previous section) or to convert recovery to an error so callers know something went wrong. Never just discard.

Anti-pattern 2: using recover for normal control flow. Some Go developers see recover and think "Go does have exceptions after all, I can throw and catch across function boundaries". Don't. panic and recover are slower than returning errors, they bypass the type system, and they make code harder to read because the control flow jumps don't show up in function signatures.

This works, but it's a terrible way to write findItem. The function uses panic to break out of the loop and the recovery to capture the index. A plain return i would do the same thing in two lines instead of the elaborate panic/recover dance. The only time this kind of thing is justified is when you're escaping out of deeply nested recursive code where every level would otherwise have to check and propagate an error, and even then you do it inside a single function or single package, not across an API boundary.

Anti-pattern 3: recovering in main when you don't have a good response to the panic. Sometimes a panic in main is the correct outcome: the program has hit a state where continuing would be worse than crashing. Resist the urge to wrap main in a blanket recover just because crashes look bad. If the program can't proceed safely, let it die so the orchestrator (systemd, Kubernetes, Docker) sees a non-zero exit code and restarts it cleanly.

badMain recovers a missing-config panic and proceeds as if the config had loaded. The program will keep running, but it will keep running broken, probably failing in mysterious ways downstream because no config was actually loaded. A crash on missing config is honest signal that something is wrong with the deployment; eating that signal turns a clear failure into a confusing one.

Anti-pattern 4: recovering and then logging at the wrong level. When a panic is genuinely unexpected (a real bug, not a control-flow signal), the right log level is whatever your loudest "page someone" level is. Logging it at info or debug because "we already handled it" obscures the fact that something went very wrong. Recovery in middleware is for keeping the server alive, not for downgrading bugs to information.

ScenarioShould you use recover?
Public API of a package whose internals use panic for parser/validator unwindingYes, convert panic to error at the boundary
HTTP request handler in a long-lived serverYes, keep one bad request from killing the process
Worker goroutine processing items from a queueYes, log and continue with the next item
A function that wants "exception-like" control flow across many functionsNo, use returned errors
A function whose only goal is to make tests look like they passNo, fix the bug
Catching every possible runtime error so the program "never crashes"No, that's hiding bugs
main in a CLI tool that hits an unrecoverable stateNo, let the process exit

Use recover where a panic represents a known boundary you understand (request, message, parser run), and let panics propagate everywhere else.

Recovering in main vs in a Server Framework

The "should I recover at the top of main" question comes up often enough to deserve its own section. The short answer is "almost never". The long answer depends on what kind of program you're writing.

For a short-lived command-line tool, a panic at the top is the right outcome. The program will exit with a non-zero status, the runtime will print a stack trace, and whoever ran it can see what happened. Wrapping main in a recover hides the stack trace and replaces it with whatever message you logged, which is almost always less useful.

The CLI dies with a clear panic message and a stack trace pointing to the exact line. A user looking at this can immediately see the problem. Adding a defer func() { if r := recover(); r != nil { fmt.Println("oops:", r); os.Exit(1) }}() to main would replace this with oops: order ID required and lose the stack. Worse, if you forget the os.Exit(1), the program would also lose its non-zero exit code, which is what shell scripts and orchestrators rely on to detect failure.

For a long-running service, the calculus is different. The server is supposed to keep handling work even when individual requests or messages fail. Recovery happens at the per-request or per-message level (HTTP middleware, message consumer wrapper), not at the level of main. main should still be allowed to crash on initialization errors, missing config, or other unrecoverable startup conditions, because at startup there's nothing useful to recover to.

The diagram contrasts the two scopes. A panic in a request handler is per-request: the middleware recovers, logs the failure, returns a 500, and the server keeps serving. A panic during startup (missing config, can't bind to port) is process-wide: there's nowhere useful to recover to, so the program exits and the orchestrator restarts it. Conflating the two scopes is what produces "we recover everywhere just to be safe" code, which sounds prudent but actually makes incidents harder to diagnose.

A worker that pulls jobs off a queue follows the same pattern as an HTTP server. Each job runs in its own goroutine (or its own iteration of a loop). Each one is wrapped with a recover that logs the panic and moves on to the next job. The main loop and the process itself are not recovered. If the worker can't connect to the queue at startup, it dies and the orchestrator restarts it. If it can connect but one job has a bug, recover keeps the worker alive for the next job.

The bad job (negative items) panics, the wrapper recovers and logs, and the loop moves on to the next job. The third job runs fine. The worker process never dies. This is the right shape for queue consumers, batch processors, and any long-running loop over external input.

Recovering in main for a worker like this would be redundant. The work has already been isolated with the per-job wrapper. A panic that escapes the wrapper (say, an out-of-memory while logging the recovery) almost certainly means the process is in a bad state anyway, and crashing is the safest outcome.