AlgoMaster Logo

Error Handling Best Practices

Last Updated: May 22, 2026

High Priority
16 min read

The mechanics of Go's error handling fit into a handful of chapters, but choosing the right mechanism for a given situation is where the real engineering work happens. A codebase that uses fmt.Errorf with %w everywhere reads like a stack trace pretending to be a program. A codebase that catches panics in every function reads like Java in disguise. A codebase that returns sentinel errors for every failure makes every package depend on every other one. This chapter pulls together every decision the section has covered into one place, with rules of thumb, anti-patterns, and a decision flow you can follow before writing any error-returning function.

The Four Tools and When Each One Fits

Go provides four ways to produce an error. errors.New and fmt.Errorf make ad-hoc errors. Sentinel errors are package-level error variables that callers can compare against. Custom error types carry structured data alongside the message. Each one has a clear best fit.

A short rule, then the details:

MechanismUse when
errors.New("message")Fixed message, no dynamic data, callers don't distinguish
fmt.Errorf("...: %v", val)Message includes dynamic data, callers don't unwrap
fmt.Errorf("...: %w", err)Wrapping to add context, callers may want the cause
Sentinel (var ErrNotFound = errors.New(...))Callers in other packages detect this specific error
Custom error typeErrors carry structured fields callers inspect

The decision is about what callers will do with the error. If callers will only log and return, a plain errors.New or fmt.Errorf is enough. If callers will branch on the kind of error (retry on one, give up on another, return a 404 instead of a 500), they need something they can compare or type-assert. Sentinel errors are the cheapest way to express "this kind." Custom types fit when "this kind" needs to carry data, like the offending field name in a validation error or the HTTP status to return.

The flowchart turns the rule into a four-question walk. Run any "what error do I return here?" question through it, and the answer drops out. The rest of the chapter unpacks the bends and edge cases.

A common e-commerce scenario:

Three error types, each used for the reason it fits. ErrProductNotFound is a sentinel because callers branch on it. ValidationError is a custom type because callers may want the field name. displayProduct wraps with %w because it adds context (the product code, the operation name) on top of a lower error that callers may still need to inspect.

When to Wrap and When Not To

The mechanic of %w is simple: fmt.Errorf("context: %w", err) returns a new error that wraps err, and errors.Is/errors.As can walk the chain to find specific kinds. The harder question is when wrapping is appropriate and when it leaks information that callers shouldn't depend on.

The rule of thumb is one sentence: wrap when you're adding context to an error from a lower layer that callers may still want to inspect. Don't wrap when the lower-layer error is an implementation detail callers shouldn't see, or when there's no context to add.

A wrapping example that gets it right:

The wrap adds the path and the operation name. The underlying os.ErrNotExist is still detectable through errors.Is. A caller deciding whether to fall back to a default catalog or retry can do so without parsing the message.

A wrapping example that gets it wrong:

loadProductGood translates the storage-layer error into a domain error before wrapping. Callers learn about "product not found" without learning that there's a SQL database underneath. The implementation can change without breaking every caller. This is the rule: wrap only when the underlying error is part of the contract you want to expose.

The other case where wrapping is wrong is when you have no context to add. Wrapping just to wrap (fmt.Errorf("%w", err)) accomplishes nothing the original error didn't already say. Return the error directly:

The fmt.Errorf("%w", err) version produces an error with the exact same string as err and the same unwrap chain. It's noise.

SituationWrap?Verb
Adding context to a lower error callers may need to detectYes%w
Adding context to an error whose cause is an internal detailTranslate first, then wrap%w after translation
Reporting an error you constructed yourself in this functionNo wrap neededReturn the error directly
Logging an error without changing itNo wrapPass the original
Including a value purely for the message, not for inspectionFormat only%v

The diagram is the same rule split into three questions. Most error sites land on "return directly" or "wrap with %w." The "translate" case is the one most worth practicing, because it's the case where mechanical wrapping causes long-term pain.

Error Message Style Rules

The Go standard library and most idiomatic code follow a consistent style for error messages. Following the same style makes your errors stack cleanly when wrapped and reads naturally when printed. The rules:

Start lowercase. Error messages chain together with colons or other punctuation, and a capital letter mid-sentence reads wrong. errors.New("product not found"), not errors.New("Product not found"). The standard library does this almost everywhere; the few exceptions are package-level errors that are quoted whole, like proper nouns.

No trailing punctuation. Error messages don't end in a period, exclamation mark, or any other punctuation. The caller may append more context after the colon, and a sentence-ending period in the middle of a longer string looks broken. errors.New("invalid order id"), not errors.New("invalid order id.").

Lead with context. When wrapping, put the operation or context first and the underlying cause last, separated by a colon. The pattern is "operation: cause", not "cause: operation". Reading the result top-down tells the reader exactly what failed, then why.

The first version reads "load product 'BOOK-01': product not found." The second reads "product not found: while loading product 'BOOK-01'", which is awkward and breaks the convention every Go tool and reader expects.

No redundant `error:` prefix. Don't prefix messages with "error:". The fact that the value is an error is already obvious from the type signature, and the prefix appears multiple times after wrapping. errors.New("invalid order id"), not errors.New("error: invalid order id"). The standard library never does this.

Be specific. A message like "failed" or "something went wrong" tells the caller nothing. Include what was being attempted and any value that helps identify the failure. "open config file /etc/app.conf: permission denied" is useful. "failed to open file" is barely better than no message.

Here's a table of common mistakes and their fixes:

WrongRightReason
errors.New("Failed to load product.")errors.New("load product failed")Capital + period
errors.New("Error: not found")errors.New("not found")Redundant prefix
fmt.Errorf("%w: while reading", err)fmt.Errorf("read: %w", err)Context-first ordering
errors.New("oops")errors.New("invalid coupon code")Specific over vague
errors.New("ERROR: BAD INPUT")errors.New("invalid input")All caps + redundant prefix

A complete example showing the wrapping chain:

The whole chain reads as one sentence, lowercase from start to finish, no trailing punctuation, with the context layered from outermost to innermost. The standard library's errors compose exactly this way (open /etc/app.conf: no such file or directory), and your errors should too.

Handle Errors Once

A common mistake in error-heavy code is to log an error and also return it. Every layer above does the same thing, and the log ends up with three copies of the same message, each with a slightly different prefix. The rule that prevents this is short: handle an error once. Either log it (and stop the chain by returning a sentinel or nil) or return it (and let the caller decide).

The wrong shape logs at every layer:

If the file doesn't exist, this prints three log lines for one failure: one from loadCatalog, one from startup, one from main. Three copies of the same information, and if you're filtering logs in production, you'll waste time chasing what looks like three separate failures.

The right shape:

Each layer adds its context and returns. Only main logs, and the single log line carries the whole chain:

One failure, one log entry, the entire history of the call.

There's one common exception worth naming. A goroutine spawned by your code has no caller to return to. If it can't surface an error through a channel or callback, logging is the only thing it can do. Library code generally shouldn't log at all; logging is the application's job, and the library returns errors so the application can decide whether to log, swallow, or surface to the user.

LayerActionWhy
Library function deep in the stackReturn with wrappingCaller may want context or to detect the kind
Service layerReturn with wrappingSame as above
HTTP handler / CLI top levelLog once and return a responseLast layer that has a caller; user-facing
Goroutine top levelLog if no error channel existsNo caller to return to
main functionLog and exitThe buck stops here

Don't Use panic for Ordinary Failures

The short rule: panic is for unrecoverable, programmer-error situations. Ordinary failures, the kind any well-behaved program will encounter, return an error.

A failed database query, a malformed request, a missing file, a network timeout, a value outside an expected range: every one of these is an ordinary failure. The function that detects it returns an error. Callers decide what to do.

A nil dereference where there should be a valid pointer, an array index past the end of the slice, an assertion that two invariants both hold at the same time but don't: these are programmer errors. The program is in an inconsistent state and there's no sensible way to continue. A panic is appropriate, often automatically (Go panics on its own for nil dereferences and out-of-bounds access).

The wrong version crashes the whole program because a user typed something the parser didn't recognize. That's a runtime failure of caller input, not a bug in the program. The right version surfaces the failure where the caller can show a useful message and move on.

The places where panic is the right choice are narrow. The standard library uses it for regexp.MustCompile (where a malformed pattern is a programmer mistake compiled into the binary), for template.Must (same reason), and inside the runtime for genuine impossibilities like a corrupted internal map. Application code rarely has the same situation.

A useful test: would a well-tested program ever reach this line? If yes, return an error. If no, the line represents an invariant violation, and a panic is honest about that.

The flowchart restates the rule. If the situation is something a well-behaved program could encounter (bad input, network failure, missing data), return an error. If it represents a violated assumption your code rests on (an unreachable branch, a nil pointer that was supposed to be set, a default case in a switch over a closed enum), panic is appropriate.

panic and recover Only at Boundaries

The companion rule to catching a panic is where to catch one: only at boundaries where letting the panic kill the process would be worse than turning it into an error.

Three boundaries apply:

HTTP handlers and similar request boundaries. A panic in one handler shouldn't crash the whole server. The standard library's net/http already wraps each handler in a recover that logs the panic and returns a 500. If you're writing a server framework or middleware, doing the same is correct. Inside your own handler code, don't recover; let the framework's outer recover catch it.

Goroutine top-level functions. A panic in a goroutine kills the program, not just the goroutine. If you spawn a goroutine that does work the program can survive without (a background metric flush, a periodic cache refresh), wrap its body in a recover so an unexpected panic doesn't take everything down.

The recover is in the goroutine entry point, not scattered through every function the goroutine calls. If a panic happens anywhere in the call stack, the recover at the top catches it, logs, and lets the worker exit cleanly without killing the program.

Library APIs that convert internal panics to errors. If your library internally panics on an impossible state but you don't want callers to see panics, recover at the public entry point and convert to an error. The encoding/json package does this in some places: an internal panic from a bad reflection path is converted to an error before returning to the caller.

The named return value err is set by the deferred function if a panic happens. The caller sees a normal error return, not a crash. This pattern is only worth using when there's a meaningful chance of an internal panic and the caller genuinely can't be expected to handle it.

What recover should never do:

  • Catch panics inside a function and resume normal execution as if nothing happened. A panic means an invariant was violated. Pretending it didn't happen is how broken programs continue to run, corrupting more data.
  • Replace ordinary error handling. Every function that returns (T, error) should still return errors normally. recover is for the rare panic-via-runtime case, not the common error case.
  • Run inside utility functions throughout the codebase. Recovery belongs at the boundary, and nowhere else.
BoundaryWhy a recover belongs there
HTTP handler / RPC entryOne bad request shouldn't kill the whole server
Goroutine top functionA panic in a goroutine kills the program; this is the only place to catch it
Library public APIConvert internal invariant violations into errors so callers don't see panics
Anywhere elseDon't recover; let the panic propagate

Don't Swallow Errors

The most insidious form of broken error handling is the silent swallow: a function returns an error and the caller ignores it with _. The program runs, the test passes, and the bug shows up in production three weeks later when a database write didn't actually happen.

The first version uses _ to explicitly discard the error. The second one assigns to err but then never reads it before the variable is overwritten by the next line. Both crash later, in confusing ways, if the marshal fails (one tries to write garbage, the other tries to write nothing).

The fix is always the same: check the error and decide what to do.

Every error gets checked, every check decides whether to propagate, recover, or take a different path. The Go community's errcheck linter exists specifically to catch unchecked errors at compile time, and most production codebases run it as part of CI.

There are two narrow cases where _ for an error is acceptable. The first is when the operation genuinely cannot fail in the way you're using it. The standard library's bytes.Buffer.Write returns an error in its signature only because it satisfies io.Writer; in practice it never returns one. The second is when the error has already been handled (logged or surfaced by a previous call) and you're in cleanup code where you can't do anything else. A deferred file.Close() after a successful write often falls into this category, though the safer pattern is to assign to a named return and let defer write back the close error if one happens.

The deferred close pattern preserves any earlier error while still surfacing a close error when nothing else failed. It's a bit of ceremony, but it's the right way to handle cleanup that can itself fail.

What's wrong with this code?

The function silently ignores three errors: db.Begin may fail to start a transaction, tx.Exec may fail to run the update, and tx.Commit may fail to commit the transaction. The function returns nil as if everything worked even when the stock didn't change. Worse, if Begin failed and tx is nil, the next line panics. The fix is straightforward:

Fix:

Every error is checked and either propagated or used to drive a rollback. The function now correctly reports failures and leaves the database in a consistent state.

Designing Errors Callers Can Act On

The whole purpose of returning an error is to let the caller do something different than they would have on success. If the caller can't do anything different, the error provides no value beyond logging. Designing error APIs starts with the question: what will callers do with this error?

If the answer is "they'll all return it up to the next layer," a single error type with a good message is enough. There's no reason to design multiple error kinds when nobody distinguishes between them.

If the answer is "some callers will retry, others will surface to the user, others will return a 404," the error needs to carry enough structure for callers to branch. Sentinel errors are the cheapest form. Custom types with kind fields are next.

A single *CheckoutError type with a Kind string field handles all three failure cases without three separate types. Callers can switch on Kind to decide what to do: return 400 for validation, return 422 for fraud, retry on transient errors. The Field field gives validation errors enough structure to point at the offending input.

When that design is overkill, sentinels are the lighter alternative: a few var Err... = errors.New(...) declarations and an errors.Is switch at the call site. Pick sentinels until you actually need to carry data, then upgrade to a type. The trade-off is direct: sentinels are cheap and composable but carry no data; custom types carry data but require more declaration.

A bad design pattern to avoid: returning errors that callers can't act on. If a function returns error but the caller can only log and continue, the error is providing no decision-making value. Sometimes this is fine (logging is what you want), but sometimes it's a sign that the function should either succeed unambiguously or expose a richer return type. A function attachReceipt that "always succeeds even if the receipt couldn't be attached" should either really always succeed (logging internally) or admit it can fail and return a specific error the caller can use.

API Design: Sentinel or Custom Type?

Once you've decided that an error needs to be detectable by callers, the next question is which mechanism to use. Sentinel errors and custom error types both work, but they fit different shapes of failure.

A sentinel is a single value. errors.Is(err, ErrProductNotFound) answers a yes/no question: is this the kind of error we mean? Sentinels work well for failures that are fully described by their existence. "Not found" is the same kind of "not found" regardless of which product was missing.

A custom type carries fields. errors.As(err, &valErr) extracts a value the caller can read. Custom types work well for failures that need to carry data: the offending field name, the failing record ID, the HTTP status code, a retry-after duration, a list of validation messages.

Use a sentinel whenUse a custom type when
The error is fully described by "this happened"The error has fields callers need to read
Callers branch on identity, not dataCallers need data (status, field, retry-after)
You want minimal declaration overheadYou're modeling a domain concept (validation, fraud)
The error fits across many call sites unchangedEach instance differs in its details

A common middle ground is a custom type with a sentinel errors.Is method. The type carries fields, but it also matches a sentinel when callers only want the kind:

The type carries Resource and ID fields callers can read, and the Is method makes every instance match the ErrNotFound sentinel. Callers can ask either question (specific data or general kind) and get the right answer. This pattern shows up in the standard library: *os.PathError carries the path and operation, and matches os.ErrNotExist via its underlying error.

The trade-off is more declaration, but the API stays clean for both styles of caller. Use it when you have a domain concept with structure but you also want the cheap sentinel comparison.

Testing Error Paths

A function with error returns is two functions: the happy path and the error path. Tests for both are necessary, but the error path is the one developers skip most often. Go's errors.Is and errors.As make assertions on specific error kinds straightforward, and they work no matter how many layers of wrapping sit between the function under test and the original error.

In a *_test.go file, a sentinel-based assertion looks like this:

Asserting on the error string (err.Error() == "cart is empty") is brittle: any layer that adds context breaks the assertion. Asserting on identity via errors.Is is stable.

For custom types, use errors.As:

errors.As walks the wrap chain and assigns the first matching type to the target. The test asserts both that the error is a *ValidationError and that its fields have the right values.

A few patterns for writing thorough error tests:

  • Table-driven tests for input validation. List inputs and the expected error sentinel or type, then loop with t.Run for each row.
  • Inject failures by passing in test doubles (a fake database, a fake HTTP client) that return specific errors. Verify the function under test wraps them correctly.
  • Check the wrap chain, not the error message. Use errors.Is and errors.As, not string comparisons.
  • Test the no-error case too. A function that returns nil on success should be tested for that as much as for the failure cases.

Common Anti-Patterns

A short tour of patterns that look reasonable until you read them carefully. Each one shows up regularly in code reviews.

What's wrong with this code?

The comparison err.Error() == "not found" is string-based, which means any change to the error message (a wrap, a typo fix, a translation) breaks the fallback silently. Compare against a sentinel instead:

Fix:

errors.Is compares against an exported sentinel that the repo package documents as part of its API. The comparison survives wrapping and message edits.

What's wrong with this code?

Three style violations in five lines: "error:" prefix is redundant, capital C mid-message will read wrong after wrapping, trailing period interferes with "caller: error" chaining. Plus, the validation case uses a generic errors.New instead of carrying any information about which item or which field. Cleaner:

Fix:

The empty-cart case is a sentinel callers can match. The invalid-quantity case is a custom type that carries the code and quantity so callers can build a useful response. Both messages start lowercase and end without punctuation.

What's wrong with this code?

Two problems. First, panic(err) is used for an ordinary validation failure, which is exactly the case error returns exist for. Second, recover is being used inside the function as a substitute for normal error handling, which silently turns the panic into a logged message and a nil return. Callers see a successful order that didn't actually save. The fix is to remove both the panic and the recover:

Fix:

Both failure modes return errors. The caller decides what to do, and the function's name no longer hides what's happening.

What's wrong with this code?

The function swallows both errors and returns a possibly-nil *Config, which the caller has no way to tell from a "loaded successfully but the file was empty" case. The signature should admit it can fail:

Fix:

The signature returns (*Config, error). Callers see the failure and decide how to handle it. The errors are wrapped, so callers can detect specific failure modes (os.ErrNotExist for a missing file, *json.SyntaxError for malformed JSON) if they want.