We've all been there. It's 2 AM, production is down, and you're staring at a log that says:
Error: Something went wrong.
Thanks. Very helpful.
Bad error messages are one of the most insidious forms of technical debt. They don't break your tests, they don't show up in code reviews as obviously wrong, and they seem fine—until they're the only thing standing between you and understanding why your system is on fire.
The Anatomy of a Useless Error
Most bad error messages fall into a few categories:
The Optimist: Error: Failed to process request. Which request? What kind of failure? Was it a timeout, a validation issue, a null pointer? The optimist assumes you'll figure it out.
The Cryptic Oracle: Error: E_INVAL_0x3F. Somewhere, there's a document explaining what 0x3F means. You don't have it.
The Blame Shifter: Error: Unexpected error occurred. Of course it was unexpected. If you expected it, you would have handled it.
The Oversharer: A 47-line stack trace with no context about what the application was actually trying to do when things went sideways.
What Makes an Error Message Actually Useful?
A good error message answers three questions:
- What happened? Not "something went wrong," but specifically what operation failed.
- Why did it happen? What condition or input caused the failure?
- What can be done about it? Is this recoverable? What should the user or operator try next?
Here's the difference in practice:
python
# Bad
raise Exception("Database error")
# Better
raise DatabaseConnectionError(
f"Failed to connect to PostgreSQL at {host}:{port}. "
f"Connection timed out after {timeout_seconds}s. "
f"Check that the database server is running and accessible."
)
The second version tells you where it was trying to connect, how long it waited, and what to check first. That's the difference between a five-minute fix and an hour of bisecting logs.
Errors Are Part of Your API
Here's a mental shift that helped me: treat error messages as part of your public interface. If you're writing a library, your errors are documentation. If you're writing a service, your errors are customer support.
This means:
Be consistent. If your validation errors include the field name in one place, they should include it everywhere. Establish patterns and stick to them.
Include identifiers. When an operation fails on a specific entity, include the ID. "Order not found" is frustrating. "Order abc-123 not found" is actionable—you can search logs, check databases, and verify the ID was correct.
Distinguish error types. There's a big difference between "this failed because of a bug" and "this failed because of invalid input." Your error handling should make this clear, both for users and for monitoring systems.
The Context Problem
One of the hardest things about error messages is that the code throwing the error often doesn't have the context to make a good message. A low-level database function knows it got a connection refused, but it doesn't know that you were trying to save a user's payment information.
The solution is to catch and wrap errors as they propagate up the stack:
go
user, err := db.GetUser(userID)
if err != nil {
return fmt.Errorf("loading user %s for permission check: %w", userID, err)
}
Each layer adds its context. By the time the error reaches your logging or response layer, you have a chain: "loading user abc-123 for permission check: connection refused: dial tcp 10.0.0.5:5432: i/o timeout."
Now you know what you were doing (permission check), what entity was involved (user abc-123), and what went wrong at the infrastructure level (TCP timeout to your database).
Actionable Suggestions
Where possible, tell people what to do next. This is especially important for user-facing errors:
Unable to upload file: The file exceeds the 10MB size limit.
Try compressing the image or uploading a smaller version.
Even for internal/operational errors, suggestions help:
Failed to acquire lock on resource X after 30s.
This may indicate a deadlock or a long-running transaction.
Check active transactions with: SELECT * FROM pg_stat_activity;
You're encoding your debugging knowledge into the system itself.
Logging vs. User-Facing Messages
Not every detail belongs in front of users. Internal errors should be logged with full technical detail, but the user-facing message should be helpful without exposing internals:
python
try:
result = process_payment(order)
except PaymentGatewayError as e:
logger.error(
"Payment failed for order %s: %s",
order.id,
e,
exc_info=True
)
raise UserFacingError(
"We couldn't process your payment. Please check your card details "
"and try again, or contact support if the problem persists. "
f"Reference: {order.id}"
)
The logs get everything. The user gets something helpful and a reference ID they can provide to support.
The Lazy Developer's Argument
"This takes more time," you might say. It does, in the moment. But consider:
Every vague error message is a future interruption. It's a Slack message asking "what does this mean?" It's thirty minutes of log diving that could have been thirty seconds. It's a customer support ticket that didn't need to exist.
Good error messages are a gift to your future self and everyone who maintains the code after you. They're documentation that shows up exactly when it's needed.
Start Small
You don't have to refactor every error in your codebase tomorrow. But you can:
- Next time you write a new error, make it a good one.
- Next time you're debugging a bad error, fix it while you're there.
- When you review code, push back gently on vague errors.
Over time, the quality compounds. Your logs become more useful. Your on-call rotations become less painful. Your users become less frustrated.
And maybe, just maybe, the next 2 AM incident will be a five-minute fix instead of an all-nighter.
The best error message is one that helps someone solve their problem without having to ask for help. Write errors like you're leaving notes for a colleague who'll be debugging this at 2 AM. Because someday, that colleague might be you.





0 Comments