Our transactional email said it sent for weeks — and delivered nothing
John C. Thomas
Founder, BlueWave Projects
Here is a failure mode that is almost worse than an outage: software that reports success while doing nothing.
For a few weeks, one of our products sent transactional email that never arrived. Welcome emails, billing receipts, form notifications — the API returned success every single time, the logs were clean, and not one message reached an inbox. We only caught it because someone asked, "hey, did that test email come through?" It had not. None of them had.
This is the post-mortem, because the root cause is a trap any team on a modern email API can fall into.
The symptom
The endpoint that sends a notification returned HTTP 200. The send helper wrapped the provider call in a try/catch, did not throw, and returned an ok response. Everything downstream saw success — the dashboard logged the event, the user got a friendly "message received" screen, and the email went nowhere.
A silent failure is worse than a loud one. A 500 gets investigated within the hour. A cheerful 200 that does nothing can run for weeks.
The root cause: an unverified sender domain
We send through Resend, but the lesson generalizes to SES, Postmark, SendGrid — any provider that verifies sending domains. To send from an address, the provider has to have verified that you own its domain. We had verified a subdomain dedicated to sending, mail.ourdomain.com. But an environment variable pointed the from address at the bare apex, [email protected].
The provider had no verification record for the apex, so it rejected every send with a 403: the ourdomain.com domain is not verified.
Here is the part that turned a one-line config typo into a weeks-long silent outage: the provider returned the rejection as a value, not an exception. Modern SDKs increasingly hand back an object shaped like data-or-error instead of throwing. Our code awaited the send and returned success without ever inspecting the error field. A thrown error would have hit our catch and shown up in the logs. A returned error sailed straight through.
So: unverified domain, provider returns an error object, code ignores the error object, success response, silent failure across every email path in the product.
Why "it worked in dev" did not save us
In development, sends are a no-op — no API key, or a dry-run flag. Nobody ever saw a real rejection locally. The first real send happened in production, where the failure was invisible by construction. The gap between "the code ran" and "the mail arrived" was exactly the gap we had no test for.
The fix, in two parts
First, point the sender at the verified domain. One environment change moved from to [email protected]. We proved it with a direct API call before trusting anything: sending from the apex returned a 403, sending from the verified subdomain returned a real message id. That five-minute check is the entire difference between "I think it works" and "it works."
Second, stop ignoring the provider's error value. The send helper now inspects the returned error and treats it the same as a thrown one — it logs, surfaces, and on the paths that matter, fails loudly instead of returning a cheerful 200. We also hardened the code's default from address, which used to point at a domain we did not even control; now the fallback is the verified sending domain, so a missing variable degrades to "still works" instead of "silently 403s."
Send from a subdomain on purpose
There is a second lesson hiding in the first. People reach for the clean apex address because it looks nicer in a from-line. For sending, a dedicated subdomain like mail.brand.com is the better practice, not a downgrade:
The apex is not a spam-filter win. Alignment, sender reputation, and engagement are what land you in the inbox. Send from the subdomain and keep the apex clean.
What I would tell another team
The bug was a one-line config mistake. The damage was multiplied by an API shape that fails by returning instead of throwing, and a code path that trusted the call instead of the outcome. We run a lot of small products on a small team, and the discipline that came out of this — verify the outcome, never the 200 — is now wired into every send path we ship.
If you want the kind of team that finds the silent failures before your customers do, [come say hello](https://bluewaveprojects.com/booking).
More from BlueWave
RoomPlan vs Matterport vs Polycam: which one belongs in your contractor's toolkit?
8 min
Hawaii complianceHawaii GET tax for contractors: how the §237-13(3)(B) sub-deduction actually works
6 min
WorkflowHow to scope a renovation in 60 seconds (and why your hand-written estimate keeps losing jobs)
5 min