← All posts

The badge generator that found a 4-week-old bug in production

I shipped a status badge generator on Thursday. The first thing it did was reveal that every status badge I'd ever shipped had been broken since launch.

I shipped a status badge generator on Thursday. The first thing it did was reveal that every status badge StatusPageBuddy had ever served had been broken for visitors since launch, almost four weeks.

Nobody had complained. Nobody had noticed. The bug was invisible by design.

This week's newsletter is about the bug, but more about the meta-lesson: the surface of your product that you yourself never load as a logged-out visitor is probably broken for logged-out visitors. Indie founders building auth-gated apps fall into this trap constantly. I just did.

What I actually shipped

A small public tool at /tools/badge-generator. You paste your status page slug into a box, the page generates a shields.io-style SVG badge that reflects your current status, and gives you Markdown, HTML, and bare-URL snippets to copy.

The point of the tool is distribution. Every embedded badge on a GitHub README, a landing page footer, a docs site, is a free brand impression and a backlink to your status page. Upptime grew to 17K stars on this loop, and the same loop is available to any indie founder who ships a tool with a public status page.

So I built the public-facing landing page. Wrote the copy, wired up the live preview, added the FAQ JSON-LD schema. Pushed. Page live. Tool works.

Then I clicked the preview badge URL in a private browser tab to see how it would look to a stranger.

The bug

The URL https://www.statuspagebuddy.com/api/badge/statuspagebuddy did not render a badge. It served an HTTP 307 redirect to /auth/login.

For a logged-in user, the redirect never fires, because their session cookie is present. The dashboard, where I had previously embedded the badge preview component, worked perfectly. I had been looking at green badges every day for a month.

For a logged-out user, which is every single visitor who would ever load an embedded badge on someone else's README, the request bounced to the login page. The browser, expecting an SVG, got back HTML. It rendered the broken-image icon.

Every badge StatusPageBuddy had ever served, in every embed, on every README and landing page, had been a broken image icon for the entire visitor's view. For nearly four weeks.

Where the bug lived

In one line of an allowlist. The middleware that wraps the app and enforces "logged-out users get bounced to login, except on these public routes" had this list:

/, /blog, /auth, /s/, /privacy, /cookies, /terms, /pricing,
/subscribe, /alternatives, /api/public

/api/badge was not in the list. So every request hit the auth check, found no session, and got the redirect.

The fix was four characters of code. Adding /api/badge to the array, plus the same for /tools so my new landing page could exist. A two-line diff that I shipped in the same commit as the badge generator itself.

Why it stayed hidden

This is the part I want to think out loud about.

The bug had no error log. The middleware was doing exactly what it was written to do, which was to return a redirect. The redirect had a valid HTTP status code. From the server's perspective, it was healthy.

The bug had no user report. The user who would notice was a stranger visiting somebody else's GitHub README, where the broken-image icon would render. That stranger had no relationship with StatusPageBuddy, no email to write to, no incentive to investigate. They might assume "this person's status page is just down today" and move on. Most likely they didn't even notice; broken images on the web are background noise.

The bug had no instrumentation. The badge endpoint did not log how many requests it served or what their cache headers looked like. From the dashboard, I could see that I had registered users with status pages. I had no view of whether their embedded badges, on whatever sites they had embedded them on, were rendering.

The bug had no visibility in my own use of the product. I always loaded my dashboard as a logged-in user, where the badge preview worked. The "embedded in the wild" surface of my product was a surface I, the builder, never touched.

That is the meta-lesson. Whatever surface of your product you never load as the kind of user it was built for, is probably broken in a way no monitoring will catch. For SaaS with auth, that surface is usually the unauthenticated one. For B2B with admin tools, it's the read-only-customer surface. For multi-tenant, it's the empty-state-new-org surface. The list goes on.

You will not find these bugs by looking at logs. You will find them by loading your own product the way the people you do not see load it.

The discipline that should follow

I am adding two habits to my weekly cycle.

The first is a private-window walkthrough of any new public surface, every time I ship one. Not just the page I built, but the URLs it references. The badge generator references the badge endpoint. The blog references the RSS feed. The alt pages reference the canonical URL. All of those should be loaded from a fresh browser session that has no cookies, no cache, no history of my product.

The second is to add at least one synthetic health check per public endpoint. I'm going to wire up a tiny cron that hits /api/badge/statuspagebuddy once an hour from a server that has never been logged in, and checks that the response is image/svg+xml with a 200 status. If it ever stops being that, I want a notification.

Neither of these is novel. Both are obvious once you have shipped a bug that lived in production for four weeks because you never loaded your own product the way a stranger does.

This week's numbers, briefly

  • GSC impressions (28-day): 95, up from a baseline of around 12/week before the four alternative pages shipped on 5/11. About 8× growth, concentrated on the alt pages.
  • GSC clicks (28-day): 4. First clicks the project has registered. CTR 4.2%, which is healthy for the avg position of 35.
  • WAU: 0. Will reassess Sunday at the W5 retro. Status pages are a set-and-forget category, so I am not yet treating the zero as a crisis, but I am watching it.
  • Two new alt pages shipped 5/19: /alternatives/upptime (Hosted Upptime, zero GitHub Actions) and /alternatives/cachet (Hosted Cachet for the rest of us). Both with the signature-framework-name positioning that I am tracking as a SEO hypothesis.
  • One badge generator shipped 5/21: /tools/badge-generator. Free, no signup to preview.

What I want you to take from this

If you are an indie founder shipping a SaaS with auth, take twenty minutes this weekend and open every public URL of your product in a private browser window. Not just the landing page. The OG image endpoint, the sitemap, the RSS feed, the API endpoints that serve public data, the embeddable widgets you offer your customers.

You will find at least one thing that does not work the way you expect.

It will be a four-character fix.

It will have been broken for longer than you think.

See you Sunday.

— Hao