Support During an Outage: The 60-Minute Playbook
Your product is down. Customers are flooding every channel. Your support team is panicking. Here's exactly what to do in the first hour.
Your status page just turned red. The product is down. Within 5 minutes, your support inbox has 50 new messages. By 15 minutes, it's 200. Your team of 4 agents is looking at each other, wondering what to say.
This moment determines whether your customers remember the outage or remember how you handled it. Companies that communicate well during outages build trust. Companies that go silent destroy it.
Here's the 60-minute playbook.
Minute 0-5: Acknowledge Publicly
Before you do anything else, update your status page. "We're aware of an issue affecting [service]. We're investigating. Next update in 15 minutes."
That status page update prevents 30 to 50% of incoming tickets. Customers who check your status page (and more do than you think) see that you know about the problem. They don't submit a ticket.
If you don't have a status page, make one. Atlassian Statuspage, Instatus, or even a simple static page that you can update manually. Having one and keeping it current is one of the highest-ROI investments you can make for outage communication.
Post on Twitter/X too. "We're experiencing an issue with [service]. We're investigating and will update here. Details: [status page link]." Social media reaches customers who wouldn't check a status page.
Minute 5-15: Set Up the Auto-Response
Every incoming support message during the outage should get an immediate auto-response with the current situation.
"We're currently experiencing an outage affecting [specific service]. Our engineering team is investigating. We'll update you as soon as we have more information. You don't need to submit a ticket. We'll reach out to everyone affected once the issue is resolved."
If you're using Supp, configure an outage-specific response for all incoming messages. AI classifies the message, detects it's about the current outage (intent: "service not working" or similar), and responds with the outage notice. This happens in under 5 seconds per message.
That auto-response serves two purposes: it acknowledges the customer's problem instantly, and it tells them they don't need to follow up. Without it, many customers will submit multiple tickets ("my first message didn't get a response, is anyone there?").
Minute 15-30: Update With Specifics
By now, your engineering team should have preliminary information. Update the status page with whatever you know:
"We've identified the issue: [brief technical description in plain language]. Our team is implementing a fix. Estimated time to resolution: [estimate if you have one, or 'we'll update in 30 minutes']."
Be honest about what you don't know. "We haven't determined the root cause yet" is fine. Making up a timeline you can't meet is not.
The update doesn't have to be long. Two sentences. But it has to exist. Customers can tolerate outages. They can't tolerate silence.
Minute 30-45: Prioritize the Queue
Your inbox is full of outage-related messages. Don't try to respond to all of them individually. The auto-response handled acknowledgment. Now focus on the high-priority ones:
Customers reporting a different issue (not the outage). These exist. While everyone is focused on the outage, someone has a billing question. Don't ignore them. Filter for non-outage intents and handle them normally.
VIP or enterprise customers. If you have customers with SLA commitments, contact them proactively: "We're aware of the outage and working on a fix. I'll send you updates directly."
Customers who report the outage is costing them money. "I'm losing sales because your checkout is down." These get priority response and, if appropriate, a service credit commitment.
AI classification helps here. Supp differentiates between "service outage" intents and other intents. Your team can filter the queue to see only non-outage tickets, so nothing important gets buried under the outage flood.
Minute 45-60: Resolve and Communicate
When the fix deploys, update the status page immediately: "The issue has been resolved. [Brief description of what happened and what we did]. We're monitoring to ensure stability."
Send a proactive email to all affected users (or all users, if you can't segment). "Earlier today, [product] experienced [X minutes] of downtime due to [cause]. The issue is now resolved. [What we're doing to prevent it in the future]."
That last part matters. "What we're doing to prevent it" shows accountability. "Adding monitoring for the affected service" or "increasing database capacity" tells customers you're not just patching but improving.
After the Outage: The Post-Mortem
Within 48 hours, publish a brief public post-mortem. Not a PR statement. A genuine explanation:
What happened (in plain language). When it happened (timeline with timestamps). How many customers were affected. What the root cause was. What you're doing to prevent it from happening again.
This transparency builds trust. Customers who see a thoughtful post-mortem think "this team takes reliability seriously." Customers who get silence think "they're hiding something."
Statuspage has a built-in post-mortem feature. You can also just publish a blog post. The format doesn't matter. The honesty does.
The One Thing Most Companies Skip
After the outage is resolved, go back and respond to every support ticket that came in during the outage. Not a mass auto-close. A personal response:
"The issue you reported has been resolved. [Brief explanation]. I'm sorry for the disruption. If you're still experiencing any problems, reply and I'll look into it."
This takes 30 seconds per ticket with a template. For 200 tickets, that's about 2 hours of work. The cost is small. The impact is large. Every customer who submitted a ticket gets personal confirmation that it's fixed. They feel seen.
The alternative (mass-closing tickets without a response) tells customers their message went into a void. Some will contact support again to confirm the fix, generating unnecessary follow-up volume. Some will just assume you don't care.
The follow-up response is the difference between "they had an outage but handled it well" and "they had an outage and I never heard back."