Incident Response: The Difference Between a Managed Event and an Organisational Crisis
Every organisation will experience a significant security incident. The only variable is whether the response is rehearsed or improvised. This hub gives you the playbook — built from 25+ years of post-breach reviews across 50 countries.
I’ve been called into post-breach situations on four continents. The one thing they all share — regardless of industry, company size, or geography — is that the response was slower, more chaotic, and more expensive than it needed to be because the organisation had never actually practised it.
An IR plan that lives in a SharePoint folder and gets reviewed annually is not an IR capability. An IR capability is built through tabletop exercises, purple team operations, real runbooks that real people have read, communication plans that have been stress-tested, and retainer relationships with IR firms and legal counsel that are activated before you need them.
PwC’s incident research shows 54% of IR failures are process failures, not technical failures. The breach was contained. The recovery failed because of unclear roles, poor communication, and inadequate business continuity planning. Technology alone cannot fix this.
IBM Cost of a Data Breach Report 2025
IBM Cost of a Data Breach Report 2025
Cisco Cybersecurity Readiness Index 2025
EU NIS2 Directive, GDPR Article 33
PwC Global Digital Trust Insights 2026
Preparation
Build your CIRT, write your playbooks, establish retainers, test your backups, run tabletops. Everything that happens before the incident determines how the incident goes.
Detection & Analysis
Identify, scope, and classify the incident. Preserve evidence. Activate the CIRT. Determine whether regulatory notification clocks have started running.
Contain, Eradicate & Recover
Isolate affected systems. Remove the threat. Restore from clean backups. Verify clean state before returning to production. Maintain business continuity throughout.
Post-Incident Activity
Root cause analysis. Lessons learned. Control improvements. Board reporting. Regulatory submissions. Update playbooks based on what you learned.
Incident Commander
Maintains overall situational awareness and makes key decisions under pressure. This is typically the CISO or a designated deputy. They do not do technical work during the incident — they command. Most IR failures happen when the CISO is heads-down on forensics instead of managing the response.
Technical Lead
Directs the forensic investigation, containment actions, and eradication. Needs deep technical skills and the ability to brief non-technical stakeholders clearly. Consider pre-engaging an external IR firm on retainer for complex incidents.
Communications Lead
Manages internal communications, media enquiries, customer notifications, and regulatory submissions. This role is activated immediately in any significant incident. Having pre-approved communication templates reduces response time dramatically.
Legal Counsel
Advises on notification obligations, evidence preservation, ransomware payment legality, and regulatory engagement. Legal must be in the CIRT, not called in after decisions are made. Get your legal team into a tabletop exercise before you need them in a real incident.
Business Continuity Lead
The most underrepresented role in most CIRTs. While the technical team contains the threat, this person is activating manual fallback procedures, coordinating with operations, and ensuring critical business functions continue. IR and BCP must be co-designed.
Executive Liaison
Manages the board and C-suite communication cadence. The board should never learn about a significant incident from a news article. This role maintains a regular briefing schedule and ensures leadership has the information they need to make business decisions without interfering with the technical response.
— Dr. Erdal Ozkaya, CISO & Author of 26 Cybersecurity Books
The most consistent failure point I see in post-breach reviews is the handoff from technical IR to business continuity. Security contains the ransomware — but nobody arranged alternative workflows for the factory floor that can’t access its production systems. The DR site is activated — but the manufacturing processes that depend on the compromised OT network have no manual fallback.
IR and BCP must be co-designed. Every IR playbook needs a BCP counterpart that answers: if this system is offline for 72 hours, how does the business keep operating? Who authorises manual processes? What’s the communication to customers and suppliers? What’s the maximum tolerable downtime? Get these answers before the incident, not during it.
Digital forensics isn’t just about understanding what happened — it’s about preserving evidence in a way that supports potential legal action, insurance claims, and regulatory investigations. Evidence collected without maintaining chain of custody may be inadmissible. Systems powered down without capturing volatile memory lose critical forensic artefacts forever.
Cloud forensics adds complexity: evidence may be held by a third-party provider with limited retention windows, platform-specific collection methods, and jurisdictional considerations. Enable cloud audit logging now — you cannot go back and collect logs from before an incident if logging wasn’t active. This is a preparation task, not a response task.
Digital Forensics & IR (DFIR): A CISO’s Guide
How to investigate breaches properly, preserve evidence for legal proceedings, and build a DFIR capability that supports both recovery and accountability.
Building a Cyber Incident Response Team
Roles, responsibilities, retainers, and the organisational politics of building a CIRT that actually functions under pressure — not just on the org chart.
IR Planning for Business Continuity
The integration most organisations miss — connecting your IR plan to BCP so the business keeps running while security contains the threat.
Free IR Book
Incident Response for Business Continuity — co-authored with Binalyze. Practical IR planning that connects to operational survival.
CISO Toolkit
IR plan templates, CIRT role definitions, communication templates, and post-incident report frameworks.
ISO 27001 Toolkit
Incident management controls aligned to ISO 27001:2022 — the framework foundation for your IR programme.
Sentinels Talk Show
IR war stories, lessons learned, and expert conversations with practitioners who’ve managed major incidents.
Book Dr. Ozkaya
IR tabletop exercises, CIRT building workshops, and board-level crisis simulation for executive teams.
What your board needs to hear about incident response
Three talking points, one metric, one question. Screenshot this for your next board prep.
The board’s job during an incident is not to manage it. It is to back the people who are. Pre-agree decision authority before you need it — at 3 a.m. is not the time to discover who can authorise a public statement.
Tabletop exercises that everyone passes are theatre. A real exercise should produce uncomfortable findings — usually about communication, legal escalation, and who actually has authority to take systems offline.
Most material damage in a serious incident comes from the 72 hours after detection, not the breach itself. The wrong public statement, wrong regulator filing, or wrong customer notification creates the lasting harm.
Your IR Plan Has Gaps. Find Them Before the Attacker Does.
Most organisations discover their IR gaps during an actual incident — when the cost of finding them is at its highest. I run IR tabletop exercises and capability assessments that expose the gaps in roles, process, and communication before they matter under pressure.
Book a Tabletop Exercise →
Incident Response FAQ — Honest Answers to the Questions CISOs Actually Ask
What are the key components of an incident response plan, and what does a real one look like in practice?
A real IR plan has six things — and most “plans” I review are missing at least three of them. (1) A named incident commander with explicit authority to spend money and shut systems down without asking permission. (2) A pre-agreed severity matrix so the team isn’t debating “is this a P1?” at 3 a.m. (3) Communications templates — internal, customer, regulator, board — drafted in calm conditions, not under fire. (4) Vendor and legal contact lists with after-hours numbers, retested every quarter. (5) Containment playbooks for your top five most likely scenarios (ransomware, BEC, insider, cloud account takeover, supply chain). (6) A post-incident review process that produces actual changes, not theatre. NIST 800-61 covers the structural lifecycle — preparation, detection, containment, eradication, recovery, lessons learned — and that framework is sound. The failure mode isn’t framework choice; it’s that the plan exists as a PDF nobody has tested.
How does incident response in Azure or Microsoft 365 differ from traditional on-premises IR?
The biggest shift is that your forensic surface is partly someone else’s machine. In on-prem IR, you can image the box and walk away with everything. In Azure/M365, your evidence lives in Microsoft Defender XDR, Sentinel, Entra ID sign-in logs, Purview audit logs, and Unified Audit Log — and retention windows vary by license tier. If you’re on E3, you have 90 days of Unified Audit Log; on E5, 365 days. That delta matters when an attacker has been resident for eight months. The second shift is identity is the new perimeter — most cloud incidents I’ve worked began with a token theft or a consent phishing attack, not a traditional malware payload. Your IR plan needs Entra ID conditional access reset procedures, OAuth app review, and the muscle memory to pull session tokens fast. Defender XDR’s auto-investigation is genuinely good, but only if you’ve tuned it before the incident — not during.
What incident response leadership model actually works during a real breach?
Separate the roles. Incident Commander runs the response. CISO owns the strategic narrative and the board. Legal owns regulatory disclosure decisions. Communications owns the message to customers and press. Putting all four hats on one person — usually the CISO — is the most common organisational failure I see, and it’s why decisions get delayed at the worst moments. The Incident Commander shouldn’t be the most senior person; they should be the most operationally calm one. They need authority delegated in writing, not implied. During a breach, you do not have time to escalate every containment decision. Pre-agree the dollar threshold and risk threshold at which the IC can act unilaterally versus escalate. If you don’t, your $50K decision becomes a 4-hour debate while the attacker exfils data.
How do I build an incident response playbook for a specific scenario like ransomware?
A scenario playbook answers five questions before the incident, not during it. (1) Detection trigger: what specific signal kicks this playbook off — and is that signal monitored 24/7? (2) Containment authority: who is allowed to disconnect production systems, and at what severity threshold? (3) Forensic preservation: what evidence must be captured before recovery begins, and who captures it? (4) Decision tree on payment: if it’s ransomware, has Legal pre-cleared OFAC sanctions screening on likely threat actors, and is your insurance coverage clear on whether they’ll pay? (5) Recovery sequencing: which systems come back first, and what’s the validation gate before reconnecting? Most ransomware playbooks I review are detailed on detection and weak on (4) and (5). The 80% Cisco stat — organisations that pay get re-attacked — exists because the eradication step was skipped.
What’s the NIST 800-61 incident response framework, and do I need to follow it strictly?
NIST SP 800-61 (currently Revision 3 as of 2024-2025) defines the four-phase IR lifecycle: Preparation → Detection and Analysis → Containment, Eradication, and Recovery → Post-Incident Activity. It’s not a regulation; it’s a structural framework. You don’t need to follow it strictly, but you need to be able to map your plan to it — because that’s how auditors, insurers, and incident-response retainers will assess you. CMMC, NYDFS Part 500, NIS2, and most cyber insurance underwriting questionnaires assume NIST 800-61 alignment. If your plan doesn’t have clear phases, named owners per phase, and evidence of testing, you’ll fail those assessments regardless of how good your tooling is. Structure it. Test it. Document the test.
How fast does incident response need to be, and what response time benchmarks should I aim for?
Three numbers matter, and they’re not all the same urgency. Detection (dwell time): Mandiant M-Trends 2024 puts global median dwell at 10 days; that’s down from 16 in 2022 but still too long for ransomware, which often acts in under 24 hours. Aim for under 24 hours mean-time-to-detect for high-severity events. Containment: for ransomware and active data exfil, you need containment in single-digit hours; the longer you take, the larger your blast radius. Disclosure: GDPR mandates 72 hours to regulator; SEC requires 4 business days for material incidents in the U.S.; NIS2 has a 24-hour early warning followed by 72-hour notification in the EU. These are non-negotiable clocks. The teams that hit these benchmarks aren’t the ones with the best tools — they’re the ones who tabletop’d the scenario in the last six months. Untested plans add hours. Sometimes days.
