Incident Response Guide for CISOs: Plans, Runbooks & DFIR Resources

Incident Response · CISO Playbook

Incident Response: The Difference Between a Managed Event and an Organisational Crisis

Every organisation will experience a significant security incident. The only variable is whether the response is rehearsed or improvised. This hub gives you the playbook — built from 25+ years of post-breach reviews across 50 countries.

Bottom line up front: The most expensive IR mistakes I’ve seen aren’t technical. They’re organisational. Nobody knew who the incident commander was. Legal wasn’t in the room when the decision to notify regulators was made. The backup team found out their backups were encrypted too. And the board heard about it from a journalist before they heard from the CISO. A tested IR plan eliminates all of these. IBM data shows organisations with tested IR plans save an average $1.49M per incident. That’s not a soft benefit — that’s your business case.

$1.49MSaved with tested IR plan — IBM

241Days avg breach lifecycle — IBM

80%Re-attacked after paying ransom — Cisco

72hrsNIS2 breach notification window

54%Of IR failures are process failures — PwC

What You’ll Find in This Hub

The Incident Response Reality Check
What the Data Says
The IR Lifecycle: 4 Phases
Building Your CIRT
Playbooks That Work Under Pressure
IR and Business Continuity Integration
Digital Forensics & Evidence
Deep-Dive Articles
Resources & Free Downloads

The Incident Response Reality Check

I’ve been called into post-breach situations on four continents. The one thing they all share — regardless of industry, company size, or geography — is that the response was slower, more chaotic, and more expensive than it needed to be because the organisation had never actually practised it.

An IR plan that lives in a SharePoint folder and gets reviewed annually is not an IR capability. An IR capability is built through tabletop exercises, purple team operations, real runbooks that real people have read, communication plans that have been stress-tested, and retainer relationships with IR firms and legal counsel that are activated before you need them.

PwC’s incident research shows 54% of IR failures are process failures, not technical failures. The breach was contained. The recovery failed because of unclear roles, poor communication, and inadequate business continuity planning. Technology alone cannot fix this.

What the Data Says

$1.49M

Average cost reduction per breach for organisations with a high-level IR team and regularly tested IR plan. This is the single strongest financial argument for IR investment. Every dollar spent on IR preparation returns multiple dollars in breach cost avoidance.
IBM Cost of a Data Breach Report 2025

241

Average days to identify and contain a breach globally. During those 241 days, an attacker is inside your environment mapping it, exfiltrating data, and establishing persistence. Reducing this number is the core goal of threat detection and IR investment.
IBM Cost of a Data Breach Report 2025

80%

Of ransomware victims who pay are targeted again within 12 months. Paying does not end the relationship with the attacker — it often signals you’ll pay again. The decision to pay must be made with legal counsel, law enforcement consideration, and a clear understanding that it doesn’t guarantee decryption.
Cisco Cybersecurity Readiness Index 2025

72hrs

Maximum breach notification window under NIS2 — with an initial early warning to authorities within 24 hours of awareness. Under GDPR it’s 72 hours to supervisory authorities. These clocks start the moment you have reasonable grounds to believe an incident has occurred. Your legal team needs to be in your IR plan from day one.
EU NIS2 Directive, GDPR Article 33

54%

Of incident response failures are process and communication failures — not technical failures. The containment worked. The recovery failed because nobody owned the communication plan, the backup process was untested, or the BCP handoff was never defined.
PwC Global Digital Trust Insights 2026

The IR Lifecycle: 4 Phases

Phase 01

Preparation

Build your CIRT, write your playbooks, establish retainers, test your backups, run tabletops. Everything that happens before the incident determines how the incident goes.

Phase 02

Detection & Analysis

Identify, scope, and classify the incident. Preserve evidence. Activate the CIRT. Determine whether regulatory notification clocks have started running.

Phase 03

Contain, Eradicate & Recover

Isolate affected systems. Remove the threat. Restore from clean backups. Verify clean state before returning to production. Maintain business continuity throughout.

Phase 04

Post-Incident Activity

Root cause analysis. Lessons learned. Control improvements. Board reporting. Regulatory submissions. Update playbooks based on what you learned.

Building Your Cyber Incident Response Team

Role 01

Incident Commander

Maintains overall situational awareness and makes key decisions under pressure. This is typically the CISO or a designated deputy. They do not do technical work during the incident — they command. Most IR failures happen when the CISO is heads-down on forensics instead of managing the response.

Role 02

Technical Lead

Directs the forensic investigation, containment actions, and eradication. Needs deep technical skills and the ability to brief non-technical stakeholders clearly. Consider pre-engaging an external IR firm on retainer for complex incidents.

Role 03

Communications Lead

Manages internal communications, media enquiries, customer notifications, and regulatory submissions. This role is activated immediately in any significant incident. Having pre-approved communication templates reduces response time dramatically.

Role 04

Legal Counsel

Advises on notification obligations, evidence preservation, ransomware payment legality, and regulatory engagement. Legal must be in the CIRT, not called in after decisions are made. Get your legal team into a tabletop exercise before you need them in a real incident.

Role 05

Business Continuity Lead

The most underrepresented role in most CIRTs. While the technical team contains the threat, this person is activating manual fallback procedures, coordinating with operations, and ensuring critical business functions continue. IR and BCP must be co-designed.

Role 06

Executive Liaison

Manages the board and C-suite communication cadence. The board should never learn about a significant incident from a news article. This role maintains a regular briefing schedule and ensures leadership has the information they need to make business decisions without interfering with the technical response.

The most dangerous IR plan is the one that’s never been tested. I’ve reviewed IR plans that were 200 pages long and completely useless under pressure — because the people who wrote them weren’t the people who would execute them, and the people who would execute them had never read them. Test your plan. Tabletop it. Run red team simulations. Find the gaps before the attacker does.
— Dr. Erdal Ozkaya, CISO & Author of 26 Cybersecurity Books

The Integration Gap: IR and Business Continuity

The most consistent failure point I see in post-breach reviews is the handoff from technical IR to business continuity. Security contains the ransomware — but nobody arranged alternative workflows for the factory floor that can’t access its production systems. The DR site is activated — but the manufacturing processes that depend on the compromised OT network have no manual fallback.

IR and BCP must be co-designed. Every IR playbook needs a BCP counterpart that answers: if this system is offline for 72 hours, how does the business keep operating? Who authorises manual processes? What’s the communication to customers and suppliers? What’s the maximum tolerable downtime? Get these answers before the incident, not during it.

Digital Forensics: Evidence That Holds Up

Digital forensics isn’t just about understanding what happened — it’s about preserving evidence in a way that supports potential legal action, insurance claims, and regulatory investigations. Evidence collected without maintaining chain of custody may be inadmissible. Systems powered down without capturing volatile memory lose critical forensic artefacts forever.

Cloud forensics adds complexity: evidence may be held by a third-party provider with limited retention windows, platform-specific collection methods, and jurisdictional considerations. Enable cloud audit logging now — you cannot go back and collect logs from before an incident if logging wasn’t active. This is a preparation task, not a response task.

Deep-Dive Articles

DFIR

Digital Forensics & IR (DFIR): A CISO’s Guide

How to investigate breaches properly, preserve evidence for legal proceedings, and build a DFIR capability that supports both recovery and accountability.

Team Building

Building a Cyber Incident Response Team

Roles, responsibilities, retainers, and the organisational politics of building a CIRT that actually functions under pressure — not just on the org chart.

BCP Integration

IR Planning for Business Continuity

The integration most organisations miss — connecting your IR plan to BCP so the business keeps running while security contains the threat.

IR Resources & Free Downloads

📗

Free IR Book

Incident Response for Business Continuity — co-authored with Binalyze. Practical IR planning that connects to operational survival.

Download Free →

📋

CISO Toolkit

IR plan templates, CIRT role definitions, communication templates, and post-incident report frameworks.

Access Free →

📄

ISO 27001 Toolkit

Incident management controls aligned to ISO 27001:2022 — the framework foundation for your IR programme.

Download Free →

🎙️

Sentinels Talk Show

IR war stories, lessons learned, and expert conversations with practitioners who’ve managed major incidents.

Watch Now →

🗓️

Book Dr. Ozkaya

IR tabletop exercises, CIRT building workshops, and board-level crisis simulation for executive teams.

Submit Enquiry →

Take to the Boardroom

What your board needs to hear about incident response

Three talking points, one metric, one question. Screenshot this for your next board prep.

The board’s job during an incident is not to manage it. It is to back the people who are. Pre-agree decision authority before you need it — at 3 a.m. is not the time to discover who can authorise a public statement.

Tabletop exercises that everyone passes are theatre. A real exercise should produce uncomfortable findings — usually about communication, legal escalation, and who actually has authority to take systems offline.

Most material damage in a serious incident comes from the 72 hours after detection, not the breach itself. The wrong public statement, wrong regulator filing, or wrong customer notification creates the lasting harm.

The Metric That MattersTime from detection to first executive decision on containment — including taking critical systems offline.

Ask Your TeamWhen we last ran a tabletop, what specifically went wrong — and have we fixed it, or are we still scheduling the meeting to fix it?

Your IR Plan Has Gaps. Find Them Before the Attacker Does.

Most organisations discover their IR gaps during an actual incident — when the cost of finding them is at its highest. I run IR tabletop exercises and capability assessments that expose the gaps in roles, process, and communication before they matter under pressure.
Book a Tabletop Exercise →

Incident Response FAQ — Honest Answers to the Questions CISOs Actually Ask

What are the key components of an incident response plan, and what does a real one look like in practice?

A real IR plan has six things — and most “plans” I review are missing at least three of them. (1) A named incident commander with explicit authority to spend money and shut systems down without asking permission. (2) A pre-agreed severity matrix so the team isn’t debating “is this a P1?” at 3 a.m. (3) Communications templates — internal, customer, regulator, board — drafted in calm conditions, not under fire. (4) Vendor and legal contact lists with after-hours numbers, retested every quarter. (5) Containment playbooks for your top five most likely scenarios (ransomware, BEC, insider, cloud account takeover, supply chain). (6) A post-incident review process that produces actual changes, not theatre. NIST 800-61 covers the structural lifecycle — preparation, detection, containment, eradication, recovery, lessons learned — and that framework is sound. The failure mode isn’t framework choice; it’s that the plan exists as a PDF nobody has tested.

How does incident response in Azure or Microsoft 365 differ from traditional on-premises IR?

The biggest shift is that your forensic surface is partly someone else’s machine. In on-prem IR, you can image the box and walk away with everything. In Azure/M365, your evidence lives in Microsoft Defender XDR, Sentinel, Entra ID sign-in logs, Purview audit logs, and Unified Audit Log — and retention windows vary by license tier. If you’re on E3, you have 90 days of Unified Audit Log; on E5, 365 days. That delta matters when an attacker has been resident for eight months. The second shift is identity is the new perimeter — most cloud incidents I’ve worked began with a token theft or a consent phishing attack, not a traditional malware payload. Your IR plan needs Entra ID conditional access reset procedures, OAuth app review, and the muscle memory to pull session tokens fast. Defender XDR’s auto-investigation is genuinely good, but only if you’ve tuned it before the incident — not during.

What incident response leadership model actually works during a real breach?

Separate the roles. Incident Commander runs the response. CISO owns the strategic narrative and the board. Legal owns regulatory disclosure decisions. Communications owns the message to customers and press. Putting all four hats on one person — usually the CISO — is the most common organisational failure I see, and it’s why decisions get delayed at the worst moments. The Incident Commander shouldn’t be the most senior person; they should be the most operationally calm one. They need authority delegated in writing, not implied. During a breach, you do not have time to escalate every containment decision. Pre-agree the dollar threshold and risk threshold at which the IC can act unilaterally versus escalate. If you don’t, your $50K decision becomes a 4-hour debate while the attacker exfils data.

How do I build an incident response playbook for a specific scenario like ransomware?

A scenario playbook answers five questions before the incident, not during it. (1) Detection trigger: what specific signal kicks this playbook off — and is that signal monitored 24/7? (2) Containment authority: who is allowed to disconnect production systems, and at what severity threshold? (3) Forensic preservation: what evidence must be captured before recovery begins, and who captures it? (4) Decision tree on payment: if it’s ransomware, has Legal pre-cleared OFAC sanctions screening on likely threat actors, and is your insurance coverage clear on whether they’ll pay? (5) Recovery sequencing: which systems come back first, and what’s the validation gate before reconnecting? Most ransomware playbooks I review are detailed on detection and weak on (4) and (5). The 80% Cisco stat — organisations that pay get re-attacked — exists because the eradication step was skipped.

What’s the NIST 800-61 incident response framework, and do I need to follow it strictly?

NIST SP 800-61 (currently Revision 3 as of 2024-2025) defines the four-phase IR lifecycle: Preparation → Detection and Analysis → Containment, Eradication, and Recovery → Post-Incident Activity. It’s not a regulation; it’s a structural framework. You don’t need to follow it strictly, but you need to be able to map your plan to it — because that’s how auditors, insurers, and incident-response retainers will assess you. CMMC, NYDFS Part 500, NIS2, and most cyber insurance underwriting questionnaires assume NIST 800-61 alignment. If your plan doesn’t have clear phases, named owners per phase, and evidence of testing, you’ll fail those assessments regardless of how good your tooling is. Structure it. Test it. Document the test.

How fast does incident response need to be, and what response time benchmarks should I aim for?

Three numbers matter, and they’re not all the same urgency. Detection (dwell time): Mandiant M-Trends 2024 puts global median dwell at 10 days; that’s down from 16 in 2022 but still too long for ransomware, which often acts in under 24 hours. Aim for under 24 hours mean-time-to-detect for high-severity events. Containment: for ransomware and active data exfil, you need containment in single-digit hours; the longer you take, the larger your blast radius. Disclosure: GDPR mandates 72 hours to regulator; SEC requires 4 business days for material incidents in the U.S.; NIS2 has a 24-hour early warning followed by 72-hour notification in the EU. These are non-negotiable clocks. The teams that hit these benchmarks aren’t the ones with the best tools — they’re the ones who tabletop’d the scenario in the last six months. Untested plans add hours. Sometimes days.