How to Run a Custody Key-Recovery Drill Without Exposing a Single Private Key
Fire drills don’t burn the building down. A custody key-recovery drill, done right, follows the same principle: you rehearse every step of recovering from your key backup except the one step that creates risk: materializing a private key.
This matters because the most common objection to rehearsing recovery is security: “we don’t want to handle live key material more often than we must.” Correct instinct, wrong conclusion. Modern MPC recovery tooling separates verification (reconstruct internally, confirm against registered public keys, expose nothing) from revelation (export the private key). A drill built on verify-only execution gets you nearly all of the assurance with none of the exposure.
Here is a structure that works. It assumes an MPC custody setup with a customer-held backup and offline recovery tooling, but the shape generalizes to multisig and HSM-based setups.
Before the drill
Define what “success” means in advance. A drill without pre-agreed pass criteria becomes a demo. Example criteria: backup decrypts; all key sets verify against registered public keys; coverage gap (key material newer than the backup) enumerated and below the agreed threshold; total elapsed time under target; executed by someone other than the person who designed the procedure.
Pick the operator deliberately. The drill’s value is highest when run by the second-most qualified person. If only one human can recover your assets, the drill’s first finding has already been made.
Prepare the air-gapped environment from scratch. A clean machine, OS installed for the occasion, tooling transferred by verified offline media, no network. Part of what you’re testing is whether your procedure for building the recovery environment works. In the real scenario you may not have a lovingly maintained recovery laptop.
Stage the materials as they would really be retrieved. Backup archive from its actual storage location, recovery key from its actual escrow or HSM, passphrases via their actual access procedure. Most drill findings come from this step, not from the cryptography: the safe-deposit process that needs a person who left, the passphrase that was “updated” without re-wrapping the backup, the second key share nobody can locate.
During the drill
Run verify-only. Reconstruct internally, confirm the derived public keys match the registered ones, never export a private key. If your tooling supports it, also derive a sample of wallet addresses and confirm them against known addresses; that extends the proof from “master key correct” to “our actual wallets are reachable.”
Reconcile coverage. Compare what the backup contains against current key material in the live system. The delta (anything created since the backup) is your silent gap. Record the number; it’s the single most decision-relevant output of the whole exercise.
Walk the long tail on paper. For staked, locked, and contract-held positions, a sweep is impossible by construction; the drill should table-top the runbook instead: which protocol exit steps, what unbonding timeline, which pre-approved destination addresses. If the runbook doesn’t exist, that’s a finding, not a footnote.
Record deviations as they happen. Every place reality diverged from the written procedure gets logged at the moment it occurs: a missing tool version, an unexpected prompt, a step done from memory because the document was wrong. Deviations are the product of the drill; a deviation-free record usually means nobody was writing things down.
After the drill
Produce evidence, not a vibe. The output should be an artifact a third party can assess later: who executed, when, on what environment, what was verified, the coverage number, every deviation, pass/fail against the pre-agreed criteria, and sign-off by the participants. Timestamped and tamper-evident if you can; a signed PDF in the audit folder at minimum. Eighteen months from now, “we did a drill” carries no weight; the artifact does.
Fix the findings, then schedule the next one. A drill is a sample of your readiness at one point in time. Readiness decays: workspaces grow past their backups, people leave, vendors change backup formats, tooling versions drift. An annual cadence is the floor; after any material change to key material, vendor tooling, or personnel is the trigger for an off-cycle run.
The maturity ladder
In practice, organizations sit on a four-rung ladder:
- A backup exists (someone downloaded it, once).
- The backup has been verified (point-in-time proof it decrypts and matches).
- Verification is scheduled and evidenced (recoverability is a monitored property, with artifacts).
- Recovery is rehearsed (people, environment, runbooks, and the long tail, proven under drill conditions).
Most institutions holding digital assets are on rung 1 and believe they are on rung 4. The distance between those rungs is invisible right up until the day it’s the only thing that matters.