MICRO case_file
Scripted Cassandra Node Decommission
Database Cloud Production
Context
After a workload consolidation, a set of Cassandra nodes were no longer needed. The cluster needed a clean shrink, not a hard removal.
Problem
Removing nodes from a live ring is reversible only up to a point. The shape of the change had to respect Cassandra's data ownership semantics, and the operation had to be repeatable across multiple nodes.
My role
Operator: scripted the decommission steps and validated cluster state between them.
Technical actions
- [01] Drained and decommissioned nodes following Cassandra's ownership-aware procedure.
- [02] Validated cluster topology and token ranges between steps.
- [03] Captured the steps as a repeatable script so the next decommission would not start from zero.
Operational impact
Cluster shrunk safely after the consolidation. The procedure is now a reusable artifact instead of tribal knowledge.
What this demonstrates
- Treating cleanup as a first-class operation, not an afterthought.
- Scripting recurring infrastructure work for repeatability.