It Took a Pandemic to Learn Why Standards Failed

In 2015, I did what seemed like the mature thing to do. I created a Production Engineering department. My college foundation was production engineering. I was a true believer: if we formalized standards and assigned a dedicated group to own operational rigor, the organization would naturally converge toward consistency. The mandate: Create SOPs. Define standards. Reduce variance. Improve reliability. On paper, it was textbook. In practice, it was a slow-motion collision with reality. ...

When the Constraint Isn’t Capacity

A few years ago, as Field CTO for an enterprise customer, I was pulled into a rescue effort that started the way these stories usually start: pain, urgency, and a narrative that felt convenient. The application hit a bootstorm—150,000+ users slamming it in a short window—and then the predictable second-order effect: every day after that, more tickets piled up. Instability. Session timeouts. Intermittent failures. The kind of symptoms that turn a service into a rumor. ...