ai infrastructure

The Telemetry Trap - When Everything Measured Makes Nothing Matter

Companies burn millions on Snowflake/Databricks while Parquet+S3 delivers 95% of the value at fraction of the cost. Netflix mastered telemetry by focusing on what matters. Your data goldmine isn't in petabytes—it's in the kilobytes driving your core business. Measure what matters, not everything.

Stefano Schotten

04 Oct 2025 — 3 min read

A few decades ago, an IBM director shared what became Silicon Valley gospel: "If you can't measure it, it doesn't exist." Made perfect sense when we were talking SLAs and business efficiency. Twenty years later, I'm watching companies drown in their own measurements.
Let me be crystal clear upfront: I'm a massive observability advocate. Netflix's infrastructure telemetry work is genius-level stuff. The contributions from their distribution engineering team—folks like Kevin McEntee, Maulik Pandey, and Sergey Fedorov—represent the absolute state of the art in content delivery. Their approach was brilliantly simple: obsess over what users actually experience. Revolutionary at the time. Changed everything. And here's the kicker—Netflix probably spends more on telemetry and observability than on actual content delivery infrastructure.
But here's where it gets interesting.
More than a decade ago, a company president called me with an urgent request: "We need a big data project." When I asked what insights he needed, his response was telling: "I don't know yet, but we need to figure it out." That was my first real glimpse into the big-data gold rush. Everyone wanted a data lake. Nobody knew what fish they were trying to catch.
Don't misunderstand—as a researcher and performance obsessive, I know the difference between microseconds and nanoseconds can mean the difference between rockets landing and rockets exploding. Precision matters. But there's a statistical concept called heteroskedasticity that most data initiatives ignore: when variance in your data isn't constant across all measurements, your noise grows exponentially with dataset width. Translation: the more random stuff you measure, the harder and more expensive it becomes to extract any real signal. This kills most data analysis initiatives before they even start.
Consider what actually built today's tech giants. It wasn't massive data collection. Google dominated search by delivering compelling results first—Chrome and Android telemetry came later, refining an already-winning formula. Facebook cracked digital social interaction before they became behavioral analysis masters. These companies are telemetry pinnacles precisely because they don't advertise it. They solved real problems first, then used data to polish the solution.
What I'm witnessing now is companies pursuing "observability excellence" while ignoring the gold already in their hands. Instead of mastering their core business data—the stuff their entire existence is built on—they're burning millions chasing trends. They're so busy looking for needles in ever-larger haystacks that they miss the treasure chest sitting in plain sight: their existing customer interactions, their actual user experience, the fundamental events that drive their business.
Here's the part that should keep CFOs up at night: the modern data stack doesn't require millions anymore. Parquet files on S3, orchestrated with DuckDB or Polars, queried through Presto or Athena—this combination delivers 95% of what Snowflake or Databricks offers for a fraction of the cost. Add Apache Iceberg or Delta Lake for ACID transactions and time travel. Deploy Superset or Metabase for visualization. Total cost? Maybe tens of thousands instead of millions. The tools are commoditized, battle-tested, and sitting right there on GitHub.
The remaining 5%? That's where things get philosophical. Yes, that 0.1% difference can revolutionize industries or determine Olympic gold. But here's the rational calculation most companies refuse to make: is burning millions in pursuit of that marginal gain actually moving your business forward? Or are you sacrificing business velocity—actual customer value, product improvements, market expansion—on the altar of theoretical optimization?
I've watched startups with 50 customers deploy the same observability stack as Netflix. I've seen mid-market companies implement data platforms that could handle Twitter's firehose when their actual volume would fit in a PostgreSQL instance. It's like buying a Formula 1 pit crew for your daily commute.
The hard truth? Most companies already have the data they need to transform their business. It's sitting in their transaction logs, their customer support tickets, their basic user analytics. It's not sexy. It's not "AI-ready." It won't win any engineering blog posts. But it's real, it's relevant, and it's directly tied to why their business exists in the first place.
Stick to your own data business. Master the fundamentals with simple, affordable tools. Let your competitors chase infinitesimal optimizations while you ship features customers actually want. The revolution isn't in measuring everything—it's in measuring what matters and acting on it faster than everyone else.
The best telemetry is the kind that pays for itself by next quarter, not next decade.

The Telemetry Trap - When Everything Measured Makes Nothing Matter

Stefano Schotten

Read more

URE Atlas: Visualizing the Hidden Cost of AI Infrastructure Placement

Why AI Training Needs Microsecond Power Prediction, Not Millisecond Reaction

AI's Infrastructure Crisis: Why 95% of Models Run in Just 4 Locations (And Why That's Breaking Everything)