Off-Prem

Even Netflix struggles to identify and understand the cost of its AWS estate

If you have trouble keeping track of your various streaming subscriptions, you're gonna love the irony


Keeping track of the amount of cloudy resources an org uses, and the cost of doing so, is notoriously tricky – so tricky, indeed, that even Netflix isn't on top of it.

We know this because on Wednesday US time the vid-streamer blogged about its cloud efficiency measures.

The post – penned by senior analytics engineer "Jennifer H" and Pallavi Phadnis, who describes her role as "Data" – opens by noting Netflix's well-known use of Amazon Web Services (AWS) for its cloud infrastructure needs, and that its engineering teams have self-service tools they can use to provision apps in the cloud.

The pair also reveal that Netflix operates a Platform DSE (Data Science Engineering) team, which helps engineering teams "to understand what resources they're using, how effectively and efficiently they use those resources, and the cost associated with their resource usage."

The Platform DSE team's goal is helping "downstream consumers to make cost conscious decisions using our datasets."

To assist in that goal, it's created two tools:

  1. A Foundational Platform Data (FPD) that "provides a centralized data layer for all platform data, featuring a consistent data model and standardized data processing methodology."
  2. A Cloud Efficiency Analytics (CEA) tool that is built on top of FPD and "offers an analytics data layer that provides time series efficiency metrics across various business use cases."

FPD consumes fed data from applications like Apache spark, which records how long cores are allocated to jobs and the amount of data read. CEA is then sent "inventory, ownership, and usage data and applies the appropriate business logic to produce cost and ownership attribution at various granularities," the post explains.

The datasets Netflix generates are highly complex "due to the breadth and scope of the business infrastructure and platform specific features."

"Services can have multiple owners, cost heuristics are unique to each platform, and the scale of infra data is large," Jennifer H and Pallavi Phadnis wrote, before explaining Netflix's platforms often have customizations that mean the Platform DSE team always has plenty to do – including regular audits.

"Maintaining data completeness while ensuring correctness becomes challenging due to upstream latency and required transformations to have the data ready for consumption," they explained.

Their work therefore continues, with both FPD and CEA under development and Netflix "striving for nearly complete cost insight coverage in the upcoming year."

It gets better. The post concludes by revealing Netflix's intention to "move towards proactive approaches via predictive analytics and ML for optimizing usage and detecting anomalies in cost."

You read that right: Netflix, one of the most famous users of public cloud, isn't in total control of its cloud spend and needs to get better at detecting anomalies.

So you're not alone if you struggle to do so, too. ®

Send us news
70 Comments

UK government's cloud strategy: Pay more, get less, blame vendor lock-in?

Home Office's £450M deal with AWS raises questions over competition and aligning department requirements

Check out this free automated tool that hunts for exposed AWS secrets in public repos

You can find out if your GitHub codebase is leaking keys ... but so can miscreants

AWS unboxes quantum cat qubit kit called Ocelot

Sprinting after Microsoft and co, Amazon claims it too has a QC chip that's good at all-important error correction

AWS vacates its board seat at European cloud crew CISPE

... weeks after US titan was outvoted by other members to let Microsoft join the Euro cloud trade association

Trump administration threatens tariffs for any nation that dares to tax Big Tech

Digital services taxes, network build levies, touted as violations of US sovereignty

Under Trump 2.0, Europe's dependence on US clouds back under the spotlight

Technologist Bert Hubert tells The Reg Microsoft Outlook is a huge source of geopolitical risk

ST Micro skips in, arm in arm with AWS, bearing a chip for 1.6 Tbps pluggable optics

It's Friday. Quit the doomscrolling. Distract yourself with IT infra news

Why SAP may be mulling 2030 end of maintenance for legacy ERP

Users' sluggish migration of critical apps mean current deadline not workable, says analyst

Triplestrength hits victims with triple trouble: Ransomware, cloud hijacks, crypto-mining

These crooks have no chill

Hardware quality problems and server supply chain kinks slow Amazon’s $100 billion AI build

Reverses life extensions for some servers it now feels aren’t useful in the inferencing age

Abandoned AWS S3 buckets can be reused in supply-chain attacks that would make SolarWinds look 'insignificant'

When cloud customers don't clean up after themselves, part 97

Datacenter energy use to more than double by 2030 thanks to AI's insatiable thirst

Shocking research warns electricity shortages could create construction bottleneck