This week, I ran a simple audit on my standard Kubernetes lab cluster.

Two commands. 30 seconds. One shocking discovery.

Command 1: See what pods REQUEST

kubectl describe nodes | grep -A 5 "Allocated resources"

Command 2: See what pods ACTUALLY USE

kubectl top nodes

Here’s what I found:

Requested: 18% of cluster capacityActually using: 1% of cluster capacity

Gap: 18x over-provisioned

Translation: I could fit 18x more workloads on the same hardware.

Or said differently: I’m wasting 93% of what I requested.

This is NORMAL

Before you think “that’s just a standard K8s lab cluster,” let me be clear:

This is the industry standard.

I’ve audited 30+ clusters now. Startups, mid-size companies, enterprises.

The pattern is consistent:

  • Smallest waste: 5x over-provisioned

  • Average waste: 15-30x over-provisioned

  • Largest waste: 47x over-provisioned

My 18x? Right in the middle.

Most teams have no idea this is happening.

Why This Happens

It’s not your fault.

Default Manifests Are Ridiculously High

Real examples from my cluster:

CoreDNS:

  • Requests: 100m CPU

  • Actually uses: 2m CPU

  • Waste: 98% (50x over-provisioned)

kube-proxy:

  • Requests: 100m CPU

  • Actually uses: 1m CPU

  • Waste: 99% (100x over-provisioned)

metrics-server:

  • Requests: 100m CPU

  • Actually uses: 3m CPU

  • Waste: 97% (33x over-provisioned)

These are SYSTEM PODS from Kubernetes itself.

If the defaults are this wasteful, what chance do you have?

Documentation Doesn’t Teach Sizing

The Kubernetes docs tell you WHAT requests and limits are.

They don’t tell you HOW to size them.

You’ll read things like:

  • “Specify how much of each resource a container needs”

  • “Define resource requests based on your application”

Okay. But HOW?

How do I know what my application needs?

Do I guess? Copy someone else’s manifest? Use the defaults?

The docs don’t say.

So everyone does what’s easiest: copy-paste and move on.

Nobody Measures Actual Usage

Here’s what actually happens:

  1. You find a manifest on GitHub

  2. You copy the resource requests

  3. You deploy it

  4. It works

  5. You never touch it again

The step everyone skips: measuring actual resource usage and adjusting.

Why?

Because nobody told you to do it.

Because you don’t know how.

Because you’re busy with “more important” things.

Because the default worked, so why mess with it?

And that’s how you end up with 18x over-provisioning.

What This Actually Costs

“My cluster is small. 18x waste doesn’t matter, right?”

Let me show you.

My Standard K8s Lab Cluster (2 nodes, 8 pods):

Current state:

  • Requesting: 700m CPU

  • Using: 50m CPU

  • Waste: 650m CPU

At this small scale, I’m already at minimum HA config (2 nodes). Can’t reduce nodes further. Waste is “locked in.”

Cost impact now: $0

But watch what happens when this scales...

At 50 Pods (Same Waste Pattern):

Over-provisioned scenario:

  • 50 pods × 87.5m average request = 4,375m CPU needed

  • Nodes needed: 3

  • Cost: $90/month

Right-sized scenario:

  • 50 pods × 5m actual × 1.5 buffer = 375m CPU needed

  • Nodes needed: 2

  • Cost: $60/month

Difference: $30/month = $360/year

At 100 Pods:

Over-provisioned: 6 nodes = $180/monthRight-sized: 2 nodes = $60/month

Difference: $120/month = $1,440/year

At 200 Pods:

Over-provisioned: 11 nodes = $330/monthRight-sized: 3 nodes = $90/month

Difference: $240/month = $2,880/year

The pattern scales linearly.

Fix it at 10 pods, save $3,000/year at 200 pods.

The Hidden Danger: Memory Under-Provisioning

While looking at CPU, I noticed something else.

Memory situation (opposite of CPU):

Requested: 540Mi (8% of capacity)Actually using: 998Mi (15% of capacity)

Memory is UNDER-provisioned by 85%.

This is dangerous.

Why This Happens:

Most developers:

  1. Copy-paste CPU requests from docs (way too high)

  2. Set memory to 0 or leave it blank (risky)

Result:

  • CPU: Massively over-provisioned (wasted money)

  • Memory: Under-provisioned (risk of OOM kills)

The fix: Measure BOTH resources.

How to Check Yours (Right Now)

Stop reading. Run these two commands.

Step 1: See what’s requested

kubectl describe nodes | grep -A 5 "Allocated resources"Look for the percentage:cpu 700m (18%) 0 (0%)

That 18% is what the scheduler sees.

Step 2: See what’s actually used

kubectl top nodesLook at CPU%:NAME CPU(cores) CPU%node-1 20m 1%

That 1% is reality.

Step 3: Calculate the gap

Scheduler view: 18%Reality: 1%

Gap: 18 ÷ 1 = 18x over-provisioned

What Your Numbers Mean

If your gap is:

1-3x. : You’re doing better than 95% of clusters.

5-10x. : Normal. Room for optimization, not urgent.

10-30x. : Industry average. Significant savings available.

30-50x.: Extreme waste. Fix this immediately.

50x+. : Something is very wrong.

Reply with your numbers - I’ll tell you what they mean.

How to Fix It This Week

Here’s what you’re going to do.

I’ve broken it down by day. Time estimates included.

Follow the order. Don’t jump around.

Step 1: Identify Your Top 5 Wasteful Pods (10 minutes - Today)

Run this right now:

kubectl top pods --all-namespaces --sort-by=cpu

Look at the output. Pick the 5 pods using the most CPU.

Grab a notepad. Write down:

  • Pod name

  • What it’s actually using

  • What it’s requesting (check the manifest)

Do this before you close this email.

Step 2: Measure Peak Usage (Next 7 Days)

One measurement tells you nothing. You need the pattern.

For each of your 5 pods, run this throughout the week:

kubectl top pod <pod-name> -n <namespace>

Check it at different times. Traffic patterns matter.

I usually check:

  • Monday morning (things are slow)

  • Wednesday lunch (everything’s on fire)

  • Thursday evening (wrapping up the week)

  • Saturday afternoon (see if weekend is different)

Write down the peak. That’s what matters.

Step 3: Calculate New Requests (5 minutes - Next Friday)

Now do the math.

Take your peak usage. Multiply by 1.5. That’s your new request.

Example:

  • Peak usage: 40m CPU

  • Calculation: 40m × 1.5

  • New request: 60m

Simple.

Why 1.5x?

You need headroom for:

  • Traffic spikes you didn’t catch

  • That random Tuesday when everything breaks

  • Growth over the next few months

But you don’t need 10x. You’re not running a nuclear reactor.

1.5x keeps you safe without the insane waste.

Open a spreadsheet. Track all 5 pods. You’ll need this.

Step 4: Update Pod Manifests (1 hour - Next Weekend)

Lets make some changes.

Update the requests for each pod like below.

Current:

resources: requests: cpu: 500m # Copied from some blog post memory: 256Mi # Someone's guess

Change to :

resources: requests: cpu: 60m # Based on actual × 1.5 memory: 120Mi # Based on actual × 1.5

Important: Start with ONE pod. Not all 5.

Test one. Make sure it works. Then move to the next.

Step 5: Deploy and Monitor (2-3 hours - Over Next 2 Weeks)

Deploy your first optimized pod to staging.

Wait 48 hours. Watch for problems:

CPU throttling?Requests are too low. Bump them up by 20%. Try again.

OOM kills?Memory is too low. Bump it up by 20%. Try again.

Everything running normally?You nailed it. Move to production.

Once it’s stable in production for 3 days, start on pod #2.

Repeat until all 5 are done.

Total timeline: 2-3 weeks for all 5 podsTotal time investment: 4-6 hoursPotential savings: $360-$2,880/year

Start today. Run that first command.

The Real-World Impact

Let me show you what this looks like.

Scenario: 50-Pod Cluster (Typical Mid-Size Startup)

Before optimization:

  • Average request per pod: 87.5m CPU

  • Total requested: 4,375m CPU

  • Nodes needed: 3

  • Monthly cost: $90

After optimization:

  • Average request per pod: 7.5m CPU (actual × 1.5)

  • Total requested: 375m CPU

  • Nodes needed: 2

  • Monthly cost: $60

Savings: $30/month = $360/year

Time to implement: 4-6 hours over 2 weeks

ROI: Pays for itself in 2 months

When That Startup Scales to 200 Pods:

Without early optimization:

  • Still using 87.5m average requests

  • Needs: 11 nodes

  • Cost: $330/month

With early optimization:

  • Using 7.5m average requests

  • Needs: 3 nodes

  • Cost: $90/month

Difference: $240/month = $2,880/year

The work you do at 50 pods saves 10x at 200 pods.

Why Companies Don’t Fix This

I hear the same objections every time.

“We’ll optimize when we scale”

By the time you “need” to optimize, you have:

  • 200+ pod manifests to update

  • Complex dependencies

  • Production workloads you’re afraid to touch

  • 6-month backlog of “more important” work

Result: It never happens.

Better approach: Fix it now when you have 10-50 pods. Takes 4 hours, not 4 weeks.

“It’s only $30/month”

Today: $30/month (annoying, not urgent)

In 6 months at scale: $240/month (urgent, but now 10x harder to fix)

The optimization is the same effort.

Do it when it’s easy, not when it’s desperate.

“We have bigger problems”

Do you?

4 hours of work = $360-$2,880/year saved.

That’s $90-720/hour ROI.

Show me the “bigger problem” with better ROI.

Your Turn

Run the audit. Reply with your numbers.

Format:

Requested: X%Actual: Y%Gap: Zx

I’ll tell you:

  • Whether this is normal

  • What it’s costing you

  • Whether you should fix it

And if your gap is over 30x, I’ll show you exactly where to start.

Want to stop wasting 93% of your Kubernetes capacity?

Start with the 30-second audit.

Then fix your top 5 pods.

4 hours of work. $2,880/year saved.

Or keep doing what everyone else does:

Copy-paste defaults. Deploy. Never look back. Wonder why AWS bills keep growing.

Your choice.

- Naveen

Thanks for reading BeyondOps Newsletter! This post is public so feel free to share it.