Cloud··7 min read

How I cut a startup's AWS Lambda bill by 67%

A practical walkthrough of the four biggest cost wins on serverless: right-sizing memory, ARM architecture, provisioned concurrency, and the invocation patterns nobody profiles.

AR
Ali Razzaq
Full-Stack Developer · Lahore, Pakistan

A client came to me with a $3,400/month Lambda bill on a product doing roughly 4 million invocations a month. After two weeks of work, the bill dropped to $1,120. Same product. Same traffic. Same SLOs.

Here's exactly what changed and how to find these wins in your own infrastructure.

1. Memory right-sizing (saved $940/mo)

Lambda bills you on memory × duration. Most teams set memory based on a guess and never revisit. The trick: more memory often means lower cost because CPU scales with memory and the function finishes faster.

I used AWS Lambda Power Tuning to find the sweet spot for each function. Results varied wildly — one function got cheaper at 1024MB despite using only 280MB, because the CPU boost cut runtime by 60%.

bash
# Lambda Power Tuning state machine
aws stepfunctions start-execution \
  --state-machine-arn $POWER_TUNING_ARN \
  --input '{
    "lambdaARN": "arn:aws:lambda:us-east-1:123:function:api-handler",
    "powerValues": [128, 256, 512, 1024, 2048],
    "num": 50,
    "strategy": "balanced"
  }'

2. ARM (Graviton) architecture (saved $480/mo)

AWS charges 20% less for arm64 Lambda functions than x86. For Node.js, Python, and Java workloads, this is usually a free win — no code changes needed. Just flip the architecture flag.

yaml
# AWS SAM template
Resources:
  ApiHandler:
    Type: AWS::Serverless::Function
    Properties:
      Architectures:
        - arm64  # Was: x86_64
      Runtime: nodejs20.x

Caveat: any native dependencies need ARM-compatible builds. Most popular npm/PyPI packages have these by 2026, but watch for any that don't.

3. Cold start mitigation (saved $620/mo)

We were running 12 functions, each with provisioned concurrency of 2 — a static $1,200/mo bill. After profiling, only 3 functions had cold-start-sensitive traffic. We moved the others to on-demand and used SnapStart for the Java function.

  • Audit which functions actually need warm starts (look at p99 latency)
  • Use SnapStart for Java — restores in <500ms with no per-instance cost
  • Replace provisioned concurrency with scheduled warming for predictable peaks

4. Invocation pattern audit (saved $240/mo)

The sneakiest win: we found a function being called 800k times per day to check a status that changed maybe twice a day. Caching the result in DynamoDB with a 5-minute TTL eliminated 99% of those calls.

What I'd recommend for any team

Start with Power Tuning — it's the highest-ROI change you can make in an afternoon. Then look at architecture (1 line of YAML for 20% off). Provisioned concurrency is the longest tail; audit it carefully before paying for it.

AWSLambdaServerlessCost optimization
Read next
Frontend · Feb 8, 2026

React Server Components in production — what actually changed

Six months of shipping with the App Router and RSC. The mental model shifts, the gotchas, and the patterns that emerged after the hype settled.