How I cut a startup's AWS Lambda bill by 67%
A practical walkthrough of the four biggest cost wins on serverless: right-sizing memory, ARM architecture, provisioned concurrency, and the invocation patterns nobody profiles.
A client came to me with a $3,400/month Lambda bill on a product doing roughly 4 million invocations a month. After two weeks of work, the bill dropped to $1,120. Same product. Same traffic. Same SLOs.
Here's exactly what changed and how to find these wins in your own infrastructure.
1. Memory right-sizing (saved $940/mo)
Lambda bills you on memory × duration. Most teams set memory based on a guess and never revisit. The trick: more memory often means lower cost because CPU scales with memory and the function finishes faster.
I used AWS Lambda Power Tuning to find the sweet spot for each function. Results varied wildly — one function got cheaper at 1024MB despite using only 280MB, because the CPU boost cut runtime by 60%.
# Lambda Power Tuning state machine
aws stepfunctions start-execution \
--state-machine-arn $POWER_TUNING_ARN \
--input '{
"lambdaARN": "arn:aws:lambda:us-east-1:123:function:api-handler",
"powerValues": [128, 256, 512, 1024, 2048],
"num": 50,
"strategy": "balanced"
}'2. ARM (Graviton) architecture (saved $480/mo)
AWS charges 20% less for arm64 Lambda functions than x86. For Node.js, Python, and Java workloads, this is usually a free win — no code changes needed. Just flip the architecture flag.
# AWS SAM template
Resources:
ApiHandler:
Type: AWS::Serverless::Function
Properties:
Architectures:
- arm64 # Was: x86_64
Runtime: nodejs20.xCaveat: any native dependencies need ARM-compatible builds. Most popular npm/PyPI packages have these by 2026, but watch for any that don't.
3. Cold start mitigation (saved $620/mo)
We were running 12 functions, each with provisioned concurrency of 2 — a static $1,200/mo bill. After profiling, only 3 functions had cold-start-sensitive traffic. We moved the others to on-demand and used SnapStart for the Java function.
- Audit which functions actually need warm starts (look at p99 latency)
- Use SnapStart for Java — restores in <500ms with no per-instance cost
- Replace provisioned concurrency with scheduled warming for predictable peaks
4. Invocation pattern audit (saved $240/mo)
The sneakiest win: we found a function being called 800k times per day to check a status that changed maybe twice a day. Caching the result in DynamoDB with a 5-minute TTL eliminated 99% of those calls.
What I'd recommend for any team
Start with Power Tuning — it's the highest-ROI change you can make in an afternoon. Then look at architecture (1 line of YAML for 20% off). Provisioned concurrency is the longest tail; audit it carefully before paying for it.
React Server Components in production — what actually changed
→Six months of shipping with the App Router and RSC. The mental model shifts, the gotchas, and the patterns that emerged after the hype settled.