FFmpeg on Serverless & Lambda: Challenges and Solutions
Why running FFmpeg on AWS Lambda, Vercel, and Cloudflare Workers is painful — binary size, timeouts, memory, cold starts — and practical alternatives.

Serverless platforms promise infinite scale and zero ops. So when you need to process video — trim a clip, generate a thumbnail, transcode an upload — the obvious thought is: just run FFmpeg on Lambda. How hard can it be?
Very hard, as it turns out. This guide walks through the real challenges of running FFmpeg on serverless platforms, the common workarounds, where they fall short, and when it makes sense to use a dedicated API instead.
Why FFmpeg on Serverless Is Hard
1. Binary Size — 70MB+ Before You Start
A statically compiled FFmpeg binary with common codecs (H.264, H.265, VP9, AAC, Opus) is 70-100 MB. Add libraries like libass (subtitles) or libfdk-aac (high-quality AAC) and you're looking at 120 MB+.
Platform limits make this painful:
| Platform | Deployment Size Limit | Includes |
|---|---|---|
| AWS Lambda (zip) | 50 MB (250 MB unzipped) | Code + dependencies + binary |
| AWS Lambda (container) | 10 GB | More room, but slower cold start |
| Vercel Serverless Functions | 50 MB | Compressed |
| Cloudflare Workers | 10 MB (with paid plan) | No native binary support |
| Google Cloud Functions | 500 MB (source) | More lenient |
On AWS Lambda with zip deployment, the FFmpeg binary alone eats most of your 50 MB budget. Your actual application code competes for the remaining space.
2. Execution Timeout — 15 Minutes Max
AWS Lambda's maximum execution time is 15 minutes. Other platforms are even stricter:
| Platform | Max Timeout |
|---|---|
| AWS Lambda | 15 min |
| Google Cloud Functions | 9 min (60 min 2nd gen) |
| Vercel Serverless | 5 min (Pro), 60s (Hobby) |
| Cloudflare Workers | 30s (standard), 15 min (Workflows) |
A 15-minute timeout sounds generous — until you realize what FFmpeg processing actually takes:
| Task | 1080p, 10 min video | Time on 2 vCPU |
|---|---|---|
| Thumbnail extraction | 1 frame | < 1 sec |
| Audio extraction | Stream copy | < 5 sec |
| Transcode to H.264 | CRF 23, medium | 8-12 min |
| Transcode to H.265 | CRF 28, medium | 15-25 min |
| Add hardcoded subtitles | Re-encode | 10-15 min |
| Transcode to VP9 | CRF 30 | 20-40 min |
Simple tasks like thumbnails and audio extraction fit easily. But any real transcoding of videos longer than a few minutes will hit Lambda's timeout — or come dangerously close. For a deeper look at encoding settings that affect processing time, see our video compression best practices.
3. Memory Limits — No Room for Big Files
Video processing is memory-hungry. FFmpeg buffers input and output data, and complex filters hold multiple frames in memory simultaneously.
| Platform | Max Memory |
|---|---|
| AWS Lambda | 10 GB |
| Google Cloud Functions | 32 GB (2nd gen) |
| Vercel Serverless | 3 GB |
| Cloudflare Workers | 128 MB |
AWS Lambda at 10 GB sounds fine — but memory directly affects cost. Lambda pricing is per-GB-second. A function using 4 GB for 10 minutes costs 40x more than one using 256 MB for 15 seconds.
Real-world memory usage for FFmpeg tasks:
| Task | Typical Memory |
|---|---|
| Thumbnail extraction | 100-200 MB |
| Audio extraction | 50-150 MB |
| 1080p H.264 transcode | 500 MB - 1.5 GB |
| 4K H.264 transcode | 2-4 GB |
| Complex filter chains | 1-4 GB |
4. No GPU Acceleration
Hardware-accelerated encoding (NVENC, QSV, VAML) dramatically speeds up video processing — 5-10x faster for H.264/H.265. But serverless platforms don't offer GPU access.
This means you're stuck with CPU encoding, which is:
- 5-10x slower than GPU encoding
- More expensive per-video at scale
- More likely to hit timeout limits
5. Cold Start Penalty
Every cold start means loading the FFmpeg binary into the execution environment. A 70 MB+ binary adds 1-3 seconds of cold start latency on top of the platform's own initialization.
For container-based Lambda (needed for larger binaries), cold starts can reach 5-15 seconds.
6. Ephemeral Storage Constraints
Lambda provides /tmp with 512 MB by default (configurable to 10 GB at extra cost). Video files are large — a 10-minute 1080p video is 500 MB-1 GB. You need space for both input and output:
Input file: 500 MB
Output file: 300 MB
FFmpeg temp: 200 MB
─────────────────
Total: 1 GB ← exceeds default /tmp
7. No Persistent Processes
FFmpeg often benefits from persistent processes — keeping the binary warm, maintaining decode caches, reusing connections. Serverless functions are stateless and ephemeral. Every invocation starts fresh.
Common Workarounds
Despite these challenges, teams do run FFmpeg on serverless. Here are the most common approaches.
Lambda Layers
AWS Lambda Layers let you package FFmpeg separately from your application code:
# Download a static FFmpeg build
wget https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz
tar xf ffmpeg-release-amd64-static.tar.xz
# Create Lambda layer structure
mkdir -p ffmpeg-layer/bin
cp ffmpeg-*-amd64-static/ffmpeg ffmpeg-layer/bin/
cd ffmpeg-layer && zip -r ../ffmpeg-layer.zip .
Then in your Lambda function:
const { execSync } = require('child_process');
exports.handler = async (event) => {
// FFmpeg binary from layer is at /opt/bin/ffmpeg
const result = execSync(
'/opt/bin/ffmpeg -i /tmp/input.mp4 -vf "scale=-2:720" -crf 23 /tmp/output.mp4',
{ timeout: 300000 }
);
return { statusCode: 200 };
};
Limitations:
- Layer size still counts toward the 250 MB unzipped limit
- You need to strip FFmpeg to only the codecs you need
- Maintaining your own FFmpeg build is an ongoing burden
Docker Container Images
Lambda supports container images up to 10 GB, giving much more room:
FROM public.ecr.aws/lambda/nodejs:20
# Install FFmpeg
RUN yum install -y tar xz && \
curl -L https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-amd64-static.tar.xz | \
tar xJ --strip-components=1 -C /usr/local/bin/ --wildcards '*/ffmpeg' '*/ffprobe'
COPY index.mjs ${LAMBDA_TASK_ROOT}/
CMD ["index.handler"]
Limitations:
- Cold starts are significantly longer (5-15 seconds)
- Larger images mean higher ECR storage costs
- You still face the 15-minute timeout and no GPU
ffmpeg.wasm — Browser/Edge FFmpeg
ffmpeg.wasm compiles FFmpeg to WebAssembly, making it run in browsers and edge runtimes:
import { FFmpeg } from '@ffmpeg/ffmpeg';
import { fetchFile } from '@ffmpeg/util';
const ffmpeg = new FFmpeg();
await ffmpeg.load();
await ffmpeg.writeFile('input.mp4', await fetchFile(videoUrl));
await ffmpeg.exec(['-i', 'input.mp4', '-vf', 'scale=-2:720', 'output.mp4']);
const data = await ffmpeg.readFile('output.mp4');
Limitations:
- 3-10x slower than native FFmpeg
- Limited codec support (no H.265, limited filter support)
- Memory-constrained in browser environments
- Can't handle files larger than a few hundred MB
- Single-threaded in most environments
Chunked Processing
Split large videos into chunks, process in parallel, then merge:
# Split into 2-minute segments
ffmpeg -i input.mp4 -c copy -segment_time 120 -f segment -reset_timestamps 1 chunk_%03d.mp4
// Process chunks in parallel Lambda invocations
const chunks = ['chunk_000.mp4', 'chunk_001.mp4', 'chunk_002.mp4'];
const results = await Promise.all(
chunks.map(chunk => lambda.invoke({
FunctionName: 'process-chunk',
Payload: JSON.stringify({ chunk })
}).promise())
);
// Merge results
// (requires another Lambda invocation)
Limitations:
- Complex orchestration (Step Functions, SQS, etc.)
- Not all operations support chunking (e.g., subtitle timing spans chunks)
- Merge step adds latency and can introduce artifacts at chunk boundaries
- Total cost is often higher than a single long-running process
Stripped FFmpeg Builds
Compile FFmpeg with only the codecs you need to reduce binary size:
./configure \
--disable-everything \
--enable-decoder=h264,aac,mp3 \
--enable-encoder=libx264,aac \
--enable-muxer=mp4,mp3 \
--enable-demuxer=mov,mp4,mp3 \
--enable-protocol=file,pipe \
--enable-filter=scale,overlay \
--enable-gpl --enable-libx264
This can bring the binary down to 15-25 MB but means you lose functionality. Need VP9 support next month? Rebuild.
Benchmark: Lambda vs Dedicated Server
To illustrate the gap, here are benchmarks for a 5-minute 1080p video:
| Task | Lambda (3 GB, arm64) | EC2 c6g.large (2 vCPU) | EC2 g5.xlarge (GPU) |
|---|---|---|---|
| Thumbnail | 0.8s | 0.5s | 0.3s |
| Audio extract | 2s | 1.5s | 1.5s |
| H.264 CRF 23 | 4.5 min | 3.2 min | 25s |
| H.265 CRF 28 | 9 min | 7 min | 35s |
| VP9 CRF 30 | 12 min | 9 min | N/A |
Cost comparison for 1000 H.264 transcodes (5-min 1080p video each):
| Approach | Time per Video | Cost per Video | Total Cost |
|---|---|---|---|
| Lambda (3 GB) | 4.5 min | ~$0.022 | ~$22 |
| EC2 c6g.large (reserved) | 3.2 min | ~$0.004 | ~$4 |
| EC2 g5.xlarge (GPU) | 25s | ~$0.008 | ~$8 |
Lambda is roughly 5x more expensive than a reserved EC2 instance for sustained workloads, and loses the cost advantage once you process more than a handful of videos per hour.
When Serverless FFmpeg Makes Sense
Despite all the challenges, there are legitimate use cases:
- Thumbnail generation — Fast, low memory, well within limits
- Audio extraction with stream copy — Near-instant, minimal resources
- Short video clips (< 30 seconds) — Quick transcoding that won't timeout
- Metadata extraction — ffprobe runs in milliseconds
- Sporadic, unpredictable traffic — When you might go hours without a single request
# These tasks work well on Lambda
ffmpeg -i input.mp4 -ss 00:00:05 -frames:v 1 thumbnail.jpg # < 1s
ffmpeg -i input.mp4 -vn -c:a copy audio.aac # < 2s
ffprobe -v quiet -print_format json -show_format -show_streams input.mp4 # < 1s
When to Use a Dedicated API Instead
Serverless falls short when you need:
- Transcoding videos longer than 2-3 minutes — Timeout risk
- H.265 or VP9 encoding — Too slow without GPU
- Complex filter chains — Subtitles, overlays, multi-input compositions
- Predictable processing at scale — Cost becomes prohibitive
- Full codec support — Stripped binaries limit your options
- GPU acceleration — Not available on serverless
For these cases, a dedicated video processing API eliminates the infrastructure headaches entirely. If you're evaluating managed solutions, see how FFHub compares to AWS MediaConvert.
FFHub: Cloud FFmpeg Without the Infrastructure
FFHub provides a cloud FFmpeg API designed specifically for the problems serverless can't solve. You send the same FFmpeg commands you'd run locally, and FFHub handles execution on optimized infrastructure.
# Instead of wrestling with Lambda layers and timeouts:
curl -X POST https://api.ffhub.io/v1/command \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"command": "-i input.mp4 -c:v libx264 -crf 23 -preset medium -c:a aac output.mp4",
"inputs": ["https://storage.example.com/input.mp4"],
"webhook": "https://your-app.com/callback"
}'
Key advantages over serverless FFmpeg:
- No binary management — Full FFmpeg with all codecs, always up to date
- No timeout limits — Process videos of any length
- Optimized infrastructure — Dedicated hardware tuned for video processing
- Same FFmpeg syntax — No new API to learn
- Pay per use — No idle infrastructure costs
This is particularly valuable when your core product isn't video processing — you shouldn't be maintaining FFmpeg builds and Lambda layers when you could be building features.
Decision Framework
| Factor | Use Serverless | Use Dedicated API |
|---|---|---|
| Video duration | < 30 seconds | Any length |
| Task type | Thumbnails, metadata, audio copy | Transcoding, subtitles, filters |
| Volume | < 50/day | Any volume |
| Codec needs | H.264 only | Any codec |
| GPU needed | No | Yes (for speed) |
| Team expertise | DevOps team available | Prefer managed solution |
Summary
Running FFmpeg on serverless platforms is possible but painful. The binary size, timeout limits, memory costs, and lack of GPU make it a poor fit for anything beyond simple tasks like thumbnail generation and audio extraction.
If your workload involves real transcoding, consider the total cost — not just the Lambda bill, but the engineering time spent building and maintaining custom FFmpeg deployments. For most teams, a dedicated video processing API is both cheaper and more reliable than trying to make serverless work for a use case it wasn't designed for. To understand what FFHub offers as a cloud FFmpeg solution, or to explore batch transcoding via API, check out those guides.
Related Articles
- What Is FFHub? - Learn how FFHub provides cloud FFmpeg without the infrastructure overhead
- Batch Video Transcoding API - Process thousands of videos programmatically with a simple REST API
- FFHub vs AWS MediaConvert - An honest comparison of FFHub and AWS's managed video transcoding service