Batch Video Transcoding via API: Architecture, Implementation, and Scaling Guide

Learn how to build a reliable batch video transcoding pipeline using APIs. Covers queue architecture, error handling, retry strategies, and a complete Node.js implementation.

FFHub·2026-05-11

Batch Video Transcoding via API: Architecture, Implementation, and Scaling Guide

Processing a single video is straightforward. Processing ten thousand videos without losing sleep is an entirely different problem. Whether you're migrating a media library, normalizing user uploads, or converting an archive to a new codec, batch video transcoding introduces challenges that don't exist at small scale: queue management, concurrency limits, error recovery, and cost control.

This guide walks through the architecture patterns, implementation details, and operational strategies you need to build a production-grade batch transcoding pipeline.

When You Need Batch Transcoding

Batch transcoding isn't just "run FFmpeg in a loop." It's a distinct workload with its own requirements. Here are the most common scenarios:

UGC Platform Backfill

You launched with whatever format users uploaded. Now you need every video in H.264 MP4 at three resolutions for adaptive streaming. That's 50,000 source files becoming 150,000 outputs. Our UGC platform video processing guide covers this pipeline in detail.

Media Library Migration

Moving from on-premise storage to cloud, or switching CDN providers. Every asset needs re-encoding to match the new delivery spec.

Format Standardization

Your platform accumulated videos in dozens of formats — MKV, AVI, MOV, WMV, FLV. Support tickets pile up because some don't play on mobile. You need a one-time (or recurring) job to normalize everything.

Archival and Compliance

Regulatory or internal policy requires specific codecs, resolutions, or watermarks applied to all historical content.

Why Batch Transcoding Is Hard

Running ffmpeg in a for-loop on a single server will work for 20 files. It will not work for 20,000. Here's why:

Resource Contention

Video transcoding is CPU-intensive. A single 1080p encode can pin 4-8 cores. Running 10 concurrently on an 8-core machine means thrashing, OOM kills, and degraded output quality from timeouts.

Queue Management

You need to track which files are pending, in-progress, completed, or failed. Without a proper queue, you'll end up with duplicate processing, lost jobs, or an unrecoverable state after a restart.

Error Handling at Scale

At 10,000 files, a 1% error rate means 100 failures. Corrupt source files, codec incompatibilities, network timeouts, disk full — each failure mode needs different handling.

Scaling Ceiling

A single machine has a fixed number of cores. When your backlog grows, you need horizontal scaling — multiple workers, job distribution, and result aggregation.

Architecture Patterns

There are three common approaches to batch transcoding. Each makes different trade-offs.

Pattern 1: Queue + Worker Pool (Self-Hosted)

┌──────────┐    ┌───────────┐    ┌──────────────┐
│  Job DB  │───>│  Queue    │───>│  Worker 1    │
│          │    │ (Redis /  │    │  (FFmpeg)    │
│          │    │  RabbitMQ)│    ├──────────────┤
│          │    │           │───>│  Worker 2    │
│          │    │           │    │  (FFmpeg)    │
│          │    │           │    ├──────────────┤
│          │    │           │───>│  Worker N    │
└──────────┘    └───────────┘    └──────────────┘

Pros: Full control, no per-job costs, works offline.

Cons: You manage servers, FFmpeg versions, scaling, monitoring, and failure recovery. Worker provisioning takes engineering time. A dedicated DevOps effort.

Pattern 2: Event-Driven (Cloud Functions)

┌──────────┐    ┌───────────┐    ┌──────────────┐
│  S3      │───>│  Lambda / │───>│  S3 Output   │
│  Upload  │    │  Cloud Fn │    │  Bucket      │
│  Event   │    │  (FFmpeg) │    │              │
└──────────┘    └───────────┘    └──────────────┘

Pros: Auto-scales with demand, no idle servers, pay-per-invocation.

Cons: Function timeout limits (15 min on AWS Lambda), cold starts, limited FFmpeg binary support, memory constraints. Fine for thumbnails, problematic for long videos. For a deeper look at these constraints, see FFmpeg on Serverless: Challenges and Solutions.

Pattern 3: Cloud Transcoding API

┌──────────┐    ┌───────────┐    ┌──────────────┐
│  Your    │───>│  API      │───>│  Webhook     │
│  App     │    │  Service  │    │  Callback    │
│          │    │           │    │              │
└──────────┘    └───────────┘    └──────────────┘

Pros: No infrastructure to manage, scales to any volume, built-in error handling and retries, consistent FFmpeg version.

Cons: Per-job cost, requires network connectivity, data leaves your infrastructure.

Comparison

Factor	Self-Hosted Queue	Cloud Functions	Cloud API
Setup time	Days-weeks	Hours	Minutes
Scaling	Manual	Auto (with limits)	Auto
Max video length	Unlimited	~15 min	Unlimited
Maintenance	High	Medium	None
Cost at 100 videos/day	$$ (server)	$	$
Cost at 10,000 videos/day	$$$ (cluster)	$$$ (invocations)	$$
Error handling	Build it	Build it	Built-in

Implementation: Batch Transcoding with a Cloud API

Let's build a complete batch transcoding system using Node.js. The architecture is simple: read a list of source videos, submit them as API tasks with controlled concurrency, track progress, and handle failures.

Project Setup

mkdir batch-transcode && cd batch-transcode
npm init -y
npm install p-limit

The Batch Processor

// batch-transcode.js
import pLimit from "p-limit";

const API_BASE = "https://api.ffhub.io/v1";
const API_KEY = process.env.FFHUB_API_KEY;
const CONCURRENCY = 20; // max parallel jobs
const POLL_INTERVAL = 5000; // 5 seconds
const MAX_RETRIES = 3;

// 模拟视频源列表 — 实际场景从数据库或 S3 列表获取
function getVideoList() {
  return Array.from({ length: 100 }, (_, i) => ({
    id: `video-${String(i + 1).padStart(4, "0")}`,
    url: `https://your-bucket.s3.amazonaws.com/raw/video-${i + 1}.mp4`,
  }));
}

// 提交单个转码任务
async function submitTask(video) {
  const response = await fetch(`${API_BASE}/tasks`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      command: `ffmpeg -i ${video.url} -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k -movflags +faststart output.mp4`,
    }),
  });

  if (!response.ok) {
    throw new Error(`Submit failed: ${response.status} ${response.statusText}`);
  }

  const data = await response.json();
  return data.task_id;
}

// 轮询任务状态直到完成或失败
async function waitForCompletion(taskId) {
  while (true) {
    const response = await fetch(`${API_BASE}/tasks/${taskId}`, {
      headers: { Authorization: `Bearer ${API_KEY}` },
    });

    const task = await response.json();

    if (task.status === "completed") {
      return { success: true, output_url: task.output_url, duration: task.duration };
    }

    if (task.status === "failed") {
      throw new Error(`Task failed: ${task.error}`);
    }

    // 仍在处理中，等待后继续轮询
    await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL));
  }
}

// 带重试的单视频处理
async function processVideo(video, attempt = 1) {
  try {
    console.log(`[${video.id}] 提交任务 (尝试 ${attempt}/${MAX_RETRIES})`);
    const taskId = await submitTask(video);
    console.log(`[${video.id}] 任务已提交: ${taskId}`);

    const result = await waitForCompletion(taskId);
    console.log(`[${video.id}] 完成 — 耗时 ${result.duration}s`);
    return { video, status: "completed", ...result };
  } catch (error) {
    if (attempt < MAX_RETRIES) {
      const delay = Math.pow(2, attempt) * 1000; // 指数退避
      console.warn(`[${video.id}] 失败，${delay / 1000}s 后重试: ${error.message}`);
      await new Promise((resolve) => setTimeout(resolve, delay));
      return processVideo(video, attempt + 1);
    }
    console.error(`[${video.id}] 最终失败: ${error.message}`);
    return { video, status: "failed", error: error.message };
  }
}

// 主函数 — 带并发限制的批量处理
async function main() {
  const videos = getVideoList();
  const limit = pLimit(CONCURRENCY);

  console.log(`开始批量转码: ${videos.length} 个视频, 并发: ${CONCURRENCY}`);
  const startTime = Date.now();

  const results = await Promise.all(
    videos.map((video) => limit(() => processVideo(video)))
  );

  const elapsed = ((Date.now() - startTime) / 1000).toFixed(1);
  const completed = results.filter((r) => r.status === "completed").length;
  const failed = results.filter((r) => r.status === "failed").length;

  console.log(`\n===== 批量处理完成 =====`);
  console.log(`总计: ${videos.length} | 成功: ${completed} | 失败: ${failed}`);
  console.log(`总耗时: ${elapsed}s`);

  // 输出失败列表以便后续排查
  if (failed > 0) {
    console.log(`\n失败列表:`);
    results
      .filter((r) => r.status === "failed")
      .forEach((r) => console.log(`  ${r.video.id}: ${r.error}`));
  }
}

main();

Run it:

FFHUB_API_KEY=your_key node batch-transcode.js

Key Design Decisions

Why p-limit instead of Promise.all on the full array? Submitting 10,000 API requests simultaneously will overwhelm any endpoint. p-limit ensures at most CONCURRENCY jobs run in parallel, acting as a client-side rate limiter.

Why exponential backoff? Transient failures (network blips, rate limits) often resolve themselves. Retrying immediately just adds load. Backing off exponentially — 2s, 4s, 8s — gives the system time to recover.

Why poll instead of webhooks here? Polling is simpler for batch scripts. For production systems with a web server, webhooks are more efficient — see the next section.

Webhook-Based Architecture (Production)

For production systems, polling thousands of tasks wastes bandwidth. Use webhooks instead:

// Express webhook endpoint
app.post("/webhooks/transcode-complete", async (req, res) => {
  const { task_id, status, output_url, error } = req.body;

  // 更新数据库中的任务状态
  await db.query(
    `UPDATE transcode_jobs
     SET status = $1, output_url = $2, error = $3, completed_at = NOW()
     WHERE task_id = $4`,
    [status, output_url, error, task_id]
  );

  // 如果失败且未超过重试次数，重新入队
  if (status === "failed") {
    const job = await db.query(
      "SELECT * FROM transcode_jobs WHERE task_id = $1",
      [task_id]
    );
    if (job.retry_count < MAX_RETRIES) {
      await enqueueRetry(job);
    }
  }

  res.sendStatus(200);
});

When submitting tasks, include the webhook URL:

const response = await fetch(`${API_BASE}/tasks`, {
  method: "POST",
  headers: {
    Authorization: `Bearer ${API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    command: `ffmpeg -i ${video.url} -c:v libx264 -crf 23 output.mp4`,
    webhook_url: "https://your-app.com/webhooks/transcode-complete",
  }),
});

Error Handling Strategies

At batch scale, errors are inevitable. Here's how to categorize and handle them:

Error Type	Example	Strategy
Transient	Network timeout, 503	Retry with backoff
Source file	Corrupt file, codec unsupported	Log and skip, flag for review
Configuration	Invalid FFmpeg params	Fix command, reprocess batch
Rate limit	429 Too Many Requests	Reduce concurrency, add delay
Permanent	File not found (404)	Log, do not retry

Dead Letter Queue

After MAX_RETRIES failures, don't just log and forget. Push failed jobs to a dead letter queue for manual review:

async function handlePermanentFailure(video, error) {
  await db.query(
    `INSERT INTO dead_letter_queue (video_id, source_url, error, failed_at)
     VALUES ($1, $2, $3, NOW())`,
    [video.id, video.url, error]
  );
}

Progress Tracking

For large batches, you need visibility. Here's a minimal progress tracker:

class BatchProgress {
  constructor(total) {
    this.total = total;
    this.completed = 0;
    this.failed = 0;
    this.startTime = Date.now();
  }

  update(status) {
    if (status === "completed") this.completed++;
    if (status === "failed") this.failed++;

    const done = this.completed + this.failed;
    const percent = ((done / this.total) * 100).toFixed(1);
    const elapsed = ((Date.now() - this.startTime) / 1000).toFixed(0);
    const rate = (done / (elapsed || 1)).toFixed(1);
    const eta = ((this.total - done) / (rate || 1)).toFixed(0);

    process.stdout.write(
      `\r[${percent}%] ${done}/${this.total} | ` +
      `OK: ${this.completed} FAIL: ${this.failed} | ` +
      `${rate}/s | ETA: ${eta}s   `
    );
  }
}

Cost Analysis at Scale

Here's a realistic cost comparison for a one-time batch job transcoding 10,000 videos (average 5 minutes each, 1080p H.264 output):

Approach	Compute Cost	Engineering Time	Total Time	Total Cost
Single server (16-core)	~$200/mo	2-3 days setup	~7 days	$400+
Kubernetes cluster (40 cores)	~$500/mo	1-2 weeks setup	~2 days	$1,500+
AWS Elastic Transcoder	~$150	1-2 days	~4 hours	$300+
Cloud FFmpeg API	~$100	2-3 hours	~2 hours	$150+

The self-hosted options look cheaper in raw compute, but engineering time dominates. A backend engineer's time costs $80-150/hour. Two days of setup and debugging easily exceeds the API cost for a 10,000-video batch.

For recurring workloads, the math shifts. If you process 100,000 videos per month, a self-hosted cluster amortizes setup costs. Below that threshold, an API-based approach almost always wins on total cost. For a detailed build-vs-buy analysis tailored to SaaS teams, see our video processing for SaaS guide.

Optimization Tips

1. Sort by File Size

Process small files first. This gives you quick wins and early feedback on whether your FFmpeg command is correct before committing to large encodes.

videos.sort((a, b) => a.fileSize - b.fileSize);

2. Tune Concurrency

Start with 10-20 concurrent jobs and monitor API response times. If you see increased latency or rate limiting, reduce concurrency. If response times are stable, increase gradually.

3. Use Appropriate Presets

For batch jobs where speed matters more than file size, use -preset fast or -preset veryfast. The file size difference between medium and fast is typically 5-10%, but encoding speed doubles.

4. Skip Already-Processed Files

Maintain a processed-files log or database table. Check it before submitting each job to avoid paying for duplicate work:

async function shouldProcess(video) {
  const existing = await db.query(
    "SELECT 1 FROM completed_jobs WHERE video_id = $1",
    [video.id]
  );
  return existing.rows.length === 0;
}

Conclusion

Batch video transcoding is a solved problem if you pick the right architecture. For most teams, the decision comes down to volume and frequency:

One-time batch under 10,000 videos: Use a cloud transcoding API. The code in this article will work out of the box.
Recurring batch under 100,000/month: API-based with webhook integration and a database for tracking.
Recurring batch over 100,000/month: Consider a hybrid — self-hosted workers for predictable base load, API overflow for peaks.

The Node.js implementation shown here handles the common case well. Swap in your video source, adjust the FFmpeg command, set your concurrency, and let it run.

If you want to skip the infrastructure entirely, FFHub.io provides a cloud FFmpeg API that handles scaling, retries, and FFmpeg version management — you just send commands and get results.

Video Processing for UGC Platforms - Complete pipeline guide for handling user-uploaded video at scale
Video Processing for SaaS - Build vs buy decision framework with cost analysis at every growth stage
FFmpeg Video Compression Best Practices - Optimize your FFmpeg commands for the best quality-to-size ratio in batch jobs