Video Processing for UGC Platforms: A Complete Engineering Guide
Build a robust video processing pipeline for user-generated content. Covers format normalization, thumbnail generation, multi-resolution output, and scalable architecture.

Users will upload anything. A 4K ProRes file from a DSLR camera. A vertical screen recording in WebM. A 10-year-old AVI from a flip phone. A MOV with HEVC that plays on nothing except Safari. Your platform needs to accept all of it and deliver a consistent playback experience across every device and network condition.
This guide covers the complete video processing pipeline for UGC platforms — from upload to delivery — with practical FFmpeg commands and architecture patterns that scale.
The UGC Video Processing Pipeline
Every platform that accepts user-uploaded video needs some version of this pipeline:
┌─────────┐ ┌──────────┐ ┌────────────┐ ┌────────────┐ ┌──────────┐
│ Upload │──>│ Validate │──>│ Normalize │──>│ Generate │──>│ Deliver │
│ │ │ & Store │ │ Format │ │ Variants │ │ via CDN │
└─────────┘ └──────────┘ └────────────┘ └────────────┘ └──────────┘
│ │
┌─────┴─────┐ ┌────┴─────┐
│ Transcode │ │ Thumbnails│
│ to H.264 │ │ Previews │
│ MP4 │ │ Multi-res │
└───────────┘ └──────────┘
Each stage has specific technical requirements. Let's walk through them.
Stage 1: Upload and Validation
Before you spend compute on transcoding, validate the upload:
File-Level Checks
// 上传验证中间件
function validateUpload(req, res, next) {
const file = req.file;
// 文件大小限制
const MAX_SIZE = 2 * 1024 * 1024 * 1024; // 2GB
if (file.size > MAX_SIZE) {
return res.status(413).json({ error: "File exceeds 2GB limit" });
}
// MIME 类型白名单
const ALLOWED_TYPES = [
"video/mp4", "video/quicktime", "video/x-msvideo",
"video/webm", "video/x-matroska", "video/x-flv",
];
if (!ALLOWED_TYPES.includes(file.mimetype)) {
return res.status(415).json({ error: "Unsupported video format" });
}
next();
}
Probe the File
After upload, use FFprobe to extract metadata before processing:
ffprobe -v quiet -print_format json -show_format -show_streams input.mp4
This gives you duration, resolution, codec, bitrate, and frame rate — all of which inform your transcoding decisions.
// 使用 FFprobe 获取视频元数据
async function probeVideo(fileUrl) {
const response = await fetch("https://api.ffhub.io/v1/tasks", {
method: "POST",
headers: {
Authorization: `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
command: `ffprobe -v quiet -print_format json -show_format -show_streams ${fileUrl}`,
}),
});
const { task_id } = await response.json();
// 等待完成后解析 JSON 输出
return await waitAndGetResult(task_id);
}
Stage 2: Format Normalization
The goal: take any input format and produce a consistent output. For the vast majority of UGC platforms, that means H.264 MP4 with AAC audio.
Why H.264 MP4?
| Factor | H.264 MP4 | H.265 | VP9/WebM | AV1 |
|---|---|---|---|---|
| Browser support | 99%+ | ~80% | ~90% | ~70% |
| Mobile support | Universal | iOS only | Android mostly | Limited |
| Encoding speed | Fast | 2-3x slower | 3-5x slower | 10x slower |
| Hardware decode | Universal | Newer devices | Chrome/Android | Newest only |
| File size (baseline) | 1x | 0.5x | 0.6x | 0.4x |
H.264 wins on compatibility. If your audience is global and includes older devices, it's the only safe default. You can offer H.265 or AV1 as progressive enhancement for capable clients. For a deep dive into compression settings, see our FFmpeg video compression best practices.
The Base Transcode Command
ffmpeg -i input.mov \
-c:v libx264 \
-crf 23 \
-preset medium \
-profile:v high \
-level 4.1 \
-pix_fmt yuv420p \
-c:a aac \
-b:a 128k \
-ar 44100 \
-movflags +faststart \
output.mp4
Why each flag matters:
-crf 23: Good balance of quality and file size for UGC content-profile:v high -level 4.1: Maximum compatibility with modern devices-pix_fmt yuv420p: Required for playback on many devices (some cameras produce yuv422p or yuv444p)-movflags +faststart: Moves the metadata to the beginning of the file so playback can start before the full download completes — critical for web delivery
Handling Rotation
Phone videos often have rotation metadata rather than physically rotated pixels. FFmpeg handles this automatically with -c:v libx264, but if you need to force it:
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -vf "transpose=1" output.mp4
Capping Resolution
Users upload 4K, but your player maxes out at 1080p. Cap the resolution without upscaling smaller videos:
ffmpeg -i input.mp4 \
-vf "scale='min(1920,iw)':'min(1080,ih)':force_original_aspect_ratio=decrease,pad=ceil(iw/2)*2:ceil(ih/2)*2" \
-c:v libx264 -crf 23 -preset medium \
-c:a aac -b:a 128k \
output.mp4
This scales down anything larger than 1920x1080 while leaving smaller videos untouched, and ensures dimensions are even numbers (required by H.264).
Stage 3: Thumbnail and Preview Generation
Every video needs at least a thumbnail. Most platforms also generate animated previews for hover states.
Static Thumbnail
Extract a frame at the 2-second mark (avoids black frames from intros):
ffmpeg -i input.mp4 -ss 2 -frames:v 1 -q:v 2 thumbnail.jpg
Multiple Thumbnail Candidates
Extract 5 evenly-spaced frames and let the user choose (or use an image quality scorer to auto-select):
ffmpeg -i input.mp4 \
-vf "select='not(mod(n\,floor($(ffprobe -v error -count_frames -select_streams v:0 -show_entries stream=nb_read_frames -of csv=p=0 input.mp4)/5)))',scale=640:-2" \
-frames:v 5 -vsync vfr \
thumb_%02d.jpg
A simpler approach — extract at specific timestamps:
# 在 1s, 25%, 50%, 75% 处分别截取缩略图
ffmpeg -i input.mp4 -ss 1 -frames:v 1 thumb_01.jpg
ffmpeg -i input.mp4 -ss 25% -frames:v 1 thumb_02.jpg
ffmpeg -i input.mp4 -ss 50% -frames:v 1 thumb_03.jpg
ffmpeg -i input.mp4 -ss 75% -frames:v 1 thumb_04.jpg
Animated Preview (GIF or WebP)
A 3-5 second looping preview for hover states:
# 从第 5 秒开始截取 4 秒,生成 10fps 320px 宽的 GIF
ffmpeg -i input.mp4 -ss 5 -t 4 \
-vf "fps=10,scale=320:-2:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" \
preview.gif
For better quality and smaller size, use WebP:
ffmpeg -i input.mp4 -ss 5 -t 4 \
-vf "fps=10,scale=320:-2" \
-c:v libwebp -lossless 0 -quality 50 -loop 0 \
preview.webp
Stage 4: Multi-Resolution Outputs
For adaptive streaming, generate multiple resolutions from the source:
# 1080p
ffmpeg -i input.mp4 -vf "scale=-2:1080" -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 128k output_1080p.mp4
# 720p
ffmpeg -i input.mp4 -vf "scale=-2:720" -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 96k output_720p.mp4
# 480p
ffmpeg -i input.mp4 -vf "scale=-2:480" -c:v libx264 -crf 26 -preset medium -c:a aac -b:a 64k output_480p.mp4
# 360p
ffmpeg -i input.mp4 -vf "scale=-2:360" -c:v libx264 -crf 28 -preset medium -c:a aac -b:a 48k output_360p.mp4
Note how CRF increases and audio bitrate decreases for lower resolutions — lower resolution means less visual information, so you can compress more aggressively.
Resolution Ladder
| Resolution | CRF | Video Bitrate (typical) | Audio Bitrate | Target Use |
|---|---|---|---|---|
| 1080p | 23 | 3-5 Mbps | 128k | WiFi / broadband |
| 720p | 23 | 1.5-3 Mbps | 96k | Good mobile |
| 480p | 26 | 0.5-1.5 Mbps | 64k | Slow mobile |
| 360p | 28 | 0.3-0.7 Mbps | 48k | Very slow connection |
Complete Processing Pipeline
Here's a Node.js implementation that orchestrates the entire pipeline — from upload to all outputs:
// ugc-pipeline.js
const API_BASE = "https://api.ffhub.io/v1";
const API_KEY = process.env.FFHUB_API_KEY;
// 提交 FFmpeg 任务并等待完成
async function runFFmpeg(command) {
const res = await fetch(`${API_BASE}/tasks`, {
method: "POST",
headers: {
Authorization: `Bearer ${API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({ command }),
});
const { task_id } = await res.json();
return waitForTask(task_id);
}
async function waitForTask(taskId) {
while (true) {
const res = await fetch(`${API_BASE}/tasks/${taskId}`, {
headers: { Authorization: `Bearer ${API_KEY}` },
});
const task = await res.json();
if (task.status === "completed") return task;
if (task.status === "failed") throw new Error(task.error);
await new Promise((r) => setTimeout(r, 3000));
}
}
// UGC 视频处理主流程
async function processUGCVideo(sourceUrl, videoId) {
console.log(`[${videoId}] 开始处理: ${sourceUrl}`);
// 步骤 1: 探测源文件
const probe = await runFFmpeg(
`ffprobe -v quiet -print_format json -show_format -show_streams ${sourceUrl}`
);
const metadata = JSON.parse(probe.output);
console.log(`[${videoId}] 源文件: ${metadata.format.duration}s, ${metadata.streams[0].width}x${metadata.streams[0].height}`);
// 步骤 2: 标准化转码 + 缩略图(并行执行)
const baseCommand = `-c:v libx264 -crf 23 -preset medium -profile:v high -pix_fmt yuv420p -c:a aac -b:a 128k -movflags +faststart`;
const [normalized, thumbnail, preview] = await Promise.all([
// 标准化为 1080p H.264 MP4
runFFmpeg(
`ffmpeg -i ${sourceUrl} -vf "scale='min(1920,iw)':'min(1080,ih)':force_original_aspect_ratio=decrease,pad=ceil(iw/2)*2:ceil(ih/2)*2" ${baseCommand} output.mp4`
),
// 缩略图
runFFmpeg(
`ffmpeg -i ${sourceUrl} -ss 2 -frames:v 1 -q:v 2 thumbnail.jpg`
),
// 动画预览
runFFmpeg(
`ffmpeg -i ${sourceUrl} -ss 5 -t 4 -vf "fps=10,scale=320:-2" -c:v libwebp -lossless 0 -quality 50 -loop 0 preview.webp`
),
]);
// 步骤 3: 生成低分辨率版本(并行)
const [res720, res480] = await Promise.all([
runFFmpeg(
`ffmpeg -i ${sourceUrl} -vf "scale=-2:720" -c:v libx264 -crf 23 -preset medium -c:a aac -b:a 96k -movflags +faststart output_720p.mp4`
),
runFFmpeg(
`ffmpeg -i ${sourceUrl} -vf "scale=-2:480" -c:v libx264 -crf 26 -preset medium -c:a aac -b:a 64k -movflags +faststart output_480p.mp4`
),
]);
return {
videoId,
outputs: {
"1080p": normalized.output_url,
"720p": res720.output_url,
"480p": res480.output_url,
thumbnail: thumbnail.output_url,
preview: preview.output_url,
},
};
}
// 使用示例
processUGCVideo("https://your-bucket.s3.amazonaws.com/uploads/raw-video.mov", "vid_abc123")
.then((result) => console.log("处理完成:", result))
.catch((err) => console.error("处理失败:", err));
Architecture: Upload to Delivery
Here's the full production architecture for a UGC platform:
┌──────────┐ ┌──────────────┐ ┌──────────────────┐
│ Client │────>│ Your API │────>│ Object Storage │
│ Upload │ │ Server │ │ (S3 / R2) │
└──────────┘ └──────┬───────┘ └────────┬─────────┘
│ │
│ POST /tasks │ source URL
v │
┌──────────────┐ │
│ Transcoding │<─────────────┘
│ API │
└──────┬───────┘
│
│ webhook callback
v
┌──────────────┐ ┌──────────────────┐
│ Your API │────>│ CDN │
│ (webhook) │ │ (CloudFront/CF) │
└──────────────┘ └──────────────────┘
Flow
- Client uploads raw video to your API server (or directly to object storage via presigned URL)
- Your API stores the raw file and creates a processing job
- Transcoding API receives the FFmpeg command with the source URL, processes the video
- Webhook notifies your API when processing is complete, with output URLs
- Your API updates the database and makes the video available via CDN
Database Schema
CREATE TABLE videos (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
title TEXT,
status TEXT DEFAULT 'uploading', -- uploading, processing, ready, failed
source_url TEXT,
duration REAL,
width INTEGER,
height INTEGER,
created_at TIMESTAMPTZ DEFAULT NOW(),
processed_at TIMESTAMPTZ
);
CREATE TABLE video_variants (
id TEXT PRIMARY KEY,
video_id TEXT NOT NULL,
resolution TEXT NOT NULL, -- '1080p', '720p', '480p', 'thumbnail', 'preview'
url TEXT NOT NULL,
file_size BIGINT,
codec TEXT,
bitrate INTEGER,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_videos_user ON videos(user_id);
CREATE INDEX idx_variants_video ON video_variants(video_id);
Performance at Scale
Processing Time Benchmarks
| Source | Duration | 1080p Output | Thumbnail | Total Pipeline |
|---|---|---|---|---|
| 1080p 30s clip | 30s | ~15s | <1s | ~25s |
| 1080p 5min video | 5min | ~90s | <1s | ~120s |
| 4K 10min video | 10min | ~300s | <1s | ~360s |
| 720p 1min clip | 1min | ~20s | <1s | ~30s |
These are approximate — actual times depend on source codec, bitrate, and complexity.
Scaling Strategies
For 100 uploads/day: Single-threaded processing with a simple job queue. Even a basic queue (database-backed) is sufficient.
For 1,000 uploads/day: Concurrent processing with 5-10 parallel jobs. Add monitoring and alerting for stuck jobs.
For 10,000+ uploads/day: Full async architecture with webhooks, dedicated job tracking database, dead letter queue for failures, and auto-scaling workers or a cloud API that handles scaling for you. Our batch video transcoding guide covers this architecture in detail.
Cost Considerations
For a social platform processing 5,000 video uploads per day (average 2 minutes each):
- Self-hosted (dedicated GPU servers): ~$2,000-5,000/month for servers + engineering time
- Cloud transcoding API: ~$500-1,500/month depending on output variants
- Hybrid: Base load on dedicated hardware, overflow to cloud API
Common Pitfalls
1. Not Handling Audio-Only or Silent Videos
Some uploads have no audio stream. FFmpeg will error if you try to encode non-existent audio:
# 安全处理:如果没有音轨则跳过音频编码
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -c:a aac -b:a 128k output.mp4 2>&1 || \
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -an output.mp4
2. Ignoring Aspect Ratio
Scaling to a fixed resolution without preserving aspect ratio produces stretched video. Always use -2 for the free dimension or force_original_aspect_ratio=decrease.
3. No Processing Timeout
A corrupt file can cause FFmpeg to hang indefinitely. Always set a timeout:
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 600000); // 10 分钟超时
try {
await fetch(url, { signal: controller.signal });
} finally {
clearTimeout(timeout);
}
4. Storing Only One Resolution
If you only store the original, every playback on a slow connection buffers. If you only store one transcoded version, you can't adapt to network conditions. Store at least 2-3 resolutions.
Conclusion
A UGC video processing pipeline doesn't have to be complicated, but it does have to be reliable. The core recipe is:
- Validate uploads early (format, size, duration)
- Normalize to H.264 MP4 with consistent settings
- Generate thumbnails and previews in parallel with transcoding
- Output multiple resolutions for adaptive delivery
- Deliver via CDN with proper cache headers
The pipeline code shown in this article handles all of these steps. For the transcoding compute, you can run FFmpeg on your own servers or use a cloud API like FFHub.io to avoid managing FFmpeg infrastructure entirely.
Start simple — a single resolution with a thumbnail — and add complexity (multi-resolution, animated previews, HLS packaging) as your platform grows. If your platform is a SaaS product, our video processing for SaaS guide covers the build-vs-buy decision and integration patterns.
Related Articles
- Batch Video Transcoding via API - Architecture and implementation guide for processing thousands of videos reliably
- Video Processing for SaaS - Build vs buy decision framework when video is a supporting feature in your product
- How to Convert Video Format with FFmpeg - Beginner-friendly guide to format conversion commands used in UGC pipelines