Full Stack • Java • System Design • Cloud • AI Engineering

System Design2024-01-20

Design Netflix - Video Streaming Platform

Learn how to design a scalable video streaming platform like Netflix, handling millions of concurrent users, content delivery, and personalized recommendations.

Design Netflix - Video Streaming Platform

Problem Statement

Design a video streaming platform that can:

  • Stream videos to millions of concurrent users
  • Provide personalized recommendations
  • Support multiple devices and resolutions
  • Handle content upload and encoding
  • Deliver content with low latency globally

Requirements

Functional Requirements

  1. User authentication and profiles
  2. Video upload and processing
  3. Video streaming with adaptive bitrate
  4. Search and browse content
  5. Personalized recommendations
  6. Watch history and resume playback
  7. Subtitle support
  8. Download for offline viewing

Non-Functional Requirements

  1. Scalability: Support 200M+ users, 100M+ concurrent streams
  2. Availability: 99.99% uptime
  3. Low Latency: Start streaming within 2 seconds
  4. Global: Serve users worldwide
  5. Cost-Effective: Optimize bandwidth and storage costs

High-Level Architecture

┌─────────────┐
│   Client    │ (Web, Mobile, TV, Gaming Console)
└──────┬──────┘
       │
       ├─────────────────────────────────────────┐
       │                                         │
┌──────▼──────┐                         ┌───────▼────────┐
│   CDN       │                         │  API Gateway   │
│  (CloudFront)│                        │   (Load Bal)   │
└──────┬──────┘                         └───────┬────────┘
       │                                         │
       │                                ┌────────▼────────┐
       │                                │  Microservices  │
       │                                ├─────────────────┤
       │                                │ • Auth Service  │
       │                                │ • User Service  │
       │                                │ • Video Service │
       │                                │ • Search Service│
       │                                │ • Recommend Svc │
       │                                └────────┬────────┘
       │                                         │
┌──────▼──────┐                         ┌───────▼────────┐
│   Storage   │                         │   Databases    │
│   (S3)      │                         ├────────────────┤
├─────────────┤                         │ • PostgreSQL   │
│ • Videos    │                         │ • Cassandra    │
│ • Thumbnails│                         │ • Redis Cache  │
│ • Subtitles │                         │ • Elasticsearch│
└─────────────┘                         └────────────────┘

Core Components

1. Content Delivery Network (CDN)

Purpose: Deliver video content with low latency globally

Implementation:

CDN Strategy:
├── Edge Locations (200+ worldwide)
├── Origin Servers (S3 buckets)
├── Cache Strategy
│   ├── Popular content: Cache at all edges
│   ├── Regional content: Cache in specific regions
│   └── Long-tail content: On-demand caching
└── Adaptive Bitrate Streaming (ABR)
    ├── 4K: 25 Mbps
    ├── 1080p: 5 Mbps
    ├── 720p: 3 Mbps
    ├── 480p: 1.5 Mbps
    └── 360p: 0.7 Mbps

2. Video Processing Pipeline

Workflow:

Upload → Transcoding → Quality Check → Storage → CDN Distribution

1. Upload Service
   - Chunked upload for large files
   - Resume capability
   - Validation (format, size, duration)

2. Transcoding Service
   - Multiple resolutions (360p to 4K)
   - Multiple formats (MP4, WebM, HLS)
   - Audio tracks (multiple languages)
   - Subtitle generation
   - Thumbnail extraction

3. Storage
   - Original: S3 Glacier (cold storage)
   - Transcoded: S3 Standard
   - Metadata: Database

3. Streaming Protocol

HLS (HTTP Live Streaming):

video.m3u8 (Master Playlist)
├── 4k.m3u8
│   ├── segment_001.ts
│   ├── segment_002.ts
│   └── segment_003.ts
├── 1080p.m3u8
├── 720p.m3u8
└── 480p.m3u8

Client automatically switches quality based on:
- Network bandwidth
- Device capability
- Buffer health

4. Recommendation System

Architecture:

Data Collection → Feature Engineering → Model Training → Serving

Data Sources:
├── Watch history
├── Search queries
├── Ratings
├── Time of day
├── Device type
└── Geographic location

Algorithms:
├── Collaborative Filtering
├── Content-Based Filtering
├── Deep Learning (Neural Networks)
└── A/B Testing for optimization

Real-time Serving:
├── Pre-computed recommendations (batch)
├── Real-time personalization
└── Redis cache for fast access

Database Design

User Service (PostgreSQL)

users
├── user_id (PK)
├── email
├── password_hash
├── subscription_tier
├── created_at
└── last_login

profiles
├── profile_id (PK)
├── user_id (FK)
├── name
├── avatar
└── preferences

Video Service (Cassandra)

videos (Partition Key: video_id)
├── video_id
├── title
├── description
├── duration
├── release_date
├── genres
├── cast
└── thumbnail_url

video_metadata
├── video_id
├── resolution
├── bitrate
├── codec
├── file_size
└── cdn_url

Watch History (Cassandra)

watch_history (Partition Key: user_id, Clustering Key: timestamp)
├── user_id
├── video_id
├── timestamp
├── duration_watched
├── total_duration
└── device_type

API Design

Streaming API

GET /api/v1/stream/{video_id}
Headers:
  Authorization: Bearer {token}
  Range: bytes=0-1024

Response:
{
  "manifest_url": "https://cdn.netflix.com/video123/master.m3u8",
  "drm_license": "...",
  "subtitles": [
    {"language": "en", "url": "..."},
    {"language": "es", "url": "..."}
  ]
}

Recommendation API

GET /api/v1/recommendations
Headers:
  Authorization: Bearer {token}

Response:
{
  "personalized": [...],
  "trending": [...],
  "continue_watching": [...],
  "new_releases": [...]
}

Scalability Strategies

1. Horizontal Scaling

  • Microservices architecture
  • Stateless services
  • Load balancing across multiple instances

2. Caching Strategy

Multi-Level Caching:
├── Browser Cache (videos, thumbnails)
├── CDN Cache (edge locations)
├── Application Cache (Redis)
│   ├── User sessions
│   ├── Recommendations
│   └── Popular content metadata
└── Database Cache (query results)

3. Database Sharding

User Data Sharding:
- Shard by user_id % num_shards
- Consistent hashing for even distribution

Video Data Sharding:
- Shard by video_id
- Replicate popular content across shards

Cost Optimization

1. Storage Tiering

Content Lifecycle:
├── New Release (0-30 days): S3 Standard + All CDN edges
├── Popular (30-180 days): S3 Standard + Regional CDN
├── Catalog (180-365 days): S3 IA + On-demand CDN
└── Archive (365+ days): S3 Glacier + Rare access

2. Bandwidth Optimization

  • Adaptive bitrate streaming
  • Compression (H.265/HEVC)
  • P2P delivery for popular content
  • Off-peak encoding

Security

1. DRM (Digital Rights Management)

  • Widevine (Android, Chrome)
  • FairPlay (iOS, Safari)
  • PlayReady (Windows, Xbox)

2. Content Protection

  • Encrypted streaming (HTTPS)
  • Token-based authentication
  • Geo-blocking
  • Watermarking

Monitoring & Analytics

Key Metrics

Performance:
├── Video start time
├── Buffering ratio
├── Bitrate distribution
└── CDN hit ratio

Business:
├── Concurrent streams
├── Watch time per user
├── Completion rate
└── Churn rate

Infrastructure:
├── Server CPU/Memory
├── Database query time
├── Cache hit rate
└── Error rates

Interview Questions

Q1: How to handle 100M concurrent streams?

Answer:

  • CDN for content delivery (offload 95% traffic)
  • Microservices for horizontal scaling
  • Database sharding for user data
  • Redis for session management
  • Load balancers with auto-scaling

Q2: How to reduce video start time?

Answer:

  • Preload first segment
  • Optimize CDN cache hit ratio
  • Use HTTP/2 for multiplexing
  • Reduce manifest file size
  • Predictive prefetching

Q3: How to handle video encoding at scale?

Answer:

  • Distributed encoding cluster
  • Queue-based processing (SQS)
  • Priority queue (new releases first)
  • Parallel encoding (multiple resolutions)
  • Spot instances for cost savings

Conclusion

Netflix-scale system design requires:

  • Global CDN for low-latency delivery
  • Microservices for scalability
  • Adaptive streaming for quality
  • ML-based recommendations for engagement
  • Cost optimization for profitability

Key architectural decisions:

  1. CDN-first approach
  2. Cassandra for time-series data
  3. Redis for real-time caching
  4. Kafka for event streaming
  5. Kubernetes for orchestration