Phantom CodePhantom Code
Earn with UsBlogsHelp Center
Earn with UsBlogsMy WorkspaceFeedbackPricingHelp Center
Home/Blog/System Design Case Study: Design a URL Shortener Like Bit.ly
By PhantomCode Team·Published April 30, 2026·5 min read
TL;DR

Designing a Bit.ly-style URL shortener is the canonical system design interview because it touches every important concept: back-of-envelope sizing (~115 RPS, 1 TB/year), base62 encoding of auto-incrementing IDs for compact unique slugs, MySQL with indexed short_url plus Redis for an 80%+ cache hit rate, sharding by user_id or hash for write scale, asynchronous analytics via Kafka, and CAP-aware HA with replicas and circuit breakers. Master this design and you have a template for most senior-level system design rounds.

System design questions appear in every senior-level FAANG interview. They test whether you can think beyond coding—about scalability, reliability, and trade-offs in distributed systems. URL shortener design is a classic system design question that appears at Google, Amazon, Meta, Microsoft, and Apple. It's popular because it's straightforward to understand but complex to design well. This guide walks you through designing a URL shortener from first principles, with all the trade-offs and decisions you'll need to make.

Requirements Gathering

Before jumping into design, always clarify requirements. Never assume.

Functional Requirements:

  • Shorten a long URL to a short URL
  • Redirect from short URL to original long URL
  • Users can customize short URLs (optional)
  • Analytics: track clicks, referrers, devices

Non-Functional Requirements:

  • High availability (99.99% uptime)
  • Low latency (<100ms redirect)
  • High throughput (millions of requests/day)
  • No data loss
  • Scalability

Back-of-envelope estimation: Let's assume 10 million redirects per day. That's ~115 redirects per second (RPS) average, with peaks of 500+ RPS.

Write ratio: Read ratio is roughly 1:100 (100 clicks per shortened URL created on average).

Data growth: 1 million new URLs per day = ~365 million URLs per year. Each URL record: ~1 KB (original URL + metadata) = ~1 TB per year.

High-Level Architecture

At a high level, you need:

  1. Load Balancer: Distribute incoming requests
  2. API Servers: Handle create/redirect logic
  3. Database: Store URL mappings
  4. Cache: Speed up redirects (most reads are repeated)
  5. Monitoring/Logging: Track health and issues
User -> Load Balancer -> API Servers -> Cache -> Database
                                |
                         CDN (optional)

URL Generation Strategy

You need to generate short, unique identifiers. Two main approaches:

Approach 1: Base62 Encoding of Auto-Incrementing ID

  • Generate sequential IDs: 1, 2, 3, ...
  • Encode to base62: 0-9, a-z, A-Z (62 characters)
  • 6 characters = 62^6 = 56 trillion combinations
  • Pros: Simple, no collisions, compact
  • Cons: Sequential IDs leak information (users can guess IDs), exhaustion calculation needed

Approach 2: Random String Generation

  • Generate random 6-character strings
  • Check for collisions, retry if needed
  • Pros: Non-sequential, harder to guess
  • Cons: Risk of collisions, slower generation

Recommendation: Use auto-incrementing ID + base62 encoding. It's simpler and proven (used by bit.ly).

Sample code:

def encode_id(id):
    base62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
    if id == 0:
        return base62[0]
    short_url = ""
    while id > 0:
        short_url = base62[id % 62] + short_url
        id //= 62
    return short_url

Database Schema

URLs table:

CREATE TABLE urls (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    original_url VARCHAR(2048) NOT NULL,
    short_url VARCHAR(10) UNIQUE NOT NULL,
    user_id INT,
    created_at TIMESTAMP,
    expires_at TIMESTAMP,
    is_active BOOLEAN
);
 
CREATE INDEX idx_short_url ON urls(short_url);
CREATE INDEX idx_user_id ON urls(user_id);

Redirects/Analytics table (optional):

CREATE TABLE analytics (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    short_url_id BIGINT,
    user_agent VARCHAR(255),
    referrer VARCHAR(2048),
    ip_address VARCHAR(45),
    country VARCHAR(2),
    timestamp TIMESTAMP
);
 
CREATE INDEX idx_short_url_id ON analytics(short_url_id);

Key decisions:

  • Use BIGINT for ID to handle years of data
  • Original URL is VARCHAR(2048) to handle long URLs
  • Short URL is VARCHAR(10) because 6+ characters in base62 gives us enough space
  • Denormalize created_at and is_active for quick lookups

Caching Layer

Redirects are read-heavy. Cache is essential.

Cache design:

  • Store: short_url -> original_url mappings
  • TTL: No explicit TTL (URLs are permanent until expiry). Evict based on LRU or memory limits.
  • Hit rate: Expect 80%+ with temporal locality

Cache update strategy:

  • On create: Add to cache
  • On redirect: Check cache first, then DB
  • On expiry: Invalidate from cache

Sample code:

class URLShortener:
    def __init__(self):
        self.cache = {}  # Redis in production
        self.db = Database()
 
    def redirect(self, short_url):
        # Check cache
        if short_url in self.cache:
            return self.cache[short_url]
 
        # Check database
        original_url = self.db.get(short_url)
 
        if original_url:
            self.cache[short_url] = original_url
            return original_url
 
        return None

Handling Collisions and Conflicts

What if multiple users want the same short code?

Option 1: Retrying When generating, check if code exists. If yes, generate again.

Option 2: Custom short URLs Allow users to request custom codes. Handle conflicts by rejecting or auto-incrementing.

Option 3: Hierarchical Namespace User-specific namespaces: user123/mylink vs user456/mylink.

For a simple design, assume auto-generated codes are rare to collide and implement collision detection + retry.

Scaling the Write Path

Creating URLs is relatively simple, but at scale, you need:

Database bottleneck: Writing millions of short URLs daily.

Solution 1: Database sharding Shard by user_id or by URL hash. Splits load across multiple DB instances.

user_id % 4 = 0 -> Shard 0
user_id % 4 = 1 -> Shard 1
user_id % 4 = 2 -> Shard 2
user_id % 4 = 3 -> Shard 3

Solution 2: ID generation service Create a separate microservice that generates IDs using Zookeeper or similar. This prevents collisions across shards.

API Servers -> ID Service (generates IDs across all shards) -> API Servers -> DB Shards

Scaling the Read Path

Redirects are the hot path. You need:

1. Caching: As discussed, 80%+ hit rates with Redis.

2. CDN: For geographic distribution, edge locations can serve redirects faster. CDN caches redirects and serves from nearest location.

User in Japan -> CDN edge in Tokyo -> Original long URL (fetched from origin if not in cache)

3. Database replication: Replicate read-heavy databases to multiple regions.

Master DB (Writes) -> Replica DB1 (Reads) -> Replica DB2 (Reads)

Analytics and Logging

Tracking clicks requires logging millions of events per day.

Option 1: Synchronous logging

def redirect(short_url):
    original_url = get_url(short_url)
    log_click(short_url, request.ip, request.user_agent)  # Blocks redirect
    return original_url

Problem: Logging blocks the redirect response, increasing latency.

Option 2: Asynchronous logging (Recommended)

def redirect(short_url):
    original_url = get_url(short_url)
    queue.enqueue(log_click, short_url, request.ip, request.user_agent)  # Non-blocking
    return original_url

Use a message queue (Kafka, RabbitMQ) to decouple logging from redirects.

Analytics pipeline:

API Servers -> Message Queue -> Analytics Workers -> Analytics Database

Handling Expiration and Cleanup

URLs can expire. You need cleanup.

Option 1: Lazy deletion Check is_active flag on reads. Periodically scan and mark expired as inactive.

Option 2: TTL in cache Set TTL in Redis. Expired entries are automatically removed.

Option 3: Background jobs Scheduled jobs scan database and soft-delete expired URLs.

Handling Edge Cases

Long URLs: Original URLs can be 2+ KB. Store securely, handle encoding.

Duplicate submissions: If same user shortens the same URL twice, return the existing short code (or create new one per user preference).

Custom short codes: Validate user-provided codes. Check for profanity or offensive words.

HTTPS redirects: Always redirect to HTTPS if original URL supports it.

Circular redirects: Validate that short URL doesn't redirect to itself.

Availability and Reliability

High Availability Design:

  1. Multi-region deployment: Deploy API servers and databases in multiple regions. If one region fails, others serve traffic.

  2. Database failover: Use master-slave replication with automatic failover. If master fails, promote a slave.

  3. Load balancing: Use consistent hashing for load balancing. Add/remove servers without rerouting all requests.

  4. Health checks: Monitor API servers, databases, caches. Remove unhealthy instances from load balancer.

  5. Circuit breakers: If database is slow, circuit breaker throttles requests to prevent cascading failures.

Final Architecture Diagram

                         CDN
                         |
User Request -----> Load Balancer -----> API Servers -----> Cache (Redis)
                                              |                  |
                                          Message Queue       Database (MySQL)
                                              |                  |
                                        Analytics Workers   Database Replicas

Complete Code Sketch

class URLShortener:
    def __init__(self):
        self.db = Database()
        self.cache = RedisCache()
        self.id_generator = IDGenerator()
 
    def create_short_url(self, original_url, user_id, custom_code=None):
        # Generate or use custom code
        if custom_code:
            if self.db.exists(custom_code):
                raise DuplicateError()
            short_url = custom_code
        else:
            id = self.id_generator.next_id()
            short_url = self.encode(id)
 
        # Store in database
        self.db.insert(original_url, short_url, user_id)
 
        # Update cache
        self.cache.set(short_url, original_url)
 
        return short_url
 
    def redirect(self, short_url):
        # Check cache
        original_url = self.cache.get(short_url)
 
        if not original_url:
            # Fetch from database
            original_url = self.db.get(short_url)
            if original_url:
                self.cache.set(short_url, original_url)
 
        if original_url:
            # Async logging
            self.message_queue.enqueue(log_click, short_url, request.ip)
            return redirect(original_url)
        else:
            return 404
 
    def encode(self, id):
        base62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
        result = ""
        while id > 0:
            result = base62[id % 62] + result
            id //= 62
        return result

Trade-Offs and Decisions

  • Read vs. Write: This system is read-heavy. Optimize for fast redirects.
  • Consistency vs. Availability: CAP theorem: prioritize availability (always serve redirects) over consistency.
  • Complexity vs. Simplicity: Start simple, add sharding and replication only if needed.
  • Analytics accuracy vs. speed: Async logging loses some analytics but keeps redirects fast.

Preparing for System Design Interviews

System design interviews require thinking holistically about scalability, not just coding. You need to communicate your thinking, make trade-off decisions, and handle follow-up questions. When you're preparing for system design rounds and want to practice articulating your approach clearly and handling questions, tools can help you practice at scale. Phantom Code supports system design interviews and helps you practice explaining your architecture while getting feedback on your trade-off decisions. While it's primarily known for coding interviews, it can be valuable for practicing the communication and structured thinking that system design requires. Plans start at ₹499/month at phantomcode.co.

Final Thoughts

Designing a URL shortener touches every important system design concept: databases, caching, scalability, replication, sharding, and monitoring. Master this design and you're well-prepared for senior engineering interviews at top tech companies.

Frequently Asked Questions

Should you use auto-incrementing IDs or random strings for short URLs?
Auto-incrementing IDs encoded in base62 are simpler, collision-free, and compact (6 chars = 56 trillion combinations). The downside is that sequential IDs leak creation order; if that's a concern, hash the ID or use a random component.
What database should a URL shortener use?
Start with MySQL or PostgreSQL with an indexed short_url column. The workload is 100:1 read-to-write and most reads hit cache, so a relational store with replicas is plenty. Consider DynamoDB or Cassandra only at extreme write volumes.
Why is async analytics important?
Synchronous analytics logging blocks the redirect HTTP response, increasing user-facing latency. Push the click event to Kafka or RabbitMQ and let analytics workers process asynchronously — your redirect stays under 100ms.
How do you handle URL expiration at scale?
Combine three strategies: TTL in Redis cache for automatic eviction, an is_active flag for lazy deletion on read, and a nightly background job that soft-deletes expired rows from the database.
What's the CAP tradeoff for a URL shortener?
Prioritize availability over strict consistency. Users would rather hit an occasionally stale redirect than see a 503. Use eventual consistency between regions, and tolerate brief inconsistency for newly created URLs.

Ready to Ace Your Next Interview?

Phantom Code provides real-time AI assistance during technical interviews. Solve DSA problems, system design questions, and more with instant AI-generated solutions.

Get Started

Related Articles

10 Things Great Candidates Do Differently in Technical Interviews

Ten behaviors that separate offer-winning candidates from average ones, from clarifying questions to optimizing without being asked.

From 5 Rejections to a Google Offer: One Engineer's Story

How a mid-level engineer turned five Google rejections into an L5 offer by fixing communication, system design depth, and exceptional reasoning.

Advanced SQL Interview Questions for Senior Engineers (2026)

Basic SQL gets you through L3. Senior roles require window functions, CTEs, execution plans, and real optimization know-how. Here is the complete advanced playbook.

Salary Guide|Resume Templates|LeetCode Solutions|FAQ|All Blog Posts
Phantom CodePhantom Code
Phantom Code is an undetectable desktop application to help you pass your Leetcode interviews.
All systems online

Legal

Refund PolicyTerms of ServiceCancellation PolicyPrivacy Policy

Pages

Contact SupportHelp CenterFAQBlogPricingBest AI Interview Assistants 2026FeedbackLeetcode ProblemsLoginCreate Account

Compare

Interview Coder AlternativeFinal Round AI AlternativeUltraCode AI AlternativeParakeet AI AlternativeAI Apply AlternativeCoderRank AlternativeInterviewing.io AlternativeShadeCoder Alternative

Resources

Salary GuideResume TemplatesWhat Is PhantomCodeIs PhantomCode Detectable?Use PhantomCode in HackerRankvs LeetCode PremiumIndia Pricing (INR)

Interview Types

Coding InterviewSystem Design InterviewDSA InterviewLeetCode InterviewAlgorithms InterviewData Structure InterviewSQL InterviewOnline Assessment

© 2026 Phantom Code. All rights reserved.