A beginner-friendly systems design walkthrough for building and scaling a URL shortener.
How much engineering could possibly hide behind turning this:
https://example.com/articles/system-design-for-beginners
into this?
https://sho.rt/aZ91x
At first glance, a URL shortener feels like a small dictionary: store a long URL under a short code, then look it up later. That is the right starting point. The interesting design decisions appear when millions of people start clicking links at the same time.
A URL shortener has two main jobs:
A few terms will help:
aZ91x.Before drawing boxes, clarify the product.
/my-resume.That last point shapes the architecture. This is a read-heavy system.
Note: Start with the smallest useful feature set. Link previews, advanced analytics, abuse detection, and user accounts can be added later.
The service only needs two essential endpoints.
POST /api/v1/urls
Content-Type: application/json
{
"url": "https://example.com/articles/system-design-for-beginners",
"expiresAt": "2027-01-01T00:00:00Z"
}
Response:
{
"shortUrl": "https://sho.rt/aZ91x",
"code": "aZ91x"
}
GET /aZ91x
Response:
HTTP/1.1 302 Found
Location: https://example.com/articles/system-design-for-beginners
A 302 redirect is a sensible default because it allows the service to continue observing clicks. A 301 permanent redirect may encourage browsers and other systems to cache the destination more aggressively.
A short code must be compact, URL-friendly, and unique.
One approachable strategy is:
Base62 uses:
0-9 a-z A-Z
With six characters, Base62 provides:
62^6 = 56,800,235,584 combinations
That is more than 56 billion possible codes.
Here is a simplified implementation:
const alphabet =
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
function encodeBase62(id: number): string {
if (id === 0) return alphabet[0];
let result = "";
while (id > 0) {
result = alphabet[id % 62] + result;
id = Math.floor(id / 62);
}
return result;
}
console.log(encodeBase62(125)); // "21"
// Risky: truncation can create collisions.
const code = hash(url).slice(0, 6);
A hash is useful, but shortening it increases the chance of collisions. You would still need to check the database and retry when a duplicate appears.
A database-generated ID encoded as Base62 is easier to explain and operate in an entry-level design.
Tip: Predictable IDs are acceptable for a learning exercise. If people must not guess nearby links, use randomized codes or add a reversible scrambling step.
A relational database is a practical starting point.
CREATE TABLE short_urls (
id BIGINT PRIMARY KEY,
code VARCHAR(12) NOT NULL UNIQUE,
original_url TEXT NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP NULL
);
CREATE INDEX idx_short_urls_code ON short_urls(code);
The code index matters because redirects happen frequently:
SELECT original_url, expires_at
FROM short_urls
WHERE code = 'aZ91x';
Without an index, the database may scan many rows to find one link. That is like searching every apartment in a city instead of checking the address directory.
Browser or API Client
|
v
Load Balancer
|
v
URL Shortener API ------> Cache
|
+---------------> Primary Database
|
+---------------> Event Queue ------> Analytics Worker
|
v
Analytics Store
/aZ91x.Popular links may be opened thousands of times per second. Reading the same row repeatedly from the database wastes work.
A cache keeps frequently requested mappings close to the application:
aZ91x -> https://example.com/articles/system-design-for-beginners
Pseudocode:
async function redirect(code: string) {
const cachedUrl = await cache.get(code);
if (cachedUrl) {
return redirectTo(cachedUrl); // Fast path
}
const record = await database.findByCode(code);
if (!record || isExpired(record)) {
return notFound();
}
await cache.set(code, record.originalUrl, { ttl: 3600 });
return redirectTo(record.originalUrl);
}
The cache is helpful, but it is not the source of truth. If it disappears, the database still contains every link.
Click counts are useful, but a redirect should not wait for analytics processing.
// Slower redirect: the visitor waits for the counter update.
await database.incrementClickCount(code);
return redirectTo(url);
Instead, publish an event and process it asynchronously:
// The redirect remains fast.
await eventQueue.publish({
type: "url.clicked",
code,
clickedAt: new Date().toISOString(),
});
return redirectTo(url);
A background worker can aggregate counts later. If analytics briefly falls behind, links still work.
A few design mistakes show up quickly as traffic grows.
Popular links create unnecessary database load.
Fix: Cache frequently accessed mappings.
Analytics writes slow down the user-facing path.
Fix: Send click events to a queue and process them asynchronously.
A URL shortener can hide misleading links.
Fix: Validate URL schemes, rate-limit link creation, and add abuse reporting.
function isAllowedUrl(value: string): boolean {
const url = new URL(value);
return url.protocol === "http:" || url.protocol === "https:";
}
Expired links should not redirect forever.
Fix: Check expires_at after cache misses and use a cache time-to-live (TTL).
Sharding, globally distributed databases, and elaborate ID services add operational cost.
Fix: Begin with one database, stateless API servers, a cache, and a queue. Add complexity when measured traffic requires it.
For an entry-level implementation, build this:
code.When traffic grows, consider:
A URL shortener starts as a dictionary:
short code -> original URL
The architecture becomes more interesting when we protect the database from repeated reads and keep optional work away from the redirect path.
The core lessons are broadly useful:
That is a solid systems design answer: begin with a small reliable service, explain the bottlenecks clearly, and scale each part only when the traffic justifies it.
-Caleb