Caching: Kenapa Sistem Cepat Itu Sistem yang Males Ngitung Ulang

Kenapa Butuh Cache?

Bayangin kamu punya API yang return daftar produk populer. Setiap request:

Query database (50ms)
Join dengan tabel rating (30ms)
Sort dan filter (20ms)
Serialize ke JSON (10ms)

Total: 110ms per request.

Kalau ada 1000 request/detik ke endpoint yang sama, database kamu kena 1000 query/detik — padahal hasilnya sama persis. Data produk populer ga berubah tiap detik.

Ini pemborosan.

Cache itu simpel: simpan hasil yang udah dihitung, jadi ga perlu hitung ulang.

graph LR
    R1["Request 1"] --> App
    App -->|"cache miss"| DB[(Database)]
    DB -->|"110ms"| App
    App -->|"simpan di cache"| Cache[(Cache)]
    App --> Response1["Response (110ms)"]

    R2["Request 2-1000"] --> App2[App]
    App2 -->|"cache hit"| Cache2[(Cache)]
    Cache2 -->|"1ms"| App2
    App2 --> Response2["Response (1ms) ⚡"]

Request pertama tetap 110ms (cache miss). Request 2 sampai 1000? 1ms (cache hit). Database cuma kena 1 query bukan 1000.

Di Mana Cache Bisa Dipasang?

Cache bisa ada di banyak layer. Semakin dekat ke user, semakin cepat.

graph TB
    User["👤 User"]
    Browser["Browser Cache<br/>(paling dekat ke user)"]
    CDN["CDN Cache<br/>(edge server)"]
    LB["Load Balancer"]
    AppCache["Application Cache<br/>(Redis/Memcached)"]
    App["Application Server"]
    DBCache["Database Cache<br/>(query cache)"]
    DB["Database"]

    User --> Browser --> CDN --> LB --> App
    App --> AppCache
    App --> DB
    DB --> DBCache

Layer	Contoh	Kecepatan	Cocok Untuk
Browser	HTTP cache headers	Instan	Static assets (CSS, JS, gambar)
CDN	Cloudflare, CloudFront	~10-50ms	Static content, API response yang jarang berubah
Application	Redis, Memcached	~1-5ms	Session, query results, computed data
Database	MySQL query cache, PostgreSQL buffer	~5-20ms	Frequently accessed rows

Strategi Caching

1. Cache-Aside (Lazy Loading)

Strategi paling umum. Application yang manage cache sendiri.

sequenceDiagram
    participant App
    participant Cache
    participant DB

    App->>Cache: Get "product:123"
    Cache-->>App: null (miss)
    App->>DB: SELECT * FROM products WHERE id=123
    DB-->>App: {product data}
    App->>Cache: Set "product:123" = {data}, TTL 5min
    App-->>App: Return data

    Note over App,DB: Request berikutnya...

    App->>Cache: Get "product:123"
    Cache-->>App: {product data} (hit!)
    App-->>App: Return data (skip DB)

Cara kerja:

Cek cache dulu
Kalau ada (hit) → return dari cache
Kalau ga ada (miss) → query DB → simpan di cache → return

Pro:

Simple dan mudah diimplementasi
Cache cuma diisi data yang benar-benar diakses (no waste)
Kalau cache mati, app tetap jalan (fallback ke DB)

Kontra:

Cache miss pertama selalu lambat
Data bisa stale (outdated) sampai TTL habis
Thundering herd problem (lihat di bawah)

2. Write-Through

Setiap kali tulis ke DB, tulis ke cache juga secara bersamaan.

sequenceDiagram
    participant App
    participant Cache
    participant DB

    App->>Cache: Set "product:123" = {updated data}
    App->>DB: UPDATE products SET ... WHERE id=123
    Note over Cache,DB: Keduanya selalu in-sync

Pro:

Cache selalu up-to-date
Ga ada stale data

Kontra:

Write jadi lebih lambat (tulis 2x: cache + DB)
Bisa caching data yang ga pernah dibaca (waste memory)

3. Write-Behind (Write-Back)

Tulis ke cache dulu, DB diupdate nanti secara async (batched).

sequenceDiagram
    participant App
    participant Cache
    participant Queue
    participant DB

    App->>Cache: Set "product:123" = {data}
    App-->>App: Return success (cepat!)
    Cache->>Queue: Queue write
    Queue->>DB: Batch update (async)

Pro:

Write sangat cepat (cuma ke cache)
DB load berkurang (batched writes)

Kontra:

Risiko data loss — kalau cache crash sebelum flush ke DB, data hilang
Eventual consistency — DB bisa ketinggalan dari cache
Lebih complex untuk diimplement

4. Read-Through

Mirip cache-aside, tapi cache yang bertanggung jawab ambil data dari DB kalau miss. Application cuma interaksi sama cache.

Pro: Application code lebih bersih
Kontra: Cache harus tahu cara query DB (coupling)

Kapan Pakai Mana?

Strategi	Write Speed	Read Speed	Consistency	Data Loss Risk
Cache-Aside	Normal	Fast (hit)	Eventual	Rendah
Write-Through	Lambat (2x write)	Fast	Strong	Rendah
Write-Behind	Sangat cepat	Fast	Eventual	Tinggi
Read-Through	Normal	Fast	Eventual	Rendah

Default recommendation: mulai dari Cache-Aside kecuali ada alasan kuat pakai yang lain.

Redis vs Memcached

Dua in-memory data store paling populer untuk caching.

Redis

Redis itu lebih dari cache. Dia in-memory data structure server.

Data structures yang didukung:

String (SET key value)
Hash (HSET user:1 name "Budi" age 25)
List (LPUSH queue task1 task2)
Set (SADD tags "docker" "devops")
Sorted Set (ZADD leaderboard 100 "player1")
Stream, Bitmap, HyperLogLog, dll

Fitur tambahan:

Persistence — bisa simpan ke disk (RDB snapshot / AOF log)
Replication — master-replica untuk HA
Pub/Sub — messaging ringan
Lua scripting — atomic operations
TTL per key — auto-expire
Cluster mode — horizontal scaling

Memcached

Memcached itu cache murni. Simple, cepat, focused.

Fitur:

Key-value store (string only)
Multi-threaded (bisa manfaatin banyak CPU core)
Consistent hashing untuk distribusi

Perbandingan

Aspek	Redis	Memcached
Data types	String, Hash, List, Set, dll	String only
Persistence	✅ (RDB/AOF)	❌ (in-memory only)
Replication	✅ Master-Replica	❌
Pub/Sub	✅	❌
Threading	Single-threaded (6.0+ multi I/O)	Multi-threaded
Max value size	512MB	1MB
Use case	Cache + session + queue + leaderboard	Pure caching
Complexity	Lebih complex	Lebih simple

Kapan pakai Redis? Almost always. Redis bisa semua yang Memcached bisa, plus banyak lagi.

Kapan pakai Memcached? Kalau cuma butuh simple key-value cache dan mau leverage multi-threading di mesin multi-core besar.

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things."
— Phil Karlton

Cache invalidation itu masalah tersulit di caching. Gimana caranya pastiin data di cache masih valid?

1. TTL (Time-To-Live)

Paling simpel. Set expiry time. Setelah TTL habis, cache auto-delete.

SET product:123 "{data}" EX 300  // expire setelah 5 menit

Trade-off:

TTL terlalu pendek → cache miss sering, DB tetap sibuk
TTL terlalu panjang → data stale lama

Best practice: mulai dari 5 menit, adjust berdasarkan seberapa sering data berubah.

2. Event-Based Invalidation

Kalau data berubah, hapus cache secara eksplisit.

sequenceDiagram
    participant Admin
    participant App
    participant Cache
    participant DB

    Admin->>App: Update product price
    App->>DB: UPDATE products SET price=99000
    App->>Cache: DELETE "product:123"
    Note over Cache: Cache cleared!
    
    Note over App: Request berikutnya...
    App->>Cache: GET "product:123" → miss
    App->>DB: SELECT → fresh data
    App->>Cache: SET "product:123" = {fresh data}

Pro: data selalu fresh setelah update
Kontra: harus pastiin semua write path invalidate cache. Kalau ada yang terlewat, data stale.

3. Version-Based

Tambahin version number di cache key.

product:123:v5 = {data}

Kalau data berubah, increment version. Key lama otomatis ga dipake (dan eventually expired by TTL).

CDN (Content Delivery Network)

CDN itu cache yang disebarkan ke seluruh dunia.

graph TB
    Origin["Origin Server<br/>(Singapore)"]
    CDN1["CDN Edge<br/>Tokyo 🇯🇵"]
    CDN2["CDN Edge<br/>Sydney 🇦🇺"]
    CDN3["CDN Edge<br/>Mumbai 🇮🇳"]
    CDN4["CDN Edge<br/>Frankfurt 🇩🇪"]

    U1["User Tokyo"] --> CDN1
    U2["User Sydney"] --> CDN2
    U3["User India"] --> CDN3
    U4["User Germany"] --> CDN4
    CDN1 --> Origin
    CDN2 --> Origin
    CDN3 --> Origin
    CDN4 --> Origin

User di Tokyo akses website kamu. Tanpa CDN → request terbang ke server di Singapore (latency tinggi). Dengan CDN → response di-serve dari edge server di Tokyo (latency rendah).

Apa yang di-cache CDN?

Static assets: CSS, JS, images, fonts, video
API responses: kalau response jarang berubah
HTML pages: untuk static sites

HTTP Cache Headers

CDN (dan browser) menggunakan HTTP headers untuk tahu apa yang boleh di-cache dan berapa lama.

Cache-Control: public, max-age=86400

Header	Fungsi
`Cache-Control: public`	Boleh di-cache oleh CDN dan browser
`Cache-Control: private`	Cuma boleh di-cache browser (bukan CDN)
`Cache-Control: no-cache`	Boleh di-cache, tapi harus revalidate dulu
`Cache-Control: no-store`	Jangan cache sama sekali
`max-age=3600`	Cache valid selama 1 jam
`ETag`	Hash dari content, buat cek apakah berubah
`Last-Modified`	Timestamp terakhir dimodifikasi

CDN Populer

CDN	Kelebihan
Cloudflare	Free tier generous, DDoS protection, edge computing
AWS CloudFront	Integrasi AWS, Lambda@Edge
Fastly	Real-time purging, VCL config
Akamai	Enterprise, jaringan terbesar
Vercel/Netlify	Built-in untuk JAMstack apps

Common Pitfalls

1. Thundering Herd

Cache key popular expired → ribuan request serentak query DB karena semua dapet cache miss bersamaan.

graph LR
    subgraph TH["Thundering Herd"]
        R1["Req 1"] --> DB[(DB)]
        R2["Req 2"] --> DB
        R3["Req 3"] --> DB
        R4["Req N..."] --> DB
    end
    Note1["Cache expired!\nSemua request ke DB sekaligus 💥"]

Solusi:

Lock/Mutex: request pertama yang miss ambil lock, query DB, set cache. Request lain nunggu
Stale-while-revalidate: return data stale sambil refresh di background
Jitter pada TTL: jangan set TTL yang sama persis. Tambahin random offset supaya ga expire bareng

2. Cache Stampede

Mirip thundering herd, tapi terjadi saat cache cold start (baru dinyalain, isinya kosong). Semua request miss.

Solusi: cache warming — pre-populate cache sebelum mulai terima traffic.

3. Hot Key

Satu key diakses jauh lebih banyak dari yang lain. Misal: home page data, trending item.

Solusi:

Replicate hot key ke multiple Redis instances
Local cache (in-memory) di application server sebagai L1 cache

4. Cache Penetration

Request untuk data yang ga ada di DB maupun cache. Setiap request always miss → always query DB → DB sia-sia.

Misal: attacker query product:999999999 yang ga exist. Cache ga punya, DB juga ga punya, tapi query tetep jalan.

Solusi:

Cache null result: simpan product:999999999 = null dengan TTL pendek
Bloom filter: data structure yang bisa cek "apakah key ini pasti ga ada" tanpa query DB

5. Cache Avalanche

Banyak key expired di waktu yang sama → massive cache miss → DB overload.

Beda dengan thundering herd (1 key popular), avalanche itu banyak key sekaligus.

Solusi:

Jitter pada TTL (TTL + random seconds)
Multi-layer cache (L1 in-memory + L2 Redis)
Circuit breaker kalau DB mulai kewalahan

6. Stale Data

Data di cache udah outdated tapi masih di-serve ke user.

Contoh: harga produk udah berubah di DB tapi cache masih return harga lama.

Solusi yang udah dibahas: TTL, event-based invalidation, version-based.

Penting: tentukan dulu seberapa stale yang acceptable. Harga produk? Harus real-time. Jumlah followers? 5 menit stale ga masalah.

Caching di Code (Contoh Node.js + Redis)

const redis = require('redis');
const client = redis.createClient();

async function getProduct(productId) {
  const cacheKey = `product:${productId}`;
  
  // 1. Cek cache
  const cached = await client.get(cacheKey);
  if (cached) {
    return JSON.parse(cached); // cache hit
  }
  
  // 2. Cache miss → query DB
  const product = await db.query(
    'SELECT * FROM products WHERE id = $1', 
    [productId]
  );
  
  if (!product) {
    // Cache null untuk prevent cache penetration
    await client.set(cacheKey, 'null', { EX: 60 });
    return null;
  }
  
  // 3. Simpan di cache (TTL 5 menit + jitter)
  const ttl = 300 + Math.floor(Math.random() * 60); // 300-360 detik
  await client.set(cacheKey, JSON.stringify(product), { EX: ttl });
  
  return product;
}

// Invalidate saat update
async function updateProduct(productId, data) {
  await db.query('UPDATE products SET ... WHERE id = $1', [productId]);
  await client.del(`product:${productId}`); // invalidate cache
}

Ringkasan

Konsep	Penjelasan
Cache	Simpan hasil yang sudah dihitung supaya ga perlu hitung ulang
Cache-Aside	App manage cache sendiri, cek → miss → query → set
Write-Through	Tulis ke cache dan DB bersamaan
Write-Behind	Tulis ke cache dulu, DB di-update async
Redis	In-memory data structure server (cache + lebih)
Memcached	Pure cache, simple, multi-threaded
TTL	Auto-expire cache setelah waktu tertentu
CDN	Cache yang disebarkan ke edge server seluruh dunia
Thundering Herd	Banyak request miss karena key popular expired
Cache Penetration	Query untuk data yang ga ada di mana-mana

Kesimpulan

Caching itu multiplier terbesar untuk performance. Satu baris redis.get() bisa mengubah response time dari 500ms jadi 1ms.

Yang perlu diingat:

Cache-Aside untuk mulai — simpel, reliable, coverage bagus
TTL itu wajib — jangan cache tanpa expiry, data pasti stale akhirnya
Tambahin jitter ke TTL — prevent thundering herd dan avalanche
Cache null results — prevent cache penetration dari query yang ga ada
Redis almost always wins — kecuali ada alasan spesifik pakai Memcached
CDN buat static assets — ga ada alasan ga pakai ini
Cache invalidation itu susah — tapi TTL + event-based udah cover 90% use case
Tentukan staleness tolerance — ga semua data harus real-time

Ingat: cache itu trade-off antara speed dan freshness. Tentuin mana yang lebih penting buat setiap data, dan design cache kamu accordingly.

Kenapa Butuh Cache?

Bayangin kamu punya API yang return daftar produk populer. Setiap request:

Query database (50ms)
Join dengan tabel rating (30ms)
Sort dan filter (20ms)
Serialize ke JSON (10ms)

Total: 110ms per request.

Kalau ada 1000 request/detik ke endpoint yang sama, database kamu kena 1000 query/detik — padahal hasilnya sama persis. Data produk populer ga berubah tiap detik.

Ini pemborosan.

Cache itu simpel: simpan hasil yang udah dihitung, jadi ga perlu hitung ulang.

graph LR
    R1["Request 1"] --> App
    App -->|"cache miss"| DB[(Database)]
    DB -->|"110ms"| App
    App -->|"simpan di cache"| Cache[(Cache)]
    App --> Response1["Response (110ms)"]

    R2["Request 2-1000"] --> App2[App]
    App2 -->|"cache hit"| Cache2[(Cache)]
    Cache2 -->|"1ms"| App2
    App2 --> Response2["Response (1ms) ⚡"]

Request pertama tetap 110ms (cache miss). Request 2 sampai 1000? 1ms (cache hit). Database cuma kena 1 query bukan 1000.

Di Mana Cache Bisa Dipasang?

Cache bisa ada di banyak layer. Semakin dekat ke user, semakin cepat.

graph TB
    User["👤 User"]
    Browser["Browser Cache<br/>(paling dekat ke user)"]
    CDN["CDN Cache<br/>(edge server)"]
    LB["Load Balancer"]
    AppCache["Application Cache<br/>(Redis/Memcached)"]
    App["Application Server"]
    DBCache["Database Cache<br/>(query cache)"]
    DB["Database"]

    User --> Browser --> CDN --> LB --> App
    App --> AppCache
    App --> DB
    DB --> DBCache

Layer	Contoh	Kecepatan	Cocok Untuk
Browser	HTTP cache headers	Instan	Static assets (CSS, JS, gambar)
CDN	Cloudflare, CloudFront	~10-50ms	Static content, API response yang jarang berubah
Application	Redis, Memcached	~1-5ms	Session, query results, computed data
Database	MySQL query cache, PostgreSQL buffer	~5-20ms	Frequently accessed rows

Strategi Caching

1. Cache-Aside (Lazy Loading)

Strategi paling umum. Application yang manage cache sendiri.

sequenceDiagram
    participant App
    participant Cache
    participant DB

    App->>Cache: Get "product:123"
    Cache-->>App: null (miss)
    App->>DB: SELECT * FROM products WHERE id=123
    DB-->>App: {product data}
    App->>Cache: Set "product:123" = {data}, TTL 5min
    App-->>App: Return data

    Note over App,DB: Request berikutnya...

    App->>Cache: Get "product:123"
    Cache-->>App: {product data} (hit!)
    App-->>App: Return data (skip DB)

Cara kerja:

Cek cache dulu
Kalau ada (hit) → return dari cache
Kalau ga ada (miss) → query DB → simpan di cache → return

Pro:

Simple dan mudah diimplementasi
Cache cuma diisi data yang benar-benar diakses (no waste)
Kalau cache mati, app tetap jalan (fallback ke DB)

Kontra:

Cache miss pertama selalu lambat
Data bisa stale (outdated) sampai TTL habis
Thundering herd problem (lihat di bawah)

2. Write-Through

Setiap kali tulis ke DB, tulis ke cache juga secara bersamaan.

sequenceDiagram
    participant App
    participant Cache
    participant DB

    App->>Cache: Set "product:123" = {updated data}
    App->>DB: UPDATE products SET ... WHERE id=123
    Note over Cache,DB: Keduanya selalu in-sync

Pro:

Cache selalu up-to-date
Ga ada stale data

Kontra:

Write jadi lebih lambat (tulis 2x: cache + DB)
Bisa caching data yang ga pernah dibaca (waste memory)

3. Write-Behind (Write-Back)

Tulis ke cache dulu, DB diupdate nanti secara async (batched).

sequenceDiagram
    participant App
    participant Cache
    participant Queue
    participant DB

    App->>Cache: Set "product:123" = {data}
    App-->>App: Return success (cepat!)
    Cache->>Queue: Queue write
    Queue->>DB: Batch update (async)

Pro:

Write sangat cepat (cuma ke cache)
DB load berkurang (batched writes)

Kontra:

Risiko data loss — kalau cache crash sebelum flush ke DB, data hilang
Eventual consistency — DB bisa ketinggalan dari cache
Lebih complex untuk diimplement

4. Read-Through

Mirip cache-aside, tapi cache yang bertanggung jawab ambil data dari DB kalau miss. Application cuma interaksi sama cache.

Pro: Application code lebih bersih
Kontra: Cache harus tahu cara query DB (coupling)

Kapan Pakai Mana?

Strategi	Write Speed	Read Speed	Consistency	Data Loss Risk
Cache-Aside	Normal	Fast (hit)	Eventual	Rendah
Write-Through	Lambat (2x write)	Fast	Strong	Rendah
Write-Behind	Sangat cepat	Fast	Eventual	Tinggi
Read-Through	Normal	Fast	Eventual	Rendah

Default recommendation: mulai dari Cache-Aside kecuali ada alasan kuat pakai yang lain.

Redis vs Memcached

Dua in-memory data store paling populer untuk caching.

Redis

Redis itu lebih dari cache. Dia in-memory data structure server.

Data structures yang didukung:

String (SET key value)
Hash (HSET user:1 name "Budi" age 25)
List (LPUSH queue task1 task2)
Set (SADD tags "docker" "devops")
Sorted Set (ZADD leaderboard 100 "player1")
Stream, Bitmap, HyperLogLog, dll

Fitur tambahan:

Persistence — bisa simpan ke disk (RDB snapshot / AOF log)
Replication — master-replica untuk HA
Pub/Sub — messaging ringan
Lua scripting — atomic operations
TTL per key — auto-expire
Cluster mode — horizontal scaling

Memcached

Memcached itu cache murni. Simple, cepat, focused.

Fitur:

Key-value store (string only)
Multi-threaded (bisa manfaatin banyak CPU core)
Consistent hashing untuk distribusi

Perbandingan

Aspek	Redis	Memcached
Data types	String, Hash, List, Set, dll	String only
Persistence	✅ (RDB/AOF)	❌ (in-memory only)
Replication	✅ Master-Replica	❌
Pub/Sub	✅	❌
Threading	Single-threaded (6.0+ multi I/O)	Multi-threaded
Max value size	512MB	1MB
Use case	Cache + session + queue + leaderboard	Pure caching
Complexity	Lebih complex	Lebih simple

Kapan pakai Redis? Almost always. Redis bisa semua yang Memcached bisa, plus banyak lagi.

Kapan pakai Memcached? Kalau cuma butuh simple key-value cache dan mau leverage multi-threading di mesin multi-core besar.

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things."
— Phil Karlton

Cache invalidation itu masalah tersulit di caching. Gimana caranya pastiin data di cache masih valid?

1. TTL (Time-To-Live)

Paling simpel. Set expiry time. Setelah TTL habis, cache auto-delete.

SET product:123 "{data}" EX 300  // expire setelah 5 menit

Trade-off:

TTL terlalu pendek → cache miss sering, DB tetap sibuk
TTL terlalu panjang → data stale lama

Best practice: mulai dari 5 menit, adjust berdasarkan seberapa sering data berubah.

2. Event-Based Invalidation

Kalau data berubah, hapus cache secara eksplisit.

sequenceDiagram
    participant Admin
    participant App
    participant Cache
    participant DB

    Admin->>App: Update product price
    App->>DB: UPDATE products SET price=99000
    App->>Cache: DELETE "product:123"
    Note over Cache: Cache cleared!
    
    Note over App: Request berikutnya...
    App->>Cache: GET "product:123" → miss
    App->>DB: SELECT → fresh data
    App->>Cache: SET "product:123" = {fresh data}

Pro: data selalu fresh setelah update
Kontra: harus pastiin semua write path invalidate cache. Kalau ada yang terlewat, data stale.

3. Version-Based

Tambahin version number di cache key.

product:123:v5 = {data}

Kalau data berubah, increment version. Key lama otomatis ga dipake (dan eventually expired by TTL).

CDN (Content Delivery Network)

CDN itu cache yang disebarkan ke seluruh dunia.

graph TB
    Origin["Origin Server<br/>(Singapore)"]
    CDN1["CDN Edge<br/>Tokyo 🇯🇵"]
    CDN2["CDN Edge<br/>Sydney 🇦🇺"]
    CDN3["CDN Edge<br/>Mumbai 🇮🇳"]
    CDN4["CDN Edge<br/>Frankfurt 🇩🇪"]

    U1["User Tokyo"] --> CDN1
    U2["User Sydney"] --> CDN2
    U3["User India"] --> CDN3
    U4["User Germany"] --> CDN4
    CDN1 --> Origin
    CDN2 --> Origin
    CDN3 --> Origin
    CDN4 --> Origin

User di Tokyo akses website kamu. Tanpa CDN → request terbang ke server di Singapore (latency tinggi). Dengan CDN → response di-serve dari edge server di Tokyo (latency rendah).

Apa yang di-cache CDN?

Static assets: CSS, JS, images, fonts, video
API responses: kalau response jarang berubah
HTML pages: untuk static sites

HTTP Cache Headers

CDN (dan browser) menggunakan HTTP headers untuk tahu apa yang boleh di-cache dan berapa lama.

Cache-Control: public, max-age=86400

Header	Fungsi
`Cache-Control: public`	Boleh di-cache oleh CDN dan browser
`Cache-Control: private`	Cuma boleh di-cache browser (bukan CDN)
`Cache-Control: no-cache`	Boleh di-cache, tapi harus revalidate dulu
`Cache-Control: no-store`	Jangan cache sama sekali
`max-age=3600`	Cache valid selama 1 jam
`ETag`	Hash dari content, buat cek apakah berubah
`Last-Modified`	Timestamp terakhir dimodifikasi

CDN Populer

CDN	Kelebihan
Cloudflare	Free tier generous, DDoS protection, edge computing
AWS CloudFront	Integrasi AWS, Lambda@Edge
Fastly	Real-time purging, VCL config
Akamai	Enterprise, jaringan terbesar
Vercel/Netlify	Built-in untuk JAMstack apps

Common Pitfalls

1. Thundering Herd

Cache key popular expired → ribuan request serentak query DB karena semua dapet cache miss bersamaan.

graph LR
    subgraph TH["Thundering Herd"]
        R1["Req 1"] --> DB[(DB)]
        R2["Req 2"] --> DB
        R3["Req 3"] --> DB
        R4["Req N..."] --> DB
    end
    Note1["Cache expired!\nSemua request ke DB sekaligus 💥"]

Solusi:

Lock/Mutex: request pertama yang miss ambil lock, query DB, set cache. Request lain nunggu
Stale-while-revalidate: return data stale sambil refresh di background
Jitter pada TTL: jangan set TTL yang sama persis. Tambahin random offset supaya ga expire bareng

2. Cache Stampede

Mirip thundering herd, tapi terjadi saat cache cold start (baru dinyalain, isinya kosong). Semua request miss.

Solusi: cache warming — pre-populate cache sebelum mulai terima traffic.

3. Hot Key

Satu key diakses jauh lebih banyak dari yang lain. Misal: home page data, trending item.

Solusi:

Replicate hot key ke multiple Redis instances
Local cache (in-memory) di application server sebagai L1 cache

4. Cache Penetration

Request untuk data yang ga ada di DB maupun cache. Setiap request always miss → always query DB → DB sia-sia.

Misal: attacker query product:999999999 yang ga exist. Cache ga punya, DB juga ga punya, tapi query tetep jalan.

Solusi:

Cache null result: simpan product:999999999 = null dengan TTL pendek
Bloom filter: data structure yang bisa cek "apakah key ini pasti ga ada" tanpa query DB

5. Cache Avalanche

Banyak key expired di waktu yang sama → massive cache miss → DB overload.

Beda dengan thundering herd (1 key popular), avalanche itu banyak key sekaligus.

Solusi:

Jitter pada TTL (TTL + random seconds)
Multi-layer cache (L1 in-memory + L2 Redis)
Circuit breaker kalau DB mulai kewalahan

6. Stale Data

Data di cache udah outdated tapi masih di-serve ke user.

Contoh: harga produk udah berubah di DB tapi cache masih return harga lama.

Solusi yang udah dibahas: TTL, event-based invalidation, version-based.

Penting: tentukan dulu seberapa stale yang acceptable. Harga produk? Harus real-time. Jumlah followers? 5 menit stale ga masalah.

Caching di Code (Contoh Node.js + Redis)

const redis = require('redis');
const client = redis.createClient();

async function getProduct(productId) {
  const cacheKey = `product:${productId}`;
  
  // 1. Cek cache
  const cached = await client.get(cacheKey);
  if (cached) {
    return JSON.parse(cached); // cache hit
  }
  
  // 2. Cache miss → query DB
  const product = await db.query(
    'SELECT * FROM products WHERE id = $1', 
    [productId]
  );
  
  if (!product) {
    // Cache null untuk prevent cache penetration
    await client.set(cacheKey, 'null', { EX: 60 });
    return null;
  }
  
  // 3. Simpan di cache (TTL 5 menit + jitter)
  const ttl = 300 + Math.floor(Math.random() * 60); // 300-360 detik
  await client.set(cacheKey, JSON.stringify(product), { EX: ttl });
  
  return product;
}

// Invalidate saat update
async function updateProduct(productId, data) {
  await db.query('UPDATE products SET ... WHERE id = $1', [productId]);
  await client.del(`product:${productId}`); // invalidate cache
}

Ringkasan

Konsep	Penjelasan
Cache	Simpan hasil yang sudah dihitung supaya ga perlu hitung ulang
Cache-Aside	App manage cache sendiri, cek → miss → query → set
Write-Through	Tulis ke cache dan DB bersamaan
Write-Behind	Tulis ke cache dulu, DB di-update async
Redis	In-memory data structure server (cache + lebih)
Memcached	Pure cache, simple, multi-threaded
TTL	Auto-expire cache setelah waktu tertentu
CDN	Cache yang disebarkan ke edge server seluruh dunia
Thundering Herd	Banyak request miss karena key popular expired
Cache Penetration	Query untuk data yang ga ada di mana-mana

Kesimpulan

Caching itu multiplier terbesar untuk performance. Satu baris redis.get() bisa mengubah response time dari 500ms jadi 1ms.

Yang perlu diingat:

Cache-Aside untuk mulai — simpel, reliable, coverage bagus
TTL itu wajib — jangan cache tanpa expiry, data pasti stale akhirnya
Tambahin jitter ke TTL — prevent thundering herd dan avalanche
Cache null results — prevent cache penetration dari query yang ga ada
Redis almost always wins — kecuali ada alasan spesifik pakai Memcached
CDN buat static assets — ga ada alasan ga pakai ini
Cache invalidation itu susah — tapi TTL + event-based udah cover 90% use case
Tentukan staleness tolerance — ga semua data harus real-time

Ingat: cache itu trade-off antara speed dan freshness. Tentuin mana yang lebih penting buat setiap data, dan design cache kamu accordingly.

Caching: Kenapa Sistem Cepat Itu Sistem yang Males Ngitung Ulang

Kenapa Butuh Cache?

Di Mana Cache Bisa Dipasang?

Strategi Caching

1. Cache-Aside (Lazy Loading)

2. Write-Through

3. Write-Behind (Write-Back)

4. Read-Through

Kapan Pakai Mana?

Redis vs Memcached

Redis

Memcached

Perbandingan

Cache Invalidation

1. TTL (Time-To-Live)

2. Event-Based Invalidation

3. Version-Based

CDN (Content Delivery Network)

Apa yang di-cache CDN?

HTTP Cache Headers

CDN Populer

Common Pitfalls

1. Thundering Herd

2. Cache Stampede

3. Hot Key

4. Cache Penetration

5. Cache Avalanche

6. Stale Data

Caching di Code (Contoh Node.js + Redis)

Ringkasan

Kesimpulan

Topics

Related Articles

OAuth 2.0 & OpenID Connect: Gimana "Login dengan Google" Itu Bekerja

Clustering: Gimana Banyak Server Kerja Bareng Jadi Satu Sistem

DNS: Gimana Komputer Tahu Jalan ke google.com

Caching: Kenapa Sistem Cepat Itu Sistem yang Males Ngitung Ulang

Kenapa Butuh Cache?

Di Mana Cache Bisa Dipasang?

Strategi Caching

1. Cache-Aside (Lazy Loading)

2. Write-Through

3. Write-Behind (Write-Back)

4. Read-Through

Kapan Pakai Mana?

Redis vs Memcached

Redis

Memcached

Perbandingan

Cache Invalidation

1. TTL (Time-To-Live)

2. Event-Based Invalidation

3. Version-Based

CDN (Content Delivery Network)

Apa yang di-cache CDN?

HTTP Cache Headers

CDN Populer

Common Pitfalls

1. Thundering Herd

2. Cache Stampede

3. Hot Key

4. Cache Penetration

5. Cache Avalanche

6. Stale Data

Caching di Code (Contoh Node.js + Redis)

Ringkasan

Kesimpulan

Topics

Related Articles

OAuth 2.0 & OpenID Connect: Gimana "Login dengan Google" Itu Bekerja

Clustering: Gimana Banyak Server Kerja Bareng Jadi Satu Sistem

DNS: Gimana Komputer Tahu Jalan ke google.com