Unit II: Key-Value Databases — Redis
DBS302 · NoSQL Database Systems · BE Software Engineering
How to use these notes: Each section builds on the previous one. If you’re new to Redis, read from 2.1 to 2.8 in order. If you’re reviewing, jump to any section using the table of contents.
Table of Contents
2.1 Introduction to Key-Value Databases
2.1.1 Concept and Architecture
A key-value database stores data as pairs of a unique key and its associated value — exactly like a dict in Python or a HashMap in Java.
🔑 Analogy — Gym Locker Room: Each locker has a unique number (the KEY) and whatever you put inside is the VALUE. If you know the locker number, you get your stuff instantly — no searching through every locker. That’s O(1) lookup.
KEY VALUE
────────────────────────── ─────────────────────────────────
user:1001:name → "Alice"
session:abc123 → "{userId: 1001, role: admin}"
product:iphone15:price → "1299"
leaderboard → [sorted list of players]Redis Architecture: Redis follows a Client-Server model where clients connect via TCP on port 6379. Internally, Redis is built on three architectural pillars:
| Pillar | What it means |
|---|---|
| In-Memory Storage | All data lives in RAM — no disk reads, sub-millisecond access |
| Non-Blocking I/O | Uses the Reactor Pattern with epoll/kqueue to handle thousands of connections on a single thread |
| Single-Threaded Execution | No context switching, no mutex locks, no race conditions |
The Reactor Pattern is the heart of Redis — it’s an event-driven design where an I/O multiplexer monitors all client connections and dispatches events one by one to a single worker thread. The CPU never waits for slow I/O; it only works when data is actually ready.
Why is single-threaded fast? Because there are no context switches, no lock contention, and the CPU cache stays “warm.” Redis processes 100,000+ requests/second on a single core.
How RESP fits in: Clients communicate with Redis using RESP (REdis Serialization Protocol) — a binary-safe, text-based protocol. While the Reactor Pattern is the engine, RESP is the language spoken over connections.
2.1.2 Advantages and Limitations
Advantages:
| Advantage | Why It Matters |
|---|---|
| Blazing Fast | In-memory = no disk I/O. O(1) for most operations. 1M+ ops/sec |
| Simple Model | No schemas, no JOINs — store and retrieve with minimal complexity |
| Horizontally Scalable | Redis Cluster distributes data across nodes |
| Flexible Data Types | 10+ types: strings, lists, sets, hashes, sorted sets, geo, HLL, and more |
| Atomic Operations | INCR is atomic — safe for concurrent counters without locks |
| TTL Support | Keys auto-expire — perfect for sessions, caches, rate limits |
Limitations:
| Limitation | Explanation |
|---|---|
| Memory-Bound | All data lives in RAM. RAM is expensive. You can’t store 500 GB in Redis |
| No Complex Queries | No SQL JOINs, GROUP BY, or nested WHERE clauses |
| Limited ACID | MULTI/EXEC ≠ full SQL transactions. No rollback on runtime errors |
| Key-Based Lookup Only | Scanning all keys is O(N) and dangerous in production |
| Persistence Complexity | In-memory data risks loss without careful configuration |
⚠️ Key Insight: Redis is not a replacement for PostgreSQL or MySQL. It’s a complement — use it for caching, sessions, queues, and leaderboards alongside a persistent relational database.
2.1.3 Common Use Cases
| Use Case | How Redis Helps | Real Company |
|---|---|---|
| Session Management | Store login sessions with TTL auto-expiry | GitHub, GitLab |
| Caching | Cache DB results and API responses | Twitter, Instagram |
| Rate Limiting | Count API calls per user per time window | Stripe, GitHub API |
| Leaderboards | Sorted Sets rank users by score in real-time | Stack Overflow, Games |
| Real-time Analytics | Count events and unique visitors | YouTube view counts |
| Message Queues | Lists as FIFO queues for background tasks | Celery (Python) |
| Pub/Sub | Event broadcasting between microservices | Slack notifications |
| Geolocation | Find nearby drivers, restaurants, stores | Uber, Swiggy, Zomato |
| Distributed Locks | Prevent race conditions across servers | Payment processing |
🛒 Real-World Example — Amazon Cart: When you add an item to your Amazon cart, it’s stored in Redis as
cart:user:9876. It’s faster than querying a database on every page load and expires automatically after 30 days of inactivity.
2.2 Redis Fundamentals
2.2.1 Redis Data Model
Redis stores everything in a flat key-value namespace — no tables, no rows, no foreign keys. Since all keys share the same space, structured naming is critical.
Key Naming Convention: object-type:id:attribute
user:1001:profile # User profile data
user:1001:sessions # User's active sessions
order:ORD2024001:status # Order status
cache:homepage:trending # Cached content
ratelimit:user:1001:api # Rate limiting counter💡 Tip: Use colons (
:) as separators consistently. This lets you safely iterate withSCAN 0 MATCH user:*in production without blocking.
Key Rules:
| Rule | Detail |
|---|---|
| Format | Binary-safe strings (use readable UTF-8 names) |
| Max Size | 512 MB (keep keys short — they add to memory) |
| Case Sensitive | User:1001 ≠ user:1001 |
| TTL | Keys can auto-expire in seconds or milliseconds |
2.2.2 Redis Data Types
1. Strings
The most fundamental type. Binary-safe. Can hold text, numbers, serialized JSON, even images. Max: 512 MB.
SET username "Alice"
SET counter 0
SET user:1001:profile '{"name":"Alice","age":25}' # Store JSON as string
GET username # → "Alice"
INCR counter # → 1 (ATOMIC increment)
INCRBY counter 5 # → 6
INCRBYFLOAT price 1.5 # Float increment
# Key with expiry:
SETEX session:tok123 3600 "user:1001" # Expires in 1 hour
TTL session:tok123 # Returns remaining seconds
PERSIST username # Remove expiry🐦 Real-World — Twitter Rate Limiting:
SET ratelimit:user:1001:tweets 0→INCRon each tweet →EXPIREfor 3 hours → reject when value hits 300.
2. Lists
An ordered list of strings — a doubly linked list under the hood. Perfect for queues (LPUSH + RPOP) and stacks (LPUSH + LPOP).
LPUSH tasks "send_email" # ["send_email"]
RPUSH tasks "generate_pdf" # ["send_email", "generate_pdf"]
LLEN tasks # 2
LRANGE tasks 0 -1 # Get ALL elements
LPOP tasks # Remove from left
RPOP tasks # Remove from right
BLPOP tasks 30 # ⭐ BLOCKING pop — waits up to 30s for a new task📸 Real-World — Instagram Task Queue: When a user uploads a photo, a job is pushed to a Redis list. Worker processes use
BLPOPto pick up and process jobs (resize, notify) without polling.
3. Sets
An unordered collection of unique strings. Duplicates are automatically ignored. Supports powerful set math.
SADD followers:user:1001 "user:2001" "user:3001"
SMEMBERS followers:user:1001 # Get all members
SISMEMBER followers:user:1001 "user:2001" # → 1 (exists)
SCARD followers:user:1001 # Count = 2
# Set Math:
SADD tags:article:101 "redis" "nosql" "database"
SADD tags:article:102 "redis" "caching"
SINTER tags:article:101 tags:article:102 # → {"redis"}
SUNION tags:article:101 tags:article:102 # → all unique tags
SDIFF tags:article:101 tags:article:102 # → {"nosql","database"}🤝 Real-World — LinkedIn “People You May Know”:
SINTER connections:alice connections:bobreturns users known by both — perfect mutual connection suggestions.
4. Hashes
A field-value map stored under a single key — like a mini dictionary, or a single database row. Best for storing objects.
HSET user:1001 name "Alice" email "alice@example.com" age 28 city "Thimphu"
HGET user:1001 name # → "Alice"
HGETALL user:1001 # → all fields and values
HMGET user:1001 name email # → multiple specific fields
HINCRBY user:1001 age 1 # Atomically increment age
HDEL user:1001 city # Delete one fieldHash vs String for objects:
| Approach | Problem |
|---|---|
SET user:1001 '{"name":"Alice","age":28}' | To update age: read → deserialize → update → serialize → write the entire object |
HSET user:1001 name "Alice" age 28 | Update just one field: HSET user:1001 age 29 — no touching other fields |
🐙 Real-World — GitHub Repos:
HSET repo:torvalds/linux stars 170000 forks 49000 language "C"— fetched withHGETALLin microseconds on page load.
5. Sorted Sets (ZSets)
The most powerful Redis type. Like a Set, but every member has a floating-point score. Members are always kept sorted by score. O(log N) for most operations.
ZADD leaderboard 9500 "player:alice"
ZADD leaderboard 9800 "player:charlie"
ZADD leaderboard 8200 "player:bob"
# Top 3 (highest scores first):
ZREVRANGE leaderboard 0 2 WITHSCORES
# → charlie 9800 | alice 9500 | bob 8200
ZREVRANK leaderboard "player:alice" # → 1 (2nd place, 0-indexed)
ZSCORE leaderboard "player:alice" # → 9500.0
ZINCRBY leaderboard 500 "player:alice" # alice → 10000
# Range by score:
ZRANGEBYSCORE leaderboard 8000 +inf WITHSCORES
ZCOUNT leaderboard 4500 +inf # Count with score >= 4500
ZPOPMIN leaderboard # Remove and return lowest scorer🏆 Real-World — Stack Overflow:
ZREVRANGE reputations 0 99 WITHSCORES— fetches the top 100 users for the “Top Users” page in real-time. No SQL sorting needed.
2.2.3 Basic Redis Commands and Operations
Global Key Commands (work on ALL types):
EXISTS user:1001 # Does key exist? → 1 or 0
DEL user:1001 # Delete key
TYPE user:1001 # Returns: string | list | set | hash | zset
EXPIRE user:1001 3600 # Set TTL in seconds
TTL user:1001 # Remaining TTL (-1 = no TTL, -2 = not found)
PERSIST user:1001 # Remove TTL (make permanent)
# Safe key iteration:
SCAN 0 MATCH user:* COUNT 100 # ✅ Use this
KEYS user:* # ❌ NEVER in production — O(N) blocking⚠️ NEVER use
KEYS *in production! It’s O(N) and blocks ALL other Redis operations while scanning. UseSCANinstead — it iterates in small batches without blocking.
Server Commands:
PING # → PONG (test connection)
INFO memory # Memory-specific stats
DBSIZE # Total key count
SLOWLOG GET 10 # Last 10 slow commands
CONFIG SET maxmemory 2gb # Live config update (no restart!)
FLUSHDB # ⚠️ Delete ALL keys in current DB2.3 Redis Data Structures and Algorithms
These are Redis’s advanced, specialized structures for specific problem classes. Each trades something (exactness, simplicity) for enormous gains in memory efficiency or speed.
2.3.1 Bitmaps and Bitfields
Bitmaps
A Redis Bitmap is not a separate type — it’s a set of bit manipulation operations on regular Redis Strings. Each string character = 8 bits = 8 boolean flags.
🎬 Analogy — Cinema Seating: Imagine each seat = one user ID. A
1bit means “occupied/active”,0means “empty/inactive”. You can represent the status of 8 users in a single byte.
# Track which users logged in on a specific date:
SETBIT login:2024-01-15 1001 1 # User 1001 logged in
SETBIT login:2024-01-15 2005 1 # User 2005 logged in
GETBIT login:2024-01-15 1001 # → 1 (logged in)
GETBIT login:2024-01-15 9999 # → 0 (did NOT log in)
BITCOUNT login:2024-01-15 # Total logins today
# Users active on BOTH Monday AND Tuesday:
BITOP AND active_both login:mon login:tue
BITCOUNT active_both # Intersection countMemory comparison:
| Method | Memory for 10M users/day | Approach |
|---|---|---|
| Redis Set | ~100 MB | Store each user ID as string |
| Bitmap | ~1.25 MB | 1 bit per user |
| Savings | 80x smaller |
🐙 Real-World — GitHub Contribution Graph: Those green squares on GitHub profiles? Each bit = one day.
BITCOUNT contributions:user:torvalds= total active days.
Bitfields
Extend bitmaps by storing multi-bit integers at arbitrary bit offsets. Pack multiple integer values compactly into a single key.
# Store player stats: level (u8), health (u8), gold (u16)
BITFIELD player:1001 SET u8 0 15 # Level = 15
BITFIELD player:1001 SET u8 8 87 # Health = 87
BITFIELD player:1001 SET u16 16 5000 # Gold = 5000
BITFIELD player:1001 GET u8 0 # → 15
BITFIELD player:1001 INCRBY u8 0 1 # Level up → 16
# Overflow protection (cap at max, no wrap-around):
BITFIELD player:1001 OVERFLOW SAT INCRBY u8 8 200 # Health stays at 2552.3.2 HyperLogLog for Cardinality Estimation
The Problem: You need to count unique visitors. Storing every visitor ID in a Set costs ~48 GB for 1 billion users.
HyperLogLog (HLL) is a probabilistic algorithm that estimates the count of distinct elements using constant, tiny memory — always ~12 KB, regardless of input size. The trade-off: ~0.81% error rate (acceptable for analytics).
How it works (intuition): HLL hashes each input into a binary string and tracks the maximum number of leading zeros seen. Statistically, if the longest run of leading zeros is k, then approximately 2^k distinct items have been seen. With 16,384 sub-registers averaging these observations, the estimate becomes remarkably accurate.
# Count unique website visitors:
PFADD visitors:2024-01-15 user1 user2 user3 user4 user5
PFADD visitors:2024-01-15 user2 user3 user6 user7 # Duplicates ignored
PFCOUNT visitors:2024-01-15 # → ~7 (not 9, duplicates excluded)
# Merge multiple HLLs (weekly report):
PFADD visitors:mon user1 user2 user3
PFADD visitors:tue user2 user4 user5
PFADD visitors:wed user1 user6 user7
PFMERGE visitors:week visitors:mon visitors:tue visitors:wed
PFCOUNT visitors:week # → ~7 unique users across all 3 daysPF in commands stands for Philippe Flajolet, the mathematician who invented the HyperLogLog algorithm.
Comparison:
| Method | Memory (1B unique items) | Accuracy |
|---|---|---|
| Redis Set | ~48 GB | 100% exact |
| HyperLogLog | ~12 KB | ~99.19% |
📺 Real-World — YouTube: “500 million unique views” is an approximation using HLL-like structures. At that scale, ±0.81% error is unnoticeable and saves enormous memory.
2.3.3 Bloom Filters for Membership Testing
The Problem: You have 5 billion URLs and need to check “Has this URL been seen before?” Exact lookup in a database is too slow; storing everything in memory costs hundreds of GB.
A Bloom Filter answers: “Is this element possibly in the set?”
- “NO” → definitely not in the set (100% accurate — zero false negatives)
- “YES” → probably in the set (small chance of false positive)
How it works: A bit array + multiple hash functions. Adding an item sets several bits to 1. Checking an item verifies all those bits — if any is 0, the item is definitely absent; if all are 1, it’s probably present.
# Requires RedisBloom module (included in Redis Stack)
# Create a filter for 1M items with 0.1% false positive rate:
BF.RESERVE urls:shortened 0.001 1000000
# Add items:
BF.ADD urls:shortened "https://google.com"
BF.ADD urls:shortened "https://github.com"
# Check membership:
BF.EXISTS urls:shortened "https://google.com" # → 1 (probably in set)
BF.EXISTS urls:shortened "https://bing.com" # → 0 (DEFINITELY not)
# Bulk operations:
BF.MADD blacklist:emails "spam@evil.com" "bot@scam.net"
BF.MEXISTS blacklist:emails "spam@evil.com" "real@gmail.com"
# → [1, 0]Two-tier lookup pattern:
Request
│
▼
[Bloom Filter Check]
│ │
│ NO │ YES (possibly)
▼ ▼
Definitely [Exact DB Lookup]
Not Present │ │
(skip DB!) Found Not Found
(TP) (FP → discard)🌐 Real-World — Google Chrome Safe Browsing: Chrome stores a local Bloom Filter of known malicious URLs.
0= definitely safe (no network call needed).1= send to Google for exact verification. This reduces network calls by ~99%.
Important Limitations:
- Cannot delete items (only add). Use a Cuckoo Filter if deletion is needed.
- False positive rate increases as the filter fills beyond capacity.
- Not suitable when exact membership is required (billing, auth).
2.3.4 Geospatial Indexes
Redis Geospatial commands are built on top of Sorted Sets — coordinates are encoded as GeoHash integers (scores), enabling efficient proximity queries.
📍 GeoHash: An algorithm that encodes
(latitude, longitude)into a single string or integer. Nearby coordinates share similar GeoHash prefixes — this “spatial locality” enables fast range queries.
# Add driver locations (longitude first, then latitude):
GEOADD drivers:online 77.2090 28.6139 "driver_A"
GEOADD drivers:online 77.2210 28.6250 "driver_B"
GEOADD drivers:online 77.3000 28.7000 "driver_D"
# Get stored coordinates:
GEOPOS drivers:online driver_A
# Distance between two points:
GEODIST drivers:online driver_A driver_B km # → 1.78 km
# ⭐ Find all drivers within 3 km of a passenger:
GEOSEARCH drivers:online
FROMLONLAT 77.2090 28.6139
BYRADIUS 3 km
ASC
COUNT 5
WITHCOORD
WITHDIST
# GeoHash encoding (nearby places share prefix):
GEOHASH drivers:online driver_A driver_B
# → "ttnfv2ub4k0" "ttnfvdc6k50" (share "ttnfv" prefix)Key Commands:
| Command | Description |
|---|---|
GEOADD key lon lat member | Add a location |
GEOPOS key member | Get coordinates |
GEODIST key m1 m2 unit | Distance (m, km, mi, ft) |
GEOSEARCH key FROMLONLAT ... | Radius/bounding box search |
GEOHASH key member | Get GeoHash string |
🛵 Real-World — Swiggy/Zomato: On app open:
GEOSEARCH restaurants FROMLONLAT <your_lat> <your_lon> BYRADIUS 5 km ASC— returns all restaurants within 5 km, sorted by proximity, in sub-millisecond time.
Performance: ~16 bytes per member, ~0.6mm precision, O(log N) for GEOADD, O(N + log M) for radius search.
Comparative Summary
| Structure | Problem Solved | Memory | Accuracy | Primary Commands |
|---|---|---|---|---|
| Bitmaps | Binary flags per user/day | 1 bit/user/day | Exact | SETBIT, GETBIT, BITCOUNT |
| Bitfields | Packed integer arrays | Dense bit-packing | Exact | BITFIELD GET/SET/INCRBY |
| HyperLogLog | Count distinct items | Fixed 12 KB | ~0.81% error | PFADD, PFCOUNT, PFMERGE |
| Bloom Filter | Membership testing | ~10 bits/item | No false negatives | BF.ADD, BF.EXISTS |
| Geospatial | Location-based queries | ~16 bytes/point | ~0.6mm precision | GEOADD, GEOSEARCH |
2.4 Redis Persistence and Durability
Redis is in-memory. A crash or restart clears all data — like RAM being wiped on shutdown. Persistence mechanisms save data to disk so it can be recovered.
2.4.1 RDB Snapshots (Redis Database Backup)
RDB creates point-in-time binary snapshots of the entire dataset at configured intervals. Uses OS fork() + Copy-On-Write — so the main process keeps serving clients while a child process writes the snapshot.
# redis.conf:
save 900 1 # Snapshot if ≥1 key changed in 15 min
save 300 10 # Snapshot if ≥10 keys changed in 5 min
save 60 10000 # Snapshot if ≥10000 keys changed in 1 min
dbfilename dump.rdb
dir /var/lib/redis/
# Manual commands:
BGSAVE # ✅ Background save (non-blocking)
SAVE # ⚠️ Synchronous save (BLOCKS Redis)
LASTSAVE # Unix timestamp of last successful save| Feature | Detail |
|---|---|
| File format | Compact binary .rdb (highly compressible) |
| Performance impact | Low — fork() is nearly instant |
| Recovery time | Fast — loads single binary file |
| Data loss risk | Up to X minutes (since last snapshot) |
| Best for | Backups, disaster recovery, warm cache restarts |
2.4.2 AOF (Append-Only File) Logs
AOF logs every write operation to a file in RESP format. On restart, Redis replays all commands to rebuild the dataset — like a transaction log in RDBMS.
# redis.conf:
appendonly yes
appendfilename "appendonly.aof"
# Fsync policy (when to flush buffer to disk):
appendfsync always # Every write → MOST DURABLE, slowest
appendfsync everysec # Every second → BALANCED ⭐ recommended
appendfsync no # Let OS decide → FASTEST, least durable
# Auto-rewrite (compress AOF when it grows too large):
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# Manual rewrite:
BGREWRITEAOFWhat the AOF file looks like (human-readable RESP format):
*3 ← Command has 3 arguments
$3 ← Next arg: 3 bytes
SET
$8 ← Key: 8 bytes
username
$5 ← Value: 5 bytes
Alice2.4.3 Hybrid Persistence Strategies
| Strategy | Data Loss Risk | Restart Speed | Use Case |
|---|---|---|---|
| No persistence | Total loss | Instant | Pure cache, data rebuildable |
| RDB only | Up to minutes | Fast | Tolerable loss, backups |
| AOF only | ≤1 second | Slower | High-durability requirement |
| RDB + AOF | ≤1 second | Fast (uses RDB) | ⭐ Production recommended |
# redis.conf — Production Hybrid Setup:
appendonly yes
appendfsync everysec
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
save 900 1
save 300 10
aof-use-rdb-preamble yes # Redis 7.0+ hybrid mode🛍️ Real-World — Shopify on Black Friday: AOF
everysecensures cart data loses at most 1 second during outages. Daily RDB snapshots go to AWS S3 for disaster recovery. During peak traffic (millions of transactions/minute), this balance is critical.
2.5 Redis Clustering and High Availability
Key Terms:
- SPOF (Single Point of Failure): If one server fails, the whole service goes down.
- High Availability (HA): The system keeps working even if some nodes fail.
- Horizontal Scaling: Adding more machines, not bigger machines.
2.5.1 Redis Sentinel for Automatic Failover
Sentinel is the health check and automatic failover layer. It monitors Redis instances and, if the primary (master) goes down, automatically promotes a replica to become the new primary.
# sentinel.conf:
sentinel monitor mymaster 192.168.1.100 6379 2
# Name: mymaster | Address | Quorum: 2 sentinels must agree
sentinel down-after-milliseconds mymaster 5000 # Down after 5s
sentinel failover-timeout mymaster 60000 # Failover must finish in 60s
# Start:
redis-sentinel /etc/redis/sentinel.confSentinel Failover Flow:
- Primary stops responding → Sentinels detect failure
- Quorum reached (e.g., 2 of 3 sentinels agree)
- Sentinel promotes a replica:
SLAVEOF NO ONE - Sentinels notify clients about the new primary
- Other replicas reconfigure to follow the new primary
⚠️ Sentinel vs Cluster: Sentinel solves High Availability (auto-failover) but NOT horizontal scaling. To scale writes beyond one machine, use Redis Cluster.
2.5.2 Redis Cluster for Horizontal Scaling
Redis Cluster automatically shards data across multiple nodes using hash slots.
Hash Slot Sharding:
Total: 16,384 hash slots
slot = CRC16(key) % 16384
Node A → Slots 0–5460
Node B → Slots 5461–10922
Node C → Slots 10923–16383When a client connects to the wrong node, Redis returns a MOVED redirect to the correct node.
# redis.conf for each cluster node:
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 5000
# Create cluster (3 masters + 3 replicas):
redis-cli --cluster create \
192.168.1.101:6379 192.168.1.102:6379 192.168.1.103:6379 \
192.168.1.104:6379 192.168.1.105:6379 192.168.1.106:6379 \
--cluster-replicas 1
# Hash Tags — force keys to same slot:
{user:1001}.name # Both keys hashed on "user:1001" → same node
{user:1001}.email # → multi-key operations work!Sentinel vs Cluster:
| Feature | Redis Sentinel | Redis Cluster |
|---|---|---|
| Primary purpose | HA (auto-failover) | Scaling + HA |
| Max data | One machine’s RAM | Sum of all nodes’ RAM |
| Write throughput | Single machine | Scales with nodes |
| Multi-key ops | Full support | Keys must share hash slot |
| Minimum nodes | 3 (1P + 2 Sentinels) | 6 (3M + 3R) |
📸 Real-World — Instagram: Uses Redis Cluster across thousands of nodes to store follower/following relationships and feed data for 1+ billion users. No single machine could hold all this in RAM.
2.5.3 Replication and Data Synchronization
# On replica node's redis.conf:
replicaof 192.168.1.100 6379 # Point to primary
# Or dynamically:
REPLICAOF 192.168.1.100 6379
REPLICAOF NO ONE # Detach (promote to standalone)
# Check replication status on primary:
INFO replication
# → role:master
# → connected_slaves:2
# → slave0:ip=192.168.1.101,...,state=online,lag=0Full Sync Process:
- Replica sends
PSYNC ? -1(full sync request) - Primary forks and creates an RDB snapshot (
BGSAVE) - Primary sends RDB file to replica
- Replica loads RDB (clears memory first)
- Primary sends buffered commands from during transfer
- Ongoing: Primary streams every write command to replica
2.6 Redis Modules and Extensions
Redis Modules (since Redis 4.0) extend Redis with new data types and commands. Redis Stack bundles the most popular ones.
| Module | Purpose |
|---|---|
| RediSearch | Full-text search + aggregations |
| RedisJSON | Native JSON storage and querying |
| RedisTimeSeries | Time-series data and analytics |
| RedisAI | ML model serving and inference |
| RedisBloom | Bloom/Cuckoo Filters + HyperLogLog |
2.6.1 RediSearch — Full-Text Search
RediSearch transforms Redis into a search engine with inverted indexes for O(1)/O(log N) queries instead of O(N) key scans.
# 1. Create a search index:
FT.CREATE idx:products
ON HASH
PREFIX 1 product:
SCHEMA
name TEXT WEIGHT 5.0 # Higher weight = more relevant
description TEXT
price NUMERIC SORTABLE
category TAG
brand TAG SORTABLE
# 2. Add products (normal HSET — index updates automatically!):
HSET product:1001 name "iPhone 15 Pro" price 1299 category "smartphone" brand "Apple"
HSET product:1002 name "Samsung Galaxy S24" price 1199 category "smartphone" brand "Samsung"
# 3. Search:
FT.SEARCH idx:products "iPhone" # Full-text
FT.SEARCH idx:products "@category:{smartphone}" # Tag filter
FT.SEARCH idx:products "@price:[200 500]" # Numeric range
FT.SEARCH idx:products "flagship @price:[1000 +inf]" # Combined
FT.SEARCH idx:products "*" SORTBY price ASC LIMIT 0 10Field Types:
| Field Type | Description | Use Case |
|---|---|---|
TEXT | Full-text search with tokenization and stemming | Names, descriptions |
TAG | Atomic literal, no tokenization | Categories, brands, IDs |
NUMERIC | Range-based math queries | Prices, timestamps |
Key Design Insights:
- The index is decoupled from the data — deleting the index doesn’t delete the Hash data.
- Data is added via standard
HSET— RediSearch monitors the prefix and auto-indexes. - Instead of O(N) key scans, RediSearch uses an Inverted Index for O(1)/O(log N) lookups.
2.6.2 RedisJSON — Native JSON Storage
Store, retrieve, and partially update JSON documents natively — no full serialization/deserialization required.
# Store JSON:
JSON.SET user:1001 $ '{"name":"Alice","age":28,"skills":["Python","Redis"]}'
# Read (JSONPath syntax):
JSON.GET user:1001 $ # Entire document
JSON.GET user:1001 $.name # → "Alice"
JSON.GET user:1001 $.skills # → ["Python","Redis"]
# Partial update (no need to read whole document!):
JSON.SET user:1001 $.age 29
JSON.SET user:1001 $.address.city "Paro"
# Array operations:
JSON.ARRAPPEND user:1001 $.skills '"Docker"'
JSON.ARRLEN user:1001 $.skills # → 3RedisJSON vs String JSON:
| Operation | String JSON | RedisJSON |
|---|---|---|
| Partial update | Read → parse → update → write full blob | JSON.SET $.field value |
| Partial read | Get entire blob | JSON.GET $.field |
| Array push | Full rewrite | JSON.ARRAPPEND |
| Search | Not indexable | Index with RediSearch |
2.6.3 RedisTimeSeries — Time-Series Data
Built for storing and querying sequential timestamped data with automatic aggregation and downsampling.
# Create a time series:
TS.CREATE temperature:sensor:001
RETENTION 86400000 # Keep 24 hours (ms)
LABELS location Thimphu unit celsius
# Add data points (* = auto current timestamp):
TS.ADD temperature:sensor:001 * 22.5
TS.ADD temperature:sensor:001 * 23.1
# Query:
TS.RANGE temperature:sensor:001 - + # All data
TS.RANGE temperature:sensor:001 - + AGGREGATION avg 3600000 # Hourly avg
# Automatic downsampling rule:
TS.CREATERULE temperature:sensor:001 temp:hourly:001
AGGREGATION avg 3600000 # Auto-compute hourly averages🚗 Real-World — Tesla Telemetry: Every Tesla sends battery level, temperature, speed, and GPS every second. RedisTimeSeries stores this with automatic aggregation (per-minute averages), enabling real-time dashboards and anomaly detection.
2.6.4 RedisAI — ML Model Serving
Serve machine learning models directly inside Redis — no data transfer to a separate inference server.
# Load a trained TensorFlow model:
AI.MODELSTORE sentiment:model TF CPU
INPUTS input_text
OUTPUTS prediction
BLOB <model_binary_data>
# Set input tensor:
AI.TENSORSET input:review FLOAT 1 128 VALUES 0.1 0.5 0.3 ...
# Run inference:
AI.MODELEXECUTE sentiment:model
INPUTS 1 input:review
OUTPUTS 1 output:sentiment
# Get result:
AI.TENSORGET output:sentiment VALUES
# → [0.92] (92% positive sentiment)The key advantage: User data is already in Redis. The model is also in Redis. Inference happens in one hop — no network transfer to a separate ML server.
2.7 Redis Performance Optimization
2.7.1 Pipelining and Transactions
Pipelining
Every Redis command involves a round-trip: Client → Network → Redis → Network → Client. For 1000 commands, that’s 1000 round-trips. Pipelining batches all commands into a single network trip.
# Python example (redis-py):
# Without pipelining — 10,000 round-trips (slow!)
for i in range(10000):
r.set(f"key:{i}", f"value:{i}")
# With pipelining — ~1 round-trip (~100x faster!)
pipe = r.pipeline(transaction=False)
for i in range(10000):
pipe.set(f"key:{i}", f"value:{i}") # Just queues locally
results = pipe.execute() # Send ALL at once
transaction=FalsedisablesMULTI/EXECwrapping — commands are buffered and sent together but are not atomic. This is appropriate for bulk inserts where atomicity isn’t needed.
Transactions (MULTI/EXEC)
MULTI # Start transaction (queue mode)
DECRBY user:1001:credits 100 # Queued
INCRBY user:2001:credits 100 # Queued
EXEC # Execute ALL atomically
DISCARD # Cancel transaction
# Optimistic Locking with WATCH:
WATCH user:1001:credits # Monitor this key for changes
# ... read current value ...
MULTI
DECRBY user:1001:credits 100
EXEC # Returns nil if key changed since WATCH → retry!⚠️ Redis ≠ SQL Transactions. Redis does NOT roll back on runtime errors. If command 3 of 5 fails, commands 1, 2, 4, 5 still execute.
MULTI/EXEConly guarantees no interleaving between other clients.
Lua Scripts (Atomic Complex Operations)
# Atomic check-then-deduct:
EVAL "
local balance = redis.call('GET', KEYS[1])
if tonumber(balance) >= tonumber(ARGV[1]) then
redis.call('DECRBY', KEYS[1], ARGV[1])
return 1 -- success
else
return 0 -- insufficient funds
end
" 1 user:1001:credits 100Lua scripts run atomically — no other commands can interleave during execution.
2.7.2 Memory Management and Eviction Policies
# redis.conf:
maxmemory 4gb
maxmemory-policy allkeys-lruEviction Policies:
| Policy | Behavior | Best For |
|---|---|---|
noeviction | Error when full (default) | Data that must never be lost |
allkeys-lru | Evict Least Recently Used from all keys | ⭐ General cache |
volatile-lru | LRU from keys with TTL only | Mixed permanent + cached |
allkeys-lfu | Evict Least Frequently Used from all keys | Frequency matters more than recency |
volatile-lfu | LFU from TTL keys only | Frequently accessed cache |
volatile-ttl | Evict soonest-expiring keys | Data with varying importance |
allkeys-random | Random eviction | Rarely used |
# Check memory usage:
MEMORY USAGE user:1001 # Returns bytes
INFO memory # Full memory stats
MEMORY DOCTOR # Auto-diagnosis and suggestions
# Key fields to monitor:
# used_memory_human → current allocated
# mem_fragmentation_ratio → >1.5 = high fragmentation
# maxmemory_human → configured limit💡 Pro Tip — Hash for Small Objects: Redis uses compact
listpackencoding for hashes with fewer than 128 fields and values under 64 bytes. Storing 1M user objects as hashes can be 10x more memory-efficient than using separate string keys.
2.7.3 Benchmarking and Monitoring Tools
# Built-in benchmark:
redis-benchmark -h localhost -p 6379 -n 100000 -c 50
# -n: total requests -c: concurrent clients
redis-benchmark -t get,set -n 1000000 -q # Quiet mode
# Typical output on modern hardware:
# SET: 185,185 requests/sec
# GET: 192,307 requests/sec
# Monitoring:
INFO stats # hits, misses, ops/sec
INFO clients # connected_clients, blocked_clients
SLOWLOG GET 25 # Last 25 slow commands
LATENCY HISTORY # Latency spikes
MONITOR # Real-time command stream (dev/debug only!)Key Metrics to Monitor:
| Metric | Healthy Value | Alert If |
|---|---|---|
| Hit rate (keyspace_hits/total) | > 90% | < 80% |
| used_memory | < 80% of maxmemory | > 90% |
| connected_clients | Stable | Sudden spike |
| mem_fragmentation_ratio | 1.0–1.5 | > 1.5 |
| rdb_last_bgsave_status | ok | Not “ok” |
2.8 Redis Security Considerations
⚠️ Critical Context: Redis was originally designed for trusted internal networks. By default — NO authentication, listens on all interfaces. In 2016, tens of thousands of unprotected Redis instances were compromised. Security is not optional.
2.8.1 Authentication and Access Control
Legacy Password (Redis < 6.0)
# redis.conf:
requirepass YourStr0ngP@ssw0rd!
# Connect:
redis-cli -a YourStr0ngP@ssw0rd!
# Or after connecting:
AUTH YourStr0ngP@ssw0rd!ACL — Access Control Lists (Redis 6.0+)
Modern, fine-grained access control — define what each user can do and on which keys.
# Create users with specific permissions:
ACL SETUSER alice on >alice_pass ~user:* +GET +HGET +HGETALL
# alice: enabled | password | key pattern | allowed commands
ACL SETUSER bob on >bob_pass ~order:* +@read
# bob: read-only on order:* keys
ACL SETUSER api_service on >svc_pass ~cache:* +GET +MGET +SETEX
ACL SETUSER admin on >admin_pass ~* +@all # Full access
# Disable unauthenticated access:
ACL SETUSER default off
# Inspect:
ACL LIST # Show all rules
ACL WHOAMI # Current user
ACL LOG # Security events (failed auths, denied commands)ACL Command Categories:
| Category | Commands Included |
|---|---|
+@read | GET, LRANGE, HGETALL, SMEMBERS, ZRANGE… |
+@write | SET, LPUSH, HSET, SADD, ZADD… |
+@admin | CONFIG, INFO, MONITOR, DEBUG… |
~user:* | Only keys starting with user: |
~* | All keys |
2.8.2 SSL/TLS Encryption
# redis.conf (Redis 6.0+):
port 0 # Disable plain TCP
tls-port 6380 # Enable TLS
tls-cert-file /etc/redis/redis.crt
tls-key-file /etc/redis/redis.key
tls-ca-cert-file /etc/redis/ca.crt
tls-auth-clients yes # Require client cert
tls-protocols "TLSv1.2 TLSv1.3"
# Connect with TLS:
redis-cli -h localhost -p 6380 \
--tls \
--cert /path/to/client.crt \
--key /path/to/client.key \
--cacert /path/to/ca.crt2.8.3 Redis Security Best Practices
# 1. Bind to specific interfaces only:
bind 127.0.0.1 10.0.0.50 # NEVER: bind 0.0.0.0
# 2. Enable protected mode:
protected-mode yes
# 3. Disable or rename dangerous commands:
rename-command FLUSHDB "" # Disable completely
rename-command FLUSHALL ""
rename-command DEBUG ""
rename-command CONFIG "ADMIN_CFG_9k2m" # Rename to secret
rename-command KEYS "" # Force use of SCAN
# 4. Set maxmemory to prevent OOM:
maxmemory 4gb
maxmemory-policy allkeys-lruSecurity Layers Summary:
| Layer | Tool | Protects Against |
|---|---|---|
| Network | Firewall, bind to internal IP | Unauthorized external access |
| Authentication | ACL + strong passwords | Unauthorized users |
| Authorization | ACL key patterns + commands | Lateral movement |
| Encryption (transit) | TLS/SSL | Eavesdropping, MITM |
| Audit | ACL LOG, slow log | Detecting attacks, compliance |
| Command hardening | Rename/disable dangerous commands | Accidental/malicious data wipes |
🔴 Real Security Incident — The Crackit Attack (2016): Attackers found Redis instances exposed to the internet with no auth. They used
CONFIG SET dir /root/.ssh,CONFIG SET dbfilename authorized_keys, thenSETto write their SSH key — gaining full root server access. Always firewall your Redis port. Always use ACL.
Quick Reference Cheat Sheet
Commands by Data Type
# ── STRINGS ──────────────────────────────────────────
SET key value GET key MSET k1 v1 k2 v2
INCR key INCRBY key n DECRBY key n
SETEX key ttl value SETNX key value STRLEN key
# ── LISTS ────────────────────────────────────────────
LPUSH key v RPUSH key v LPOP key
RPOP key LLEN key LRANGE key 0 -1
BLPOP key timeout
# ── SETS ─────────────────────────────────────────────
SADD key m SREM key m SMEMBERS key
SISMEMBER key m SCARD key SPOP key
SUNION k1 k2 SINTER k1 k2 SDIFF k1 k2
# ── HASHES ───────────────────────────────────────────
HSET key f v HGET key f HGETALL key
HDEL key f HEXISTS key f HINCRBY key f n
# ── SORTED SETS ──────────────────────────────────────
ZADD key score member ZREVRANGE key 0 -1 WITHSCORES
ZRANK key m ZREVRANK key m ZSCORE key m
ZINCRBY key n m ZCOUNT key min max ZPOPMIN key
# ── ADVANCED ─────────────────────────────────────────
SETBIT key offset 1 GETBIT key offset BITCOUNT key
PFADD key v PFCOUNT key PFMERGE dest k1 k2
GEOADD key lon lat m GEOSEARCH key ... GEODIST key m1 m2 km
BF.ADD key item BF.EXISTS key itemBig O Complexity Reference
| Operation | Complexity | Note |
|---|---|---|
| GET / SET (String) | O(1) | Hash table lookup |
| LPUSH / RPUSH / LPOP / RPOP | O(1) | Doubly linked list ends |
| LRANGE | O(S+N) | S=start offset, N=returned |
| SADD / SISMEMBER | O(1) | Hash set |
| SUNION / SINTER / SDIFF | O(N) | N=total elements in all sets |
| HSET / HGET | O(1) | Hash table |
| ZADD / ZSCORE / ZRANK | O(log N) | Skip list |
| ZRANGE | O(log N + M) | M=elements returned |
| KEYS * | O(N) | ❌ Blocks Redis |
| SCAN | O(1) per call | ✅ Non-blocking |
Data Type Decision Guide
| If you need to store… | Use |
|---|---|
| A single value, counter, or JSON blob | String |
| An object with multiple fields | Hash |
| An ordered sequence or queue | List |
| A unique collection | Set |
| Ranked members | Sorted Set |
| Boolean flags for millions of IDs | Bitmap |
| Approximate unique item count | HyperLogLog |
| Fast membership testing | Bloom Filter |
| Geographic coordinates | Geospatial |
Key Principles to Remember
| Principle | Rule |
|---|---|
| Key naming | Use type:id:field pattern with colons |
| Searching keys | Use SCAN, never KEYS * in production |
| TTL | Always set TTL on cache keys |
| Persistence | Use RDB + AOF hybrid in production |
| HA | Minimum 3 Sentinels for failover |
| Scaling | Redis Cluster for data > single machine RAM |
| Security | Bind to localhost + firewall + ACL = minimum |
| Performance | Pipeline batch commands to reduce round-trips |
| Memory | Use Hashes for small objects (compact encoding) |
| Anti-pattern | Never KEYS *, never store huge values in Redis |
