Redis Distributed Caching

📢 This article was translated by gemini-3-flash-preview

Redis Basics: https://blog.yexca.net/en/archives/157/
Redis Distributed Caching: This article

Introduction

So I wrote both articles at the same time but waited a year to post them, huh?

Actually, I had three articles planned back then, but every time I sat down to finish them, I forgot what I wanted to say. A year just flew by…

Problems

A standalone Redis instance has several issues:

  • Data loss: Solution: Implement Redis persistence.
  • Concurrency limits: Solution: Set up Master-Slave clusters for read/write splitting.
  • Storage limits: Solution: Set up Sharded clusters using slot mechanisms for dynamic scaling.
  • Fault recovery: Solution: Use Redis Sentinel for health monitoring and automatic recovery.

Redis Persistence

Redis offers two persistence options: RDB and AOF.

RDB Persistence

RDB stands for Redis Database Backup file, also known as a data snapshot. Simply put, it saves all memory data to disk. After a crash, Redis restores data by reading the snapshot file. These snapshots (RDB files) are saved in the working directory by default.

RDB executes in these four scenarios:

  • save command: Immediate execution. Blocks the main process and all other commands. Use only during data migration.
  • bgsave command: Asynchronous execution. Forks a child process to handle the RDB, allowing the main process to keep processing requests.
  • Shutdown: Redis automatically runs a save command when stopping.
  • Trigger conditions: Configured in the settings as follows:
1
2
3
4
# If at least 1 key changes within 900s, run bgsave. save "" disables RDB.
save 900 1  
save 300 10  
save 60 10000 

Other configurations:

1
2
3
4
5
6
7
8
# Compress RDB? Recommended: no. CPU cost isn't worth the disk savings.
rdbcompression yes

# RDB filename
dbfilename dump.rdb  

# Directory for saving the file
dir ./ 

RDB Principles

When bgsave starts, it forks the main process to create a child process. Both share the same memory data. Once the fork is complete, the child process reads the memory and writes it to the RDB file.

Forking uses Copy-On-Write (COW) technology:

  • When the main process performs a read, it accesses the shared memory.
  • When the main process performs a write, it copies the data segment first and then executes the write.

image

RDB Disadvantages:

  • Long intervals between saves; data written between two RDB runs risks being lost.
  • Forking, compression, and writing RDB files are resource-heavy.

AOF Persistence

AOF stands for Append Only File. Every write command processed by Redis is logged in the AOF file. Think of it as a command history log.

AOF is disabled by default. Enable it in the config:

1
2
3
4
# Enable AOF? Default: no
appendonly yes
# AOF filename
appendfilename "appendonly.aof"

The logging frequency can also be configured in redis.conf:

1
2
3
4
5
6
# Log every write command immediately to the AOF file
appendfsync always 
# Buffer writes, flush to disk every second. Default/Balanced option.
appendfsync everysec 
# Buffer writes, let the OS decide when to flush to disk.
appendfsync no

Comparison of sync policies:

OptionFlush TimingProsCons
alwaysSync flushHigh reliability, minimal lossHigh performance impact
everysecPer-second flushBalanced performanceMax 1s data loss
noOS controlledBest performanceLow reliability, potential high loss

File Rewriting

Since AOF logs every command, files grow much larger than RDB files. Also, AOF might log multiple writes to the same key, though only the last state matters. Running the bgrewriteaof command rewrites the AOF file using the minimum number of commands required to reach the current state.

Example: Original commands:

1
2
3
set num 123
set name jack
set num 666

After rewrite:

1
mset name jack num 666

Redis also triggers auto-rewrites based on thresholds in the config:

1
2
3
4
# Trigger rewrite if AOF grows by this percentage since last rewrite
auto-aof-rewrite-percentage 100
# Min AOF size to trigger a rewrite 
auto-aof-rewrite-min-size 64mb 

RDB vs AOF

RDB and AOF both have pros and cons. For high data security, developers often use both together.

RDBAOF
Persistence MethodPeriodic full memory snapshotsLogs every write command
Data IntegrityIncomplete; data lost between backupsMostly complete, depends on flush policy
File SizeCompressed, smallCommand logs, very large
Recovery SpeedVery fastSlow
Recovery PriorityLow (due to integrity)High (due to better integrity)
Resource UsageHigh CPU/Memory during forkLow (mostly Disk I/O).
Note: Rewrite uses significant CPU/RAM.
Use CaseTolerable data loss (minutes), fast startupHigh data security requirements

Redis Master-Slave Architecture

A single Redis node has a concurrency ceiling. To scale, you need a Master-Slave cluster to implement read/write splitting.

image

Setting up the Cluster

Environment: CentOS 7

Based on the diagram, we’ll deploy three nodes on one machine using ports 7001 (master), 7002, and 7003.

First, create the directories:

1
2
cd /tmp
mkdir 7001 7002 7003

If you previously modified the config, revert to default RDB mode:

1
2
3
4
5
6
7
8
# Enable RDB
# save ""
save 3600 1
save 300 100
save 60 10000

# Disable AOF
appendonly no

Copy the config file to each instance directory:

1
2
3
4
5
6
7
# Option 1
cp redis-6.2.4/redis.conf 7001
cp redis-6.2.4/redis.conf 7002
cp redis-6.2.4/redis.conf 7003

# Option 2
echo 7001 7002 7003 | xargs -t -n 1 cp redis-6.2.4/redis.conf

Modify the port and working directory for each instance (updates port and RDB save path):

1
2
3
sed -i -e 's/6379/7001/g' -e 's/dir .\//dir \/tmp\/7001\//g' 7001/redis.conf
sed -i -e 's/6379/7002/g' -e 's/dir .\//dir \/tmp\/7002\//g' 7002/redis.conf
sed -i -e 's/6379/7003/g' -e 's/dir .\//dir \/tmp\/7003\//g' 7003/redis.conf

Update the IP for each directory (replace ip_address with actual IP):

1
2
3
4
5
6
7
# Execute one by one
sed -i '1a replica-announce-ip ip_address' 7001/redis.conf
sed -i '1a replica-announce-ip ip_address' 7002/redis.conf
sed -i '1a replica-announce-ip ip_address' 7003/redis.conf

# Or use one-liner
printf '%s\n' 7001 7002 7003 | xargs -I{} -t sed -i '1a replica-announce-ip ip_address' {}/redis.conf

Start the instances:

1
2
3
4
5
6
# Instance 1
redis-server 7001/redis.conf
# Instance 2
redis-server 7002/redis.conf
# Instance 3
redis-server 7003/redis.conf

Stop the instances:

1
printf '%s\n' 7001 7002 7003 | xargs -I{} -t redis-cli -p {} shutdown