How Append Only Logs achieves durability in Databases?
Decisions for doing efficient Append Only Logs
I’ve been working on my key-value database—somewhat like Redis—since last year. In the beginning, I built everything without much thought about scalability or extensibility. But as I iterated, I ended up rewriting the entire architecture and, in the process, gained a deeper understanding of how databases handle persistence.
One key concept I discovered was Append-Only Logs (AOF). Databases use this technique to record every operation sequentially. If the system crashes, it can simply replay these logs to restore the last known state. This approach ensures durability while keeping the system efficient.
Here are some ways where we can make our AOF efficient -
1. Batch Writes (Reducing Disk I/O Overhead)
Problem:
Whenever a new SET/DEL operation is done, we write it on the file.
Calling
fsync()after every operation, it ensures durability but is expensive because disk writes are slow.Frequent syncs increase I/O latency and reduce throughput.
Solution:
Instead of writing every operation immediately, buffer writes in memory and flushes them periodically (e.g., every second).
This is similar to how databases like PostgreSQL use WAL (Write-Ahead Logging).
2. Compaction (AOF Rewrite to Reduce File Size)
Problem:
AOF grows quickly because it logs every change, even if a key is updated multiple times.
SET KEY VAL
SET KEY VAL1
DEL KEY
SET KEY VAL3Solution:
Along with the operation, we will also store the time of that operation, something like this
SET KEY VAL 2025-04-28:12:00:01Periodically, we can rewrite the log to keep only the latest state of each key.
We call it AOF compaction or log rewriting.
How It Works:
Create a temporary new log file.
Dump the current state of the in-memory key-value store.
Replace the old AOF file with the new compacted one.
But there’s a problem.
Since we create a new AOF file and rewrite the entire state, writes to the DB can't be persisted to the log until the process completes. This means:
New writes during compaction could be lost if not handled properly.
If the compaction takes time, it blocks the main process of inserting the logs, which could cost us our data if our DB crashes.
So, instead of blocking writes, we can:
Continue accepting writes in memory.
Buffer new writes that happen during compaction.
Apply the buffered writes once compaction is complete.
3. Binary Format (Faster & Smaller Logs)
Problem:
Using plain text for AOF (
SET key value\n) is human-readable but inefficient.Text-based logs waste storage and are slower to parse when restoring state.
Solution:
Use a binary format instead of text-based logs.
This makes logs smaller, faster to write, and quicker to read.
Binary Format Breakdown
Each log entry consists of fixed-size headers followed by the actual key-value data.
Operation - 1 byte (1 for SET, 2 for DEL)
Key Length - 2 bytes (Length of the key (N bytes))
Key - N bytes (The key itself)
Value Length - 2 bytes (Length of the value (M bytes), only for SET)
Value - M bytes (The actual value, only for SET)
Total entry size:
SET= 1 + 2 + N + 2 + M bytesDEL= 1 + 2 + N bytes (no value forDEL)
Example: Convert to Binary Format
SET key val
DEL keyThe actual binary data stored (hex representation):
01 00 03 66 6F 6F 00 03 62 61 72
02 00 03 6B 65 7901- SET00 03- Key length = 366 6F 6F-"key"00 03- Value length = 362 61 72-"val"02- DEL00 03- Key length = 36B 65 79-"key"
Now, instead of text logs (SET key val\n), we store them in a compact binary format, reducing disk usage and improving read/write speed.
After testing the performance of storing logs in string format vs. binary format, I observed a significant improvement in restoration speed. With around 2000 keys, the results were:
String storage - 14ms
Binary storage - 4ms
This optimization drastically reduces recovery time, making the system more efficient.
That’s all for this one—I’ll be back next week with something new!



is AOL and WAL different concepts ? What i can see is that WAL are the same thing as AOL but WAL need to ensure maximum durability