The Log
An append-only file containing log records
Note: multiple transactions run concurrently, log records are interleaved
After a system crash, use log to:
- Redo some transaction that did commit.
- Undo other transactions that did not commit.
Book was a bit relaxed in how it showed reads and writes: there are also input and output
- Read and write take data from memory into transaction memory;
- Input and output take data from memory to disk
Log records
- <START T>
- transaction T has begun
- <COMMIT T>
- T has committed
- <ABORT T>
- T has aborted
- <T,X,v>
- T has updated element X, and its old value was v
Undo Logging
- U1: If T modifies X, then <T,X,v> entry must be written to log before X is written to disk
- U2: If T commits, then <COMMIT T> must be put to log after all changes by T are written to disk
Hence: OUTPUTs are done early (before commit)
Action | t | Mem A | Mem B | Disk A | Disk B | LOG |
---|---|---|---|---|---|---|
8 | 8 | <START T> | ||||
INPUT(A) | 8 | 8 | 8 | |||
r1(A,t) | 8 | 8 | 8 | 8 | ||
t=t*2 | 16 | 8 | 8 | 8 | ||
w1(A,t) | 16 | 16 | 8 | 8 | <T,A,8> | |
INPUT(B) | 16 | 16 | 8 | 8 | 8 | |
r1(B,t) | 8 | 16 | 8 | 8 | 8 | |
t=t*2 | 16 | 16 | 8 | 8 | 8 | |
w1(B,t) | 16 | 16 | 16 | 8 | 8 | <T,B,8> |
OUTPUT(A) | 16 | 16 | 16 | 16 | 8 | |
OUTPUT(B) | 16 | 16 | 16 | 16 | 16 | |
<COMMIT T> |
UNDO Recovery
After system’s crash, run recovery manager
1. Decide for each transaction T whether it is completed or not
<START T>….<COMMIT T>…. = yes
<START T>….<ABORT T>……. = yes
<START T>……………………… = no
2. Undo all modifications by incomplete transactions
To do this:
Read log from the end; cases:
- <COMMIT T>: mark T as completed
- <ABORT T>: mark T as completed
- <T,X,v>: if T is not completed
then write X=v to disk
else ignore /* committed or aborted xact. */
- <START T>: ignore
Note: all undo commands are idempotent
- If we perform them a second time, no harm is done
- E.g. if there is a system crash during recovery, simply restart recovery from scratch
Checkpointing
When do we stop reading the log ?
- We cannot stop until we reach the beginning of the log file
- This is impractical
- Better idea: use checkpointing
Checkpoint the database periodically
- Stop accepting new transactions
- Wait until all current transactions complete
- Flush log to disk
- Write a <CKPT> log record, flush
- Resume transactions
This is called Quiescent Checkpointing (quies – latin for “at rest”)
- Problem with checkpointing: database freezes during checkpoint
- Would like to checkpoint while database is operational called:
Nonquiescent checkpointing
Write a <START CKPT(T1,…,Tk)> where T1,…,Tk are all active transactions
Continue normal operation
When all of T1,…,Tk have completed, write <END CKPT>
Q – why do we need the <END CKPT>?
A – to know when the T4,T5,T6 have committed
so we don’t have to undo T4, T5, T6 Txn
Redo Logging
- <START T> = transaction T has begun
- <COMMIT T> = T has committed
- <ABORT T>= T has aborted
- <T,X,v>= T has updated element X, and its new value is v
1: If T modifies X, then both <T,X,v> and <COMMIT T> must be written to disk before X is written to disk Hence: OUTPUTs are done late (after commit)
Action | T | Mem A | Mem B | Disk A | Disk B | LOG |
---|---|---|---|---|---|---|
8 | 8 | <START T> | ||||
INPUT(A) | 8 | 8 | 8 | |||
r1(A,t) | 8 | 8 | 8 | 8 | ||
t=t*2 | 16 | 8 | 8 | 8 | ||
w1(A,t) | 16 | 16 | 8 | 8 | <T,A,16> | |
INPUT(B) | 16 | 16 | 8 | 8 | 8 | |
r1(B,t) | 8 | 16 | 8 | 8 | 8 | |
t=t*2 | 16 | 16 | 8 | 8 | 8 | |
w1(B,t) | 16 | 16 | 16 | 8 | 8 | <T,B,16> |
<COMMIT T> | ||||||
OUTPUT(A) | 16 | 16 | 16 | 16 | 8 | |
OUTPUT(B) | 16 | 16 | 16 | 16 | 16 |
Recovery with Redo Log
After system’s crash, run recovery manager
Step 1. Decide for each transaction T whether it is completed or not
- <START T>….<COMMIT T>…. = yes
- <START T>….<ABORT T>……. = yes
- <START T>……………………… = no
Step 2. Read log from the beginning, redo all updates of committed transactions
Nonquiescent Checkpointing
Same as before
Redo Recovery with nonquiescent checkpointing
UNDO vs REDO Logging
Undo logging:
- OUTPUT must be done early
- If <COMMIT T> is seen, T definitely has written all its data to disk (hence, don’t need to undo)
Redo logging
- OUTPUT must be done late
- If <COMMIT T> is not seen, T definitely has not written any of its data to disk (hence there is not dirty data on disk)
Would like more flexibility on when to OUTPUT (OS calls are expensive): undo/redo logging (next)
UNDO/REDO Logging
<T,X,u,v>= T has updated element X, its old value was u, and its new value is v
UR1: If T modifies X, then <T,X,u,v> must be written to disk before X is written to disk
- Note: we are free to OUTPUT early or late (I.e. before or after <COMMIT T>)
Action | t | Mem A | Mem B | Disk A | Disk B | LOG |
---|---|---|---|---|---|---|
8 | 8 | <START T> | ||||
INPUT(A) | 8 | 8 | 8 | |||
r1(A,t) | 8 | 8 | 8 | 8 | ||
t=t*2 | 16 | 8 | 8 | 8 | ||
w1(A,t) | 16 | 16 | 8 | 8 | <T,A,8,16> | |
INPUT(B) | 16 | 16 | 8 | 8 | 8 | |
r1(B,t) | 8 | 16 | 8 | 8 | 8 | |
t=t*2 | 16 | 16 | 8 | 8 | 8 | |
w1(B,t) | 16 | 16 | 16 | 8 | 8 | <T,B,8,16> |
OUTPUT(A) | 16 | 16 | 16 | 16 | 8 | |
<COMMIT T> | ||||||
OUTPUT(B) | 16 | 16 | 16 | 16 | 16 |
Undo/Redo Recovery
After system’s crash, run recovery manager
- Redo all committed transaction, top-down
- Undo all uncommitted transactions, bottom-up