DBS101 Unit7
Unit VII: Transactions, Concurrency & Recovery β Databases That Heal Themselves
π Introduction
This unit uncovers the superpowers that keep our data consistent, available, and durable β even when the system crashes, power goes out, or users mess up.
Welcome to the world of:
- β Transactions (Lesson 18)
- π Concurrency Control (Lesson 19)
- π Database Recovery (Lesson 20)
π§Ύ Lesson 18: Transactions & ACID β Making Promises You Can Keep
π§© What is a Transaction?
A transaction = a unit of work (like a bank transfer or an inventory update) that must either complete fully or not at all.
ACID Properties
| Property | Purpose |
|---|---|
| Atomicity | All or nothing |
| Consistency | No data corruption |
| Isolation | No interference from other transactions |
| Durability | Changes survive crashes |
π‘ SQL Example:
1 2 3 4 BEGIN; UPDATE accounts SET balance = balance - 50 WHERE account_name = 'A'; UPDATE accounts SET balance = balance + 50 WHERE account_name = 'B'; COMMIT;
Storage Types That Support ACID
| Type | Description |
|---|---|
| Volatile | RAM β fast but not crash-proof |
| Non-Volatile | Disks β durable but slower |
| Stable | Redundant, reliable storage (RAID) |
π‘ Durability relies on writing to stable storage before committing.
Transaction Lifecycle
| State | What Happens |
|---|---|
| Active | Execution in progress |
| Partially Committed | Final operations done |
| Failed | Error occurred, canβt proceed |
| Aborted | Rolled back to safe state |
| Committed | Successfully completed, changes saved |
Serializability β Order in the Chaos
To preserve consistency with multiple transactions, databases ensure serializability β the outcome is the same as if transactions were run one-by-one.
- β Serial Schedule: T1 β T2
- β Serializable: Looks like a serial execution
- β Non-serial but inconsistent: Database disaster!
Conflict Serializability and Precedence Graphs
Use Precedence Graphs:
- Nodes = transactions
- Edge T1 β T2 = T1 must come before T2 due to a conflict
- π No cycles = β Serializable
- π Cycles = β Not serializable
Recoverable vs. Cascadeless Schedules
| Schedule Type | Rule |
|---|---|
| Recoverable | Donβt commit if you read uncommitted data |
| Cascadeless | Donβt even read uncommitted data! |
Isolation Levels
| Level | Isolation Strength |
|---|---|
| Serializable | Strongest β like serial |
| Repeatable Read | Stable reads, no phantom rows |
| Read Committed | Canβt read dirty data |
| Read Uncommitted | π± May read dirty data |
π Lesson 19: Concurrency Control β Keeping Transactions from Colliding
π Why Locks?
Locks help transactions access data safely and prevent conflicts:
- Shared (S): Read-only access
- Exclusive (X): Read + Write access
β οΈ Many S-locks allowed, only one X-lock!
Two-Phase Locking (2PL)
| Phase | Rule |
|---|---|
| Growing | Acquire locks only |
| Shrinking | Release locks only |
β Guarantees serializability β May cause deadlocks
Deadlocks & Solutions
| Strategy | Action |
|---|---|
| Detection | Use waits-for graph & pick a victim |
| Prevention | Use timestamp rules (Wait-Die, Wound-Wait) |
Lock Granularity & Intention Locks
| Level | Example |
|---|---|
| Tuple | Specific row |
| Page | Block of rows |
| Table | Entire table |
π§ Use IS/IX/SIX intention locks for hierarchy control.
Timestamp Ordering (TO)
- Assign timestamps to transactions
- Maintain order to ensure serializability
- Use Thomas Write Rule to reduce aborts
Multi-Version Concurrency Control (MVCC)
- DBMS keeps multiple versions of data items
- Readers see consistent snapshot
- Writers donβt block readers (and vice versa!)
π° Great for time-travel queries and snapshots
Snapshot Isolation (SI)
Every transaction sees a frozen version of the database from when it began.
β οΈ Beware of Write Skew Anomalies β two transactions update different values based on outdated shared state.
β»οΈ Lesson 20: Recovery β When Things Go Wrong, Databases Fight Back πͺ
π± Why Recovery Matters
- Power failures
- Crashes
- Transaction errors
- Natural disasters
β Transactions must be atomic and durable even after disaster!
Log-Based Recovery
Every change is recorded in a log BEFORE modifying the actual database.
| Log Format Example | Meaning |
|---|---|
<Ti, Xj, V1, V2> | Ti changed Xj from V1 to V2 |
<Ti start> | Ti started |
<Ti commit> | Ti committed |
<Ti abort> | Ti aborted |
Undo & Redo
| Operation | Action |
|---|---|
| Undo(Ti) | Rollback all changes using old values |
| Redo(Ti) | Reapply changes using new values |
π After a crash:
- Undo incomplete transactions
- Redo committed transactions
Checkpoints β Recovery Time Boosters
| Type | Benefit |
|---|---|
| Standard | Snapshot of active transactions |
| Fuzzy | Doesnβt block transactions |
β Reduces the log scanning needed during recovery
Log Record Buffering & WAL
- Logs are kept in memory temporarily
- Write-Ahead Logging (WAL) ensures log is written before data is updated
Steal & No-Force Policies
| Policy | Description |
|---|---|
| Steal | Dirty data may be written before commit |
| No-Force | Committed data may not be written immediately |
Recovery from Disk Failures
- Periodic full dumps to backup media
- Use dump + redo log to recover from complete disk loss
Remote Backup Systems
| Technique | Use Case |
|---|---|
| Log Shipping | Send logs to remote replica |
| Hot-Spare Mode | Backup is nearly always ready |
| Distributed Systems | Multi-site redundancy |
ARIES Algorithm (IBM)
A 3-phase recovery strategy:
- π Analysis β What happened?
- πRedo β Replay history
- βͺ Undo β Rollback incomplete work
Uses STEAL + NO-FORCE policies and logs every step
Recovery in Main-Memory DBs
- Main memory = volatile β risk of loss
- Recovery must reload DB into memory and replay logs
β Optimizations:
- Skip redo for indexes
- Do recovery in parallel across cores
π‘ What Iβve Learned
- π Transactions must be safe and consistent
- π Concurrency control is a balancing act
- πΎ Recovery = logs, checkpoints, ARIES
- π High availability = backup systems and replication
π± Reflection
From safe updates to post-crash recovery, this unit taught me that databases arenβt just about data β theyβre about trust.
Transactions, locks, logs, and backups: The unsung heroes keeping our apps alive π
π Conclusion
ACID, 2PL, ARIES, MVCC β they sound complex, but they all serve a simple purpose: Protect the data. Always. π‘οΈ
π βA reliable system isnβt the one that never fails. Itβs the one that knows how to recover.β