Transaction Management

Users and DB Programs

End users don’t see the DB directly

  • are only vaguely aware of its design
  • may be acutely aware of part of its contents
  • SQL is not a suitable end-user interface

A single SQL query is not a sufficient unit of DB work

  • May need more than one query
  • May need to check constraints not enforced by the DBMS
  • May need to do calculations, realize “business rules”, etc.

Ergo, a program is needed to carry out each unit of DB work

End users interact with DB programs rather than SQL

  • May be many users simultaneously
    • Thus many simultaneous executions of these programs
  • Each user expects service and correct operation
    • A user should not have to wait forever
    • A user should not be affected by errors of others

Transaction

A transaction is the execution of a DB program

DB applications are designed as a set of transactions

Typical transaction

  • starts with data from user or from another transaction
  • includes DB reads/writes
  • ends with display of data or form, or with request to start another transaction

ACID Properties

  • Atomicity
  • Consistency Preservation
  • Isolation
  • Durability

Atomicity

Transactions must be “atomic” i.e., Their effect is all or none

DB must be consistent before and after the transaction executes (not necessarily during!)

EITHER
a transaction executes fully and “commits” to all the changes it makes to the DB
OR
it must be as thought that transaction never executed at all

User view: “Transfer money from savings to checking”

Program:

  • read savings;
  • verify balance is adequate;
  • update savings balance;
  • read checking;
  • update checking balance;

Consistency

A transactions which only READs expects DB to be consistent, and cannot cause it to become otherwise.

When a transaction which does any WRITE finishes, it must either

COMMIT: “I’m done and the DB is consistent again”
OR
ABORT (ROLLBACK): “I’m done but I goofed: my changes must be undone.”

There are complications:

  • A DB may have many simultaneous users
    • simultaneous users implies simultaneous transactions implies simultaneous DB access
    • transactions can conflict with one another
  • programs may crash, OS may crash, disk may crash
  • company loses customer, gets sued, goes bankrupt, etc.

A DB that’s 1% inaccurate is 100% unusable.

What problems can happen if a database is not consistent or atomic? Be creative.

Isolation

Your transaction cannot affect another users transaction

Durability

Can’t lose data – ever

Can’t become unavailable

Transactions

Transaction starts/stops

DB writes: “before” and/or “after” images of DB records

  • befores can be used to rollback an aborted transaction
  • afters can be used to redo a transaction
    • (recovery from catastrophe)

COMMITs and ABORTs

Each transaction has internal state

When system crashes, internal state is lost

  • Don’t know which parts executed and which didn’t

So we use a log file. The log itself is as critical as the DB! We’ll talk more about this later.

Start Transaction

  • Oracle – autocommit is off by default, so a new transaction is started after each COMMIT or ROLLBACK
  • MySQL – START TRANSACTION

End Transaction

  • COMMIT or ROLLBACK

Schedules

T1T2
r1(A,t)r2(A,s)
t = t+100s = s*2
w1(A,t)w2(A,s)
r1(B,t)r2(B,s)
t = t+100s = s*2
w1(B,t)w2(B,s)

Serial Schedule

Serial means first do all of 1 txn and then all of the other

Serializable means the final effect is the same as serial.

T1T2Disk ADisk B
  2525
r1(A,t)   
t = t+100   
w1(A,t) 125 
r1(B,t)   
t = t+100   
w1(B,t)  125
 r2(A,s)  
 s = s*2  
 w2(A,s)250 
 r2(B,s)  
 s = s*2  
 w2(B,s) 250

Can we make this schedule serializable?

T1T2
r1(A,t)r2(A,s)
t = t+100s = s*2
w1(A,t)w2(A,s)
r1(B,t)r2(B,s)
t = t+100s = s*2
w1(B,t)w2(B,s)
T1T2Disk ADisk B
  2525
r1(A,t)   
t = t+100   
w1(A,t) 125 
 r2(A,s)  
 s = s*2  
 w2(A,s)250 
r1(B,t)   
t = t+100   
w1(B,t)  125
 r2(B,s)  
 s = s*2  
 w2(B,s) 250

Conflict Serializable

This is a stronger notion of serializability – if the order can be changed to a serial schedule without creating conflict.

Most pairs of actions are not in conflict.

Typically two pairs are in conflict if:

  • If they involve the same database element
  • AND there is at least one write
    • If they are in conflict then order cannot be swapped

Example: Same as result of in class assignment

r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B);

We want to take this schedule and reorder it to make it into a serial schedule.

r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B);
r1(A); w1(A); r2(A); r1(B); w2(A); w1(B); r2(B); w2(B);
r1(A); w1(A); r1(B); r2(A); w2(A); w1(B); r2(B); w2(B);
r1(A); w1(A); r1(B); r2(A); w1(B); w2(A); r2(B); w2(B);
r1(A); w1(A); r1(B); w1(B); r2(A); w2(B); r2(B); w2(B); <- Serial Schedule so the initial schedule is fine!

Enforcing Serializability

We need to schedule locks. 2PL is popular (obsolete) way to do it.

  • First all lock action
  • Then all unlock actions

Example: “All locks precede unlocks”

T1T2Disk ADisk B
  2525
l1(A);r1(A,t);   
t = t+100   
w1(A,t) l1(B); u1(A); 125 
 l2(A);r2(A,s);  
 s = s*2  
 w2(A,s) l2(B); Denied250 
r1(B,t);   
t = t+100   
w1(B,t); u1(B);  125
 l2(B); u2(A); r2(B,s);  
 s = s*2  
 w2(B,s) u2(B); 250

Problems with 2PL? Deadlock

In practice

Lots of different lock modes

  • Update locks
  • Only read not write
  • Upgrade locks

Notion of “friendliness” – Allow others to read or write or neither (exclusive lock)

Increment locks – Two increment operations are commutable (doesn’t matter the order)

DBMS keeps a lock table to make sure everyone places nicely.