Transaction Management – Tim Weninger, PhD

Users and DB Programs

End users don’t see the DB directly

are only vaguely aware of its design
may be acutely aware of part of its contents
SQL is not a suitable end-user interface

A single SQL query is not a sufficient unit of DB work

May need more than one query
May need to check constraints not enforced by the DBMS
May need to do calculations, realize “business rules”, etc.

Ergo, a program is needed to carry out each unit of DB work

End users interact with DB programs rather than SQL

May be many users simultaneously
- Thus many simultaneous executions of these programs
Each user expects service and correct operation
- A user should not have to wait forever
- A user should not be affected by errors of others

Transaction

A transaction is the execution of a DB program

DB applications are designed as a set of transactions

Typical transaction

starts with data from user or from another transaction
includes DB reads/writes
ends with display of data or form, or with request to start another transaction

ACID Properties

Atomicity
Consistency Preservation
Isolation
Durability

Atomicity

Transactions must be “atomic” i.e., Their effect is all or none

DB must be consistent before and after the transaction executes (not necessarily during!)

EITHER
a transaction executes fully and “commits” to all the changes it makes to the DB
OR
it must be as thought that transaction never executed at all

User view: “Transfer money from savings to checking”

Program:

read savings;
verify balance is adequate;
update savings balance;
read checking;
update checking balance;

Consistency

A transactions which only READs expects DB to be consistent, and cannot cause it to become otherwise.

When a transaction which does any WRITE finishes, it must either

COMMIT: “I’m done and the DB is consistent again”
OR
ABORT (ROLLBACK): “I’m done but I goofed: my changes must be undone.”

There are complications:

A DB may have many simultaneous users
- simultaneous users implies simultaneous transactions implies simultaneous DB access
- transactions can conflict with one another
programs may crash, OS may crash, disk may crash
company loses customer, gets sued, goes bankrupt, etc.

A DB that’s 1% inaccurate is 100% unusable.

What problems can happen if a database is not consistent or atomic? Be creative.

Isolation

Your transaction cannot affect another users transaction

Durability

Can’t lose data – ever

Can’t become unavailable

Transactions

Transaction starts/stops

DB writes: “before” and/or “after” images of DB records

befores can be used to rollback an aborted transaction
afters can be used to redo a transaction
- (recovery from catastrophe)

COMMITs and ABORTs

Each transaction has internal state

When system crashes, internal state is lost

Don’t know which parts executed and which didn’t

So we use a log file. The log itself is as critical as the DB! We’ll talk more about this later.

Start Transaction

Oracle – autocommit is off by default, so a new transaction is started after each COMMIT or ROLLBACK
MySQL – START TRANSACTION

End Transaction

COMMIT or ROLLBACK

Schedules

T1	T2
r₁(A,t)	r₂(A,s)
t = t+100	s = s*2
w₁(A,t)	w₂(A,s)
r₁(B,t)	r₂(B,s)
t = t+100	s = s*2
w₁(B,t)	w₂(B,s)

Serial Schedule

Serial means first do all of 1 txn and then all of the other

Serializable means the final effect is the same as serial.

T1	T2	Disk A	Disk B
		25	25
r₁(A,t)
t = t+100
w₁(A,t)		125
r₁(B,t)
t = t+100
w₁(B,t)			125
	r₂(A,s)
	s = s*2
	w₂(A,s)	250
	r₂(B,s)
	s = s*2
	w₂(B,s)		250

Can we make this schedule serializable?

T1	T2
r₁(A,t)	r₂(A,s)
t = t+100	s = s*2
w₁(A,t)	w₂(A,s)
r₁(B,t)	r₂(B,s)
t = t+100	s = s*2
w₁(B,t)	w₂(B,s)

T1	T2	Disk A	Disk B
		25	25
r₁(A,t)
t = t+100
w₁(A,t)		125
	r₂(A,s)
	s = s*2
	w₂(A,s)	250
r₁(B,t)
t = t+100
w₁(B,t)			125
	r₂(B,s)
	s = s*2
	w₂(B,s)		250

Conflict Serializable

This is a stronger notion of serializability – if the order can be changed to a serial schedule without creating conflict.

Most pairs of actions are not in conflict.

Typically two pairs are in conflict if:

If they involve the same database element
AND there is at least one write
- If they are in conflict then order cannot be swapped

Example: Same as result of in class assignment

r₁(A); w₁(A); r₂(A); w₂(A); r₁(B); w₁(B); r₂(B); w₂(B);

We want to take this schedule and reorder it to make it into a serial schedule.

r₁(A); w₁(A); r₂(A); w₂(A); r₁(B); w₁(B); r₂(B); w₂(B);
r₁(A); w₁(A); r₂(A); r₁(B); w₂(A); w₁(B); r₂(B); w₂(B);
r₁(A); w₁(A); r₁(B); r₂(A); w₂(A); w₁(B); r₂(B); w₂(B);
r₁(A); w₁(A); r₁(B); r₂(A); w₁(B); w₂(A); r₂(B); w₂(B);
r₁(A); w₁(A); r₁(B); w₁(B); r₂(A); w₂(B); r₂(B); w₂(B); <- Serial Schedule so the initial schedule is fine!

Enforcing Serializability

We need to schedule locks. 2PL is popular (obsolete) way to do it.

First all lock action
Then all unlock actions

Example: “All locks precede unlocks”

T1	T2	Disk A	Disk B
		25	25
l₁(A);r₁(A,t);
t = t+100
w₁(A,t) l₁(B); u₁(A);		125
	l₂(A);r₂(A,s);
	s = s*2
	w₂(A,s) l₂(B); Denied	250
r₁(B,t);
t = t+100
w₁(B,t); u₁(B);			125
	l₂(B); u₂(A); r₂(B,s);
	s = s*2
	w₂(B,s) u₂(B);		250

Problems with 2PL? Deadlock

In practice

Lots of different lock modes

Update locks
Only read not write
Upgrade locks

Notion of “friendliness” – Allow others to read or write or neither (exclusive lock)

Increment locks – Two increment operations are commutable (doesn’t matter the order)

DBMS keeps a lock table to make sure everyone places nicely.