Users and DB Programs
End users don’t see the DB directly
- are only vaguely aware of its design
- may be acutely aware of part of its contents
- SQL is not a suitable end-user interface
A single SQL query is not a sufficient unit of DB work
- May need more than one query
- May need to check constraints not enforced by the DBMS
- May need to do calculations, realize “business rules”, etc.
Ergo, a program is needed to carry out each unit of DB work
End users interact with DB programs rather than SQL
- May be many users simultaneously
- Thus many simultaneous executions of these programs
- Each user expects service and correct operation
- A user should not have to wait forever
- A user should not be affected by errors of others
Transaction
A transaction is the execution of a DB program
DB applications are designed as a set of transactions
Typical transaction
- starts with data from user or from another transaction
- includes DB reads/writes
- ends with display of data or form, or with request to start another transaction
ACID Properties
- Atomicity
- Consistency Preservation
- Isolation
- Durability
Atomicity
Transactions must be “atomic” i.e., Their effect is all or none
DB must be consistent before and after the transaction executes (not necessarily during!)
EITHER
a transaction executes fully and “commits” to all the changes it makes to the DB
OR
it must be as thought that transaction never executed at all
User view: “Transfer money from savings to checking”
Program:
- read savings;
- verify balance is adequate;
- update savings balance;
- read checking;
- update checking balance;
Consistency
A transactions which only READs expects DB to be consistent, and cannot cause it to become otherwise.
When a transaction which does any WRITE finishes, it must either
COMMIT: “I’m done and the DB is consistent again”
OR
ABORT (ROLLBACK): “I’m done but I goofed: my changes must be undone.”
There are complications:
- A DB may have many simultaneous users
- simultaneous users implies simultaneous transactions implies simultaneous DB access
- transactions can conflict with one another
- programs may crash, OS may crash, disk may crash
- company loses customer, gets sued, goes bankrupt, etc.
A DB that’s 1% inaccurate is 100% unusable.
What problems can happen if a database is not consistent or atomic? Be creative.
Isolation
Your transaction cannot affect another users transaction
Durability
Can’t lose data – ever
Can’t become unavailable
Transactions
Transaction starts/stops
DB writes: “before” and/or “after” images of DB records
- befores can be used to rollback an aborted transaction
- afters can be used to redo a transaction
- (recovery from catastrophe)
COMMITs and ABORTs
Each transaction has internal state
When system crashes, internal state is lost
- Don’t know which parts executed and which didn’t
So we use a log file. The log itself is as critical as the DB! We’ll talk more about this later.
Start Transaction
- Oracle – autocommit is off by default, so a new transaction is started after each COMMIT or ROLLBACK
- MySQL – START TRANSACTION
End Transaction
- COMMIT or ROLLBACK
Schedules
T1 | T2 |
---|---|
r1(A,t) | r2(A,s) |
t = t+100 | s = s*2 |
w1(A,t) | w2(A,s) |
r1(B,t) | r2(B,s) |
t = t+100 | s = s*2 |
w1(B,t) | w2(B,s) |
Serial Schedule
Serial means first do all of 1 txn and then all of the other
Serializable means the final effect is the same as serial.
T1 | T2 | Disk A | Disk B |
---|---|---|---|
25 | 25 | ||
r1(A,t) | |||
t = t+100 | |||
w1(A,t) | 125 | ||
r1(B,t) | |||
t = t+100 | |||
w1(B,t) | 125 | ||
r2(A,s) | |||
s = s*2 | |||
w2(A,s) | 250 | ||
r2(B,s) | |||
s = s*2 | |||
w2(B,s) | 250 |
Can we make this schedule serializable?
T1 | T2 |
---|---|
r1(A,t) | r2(A,s) |
t = t+100 | s = s*2 |
w1(A,t) | w2(A,s) |
r1(B,t) | r2(B,s) |
t = t+100 | s = s*2 |
w1(B,t) | w2(B,s) |
T1 | T2 | Disk A | Disk B |
---|---|---|---|
25 | 25 | ||
r1(A,t) | |||
t = t+100 | |||
w1(A,t) | 125 | ||
r2(A,s) | |||
s = s*2 | |||
w2(A,s) | 250 | ||
r1(B,t) | |||
t = t+100 | |||
w1(B,t) | 125 | ||
r2(B,s) | |||
s = s*2 | |||
w2(B,s) | 250 |
Conflict Serializable
This is a stronger notion of serializability – if the order can be changed to a serial schedule without creating conflict.
Most pairs of actions are not in conflict.
Typically two pairs are in conflict if:
- If they involve the same database element
- AND there is at least one write
- If they are in conflict then order cannot be swapped
Example: Same as result of in class assignment
r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B);
We want to take this schedule and reorder it to make it into a serial schedule.
r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B);
r1(A); w1(A); r2(A); r1(B); w2(A); w1(B); r2(B); w2(B);
r1(A); w1(A); r1(B); r2(A); w2(A); w1(B); r2(B); w2(B);
r1(A); w1(A); r1(B); r2(A); w1(B); w2(A); r2(B); w2(B);
r1(A); w1(A); r1(B); w1(B); r2(A); w2(B); r2(B); w2(B); <- Serial Schedule so the initial schedule is fine!
Enforcing Serializability
We need to schedule locks. 2PL is popular (obsolete) way to do it.
- First all lock action
- Then all unlock actions
Example: “All locks precede unlocks”
T1 | T2 | Disk A | Disk B |
---|---|---|---|
25 | 25 | ||
l1(A);r1(A,t); | |||
t = t+100 | |||
w1(A,t) l1(B); u1(A); | 125 | ||
l2(A);r2(A,s); | |||
s = s*2 | |||
w2(A,s) l2(B); Denied | 250 | ||
r1(B,t); | |||
t = t+100 | |||
w1(B,t); u1(B); | 125 | ||
l2(B); u2(A); r2(B,s); | |||
s = s*2 | |||
w2(B,s) u2(B); | 250 |
Problems with 2PL? Deadlock
In practice
Lots of different lock modes
- Update locks
- Only read not write
- Upgrade locks
Notion of “friendliness” – Allow others to read or write or neither (exclusive lock)
Increment locks – Two increment operations are commutable (doesn’t matter the order)
DBMS keeps a lock table to make sure everyone places nicely.