Transaction Systems

44
1 / 24 Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery Gerhard Weikum and Gottfried Vossen Teamwork is essential. It allows you to blame someone else. (Anonymous) © 2002 Morgan Kaufmann ISBN 1 -55860-508-8

Transcript of Transaction Systems

1 / 24

Transactional Information Systems:

Theory, Algorithms, and

the

Practice of

Concurrency Control and Recovery

Gerhard Weikum and Gottfried Vossen

Teamwork is essential. It allows you to blame someone else.

(Anonymous)

© 2002 Morgan Kaufmann

ISBN 1

-

55860

-

508

-

8

2 / 24

Part II: Concurrency Control

3 Concurrency Control: Notions of Correctness for the Page Model

4 Concurrency Control Algorithms

5 Multiversion Concurrency Control

6 Concurrency Control on Objects: Notions of Correctness

7 Concurrency Control Algorithms on Objects

8 Concurrency Control on Relational Databases

9 Concurrency Control on Search Structures

10 Implementation and Pragmatic Issues

3 / 24

Chapter 5: Multiversion

Concurrency Control

5.2 Multiversion Schedules

5.3 Multiversion Serializability

5.4 Limiting the Number of Versions

5.5 Multiversion Concurrency Control Protocols

5.6 Lessons Learned

A book is a version of the world. If you do not like it, ignore it;

or offer your own version in return.

(Salmon Rushdie)

4 / 24

Motivation

Example 5.1:

s =

r

1

(x) w

1

(x)

r

2

(x) w

2

(y)

r

1

(y) w

1

(z) c

1

c

2

CSR

but: schedule would be tolerable

if r

1

(y) could read the

old version

y

0

of y

to be consistent with r

1

(x)

s would then be equivalent to serial s' = t

1

t 2

4 / 24

Motivation

Example 5.1:

s =

r

1

(x) w

1

(x)

r

2

(x) w

2

(y)

r

1

(y) w

1

(z) c

1

c

2

CSR

but: schedule would be tolerable

if r

1

(y) could read the

old version

y

0

of y

to be consistent with r

1

(x)

s would then be equivalent to serial s' = t

1

t 2

Approach:

each w step creates a new version

each r step can choose which version it wants/needs to read

versions are transparent to application and

transient (i.e., subject to garbage collection)

5 / 24

Definition 5.1 (Version Function):

Let s be a history with initial transaction t

0

and final transaction t

.

A

version function

for s is a function h which associates with each read step of s

a previous write step on the same data item, and the identity for writes.

Multiversion Schedules

5 / 24

Definition 5.1 (Version Function):

Let s be a history with initial transaction t

0

and final transaction t

.

A

version function

for s is a function h which associates with each read step of s

a previous write step on the same data item, and the identity for writes.

Multiversion Schedules

Definition 5.2 (Multiversion Schedule):

A

multiversion (mv) history

for transactions T ={t

1

, ..., t

n

} is a pair

m=(op(m), <

m

) where <

m

is an order on op(m) and

(1)

op(m) =

i=1..n

h(op(t

i )) for some version function h,

(2)

for all t

T and all p, q

op(t

i ): p <

t

q

h(p) <

m

h(q),

(3)

if h(r

j (x)) = w

j (x

i ), i

j, then c

i is in m and c

i

<

m

c

j .

A

multiversion (mv) schedule

is a prefix of a multiversion history.

Example 5.2:

r

1

(x

0

)

w

1

(x

1

)

r

2

(x

1

) w

2

(y

2

)

r

1

(y

0

) w

1

(z

1

) c

1

c

2

with

h(r

1

(y)) = w

0

(y

0

)

5 / 24

Definition 5.1 (Version Function):

Let s be a history with initial transaction t

0

and final transaction t

.

A

version function

for s is a function h which associates with each read step of s

a previous write step on the same data item, and the identity for writes.

Multiversion Schedules

Definition 5.2 (Multiversion Schedule):

A

multiversion (mv) history

for transactions T ={t

1

, ..., t

n

} is a pair

m=(op(m), <

m

) where <

m

is an order on op(m) and

(1)

op(m) =

i=1..n

h(op(t

i )) for some version function h,

(2)

for all t

T and all p, q

op(t

i ): p <

t

q

h(p) <

m

h(q),

(3)

if h(r

j (x)) = w

j (x

i ), i

j, then c

i is in m and c

i

<

m

c

j .

A

multiversion (mv) schedule

is a prefix of a multiversion history.

Example 5.2:

r

1

(x

0

)

w

1

(x

1

)

r

2

(x

1

) w

2

(y

2

)

r

1

(y

0

) w

1

(z

1

) c

1

c

2

with

h(r

1

(y)) = w

0

(y

0

)

Definition 5.3 (Monoversion Schedule):

A multiversion schedule is a

monoversion schedule

if its version

function maps each read to the last preceding write on the same data item.

Example:

r

1

(x

0

) w

1

(x

1

)

r

2

(x

1

) w

2

(y

2

)

r

1

(y

2

) w

1

(z

1

) c

1

c

2

6 / 24

Chapter 5: Multiversion

Concurrency Control

5.2 Multiversion Schedules

5.3 Multiversion Serializability

5.4 Limiting the Number of Versions

5.5 Multiversion Concurrency Control Protocols

5.6 Lessons Learned

7 / 24

Definition 5.4 (Reads

-

from Relation):

For mv schedule m the reads

-

from relation of m is

RF(m)

= {(t

i , x, t

j ) | r

j (x

i )

op(m)}.

Multiversion View Serializability

7 / 24

Definition 5.4 (Reads

-

from Relation):

For mv schedule m the reads

-

from relation of m is

RF(m)

= {(t

i , x, t

j ) | r

j (x

i )

op(m)}.

Multiversion View Serializability

Definition 5.5 (View Equivalence):

mv histories m and m' with trans(m)=trans(m') are

view equivalent

,

m

v

m‘

, if RF(m) = RF(m').

7 / 24

Definition 5.4 (Reads

-

from Relation):

For mv schedule m the reads

-

from relation of m is

RF(m)

= {(t

i , x, t

j ) | r

j (x

i )

op(m)}.

Multiversion View Serializability

Definition 5.5 (View Equivalence):

mv histories m and m' with trans(m)=trans(m') are

view equivalent

,

m

v

m‘

, if RF(m) = RF(m').

Definition 5.6 (Multiversion View Serializability):

m is multiversion view serializable if there is a serial monoversion history m'

s.t. m

v

m'.

MVSR

is the class of multiversion view serializable histories.

7 / 24

Definition 5.4 (Reads

-

from Relation):

For mv schedule m the reads

-

from relation of m is

RF(m)

= {(t

i , x, t

j ) | r

j (x

i )

op(m)}.

Multiversion View Serializability

Definition 5.5 (View Equivalence):

mv histories m and m' with trans(m)=trans(m') are

view equivalent

,

m

v

m‘

, if RF(m) = RF(m').

Definition 5.6 (Multiversion View Serializability):

m is multiversion view serializable if there is a serial monoversion history m'

s.t. m

v

m'.

MVSR

is the class of multiversion view serializable histories.

Example 5.5:

m =

w

0

(x

0

) w

0

(y

0

) c

0

r

1

(x

0

)

r

1

(y

0

) w

1

(x

1

) w

1

(y

1

) c

1

r

2

(x

0

) r

2

(y

1

) c

2

Example 5.6:

m =

w

0

(x

0

) w

0

(y

0

) c

0

w

1

(x

1

) c

1

r

2

(x

1

)

r

3

(x

0

) w

3

(x

3

) c

3

w

2

(y

2

) c

2

MVSR

v

t 0

t 3

t 1

t 2

8 / 24

Properties of MVSR

Theorem 5.1:

VSR

MVSR

Example:

s =

r

1

(x) w

1

(x)

r

2

(x) w

2

(y)

r

1

(y) w

1

(z) c

1

c

2

8 / 24

Properties of MVSR

Theorem 5.1:

VSR

MVSR

Example:

s =

r

1

(x) w

1

(x)

r

2

(x) w

2

(y)

r

1

(y) w

1

(z) c

1

c

2

Theorem 5.2:

Deciding if a mv history is in MVSR is NP

-

complete.

8 / 24

Properties of MVSR

Theorem 5.1:

VSR

MVSR

Example:

s =

r

1

(x) w

1

(x)

r

2

(x) w

2

(y)

r

1

(y) w

1

(z) c

1

c

2

Theorem 5.2:

Deciding if a mv history is in MVSR is NP

-

complete.

Theorem 5.3:

The conflict graph of an mv schedule m is a directed graph G(m) with

transactions as nodes and an edge from t

i

to t

j if r

j (x

i )

op(m)

.

For all mv schedules m, m': m

v

m'

G(m) = G(m').

Example:

m =

w

1

(x

1

)

r

2

(x

0

)

w

1

(y

1

)

r

2

(y

1

)

c

1

c

2

m' =

w

1

(x

1

) w

1

(y

1

) c

1

r

2

(x

1

) r

2

(y

0

) c

2

G(m) = G(m'),

but not

m

v

m'

9 / 24

Testing MVSR

Definition 5.8 (Multiversion Serialization Graph (MVSG)):

A version order for data item x, denoted <<

x

, is a total order among all versions of x.

A

version order

for mv schedule m is the

union of version orders for items written in m.

The

mv serialization graph

for m and a given version order <<,

MVSG (m, <<),

is a graph with transactions as nodes and the following edges:

(i)

all edges of G(m) are in MVSG(m, <<)

(i.e., for r

k

(x

j ) in op(m) there is an edge from t

j to t

k

)

(ii)

for r

k

(x

j ), w

i (x

i ) in op(m): if x

i << x

j then there is an edge from t

i to t

j

(iii)

for r

k

(x

j ), w

i (x

i ) in op(m): if x

j << x

i

then there is an edge from t

k

to t

i

9 / 24

Testing MVSR

Definition 5.8 (Multiversion Serialization Graph (MVSG)):

A version order for data item x, denoted <<

x

, is a total order among all versions of x.

A

version order

for mv schedule m is the

union of version orders for items written in m.

The

mv serialization graph

for m and a given version order <<,

MVSG (m, <<),

is a graph with transactions as nodes and the following edges:

(i)

all edges of G(m) are in MVSG(m, <<)

(i.e., for r

k

(x

j ) in op(m) there is an edge from t

j to t

k

)

(ii)

for r

k

(x

j ), w

i (x

i ) in op(m): if x

i << x

j then there is an edge from t

i to t

j

(iii)

for r

k

(x

j ), w

i (x

i ) in op(m): if x

j << x

i

then there is an edge from t

k

to t

i

Theorem 5.4:

m is in MVSR iff there exists a version order << s.t. MVSG(m, <<) is acyclic.

10 / 24

MVSG Example

Examples 5.7 and 5.8:

m =

w

0

(x

0

) w

0

(y

0

) w

0

(z

0

) c

0

r

1

(x

0

)

r

2

(x

0

) r

2

(z

0

)

r

3

(z

0

)

w

1

(y

1

)

w

2

(x

2

)

w

3

(y

3

) w

3

(z

3

)

c

1

c

2

c

3

r

4

(x

2

) r

4

(y

3

) r

4

(z

3

) c

4

with version order <<:

x

0

<< x

2

y

0

<< y

1

<< y

3

z

0

<< z

3

MVSG(m, <<):

t

0

t

1

t

2

t

3

t

4

10 / 24

MVSG Example

Examples 5.7 and 5.8:

m =

w

0

(x

0

) w

0

(y

0

) w

0

(z

0

) c

0

r

1

(x

0

)

r

2

(x

0

) r

2

(z

0

)

r

3

(z

0

)

w

1

(y

1

)

w

2

(x

2

)

w

3

(y

3

) w

3

(z

3

)

c

1

c

2

c

3

r

4

(x

2

) r

4

(y

3

) r

4

(z

3

) c

4

with version order <<:

x

0

<< x

2

y

0

<< y

1

<< y

3

z

0

<< z

3

MVSG(m, <<):

t

0

t

1

t

2

t

3

t

4

Notice: Testing whether appropriate << exists for given m is

not necessarily polynomial

NP

-

completeness result remains

11 / 24

Multiversion Conflict Serializability

Definition 5.9 (Multiversion Conflict):

A

multiversion conflict

in m is a pair r

i (x

j ) and w

k

(x

k

) such that r

i (x

j ) <

m

w

k

(x

k

).

11 / 24

Multiversion Conflict Serializability

Definition 5.9 (Multiversion Conflict):

A

multiversion conflict

in m is a pair r

i (x

j ) and w

k

(x

k

) such that r

i (x

j ) <

m

w

k

(x

k

).

Definition 5.10 (Multiversion Reducibility):

An mv history is

multiversion reducible

if it can be transformed into a

serial monoversion history by exchanging the order of adjacent steps

other than multiversion conflict pairs.

11 / 24

Multiversion Conflict Serializability

Definition 5.9 (Multiversion Conflict):

A

multiversion conflict

in m is a pair r

i (x

j ) and w

k

(x

k

) such that r

i (x

j ) <

m

w

k

(x

k

).

Definition 5.10 (Multiversion Reducibility):

An mv history is

multiversion reducible

if it can be transformed into a

serial monoversion history by exchanging the order of adjacent steps

other than multiversion conflict pairs.

Definition 5.11 (Multiversion Conflict Serializability):

An mv history is

multiversion conflict serializable

if there is a

serial monoversion history with the same transactions and

the same (ordering of) multiversion conflict pairs.

MCSR

denotes the class of all multiversion conflict serializable histories.

11 / 24

Multiversion Conflict Serializability

Definition 5.9 (Multiversion Conflict):

A

multiversion conflict

in m is a pair r

i (x

j ) and w

k

(x

k

) such that r

i (x

j ) <

m

w

k

(x

k

).

Definition 5.10 (Multiversion Reducibility):

An mv history is

multiversion reducible

if it can be transformed into a

serial monoversion history by exchanging the order of adjacent steps

other than multiversion conflict pairs.

Definition 5.11 (Multiversion Conflict Serializability):

An mv history is

multiversion conflict serializable

if there is a

serial monoversion history with the same transactions and

the same (ordering of) multiversion conflict pairs.

MCSR

denotes the class of all multiversion conflict serializable histories.

Definition 5.12 (Multiversion Conflict Graph):

For an mv schedule m the

multiversion conflict graph

is a graph with

transactions as nodes and an edge from t

i to t

k

if there are steps

r

i (x

j ) and w

k

(x

k

) such that r

i (x

j ) <

m

w

k

(x

k

).

12 / 24

Properties of MCSR

Theorem:

m is in MCSR

m is multiversion reducible

m's mv conflict graph is acyclic

12 / 24

Properties of MCSR

Theorem:

m is in MCSR

m is multiversion reducible

m's mv conflict graph is acyclic

Theorem 5.6:

MCSR

MVSR

Example:

m =

w

0

(x

0

) w

0

(y

0

) w

0

(z

0

) c

0

r

2

(y

0

)

r

3

(z

0

) w

3

(x

3

) c

3

r

1

(x

3

) w

1

(y

1

)

c

1

w

2

(x

2

) c

2

r

(x

2

)

r

(y

1

)

r

(z

0

)

c

MCSR

MVSR

m

v

t 0

t 3

t 2

t 1

t ∞

13 / 24

Chapter 5: Multiversion

Concurrency Control

5.2 Multiversion Schedules

5.3 Multiversion Serializability

5.4 Limiting the Number of Versions

5.5

Multiversion Concurrency Control Protocols

5.6 Lessons Learned

14 / 24

MVTO Protocol

each transaction t

i

is assigned a unique timestamp ts(t

i

)

r

i

(x) is mapped to r

i

(x

k

) where x

k

is the version

that carries the largest timestamp

ts(t

i

)

w

i

(x) is

rejected if there is r

j

(x

k

) with ts(t

k

) < ts(t

i

) < ts(t

j

)

mapped into w

i

(x

i

) otherwise

c

i

is delayed until c

j

of all transactions t

j

that have written versions read by t

i

Multiversion timestamp ordering (MVTO):

Correctness of MVTO

(i.e., Gen(MVTO)

MVSR):

x

i

<< x

j

ts(t

i

) < ts(t

j

)

15 / 24

MVTO Example

t 1

r 1 (x

0 )

r 1 (y

0 )

t 2

r 2 (y

0 ) w

2 (y

2 )

r 2 (x

0 ) w

2 (x

2 )

t 4

r 4 (y

2 ) w

4 (y

4 )

r 4 (x

2 ) w

4 (x

4 )

t 3

r 3 (x

2 ) r 3 (z

0 )

t 5

r 5 (y

2 ) r 5 (z

0 )

abort

15 / 24

MVTO Example

t 1

r 1 (x

0 )

r 1 (y

0 )

t 2

r 2 (y

0 ) w

2 (y

2 )

r 2 (x

0 ) w

2 (x

2 )

t 4

r 4 (y

2 ) w

4 (y

4 )

r 4 (x

2 ) w

4 (x

4 )

t 3

r 3 (x

2 ) r 3 (z

0 )

t 5

r 5 (y

2 ) r 5 (z

0 )

abort

interleaving

impossible w/o

multiple versions

15 / 24

MVTO Example

t 1

r 1 (x

0 )

r 1 (y

0 )

t 2

r 2 (y

0 ) w

2 (y

2 )

r 2 (x

0 ) w

2 (x

2 )

t 4

r 4 (y

2 ) w

4 (y

4 )

r 4 (x

2 ) w

4 (x

4 )

t 3

r 3 (x

2 ) r 3 (z

0 )

t 5

r 5 (y

2 ) r 5 (z

0 )

abort

interleaving

impossible w/o

multiple versions

needs to wait for t

2

to commit

15 / 24

MVTO Example

t 1

r 1 (x

0 )

r 1 (y

0 )

t 2

r 2 (y

0 ) w

2 (y

2 )

r 2 (x

0 ) w

2 (x

2 )

t 4

r 4 (y

2 ) w

4 (y

4 )

r 4 (x

2 ) w

4 (x

4 )

t 3

r 3 (x

2 ) r 3 (z

0 )

t 5

r 5 (y

2 ) r 5 (z

0 )

abort

interleaving

impossible w/o

multiple versions

needs to wait for t

2

to commit

since last write

is too late (in the

presence of t

5 )

16 / 24

Multiversion 2PL (MV2PL) Protocol

use write locking to ensure that

at each time there is at most one uncommitted version

for t

i that is not yet issuing its final step:

r

i (x) is mapped to

current version

(i.e., the most recent committed

version)

or an uncommitted version

w

i (x) is executed only if x is not write

-

locked, otherwise it is blocked

t i 's final step is delayed until after the commit of:

all t

j that have read from a current version of a data item that t

i has written

all t

j

from which t

i has read

General approach:

16 / 24

Multiversion 2PL (MV2PL) Protocol

use write locking to ensure that

at each time there is at most one uncommitted version

for t

i that is not yet issuing its final step:

r

i (x) is mapped to

current version

(i.e., the most recent committed

version)

or an uncommitted version

w

i (x) is executed only if x is not write

-

locked, otherwise it is blocked

t i 's final step is delayed until after the commit of:

all t

j that have read from a current version of a data item that t

i has written

all t

j

from which t

i has read

General approach:

Example 5.9:

for input schedule

s =

r

1

(x) w

1

(x)

r

2

(x) w

2

(y)

r

1

(y)

w

2

(x) c

2

w

1

(y) c

1

MV2PL produces the output schedule

r

1

(x

0

) w

1

(x

1

)

r

2

(x

1

) w

2

(y

2

)

r

1

(y

0

) w

1

(y

1

) c

1

w

2

(x

2

) c

2

17 / 24

Specialization: 2V2PL Protocol

request

write lock wl

i (x)

for writing a new uncommitted version

and ensuring that at most one such version exists at any time

request

read lock rl

i (x)

for reading the current version

(i.e., most recent committed version)

request

certify lock cl

i (x)

for final step of t

i

on all data items in t

i 's write set

2

-

Version (before/after image) 2PL:

+

_

_

_

rl

i (x)

wl

i (x)

rl

j (x)

wl

j (x)

lock

holder

lock requestor

cl

i (x)

cl

j (x)

_

_

_

+

+

Correctness of 2V2PL

(i.e., Gen(2V2PL)

MVSR):

x

i << x

j ⇔

f

i < f

j (for final

certify

steps of t

i , t

j )

18 / 24

2V2PL Example

Example 5.10:

s =

r

1

(x)

w

2

(y)

r

1

(y) w

1

(x) c

1

r

3

(y) r

3

(z) w

3

(z)

w

2

(x) c

2

w

4

(z) c

4

c

3

18 / 24

2V2PL Example

Example 5.10:

s =

r

1

(x)

w

2

(y)

r

1

(y) w

1

(x) c

1

r

3

(y) r

3

(z) w

3

(z)

w

2

(x) c

2

w

4

(z) c

4

c

3

t 1

r 1

(x

0

)

r 1

(y

0

)

t 2

w

2

(y

2

)

w

1

(x

1

)

t 3

r 3

(y

0

)

w

2

(x

2

)

r 3

(z

0

) w

3

(z

3

)

t 4

w

4 (z

4 )

c

2

18 / 24

2V2PL Example

Example 5.10:

s =

r

1

(x)

w

2

(y)

r

1

(y) w

1

(x) c

1

r

3

(y) r

3

(z) w

3

(z)

w

2

(x) c

2

w

4

(z) c

4

c

3

t 1

r 1

(x

0

)

r 1

(y

0

)

t 2

w

2

(y

2

)

w

1

(x

1

)

t 3

r 3

(y

0

)

w

2

(x

2

)

r 3

(z

0

) w

3

(z

3

)

t 4

w

4 (z

4 )

c

2

rl

1

(x) r

1

(x

0

)

wl

2

(y) w

2

(y

2

)

rl

1

(y) r

1

(y

0

) wl

1

(x) w

1

(x

1

) cl

1

(x) u

1

c

1

rl

3

(y) r

3

(y

0

) rl

3

(z) r

3

(z

0

)

wl

2

(x) cl

2

(x)

wl

3

(y) w

3

(z

3

) cl

3

(y) u

3

c

3

cl

2

(y) u

2

c

2

wl

4

(z) w

4

(z

4

) cl

4

(z) u

4

c

4

19 / 24

Multiversion Serialization Graph Testing

(MVSGT)

r

i (x) is mapped to r

i (x

j ) such that

there is no path t

j

...

t

k

...

t

i

with previous w

k

(x

k

)

(eliminate

too old

transactions)

there is no path t

i

...

t

j

(eliminate

too young

transactions)

abort t

i

if no such t

j

exists

Idea:

build version order and MVSG simultaneously (and incrementally)

Protocol rules:

upon w

i

(x

i

)

add edges t

j

t

i

for all t

j

with previous r

j

(x

k

)

abort t

i

when detecting cycle

upon r

i

(x

j

)

add edge t

j

t

i

and

edges t

k

t

j

or t

i

t

k

for all t

k

with previous w

k

(x

k

)

20 / 24

ROMV Protocol

each update transactions uses 2PL on both its read and write set

but each write creates a new version and

timestamps it with the transaction's commit time

each read

-

only transaction t

i is timestamped with its begin time

r

i (x) is mapped to r

i (x

k

) where x

k

is the version

that carries the largest timestamp

ts(t

i )

(i.e., the most recent committed version as of the begin of t

i )

Read

-

only Multiversion Protocol (ROMV):

Correctness of ROMV

(i.e., Gen(ROMV)

MVSR):

x

i << x

j ⇔

c

i < c

j

21 / 24

ROMV Example

t

1

r

1

(x

0

)

r

1

(y

0

)

t

2

r

2

(y

0

)

w

2

(y

2

)

r

2

(x

0

)

w

2

(x

2

)

t

3

r

3

(x

2

)

w

3

(x

3

)

t

4

r

4

(x

0

)

r

4

(z

0

)

t

5

r

5

(x

2

)

r

5

(z

0

)

22 / 24

Chapter 5: Multiversion

Concurrency Control

5.2 Multiversion Schedules

5.3 Multiversion Serializability

5.4 Limiting the Number of Versions

5.5 Multiversion Concurrency Control Protocols

5.6 Lessons Learned

23 / 24

Lessons Learned

Transient and transparent versioning adds a degree of

freedom to concurrency control protocols, making

MVSR considerably more powerful than VSR

The most striking benefit is for long read transactions

that execute concurrently with writers.

This specific benefit is achieved with relatively simple

protocols like ROMV.

24 / 24

Summary

Concurrency control in the page model

allows for many approaches, yet locking

dominates

Non

-

locking algorithms may be used in

special situations

Multiple versions can help making

concurrency control more flexible