Database transaction schedule: Difference between revisions

Content deleted Content added
m Cascadeless: Typo fixing, replaced: a.k.a, → a.k.a.
 
(12 intermediate revisions by 5 users not shown)
Line 3:
{{more citations needed|date=November 2012}}
 
In the fields of [[database]]s and [[transaction processing]] (transaction management), a '''schedule''' (or '''history''') of a system is an abstract model to describe the order of [[Execution (computing)|executionexecutions]] in a set of transactions running in the system. Often it is a ''list'' of operations (actions) ordered by time, performed by a set of [[Database transaction|transactions]] that are executed together in the system. If the order in time between certain operations is not determined by the system, then a ''[[partial order]]'' is used. Examples of such operations are requesting a read operation, reading, writing, aborting, [[Commit (data management)|committing]], requesting a [[Lock (computer science)|lock]], locking, etc. NotOften, all transaction operation types should be included inonly a schedule,subset andof typicallythe only selectedtransaction operation types (e.g., data access operations) are included, asin neededa to reason about and describe certain phenomenaschedule.

Schedules and schedule properties are fundamental concepts in database [[concurrency control]] theory. A schedule describes the order of actions of the transactions as seen by the [[DBMS]]. In practice, most general purpose database systems employ conflict-serializable and strict recoverable (primarily strict) schedules.
 
==Notation==
Line 10 ⟶ 12:
* '''Columns:''' The different transactions in the schedule.
* '''Rows:''' The time order of operations (a.k.a., actions).
Operations/ (a.k.a., actions):
 
* '''R(X):''' The corresponding transaction "reads" object X (i.e., it retrieves the data stored at X). This is done so that it can modify the data (e.g., X=X+4) during a "write" operation rather than merely overwrite it. When the schedule is represented as a list rather than a grid, the action is represented as <math>Ri(X)</math> where <math>i</math> is a number corresponding to a specific transaction.
Line 163 ⟶ 165:
|Com.
|}
Serializability is used to keep the data in the data item in a consistent state. It is the major criterion for the correctness of concurrent transactions' schedule, and thus supported in all general purpose database systems. Schedules that are not serializable are likely to generate erroneous outcomes; which can be extremely harmful (e.g., when dealing with money within banks).<ref name="Bernstein872">[[Phil Bernstein|Philip A. Bernstein]], Vassos Hadzilacos, Nathan Goodman (1987): [http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx ''Concurrency Control and Recovery in Database Systems''] (free PDF download), Addison Wesley Publishing Company, {{ISBN|0-201-10715-5}}</ref><ref name="Weikum012">[[Gerhard Weikum]], Gottfried Vossen (2001): [http://www.elsevier.com/wps/find/bookdescription.cws_home/677937/description#description ''Transactional Information Systems''], Elsevier, {{ISBN|1-55860-508-8}}</ref><ref name="Herlihy1993">[[Maurice Herlihy]] and J. Eliot B. Moss. ''Transactional memory: architectural support for lock-free data structures.'' Proceedings of the 20th annual international symposium on Computer architecture (ISCA '93). Volume 21, Issue 2, May 1993.</ref>
 
If any specific order between some transactions is requested by an application, then it is enforced independently of the underlying serializability mechanisms. These mechanisms are typically indifferent to any specific order, and generate some unpredictable [[partial order]] that is typically compatible with multiple serial orders of these transactions. This partial order results from the scheduling orders of concurrent transactions' data access operations, which depend on many factors.
 
Serializability is used in concurrency control of databases,<ref name="Bernstein872">[[Phil Bernstein|Philip A. Bernstein]], Vassos Hadzilacos, Nathan Goodman (1987): [http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx ''Concurrency Control and Recovery in Database Systems''] (free PDF download), Addison Wesley Publishing Company, {{ISBN|0-201-10715-5}}</ref><ref name="Weikum012">[[Gerhard Weikum]], Gottfried Vossen (2001): [http://www.elsevier.com/wps/find/bookdescription.cws_home/677937/description#description ''Transactional Information Systems''], Elsevier, {{ISBN|1-55860-508-8}}</ref> transaction processing (transaction management), and various [[Database transaction|transactional]] applications (e.g., [[transactional memory]]<ref name="Herlihy1993">[[Maurice Herlihy]] and J. Eliot B. Moss. ''Transactional memory: architectural support for lock-free data structures.'' Proceedings of the 20th annual international symposium on Computer architecture (ISCA '93). Volume 21, Issue 2, May 1993.</ref> and [[software transactional memory]]). Serializability is considered the highest level of [[Isolation (database systems)|isolation]] between [[Database transaction|transactions]], and plays an essential role in [[concurrency control]].
 
'''Serializability theory''' provides the formal framework to reason about and analyze serializability and its techniques.
 
====Conflicting actions====
Line 325 ⟶ 323:
|Com.
|}
The conditions for S3 to be view-equivalent to S1 and equivalenceS2 were not satisfied at the corresponding superscripts for the following reasons:
 
# Failed the first condition of view equivalence because T1 read the initial value for B in S1 and S2, but T2 read the initial value for B in S3.
# Failed the second condition of view equivalence because T2 read the value written by T1 for B in S1 and S2, but T1 read the value written by T2 for B in S3.
# Failed the third condition of view equivalence because T2 did the final write for B in S1 and S2, but T1 did the final write for B in S3.
To quickly analyze whether two schedules are view-equivalent, write both schedules as a list with each action's subscript representing which view-equivalence condition they match. The schedules are view equivalent if and only if all the actions have the same subscript (or lack thereof) in both schedules:
 
Line 454 ⟶ 452:
These schedules are recoverable. The schedule F is recoverable because T1 commits before T2, that makes the value read by T2 correct. Then T2 can commit itself. In the F2 schedule, if T1 aborted, T2 has to abort because the value of A it read is incorrect. In both cases, the database is left in a consistent state.
 
TransactionSchedule J is unrecoverable because T2 committed before T1 despite previously reading the value written by T1. Because T1 aborted after T2 committed, the value read by T2 is wrong. Because a transaction cannot be rolled-back after it commits, the schedule is unrecoverable.
 
====Cascadeless====
 
'''Cascadeless schedules''' (a.k.a,. "Avoiding Cascading Aborts (ACA) schedules") are schedules which avoid cascading aborts by disallowing [[Write–read conflict|dirty reads]]. '''Cascading aborts''' occur when one transaction's abort causes another transaction to abort because it read and relied on the first transaction's changes to an object. A '''dirty read''' occurs when a transaction reads data from uncommitted write in another transaction.<ref>{{Cite web |date=2019-08-06 |title=Cascadeless in DBMS |url=https://www.geeksforgeeks.org/cascadeless-in-dbms/ |access-date=2023-11-29 |website=GeeksforGeeks |language=en-US}}</ref>
 
The following examples are the same as the ones in the discussion on recoverable:
Line 537 ⟶ 535:
====Strict====
 
A schedule is '''strict - has the strictness property -''' if for any two transactions T1, T2, if a write operation of T1 precedes a ''conflicting'' operation of T2 (either read or write), then the commit or abort event of T1 also precedes that conflicting operation of T2. For example, the schedule F3 above is strict.
 
Any strict schedule is cascade-less, but not the converse. Strictness allows efficient recovery of databases from failure.
 
== Serializability Classclass Relationshipsrelationships ==
 
The following expressions illustrate the hierarchical (containment) relationships between [[serializability]] and [[Serializability#Correctness - recoverability|recoverability]] classes:
Line 553 ⟶ 551:
 
==See also==
* [[scheduleSchedule (project management)]]
* [[Two-phase locking|Strong strict two-phase locking]] (SS2PL or Rigorousness).
* [[Snapshot isolation#Making Snapshot Isolation Serializable|Making snapshot isolation serializable]]<ref name="Cahill082"/> in [[Snapshot isolation]].
Line 561 ⟶ 559:
==References==
{{Reflist}}
 
*<cite id=Bern1987>[[Phil Bernstein|Philip A. Bernstein]], Vassos Hadzilacos, Nathan Goodman: [http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx ''Concurrency Control and Recovery in Database Systems''], Addison Wesley Publishing Company, 1987, {{ISBN|0-201-10715-5}}</cite>
*<cite id=Weikum2001>[[Gerhard Weikum]], Gottfried Vossen: [http://www.elsevier.com/wps/find/bookdescription.cws_home/677937/description#description ''Transactional Information Systems''], Elsevier, 2001, {{ISBN|1-55860-508-8}}</cite>
 
[[Category:Data management]]
Line 570 ⟶ 565:
[[Category:Transaction processing]]
[[Category:Distributed computing problems]]
[[Category:NP-complete problems]]