Concurrency Control:Deadlock Handling
Deadlock Handling
A system is in a deadlock state if there exists a set of transactions such that every transaction in the set is waiting for another transaction in the set. More precisely, there exists a set of waiting transactions {T0, T1,.. ., Tn} such that T0 is waiting for adata item that T1 holds, and T1 is waiting for a data item that T2 holds, and .. ., and Tn−1 is waiting for a data item that Tn holds, and Tn is waiting for a data item that T0 holds. None of the transactions can make progress in such a situation.
The only remedy to this undesirable situation is for the system to invoke some drastic action, such as rolling back some of the transactions involved in the deadlock.
Rollback of a transaction may be partial: That is, a transaction may be rolled back to the point where it obtained a lock whose release resolves the deadlock.
There are two principal methods for dealing with the deadlock problem. We can use a deadlock prevention protocol to ensure that the system will never enter a deadlock state. Alternatively, we can allow the system to enter a deadlock state, and then try to recover by using a deadlock detection and deadlock recovery scheme. As we shall see, both methods may result in transaction rollback. Prevention is commonly used if the probability that the system would enter a deadlock state is relatively high; otherwise, detection and recovery are more efficient.
Note that a detection and recovery scheme requires overhead that includes not only the run-time cost of maintaining the necessary information and of executing the detection algorithm, but also the potential losses inherent in recovery from a deadlock.
Deadlock Prevention
There are two approaches to deadlock prevention. One approach ensures that no cyclic waits can occur by ordering the requests for locks, or requiring all locks to be acquired together. The other approach is closer to deadlock recovery, and performs transaction rollback instead of waiting for a lock, whenever the wait could potentially result in a deadlock.
The simplest scheme under the first approach requires that each transaction locks all its data items before it begins execution. Moreover, either all are locked in one step or none are locked. There are two main disadvantages to this protocol: (1) it is often hard to predict, before the transaction begins, what data items need to be locked; (2) data-item utilization may be very low, since many of the data items may be locked but unused for a long time.
Another approach for preventing deadlocks is to impose an ordering of all data items, and to require that a transaction lock data items only in a sequence consistent with the ordering. We have seen one such scheme in the tree protocol, which uses a partial ordering of data items.
A variation of this approach is to use a total order of data items, in conjunction with two-phase locking. Once a transaction has locked a particular item, it cannot request locks on items that precede that item in the ordering. This scheme is easy to implement, as long as the set of data items accessed by a transaction is known when the transaction starts execution. There is no need to change the underlying concurrency-control system if two-phase locking is used: All that is needed it to en- sure that locks are requested in the right order.
The second approach for preventing deadlocks is to use preemption and transaction rollbacks. In preemption, when a transaction T2 requests a lock that transaction T1 holds, the lock granted to T1 may be preempted by rolling back of T1, and granting of the lock to T2. To control the preemption, we assign a unique timestamp to each transaction. The system uses these timestamps only to decide whether a transaction should wait or roll back. Locking is still used for concurrency control. If a transaction is rolled back, it retains its old timestamp when restarted. Two different deadlock-prevention schemes using timestamps have been proposed:
1. The wait–die scheme is a nonpreemptive technique. When transaction Ti re- quests a data item currently held by Tj , Ti is allowed to wait only if it has a timestamp smaller than that of Tj (that is, Ti is older than Tj ). Otherwise, Ti is rolled back (dies).
For example, suppose that transactions T22, T23, and T24 have timestamps 5, 10, and 15, respectively. If T22 requests a data item held by T23, then T22 will wait. If T24 requests a data item held by T23, then T24 will be rolled back.
2. The wound–wait scheme is a preemptive technique. It is a counterpart to the wait–die scheme. When transaction Ti requests a data item currently held by Tj , Ti is allowed to wait only if it has a timestamp larger than that of Tj (that is, Ti is younger than Tj ). Otherwise, Tj is rolled back (Tj is wounded by Ti).
Returning to our example, with transactions T22, T23, and T24, if T22 re- quests a data item held by T23, then the data item will be preempted from T23, and T23 will be rolled back. If T24 requests a data item held by T23, then T24 will wait.
Whenever the system rolls back transactions, it is important to ensure that there is no starvation—that is, no transaction gets rolled back repeatedly and is never al- lowed to make progress.
Both the wound–wait and the wait–die schemes avoid starvation: At any time, there is a transaction with the smallest timestamp. This transaction cannot be required to roll back in either scheme. Since timestamps always increase, and since transactions are not assigned new timestamps when they are rolled back, a transaction that is rolled back repeatedly will eventually have the smallest timestamp, at which point it will not be rolled back again.
There are, however, significant differences in the way that the two schemes operate.
• In the wait–die scheme, an older transaction must wait for a younger one to release its data item. Thus, the older the transaction gets, the more it tends to wait. By contrast, in the wound–wait scheme, an older transaction never waits for a younger transaction.
• In the wait–die scheme, if a transaction Ti dies and is rolled back because it requested a data item held by transaction Tj , then Ti may reissue the same sequence of requests when it is restarted. If the data item is still held by Tj , then Ti will die again. Thus, Ti may die several times before acquiring the needed data item. Contrast this series of events with what happens in the wound–wait scheme. Transaction Ti is wounded and rolled back because Tj requested a data item that it holds. When Ti is restarted and requests the data item now being held by Tj , Ti waits. Thus, there may be fewer rollbacks in the wound–wait scheme.
The major problem with both of these schemes is that unnecessary rollbacks may occur.
Timeout-Based Schemes
Another simple approach to deadlock handling is based on lock timeouts. In this approach, a transaction that has requested a lock waits for at most a specified amount of time. If the lock has not been granted within that time, the transaction is said to time out, and it rolls itself back and restarts. If there was in fact a deadlock, one or more transactions involved in the deadlock will time out and roll back, allowing the others to proceed. This scheme falls somewhere between deadlock prevention, where a deadlock will never occur, and deadlock detection and recovery, which Section 16.6.3 discusses.
The timeout scheme is particularly easy to implement, and works well if transactions are short and if long waits are likely to be due to deadlocks. However, in general it is hard to decide how long a transaction must wait before timing out. Too long a wait results in unnecessary delays once a deadlock has occurred. Too short a wait results in transaction rollback even when there is no deadlock, leading to wasted re- sources. Starvation is also a possibility with this scheme. Hence, the timeout-based scheme has limited applicability.
Deadlock Detection and Recovery
If a system does not employ some protocol that ensures deadlock freedom, then a detection and recovery scheme must be used. An algorithm that examines the state of the system is invoked periodically to determine whether a deadlock has occurred. If one has, then the system must attempt to recover from the deadlock. To do so, the system must:
• Maintain information about the current allocation of data items to transactions, as well as any outstanding data item requests.
• Provide an algorithm that uses this information to determine whether the sys- tem has entered a deadlock state.
• Recover from the deadlock when the detection algorithm determines that a deadlock exists.
In this section, we elaborate on these issues.
Deadlock Detection
Deadlocks can be described precisely in terms of a directed graph called a wait-for graph. This graph consists of a pair G = (V, E), where V is a set of vertices and E is a set of edges. The set of vertices consists of all the transactions in the system. Each
element in the set E of edges is an ordered pair Ti → Tj . If Ti → Tj is in E, then there is a directed edge from transaction Ti to Tj , implying that transaction Ti is waiting for transaction Tj to release a data item that it needs.
When transaction Ti requests a data item currently being held by transaction Tj , then the edge Ti → Tj is inserted in the wait-for graph. This edge is removed only when transaction Tj is no longer holding a data item needed by transaction Ti.
A deadlock exists in the system if and only if the wait-for graph contains a cycle.
Each transaction involved in the cycle is said to be deadlocked. To detect deadlocks, the system needs to maintain the wait-for graph, and periodically to invoke an algorithm that searches for a cycle in the graph.
To illustrate these concepts, consider the wait-for graph in Figure 16.18, which depicts the following situation:
• Transaction T25 is waiting for transactions T26 and T27.
• Transaction T27 is waiting for transaction T26.
• Transaction T26 is waiting for transaction T28.
Since the graph has no cycle, the system is not in a deadlock state.
Suppose now that transaction T28 is requesting an item held by T27. The edge T28 → T27 is added to the wait-for graph, resulting in the new system state in Figure 16.19. This time, the graph contains the cycle
T26 → T28 → T27 → T26
implying that transactions T26, T27, and T28 are all deadlocked.
Consequently, the question arises: When should we invoke the detection algorithm? The answer depends on two factors:
1. How often does a deadlock occur?
2. How many transactions will be affected by the deadlock?
If deadlocks occur frequently, then the detection algorithm should be invoked more frequently than usual. Data items allocated to deadlocked transactions will be unavailable to other transactions until the deadlock can be broken. In addition, the number of cycles in the graph may also grow. In the worst case, we would invoke the detection algorithm every time a request for allocation could not be granted immediately.
Recovery from Deadlock
When a detection algorithm determines that a deadlock exists, the system must re- cover from the deadlock. The most common solution is to roll back one or more trans- actions to break the deadlock. Three actions need to be taken:
1. Selection of a victim. Given a set of deadlocked transactions, we must deter- mine which transaction (or transactions) to roll back to break the deadlock. We should roll back those transactions that will incur the minimum cost. Unfortunately, the term minimum cost is not a precise one. Many factors may determine the cost of a rollback, including
a. How long the transaction has computed, and how much longer the trans- action will compute before it completes its designated task.
b. How many data items the transaction has used.
c. How many more data items the transaction needs for it to complete.
d. How many transactions will be involved in the rollback.
2. Rollback. Once we have decided that a particular transaction must be rolled back, we must determine how far this transaction should be rolled back.
The simplest solution is a total rollback: Abort the transaction and then restart it. However, it is more effective to roll back the transaction only as far as necessary to break the deadlock. Such partial rollback requires the system to maintain additional information about the state of all the running trans- actions. Specifically, the sequence of lock requests/grants and updates per- formed by the transaction needs to be recorded. The deadlock detection mechanism should decide which locks the selected transaction needs to release in order to break the deadlock. The selected transaction must be rolled back to the point where it obtained the first of these locks, undoing all actions it took after that point. The recovery mechanism must be capable of performing such partial rollbacks. Furthermore, the transactions must be capable of resuming execution after a partial rollback. See the bibliographical notes for relevant references.
3. Starvation. In a system where the selection of victims is based primarily on cost factors, it may happen that the same transaction is always picked as a victim. As a result, this transaction never completes its designated task, thus there is starvation. We must ensure that transaction can be picked as a victim only a (small) finite number of times. The most common solution is to include the number of rollbacks in the cost factor.
Comments
Post a Comment