Conference Paper

A Distributed Algorithm for Deadlock Detection and Resolution.

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Deadlock detection/resolution is an important problem in a distributed system and much attention has been devoted to it in the past few years. Many distributed deadlock detection/resolution algorithms have been proposed, however, most of them either have not given a correctness proof 1, 2, 3] or have given an informal proof by using intuitive operational arguments 4,5,6]. Intuitive operational arguments are prone to errors, and many of the published algorithms have been found to be incorrect 4,5,1,3]. ...
Article
A large number of published distributed deadlock detection/resolution algorithms are found to be incorrect because they have used informal approaches to prove the correctness of their algorithms. In this paper, we present a formal approach for the correctness proof and give an example of the proof. In this proposed approach, a formal model of distributed deadlock is presented with a local-time deadlock specification for correctness verification. With the formal model, we have an insight into the definition of deadlock in local views which is used to show the existence of a real deadlock. A rigorous proof to show the equivalence of local-time and global-time deadlock specifications is presented. Department of Computer Science University of Missouri-Rolla Rolla, Missouri 1 Introduction A distributed system consists of a set of processors connected by bidirectional communication links with processes and resources resident on each processor. Processes and resources communicate with each...
Article
To detect deadlock in distributed systems, the initiator should construct an efficient explicit or implicit global wait-for graph. In this paper, we present an unstructured deadlock detection algorithm using a gossip protocol in cloud computing environments, where constituting nodes may join and leave at any time. Because of the inherit properties of a gossip protocol, we argue that our proposed deadlock detection algorithm is scalable, fault-tolerant, and efficient, retaining safety and liveness properties. The correctness proof of the algorithm is also provided. The message complexity of our proposed algorithm is O(n), where n is the number of nodes. Our performance evaluation with scalable settings shows that our approach has a significant advantage over previous deadlock detection algorithms in terms of solving scalability, fault-tolerance, and complexity–efficiency issues. Copyright © 2013 John Wiley & Sons, Ltd.
Article
Deadlock detection and resolution is one of the major components of a successful distributed database management system. In this article, we discuss deadlock detection and resolution strategies and present two approaches for detecting and resolving deadlocks in both general distributed database systems and distributed real-time database systems. Our first approach is to collect information on the connectivity of nodes of the overall Transaction Wait-For Graph (TWFG) of the distributed database system and then use these connectivities information to build a local TWFG at each node of the overall TWFG. We then detect the deadlocks by locating the cycles in each local TWFG. To resolve the deadlocks, the nodes involved in those cycles in each local TWFG, are removed until there is no cycle in the local TWFGs. Our second approach continuously checks for the occurrence of a deadlock between different transaction trees. As soon as it detects a deadlock, it resolves it by aborting one of the transaction tree which has been initiated more recently. Some of the advantages of our approaches over the approaches which use Probe messages are: (1) no extra storage required to store different probe messages, (2) no false (Phantom) deadlocks are reported, (3) detects and resolves all deadlocks. In addition, our approaches use less message and time to detect and resolve all deadlocks in existing TWFG of the distributed database system.
Conference Paper
Deadlock detection and resolution is one of the major com- ponent of a successful distributed database management sys- tem. In this paper, we discuss deadlock detection and res- olution strategies and present two approaches for detecting and resolving deadlocks in both general distributed database systems and in distributed real-time database systems. Our first approach is to collect information on connectivity of nodes of the overall Transaction Wait-For Graph (TWFG) of the distributed database system and then use these con- nectivities information to build a local TWFG at each node of the overall TWGF. We then detect the deadlocks by locat- ing the cycles in each local TWFG. To resolve the deadlocks the nodes involved in those cycles in each local T WFG, are removed until there is no cycle in the local TWFGS. Our second approach continuously checks for the occurrences of a deadlock between different transaction trees. As soon as it detects a deadlock it resolves it by aborting one of the trans- action tree which has been initiated more recently. Some of the advantages of our approaches over the approaches which are using Probe messages are: (1) no extra storage required to store different probe messages, (2) no false (Phantom) deadlocks are reported, (3) detects and resolve all deadlocks. In addition, our approaches use less messages and time to detect and resolve all deadlocks in the existing TWFG of the distributed database system.
Conference Paper
Full-text available
We present a continuous deadlock detection and resolution algorithm in distributed database systems. Our algorithm maintains an augmented transaction wait-for graph at each site and uses a modified priority-based probe generation scheme in order to detect local deadlocks without transmit- ting any intra-site deadlock detection messages, to minimize the number of inter-site messages sent for detection of global deadlocks and also for the early detection of global dead- locks that might occur in the future without transmitting detection messages repeatedly. The augmented transaction wait-for graph contains, in addition to lock-wait informa- tion, information about message-wait relationships among agents of a transaction, probes received from other sites and transitive wait-for relationships among transactions. Global deadlocks are declared whenever a transitive wait-for rela- tionship from an agent of a global transaction is constructed for some agent of the transaction.
Conference Paper
The occurrence of deadlocks should be controlled effectively by their detection and resolution, but may sometimes lead to a serious system failure. This fact implies that deadlock detection scheduling should be designed from the view points of not only the performance trade-off between overall message usage and deadlock persistence time but also the prevention of the system failure. In this paper, we reformulate the Ling et al.'s deadlock detection scheduling problem (2006) in the presence of system failures, and derive the optimal deadlock detection time minimizing the long-run average cost per unit time. By introducing the message complexities of the deadlock detection and resolution algorithms being used, we investigate the asymptotically optimal frequency of deadlock detection scheduling in terms of the number of distributed processes through the wellknown Landau notation.
Article
This paper deals with the problem of deadlock detection in asynchronous message passing systems in a system model that covers unspecified receptions and non-FIFO channels. It presents a hierarchy of deadlock models and deadlock detection problems. It abstracts deadlocks by a general deadlock model that has the same modeling power as the OR-AND model; however, it has much concise expressive power. An abstract general definition of deadlocks in distributed systems is presented that defines deadlocks independently of the underlying deadlock model. This formulation can be used to design a single distributed deadlock detection algorithm which uniformly addresses all deadlocks in the context of various request models such as AND, OR, AND-OR, and k-out-of-n requests. A simple generalized deadlock detection algorithm that uses a circulating token is presented to illustrate the concept. The algorithm is formally described and proven correct. Moreover, possible refinements of the basic solution concerning improvements of token routing and parallel implementation are outlined and evaluated. Extensions to individual and global termination issues are also addressed. Since the proposed deadlock detection algorithm is designed around the abstract definition of deadlocks, it has some very favorable features.
Article
Full-text available
In the design of highly complex, heterogeneous and concurrent systems, deadlock detection remains an important issue. In this paper, we systematically analyze the synchronization dependencies in system-level designs. We propose a data structure called the dynamic synchronization dependency graph, which captures the runtime blocking dependencies among concurrent processes. A loop-detection algorithm is then used to detect deadlocks and help designers quickly isolate and identify modeling errors that cause the deadlock problems. We demonstrate our approach through two publicly available system-level modeling languages, SystemC and Metropolis, and two real world design examples, which are complex system-level functional models for video processing.
Conference Paper
Cellular robotic systems (CRS) employ a large number of robots operating in cellular spaces under distributed control. In this paper, the relationship between CRS and distributed computing is discussed. Two problems encountered in designing pattern generation protocols for CRS, the n -way intersection problem and the knot detection problem, are related to distributed mutual exclusion problem and distributed deadlock detection problem, respectively. Solutions to these two problems, derived from their counterparts in distributed computing, are presented in the CRS context
Article
Full-text available
The author describes a series of deadlock detection techniques based on centralized, hierarchical, and distributed control organizations. The point of view is that of practical implications. An up-to-date and comprehensive survey of deadlock detection algorithms is presented, their merits and drawbacks are discussed, and their performances (delays as well as message complexity) are compared. Related issues such as correctness of the algorithms, performance of the algorithms, and deadlock resolution, which require further research are examined.< >
Article
Deadlock is one of the most serious problems in multitasking concurrent programming systems. The deadlock problem becomes further complicated when the underlying system is distributed and when tasks have timing constraints. Distributed deadlock detection has been studied to some extent in distributed database systems and distributed timesharing operating systems, but has not been widely used in real-time systems. In this paper, we investigate deadlock detection algorithms in distributed environments and extend the results to real-time systems by considering timing constraints in the algorithms. In particular, we direct our attention to Ada environment and try to apply our solutions to it. We analyze and categorize the deadlock problem in Ada environments into four levels of complexity by using Knapp's hierarchy of deadlock models. To fully support Ada semantics it is necessary to develop solutions for the most complex level. Many Ada applications, however, do not utilize all ...
Article
Edge-chasing is the basis of many deadlock detection algorithms. This method detects a deadlock by propagating special messages called probes along dependency edges. When the initiator of a probe receives the probe back, it knows the existence of a deadlock. Once a deadlock is detected, a special message called token is sent to clean up those probes in the deadlock cycle which, if not removed, may later lead to phantom deadlock detections. Only after the token has traversed the entire deadlock cycle and returned to its initiator, the deadlock is resolved by aborting a so-called victim in the deadlock cycle. In a deadlock, all involved transactions are held waiting and all involved resources are locked up. It is thus desirable to resolve a deadlock as soon as it is detected, without waiting for the token message to go around the deadlock cycle. This paper proposes an algorithm that achieves this and thereby reduces the average deadlock persistence time by as much as two thirds...
Article
Deadlock is one of the most serious problems in multitasking concurrent programming systems. The deadlock problem becomes further complicated when the underlying system is distributed and when tasks have timing constraints. Distributed deadlock detection has been studied to some extent in distributed database systems and distributed timesharing operating systems but has not been widely used in real-time systems. In this paper, we investigate deadlock detection algorithms in distributed environments and extend the results to real-time systems by considering timing constraints in the algorithms. In particular, we direct our attention to Ada environment and try to apply our solutions to it. Related problems, such as livelocks, orphan tasks, task termination problems, and global state detection, are considered when it is appropriate. This paper has two main parts. First, we complete a state-of-the-art survey of the distributed deadlock detection algorithms proposed in the litera...
Conference Paper
Full-text available
This paper presents a distributed algorithm to detect deadlocks in distributed data bases. Features of this paper are (1) a formal model of the problem is presented, (2) the correctness of the algorithm is proved, i.e. we show that all true deadlocks will be detected and deadlocks will not be reported falsely, (3) no assumptions are made other than that messages are received correctly and in order and (4) the algorithm is simple.
Conference Paper
An efficient distributed algorithm to detect deadlocks in distributed and dynamically changing systems is presented. In our model, processes can request any N available resources from a pool of size M. This is a generalization of the well-known AND-OR request model. The algorithm is incrementally derived and proven correct. Its communication, computational, and space complexity compares favorably with those of previously known distributed AND-OR deadlock detection algorithms.
Article
We propose an algorithm for detecting deadlocks among transactions running concurrently in a distributed processing network (i.e., a distributed database system). The proposed algorithm is a distributed deadlock detection algorithm. A proof of the correctness of the distributed portion of the algorithm is given, followed by an example of the algorithm in operation. The performance characteristics of the algorithm are also presented.
Article
This paper describes a method for the detection of properties of general graphs in an environment in which each node can be considered an autonomous processor, interacting with its neighbors by passing messages. These algorithms are decentralized in that they depend on no central controlling process nor on global storage. No node is required to know the configuration or extent of the graph, and no global clock is required. The algorithms are inherently asynchronous, and in general require execution time proportional to the diameter of the graph. Copyright © 1982 by The Institute of Electrical and Electronics Engineers, Inc.
Article
A hierarchically organized and a distributed protocol for deadlock detection in distributed databases are presented in [1]. In this paper we show that the distributed protocol is incorrect, and present possible remedies. However, the distributed protocol remains impractical because "condensations" of "transaction-wait-for" graphs make graph updates difficult to perform. Delayed graph updates cause the occurrence of false deadlocks in this as well as in some other deadlock detection protocols for distributed systems. The performance degradation that results from false deadlocks depends on the characteristics of each protocol.
Article
This paper descrbes two protocols for the detection of deadlocks in distributed data bases–a hierarchically organized one and a distributed one. A graph model which depicts the state of execution of all transactions in the system is used by both protocols. A cycle in this graph is a necessary and sufficient condition for a deadlock to exist. Nevertheless, neither protocol requires that the global graph be built and maintained in order for deadlocks to be detected. In the case of the hierarchical protocol, the communications cost can be optimized if the topology of the hierarachy is appropriately chosen.