-
[show abstract]
[hide abstract]
ABSTRACT: Consider a system composed of mobile robots (mobile sensors) that move on the plane, each of which independently executing its own instance of an algorithm. Given a desired geometric pattern, the flocking problem consists in ensuring that the robots form this pattern and maintain it while moving together on the plane. In this paper, we look at the flocking problem in the presence of faulty robots, where the desired pattern is a regular polygon. We propose a distributed algorithm assuming a semi-synchronous model with a k-bounded scheduler, in the sense that no robot is activated more than k times between any two consecutive activations of any other robot. The algorithm is composed of three parts: failure detector, ranking assignment and flocking algorithm. The rank assignment part is to provide a persistent ranking for the robots in the system. Then, the failure detector can select the set of correct robots from all the robots. Finally, the flocking algorithm handles the movement and reconfiguration of the flock, while maintaining the desired shape. The difficulty of the problem comes from the combination of the three parts together with the necessity to prevent collision and allow the rotation of the flock. Different from the existed work, our algorithm can make the formation rotate freely and has good maneuverability.
Advanced Information Networking and Applications, 2009. AINA '09. International Conference on; 06/2009
-
[show abstract]
[hide abstract]
ABSTRACT: Recently, flocking of a group of mobile robots is gained a lot of attentions due to its wide applications, such as manufacturing, surveillance and space exploration. It is necessary for all robots to adapt to the complex environment during flocking. In this paper, we propose a decentralized flocking algorithm, which can avoid collision between a robot and its neighbors and the collision between robots and obstacles when there are obstacles in the environment. By simulation results, we find this algorithm can effectively achieve the goal of collision avoidance.
Grid and Pervasive Computing Workshops, 2008. GPC Workshops '08. The 3rd International Conference on; 06/2008
-
[show abstract]
[hide abstract]
ABSTRACT: This paper compares several parametric and adaptive failure detection schemes in terms of their respective QoS. We introduce an improvement over existing methods, and evaluate their benefits. First, we propose an optimization to enhance the adaptation of Chen's FD, which significantly improves QoS, especially in the aggressive range and when the network is unstable. Second, we address the problem of most adaptive schemes, namely their need for a large window of samples. We study a scheme that is designed to use a fixed and very limited amount of memory for each monitored-monitoring link. Our experimental results over several kinds of networks (Cluster, WiFi, wired LAN, WAN) show that the properties of the existing adaptive FDs, and that the optimization is reasonable and acceptable. Furthermore, the extensive experimental results show what is the effect of memory size on the overall QoS of each adaptive FD.
Dependable Computing, 2007. PRDC 2007. 13th Pacific Rim International Symposium on; 01/2008
-
[show abstract]
[hide abstract]
ABSTRACT: The scheduling of real-time tasks with fault-tolerant requirements has been an important problem in multiprocessor systems. Primary-backup (PB) approach is often used as a fault-tolerant technique to guarantee the deadlines of tasks despite the presence of faults. In this paper we propose a PB-based task scheduling approach, wherein an allocation parameter is used to search the available time slots for a newly arriving task, and the previously scheduled tasks can be rescheduled when there is no available time slot for the newly arriving task. In order to improve the schedulability we extend the existing PB-overloading and the Backup-backup (BB) overloading. Our proposed task scheduling algorithm is compared with some existing scheduling algorithms in the literature through simulation studies. The results have shown that the task rejection ratio of our real-time task scheduling algorithm is lower than the compared algorithms.
Distributed Simulation and Real-Time Applications, 2007. DS-RT 2007. 11th IEEE International Symposium; 11/2007
-
[show abstract]
[hide abstract]
ABSTRACT: In this study, we focus on a self-deployment problem for a swarm of autonomous mobile robots that can be used to build a sensor networking infrastructure with equilateral triangle lattice configurations. In order to deploy the swarm, this paper proposes a self-stabilizing distributed self- deployment algorithm under a robot model with the following features: no identification numbers, no common coordinates, no predetermined leader, no memory for past actions and implicit communication. Regardless of the restricted model, our proposed algorithm based on local interactions provides a solution for the self-deployment problem. Moreover, the algorithm provides robust capability of swarm connectivity in spite of loss of several robots. We discuss in details the features of the algorithm, including self-organization, self-stabilization, and robustness. A simulation study demonstrates the validity of the algorithm.
Robot and Human interactive Communication, 2007. RO-MAN 2007. The 16th IEEE International Symposium on; 09/2007
-
[show abstract]
[hide abstract]
ABSTRACT: This paper surveys the state of the art of agentbased fault tolerance techniques. Existing mobile agent-based fault-tolerant techniques are identified on prevent mobile agents from being blocked by a failure.
Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005. Sixth International Conference on; 01/2006
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we propose a novel active queue management (AQM) scheme based on the Random Early Detection (RED) of the loss ratio and the total sending rate control, called LRC-RED, to regulate the queue length with small variation and to achieve high utilization with small packet loss. This scheme measures the latest packet loss ratio, and uses it and the total sending rate as complements to queue length in order to dynamically adjust packet drop probability. Further, we also provide the design rules for this scheme based on the well-known TCP control model. On the basis of the design rules, we develop a simple, scalable and systematic rule for tuning the control parameters which can be adaptive to dynamic network conditions. Through ns 2 simulations, we show the faster response time and better robustness of the proposed LRC-RED as compared with the Loss Ratio based RED (LRED) [5] algorithm.
Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005. Sixth International Conference on; 01/2006
-
[show abstract]
[hide abstract]
ABSTRACT: As two different research topics with much overlap, dependability and security of computer/communication systems have respective long and rich history. The development of the techniques for their modeling and analysis thus have followed distinct but convergent paths. In essence, diverse attributes and the fundamental difference between the nature of the failures bring in different concerns for dependability and security analysis during their modeling process. Taking the understanding of the basic concepts/attributes as a point of departure, this paper intend to carry out a comparative study on the analytical models of computer system dependability and security. Also, by examining the state-of-the-art quantitative techniques and sound modeling methodologies for dependability evaluation, e.g., combinatorial and stochastic methods, we attempt to explore why and how those methods can be extended to evaluate computer system security. Furthermore, we take our developed autonomic detection coordinator (for intrusion detection) as a case study to conduct the comparative analysis.
Parallel and Distributed Computing, Applications and Technologies, 2005. PDCAT 2005. Sixth International Conference on; 01/2006
-
[show abstract]
[hide abstract]
ABSTRACT: It is widely recognized that distributed systems would greatly benefit from the availability of a generic failure detection service. There are however several issues that must be addressed before such a service can actually be implemented. In this paper, we highlight the issue related to propagating information on failures in the phi failure detector for large-scale systems. Traditionally, failure detection systems provide information on suspects to every processes. However, it is not the efficient way in the large-scale system. We consider the notification system that propagates information on suspicions with content-based filtering
Database and Expert Systems Applications, 2005. Proceedings. Sixteenth International Workshop on; 09/2005
-
[show abstract]
[hide abstract]
ABSTRACT: In this paper, we consider a distributed system that consists of a group of teams of worker robots that rely on physical robot messengers for the communication between the teams. Unlike traditional distributed systems, there is a finite amount of messengers in the system, and thus a team can send messages to other teams only when some messenger robot is available locally. It follows that a careful management of the messengers is necessary to avoid the starvation of some teams. Concretely, the paper proposes algorithms to provide group membership and view synchrony among robot teams. We look at the problem in the face of failures, in particular when a certain number of messenger robots can possibly crash.
Advanced Information Networking and Applications, 2005. AINA 2005. 19th International Conference on; 04/2005
-
[show abstract]
[hide abstract]
ABSTRACT: The detection of failures is a fundamental issue for fault-tolerance in distributed systems. Recently, many people have come to realize that failure detection ought to be provided as some form of generic service, similar to IP address lookup or time synchronization. However, this has not been successful so far; one of the reasons being the fact that classical failure detectors were not designed to satisfy several application requirements simultaneously. We present a novel abstraction, called accrual failure detectors, that emphasizes flexibility and expressiveness and can serve as a basic building block to implementing failure detectors in distributed systems. Instead of providing information of a binary nature (trust vs. suspect), accrual failure detectors output a suspicion level on a continuous scale. The principal merit of this approach is that it favors a nearly complete decoupling between application requirements and the monitoring of the environment. In this paper, we describe an implementation of such an accrual failure detector, that we call the φ failure detector. The particularity of the φ failure detector is that it dynamically adjusts to current network conditions the scale on which the suspicion level is expressed. We analyzed the behavior of our φ failure detector over an intercontinental communication link over a week. Our experimental results show that if performs equally well as other known adaptive failure detection mechanisms, with an improved flexibility.
Reliable Distributed Systems, 2004. Proceedings of the 23rd IEEE International Symposium on; 11/2004
-
[show abstract]
[hide abstract]
ABSTRACT: Mobile computing can be seen as a natural extension of distributed computing, with the difference that hosts can be physically mobile. This results in many interesting new challenges. The most original aspect of mobile computing with respect to traditional distributed computing is when one considers problems whereby the movements of the host must be controlled. In particular, this is a central issue for cooperating autonomous mobile systems. We outline a specification framework to define recurrent problems for cooperative autonomous mobile systems. The framework consists of four generic properties (two liveness and two safety properties) that can be combined to define many different problems, including those surveyed in the literature. We regard this as a necessary step toward a better understanding of the relationships between problems.
Distributed Computing Systems Workshops, 2004. Proceedings. 24th International Conference on; 04/2004
-
[show abstract]
[hide abstract]
ABSTRACT: While group communication systems have been proposed for some time, they are still not used much in actual systems. We believe that one reason for this is the lack of standardisation of group communication system interfaces. The paper proposes an architecture, using the standard decomposition into services, where services are based on standard interfaces: both interactions between services and interactions with the application use existing, open standards. A decomposition of the group communication into services is presented, along with a description of applicable standards. As an example, a group membership service based on the LDAP standard is discussed.
Network Computing and Applications, 2003. NCA 2003. Second IEEE International Symposium on; 05/2003
-
[show abstract]
[hide abstract]
ABSTRACT: This paper investigates the two main and seemingly antagonistic approaches to broadcasting messages reliably in fault-tolerant distributed systems: the approach based on reliable broadcast, and that based on view synchronous communication (or VSC for short). While VSC does more than reliable broadcast, this has a cost. We show that this cost can be reduced by exploiting the difference between input-triggered and output-triggered suspicions, and by replacing the standard VSC broadcast primitive by two broadcast primitives, one sensitive to input-triggered suspicions, and the other sensitive to output-triggered suspicions.
Reliable Distributed Systems, 2002. Proceedings. 21st IEEE Symposium on; 02/2002
-
[show abstract]
[hide abstract]
ABSTRACT: Designing, tuning, and analyzing the performance of distributed
algorithms and protocols are complex tasks. A major factor that
contributes to this complexity is the fact that there is no single
environment to support all phases of the development of a distributed
algorithm. This paper presents Neko, an easy to use Java platform that
provides a uniform and extensible environment for the various phases of
algorithm design and performance evaluation: prototyping, tuning,
simulation, deployment, etc
Information Networking, 2001. Proceedings. 15th International Conference on; 02/2001
-
[show abstract]
[hide abstract]
ABSTRACT: Algorithms for solving agreement problems can be classified in two
categories: (1) those relying on failure detectors (FDs), which we call
FD-based, and (2) those that rely on a group membership service (GMS),
which we call GMS-based. This paper discusses the advantages and
limitations of these two approaches and proposes an extension to the GMS
approach that combines the advantages of both approaches, without their
drawbacks. This extension leads us to distinguish between time-triggered
suspicions of processes and space-triggered exclusions
Object-Oriented Real-Time Dependable Systems, 2001. Proceedings. Sixth International Workshop on; 02/2001
-
[show abstract]
[hide abstract]
ABSTRACT: Fault tolerance can be achieved in distributed systems by
replication. However Fischer, Lynch and Paterson (1985) have proven an
impossibility result about consensus in the asynchronous system model,
and similar impossibility results exist for atomic broadcast and group
membership. We investigate, with the aid of an experiment conducted in a
LAN, whether these impossibility results set limits to the robustness of
a replicated server exposed to extremely high loads. The experiment
consists of client processes that send requests to a replicated server
(three replicas) using an atomic broadcast primitive. It has parameters
that allow us to control the load on the hosts and the network, as well
as the timeout value used by our heartbeat failure detection mechanism.
Our main observation is that the atomic broadcast algorithm never stops
delivering messages, not even under arbitrarily high load and very small
timeout values (1 ms). So, by trying to illustrate the practical impact
of impossibility results, we discovered that we had implemented a very
robust replicated service
Reliable Distributed Systems, 2001. Proceedings. 20th IEEE Symposium on; 02/2001
-
[show abstract]
[hide abstract]
ABSTRACT: The paper considers a consensus algorithm for an asynchronous
system augmented with failure detectors, and analyzes the impact on its
termination time of various implementations of failure detectors. The
study shows that the design of fault-tolerant distributed algorithms in
the asynchronous system model augmented with failure detectors is
orthogonal to implementing the actual failure detectors. This nicely
decouples logical issues (proof of correctness) from engineering issues
(e.g., performance and timing constraints)
Dependable Computing, 2001. Proceedings. 2001 Pacific Rim International Symposium on; 02/2001
-
[show abstract]
[hide abstract]
ABSTRACT: Resource contention is widely recognized as having a major impact
on the performance of distributed algorithms. Nevertheless, the metrics
that are commonly used to predict their performance take little or no
account of contention. We define two performance metrics for distributed
algorithms that account for network contention as well as CPU
contention. We then illustrate the use of these metrics by comparing
four atomic broadcast algorithms, and show that our metrics allow for a
deeper understanding of performance issues than conventional metrics
Computer Communications and Networks, 2000. Proceedings. Ninth International Conference on; 02/2000
-
[show abstract]
[hide abstract]
ABSTRACT: One of the fundamental differences between a centralized system and a distributed one is the notion of partial failures. The ability to efficiently and accurately detect failures is a key element underlying reliable distributed computing. In current distributed systems, however, failure detection is either left to the application developer or hidden from the programmer and provided in an ad-hoc manner behind the scenes. We plead for an intermediate approach where failure detectors are first-class objects. We view failure detection as an abstraction, the complexity of which is encapsulated behind well-defined interfaces. The various roles of a failure detection service are all represented as first-class objects. Following our approach, one can reuse existing failure detection protocols as they are, or, through composition or refinement, one can define new protocols that match the application requirements. We describe an interesting result of a composition that mixes push and pull failure monitoring, and we show how scalability issues may be addressed by using a hierarchical failure detection configuration. We also discuss the implementation of our failure service both in CORBA and in Java
Distributed Objects and Applications, 1999. Proceedings of the International Symposium on; 02/1999