In recent years, there has been a dramatic increase in the amount of available computing and storage resources, yet few have been able to exploit these resources in an aggregated form. We present the Condor-G system, which leverages software from Globus and Condor to allow users to harness multi-domain resources as if they all belong to one personal domain. We describe the structure of Condor-G and how it handles job management, resource selection, security and fault tolerance
Utility or on-demand computing, a provisioning model where a service provider makes computing infrastructure available to customers as needed, is becoming increasingly common in enterprise computing systems. Realizing this model requires making dynamic and sometimes risky, resource provisioning and allocation decisions in an uncertain operating environment to maximize revenue while reducing operating cost. This paper develops an optimization framework wherein the resource provisioning problem is posed as one of sequential decision making under uncertainty and solved using a limited lookahead control scheme. The proposed approach accounts for the switching costs incurred during resource provisioning and explicitly encodes risk in the optimization problem. Simulations using workload traces from the Soccer World Cup 1998 web site show that a computing system managed by our controller generates up to 20% more revenue than a system without dynamic control while incurring low control overhead.
In this paper, we consider the problem of scheduling and mapping
precedence-constrained tasks to a network of heterogeneous processors.
In such systems, processors are usually physically distributed, implying
that the communication cost is considerably higher than in tightly
coupled multiprocessors. Therefore, scheduling and mapping algorithms
for such systems must schedule the tasks as well as the communication
traffic by treating both the processors and communication links as
important resources. We propose an algorithm that achieves these
objectives and adapts its tasks scheduling and mapping decisions
according to the given network topology. Just like tasks, messages are
also scheduled and mapped to suitable links during the minimization of
the finish times of tasks. Heterogeneity of processors is exploited by
scheduling critical tasks to the fastest processors. Our extensive
experimental study has demonstrated that the proposed algorithm is
efficient, robust, and yields consistent performance over a wide range
of scheduling parameters
In this paper we present the design and implementation of a Pluggable Fault Tolerant CORBA Infrastructure that provides fault tolerance for CORBA applications by utilizing the pluggable protocols framework that is available for most CORBA ORBS. Our approach does not require modification to the CORBA ORB, and requires only minimal modifications to the application. Moreover; it avoids the difficulty of retrieving and assigning the ORB state, by incorporating the fault tolerance mechanisms into the ORB. The Pluggable Fault Tolerant CORBA Infrastructure achieves performance that is similar to, or better than, that of other Fault Tolerant CORBA systems, while providing strong replica consistency
Several distributed routing algorithms for wireless networks were described recently, based on location information of nodes available via Global Positioning System (GPS). In a greedy routing algorithm, the sender or node S currently holding the message m forwards m to one of its neighbors that is the closest to the destination. The algorithm fails if S does not have any neighbor that is closer to destination than S. The FACE algorithm guarantees the delivery of m if the network, modeled by unit graph, is connected. The GFG algorithm combines greedy and FACE algorithms. We further improve the performance of the GFG algorithm, by reducing its average hop count. First we improve the FACE algorithm by adding a sooner-back procedure for earlier escape from FACE mode. Then we perform a shortcut procedure at each forwarding node S. Node S uses the local information available to calculate as many hops as possible and forwards the packet to the last known hop directly instead of forwarding it to the next hop. The second improvement is based on the concept of dominating sets. The network of internal nodes defines a connected dominating set, and each node must be either internal or directly connected to an internal node. We apply several existing definitions of internal nodes, namely the concepts of intermediate, inter-gateway and gateway nodes. We propose to run GFG routing, enhanced by shortcut procedure, on the dominating set, except possibly the first and last hops. We obtained localized routing algorithm that guarantees delivery and has very low excess in terms of hop count compared to the shortest path algorithm. Experimental data show that the length of additional path can be reduced to about half that of existing GFG algorithm
The development of CPU has stepped into the era of multi-core. Due to lack of support on thread level, most of the simulation platform can not take full advantage of multicore. To fulfill this gap, we proposed a hierarchical parallel simulation kernel(HSK) model. The model has two layers. The first layer, named process kernel, was responsible for managing all thread kernels on second layer. The second layer is a group of thread kernels, which were responsible for scheduling and advancing logical processes. Each thread kernel was mapped onto an executing thread to advance simulation parallel. In addition, two algorithms were proposed to support high performance: (1) To improve the communication efficiency between threads, we proposed a pointer-based communication mechanism. By using buffers, synchronization between threads can be annihilated. (2) To eliminate redundant Lower Bound on Time Stamp(LBTS) computation and not to interrupt thread execution, we employ an approximate method to compute LBTS asynchronously. A proof of validity was presented. The execution performance of HSK was demonstrated by a series of simulation experiments with a modified phold model. The HSK can achieve good speedup for applications, especially with coarse-grained event.
Multihop wireless networks are treated as random symmetric planar point graphs, where all the nodes have the same transmission power and radius, and vertices of a graph are drawn randomly over certain geographical region. Several basic and important topological properties of random multihop wireless networks are studied, including node degree, connectivity, diameter bisection width, and biconnectivity. It is believed that such study has very useful implication in real applications.
Recent advances in hardware and software virtualization offer unprecedented management capabilities for the mapping of virtual
resources to physical resources. It is highly desirable to further create a “service hosting abstraction” that allows application
owners to focus on service level objectives (SLOs) for their applications. This calls for a resource management solution that
achieves the SLOs for many applications in response to changing data center conditions and hides the complexity from both
application owners and data center operators. In this paper, we describe an automated capacity and workload management system
that integrates multiple resource controllers at three different scopes and time scales. Simulation and experimental results
confirm that such an integrated solution ensures efficient and effective use of data center resources while reducing service
level violations for high priority applications.
The IEEE 802.11 network technology is the emerging standard for wireless LANs and mobile networking. The fundamental access mechanism in the IEEE 802.11 MAC protocol is the Distributed Coordination Function. In this paper, we present an analytical method of estimating the saturation throughput of 802.11 wireless LAN in the assumption of ideal channel conditions. The proposed method generalizes the existing 802.11 LAN models and advances them in order to take the Seizing Effect into consideration. This real-life effect consists in the following: the station that has just completed successfully its transmission has a better chance of winning in the competition and therefore of seizing the channel than other LAN stations. The saturation throughput of 802.11 wireless LANs is investigated by the developed method. The obtained numerical results are validated by simulation and lead to the change of the existing idea of the optimal access strategy in the saturation conditions.
Both distributed systems and multicore systems are difficult programming environments. Although the expert programmer may
be able to carefully tune these systems to achieve high performance, the non-expert may struggle. We argue that high level
abstractions are an effective way of making parallel computing accessible to the non-expert. An abstraction is a regularly
structured framework into which a user may plug in simple sequential programs to create very large parallel programs. By virtue
of a regular structure and declarative specification, abstractions may be materialized on distributed, multicore, and distributed
multicore systems with robust performance across a wide range of problem sizes. In previous work, we presented the All-Pairs
abstraction for computing on distributed systems of single CPUs. In this paper, we extend All-Pairs to multicore systems,
and introduce the Wavefront and Makeflow abstractions, which represent a number of problems in economics and bioinformatics.
We demonstrate good scaling of both abstractions up to 32cores on one machine and hundreds of cores in a distributed system.
KeywordsAbstractions-Multicore-Distributed systems-Bioinformatics-Economics
Numerous studies show that miss ratios at forward proxies are typically at least 40–50%. This paper proposes and evaluates a new approach for improving the throughput of Web proxy systems by reducing the overhead of handling cache misses. Namely, we propose to front-end a Web proxy with a high performance node that filters the requests, processing the misses and forwarding the hits and the new cacheable content to the proxy. Requests are filtered based on hints of the proxy cache content. This system, called Proxy Accelerator, achieves significantly better communications performance than a traditional proxy system. For instance, an accelerator can be built as an embedded system optimized for communication and HTTP processing, or as a kernel-mode HTTP server. Scalability with the Web proxy cluster size is achieved by using several accelerators. We use analytical models, trace-based simulations, and a real implementation to study the benefits and the implementation tradeoffs of this new approach. Our results show that a single proxy accelerator node in front of a 4-node Web proxy can improve the cost-performance ratio by about 40%. Hint-based request filter implementation choices that do not affect the overall hit ratio are available. An implementation of the hint management module integrated in Web proxy software is presented. Experimental evaluation of the implementation demonstrates that the associated overheads are very small.
The collision avoidance and resolution multiple access (CARMA) protocol is presented and analyzed. CARMA uses a collision
avoidance handshake in which the sender and receiver exchange a request to send (RTS) and a clear to send (CTS) before the
sender transmits any data. CARMA is based on carrier sensing, together with collision resolution based on a deterministic
tree-splitting algorithm. For analytical purposes, an upper bound is derived for the average number of steps required to resolve
collisions of RTSs using the tree-splitting algorithm. This bound is then applied to the computation of the average channel
utilization in a fully connected network with a large number of stations. Under light-load conditions, CARMA achieves the
same average throughput as multiple access protocols based on RTS/CTS exchange and carrier sensing. It is also shown that,
as the arrival rate of RTSs increases, the throughput achieved by CARMA is close to the maximum throughput that any protocol
based on collision avoidance (i.e., RTS/CTS exchange) can achieve if the control packets used to acquire the floor are much
smaller than the data packet trains sent by the stations. Simulation results validate the simplifying approximations made
in the analytical model. Our analysis results indicate that collision resolution makes floor acquisition multiple access much
more effective.
This paper presents a recovery protocol for block I/O operations in Slice, a storage system architecture for high-speed LANs incorporating network-attached block storage. The goal of the Slice architecture is to provide a network file service with scalable bandwidth and capacity while preserving compatibility with off-the-shelf clients and file server appliances. The Slice prototype virtualizes the Network File System (NFS) protocol by interposing a request switching filter at the client's interface to the network storage system. The distributed Slice architecture separates functions typically combined in central file servers, introducing new challenges for failure atomicity. This paper presents a protocol for atomic file operations and recovery in the Slice architecture, and related support for reliable file storage using mirrored striping. Experimental results from the Slice prototype show that the protocol has low cost in the common case, allowing the system to deliver client file access bandwidths approaching gigabit-per-second network speeds.
This paper describes a novel technique for establishing a virtual file system that allows data to be transferred user-transparently and on-demand across computing and storage servers of a computational grid. Its implementation is based on extensions to the Network File System (NFS) that are encapsulated in software proxies. A key differentiator between this approach and previous work is the way in which file servers are partitioned: while conventional file systems share a single (logical) server across multiple users, the virtual file system employs multiple proxy servers that are created, customized and terminated dynamically, for the duration of a computing session, on a per-user basis. Furthermore, the solution does not require modifications to standard NFS clients and servers. The described approach has been deployed in the context of the PUNCH network-computing infrastructure, and is unique in its ability to integrate unmodified, interactive applications (even commercial ones) and existing computing infrastructure into a network computing environment. Experimental results show that: (1) the virtual file system performs well in comparison to native NFS in a local-area setup, with mean overheads of 1 and 18%, for the single-client execution of the Andrew benchmark in two representative computing environments, (2) the average overhead for eight clients can be reduced to within 1% of native NFS with the use of concurrent proxies, (3) the wide-area performance is within 1% of the local-area performance for a typical compute-intensive PUNCH application (SimpleScalar), while for the I/O-intensive application Andrew the wide-area performance is 5.5 times worse than the local-area performance.
The extreme conditions under which multi-hop underwater acoustic sensor networks (UASNs) operate constrain the performance
of medium access control (MAC) protocols. The MAC protocol employed significantly impacts the operation of the network supported,
and such impacts must be carefully considered when developing protocols for networks constrained by both bandwidth and propagation
delay.
Time-based coordination, such as TDMA, have limited applicability due to the dynamic nature of the water channel used to propagate
the sound signals, as well as the significant effect of relatively small changes in propagation distance on the propagation
time. These effects cause inaccurate time synchronization and therefore make time-based access protocols less viable. The
large propagation delays also diminish the effectiveness of carrier sense protocols as they do not predict with any certainty
the status of the intended recipients at the point when the traffic would arrive. Thus, CSMA protocols do not perform well
in UASNs, either.
Reservation-based protocols have seldom been successful in commercial products over the past 50 years due to many drawbacks,
such as limited scalability, relatively low robustness, etc. In particular, the impact of propagation delays in UASNs and
other such constrained networks obfuscate the operation of the reservation protocols and diminish, if not completely negate,
the benefit of reservations. The efficacy of the well-known RTS-CTS scheme, as a reservation-based enhancement to the CSMA
protocol, is also adversely impacted by long propagation delays.
An alternative to these MAC protocols is the much less complex ALOHA protocol, or one of its variants. However, the performance
of such protocols within the context of multi-hop networks is not well studied. In this paper we identify the challenges of
modeling contention-based MAC protocols and present models for analyzing ALOHA and p-persistent ALOHA variants for a simple string topology. As expected, an application of the model suggests that ALOHA variants
are very sensitive to traffic loads. Indeed, when the traffic load is small, utilization becomes insensible to p values. A key finding, though, is the significance of the network size on the protocols’ performance, in terms of successful
delivery of traffic from outlying nodes, indicating that such protocols are only appropriate for very small networks, as measured
by hop count.
KeywordsUnderwater acoustic sensor networks–MAC–ALOHA–
p-persistent ALOHA–Multi-hop
Due to the increase of the diversity of parallel architectures, and the increasing development time for parallel applications,
performance portability has become one of the major considerations when designing the next generation of parallel program
execution models, APIs, and runtime system software. This paper analyzes both code portability and performance portability
of parallel programs for fine-grained multi-threaded execution and architecture models. We concentrate on one particular event-driven
fine-grained multi-threaded execution model—EARTH, and discuss several design considerations of the EARTH model and runtime
system that contribute to the performance portability of parallel applications. We believe that these are important issues
for future high end computing system software design. Four representative benchmarks were conducted on several different parallel
architectures, including two clusters listed in the 23rd supercomputer TOP500 list. The results demonstrate that EARTH based
programs can achieve robust performance portability across the selected hardware platforms without any code modification or
tuning.
This paper discusses how ad-hoc collaboration boosts the operation of a set of messengers. This discussion continues the research
we earlier initiated in the
MESSENGER\mathcal{MESSENGER}
project, which develops data management mechanisms for UDDI registries of Web services using mobile users and software agents.
In the current operation mode of messengers, descriptions of Web services are first, collected from UDDI registries and later,
submitted to other UDDI registries. This submission mode of Web services descriptions does not foster the tremendous opportunities
that both wireless technologies and mobile devices offer. When mobile devices are “close” to each other, they can form a mobile
ad-hoc network that permits the exchange of data between these devices without any pre-existing communication infrastructure.
By authorizing messengers to engage in ad-hoc collaboration, collecting additional descriptions of Web services from other
messengers can happen, too. This has several advantages, but at the same time poses several challenges, which in fact highlight
the complexity of ad-hoc networks.
An ad hoc network is a multihop wireless network in which mobile hosts communicate without the support of a wired backbone
for routing messages. We introduce a self organizing network structure called a spine and propose a spine-based routing infrastructure
for routing in ad hoc networks. We propose two spine routing algorithms: (a) Optimal Spine Routing (OSR), which uses full
and up-to-date knowledge of the network topology, and (b) Partial-knowledge Spine Routing (PSR), which uses partial knowledge
of the network topology. We analyze the two algorithms and identify the optimality-overhead trade-offs involved in these algorithms.
Nodes in ad hoc networks generally transmit data at regular intervals over long periods of time. Recently, ad hoc network nodes have been built that run on little power and have very limited memory. Authentication is a significant challenge in ad hoc networks, even without considering size and power constraints. Expounding on idealized hashing, this paper examines lower bounds for ad hoc broadcast authentication for μTESLA-like protocols. In particular, this paper explores idealized hashing for generating preimages of hash chains. Building on Bellare and Rogaway’s classical definition, a similar definition for families of hash chains is given. Using these idealized families of hash chain functions, this paper gives a time-space product Ω(k
2 log 4n) bit operation1 lower-bound for optimal preimage hash chain generation for k constant. This bound holds where n is the total length of the hash chain and the hash function family is k-wise independent. These last results follow as corollaries to a lower bound of Coppersmith and Jakobsson.
Distributed systems based on cluster of workstation are more and more difficult to manage due to the increasing number of processors involved, and the complexity of associated applications. Such systems need efficient and flexible monitoring mechanisms to fulfill administration services requirements. In this paper, we present PHOENIX a distributed platform supporting both applications and operating system monitoring with a variable granularity. The granularity is defined using logical expressions to specify complex monitoring conditions. These conditions can be dynamically modified during the application execution. Observation techniques, based on an automatic probe insertion combined with a system agent to minimize the PHOENIX execution time overhead. The platform extensibility offers a suitable environment to design distributed value added services (performance monitoring, load balancing, accounting, cluster management, etc.).
The growing demand for Web and multimedia content accessed through heterogeneous devices requires the providers to tailor
resources to the device capabilities on-the-fly. Providing services for content adaptation and delivery opens two novel challenges
to the present and future content provider architectures: content adaptation services are computationally expensive; the global
storage requirements increase because multiple versions of the same resource may be generated for different client devices.
We propose a novel two-level distributed architecture for the support of efficient content adaptation and delivery services.
The nodes of the architecture are organized in two levels: thin edge nodes on the first level act as simple request gateways
towards the nodes of the second level; fat interior clusters perform all the other tasks, such as content adaptation, caching
and fetching. Several experimental results show that the Two-level architecture achieves better performance and scalability
than that of existing flat or no cooperative architectures.
KeywordsContent adaptation-Multimedia resources-Distributed architectures-Performance evaluation
In heterogeneous environments, dynamic scheduling algorithms are a powerful tool towards performance improvement of scientific applications via load balancing. However, these scheduling techniques employ heuristics that require prior knowledge about workload via profiling resulting in higher overhead as problem sizes and number of processors increase. In addition, load imbalance may appear only at run-time, making profiling work tedious and sometimes even obsolete. Recently, the integration of dynamic loop scheduling algorithms into a number of scientific applications has been proven effective. This paper reports on performance improvements obtained by integrating the Adaptive Weighted Factoring, a recently proposed dynamic loop scheduling technique that addresses these concerns, into two scientific applications: computational field simulation on unstructured grids, and N-Body simulations. Reported experimental results confirm the benefits of using this methodology, and emphasize its high potential for future integration into other scientific applications that exhibit substantial performance degradation due to load imbalance.
In this paper, we discuss how to realize fault-tolerant applications on distributed objects. Servers supporting objects can
be fault-tolerant by taking advantage of replication and checkpointing technologies. However, there is no discussion on how
application programs being performed on clients are tolerant of clients faults. For example, servers might block in the two-phase
commitment protocol due to the client fault. We newly discuss how to make application programs fault-tolerant by taking advantage
of mobile agent technologies where a program can move from a computer to another computer in networks. An application program
to be performed on a faulty computer can be performed on another operational computer by moving the program in the mobile
agent model. In this paper, we discuss a transactional agent model where a reliable and efficient application for manipulating objects in multiple computers is realized in the mobile agent
model. In the transactional agent model, only a small part of the application program named routing subagent moves around computers. A routing subagent autonomously finds a computer which to visit next. We discuss a hierarchical navigation
map which computer should be visited price to another computer in a transactional agent. A routing subagent makes a decision
on which computer visit for the hierarchical navigation map. Programs manipulating objects in a computer are loaded to the
computer on arrival of the routing subagent in order to reduce the communication overhead. This part of the transactional
agent is a manipulating subagent. The manipulation subagent still exists on the computer even after the routing subagent leaves the computer in order to hold
objects until the commitment. We assume every computer may stop by fault while networks are reliable. There are kinds of faulty
computers for a transactional agent; current, destination, and sibling computers where a transactional agent now exists, will move, and has visited, respectively. The types of faults are detected
by neighbouring manipulation subagents by communicating with each other. If some of the manipulation subagents are faulty,
the routing subagent has to be aborted. However, the routing subagent is still moving. We discuss how to efficiently deliver
the abort message to the moving routing subagent. We evaluate the transactional agent model in terms of how long it takes
to abort the routing subagent if some computer is faulty.
In recent years, there has been a dramatic increase in the number of available computing and storage resources. Yet few tools exist that allow these resources to be exploited effectively in an aggregated form. We present the Condor-G system, which leverages software from Globus and Condor to enable users to harness multi-domain resources as if they all belong to one personal domain. We describe the structure of Condor-G and how it handles job management, resource selection, security, and fault tolerance. We also present results from application experiments with the Condor-G system. We assert that Condor-G can serve as a general-purpose interface to Grid resources, for use by both end users and higher-level program development tools.