[Show abstract][Hide abstract] ABSTRACT: High-performance analytical data processing sys-tems often run on servers with large amounts of memory. A common data structure used in such environment is the hash tables. This paper focuses on investigating efficient parallel hash algorithms for processing large-scale data. Currently, hash tables on distributed architectures are accessed one key at a time by local or remote threads while shared-memory ap-proaches focus on accessing a single table with multiple threads. A relatively straightforward "bulk-operation" approach seems to have been neglected by researchers. In this work, using such a method, we propose a high-level parallel hashing framework, Structured Parallel Hashing, targeting efficiently processing massive data on distributed memory. We present a theoretical analysis of the proposed method and describe the design of our hashing implementations. The evaluation reveals a very interesting result -the proposed straightforward method can vastly outperform distributed hashing methods and can even offer performance comparable with approaches based on shared memory supercomputers which use specialized hardware predicates. Moreover, we char-acterize the performance of our hash implementations through extensive experiments, thereby allowing system developers to make a more informed choice for their high-performance applications.
21st IEEE International Conference on High Performance Computing; 12/2014
[Show abstract][Hide abstract] ABSTRACT: We introduce the design of a fully parallel framework for quickly ana-lyzing large-scale RDF data over distributed architectures. We present three core operations of this framework: dictionary encoding, parallel joins and indexing processing. Preliminary experimental results on a commodity cluster show that we can load large RDF data very fast while remaining within an interactive range for query processing.
13th International Semantic Web Conference; 10/2014
[Show abstract][Hide abstract] ABSTRACT: We propose an efficient method for fast processing large RDF data over distributed memory. Our approach adopts a two-tier index architecture on each computation node: (1) a light-weight primary index, to keep loading times low, and (2) a dynamic, multi-level secondary index, calculated as a by-product of query execution, to decrease or remove inter-machine data movement for subsequent queries that contain the same graph patterns. Experimental results on a commodity cluster show that we can load large RDF data very quickly in memory while remaining within an interactive range for query processing with the secondary index.
25th ACM conference on Hypertext and Social Media; 09/2014
[Show abstract][Hide abstract] ABSTRACT: The past two decades have witnessed an explosion in the deployment of large-scale distributed simulations and distributed virtual environments in different domains, including military and academic simulation systems, social media, and commercial applications such as massively multiplayer online games. As these systems become larger, more data intensive, and more latency sensitive, the optimisation of the flow of data, a paradigm referred to as interest management, has become increasingly critical to address the scalability requirements and enable their successful deployment. Numerous interest management schemes have been proposed for different application scenarios. This article provides a comprehensive survey of the state of the art in the design of interest management algorithms and systems. The scope of the survey includes current and historical projects providing a taxonomy of the existing schemes and summarising their key features. Identifying the primary requirements of interest management, the article discusses the trade-offs involved in the design of existing approaches.
[Show abstract][Hide abstract] ABSTRACT: The Semantic Web comprises enormous volumes of semi-structured data elements.
For interoperability, these elements are represented by long strings. Such
representations are not efficient for the purposes of Semantic Web applications
that perform computations over large volumes of information. A typical method
for alleviating the impact of this problem is through the use of compression
methods that produce more compact representations of the data. The use of
dictionary encoding for this purpose is particularly prevalent in Semantic Web
database systems. However, centralized implementations present performance
bottlenecks, giving rise to the need for scalable, efficient distributed
encoding schemes. In this paper, we describe an encoding implementation based
on the asynchronous partitioned global address space (APGAS) parallel
programming model. We evaluate performance on a cluster of up to 384 cores and
datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art
MapReduce algorithm, we demonstrate a speedup of 2.6-7.4x and excellent
scalability. These results illustrate the strong potential of the APGAS model
for efficient implementation of dictionary encoding and contributes to the
engineering of larger scale Semantic Web applications.
[Show abstract][Hide abstract] ABSTRACT: Interest management is a filtering technique which is designed to reduce bandwidth consumption in Distributed Virtual Environments. This technique usually involves a process called “interest matching”, which determines what data should be filtered. Existing interest matching algorithms, however, are mainly designed for serial processing which is supposed to be run on a single processor. As the problem size grows, these algorithms may not be scalable since the single processor may eventually become a bottleneck. In this paper, a parallel approach for interest matching is presented which is suitable to deploy on both shared-memory and distributed-memory multiprocessors. We also provide an analysis of speed-up and efficiency for the simulation results of the parallel algorithms.
[Show abstract][Hide abstract] ABSTRACT: The performance of parallel distributed data management systems becomes increasingly important with the rise of Big Data. Parallel joins have been widely studied both in the parallel processing and the database communities. Nevertheless, most of the algorithms so far developed do not consider the data skew, which naturally exists in various applications. State of the art methods designed to handle this problem are based on extensions to either of the two prevalent conventional approaches to parallel joins - the hash-based and duplication-based frameworks. In this paper, we introduce a novel parallel join framework, query-based distributed join (QbDJ), for handling data skew on distributed architectures. Further, we present an efficient implementation of the method based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate the performance of our approach on a cluster of 192 cores (16 nodes) and datasets of 1 billion tuples with different skews. The results show that the method is scalable, and also runs faster with less network communication compared to state-of-art PRPD approach in  under high data skew.
15th IEEE International Conference on High Performance Computing and Communications; 11/2013
[Show abstract][Hide abstract] ABSTRACT: A large scale High Level Architecture (HLA)-based simulation can be constructed using a network of simulation federations to form a “federation community”. This effort is often for the sake of enhancing scalability, interoperability, composability and enabling information security. Synchronization mechanisms are essential to coordinate the execution of federates and event transmissions across the boundaries of interlinked federations. We have developed a generic synchronization mechanism for federation community networks with its correctness mathematically proved. The synchronization mechanism suits various types of federation community network and supports the reusability of legacy federates. It is platform-neutral and independent of federate modeling approaches. The synchronization mechanism has been evaluated in the context of the Grid-enabled federation community approach, which allows simulation users to benefit from both Grid computing technologies and the federation community approach. A series of experiments has been carried out to validate and benchmark the synchronization mechanism. The experimental results indicate that the proposed mechanism provides correct time management services to federation communities. The results also show that the mechanism exhibits encouraging performance in terms of synchronization efficiency and scalability.
Journal of Parallel and Distributed Computing 04/2013; 70(2):144-159. · 1.12 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Vince Gaffney, Phil Murgatroyd, Bart Craenen, and Georgios Theodoropoulos
‘Only individuals’: moving the Byzantine army to Manzikert pp. 25-43
Traditionally, history has frequently emphasized the role of the ‘Great Man or Woman’, who may achieve greatness, or notoriety, through the consequences of their decisions. More problematic is the historical treatment of the mass of the population. Agent-based modelling is a computer simulation technique that can not only help identify key interactions that contribute to large scale patterns but also add detail to our understanding of the effects of all contributors to a system, not just those at the top. The Medieval Warfare on the Grid project has been using agent-based models to examine the march of the Byzantine army across Anatolia to Manzikert in AD 1071. This article describes the movement model used to simulate the army and the historical sources on which it was based. It also explains why novel route pla
THE DIGITAL CLASSICIST 2013, Edited by STUART DUNN, SIMON MAHONY, 01/2013: chapter ‘ONLY INDIVIDUALS’: MOVING THE BYZANTINE ARMY TO MANZIKERT: pages 25-44; INSTITUTE OF CLASSICAL STUDIES, SCHOOL OF ADVANCED STUDY, UNIVERSITY OF LONDON., ISBN: ISBN 978‐1‐905670‐49‐9
[Show abstract][Hide abstract] ABSTRACT: The increasing scale and complexity of virtualized data centers pose significant challenges to system management software stacks, which still rely on special-purpose controllers to optimize the operation of cloud infrastructures. Autonomic computing allows complex systems to assume much of their own management, achieving self-configuration, self-optimization, self-healing, and self-protection without external intervention. This paper proposes an agent-based architecture for autonomic cloud management, where resources and virtual machines are associated with worker agents that monitor changes in their local environments, interact with each other, make their own decisions, and take adaptive actions supervised by a network of management processes. To fulfill global objectives, the management processes conduct what-if simulations and update the worker agents' local rules when necessary. Such a guided decentralized decision making method can mitigate the pressure on the system management stack, improve the effectiveness of resource management, and accelerate the response to failures and attacks.
[Show abstract][Hide abstract] ABSTRACT: As the Semantic Web becomes mainstream, the performance of triple stores becomes increasingly important. Up until now, there have been various benchmarks and experiments that have attempted to evaluate the response time and query throughput of individual stores to show the weaknesses and strengths of triple store implementation. However, these evaluations have primarily focused on the application level and have not sufficiently investigated system-level aspects to discover performance inhibitors and bottlenecks. In this paper, we are proposing metrics based on a systematic study of the impact of triple store implementation on the underlying platform. We choose some popular triple stores as use cases, and perform our experiments on a standard (128GB RAM, 12 cores) and an enterprise platform (768GB RAM, 40cores). Through detailed time cost and system consumption measures of queries derived from the Berlin SPARQL Benchmark (BSBM), we describe the dynamics and behaviors of query execution across these systems. The collected data provides insight into different triple store implementation as well as an understanding of performance differences between the two platforms. The results obtained help in the identification of performance bottlenecks in existing triple stores implementations which may be useful in future design efforts for Linked Data processing.
15th IEEE International Conference on Computational Science and Engineering; 11/2012
[Show abstract][Hide abstract] ABSTRACT: Digital Humanities offer a new exciting do-main for agent-based distributed simulation. In historical studies interpretation rarely rises above the level of un-proven assertion and is rarely tested against a range of evidence. Agent-based simulation can provide an opportu-nity to break these cycles of academic claim and counter-claim. The MWGrid framework utilises distributed agent-based simulation to study medieval military logistics. As a use-case, it has focused on the logistical analysis of the Byzantine army's march to the battle of Manzikert (AD 1071), a key event in medieval history. It integrates an agent design template, a transparent, layered mechanism to translate model-level agents' actions to timestamped events and the PDES-MAS distributed simulation kernel. The paper presents an overview of the MWGrid system and a quantitative evaluation of its perfomance.
16th IEEE International Symposium on Distributed Simulation and Real Time Applications (DSRT 2012); 10/2012
[Show abstract][Hide abstract] ABSTRACT: Traffic simulation can be very computationally intensive, especially for microscopic simulations of large urban areas (tens of thousands of road segments, hundreds of thousands of agents) and when real-time or better than real-time simulation is required. For instance, running a couple of what-if scenarios for road management authorities/police during a road incident: time is a hard constraint and the size of the simulation is relatively high. Hence the need for distributed simulations and for optimal space partitioning algorithms, ensuring an even distribution of the load and minimal communication between computing nodes. In this paper we describe a distributed version of SUMO, a simulator of urban mobility, and SParTSim, a space partitioning algorithm guided by road network for distributed simulations. It outperforms classical uniform space partitioning in terms of road segment cuts and load-balancing.
The 16th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications; 01/2012
[Show abstract][Hide abstract] ABSTRACT: Historical studies are frequently perceived to be characterised as clear narratives defined by a series of fixed events or actions. In reality, even where critical historic events may be identified, historic documentation frequently lacks corroborative detail that supports verifiable interpretation. Consequently, for many periods and areas of research, interpretation may rarely rise above the level of unproven assertion and is rarely tested against a range of evidence. Simulation provides an opportunity to break cycles of academic claim and counter-claim. This paper discusses the development and utilisation of large scale distributed Agent-based simulations designed to investigate the medieval military logistics in order to generate new evidence to supplement existing historical analysis. The work aims at modelling logistical arrangements relating to the battle of Manzikert (AD 1071), a key event in Byzantine history. The paper discusses the distributed simulation infrastructure and provides an overview of the agent models developed for this exercise.
International Journal of Humanities and Arts Computing 01/2012;
[Show abstract][Hide abstract] ABSTRACT: As the scale of Distributed Virtual Environments (DVEs) grows in terms of participants and virtual entities, using interest management schemes to reduce bandwidth consumption becomes increasingly common for DVE development. The interest matching process is essential for most of the interest management schemes which determines what data should be sent to the participants as well as what data should be filtered. However, if the computational overhead of interest matching is too high, it would be unsuitable for real-time DVEs for which runtime performance is important. This paper presents a new approach of interest matching which divides the workload of matching process among a cluster of computers. Experimental evidence shows that our approach is an effective solution for the real-time applications.
Distributed Simulation and Real Time Applications (DS-RT), 2011 IEEE/ACM 15th International Symposium on; 10/2011