[Show abstract][Hide abstract] ABSTRACT: The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings. Such representations are not efficient for the purposes of applications that perform computations over large volumes of such information. A common approach to alleviate this problem is through the use of compression methods that produce more compact representations of the data. The use of dictionary encoding is particularly prevalent in Semantic Web database systems for this purpose. However, centralized implementations present performance bottlenecks, giving rise to the need for scalable, efficient distributed encoding schemes. In this paper, we propose an efficient algorithm for fast encoding large Semantic Web data. Specially, we present the detailed implementation of our approach based on the state-of-art asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate performance on a cluster of up to 384 cores and datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art approach, we demonstrate a speed-up of 2.6 - 7.4x and excellent scalability. In the meantime, these results also illustrate the significant potential of the APGAS model for efficient implementation of dictionary encoding and contributes to the engineering of more efficient, larger scale Semantic Web applications.
IEEE Transactions on Parallel and Distributed Systems 10/2015; DOI:10.1109/TPDS.2015.2496579 · 2.17 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: High-performance analytical data processing systems often run on servers with large amounts of memory. A common data structure used in such environment is the hash tables. This paper focuses on investigating efficient parallel hash algorithms for processing large-scale data. Currently, hash tables on distributed architectures are accessed one key at a time by local or remote threads while shared-memory ap-proaches focus on accessing a single table with multiple threads. A relatively straightforward "bulk-operation" approach seems to have been neglected by researchers. In this work, using such a method, we propose a high-level parallel hashing framework, Structured Parallel Hashing, targeting efficiently processing massive data on distributed memory. We present a theoretical analysis of the proposed method and describe the design of our hashing implementations. The evaluation reveals a very interesting result -the proposed straightforward method can vastly outperform distributed hashing methods and can even offer performance comparable with approaches based on shared memory supercomputers which use specialized hardware predicates. Moreover, we char-acterize the performance of our hash implementations through extensive experiments, thereby allowing system developers to make a more informed choice for their high-performance applications.
21st IEEE International Conference on High Performance Computing; 12/2014
[Show abstract][Hide abstract] ABSTRACT: The performance of joins in parallel database management systems is critical for data intensive operations such as querying. Since data skew is common in many applications, poorly engineered join operations result in load imbalance and performance bottlenecks. State-of-the-art methods designed to handle this problem offer significant improvements over naive implementations. However, performance could be further improved by removing the dependency on global skew knowledge and broadcasting. In this paper, we propose PRPQ (partial redistribution & partial query), an efficient and robust join algorithm for processing large-scale joins over distributed systems. We present the detailed implementation and a quantitative evaluation of our method. The experimental results demonstrate that the proposed PRPQ algorithm is indeed robust and scalable under a wide range of skew conditions. Specifically, compared to the state-of-art PRPD method, we achieve 16% - 167% performance improvement and 24% - 54% less network communication under different join workloads.
23rd ACM International Conference on Information and Knowledge Management; 11/2014
[Show abstract][Hide abstract] ABSTRACT: We introduce the design of a fully parallel framework for quickly ana-lyzing large-scale RDF data over distributed architectures. We present three core operations of this framework: dictionary encoding, parallel joins and indexing processing. Preliminary experimental results on a commodity cluster show that we can load large RDF data very fast while remaining within an interactive range for query processing.
13th International Semantic Web Conference; 10/2014
[Show abstract][Hide abstract] ABSTRACT: We propose an efficient method for fast processing large RDF data over distributed memory. Our approach adopts a two-tier index architecture on each computation node: (1) a light-weight primary index, to keep loading times low, and (2) a dynamic, multi-level secondary index, calculated as a by-product of query execution, to decrease or remove inter-machine data movement for subsequent queries that contain the same graph patterns. Experimental results on a commodity cluster show that we can load large RDF data very quickly in memory while remaining within an interactive range for query processing with the secondary index.
25th ACM conference on Hypertext and Social Media; 09/2014
[Show abstract][Hide abstract] ABSTRACT: Infrastructure as a Service (IaaS) is a pay-as-you go based cloud provision model which on demand outsources the physical servers, guest virtual machine (VM) instances, storage resources, and networking connections. This article reports the design and development of our proposed innovative symbiotic simulation based system to support the automated management of IaaS-based distributed virtualized data enter. To make the ideas work in practice, we have implemented an Open Stack based open source cloud computing platform. A smart benchmarking application "Cloud Rapid Experimentation and Analysis Tool (aka CBTool)" is utilized to mark the resource allocation potential of our test cloud system. The real-time benchmarking metrics of cloud are fed to a distributed multi-agent based intelligence middleware layer. To optimally control the dynamic operation of prototype data enter, we predefine some custom policies for VM provisioning and application performance profiling within a versatile cloud modeling and simulation toolkit "CloudSim". Both tools for our prototypes' implementation can scale up to thousands of VMs, therefore, our devised mechanism is highly scalable and flexibly be interpolated at large-scale level. Autonomic characteristics of agents aid in streamlining symbiosis among the simulation system and IaaS cloud in a closed feedback control loop. The practical worth and applicability of the multiagent-based technology lies in the fact that this technique is inherently scalable hence can efficiently be implemented within the complex cloud computing environment. To demonstrate the efficacy of our approach, we have deployed an intelligible lightweight representative scenario in the context of monitoring and provisioning virtual machines within the test-bed. Experimental results indicate notable improvement in the resource provision profile of virtualized data enter on incorporating our proposed strategy.
Proceedings of The 28th IEEE International Conference on Advanced Information Networking and Applications (AINA-2014),, Victoria, Canada; 05/2014
[Show abstract][Hide abstract] ABSTRACT: Interest management in Distributed Virtual Environments (DVEs) is a data-filtering technique designed to reduce bandwidth consumption and therefore enhances the scalability of the system. This technique usually involves a process called interest matching, which determines what data should be sent to the participants as well as what data should be filtered. Although most of the existing interest matching approaches have been shown to meet their runtime performance requirements, they have a fundamental disadvantage they perform interest matching at discrete time intervals. As a result, they would fail to report events between discrete timesteps. If participants of the DVE ignore these missing events, they would most likely perform incorrect simulations. This article presents a new approach called space-time interest matching, which aims to capture the missing events between discrete timesteps. Although this approach requires additional matching effort, a number of novel algorithms are developed to significantly improve its runtime efficiency.
ACM Transactions on Modeling and Computer Simulation 05/2014; 24(3):1-23. DOI:10.1145/2567922 · 0.78 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Outer joins are ubiquitous in databases and big data systems. The question of how best to execute outer joins in large parallel systems is particularly challenging as real world datasets are characterized by data skew leading to performance issues. Although skew handling techniques have been extensively studied for inner joins, there is little published work solving the corresponding problem for parallel outer joins. Conventional approaches to this problem such as ones based on hash redistribution often lead to load balancing problems while duplication-based approaches incurs significant overhead in terms of network communication. In this paper, we propose a new algorithm, query with counters (QC), for directly handling skew in outer joins on distributed architectures. We present an efficient implementation of our approach based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate the performance of our approach on a cluster of 192 cores (16 nodes) and datasets of 1 billion tuples with different skew. Experimental results show that our method is scalable and, in cases of high skew, faster than the state-of-the-art.
14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing; 05/2014
[Show abstract][Hide abstract] ABSTRACT: The past two decades have witnessed an explosion in the deployment of large-scale distributed simulations and distributed virtual environments in different domains, including military and academic simulation systems, social media, and commercial applications such as massively multiplayer online games. As these systems become larger, more data intensive, and more latency sensitive, the optimisation of the flow of data, a paradigm referred to as interest management, has become increasingly critical to address the scalability requirements and enable their successful deployment. Numerous interest management schemes have been proposed for different application scenarios. This article provides a comprehensive survey of the state of the art in the design of interest management algorithms and systems. The scope of the survey includes current and historical projects providing a taxonomy of the existing schemes and summarising their key features. Identifying the primary requirements of interest management, the article discusses the trade-offs involved in the design of existing approaches.
[Show abstract][Hide abstract] ABSTRACT: The Semantic Web comprises enormous volumes of semi-structured data elements.
For interoperability, these elements are represented by long strings. Such
representations are not efficient for the purposes of Semantic Web applications
that perform computations over large volumes of information. A typical method
for alleviating the impact of this problem is through the use of compression
methods that produce more compact representations of the data. The use of
dictionary encoding for this purpose is particularly prevalent in Semantic Web
database systems. However, centralized implementations present performance
bottlenecks, giving rise to the need for scalable, efficient distributed
encoding schemes. In this paper, we describe an encoding implementation based
on the asynchronous partitioned global address space (APGAS) parallel
programming model. We evaluate performance on a cluster of up to 384 cores and
datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art
MapReduce algorithm, we demonstrate a speedup of 2.6-7.4x and excellent
scalability. These results illustrate the strong potential of the APGAS model
for efficient implementation of dictionary encoding and contributes to the
engineering of larger scale Semantic Web applications.
[Show abstract][Hide abstract] ABSTRACT: Hybrid optical/electrical interconnects using commercial optical circuit switches have been previously proposed as an attractive alternative to fully-connected electronically-switched networks. Among other advantages, such a design offers increased port density, bandwidth/port, cabling and energy efficiency, compared to conventional packet-switched counterparts. Recent proposals for such system designs have looked at small and/or medium scale networks employing hybrid interconnects. In our previous work, we presented a hybrid optical/electrical interconnect architecture targeting large-scale deployments in high-performance computing and datacenter environments. To reduce complexity, our architecture employs a regular shuffle network topology that allows for simple management and cabling. Thanks to using a single-stage core interconnect and multiple optical planes, our design can be both incrementally scaled up (in capacity) and scaled out (in the number of racks) without requiring major re-cabling and network re-configuration. In this paper, we extend the fundamentals of our existing work towards quantifying and understanding the performance of these type of systems against more diverse workload communication patterns and system design parameters. In this context, we evaluate–among other characteristics–the overhead of the reconfiguration (decomposition and routing) scheme proposed and extend our simulations to highly adversarial flow generation rate/duration values that challenge the reconfiguration latency of the system.
[Show abstract][Hide abstract] ABSTRACT: MapReduce is a programming model that is capable of processing large data sets in distributed computing environments. The original MapReduce model was designed to be fault-tolerant in case of various network abnormalities. However, fault-tolerance does ...
[Show abstract][Hide abstract] ABSTRACT: An emerging class of Dynamic Data Driven application systems heavily depends on cloud and Big Data. We refer to this class of DDDAS as cloud-based DDDAS. Despite the growing interest in marrying DDDAS with the cloud, there is a general lack for architectural frameworks explicating the cloud requirements, which can support cloud-based DDDAS. Given the unpredictable, dynamic and on-demand nature of the cloud, cloud-based DDDAS requires novel approaches for dynamic Quality of Service (QoS) optimization. This is important for providing timely and reliable predictions and for ensuring higher dependability in the solution, as it would be unrealistic to assume that optimal QoS can be achieved at design time. We propose a decentralized architectural style for cloud-based DDDAS, where dynamic QoS optimization is in the heart of the symbiotic adaptation. The architecture leverages on the classical DDDAS primitives to reach a refined decentralized style suited for the dynamic requirements of the cloud. We formulate the QoS optimization problem as a dynamic multi-objective optimization problem. We use a scenario to exemplify and evaluate the effectiveness of the style.
[Show abstract][Hide abstract] ABSTRACT: Multi-agent systems (MAS) are increasingly being acknowledged as a modelling paradigm for capturing the dynamics of complex systems in a wide range of domains, from system biology to adaptive socio-technical system of systems. The execution of such MAS simulations on parallel machines is a challenging problem due to their dynamic, non-deterministic, data-centric behaviour and nature. These problems are exacerbated as the scale of such MAS models increases. PDES-MAS is a distributed simulation kernel developed specifically to support MAS models addressing the problems of partitioning, load balancing and interest management in an integrated, transparent and adaptive manner. This paper presents an overview of PDES-MAS and for the first time it provides a quantitative evaluation of the system.
[Show abstract][Hide abstract] ABSTRACT: Interest management is a filtering technique which is designed to reduce bandwidth consumption in Distributed Virtual Environments. This technique usually involves a process called “interest matching”, which determines what data should be filtered. Existing interest matching algorithms, however, are mainly designed for serial processing which is supposed to be run on a single processor. As the problem size grows, these algorithms may not be scalable since the single processor may eventually become a bottleneck. In this paper, a parallel approach for interest matching is presented which is suitable to deploy on both shared-memory and distributed-memory multiprocessors. We also provide an analysis of speed-up and efficiency for the simulation results of the parallel algorithms.
[Show abstract][Hide abstract] ABSTRACT: The performance of parallel distributed data management systems becomes increasingly important with the rise of Big Data. Parallel joins have been widely studied both in the parallel processing and the database communities. Nevertheless, most of the algorithms so far developed do not consider the data skew, which naturally exists in various applications. State of the art methods designed to handle this problem are based on extensions to either of the two prevalent conventional approaches to parallel joins - the hash-based and duplication-based frameworks. In this paper, we introduce a novel parallel join framework, query-based distributed join (QbDJ), for handling data skew on distributed architectures. Further, we present an efficient implementation of the method based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate the performance of our approach on a cluster of 192 cores (16 nodes) and datasets of 1 billion tuples with different skews. The results show that the method is scalable, and also runs faster with less network communication compared to state-of-art PRPD approach in  under high data skew.
15th IEEE International Conference on High Performance Computing and Communications; 11/2013
[Show abstract][Hide abstract] ABSTRACT: A large scale High Level Architecture (HLA)-based simulation can be constructed using a network of simulation federations to form a “federation community”. This effort is often for the sake of enhancing scalability, interoperability, composability and enabling information security. Synchronization mechanisms are essential to coordinate the execution of federates and event transmissions across the boundaries of interlinked federations. We have developed a generic synchronization mechanism for federation community networks with its correctness mathematically proved. The synchronization mechanism suits various types of federation community network and supports the reusability of legacy federates. It is platform-neutral and independent of federate modeling approaches. The synchronization mechanism has been evaluated in the context of the Grid-enabled federation community approach, which allows simulation users to benefit from both Grid computing technologies and the federation community approach. A series of experiments has been carried out to validate and benchmark the synchronization mechanism. The experimental results indicate that the proposed mechanism provides correct time management services to federation communities. The results also show that the mechanism exhibits encouraging performance in terms of synchronization efficiency and scalability.
Journal of Parallel and Distributed Computing 04/2013; 70(2):144-159. DOI:10.1016/j.jpdc.2009.10.006 · 1.18 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The level of research that e-voting has attracted is a testimony of its importance as a key element in the implementation of e-government. It is argued that the ease with which voting can be performed will increase participation and enhance accountability. This convenience however, generates a set of specific requirements, not least the ability of the underlying distributed system to model the behaviour of manual systems. More specifically, the elimination of direct physical intervention entails a careful management of the implications of virtual participation. The scope of this work concerns the identification and integration of specific mechanisms for addressing issues of security, privacy and accountability. The aim of this paper is to present a case study on the design and implementation of an e-voting prototype system, and to provide a context for the selection and deployment of relevant mechanisms.