Article

OpenGeoBase: Information Centric Networking meets Spatial Database applications

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper explores methodologies, advantages and challenges related to the use of Information Centric Networking (ICN) for realizing distributed spatial databases. Our findings show that the ICN functionality perfectly fits database requirements: routing-by-name can be used to dispatch queries and insertions, in-network caching to accelerate queries, and data-centric security to implement secure multi-tenancy. We present an ICN-based distributed spatial database, named OpenGeoBase, and describe design choices that lead to performance suitable for real world applications. Thanks to ICN, OpenGeoBase can quickly and efficiently provide massive information to database users; easily operate in a distributed way, deploying and using many database engines in parallel; secure every piece of content in a customizable way; naturally slice resources, so that several tenants and users can use the database in parallel and independently. We also show how OpenGeoBase can support an Intelligent Transport System application, by enabling fast responses to continent-wide queries about stops, routes, trips, schedules, real-time updates,fares, etc.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Finally, we point out that in [12] we dealt with ICN and NoSQL databases; however, in that paper we explored how ICN can be used to implement distributed data-8 Figure 3: NoSQL/ICN federated database architecture bases, rather than federated ones. The main difference between the two is that in a distributed deployment the data is spread over the available databases of the cluster according to a sharding logic, thus the query routing process does not need any indexing. ...
... Moreover, also security requirements are different. Besides, we note that the ICN distributed database that we proposed in [12] can be considered as one of the possible NoSQL DBMS that can join our proposed federated architecture, like MongoDB, CouchBase, etc. ...
Preprint
This paper explores methodologies, challenges and expected advantages related to the use of the Information Centric Network (ICN) technology for federating NoSQL databases. ICN services allow simplifying the design of federation procedures, improving their performance, and providing so-called data-centric security. In this work present an architecture able to federate NoSQL spatial databases and evaluate its performance, by using a real data set within a heterogeneous federation formed by MongoDB and CouchBase database systems.
... And it can process data through .NET objects, without need to pay attention to the basic database that stores the data. By separating the database design from the domain object design, it makes programs more scalable and maintainable [9]. ...
Chapter
In order to reduce the large network overhead and the heavy cost of cross-match on the astronomical catalog in the database cluster, we proposed a novel method of cross-matches based on Roaring Bitmap. Firstly, we store astronomical catalog data in column-oriented storage with compression setup to reduce I/O overhead of accessing field in the parallel database system. Secondly, we create the spatial index, which maps the 2D coordinates into integer number. Then, using Roaring Bitmap convert the spatial index into a bitmap index. Finally, the received spatial range search of cross-match is translated into bitmap operations to achieve batch processing. The experiments over the real large-scale astronomical data show that the proposed method is 4 to 10 times faster than traditional method, meanwhile, only consume less than 10% of memory resource.
Article
Full-text available
This paper proposes Navigo, a location based packet forwarding mechanism for vehicular Named Data Networking (NDN). Navigo takes a radically new approach to address the challenges of frequent connectivity disruptions and sudden network changes in a vehicle network. Instead of forwarding packets to a specific moving car, Navigo aims to fetch specific pieces of data from multiple potential carriers of the data. The design provides (1) a mechanism to bind NDN data names to the producers' geographic area(s); (2) an algorithm to guide Interests towards data producers using a specialized shortest path over the road topology; and (3) an adaptive discovery and selection mechanism that can identify the best data source across multiple geographic areas, as well as quickly react to changes in the V2X network.
Conference Paper
Full-text available
Content-Centric Networking (CCN) recently received a lot of attention thanks to its elegant way to optimize content diffusion at the scale of Internet. However, communications occurring at the edge of Internet, in particular the Internet of Things (IoT), are also a vivid research topic. Even if CCN was not initially designed to optimize the specific traffic pattern of the IoT, it can be improved to better support these new applications. In this paper, we propose to optimize the traffic within a CCN for IoT network where information is created and consumed at different frequencies. The simulations show that our solution outperforms the vanilla CCN architecture for this generic scenario.
Conference Paper
Full-text available
Content-Centric Networks (CCN) provide substantial flexibility for users to obtain information without regard to the source of the information or its current location. Publish/subscribe (pub/sub) systems have gained popularity in society to provide the convenience of removing the temporal dependency of the user having to indicate an interest each time he or she wants to receive a particular piece of related information. Currently, on the Internet, such pub/sub systems have been built on top of an IP-based network with the additional responsibility placed on the end-systems and servers to do the work of getting a piece of information to interested recipients. We propose Content-Oriented Pub/Sub System (COPSS) to achieve an efficient pub/sub capability for CCN. COPSS enhances the heretofore inherently pull-based CCN architectures proposed by integrating a push based multicast capability at the content-centric layer. We emulate an application that is particularly emblematic of a pub/sub environment - Twitter - but one where subscribers are interested in content (e.g., identified by keywords), rather than tweets from a particular individual. Using trace-driven simulation, we demonstrate that our architecture can achieve a scalable and efficient content centric pub/sub network. The simulator is parameterized using the results of careful micro benchmarking of the open source CCN implementation and of standard IP based forwarding. Our evaluations show that COPSS provides considerable performance improvements in terms of aggregate network load, publisher load and subscriber experience compared to that of a traditional IP infrastructure.
Conference Paper
Full-text available
In order to handle spatial data efficiently, as required in computer aided design and geo-data applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations However, traditional indexing methods are not well suited to data objects of non-zero size located m multi-dimensional spaces In this paper we describe a dynamic index structure called an R-tree which meets this need, and give algorithms for searching and updating it. We present the results of a series of tests which indicate that the structure performs well, and conclude that it is useful for current database systems in spatial applications
Conference Paper
Full-text available
Microsoft SQL Server 2008 adds built-in support for 2-dimensional spatial data types for both planar and geodetic geometries to address the increasing demands for managing location-aware data. SQL Server 2008 also adds indexing capabilities that, together with the necessary plan selections done by the query optimizer, provide efficient processing of spatial queries. This paper will present an overview of the spatial indexing implementation in SQL Server 2008 and outline how the indexing is implemented and how the cost-based query optimizer chooses among the different plans.
Article
Full-text available
A Bloom filter is a simple space-efficient randomized data structure for representing a set in order to support membership queries. Bloom filters allow false positives but the space savings often outweigh this drawback when the probability of an error is controlled. Bloom filters have been used in database applications since the 1970s, but only in recent years have they become popular in the networking literature. The aim of this paper is to survey the ways in which Bloom filters have been used and modified in a variety of network problems, with the aim of providing a unified mathematical and practical framework for understanding them and stimulating their use in future applications.
Conference Paper
Full-text available
We propose a scalable distributed data structure (SDDS) called SD-Rtree. We intend our structure for point and window queries over possibly large spatial datasets distributed on clusters of interconnected servers. SD-Rtree generalizes the well-known Rtree structure. It uses a distributed balanced binary spatial tree that scales with insertions to potentially any number of storage servers through splits of the overloaded ones. A user/application manipulates the structure from a client node. The client addresses the tree through its image that the splits can make outdated. This may generate addressing errors, solved by the forwarding among the servers. Specific messages towards the clients incrementally correct the outdated images.
Conference Paper
Securing communication in network applications involves many complex tasks that can be daunting even for security experts. The Named Data Networking (NDN) architecture builds data authentication into the network layer by requiring all applications to sign and authenticate every data packet. To make this authentication usable, the decision about which keys can sign which data and the procedure of signature verification need to be automated. This paper explores the ability of NDN to enable such automation through the use of trust schemas. Trust schemas can provide data consumers an automatic way to discover which keys to use to authenticate individual data packets, and provide data producers an automatic decision process about which keys to use to sign data packets and, if keys are missing, how to create keys while ensuring that they are used only within a narrowly defined scope ("the least privilege principle"). We have developed a set of trust schemas for several prototype NDN applications with different trust models of varying complexity. Our experience suggests that this approach has the potential of being generally applicable to a wide range of NDN applications.
Article
In this paper, we introduce a natural generalization of Weighted Set Cover and Maximum Coverage, called Size-Constrained Weighted Set Cover. The input is a collection of n elements, a collection of weighted sets over the elements, a size constraint k, and a minimum coverage fraction ͉; the output is a sub-collection of up to k sets whose union contains at least ͉n elements and whose sum of weights is minimal. We prove the hardness of approximation of this problem, and we present efficient approximation algorithms with provable quality guarantees that are the best possible. In many applications, the elements are data records, and the set collection to choose from is derived from combinations (patterns) of attribute values. We provide optimization techniques for this special case. Finally, we experimentally demonstrate the effectiveness and efficiency of our solutions.
Article
Mobile Ad-hoc NETworks (MANETs) connect mobile wireless devices without an underlying communication infrastructure. Communications occur in a multi-hop fashion, using mobile devices as routers. Several MANET distributed applications require to exchange data (GPS position, messages, pictures, etc.) by using a topic-based publish–subscribe interaction. Participants of these applications can publish information items on a given topic (identified by a name) and can subscribe to a topic to receive the related published information. An efficient dissemination of publish–subscribe data in MANET environments demands for robust systems, able to face radio resource scarcity, network partitioning, frequent topology changes. Many MANET publish–subscribe systems have been proposed so far in the literature assuming an underlying TCP/IP network. In this paper, we discuss the benefits of building a MANET publish–subscribe system exploiting Content Centric Networking (CCN) technology, rather than TCP/IP. We show how CCN functionality, such as in-network caching and multicasting can be used to achieve an efficient and reliable data dissemination in MANET environments, including the support of delay tolerant delivery. We present different design approaches, describe our topic-based publish–subscribe CCN system, and report the results of a performance evaluation study carried out with real software in an emulated environment. The emulation environment is based on Linux virtual machines. The performance evaluation required also a CCN MANET routing engine, which we developed as a plug-in of the OLSR Linux daemon.
Article
We propose a definition of a spatial database system as a database system that offers spatial data types in its data model and query language, and supports spatial data types in its implementation, providing at least spatial indexing and spatial join methods. Spatial database systems offer the underlying database technology for geographic information systems and other applications. We survey data modeling, querying, data structures and algorithms, and system architecture for such systems. The emphasis is on describing known technology in a coherent manner, rather than listing open problems.
Article
An adaptation of the quadtree data structure that represents polygonal maps (i.e., collections of polygons, possibly containing holes) is described in a manner that is also useful for the manipulation of arbitrary collections of straight line segments. The goal is to store these maps without the loss of information that results from digitization, and to obtain a worst-case execution time that is not overly sensitive to the positioning of the map. Regular decomposition variant of the region quadtree is used?? ???h)??????????0*0*0*???? ???? to organize the vertices and edges of the maps. A number of related data organizations are proposed in an iterative manner until a method is obtained that meets the stated goals. The result is termed a PM (Polygonal Map) quadtree and is based on a regular decomposition Point Space quadtree (PS quadtree) that stores additional information about the edges at its terminal nodes. Algorithms are given for inserting and deleting line segments from a PM quadtree. Use of the PM quadtree to perform point location, dynamic line insertion, and map overlay is discussed. An empirical comparison of the PM quadtree with other quadtree-based representations for polygonal maps is also provided
Network applications of bloom filters: A survey Content-based publish/subscribe networking and information-centric networking
  • A Broder
  • M Mitzenmacher
  • A Carzaniga
  • M Papalini
  • A L Wolf
A. Broder and M. Mitzenmacher. Network applications of bloom filters: A survey. Internet mathematics, 1(4):485–509, 2004. [7] A. Carzaniga, M. Papalini, and A. L. Wolf. Content-based publish/subscribe networking and information-centric networking. In Proceedings of the ACM SIGCOMM workshop on Information-centric networking, pages 56–61. ACM, 2011.
Content-based publish/subscribe networking and information-centric networking
  • A Carzaniga
  • M Papalini
  • A L Wolf
A. Carzaniga, M. Papalini, and A. L. Wolf. Content-based publish/subscribe networking and information-centric networking. In Proceedings of the ACM SIGCOMM workshop on Information-centric networking, pages 56-61. ACM, 2011.
Size-constrained weighted set cover
  • L Golab
  • F Korn
  • F Li
  • B Saha
  • D Srivastava
L. Golab, F. Korn, F. Li, B. Saha, and D. Srivastava. Size-constrained weighted set cover. In Data Engineering (ICDE), 2015 IEEE 31st International Conference on, pages 879-890. IEEE, 2015.