Article

Java support for data-intensive systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Database system designers have traditionally had trouble with the default services and interfaces provided by operating systems. In recent years, developers and enthusiasts have increasingly promoted Java as a serious platform for building data-intensive servers. Java provides a number of very helpful language features, as well as a full run-time environment reminiscent of a traditional operating system. This combination of features and community support raises the question of whether Java is better or worse at supporting data-intensive server software than a traditional operating system coupled with a weakly-typed language such as C or C++.In this paper, we summarize and discuss our experience building the Telegraph dataflow system in Java. We highlight some of the pleasures of coding with Java, and some of the pains of coding around Java in order to obtain good performance in a data-intensive server. For those issues that were painful, we present concrete suggestions for evolving Java's interfaces to better suit serious software systems development. We believe these experiences can provide insight for other designers to avoid pitfalls we encountered and to decide if Java is a suitable platform for their system.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Examples of these HPC platforms are the IBM SP2, the SGI/Cray T3E, and the SGI Origin 2000. Several application implementation results on these HPC platforms have been reported [4,8,46,19,43,51,67,69]. ...
Article
Acknowledgments I would like to thank Dr. Viktor K. Prasanna, my academic adviser, who has contributed to all my academic endeavor. He directed me to the way of sharp thinking, motivated me to conduct original research, and supervised my entire dissertation research. I am sincerely grateful for his effort in training me to become,a successful researcher. I am also very grateful to the rest of my committee members, Professor Sandeep Gupta, Pro- fessor Ming-Deh Huang, Professor Douglas Ierardi, Professor Christos Kyriakakis, and Professor Monte Ung for their guidance and instructions in preparing my thesis proposal and the final version of the dissertation. I would like to thank my fellow graduate students, Yonghwa Chung, Wenheng Liu, Henry Park, Prashanth Bhat, and Myungho Lee, for their invaluable inspiration and col- laboration in writing the project proposals, reports, papers, developing the parallel algo-
... single-threaded implementation [24]. We now describe the module functionality in detail. ...
Conference Paper
Full-text available
We present a query architecture in which join operators are decomposed into their constituent data structures (State Modules, or SteMs), and dataflow among these SteMs is managed adaptively by an eddy routing operator [R. Avnur et al., (2000)]. Breaking the encapsulation of joins serves two purposes. First, it allows the eddy to observe multiple physical operations embedded in a join algorithm, allowing for better calibration and control of these operations. Second, the SteM on a relation serves as a shared materialization point, enabling multiple competing access methods to share results, which can be leveraged by multiple competing join algorithms. Our architecture extends prior work significantly, allowing continuously adaptive decisions for most major aspects of traditional query optimization: choice of access methods and join algorithms, ordering of operators, and choice of a query spanning tree. SteMs introduce significant routing flexibility to the eddy, enabling more opportunities for adaptation, but also introducing the possibility of incorrect query results. We present constraints on eddy routing through SteMs that ensure correctness while preserving a great deal of flexibility. We also demonstrate the benefits of our architecture via experiments in the Telegraph dataflow system. We show that even a simple routing policy allows significant flexibility in adaptation, including novel effects like automatic "hybridization " of multiple algorithms for a single join.
... Also note that tuples are never copied: once allocated, they are passed by reference between operators. For a detailed discussion of Telegraph, see [23]. ...
Article
Full-text available
We present a continuously adaptive, continuous query (CACQ) implementation based on the eddy query processing framework. We show that our design provides significant performance benefits over existing approaches to evaluating continuous queries, not only because of its adaptivity, but also because of the aggressive crossquery sharing of work and space that it enables. By breaking the abstraction of shared relational algebra expressions, our Telegraph CACQ implementation is able to share physical operators -- both selections and join state -- at a very fine grain. We augment these features with a grouped-filter index to simultaneously evaluate multiple selection predicates. We include measurements of the performance of our core system, along with a comparison to existing continuous query approaches.
Article
In-memory caching of intermediate data and active combining of data in shuffle buffers have been shown to be very effective in minimizing the recomputation and I/O cost in big data processing systems such as Spark and Flink. However, it has also been widely reported that these techniques would create a large amount of long-living data objects in the heap. These generated objects may quickly saturate the garbage collector, especially when handling a large dataset, and hence, limit the scalability of the system. To eliminate this problem, we propose a lifetime-based memory management framework, which, by automatically analyzing the user-defined functions and data types, obtains the expected lifetime of the data objects and then allocates and releases memory space accordingly to minimize the garbage collection overhead. In particular, we present Deca,<sup;>1</sup;> a concrete implementation of our proposal on top of Spark, which transparently decomposes and groups objects with similar lifetimes into byte arrays and releases their space altogether when their lifetimes come to an end. When systems are processing very large data, Deca also provides field-oriented memory pages to ensure high compression efficiency. Extensive experimental studies using both synthetic and real datasets show that, in comparing to Spark, Deca is able to (1) reduce the garbage collection time by up to 99.9%, (2) reduce the memory consumption by up to 46.6% and the storage space by 23.4%, (3) achieve 1.2× to 22.7× speedup in terms of execution time in cases without data spilling and 16× to 41.6× speedup in cases with data spilling, and (4) provide similar performance compared to domain-specific systems.
Article
Over the past decade, the increasing demands on data-driven business intelligence have led to the proliferation of large-scale, data-intensive applications that often have huge amounts of data (often at terabyte or petabyte scale) to process. An object-oriented programming language such as Java is often the developer's choice for implementing such applications, primarily due to its quick development cycle and rich community resource. While the use of such languages makes programming easier, significant performance problems can often be seen --- the combination of the inefficiencies inherent in a managed run-time system and the impact of the huge amount of data to be processed in the limited memory space often leads to memory bloat and performance degradation at a surprisingly early stage. This paper proposes a bloat-aware design paradigm towards the development of efficient and scalable Big Data applications in object-oriented GC enabled languages. To motivate this work, we first perform a study on the impact of several typical memory bloat patterns. These patterns are summarized from the user complaints on the mailing lists of two widely-used open-source Big Data applications. Next, we discuss our design paradigm to eliminate bloat. Using examples and real-world experience, we demonstrate that programming under this paradigm does not incur significant programming burden. We have implemented a few common data processing tasks both using this design and using the conventional object-oriented design. Our experimental results show that this new design paradigm is extremely effective in improving performance --- even for the moderate-size data sets processed, we have observed 2.5x+ performance gains, and the improvement grows substantially with the size of the data set.
Article
The system StreamAPAS and its declarative query language allows users to define temporal data analysis. This chapter addresses the problem of lack of the continuous language standard. The proposed language syntax indicates how hierarchical data structures simplify working with spatial data and groups of tuple attributes. The query language is also based on object-oriented programming concepts as a result of which continuous processing applications are easier to develop and maintain. In addition, we discuss the problem of a query logic representation. In contrast to relations stored in DBMS, data streams are temporal so that DSMS should be aware of their dynamic characteristics. Streams characteristics can be described using variables such as tuple rates and invariables like monotonicity. In StreamAPAS, a query is represented as a directed acyclic graph (DAG) whose operators define tuple data transmission model and have information of result stream monotonicity associated with them. Even though this representation is still static, this approach enables us to detect optimization points which are crucial from a stream processing viewpoint.
Conference Paper
Full-text available
This paper presents Capriccio, a scalable thread package for use with high-concurrency servers. While recent work has advocated event-based systems, we believe that thread-based systems can provide a simpler programming model that achieves equivalent or superior performance. By implementing Capriccio as a user-level thread package, we have decoupled the thread package implementation from the underlying operating system. As a result, we can take advantage of cooperative threading, new asynchronous I/O mechanisms, and compiler support. Using this approach, we are able to provide three key features: (1) scalability to 100,000 threads, (2) efficient stack management, and (3) resource-aware scheduling. We introduce linked stack management, which minimizes the amount of wasted stack space by providing safe, small, and non-contiguous stacks that can grow or shrink at run time. A compiler analysis makes our stack implementation efficient and sound. We also present resource-aware scheduling , which allows thread scheduling and admission control to adapt to the system's current resource usage. This technique uses a blocking graph that is automatically derived from the application to describe the flow of control between blocking points in a cooperative thread package. We have applied our techniques to the Apache 2.0.44 web server, demonstrating that we can achieve high performance and scalability despite using a simple threaded programming model.
Article
Full-text available
In-memory caching of intermediate data and eager combining of data in shuffle buffers have been shown to be very effective in minimizing the re-computation and I/O cost in distributed data processing systems like Spark and Flink. However, it has also been widely reported that these techniques would create a large amount of long-living data objects in the heap, which may quickly saturate the garbage collector, especially when handling a large dataset, and hence would limit the scalability of the system. To eliminate this problem, we propose a lifetime-based memory management framework, which, by automatically analyzing the user-defined functions and data types, obtains the expected lifetime of the data objects, and then allocates and releases memory space accordingly to minimize the garbage collection overhead. In particular, we present Deca, a concrete implementation of our proposal on top of Spark, which transparently decomposes and groups objects with similar lifetimes into byte arrays and releases their space altogether when their lifetimes come to an end. An extensive experimental study using both synthetic and real datasets shows that, in comparing to Spark, Deca is able to 1) reduce the garbage collection time by up to 99.9%, 2) to achieve up to 22.7x speed up in terms of execution time in cases without data spilling and 41.6x speedup in cases with data spilling, and 3) to consume up to 46.6% less memory.
Conference Paper
Full-text available
Although most business application data is stored in relational databases, programming languages and wire formats in integration middleware systems are not table-centric. Due to costly format conversions, data-shipments and faster computation, the trend is to "push-down" the integration operations closer to the storage representation. We address the alternative case of de?ning declarative, table-centric integration semantics within standard integration systems. For that, we replace the current operator implementations for the well-known Enterprise Integration Patterns by equivalent "in-memory" table processing, and show a practical realization in a conventional integration system for a non-reliable, "data-intensive" messaging example. The results of the runtime analysis show that table-centric processing is promising already in standard, "single-record" message routing and transformations, and can potentially excel the message throughput for "multi-record" table messages.
Article
Increasingly, organizations capture, transform and analyze enormous data sets. Prominent examples include internet companies and e-science. The Map-Reduce scalable dataflow paradigm has become popular for these applications. Its simple, explicit dataflow programming model is favored by some over the traditional high-level declarative approach: SQL. On the other hand, the extreme simplicity of Map-Reduce leads to much low-level hacking to deal with the many-step, branching dataflows that arise in practice. Moreover, users must repeatedly code standard operations such as join by hand. These practices waste time, introduce bugs, harm readability, and impede optimizations. Pig is a high-level dataflow system that aims at a sweet spot between SQL and Map-Reduce. Pig offers SQL-style high-level data manipulation constructs, which can be assembled in an explicit dataflow and interleaved with custom Map- and Reduce-style functions or executables. Pig programs are compiled into sequences of Map-Reduce jobs, and executed in the Hadoop Map-Reduce environment. Both Pig and Hadoop are open-source projects administered by the Apache Software Foundation. This paper describes the challenges we faced in developing Pig, and reports performance comparisons between Pig execution and raw Map-Reduce execution.
Conference Paper
The system StreamAPAS and its declarative query language allows users to define temporal data analysis. This paper addresses the problem of lack of the continuous language standard. The proposed language syntax indicates how hierarchical data structures simplify working with spatial data and groups of tuple attributes. The query language is also based on object-oriented programming concepts as a result of which continuous processing applications are easier to develop and maintain. In addition, we discuss the problem of a query logic representation. In contrast to relations stored in DBMS, data streams are temporal so that DSMS should be aware of their dynamic characteristics.Streams characteristics can be described using variables such as tuple rates and invariables like monotonicity. In StreamAPAS, a query is represented as a directed acyclic graph (DAG), whose operators define tuple data transmission model and have information of result stream monotonicity associated with them. Even though this representation is still static,this approach enables us to detect optimization points which are crucial from a stream processing viewpoint.
Conference Paper
The integration of the .NET Common Language Runtime (CLR) inside the SQL Server DBMS enables database programmers to write business logic in the form of functions, stored procedures, triggers, data types, and aggregates using modern programming languages such as C#, Visual Basic, C++, COBOL, and J++. This paper presents three main aspects of this work. First, it describes the architecture of the integration of the CLR inside the SQL Server database process to provide a safe, scalable, secure, and efficient environment to run user code. Second, it describes our approach to defining and enforcing extensibility contracts to allow a tight integration of types, aggregates, functions, triggers, and procedures written in modern languages with the DBMS. Finally, it presents initial performance results showing the efficiency of user-defined types and functions relative to equivalent native DBMS features.
Conference Paper
We present an archiving technique for hierarchical data with key structure. Our approach is based on the notion of timestamps whereby an element appearing in multiple versions of the database is stored only once along with a compact description of versions ...
Conference Paper
Full-text available
In many applications involving continuous data streams, data arrival is bursty and data rate fluctuates over time. Systems that seek to give rapid or real-time query responses in such an environment must be prepared to deal gracefully with bursts in data arrival without compromising system performance. We discuss one strategy for processing bursty streams --- adaptive, load-aware scheduling of query operators to minimize resource consumption during times of peak load. We show that the choice of an operator scheduling strategy can have significant impact on the run-time system memory usage. We then present Chain scheduling, an operator scheduling strategy for data stream systems that is near-optimal in minimizing run-time memory usage for any collection of single-stream queries involving selections, projections, and foreign-key joins with stored relations. Chain scheduling also performs well for queries with sliding-window joins over multiple streams, and multiple queries of the above types. A thorough experimental evaluation is provided where we demonstrate the potential benefits of Chain scheduling, compare it with competing scheduling strategies, and validate our analytical conclusions.
Conference Paper
This paper describes WebLogic Event Server (WL EvS), an application server designed for hosting event-driven applications that require low latency and deterministic behavior. WL EvS is based on a modular architecture in which both server components and applications are represented as modules. The application programming model supports applications that are a mixture of reusable Java components and EPL (Event Processing Language), a query language that extends SQL with stream processing capabilities. WL EvS applications are meta-data driven, in that application behavior can be changed without recompilation or redeploying an application. The paper also presents the results of a benchmark performance study. The results show that the approach used by WL EvS can handle extremely high volumes of events while providing deterministic latency.
Article
Full-text available
We have developed and implemented the Relational Grid Monitoring Architecture (R-GMA) as part of the DataGrid project, to provide a flexible information and monitoring service for use by other middleware components and applications. R-GMA presents users with a virtual database and mediates queries posed at this database: users pose queries against a global schema and R-GMA takes responsibility for locating relevant sources and returning an answer. R-GMA’s architecture and mechanisms are general and can be used wherever there is a need for publishing and querying information in a distributed environment. We discuss the requirements, design and implementation of R-GMA as deployed on the DataGrid testbed. We also describe some of the ways in which R-GMA is being used.
Article
Full-text available
Grids are distributed systems that provide access to computational resources in a transparent fashion. Providing information about the status of the Grid itself is called Grid monitoring. As an approach to this problem, we present the Relational Grid Monitoring Architecture (R-GMA), which tackles Grid monitoring as an information integration problem. A novel feature of R-GMA is its support for integrating stream data via a simple “local as view” approach. We describe the infrastructure that R-GMA provides for publishing and querying monitoring data. In this context, we discuss the semantics of continuous queries, provide characterisations of query plans, and present an algorithm for computing such plans. The concepts and mechanisms offered by R-GMA are general and can be applied in other areas where there is a need for publishing and querying information in a distributed fashion.
Conference Paper
A conventional query execution engine in a database system essentially uses a SQL virtual machine (SVM) to interpret a dataflow tree in which each node is associated with a relational operator. During query evaluation, a single tuple at a time is processed and passed among the operators. Such a model is popular because of its efficiency for pipelined processing. However, since each operator is implemented statically, it has to be very generic in order to deal with all possible queries. Such generality tends to introduce significant runtime inefficiency, especially in the context of memory-resident systems, because the granularity of data commercial system, using SVM. processing (a tuple) is too small compared with the associated overhead. Another disadvantage in such an engine is that each operator code is compiled statically, so query-specific optimization cannot be applied. To improve runtime efficiency, we propose a compiled execution engine, which, for a given query, generates new query-specific code on the fly, and then dynamically compiles and executes the code. The Java platform makes our approach particularly interesting for several reasons: (1) modern Java Virtual Machines (JVM) have Just- In-Time (JIT) compilers that optimize code at runtime based on the execution pattern, a key feature that SVMs lack; (2) because of Java’s continued popularity, JVMs keep improving at a faster pace than SVMs, allowing us to exploit new advances in the Java runtime in the future; (3) Java is a dynamic language, which makes it convenient to load a piece of new code on the fly. In this paper, we develop both an interpreted and a compiled query execution engine in a relational, Java-based, in-memory database prototype, and perform an experimental study. Our experimental results on the TPC-H data set show that, despite both engines benefiting from JIT, the compiled engine runs on average about twice as fast as the interpreted one, and significantly faster than an in-memory
Conference Paper
If industry visionaries are correct, our lives will soon be full of sensors, connected together in loose conglomerations via wireless networks, each monitoring and collecting data about the environment at large. These sensors behave very differently from traditional database sources: they have intermittent connectivity, are limited by severe power constraints, and typically sample periodically and push immediately, keeping no record of historical information. These limitations make traditional database systems inappropriate for queries over sensors. We present the Fjords architecture for managing multiple queries over many sensors, and show how it can be used to limit sensor resource demands while maintaining high query throughput. We evaluate our architecture using traces from a network of traffic sensors deployed on Interstate 80 near Berkeley and present performance results that show how query throughput, communication costs and power consumption are necessarily coupled in sensor environments
Article
Full-text available
We present a query architecture in which join operators are decomposed into their constituent data structures (State Modules, or SteMs), and dataow among these SteMs is managed adaptively by an Eddy routing operator. Breaking the encapsulation of joins serves two purposes. First, it allows the Eddy to observe multiple physical operations embedded in a join algorithm, allowing for better calibration and control of these operations. Second, the SteM on a relation serves as a shared materialization point, enabling multiple competing access methods to share results, which can be leveraged by multiple competing join algorithms. Our architecture extends prior work signicantly, allowing continuously adaptive decisions for most major aspects of traditional query optimization: choice of access methods and join algorithms, ordering of operators, and choice of a query spanning tree.
Article
Full-text available
Searching on the Internet today can be compared to dragging a net across the surfgace of the ocean. While a great deal may be caught in the net, there is still a wealth of information that is deep, and therefore, missed. The reason is simple. Most of the Web's information is buried far down on dynamically generated sites, and standard search engines never find it.
Conference Paper
Full-text available
We propose a new design for highly concurrent Internet services, which we call the staged event-driven architecture (SEDA). SEDA is intended to support massive concurrency demands and simplify the construction of well-conditioned services. In SEDA, applications consist of a network of event-driven stages connected by explicit queues. This architecture allows services to be well-conditioned to load, preventing resources from being overcommitted when demand exceeds service capacity. SEDA makes use of a set of dynamic resource controllers to keep stages within their operating regime despite large fluctuations in load. We describe several control mechanisms for automatic tuning and load conditioning, including thread pool sizing, event batching, and adaptive load shedding. We present the SEDA design and an implementation of an Internet services platform based on this architecture. We evaluate the use of SEDA through two applications: a high-performance HTTP server and a packet router for the Gnutella peer-to-peer file sharing network. These results show that SEDA applications exhibit higher performance than traditional service designs, and are robust to huge variations in load. 1
Article
Full-text available
The use of monitors for describing concurrency has been much discussed in the literature. When monitors are used in real systems of any size, however, a number of problems arise which have not been adequately dealt with: the semantics of nested monitor calls; the various ways of defining the meaning of WAIT; priority scheduling; handling of timeouts, aborts and other exceptional conditions; interactions with process creation and destruction; monitoring large numbers of small objects. These problems are addressed by the facilities described here for concurrent programming in Mesa. Experience with several substantial applications gives us some confidence in the validity of our solutions. Key Words and Phrases: concurrency, condition variable, deadlock, module, monitor, operating system, process, synchronization, task CR Categories: 4.32, 4.35, 5.24 1. Introduction In early 1977 we began to design the concurrent programming facilities of Pilot, a new operating system for a personal co...
Conference Paper
This paper provides a comprehensive treatment of index management in transaction systems. We present a method, called ARIESIIM (Algorithm for Recovery and Isolation Exploiting Semantics for Index Management), for concurrency control and recovery of B+-trees. ARIES/IM guarantees serializability and uses write-ahead logging for recovery. It supports very high concurrency and good performance by (1) treating as the lock of a key the same lock as the one on the corresponding record data in a data page (e.g., at the record level), (2) not acquiring, in the interest of permitting very high concurrency, commit duration locks on index pages even during index structure modification operations (SMOs) like page splits and page deletions, and (3) allowing retrievals, inserts, and deletes to go on concurrently with SMOs. During restart recovery, any necessary redos of index changes are always performed in a page-oriented fashion (i.e., without traversing the index tree) and, during normal processing and restart recovery, whenever possible undos are performed in a page-oriented fashion. ARIES/IM permits different granularities of locking to be supported in a flexible manner. A subset of ARIES/IM has been implemented in the OS/2 Extended Edition Database Manager. Since the locking ideas of ARIES/IM have general applicability, some of them have also been implemented in SQL/DS and the VM Shared File System, even though those systems use the shadow-page technique for recovery.
Conference Paper
The exokernel operating system architecture safely gives untrusted software efficient control over hardware and software resources by separating management from protection. This paper describes an exokernel system that allows specialized applications to achieve high performance without sacrificing the performance of unmod- ified UNIX programs. It evaluates the exokernel architecture by measuring end-to-end application performance on Xok, an exo- kernel for Intel x86-based computers, and by comparing Xok's performance to the performance of two widely-used 4.4BSD UNIX systems (FreeBSD and OpenBSD). The results show that common unmodified UNIX applications can enjoy the benefits of exoker- nels: applications either perform comparably on Xok/ExOS and the BSD UNIXes, or perform significantly better. In addition, the results show that customized applications can benefit substantially from control over their resources (e.g., a factor of eight for a Web server). This paper also describes insights about the exokernel ap- proach gained through building three different exokernel systems, and presents novel approaches to resource multiplexing.
Article
Several operating system services are examined with a view toward their applicability to support of database management functions. These services include buffer pool management; the file system; scheduling, process management, and interprocess communication; and consistency control.
Article
DB2TM, IMS, and TandemTM systems. ARIES is applicable not only to database management systems but also to persistent object-oriented languages, recoverable file systems and transaction-based operating systems. ARIES has been implemented, to varying degrees, in IBM's OS/2TM Extended Edition Database Manager, DB2, Workstation Data Save Facility/VM, Starburst and QuickSilver, and in the University of Wisconsin's EXODUS and Gamma database machine.
Conference Paper
In large federated and shared-nothing databases, resources can exhibit widely fluctuating characteristics. Assumptions made at the time a query is submitted will rarely hold throughout the duration of query processing. As a result, traditional static query optimization and execution techniques are ineffective in these environments. In this paper we introduce a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs. We characterize the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated. By combining eddies with appropriate join algorithms, we merge the optimization and execution phases of query processing, allowing each tuple to have a flexible ordering of the query operators. This flexibility is controlled by a combination of fluid dynamics and a simple learning algorithm. Our initial implementation demonstrates promising results, with eddies performing nearly as well as a static optimizer/executor in static scenarios, and providing dramatic improvements in dynamic execution environments.
Article
This paper describes the motivation, architecture and performance of SPIN, an extensible operating system. SPIN provides an extension infrastructure together with a core set of extensible services that allow applications to safely change the operating system's interface and implementation. These changes can be specified with finegranularity, allowing applications to achieve a desired level of performance and functionality from the system. Extensions are dynamically linked into the operating system kernel at application runtime, enabling them to access system services with low overhead. A capabilitybased protection model that relies on language and linktime mechanisms enables the system to inexpensively export fine-grained interfaces to system services. SPIN and its extensions are written in Modula-3 and run on DEC Alpha workstations. 1 Introduction SPIN is an operating system that can be dynamically specialized to safely meet the performance and functionality requirements of applic...
Article
Unlike applets, traditional systems programs written in Java place significant demands on the Java runtime and core libraries, and their performance is often critically important. This paper describes our experiences using Java to build such a systems program, namely, a high-performance web crawler. We found that our runtime, which includes a just-in-time compiler that compiles Java bytecodes to native machine code, performed well. However, we encountered several performance problems with the Java core libraries, including excessive synchronization, excessive allocation, and other performance problems. The paper describes the most serious pitfalls and how we programmed around them. In total, these workarounds more than doubled the speed of our crawler.
Hamilton James. Personal Communication
  • Hamilton James
Hong Wei. Personal Communication
  • Hong Wei
Cloudscape 3.6 A Technical Overview. A Cloudscape White paper
  • N Wyatt
  • A Carlson