Figure 5 - uploaded by Alan L. Cox
Content may be subject to copyright.
Online bookstore percentage CPU utilization as a function of number of clients for the ordering mix.
Source publication
The absence of benchmarks for Web sites with dynamic content has been a major impediment to research in this area. We describe three benchmarks for evaluating the performance of Web sites with dynamic content. The benchmarks model three common types of dynamic-content Web sites with widely varying application characteristics: an online bookstore, a...
Context in source publication
Context 1
... peak throughputs are 356, 515, and 654 interactions per minute, for the browsing, shopping, and ordering mix, respectively. Figure 3 to figure 5 show, for the different mixes, the average CPU utilization on the Web server and the database server as the number of clients increases. From these figures we conclude that for all workload mixes, the CPU on the database machine is the bottleneck resource at the peak throughput. ...
Similar publications
Agrarian Service Centre plays an important role in sustainability of agriculture by ensuring the cultivation in the cultivatable land according to the acts of agricultural ministry, helping farmers in irrigation water management, guiding the cultivation program according to the annual cultivation schedule and to help farmers in obtaining loans from...
This paper presents an Online Interactive Competition Model for E-learning System. The system allows a student to
connect and interact with other students on the courses they offer in a semester using both synchronous and
asynchronous computer-mediated communication mechanisms. Each course lecturer e-supervises and e-moderates
the students’ perform...
The International Telecommunication Union (ITU) published well-tested models and data set for the prediction of fading (or attenuation) due to multipath and rain based on measurements on radio links across the globe. In respect of rain attenuation, ITU released ITU-R PN.837-1 recommendation in which ITU split the globe into 15 regions according to...
The International Telecommunication Union (ITU) published well-tested models and data set for the prediction of fading (or attenuation) due to multipath and rain based on measurements on radio links across the globe. In respect of rain attenuation, ITU released ITU-R PN.837-1 recommendation in which ITU split the globe into 15 regions according to...
Citations
... Subsequently, the power consumption and application throughput are obtained by a linear regression based on server utilization in accordance to the model proposed by Wang et al. [52]. This simulator was validated by implementing a three-tier DC subjected to RUBIS [53] workload. Simulation results showed that the estimated application throughput, requests latency and power consumption followed closely the empirical values. ...
Cloud Computing (CC) has attracted a massive amount of research and investment in the previous decade. The economical
model proposed by this technology is a viable solution for consumers as well as being a proftable one for the provider. However, deploying real world cloud experiments to test new policies/algorithms is time consuming and very expensive, especially
for large scenarios. As a result, the research community has opted to test their contributions is CC simulators. Although the
models proposed by these simulators are not exhaustive, each one is made to address a specifc process. Alternatively, others
tools are made to provide a platform and the necessary building blocks to model any desired sub-component (application/
network model, energy consumption, scheduling and Virtual Machine provisioning). In this paper, a detailed survey about the
existing CC simulators is made discussing features, software architecture as well as the ingenuity behind these frameworks.
... network-level attacks (Peng et al. 2007;Zargar et al. 2013). By comparison, server resources such as CPU, I/O bandwidth, database and disk throughput are becoming easier targets (Amza et al. 2002;Ranjan et al. 2006). Attackers turn towards HTTP-layer flooding attacks in which they flood services with protocol-compliant requests that require the execution of scripts, expensive database operations, or the transmission of large files. ...
We focus on the problem of detecting clients that attempt to exhaust server resources by flooding a service with protocol-compliant HTTP requests. Attacks are usually coordinated by an entity that controls many clients. Modeling the application as a structured-prediction problem allows the prediction model to jointly classify a multitude of clients based on their cohesion of otherwise inconspicuous features. Since the resulting output space is too vast to search exhaustively, we employ greedy search and techniques in which a parametric controller guides the search. We apply a known method that sequentially learns the controller and the structured-prediction model. We then derive an online policy-gradient method that finds the parameters of the controller and of the structured-prediction model in a joint optimization problem; we obtain a convergence guarantee for the latter method. We evaluate and compare the various methods based on a large collection of traffic data of a web-hosting service.
... PostgreSQL supports two-phase commit: the PREPARE TRANSAC-TION command ensures a transaction is stored on disk, but does not make its effects visible. 3 A subsequent COMMIT PREPARED is guaranteed to succeed, even if the database server crashes and recovers in the meantime. This requires writing the list of locks held by the transaction to disk, so that they will persist after recovery. ...
... We also measured the impact of SSI on application-level performance using the RUBiS web application benchmark [3]. RUBiS simulates an auction site modeled on eBay. ...
... In these benchmarks, the database server ran on a 2.83 GHz Core 2 Quad Q9550 system with 8 GB RAM and a Seagate ST3500418AS 500 GB 7200 RPM hard drive running Ubuntu 11.10. Application server load can be a bottleneck on this workload [3], so we used multiple application servers (running Apache 2.2.17 and PHP 5.3.5) so that database performance was always the limiting factor. ...
This paper describes our experience implementing PostgreSQL's new
serializable isolation level. It is based on the recently-developed
Serializable Snapshot Isolation (SSI) technique. This is the first
implementation of SSI in a production database release as well as the first in
a database that did not previously have a lock-based serializable isolation
level. We reflect on our experience and describe how we overcame some of the
resulting challenges, including the implementation of a new lock manager, a
technique for ensuring memory usage is bounded, and integration with other
PostgreSQL features. We also introduce an extension to SSI that improves
performance for read-only transactions. We evaluate PostgreSQL's serializable
isolation level using several benchmarks and show that it achieves performance
only slightly below that of snapshot isolation, and significantly outperforms
the traditional two-phase locking approach on read-intensive workloads.
... From the estimation results in Section IV.B, we also identified that the Admin related transactions and the Best Seller transaction are the largest CPU consumers per transaction for the Tomcat server and the MySQL server, respectively. Previous work in [18] also found the same phenomenon. As shown in Figure 9, we simulate an aggressive Tenant "T1" by increasing the number of Admin related transactions for the intervals 11-30, and then simulate an aggressive Tenant "T2" by increasing the number of Best Seller transaction for the intervals 41-60. ...
... This implies that the coarse-grained strategy react slowly to the negatively impacts of T1, and even lead the system into a danger of crashing. Note that, in Figure 11 (b), the overall CPU utilization reduced to 85% within 5 intervals (intervals [15][16][17][18][19][20]. This leads to a small negative impact on the CPU utilizations of the other tenants. ...
Performance isolation is a key requirement for application-level multi-tenant sharing hosting environments. It requires knowledge of the resource consumption of the various tenants. It is of great importance not only to be aware of the resource consumption of a tenant's given kind of transaction mix, but also to be able to be aware of the resource consumption of a given transaction type. However, direct measurement of CPU resource consumption requires instrumentation and incurs overhead. Recently, regression analysis has been applied to indirectly approximate resource consumption, but challenges still remain for cases with non-determinism and multicollinearity. In this work, we adapts Kalman filter to estimate CPU consumptions from easily observed data. We also propose techniques to deal with the non-determinism and the multicollinearity issues. Experimental results show that estimation results are in agreement with the corresponding measurements with acceptable estimation errors, especially with appropriately tuned filter settings taken into account. Experiments also demonstrate the utility of the approach in avoiding performance interference and CPU overloading.
... LAMP is a popular open source solution used to run servers in which PHP is configured to run on Apache web server, using the MySQL database on Linux operating system. It is popular because of its open source nature, low cost, and its packages are easy to install and convenient to use [15] [16] [17] [18]. Availability [19], [20], [21] is a reoccurring and a growing concern in software intensive systems. ...
This paper presents an methodology for attaining high availability to the demands of the web clients. In order to improve in response time of web services during peak hours dynamic allocation of host nodes will be used in this research work. As web users are very demanding: they expect web services to be quickly accessible from the world 24*7. Fast response time leads to high availability of web services, while slow response time degrades the performance of web services. With the increasing trend of internet, it becomes a part of life. People use internet to help in their studies, business, shopping and many more things. To achieve this objective LAMP platform is used which are Linux, Apache, My SQL, and PHP. LAMP is used to increase the quality of product by using open source software. The proposed strategy will work as middle layer and provide highly availability to the web clients. Abstract-This paper presents an methodology for attaining high availability to the demands of the web clients. In order to improve in response time of web services during peak hours dynamic allocation of host nodes will be used in this research work. As web users are very demanding: they expect web services to be quickly accessible from the world 24*7. Fast response time leads to high availability of web services, while slow response time degrades the performance of web services. With the increasing trend of internet, it becomes a part of life. People use internet to help in their studies, business, shopping and many more things. To achieve this objective LAMP platform is used which are Linux, Apache, My SQL, and PHP. LAMP is used to increase the quality of product by using open source software. The proposed strategy will work as middle layer and provide highly availability to the web clients.
... Today's web applications are used by millions of users and demand implementations that scale accordingly. A typical system includes application logic (often implemented in web servers) and an underlying database that stores persistent state, either of which can become a bot- tleneck [1]. Increasing database capacity is typically a difficult and costly proposition, requiring careful partitioning or the use of distributed databases. ...
Distributed in-memory application data caches like memcached are a popular solution for scaling database-driven web sites. These systems are easy to add to existing deployments, and increase performance significantly by reducing load on both the database and application servers. Unfortunately, such caches do not integrate well with the database or the application. They cannot maintain transactional consistency across the entire system, violating the isolation properties of the underlying database. They leave the application responsible for locating data in the cache and keeping it up to date, a frequent source of application complexity and programming errors. Addressing both of these problems, we introduce a transactional cache, TxCache, with a simple programming model. TxCache ensures that any data seen within a transaction, whether it comes from the cache or the database, reflects a slightly stale but consistent snapshot of the database. TxCache makes it easy to add caching to an application by simply designating functions as cacheable; it automatically caches their results, and invalidates the cached data as the underlying database changes. Our experiments found that adding TxCache increased the throughput of a web application by up to 5.2×, only slightly less than a non-transactional cache, showing that consistency does not have to come at the price of performance.
... TPC-W emulates the behaviour of database-driven websites by recreating a website for an online bookstore. We started with the Rice implementation of TPC-W [2], which uses JDBC to access its database. For each query in the TPC-W benchmark, we wrote an equivalent query using JQS and manually verified that the resulting queries were semantically equivalent to the originals. ...
Instead of writing SQL queries directly, programmers often prefer writing all their code in a general purpose programming language like Java and having their programs be automatically rewritten to use database queries. Traditional tools such as object-relational mapping tools are able to automatically translate simple navigational queries written in object-oriented code to SQL. More recently, techniques for translating object-oriented code written in declarative or functional styles into SQL have been developed. For code written in an imperative style though, current techniques are still limited to basic queries. JReq is a system that is able to identify complex query operations like aggregation and nesting in imperative code and translate them into efficient SQL queries. The SQL code generated by JReq exhibits performance comparable with hand-written SQL code.
... In order to make IBA as our connection backbone instead of Ethernet, we ran all of the server applications with the SDP library (provided by Topspin) in our experiment. We used RUBiS [18] benchmark workload to collect actual measurement data for conducting our experiments since it was difficult to get real workloads because of the proprietary nature of Internet applications. ...
Performance and power issues are becoming increasingly important in the design of large, cluster-based multi-tier data centers for supporting a multitude of services. The design and analysis of such large/complex distributed systems often suffer from the lack of availability of an adequate physical infrastructure. This paper presents a comprehensive, flexible, and scalable simulation platform for in-depth analysis of multi-tier data centers. Designed as a pluggable three-level architecture, our simulator captures all the important design specifics of the underlying communication paradigm, kernel level scheduling artifacts, and the application level interactions among the tiers of a three-tier data center. The flexibility of the simulator is attributed to its ability in experimenting with different design alternatives in the three layers, and in analyzing both the performance and power consumption with realistic workloads. The scalability of the simulator is demonstrated with analyses of different data center configurations. In addition, we have designed a prototype three-tier data center on an Infiniband Architecture (IBA) connected Linux cluster to validate the simulator. Using RUBiS benchmark workload, it is shown that the simulator is quite accurate in estimating the throughput, response time, and power consumption. We then demonstrate the applicability of the simulator in conducting three different types of studies. First, we conduct a comparative analysis of the IBA and 10 Gigabit Ethernet (10 GigE) under different traffic conditions and with varying size clusters for understanding their relative merits in designing cluster-based servers. Second, measurement and characterization of power consumption across the servers of a three-tier data center is done. Third, we perform a configuration analysis of the Web server (WS), Application Server (AS), and Database Server (DB) for performance optimization. We believe that such a comprehensive simulation infrastructure -
is critical for providing guidelines in designing efficient and cost-effective multi-tier data centers.
... Additionally, commercial tools (e.g., HP Open View or IBM Tivoli) offer the possibility of inspecting an abundant variety of metric data graphically without clear aggregation and analysis frameworks. Some discussion on a two-dimensional bottleneck characterization in RUBiS and RUBBoS has been previously provided by Amza et al. [13]. However, this approach solely relies on manual human diagnosis and only targets stable bottleneck characterization with small-scale experimentation. ...
In many areas such as e-commerce, mission-critical N-tier applications have grown increasingly complex. They are characterized by non-stationary workloads (e.g., peak load several times the sustained load) and complex dependencies among the component servers. We have studied N-tier applications through a large number of experiments using the RUBiS and RUBBoS benchmarks. We apply statistical methods such as kernel density estimation, adaptive filtering, and change detection through multiple-model hypothesis tests to analyze more than 200 GB of recorded data. Beyond the usual single-bottlenecks, we have observed more intricate bottleneck phenomena. For instance, in several configurations all system components show average resource utilization significantly below saturation, but overall throughput is limited despite addition of more resources. More concretely, our analysis shows experimental evidence of multi-bottleneck cases with low average resource utilization where several resources saturate alternatively, indicating a clear lack of independence in their utilization. Our data corroborates the increasing awareness of the need for more sophisticated analytical performance models to describe N-tier applications that do not rely on independent resource utilization assumptions. We also present a preliminary taxonomy of multi-bottlenecks found in our experimentally observed data.
... Their conclusions are around limitations related with data caches and main memory accesses. Amza et al. [1] describe three benchmarks for evaluating the performance of Web sites with dynamic content . Furthermore, they present a performance evaluation of their implementations on contemporary commodity hardware . ...
The presence of application servers is more common than ever due to the quick evolution of e-Business and e-Commerce services. These software engines are able to host multiple web applications, which could receive dissimilar workloads, in a unique server machine. This means that diverse web applications have to share the limited server's resources. Furthermore, those applications differ in both resource requirements and performance goals. In order to properly exploit their capabilities, a good and accurate study must be performed to better understand the benefits and limitations of hosting various types of web applications in a single server. In this paper, we present an in-depth study of the performance of an application server in a multiprocessor environment which process representative workloads of nowadays. We evaluate its performance using a fine Web server benchmark, SPECweb2005, which characterize three real web usage patterns. For each case, we expose the obtained results from the performed experiments on multithreaded java application server. Specifically, we have performed a study of the server's scalability in this multiprocessor environment when running with different number of processors. Afterwards, we examine which are the underlaying bottleneck resources for the server's per-formance for each one of the afore-mentioned workloads.