ArticlePDF Available

Hybrid Disk Drive Configuration on Database Server Virtualization

Authors:
  • Bina Nusantara University, Jakarta, Indonesia

Abstract and Figures

SSD is a revolutionary new storage technologies. Enterprise storage system using full SSD is still very expensive, while HDD is still widely used. This study discusses hybrid configuration storage in virtualized server database with benchmark against four hybrid storage configuration for four databases, ORACLE, SQL Server, MySQL and PostgreSQL on Windows Server virtualization. Benchmark using TPC-C and TPC-H to get the best performance of four configurations were tested. The results of this study indicate HDD storage configurations as visual disk drive OS and SSD as visual disk drives database get better performance as OLTP and OLAP database server compared with SSD as visual disk drive OS and HDD as a visual disk drive database. Based on the data research TPC-C, OLTP get best performance at HDD storage configurations as visual disk drive OS and SSD as a visual disk drives database and temporary files.
Content may be subject to copyright.
Indonesian Journal of Electrical Engineering and Computer Science
Vol. 2, No. 3, June 2016, pp. 720 ~ 728
DOI: 10.11591/ijeecs.v2.i3.pp720-728 720
Received March 4, 2016; Revised May 13, 2016; Accepted May 28, 2016
Hybrid Disk Drive Configuration on Database Server
Virtualization
Ferdy Nirwansyah*1, Suharjito2
Magister in Information Technology, Binus Graduate Program, Bina Nusantara University, Jakarta,
Indonesia
*Corresponding author, e-mail: ferdy.nirwansyah@binus.ac.id; suharjito@binus.edu
Abstract
Solid State Drive (SSD) is a revolutionary new storage technology. Enterprise storage system
using full SSD is still very expensive, while hard disk drive (HDD) is still widely used. This study discusses
hybrid configuration storage in virtualized server database with benchmark against four hybrid storage
configuration for four databases, ORACLE, SQL Server, MySQL and PostgreSQL on Windows Server
virtualization. Benchmark using TPC-C and TPC-H to get the best performance of four configurations were
tested. The results of this study indicate HDD storage configurations as visual disk drive OS and SSD as
visual disk drives database get better performance as on line transaction processing (OLTP) and on line
analytical processing (OLAP) database server compared with SSD as visual disk drive OS and HDD as a
visual disk drive database. Based on the data research TPC-C, OLTP get best performance at HDD
storage configurations as visual disk drive OS and SSD as a visual disk drives database and temporary
files.
Keywords: Database, High availability, Server virtualization, Hybrid storage
Copyright © 2016 Institute of Advanced Engineering and Science. All rights reserved.
1. Introduction
Database becomes an integral part of our daily life [1]. In the modern business world,
the database became operational support company called on line transaction processing
(OLTP). The database is also used to assist companies in analyzing and making decisions,
known as the on line analytical processing (OLAP). Database has transaction data and
supporting data needed. More the number of transactions on the application and the database,
more complex the application and database infrastructure that must be owned by the company
[2]. Infrastructure applications and databases is essential in order to improve high availability.
To improve the performance can be with performance and tuning, namely on: first, server
environment such as mainboard, processor, RAM, Lan card and others. Second, storage
environment, third, database environment, fourth, network environment, fifth, desktop computer
environment [3].
Along with the development of technology, high availability can be improved with
virtualization. Management of server hardware and software becomes more practical. Some of
the features that can be done are: management of hardware resources to each OS in a VM, for
example, processor, RAM and storage, monitoring resource allocation to each OS and others.
At the storage level, we recognize several types of disk, namely: solid state drive (SSD)
and magnetic drive (SAS and SATA), sometimes called the hard disk drive (HDD). SSD is a
revolutionary new storage technologies and a positive impact on the performance of system and
database. Investment of enterprise storage system that uses full SSD storage system is still
very expensive, on the other hand there are many enterprise storage system using HDD as
legacy system.
With the condition of existing storage, it influenced the design of the IT infrastructure in
enterprise storage systems. In order for the utilization of all storage resources and the resulting
optimal performance, there should be between SSD and HDD configuration. This also applies to
the configuration of storage on the database server. This study will do a hybrid configuration
database storage on the virtualization server.
To get the best possible configuration should be performed benchmark. The criteria for
a good benchmark for the performance are as follows: first, representative, second, relevant,
ISSN: 2502-4752
IJEECS Vol. 2, No. 3, June 2016 : 720 – 728
721
third, portable, fourth, scalable, fifth, and sixth verifiable, simple. There are some benchmarks
that reach active industry standards. The most commonly used is the TPC and SPEC. For this
study, using a TPC benchmark performance measurement. Because the focus of research in
the Database OLTP and OLAP will use the TPC-C and TPC-H. On the other hand, to complete
this research, performance measurement testing of four databases are ORACLE, SQL Server,
MySQL and PostgreSQL.
2. Related Works
Mao makes parity-based framework Hybrid Disk Array Architecture (HPDA) that
combine group and two HDD SSDs to improve performance and reliability of SSD-based
storage systems. In HPDA, SSD (data disk) and part of the HDD (parity disk) write array RAID4.
Reliability analysis showed that the reliability HPDA, in terms of Mean Time to Data Loss
(MTTDL), higher than the HDD or SSD. HPDA prototype implementation and performance
evaluation showed that significantly exceeds HPDA SSD and HDD [4].
Bassil presents a comparative study on the performance of the system over the DBMS.
Testing is in MSSQL Server 2008, Oracle11g, IBMDB2, MySQL5.5 and MS Access 2010. This
test aims to execute a different SQL query with different levels of complexity five DBMS tested.
This will open the way to make a head-to-head comparative evaluation shows the average
execution time, memory usage and CPU utilization of each DBMS after completion of the test.
The test results showed no DBMS that has the most excellent performance. IBM DB2 is the
fastest DBMS, but MS Access has a lower CPU utilization than any other DBMS and IBM DB2
is the most widely consumed main memory [5].
Kim made a system called the Hybrid Store where there are 2 diprovide: Hybrid-Plan
and HybridDyn. Hybrid-Plan will improve capacity planning for administrators with the overall
goal of operational cost-budgets. HybridDyn improve the performance / lifetime guarantees
during episodes of deviation of the workload. Testing and implementation need to be known
advantages and disadvantages of this framework. To evaluate in terms of performance and total
cost. Analysis speed performance hybrid store is nearing full SSD storage, but on top of the full
storage HDD [6].
Bausch, Petrov, and Buchmann observed the different performance of the join algorithm
available in PostgreSQL on SSD and HDD. First observing, point query shows the performance
improvement of up to fifty times. Second, the range of the query to perform is well on the SSD.
Join algorithm behaves differently depending on how well they conform to the nature of SSD or
HDD [7].
According topropose and design a systematic exploration of the use of SSD to improve
performance buffer manager of the DBMS. They propose three alternatives that differ primarily
in how the buffer manager copes with dirty pages to be removed from the buffer pool. They
implement this alternative, as well as other new algorithm proposed for this study (TAC), in
SQLServer, and perform testing using various benchmarks (TPC-C and TPC-H) on some scale
factors. Empirical evaluation showed significant performance improvements of their methods of
improvement on the configuration of the HDD (up to 9.4X), and up to 6.8X more acceleration
with TAC [8].
In another study, presenting analytical tool to assess the configuration formed by the
combination of all kinds of resources. Using the tool to analyze the logical volume collected
statistics collected from 120 large production systems. This study showed a combination of
SSD, SCSI, and SATA configurations in many cases be better than just using SCSI devices in
all key aspects: price, performance and power consumption. This contrasts with other recent
studies on enterprise system smaller pessimistic about profits SDD in enterprise settings [9].
Jo creates HybridCopy-on-Write (CoW) disk storage that combines SSD and HDD to
Consolidate environment. The proposed scheme puts read-only disk image template on the
SSD, while write operations to the HDD. Creating an efficient combination of SSD and HDD in
consolidate environment. Hybrid storage CoW is clearly beneficial, for performance and cost
effective. The drawback is working in VMware level so as to measure the performance of each
VM hard. From the test results obtained CoW Hybrid storage performance over storage,
enabling full HDD but still below the full SSD storage [10].
Wu and Reddy make a framework by making drivers on Linux. Management of storage
capacity on a storage system used a hybrid SSD and HDD. This Framework does a
IJEECS ISSN: 2502-4752
Hybrid Disk Drive Configuration on Database Server Virtualization (Ferdy Nirwansyah)
722
combination of SSD and HDD configuration. Testing and implementation need to be known
advantages and disadvantages of this framework. Hybrid HDD benchmark and performance
striping and his claim could rise to 50% in some cases [11].
Lee makes three different types of SSD model from Samsung. Shows how SSD
technology has advanced to reverse the trend of widening the performance gap between
processor and storage. This study also shows that even a single SSD can outperform RAID 0
with eight 15K-RPM enterprise-class disk drives on transaction throughput, cost effectiveness
and power consumption [12].
Park presents the techniques used to increase the reliability and performance of the
new SSD RAID system. First, they analyzed the SSD RAID mechanism and then develop a
methodology adapted to the new RAID SSD storage. Via trace-driven simulation, they evaluate
performance optimized SSDs use RAID mechanism. The proposed method improves the
reliability of SSD is 2% higher than the existing RAID systems and improve I / O performance of
SSDs 28% higher than the existing RAID systems [13].
3. Research Method
SSD technology revolutionizing storage has the potential to change the architectural
principles DBMS [6]. But the SSD itself is still quite expensive. On the other hand enterprise
storage systems still use HDD as legacy system. In this connection, then there will be questions
like:
1. How to improve performance of database servers with hybrid storage configuration that is
optimal in the server virtualization?
2. How to utilize all the resources that exist in storage without compromising the performance
of the database server?
To take advantage of all the resources existing hard drive we use hybrid technique.
Hybrid technique configures the virtual drive in Windows Server as database server and part of
VMWare server virtualization. This research makes hybrid technique by performing virtual disk
configuration of hard disk drives for the OS and database using different storage. The
advantage with doing hybrid technique on virtualization is in terms of practical and convenience
than do the hybrid at the storage level.
The steps of this study: study of literature, installation of the instrument, create a
database and then loading data of fourth database for TPC-C and TPC-H testing as well as the
replacement scheme configuration of the virtual drive, collecting the test data, the performance
evaluation system configurations and then the conclusions and suggestions.
First step, the research begins by determining the background and purpose of the study
as well as defining the scope. The literature study is done to deepen the understanding of the
hybrid technique of virtual disks to virtual drives on Windows Server virtualization server. In
addition, a literature study was also conducted to find out the results of hybrid storage technique
has ever done.
Second step, the research instrument is installation of VMWare, Storage, Windows
Server, and SQL Server. Hardware for the study was:
1. Intel Modular Server Chassis MFSYS25V2: 14 drive carriers, 1 GbE switch, two power
supplies, two power supply fan.Node I: Intel MFS2600KI Compute Module, 2 x Intel (R)
Xeon (R) E5-2660 CPU 0 @ 2.20GHz 8 Cores, 24 GB DDR3 RAM.
2. 2 pieces hard drive Seagate Savvio HDD 300 GB 10K RPM 2.5 ".
3. 4 pieces harddiskFource CORSAIR GS 240GB SSD 2.5 ".
4. 8 pieces SATA hard drives Seagate Momentus 500 GB 7.2K RPM 2.5 ".
5. Switch hub Cisco SG500-28 28 ports.
Firstly, install VMWare on the server, then installation of 4 pieces disk SSD using RAID
10 for the virtual disk drive OS Windows Server in a VM and 8 pieces harddisk HDD using RAID
10 as a virtual disk drive for database ORACLE, SQL Server, MySQL and PostgreSQL. Then
installation of Windows Server as a Virtual Machine (VM) in the VMWare virtual disk drive in
settings uses a configuration that has been provided. After that, do installation HammerDB on
Windows Server. Installation of the four databases, namely: ORACLE, SQL Server, MySQL and
PostgreSQL. Configure the database so that each database is directed to SSD on drive E.
Temporary OS and database files are directed to SSD on drive F. This preparation is done for
testing in the first research. First research infrastructure schemes can be seen in Figure 1.
ISSN: 2502-4752
IJEECS Vol. 2, No. 3, June 2016 : 720 – 728
723
Figure 1. First research infrastructure schemes
Third step, create databases and then loading data for testing, also backup image of
Windows Server. The fourth step is testing the TPC-C and TPC-H. After completed, change
virtual disk drive configuration by using a restore image of Windows Server. Make configuration
of the hard disk with SSD for OS and the HDD for database. Temporary files OS and database
are directed to HDD in drive F. Install a VM from a backup image before. After that, TPC-C and
TPC-H test similar to previous research. The second research infrastructure schemes can be
seen in Figure 2.
Figure 2. Second research infrastructure schemes
The third research, make configuration of HDD for OS and SSD for database.
Temporary files OS and database are directed to the SSD on drive F. After that, TPC-C and
TPC-H test similar to previous research. Third research infrastructure schemes can be seen in
Figure 3.
IJEECS ISSN: 2502-4752
Hybrid Disk Drive Configuration on Database Server Virtualization (Ferdy Nirwansyah)
724
Figure 3. Third research infrastructure schemes
Then fourth research, make configuration of SSD for OS and HDD for database. OS
and database temporary files are directed to the HDD in drive F. Do the testing of TPC-C and
TPC-H similar to previous research. Fourth research infrastructure schemes can be seen in
Figure 4.
Figure 4. Fourth research infrastructure schemes
Used method for this study from create databases to output generated is using
HammerDB. There are four databases that will be examined, ie ORACLE XE, SQLServer,
MySQL and PostgreSQL. There are four configurations of virtual disk drives to be tested, first, a
virtual disk drive for the OS using SSDs and virtual disk drive database using the HDD but the
temporary files of Windows and database using the HDD, second, virtual disk drive for the OS
using the HDD and virtual disk drive database using SSD but the temporary files of Windows
and database using the HDD, third, virtual disk drive for the OS using the HDD and virtual disk
drive database using the SSD but the temporary files of Windows and database using SSD,
fourth, virtual disk drive for the OS using SSDs and virtual disk drive database using HDD but
Windows temporary files and databases using SSD. There are two schemes used by
HammerDB, TPC-C for OLTP and TPC-H for OLAP.
Methods of data collection is to record the results of tests on four databases with four
virtual drive configuration of two different schemes. TPC-C is calculated by TPM. TPC-H is
ISSN: 2502-4752
IJEECS Vol. 2, No. 3, June 2016 : 720 – 728
725
calculated based QPhH of 22 queries are executed. At each TPC scheme conducted two
experiments on each disk configuration and database. So with this scenario has been carried
out 64 experiments. TPC-C using 5 warehouse for 10 and 50 virtual user, rampup time 30
minutes and 10 minutes time duration. TPC-H using SF 1 for 10 and 100 virtual user and 1
query set. In each test produce results in log files. On PostgreSQL, TPC-H for query 17, 20, 21
did not include in test because it requires quite a long time for execute the query.
3. Results and Analysis
Four hard drive configurations for OLTP database using the TPC-C scheme, obtained
database performance comparison of each configuration. On ORACLE, 10 virtual users with
configuration III in the first rank, then second configuration I and configuration II and IV in third
and fourth. Configuration I and III was using HDD as OS and SSD as storage for the database.
The difference is the temporary files on the OS and database configuration I directed to the
HDD but the SSD to configuration III. Configuration III can raise the performance becomes
18.87% compared to the configuration I. For 50 virtual users configuration condition of the order
of performance equal to 10 virtual users. Configuration III succeeded in raising the performance
of 119.04% of the configuration I.
On SQL Server performance configuration III on 10 virtual users slightly outperformed
the configuration I, while the configuration of the IV and II in third and fourth. For 50 user virtual
condition equal to 10 virtual users. Configuration III slightly outperformed the first configuration,
and then configures the IV and II in third and fourth. Not seen significant performance gains of
configuration III to I. For the 50 virtual users actually decrease performance 12:14%.
MySQL occur on different conditions, for 10 virtual user’s configuration I III configuration
outperformed the performance difference 16:44%, followed by the configuration of II and IV in
third and fourth. To 50 virtual users, configuration III is superior to 5:55% compared with the
configuration I. This was followed by the configuration of the IV and II.
On PostgreSQL to 10 virtual users, configuration III ranked first with 18:43% increase in
performance compared Configuration I. Configuration IV and II in third and fourth. To 50 virtual
user configuration III is superior 21:36% of I. Further configuration: Configuration II and IV in
third and fourth. Figure 5 shows the performance of the OLTP database on each disk
configuration.
IJEECS ISSN: 2502-4752
Hybrid Disk Drive Configuration on Database Server Virtualization (Ferdy Nirwansyah)
726
Figure 5. Graph of TPC-C per Database
Four hard drive configurations for OLAP database schema using TPC-H, obtained
database performance comparison of each configuration. On ORACLE, 10 virtual users
configuration III ranked first with 5.64% increase in the performance of configuration I. IV and II
in third and fourth. For 100 virtual user’s configuration condition of the order of performance are
configuration III ranked first with 14:02% performance improvement on the configuration I, IV
and II in third and fourth.
On SQLServer, configuration I outperformed configuration III of performance on 10
virtual users with performance differences 2:59%, while the configuration II and IV in third and
fourth. For 100 virtual users configuration I outperformed configuration III 4.87% difference in
performance, configuration II and configuration IV in third and fourth.
On MySQL, 10 virtual users configuration III outperformed configuration I with the
performance difference is very slightly by 1.73%, followed by the configuration of the IV and II in
third and fourth. For 100 virtual users, configuration I is superior to configuration III, then
followed by the configuration of the IV and II.
On PostgreSQL to 10 virtual users, configuration III ranked first followed configurations I
ranked second and configuration IV and II in third and fourth. For 100 virtual users configuration
ISSN: 2502-4752
IJEECS Vol. 2, No. 3, June 2016 : 720 – 728
727
I is superior to 3:18% of configuration III; next, the configuration IV and II in third and fourth.
Figure 6 shows the performance of OLAP database on each disk configuration.
Figure 6. Graph of TPC-H per Database
4. Conclusion and Future Works
Based on the data TPC-C resulting from this research, OLTP get best performance in
the third configuration in which the OS using the HDD, database using SSD, OS and database
temporary files using SSD. On ORACLE even increased 119.04 % at 50 virtual users compare
with Configuration I. Configuration II and IV is not recommended to use as OLTP database
server. In OLAP, best hard drive configurations are fairly balanced between the configuration I
and III. Require further testing to determine the better performance among two configurations.
Configuration II and IV are not recommended to use as the OLAP database server. SQL Server
on OLAP get the highest performance, this is because standard configuration parameter of SQL
Server is able to make utilization of storage utilization, memory and processor optimally.
PostgreSQL is not recommended for use as an OLAP because there are limitations on the
database engine.
To get the best performance in the OLTP database server can use a hybrid
configuration with the configuration III. In OLAP, configuration I and III of benchmark data is
impartial, has not seen a significant difference. Need to do further testing to Scale Factor (SF)
10 or more and number of virtual users. But this test will require hardware that is higher than in
this research.
The ratio of the hard drive is used in this study is 1: 2. When using a hard drive with 1: 1
ratio the configuration III will get a better ratio of performance to the configuration I compared
IJEECS ISSN: 2502-4752
Hybrid Disk Drive Configuration on Database Server Virtualization (Ferdy Nirwansyah)
728
this research. But these results will not be relevant when used for the ratio was raised to 1:3 or
1:4 and more. Due to the increase in the ratio of use of hard disk, the speed of the HDD storage
will be increasingly offset by the performance of SSD storage.
References
[1] T Connolly and C Begg, “Database Systems”. Sixth Edition, Pearson. 2015.
[2] RK Laday, H Sukoco and Y Nurhadryani. “Distributed System and Multimaster Replication Model on
Reliability Optimation Database”. TELKOMNIKA Indonesian Journal of Electrical Engineering. 2015;
13(3): 529–536.
[3] R Schiesser. “IT systems management”. (2nd ed.). Prentice Hall. 2010: 110–126.
[4] B Mao, H Jiang, S Wu, L Tian, D Feng, J Chen, and L Zeng. “Hpda”. ACM Trans. Storage. 2012;
8(1): 1–20.
[5] Y Bassil. “A Comparative Study on the Performance of the Top DBMS Systems”. arXiv Prepr.
arXiv1205.2889. 2012: 20–31.
[6] Y Kim, A Gupta, B Urgaonkar, P Berman and A Sivasubramaniam. “HybridStore: A cost-efficient,
high-performance storage system combining SSDs and HDDs”. IEEE Int. Work. Model. Anal. Simul.
Comput. Telecommun. Syst. - Proc. 2011: 227–236.
[7] D Bausch, I Petrov and A Buchmann. “On the performance of database query processing algorithms
on flash solid state disks”. Proc. - Int. Work. Database Expert Syst. Appl. DEXA. 2011: 139–144.
[8] J Do, D Zhang, JM Patel, DJ DeWitt, JF Naughton and A Halverson. “Turbocharging DBMS buffer
pool using SSDs”. Proc. 2011 Int. Conf. Manag. data - SIGMOD ’11. 2011: 1113.
[9] R Shaull, T Ron and A Littman. “Enterprise Storage Provisioning with Flash Drive”. 2010.
[10] H Jo, Y Kwon, H Kim, E Seo, J Lee and S Maeng. “SSD-HDD-hybrid virtual disk in consolidated
environments”. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes
Bioinformatics). 2010; 2009(6043 LNCS): 375–384.
[11] X Wu and ALN Reddy. “Managing storage space in a flash and disk hybrid storage system”. Proc. -
IEEE Comput. Soc. Annu. Int. Symp. Model. Anal. Simul. Comput. Telecommun. Syst. MASCOTS.
2009: 610–613.
[12] SW Lee, B Moon and C Park. “Advances in flash memory SSD technology for enterprise database
applications”. Proc. 35th SIGMOD Int. Conf. Manag. data SIGMOD 09. 2009; 14(3): 863–870.
[13] K Park, DH Lee, Y Woo, G Lee, JH Lee and DH Kim. “Reliability and performance enhancement
technique for SSD array storage system using RAID mechanism”. 2009 9th Int. Symp. Commun. Inf.
Technol. Isc. 2009. 2009: 140–145.
... Today, the user data is increasing rapidly due to many data generating processes like latest social media networks, rapid adaptation of smartphones and handheld devices further enhances the data creation. The computation of this data is becoming more difficult day by day, as the users of the digital data and networks are increasing by manifolds [1]. Traditional databases cannot compute this huge data without complexity for the real-time responses; whereas, in the case of graph databases, a graph is generated for each entity, which speeds up the process. ...
Article
Full-text available
Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management. © Copyright 2018 Institute of Advanced Engineering and Science. All rights reserved.
... CTMC approach to consolidate multiple VMs on a single host server was employed in some studies. Using virtualization technology, management of server hardware and software becomes more practical and availability can be enhanced [18]. ...
Article
Advancement in electronics and hardware has resulted in multiple softwares running on the same hardware. The result is multiuser, multitasking, multithreaded and virtualized environments. However, reliability of such high performance computing system depends both on hardware and software. For hardware, aging can be dealt with replacement. But, software aging needs to be dealt with different techniques. For software aging detection, a new approach using machine learning framework is proposed in this paper. For rejuvenation, the proposed solution uses Adaptive Genetic Algorithm (A-GA) to perform live migration to avoid downtime and SLA violation. The proposed A-GA based rejuvenation controller (A-GARC) has outperformed other heuristic techniques such as Ant Colony Optimization (ACO) and best fit decreasing (BFD) for migration. Results reveal that the proposed aging forecasting method and A-GA based rejuvenation outperforms other approaches to ensure optimal system availability, minimum task migration, performance degradation and SLA violation. © 2016 Institute of Advanced Engineering and Science. All rights reserved.
... CTMC approach to consolidate multiple VMs on a single host server was employed in some studies. Using virtualization technology, management of server hardware and software becomes more practical and availability can be enhanced [18]. ...
Article
Full-text available
Advancement in electronics and hardware has resulted in multiple softwares running on the same hardware. The result is multiuser, multitasking and virtualized environments. However, reliability of such high performance computing systems depends both on hardware and software. For hardware, aging can be dealt with replacement. But, software aging needs to be dealt with software only. For aging detection, a new approach using machine learning framework has been proposed in this paper. For rejuvenation, Adaptive Genetic Algorithm (A-GA) has been developed to perform live migration to avoid downtime and SLA violation. The proposed A-GA based rejuvenation controller (A-GARC) has outperformed other heuristic techniques such as Ant Colony Optimization (ACO) and best fit decreasing (BFD) for migration. Results reveal that the proposed aging forecasting method and A-GA based rejuvenation outperforms other approaches to ensure optimal system availability, minimum task migration, performance degradation and SLA violation.
... CTMC approach to consolidate multiple VMs on a single host server was employed in some studies. Using virtualization technology, management of server hardware and software becomes more practical and availability can be enhanced [18]. ...
Chapter
Full-text available
Cloud computing has emerged as one of the most inevitable technologies that encompasses huge software components and functional entities, which during operation accumulates errors or garbage, thus causing software aging. Software aging can cause system failure and hazardous consequences, and therefore to deal with it, a technique called software rejuvenation is proposed that reboots or re-initiates the software to avoid fault or failure. Conventional approaches still suffer from higher downtime resulting in decreased performance and unavailability. In this paper, a controller based dynamic software aging detection and fault tolerant rejuvenation model has been proposed. This model has been implemented in a live virtual migration environment to ensure resource security during virtualization. In addition, the proposed checkpoint and log detail based migration has eliminated the probability of downtime which is highly significant for realistic applications.
Article
Full-text available
With the emergence of virtualization and cloud computing technologies, several services are housed on virtualization platform. Virtualization is the technology that many cloud service providers rely on for efficient management and coordination of the resource pool. As essential services are also housed on cloud platform, it is necessary to ensure continuous availability by implementing all necessary measures. Windows Active Directory is one such service that Microsoft developed for Windows domain networks. It is included in Windows Server operating systems as a set of processes and services for authentication and authorization of users and computers in a Windows domain type network. The service is required to run continuously without downtime. As a result, there are chances of accumulation of errors or garbage leading to software aging which in turn may lead to system failure and associated consequences. This results in software aging. In this work, software aging patterns of Windows active directory service is studied. Software aging of active directory needs to be predicted properly so that rejuvenation can be triggered to ensure continuous service delivery. In order to predict the accurate time, a model that uses time series forecasting technique is built. © 2017 Institute of Advanced Engineering and Science. All rights reserved.
Article
Full-text available
With the emergence of virtualization and cloud computing technologies, several services are housed on virtualization platform. Virtualization is the technology that many cloud service providers rely on for efficient management and coordination of the resource pool. As essential services are also housed on cloud platform, it is necessary to ensure continuous availability by implementing all necessary measures. Windows Active Directory is one such service that Microsoft developed for Windows domain networks. It is included in Windows Server operating systems as a set of processes and services for authentication and authorization of users and computers in a Windows domain type network. The service is required to run continuously without downtime. As a result, there are chances of accumulation of errors or garbage leading to software aging which in turn may lead to system failure and associated consequences. This results in software aging. In this work, software aging patterns of Windows active directory service is studied. Software aging of active directory needs to be predicted properly so that rejuvenation can be triggered to ensure continuous service delivery. In order to predict the accurate time, a model that uses time series forecasting technique is built. © 2017 Institute of Advanced Engineering and Science. All rights reserved.
Article
Full-text available
In the past, enterprise storage systems were configured with high-end disk drives supporting the SCSI or Fiber Channel protocol. In the last two years, flash drives and low cost SATA drives have entered the enterprise stor-age market as storage device options. In this paper we present an analytical tool for assessing the configura-tions formed from a mixture of all device types. Rather than relying on large-scale fine-grained traces, which are very rare, our tool uses ubiquitous coarse-grained logical volume statistics which are readily available in most production systems. We use our tool to analyze logical volume statistics collected from 120 large pro-duction systems. We show that mixing flash, SCSI, and SATA drives can lead in most cases to configurations which are better than using only SCSI devices in all key aspects: price, performance and energy consump-tion. This contrasts with other recent studies on smaller enterprise systems which are pessimistic about the ad-vantages of flash drives in the enterprise setting.
Conference Paper
Full-text available
With the prevalence of multi-core processors and cloud computing, the server consolidation using virtualization has increasingly expanded its territory, and the degree of consolidation has also become higher. As a large number of virtual machines individually require their own disks, the storage capacity of a data center could be exceeded. To address this problem, copy-on-write storage systems allow virtual machines to initially share a template disk image. This paper proposes a hybrid copy-on-write storage system that combines solid-state disks and hard disk drives for consolidated environments. In order to take advantage of both devices, the proposed scheme places a read-only template disk image on a solid-state disk, while write operations are isolated to the hard disk drive. In this hybrid architecture, the disk I/O performance benefits from the fast read access of the solid-state disk, especially for random reads, precluding write operations from the degrading flash memory performance. We show that the hybrid virtual disk, in terms of performance and cost, is more effective than the pure copy-on-write disks for a highly consolidated system. KeywordsConsolidation-Virtual machine (VM)-Copy-on-write (CoW)-Hybrid storage
Conference Paper
Full-text available
Unlike the use of DRAM for caching or buffering, certain idiosyncrasies of SSDs make their integration into existing systems non-trivial. Flash memory suffers from limits on its reliability, is an order of magnitude more expensive than the HDD, and can sometimes be as slow as the HDD (due to excessive garbage collection (GC) induced by high intensity of random writes). Given these trade-offs between HDDs and SSDs in terms of cost, performance, and lifetime, the current consensus among several storage experts is to view SSDs not as a replacement for HDD but rather as a complementary device within the high performance storage hierarchy. We design and evaluate such a hybrid system called Hybrid Store to provide: (a) Hybrid Plan: improved capacity planning technique to administrators with the overall goal of operating within cost-budgets and (b) HybridDyn: improved performance/lifetime guarantees during episodes of deviations from expected workloads through two novel mechanisms: write-regulation and fragmentation busting. As an illustrative example of HybridStore's efficacy, Hybrid Plan is able to find the most cost-effective storage configuration for a large scale workload of Microsoft Research and suggest one MLC SSD with ten 7.2K RPM HDDs instead of fourteen 7.2K RPM HDDs only. HybridDyn is able to reduce the average response time for an enterprise scale random-write dominant workload by about 71%as compared to a HDD-based system.
Conference Paper
Full-text available
Recently solid state drive (SSD) based on NAND flash memory chips becomes popular in the consumer electronics market because it is tough on shock and its I/O performance is better than that of conventional hard disk drive. However, as the density of the semiconductor grows higher, the distance between its wires narrows down, their interferences are frequently occurred, and the bit error rate of semiconductor increases. Such frequent error occurrence and short life cycle in NAND flash memory reduce the reliability of SSD. In this paper, we present reliability and performance enhancement technique on new RAID system based on SSD. First, we analyze the existing RAID mechanism in the environment of SSD array and then develop a new RAID methodology adaptable to SSD array storage system. Via trace-driven simulation, we evaluated the performance of our new optimized SSD array storage using RAID mechanism. The proposed method enhances the reliability of SSD array 2% higher than that of existing RAID system and improves the I/O performance of SSD array 28% higher than that of existing RAID system.
Conference Paper
Full-text available
Flash Solid State Disks induce a drastic change in storage technology that impacts database systems. Flash memories exhibit low latency (especially for small block sizes), very high random read and low random write throughput, and significant asymmetry between the read and write performance. These properties influence the performance of database join algorithms and ultimately the cost assumptions in the query optimizer. In this paper we examine the performance of different join algorithms available in Postgre SQL on SSD and magnetic drives. We observe that (a) point queries exhibit the best performance improvement of up to fifty times, (b) range queries benefit less from the properties of SSDs, (c) join algorithms behave differently depending on how well they match the properties of solid state disks or magnetic drives.
Article
Over the last two decades, significant advances have been made in the development of techniques for evaluating the performance, availability, and reliability of computer and communication systems in integration [11]. The reliability of the network is an important parameter in network [4]. Reliability is the performance, availability and security is a factors most important in a network . A distributed system is a system architecture in which computers can communicate and share resources [8]. This research applies a distributed system with load balancing and multimaster replication techniques in database. The results of this study found that the design of a system built to keep the data for connections between servers is in good condition and occurs down on one of the database server.
Article
Database management systems are today's most reliable mean to organize data into collections that can be searched and updated. However, many DBMS systems are available on the market each having their pros and cons in terms of reliability, usability, security, and performance. This paper presents a comparative study on the performance of the top DBMS systems. They are mainly MS SQL Server 2008, Oracle 11g, IBM DB2, MySQL 5.5, and MS Access 2010. The testing is aimed at executing different SQL queries with different level of complexities over the different five DBMSs under test. This would pave the way to build a head-to-head comparative evaluation that shows the average execution time, memory usage, and CPU utilization of each DBMS after completion of the test.
Conference Paper
This paper considers the problem of efficiently managing storage space in a hybrid storage system employing flash and disk drives. The flash and disk drives exhibit different performance characteristics of read and write behavior. We propose a technique for balancing the workload properties across flash and disk drives in such a hybrid storage system. The presented approach automatically and transparently manages migration of data blocks among flash and disk drives based on their access patterns. This paper presents the design and an evaluation of the proposed approach on a Linux testbed through realistic experiments.
Conference Paper
Flash solid-state drives (SSDs) are changing the I/O landscape, which has largely been dominated by traditional hard disk drives (HDDs) for the last 50 years. In this paper we propose and systematically explore designs for using an SSD to improve the performance of a DBMS buffer manager. We propose three alternatives that differ mainly in the way that they deal with the dirty pages evicted from the buffer pool. We implemented these alternatives, as well another recently proposed algorithm for this task (TAC), in SQL Server, and ran experiments using a variety of benchmarks (TPC-C, E and H) at multiple scale factors. Our empirical evaluation shows significant performance improvements of our methods over the default HDD configuration (up to 9.4X), and up to a 6.8X speedup over TAC.
Conference Paper
The past few decades have witnessed a chronic and widening imbalance among processor bandwidth, disk capacity, and access speed of disk. According to Amdhal's law, the per- formance enhancement possible with a given improvement is limited by the amount that the improved feature is used. This implies that the performance enhancement of an OLTP system would be seriously limited without a considerable improvement in I/O throughput. Since the market debut of flash memory SSD a few years ago, we have made a contin- ued effort to overcome its poor random write performance and to provide stable and sufficient I/O bandwidth. In this paper, we present three different flash memory SSD mod- els prototyped recently by Samsung Electronics. We then show how the flash memory SSD technology has advanced to reverse the widening trend of performance gap between processors and storage devices. We also demonstrate that even a single flash memory drive can outperform a level-0 RAID with eight enterprise class 15k-RPM disk drives with respect to transaction throughput, cost effectiveness and en- ergy consumption.