Content uploaded by Jack Marquez
Author content
All content in this area was uploaded by Jack Marquez on Sep 29, 2021
Content may be subject to copyright.
Performance Comparison: Virtual Machines and
Containers Running Artificial Intelligence
Applications
Jack D. Marquez1[0000−0002−2673−3507]?and Mario Castillo2 []
1Universidad Autonoma de Occidente, Cali, Valle del Cauca 760030, CO,
jdmarquez@uao.edu.co
2Servicio Nacional de Aprendizaje SENA, Cali, Valle del Cauca 76001000, CO,
mgcastillor@sena.edu.co
Abstract. With the continuous growth of data that can be valuable for
companies and scientific research, cloud computing has shown itself as
one of the emerging technologies that can help solve many of these ap-
plications that need the right level of computing and ubiquitous access
to them. Cloud Computing has a base technology that is virtualization,
which has evolved to provide users with features from which they can
benefit. There are different types of virtualization and each of them has
its own way of carrying out some processes and of managing compu-
tational resources. In this paper, we present the comparison of perfor-
mance between virtual machines and containers, specifically between an
instance of OpenStack and docker and singularity containers. The ap-
plication used to measure performance is a real application of artificial
intelligence. We present the obtained results and discuss them.
Keywords: Cloud Computing, virtualization,Virtual Machines, Containers, Ma-
chine Learning
1 Introduction
With the continuous development of society, technology and the recent advances
in different scientific disciplines, enormous volumes of data or called Big Data is
being generated. Related to [1] the total global data accumulated in the year
2020 is going to be 44 times more than the value in the year 2009, that is, 40ZB
a real big number. They also describe this era like the Data Technology one. As
a potential business, big data is forcing the different companies and IT leaders
to obtain the value from all kind of data [2].
The use of Cloud Computing (CC) systems is a good choice to deal massive
data. Analytic capabilities and efficient processing power are necessary to get
value of the massive data. These data must be stored, processed and analyzed
efficiently, resulting in a processing problem [3].
?Corresponding Author
The use of CC is getting higher, because it is becoming a new paradigm that
can offer flexibility and scalability for every resources requested by the person
who is using the service. Users can ask for a lot of IT resources through CC like
storage servers, networks, software, among others. They can connect to those
resources from everywhere using internet and only pay what they are using
thanks to CC load balancing service and payment model.
Containers are increasingly becoming more popular than VMs even latter
have been predominant for managing a lot of applications. Containers like Docker
or Singularity are arising as a good option because they theoretically offer low-
overhead and better performance than Vms. In this work we present a per-
formance comparison between Virtual Machines using OpenStack, Docker con-
tainers and Singularity containers. All of them running an Artificial Intelligence
application that through a Convulotional Neural Network is able to identify
different pars of the body like hands, face and the body itself.
The rest of the paper is organized as follows. In Section 2, we present a
related work, including some previous works that have compared these tech-
nologies and we show our main difference with them. In Section 3, we describe
the application we use to test both of technologies and show some background
about Convolutional Neural Networks (CNN). In Section 4 we describe in detail
all the infrastructure used to do this study, and the experiments we did, with
their specifications and some other details to guarantee the reproducibility. In
section 6, we discuss the experiments performed and show the results obtained
by them. Finally, we conclude in Section 7.
2 Related Work
Due to the great boom and all the use that has been given by Cloud Computing
and all its components, some works have been dedicated to evaluating many of
each of them. In this case, as Cloud Computing can be applied in many areas, it
can be seen that the components have been evaluated to perform different oper-
ations just as in the works presented in [4,5,6,7,8,9,10,11,12,13,14,15,16,17,18].
It is in our interest to know the works in which previous comparisons have been
made regarding virtual machines and containers, so we present the most relevant
ones below. In order to provide efficient resource sharing and run concurrently
on a virtualized infrastructure, the Big Data Platforms are offering NosQL dis-
tributed databases [13]. This is what Shirinbab et al. want to evaluate on the
cloud computing systems. In [13], authors want to compare the performance
of Apache Cassandra in both virtualization environment, Virtual Machines, and
Containers. Besides, they also measured the performance of Cassandra in a non-
virtualized environment. They used the Cassandra-stress tool as a benchmark,
and they discovered that Docker containers accomplish the virtualization chal-
lenges because of its dependencies packing. Docker consumed fewer resources
and had less overhead than the virtual machines, but they are less isolated than
virtual machines; therefore, a bug in the kernel could affect the entire system.
In [14], authors recognize containers as an essential technology that is in-
creasing its use in cloud systems, due to the light operations, efficiency, depen-
dencies encapsulation, and resource sharing flexibility. They also know that there
are some lacks and decide to test some spark applications on virtual machines
and containers. Finally, they obtained that Docker containers are more efficient
in deployment, boot stages, better scalability, and achieves higher CPU and
memory utilization.
One of the most used frameworks to deal with Big Data in Cloud Computing
is Hadoop due to its scalability and processing power. In [15], authors want to
test this framework running over Virtual Machines and Docker containers. They
validate both environments and run some tests with the Teragen benchmark,
which put some workloads for the Hadoop Distributed File System (HDFS).
After those tests, they discover that the performance of Hadoop on Containers
outperforms the performance in Virtual Machines because the cluster gains some
speed in containers due to its use of the same kernel.
Containers have shown an excellent solution for most of the previous works.
The significant difference with all the mentioned works to our study is that we are
going to test a real-word machine learning application using Virtual Machines,
Docker containers, and Singularity containers. The application will be explained
in the next section.
3 Machine Learning Application
Artificial Intelligence (AI) and its subfield Machine Learning are being widely
used, and the technologies to support these applications are improving too.
Generally, this kind of applications are data, memory, network, GPU, or CPU-
intensive, due to the number of operations they have to do in a short period.
Hardware and language programming models are evolving to reduce the execu-
tion time of these applications.
To test the performance of these platforms (Docker, singularity, and Open-
Stack virtual machine instance), we use a Machine Learning application that
is Disk and CPU-intensive. We decide to use this application as a benchmark
for the platforms because research literature showed that it had not been done
before. This test, with this application help us to show one of the best platforms
to run further machine learning applications.
The application source code can be found in https://github.com/mariocr73/
ArtificialIntelligence_python.git. This application is written in Python
and the source file can be run in Jupyter Notebook.
After train the application with more than 1300 images of the body parts
we are going to recognize, we provide another image and the application output
should be something like this:
face: 0.002709% - Body: 0.0014576% - Hand: 99.995768% - Nothing:
6.204666%
In this case, the identification of one hand by the application was successful.
4 Experiments
Two different architectures are tested in this paper, Virtual Machines and Con-
tainers. each one of them has its advantages and disadvantages because of their
configurations and the way to execute every task.
In order to run the previously section, we did some configuration and we use
some physical resources to test the three environments. Each environment has
the same resources looking for guarantee the homogeneity and do not give any
advantage or disadvantage to any of them. For the container tests we use a Dell
Workstation with 16GB of RAM, four Intel Xeon @ 3.4GHz processors, one 240
GB Solid State Drive (SSD) and a Red Hat 4.8.5-39 linux version with the kernel
3.10.0-1062. For the virtual machine we create an OpenStack instance with the
same resources than the machine where we test the containers.
For each one of the platforms we did ten tests, and each test with 100 itera-
tions.
4.1 Virtual Machine - OpenStack Instance
As mentioned by Pepple in [19], OpenStack ”was created with the audacious
goal of being the ubiquitous software choice for building public and private cloud
infrastructures”. This platform is one of the most used for open source cloud
computing due to its architecture, characteristics, and it supports most virtu-
alization technologies of the market: XEN, KVM, LXC, QEMU [20]. For that
reason, we selected it to create a virtual machine instance and test the perfor-
mance of the machine learning application. Being one of the most used tech-
nologies, people can discover how it is going to be the performance compared to
containers’ performance.
4.2 Containers
Containers are continuously getting more popularity in HPC and Cloud Com-
puting platforms due to their countless benefits, so we decided to test the per-
formance of two of the most used linux containers currently We use two types of
containers, Docker [21] and singularity [22]. Each one of them have some main
characteristics and that is the reason we also decided to test the performance of
both.
The Docker containers used were 1.13.1 version with python 3.6.8, jupyter
core 4.5.0 and tensorflow 1.14.0.
The singularity containers used were 3.4.1-1 version with python 3.7.4, jupyter
core 4.5.0 and tensorflow 1.14.0.
The Dockerfile and Singularity Receipe for reproducibility of this experiment
can be found in https://github.com/mariocr73/ArtificialIntelligence_
python.git
4.3 Extra configurations
In order to make the experiments of this study, we had to take into account the
infrastructure mentioned before. Therefore we configured an entire environment
for containers and virtual machines.
During the configuration of the experiments, an important point was to en-
sure that the machine which was running every experiment would not be execut-
ing another task that could affect or generate some noise to the performance of
the machine learning application running on it. Also, in this stage we did some
preliminary test looking for guarantee all the compatibility of the platforms with
the software that supported the application. throughout this preliminary test we
had to downgrade the tensorflow version used in singularity, because it had ten-
sorflow 2.0 and some of the functions were deprecated, consequently we fix this
installing the 1.14 version.
5 Results
We tested the machine learning application in each platform (Docker, Singular-
ity, and OpenStack instance), and then we compared the performance of each
one of them. For that end, we performed ten tests, as we mentioned before, to de-
tect variations and possible factors that could be affecting this performance. We
did each of those tests strictly following the previously mentioned constraints,
specifications, and points.
Figure 1 shows the values obtained for the machine learning application in
each platform for all the tests done. In general, we observe very similar behavior
for each of the variables that occur in each of the technologies tested. In each
of the ten tests that were done, all the platforms (Containers and Virtual Ma-
chines) maintain equal application values without affecting the outputs of its
calculations. However, it can be observed that experiment number 8 of Docker
containers presents an atypical behavior to others. We note that for the error
rate, it shows the highest peak of the 30 experiments that are offered. Conse-
quently, all the accuracies of that same experiment are affected, presenting, in
turn, the lowest values that can be found of all the tests performed. We did not
have a record of why it could have generated this, but it may have been due to
the type of image provided to the application, perhaps it was not very clear, or
the object to be identified was not well located.
For the Error Rate (See figure 1a), if experiment number eight of Docker
would not have been an atypical value, this would have been the most constant
because almost do not have a variation during the ten tests. Singularity had
the most lower rate for Error but also had the second and third highest values,
and OpenStack had a real variety, getting the lowest rate in the first, third, and
ninth experiment.
One of the first steps, when we are using a CNN, is training the network to
allow recognizing some patterns than later are going to identify the image we
provide to it. In figure 1b, we can observe that, again, the unique value which is
different than the others is experiment number eight from Docker containers.
(a) Error rate for ten tests (b) Training Accuracy for ten tests
(c) Validation Accuracy for ten tests (d) Test Accuracy for ten tests
Fig. 1: Obtained results for each technology
After training the network, the next step is to validate it. Validation Accuracy
values(See figure 1c) for Docker, Singularity, and OpenStack were pretty close
one each other. Still, at some moments, one technology was better than the
other two, and then this was the worst. Docker container almost always had
one of the best values, if it was not the best, it was the second-best, except
for the experiment number eight, that how we already know is an atypical one.
OpenStack, in most of the cases, had the worst validation accuracy value.
When the neural network has been trained and validated, it is the moment
to test it with different images and get the final results (See figure 1d). For this
variable, unlike previous ones, there is variation in values obtained during the
execution of ten experiments for each of the technologies tested. Docker’s experi-
ment number eight still has the worst value; however, Docker and Singularity had
the same accuracy in the first five experiments. OpenStack was not constant in
its test accuracy values; it even had the second and third-worst accuracy values
in its experiment numbers six and four, respectively.
We could see that, in most cases, the values obtained by the application
were not influenced by the technology that was supporting the process. Docker,
Singularity, and OpenStack did not affect the outputs that get the application
because these are generated by models created by the application itself using
the inputs for the training and the learning process it does with it.
Containers and virtual machines do affect the performance of the application
or its execution time (see figure 2), due to the processes that each one of them
has to do to guarantee the execution and the way they provide the computational
resources to the applications.
Figure 2, shows that the worst performance was from OpenStack, due to the
overhead that creates the hypervisor in the virtual machines. This hypervisor is
the one in charge of take the physical resources and offer them virtual resources
for every hosted virtual machine. This process takes too long, and it is no very
efficient compared to bare metal performance or even with the container’s perfor-
mance [23].Table 1, 2, and 3, shows the best results for Singularity, Docker and
Fig. 2: Training Time
OpenStack instance, respectively. As we mentioned before and as can be seen in
table 3 and 2, OpenStack had the worst performance with an execution time of
32 minutes and 12 seconds. Comparing containers, we observe that Docker was
better than Singularity, finishing in 16 minutes and 36 seconds, while Singularity
finished in 16 minutes and 40 seconds. This might be a small difference, but if we
are running an application that has to work in real-time, those four seconds are
going to be a great amount of time. Nevertheless, Singularity in six of the ten
experiments was better than Docker. This is possible because Singularity was
designed to improve some processes like resource management and the way it
uses them, giving a little advantage over Docker containers. The main difference
between this two container technologies is that Singularity transfer the mem-
ory contents to user namespaces while Docker control its resources using cgroup
namespaces, what brings an overhead for this kind of applications.
Even Docker had the best performance for one of its experiments, as we
mentioned before, Singularity was better in more number of tests. In table 1,
we see that the accuracy values for Singularity outperform the values obtained
from OpenStack instance and Docker containers, no matter that had a higher
Final Error.
In figure 3, we show for the best Docker result, how CNN was improving
in each iteration, starting from a high Mean Square Error (MSQ) and a low
percentage of accuracy, to a low level of MSQ and a High level of accuracy.
This figure shows the behavior for the training and validation processes. During
iterations, we can see that some values are high, and then there are some low;
that is because CNN is always learning and trying to be more precise.
Table 1: Results for Singularity Con-
tainers
Results
# Iterations 100
Final Rate 0.00066
Final Error 0.19381
Train Accuracy 92.7%
Validation Accuracy 100%
Test Accuracy 97.7%
Time 16’40”
Table 2: Results for Docker Contain-
ers
Results
# Iterations 100
Final Rate 0.00066
Final Error 0.01744
Train Accuracy 99.6%
Validation Accuracy 95.5%
Test Accuracy 93.3%
Time 16’36”
Table 3: Results for OpenStack Instance
Results
# Iterations 100
Final Rate 0.00066
Final Error 0.04975
Train Accuracy 97.9%
Validation Accuracy 93.3%
Test Accuracy 93.3%
Time 32’12”
Although OpenStack instance had the worst performance, we would like to
test this kind of application to another virtualization types such as Xen or
XenServer, to determine if any of them or KVM has more overhead for ma-
chine learning applications.
6 Conclusions
In this paper, we have shown a performance comparison of two virtualization
types, Virtual Machines, and Containers, precisely an OpenStack instance for
Virtual Machine and Docker and Singularity for containers. According to our
results, Docker had the best performance with the lowest execution time in all
experiments done to all the technologies. Nevertheless, Singularity was better
than Docker and OpenStack in six of ten times, due to Docker has an overhead
compared to Singularity, because of the use of cgroups namespaces and because
Singularity makes more efficient use of the libraries that need the application.
On the other hand, the OpenStack instance had the worst performance of the
three technologies in every test. This is because of the overhead caused by the
hypervisor, which allows having the virtual resources for each virtual machine.
The time used by the hypervisor to put the physical resources of our machines
to put them as virtual makes this process less efficient than other virtualization
types that do not need this layer in their architecture.
Machine Learning applications find another technology where they can be
executed, obtaining good results. Each platform got good results for the outputs
of the form without considering the execution time. As future work, we want to
(a) MSQ of best Docker result (b) Accuracy of best Docker result
Fig. 3: MSQ and Accuracy for best Docker result
try these machine learning applications using GPU and the access that can give
any virtualization to this resource.
Acknowledgment
We want to show our gratitude to Oscar Eduardo Castillo for his help during
some experiments, and we are also immensely grateful to Armando Uribe Churta,
Instructor of SENA, who was the person that provided us the application to make
the comparison between these three platforms.
References
1. Zhou, K., Fu, C., Yang, S.: Big data driven smart energy management: From
big data to big insights. Renewable and Sustainable Energy Reviews 56 (2016)
215–225
2. Li, H., Li, H., Wen, Z., Mo, J., Wu, J.: Distributed heterogeneous storage based
on data value. In: 2017 IEEE 2nd Information Technology, Networking, Electronic
and Automation Control Conference (ITNEC). 264–271
3. Bezerra, A., Hernandez, P., Espinosa, A., Moure, J.C.: Job scheduling in hadoop
with shared input policy and RAMDISK. 355–363
4. Bokhari, M.U., Makki, Q., Tamandani, Y.K.: A survey on cloud computing. In:
Big Data Analytics. Springer (2018) 149–164
5. Abdelfattah, A.S., Abdelkader, T., EI-Horbaty, E.S.M.: Rsam: An enhanced archi-
tecture for achieving web services reliability in mobile cloud computing. Journal of
King Saud University-Computer and Information Sciences 30(2) (2018) 164–174
6. Balmakhtar, M., Persson, C.J., Rajagopal, A.: Secure cloud computing framework
(March 26 2019) US Patent App. 10/243,959.
7. Akherfi, K., Gerndt, M., Harroud, H.: Mobile cloud computing for computation
offloading: Issues and challenges. Applied computing and informatics 14(1) (2018)
1–16
8. Lehrig, S., Sanders, R., Brataas, G., Cecowski, M., Ivanˇsek, S., Polutnik, J.: Cloud-
store—towards scalability, elasticity, and efficiency benchmarking and analysis in
cloud computing. Future Generation Computer Systems 78 (2018) 115–126
9. Marvasti, M.A., Harutyunyan, A.N., Grigoryan, N.M., Poghosyan, A.: Methods
and systems to manage big data in cloud-computing infrastructures (April 17 2018)
US Patent 9,948,528.
10. Arango, C., Dernat, R., Sanabria, J.: Performance evaluation of container-based
virtualization for high performance computing environments. arXiv preprint
arXiv:1709.10140 (2017)
11. Seo, K.T., Hwang, H.S., Moon, I.Y., Kwon, O.Y., Kim, B.J.: Performance compar-
ison analysis of linux container and virtual machine for building cloud. Advanced
Science and Technology Letters 66(105-111) (2014) 2
12. Sharma, P., Chaufournier, L., Shenoy, P., Tay, Y.: Containers and virtual machines
at scale: A comparative study. In: Proceedings of the 17th International Middleware
Conference, ACM (2016) 1
13. Shirinbab, S., Lundberg, L., Casalicchio, E.: Performance evaluation of container
and virtual machine running cassandra workload. In: 2017 3rd International Con-
ference of Cloud Computing Technologies and Applications (CloudTech), IEEE
(2017) 1–8
14. Zhang, Q., Liu, L., Pu, C., Dou, Q., Wu, L., Zhou, W.: A comparative study
of containers and virtual machines in big data environment. In: 2018 IEEE 11th
International Conference on Cloud Computing (CLOUD), IEEE (2018) 178–185
15. Singh, A., Gouthaman, P., Bagla, S., Dey, A.: Comparative study of hadoop
over containers and hadoop over virutal machine. Interantial Journal of Applied
Engineering Research 13 (2018) 4373–4378
16. Auliya, Y., Nurdinsyah, Y., Wulandari, D.: Performance comparison of docker and
lxd with apachebench. In: Journal of Physics: Conference Series. Volume 1211.,
IOP Publishing (2019) 012042
17. Gillani, K., Lee, J.H.: Comparison of linux virtual machines and containers for a
service migration in 5g multi-access edge computing. ICT Express (2019)
18. Poojara, S.R., Ghule, V.B., Birje, M.N., Dharwadkar, N.V.: Performance Anal-
ysis of Linux Container and Hypervisor for Application Deployment on Clouds.
In: 2018 International Conference on Computational Techniques, Electronics and
Mechanical Systems (CTEMS). (December 2018) 24–29
19. Pepple, K.: Deploying openstack. ” O’Reilly Media, Inc.” (2011)
20. Endo, P.T., Gon¸calves, G.E., Kelner, J., Sadok, D.: A survey on open-source
cloud computing solutions. In: Brazilian Symposium on Computer Networks and
Distributed Systems. Volume 71. (2010)
21. Merkel, D.: Docker: lightweight linux containers for consistent development and
deployment. Linux Journal 2014(239) (2014) 2
22. Kurtzer, G.M., Sochat, V., Bauer, M.W.: Singularity: Scientific containers for
mobility of compute. PloS one 12(5) (2017) e0177459
23. Li, Z., Kihl, M., Lu, Q., Andersson, J.A.: Performance overhead comparison be-
tween hypervisor and container based virtualization. In: 2017 IEEE 31st Interna-
tional Conference on Advanced Information Networking and Applications (AINA),
IEEE (2017) 955–962