ArticlePDF Available

Abstract and Figures

Cloud computing architectures are more scalable and economical which is the main reason that has contributed to its popularity. However, they bring their own set of challenges when it comes to workload scheduling and resource utilization because virtual machines (VM) and applications have to share different types of resources like servers, storage, etc. Historically, other strategies for workload balancing and resource management include manual configuration or simplistic heuristics that do not provide effective optimizations of resource usage and performance. In this technical brief, we propose an approach built on the use of unsupervised learning techniques to detect usage patterns perceptively and improve resource utilization, which corresponds to both optimal performance and automatically balanced workload among VMs. We are making use of clustering algorithms to cluster similar workloads and then resource allocation for each group based on demand. The point of this step is to use the resources more effectively so we do not run into resource exhaustion. We also integrate anomaly detection methods within our system for identifying and handling abnormal behavior by both monitoring and placing resources. We experiment with region traces from production workloads to demonstrate the benefits of our approach, showing marked improvements in workload balancing and resource utilization over current practices.
Content may be subject to copyright.
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1776]
LEVERAGING UNSUPERVISED LEARNING FOR WORKLOAD BALANCING
AND RESOURCE UTILIZATION IN CLOUD ARCHITECTURES
Rajesh Daruvuri*1, Kiran Kumar Patibandla*2, Pravallika Mannem*3
*1University of the Cumberlands, USA
*2Visvesvaraya Technological University (VTU), India
*3ProBPM, Inc, USA
Email Corresponding Author: venkatrajesh.d@gmail.com(R.D),
Kirru.patibandla@gmail.com(K.P), Pravi.sit05@gmail.com(P.M)
DOI : https://www.doi.org/10.56726/IRJMETS62304
ABSTRACT
Cloud computing architectures are more scalable and economical which is the main reason that has contributed
to its popularity. However, they bring their own set of challenges when it comes to workload scheduling and
resource utilization because virtual machines (VM) and applications have to share different types of resources
like servers, storage, etc. Historically, other strategies for workload balancing and resource management
include manual configuration or simplistic heuristics that do not provide effective optimizations of resource
usage and performance. In this technical brief, we propose an approach built on the use of unsupervised
learning techniques to detect usage patterns perceptively and improve resource utilization, which corresponds
to both optimal performance and automatically balanced workload among VMs. We are making use of
clustering algorithms to cluster similar workloads and then resource allocation for each group based on
demand. The point of this step is to use the resources more effectively so we do not run into resource
exhaustion. We also integrate anomaly detection methods within our system for identifying and handling
abnormal behavior by both monitoring and placing resources. We experiment with region traces from
production workloads to demonstrate the benefits of our approach, showing marked improvements in
workload balancing and resource utilization over current practices.
Keywords: Virtual Machines, Resource Management, Scalability, Simple Heuristics, Balance Workloads.
I. INTRODUCTION
Elasticity Workload balancing Task or the unit of work is assigned to the maximum power available. Optimizing
Resource Utilization: It is an integral part of cloud computing, it allows for efficient use of resources which in
turn helps to improve the system performance[1]. Servers and resources are shared between the users and
applications in cloud architecture. It ensures the workload is equally spread throughout the infrastructure and
it validates that workloads are using all resources. It does so by dynamically adjusting the resources of servers
as the workload on them changes [2]. Workload balancing can be performed in multiple ways and methods for
workload balancing are Load Balancing, Auto Scaling, and Resource pooling; Load-balancing is distributing the
load among servers based on the current load so that no server is overwhelmed and all servers are utilized.
Auto-scaling enables the system to dynamically add or delete resources according to the current workload so
that the system can deal with different demands [4]. Resource pooling the resources are grouped into pools
and then allocated to applications as per the demand. This effectively balances the workload making the cloud
architectures capable of achieving high resource utilization[5]. It leads to optimal utilization of available
resources, hence, wastage is avoided and cost can be calculated[6]. This also prevents the system from being
overloaded and breaking down due to workload distribution, allowing all apps can be operated without issues.
On top of that, workload balancing also increases the system performance[7]. Spreading the workload across as
many available resources as you can find decreases the likelihood that one server will become overwhelmed.
This way every application will get access to resources and the average processing and response times will be
faster[8]. Thus, workload balancing contributes a major share in the cloud architectures to utilize resources
with high performance prevent system downtime[9], and also improve the average value for [ 8]. Because of
the growing popularity of cloud computing, enabling the right kind of workload balancing techniques plays a
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1777]
crucial role in bringing high efficiency and performance benefits for businesses in the world of cloud
computing[10].The main contribution of the research has the following:
Optimize Resource Allocation: One of the significant advantages of Workload Balancing and Resource
Utilization in Cloud Architecture is its ability to optimize resource allocation. The workload and resources
are distributed more effectively across the cloud so that each server can be used efficiently by better
utilization through techniques such as Load Balancing, Resource Consolidation, etc. As information comes
less frequently, resources are used more efficiently which lowers costs for cloud providers;
· Performance Enhancement: It enhances the performance of cloud systems by balancing the workload
between servers and managing the utilization of resources. Incorporated into cloud systems within the next
three years, this technology could make those systems handle a higher traffic volume and operate even
better for end users. It Consequently Enhances User Experience and Satisfaction with its Cloud Services,
Making It More Trustworthy and Competitive In the Market.
Reducing the Incidence of Downtime and Overload: The capacity to minimize downtime and overload is
another key contribution made by this mechanism within cloud architectures. It can avoid server failure and
overloading by effectively balancing the workload and resources, which can in turn prevent downtime and
service interruptions. This also ensures that one server is not burdened with too much work, thus
enhancing the performance and stability of the cloud. This is especially true when dealing with mission-
critical systems and services that need to be online and operational 24×7.
The remaining part of the research has the following chapters. Chapter 2 describes the recent works related to
the research. Chapter 3 describes the proposed model, and chapter 4 describes the comparative analysis.
Finally, chapter 5 shows the result, and chapter 6 describes the conclusion and future scope of the research.
II. RELATED WORDS
Nawrocki, P., et,al.[11] have discussed optimizing the use of cloud computing resources. This involves
analyzing large sets of data through exploratory data analysis and utilizing machine learning techniques to
identify patterns and trends. This helps to efficiently allocate resources and improve cloud computing's overall
performance and cost-effectiveness. Kumar, J., et al.[12] The self-directed learning-based workload
forecasting model utilizes machine learning algorithms to predict resource demand in a cloud environment.
This enables efficient resource management by allocating resources based on expected future workloads. It
considers various factors, such as historical data, workload patterns, and user preferences, to make accurate
predictions. Dhakal, A., et al.[13] have discussed Edge clouds, geographically distributed computing systems
that bring processing and storage capabilities closer to the network's edge. Using edge computing and machine
learning algorithms reduces the latency and network congestion, improving the efficiency of machine learning
tasks. This enables faster decision-making and real-time analysis, making edge clouds ideal for high-
performance and low-latency machine learning applications. Patel, Y. S., et al.[14] have discussed this approach,
which utilizes deep learning techniques to predict resource utilization patterns in green cloud data centers. By
identifying hotspots and coldspots where resources are either over-utilized or under-utilized, solutions can be
implemented to optimize resource usage, improve efficiency, and reduce energy consumption for a more
sustainable cloud computing environment. Preuveneers, D., et,al.[15] have discussed how Machine learning
models in smart environments require significant computational power and data storage to train and execute.
This can lead to performance trade-offs such as longer execution times and increased energy consumption.
Balancing these trade-offs is important for optimizing the effectiveness and efficiency of these models in smart
environments. Priyadarshini, S., et al.[16] have discussed how the utilization of Artificial Intelligence (AI) and
Machine Learning (ML) in cloud computing can improve the security and scalability of workloads. By analyzing
data patterns, AI/ML can identify potential security risks and optimize workload distribution, improving the
cloud environment's security and scalability. Meyer, V., et al.[17] have discussed this scheme, which uses
machine learning techniques to classify dynamic interference patterns and allocate resources accordingly in
wireless networks. It analyzes the interference environment continuously and adjusts resource allocations in
real time to optimize network performance and reduce interference. This approach improves resource usage
efficiency and enhances QoS for network users. Soni, D., et al.[18] have discussed how Machine learning
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1778]
techniques, such as supervised and unsupervised learning, are being increasingly integrated into emerging
cloud computing paradigms to leverage the benefits of big data analytics, real-time processing, and predictive
models. This integration allows for better resource allocation, improved decision-making, and enhanced
efficiency in the management of cloud services. Ilager, S. et al.[19] have discussed Machine learning-based
energy and thermal efficient resource management algorithms, which use artificial intelligence techniques to
optimize energy consumption and thermal efficiency in various systems and devices. These algorithms learn
from data and make decisions to minimize energy usage while ensuring optimal performance and thermal
comfort. They are used in smart homes, buildings, and industrial processes to reduce energy costs and improve
sustainability. Noel, R. R., et al.[20] have discussed Self-managing cloud storage with reinforcement learning, a
recent approach that uses machine learning techniques to automate the process of maintaining and optimizing
cloud storage resources. Continuously learning from user behavior and system performance can help make
proactive decisions to improve storage efficiency and reduce costs.
III. PROPOSED MODEL
To this end, in the present paper we introduce a model to optimize the resource utilization and workload of
data processing and user requests using unsupervised learning which aims at an effective and efficient balance
when operating cloud architectures. It is mainly divided into data collection, unsupervised learning, and
workload balance. The first is the model, which collects data from, say cloud service providers, server logs, and
user behavior.
Employing machine learning methods to improve the efficiency and effectiveness of resource allocation,
optimization is formalized in a formula.
( ) ( ( ), ( ), ( ))L v p Z v S v N V=
(1)
Machine learning models can learn the complex relationship between workload, service quality requirements,
and resources by analyzing historical data.
Data on resource utilization, workload patterns, and user familiarity is monitored. Then the unsupervised
learning piece of the model examines these recorded data and using clustering algorithms, infers patterns and
relationships between variables.
Response time is the amount of time it takes for a service to respond to a specific type of request.
LV
LV
V
Je M
=
(2)
where Ang RT is the average response time, TRT denotes the total response time, and N is the total number of
requests.
This allows for grouping resources, workloads, and users based on their similarities, which can then be used for
workload balancing. Finally, the workload balancing component uses the insights from unsupervised learning
to dynamically allocate and adjust resources to achieve a balanced and optimized utilization of resources.
The throughput of a service is the maximum speed at which requests may be processed.
M
Throughput V
=
(3)
QoS metrics include functions that describe how throughput varies with load intensity and the maximum
throughput.
A system’s or service’s availability is the percentage of time it is operational and accessible to users.
100
Cv
Vov
Availability Vov V
=
+
(4)
This is usually identified as a percentage of uptime. For some services that are targeted for mission-critical
apps, ensuring availability with no loss of services is the highest priority; available horizontally in a region or
worldwide minimizes downtime and service interruptions, which makes the user experience seamless.
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1779]
Such as moving resources away from idle servers to ones requiring greater demand or utilizing serverless
computing for some workloads. A model that proposes automated load balancing and resource utilization for
efficient and cost-effective Cloud resource management is presented. This can improve the overall performance
of cloud architectures over time as it continuously adapts to changing workload patterns and user behavior.
3.1. Construction
The evolution of cloud computing has made a big impact on the designing, developing, and deployment of large-
scale applications. It is the best practice to use cost-efficient resources and balance the workload on different
nodes for cloud architects to provide this currency information.
The system should be designed to satisfy a service level objective representing the worst possible value for its
average response time, throughput, and availability.
max( , , )
LV
ObjectiveFunction J e Throughput Availability
=
(5)
It might be a way to make sure that the system is properly distributed and that there is not one performance
metric that falls below a certain threshold value.
This is important to ensure our high-speed performance while keeping the cost down for clients. To overcome
these challenges, we have seen the rise of unsupervised learning techniques as an effective tool.
Uses each neuron as a convolutional kernel, composed of these kernels On the contrary side, when the kernel is
a symmetric convolution turns into a correlation operation.
,
( , ) ( , ). ( , )
yy
r d h k d r
p f s b h k g o
=
(6)
Dividing Images Into Tiny Parts (Helper Of Extracting Feature Motifs) That is one of the ways to put
convolution.
Cloud architecture is all about the various design decisions including specific technologies and software
components, system configuration, etc which makes it up to make a running deployment of an application on
the cloud possible. It looks at the historical data and helps you conclude more specifically important patterns
around resource utilization, and flow of workloads by using ML algorithms such as unsupervised learning. Fig 1
shows the construction of the proposed model.
Fig 1: Construction of the proposed model
This information can then be used to optimize resource allocation and workload balancing, leading to improved
performance and cost savings. One example of unsupervised learning in cloud architectures is clustering
algorithms.
where is the input data component Ice, which has been multiplied elementally by the e k l (u, index of the k the
convolutional kernel k1 of the l the layer.
(1,1,,,, ( , ),,,,,, ( , )
y y y y
r r r r
P p p f s p F S

=
(7)
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1780]
Although it is likely to be representable, an equation (Equation) for the feature map yielded by the 𝑘th
convolutional operation.
So, these algorithms can aggregate nodes that have the same resource consumption and workload behavior so
we will get a proper load balancing and efficient use of resources. Unsupervised it is also used in auto-scaling,
which can increase or decrease the amount of resources according to the demand.
Operations generate feature motifs, which can appear anywhere in the picture.
(8)
This is the position in terms of both a) not leaving anything, and also b) keeping sure we keep where it has
estimated where each feature will be with respect to other features after extraction.
Using workload patterns and resource utilization, the system can continuously adjust resource allocation to
meet demand without the need of provisioning too much or not enough due to a rapid increase or decrease in
revenue or workload. Finally, a majority of the resource utilization and workload balancing in cloud
architectures can be improved by using unsupervised learning techniques. Other than that, cloud architects can
collect a lot of data and patterns that help them also to make more informed decisions so their systems perform
even better and cheaper.
3.2. Operating principle
Cloud structures are intricate and continuously alterable as they allow numerous diverse applications to
function from isolated virtual machines. So it is important to maintain that your workload must be handled
properly so that you have a great balance of performance and cost.
The activation function thereby, helps identify complex patterns and serves as a decision-making function.
()
yy
r f r
V e P=
(9)
The activation function then applies nonlinearity creating the modified result, that results from applying, it to
the convolution.
And this is where leveraging unsupervised learning comes into the picture. Unsupervised learning works by
attempting to find structure and relationships in data without having an explicit task given from outside. For
example, when working with cloud architectures this means collecting metrics on CPU and memory use,
network traffic, or application performance metrics. Fig 2 shows the operating principle of the proposed model.
Fig 2: Operating principle of the proposed model
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1781]
This data is then given to an unsupervised learning algorithm where clustering and anomaly detection
techniques are used to learn patterns and trends. This enables similar resources and workloads to be grouped
which in turn identifies the idle or underutilized resources and the load imbalance across different virtual
machines. This allows management and the SRE team to further identify all of these insights that can be used to
streamline resources and split overloads in an attempt to improve performance.
This result of the convolution which I in the above equations, is passed to an activation function along with the
bias which enables nonlinearity and generates output changes.
where y is the falcon location regarding the total applicants A for every dimension.
0.1
Max rb
Of
=
(10)
The speed is generated randomly within the t Max and t Min limit.
Anamoly detection algos meanwhile can ID strange resource utilization traffic patterns that suggest problems
or efficiencies. Predictive resource allocation- this is where unsupervised learning also comes into play. With
the help of historical data, the algorithm can learn about workload and how resources are utilized based on it
and predict future needs enabling to provision of resources in advance. Combining architecture principles with
novel approaches to unsupervised learning on stream data could open up the opportunity for intelligent
resource utilization and improving workload balancing. It does not just provide better performance but more
importantly, you can save precious dollars and avoid provisioning of unnecessary resources.
IV. RESULT AND DISCUSSION
The proposed model RLWBRA: Reinforcement Learning for Workload Balancing and Resource Utilization in
Cloud Architectures has been compared with the existing TUBRA: Task-based Unsupervised Balancing and
Resource Allocation in Cloud Architectures , ULCARA: Unsupervised Learning for Cloud Resource Allocation
and Workload Balancing and WUCLBA: Workflow-based Unsupervised Cloud Load Balancing and Allocation
4.1. Clustering accuracy: This refers to the accuracy of the workloads and resources balanced by the
unsupervised learning algorithm. Higher clustering accuracy indicates better performance in accurately
identifying and grouping similar workloads and resources. Fig.3 shows the Comparison of Clustering accuracy
Fig 3: Comparison of Clustering accuracy
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1782]
4.2. Resource utilization improvement: This parameter measures the improvement achieved using the
unsupervised learning algorithm for workload balancing. It can be calculated by comparing the resource
utilization before and after implementing the algorithm. Fig.4 shows the Comparison of Resource utilization
improvement
Fig 4: Comparison of Resource utilization improvement
4.3. Scalability: This refers to the algorithm's ability to handle a large number of workloads and resources
efficiently. The algorithm's performance should not deteriorate significantly as the number of workloads and
resources increases. Fig.5 shows the Comparison of Scalability
Fig 5: Comparison of Scalability
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1783]
4.4. Convergence time: This parameter measures the algorithm's time to reach a stable solution. A shorter
convergence time indicates faster and more efficient performance balancing workloads and resources. Fig.6
shows the Comparison of Convergence time
Fig 6: Comparison of Convergence time
V. CONCLUSION
To sum up, unsupervised learning techniques are an important tool that can prove useful in many aspects of
optimizing workload balancing and resource utilization within cloud architectures. The system will be able to
analyze data and then decide where resources should go & balance workloads by using algorithms such as k-
means clustering, anomaly detection, etc. This results in better performance and efficiency, saving on resources
and providing a smoother overall user experience. Additionally, unsupervised learning helps in scaling in with
workload spikes while adapting to them and catching outliers before they become a real problem in resource
management. In general, the introduction of unsupervised learning in cloud architectures can potentially make
large improvements to the scalability, reliability, and cost performance points for a vast range of their systems.
More research and innovation in this space can only build on these innovations, delivering even more value to
cloud service providers and their customers.
VI. REFERENCES
[1] Desai, B., & Patil, K. (2023). Reinforcement Learning-Based Load Balancing with Large Language
Models and Edge Intelligence for Dynamic Cloud Environments. Journal of Innovative Technologies,
6(1), 1-13.
[2] Duc, T. L., Leiva, R. G., Casari, P., & Östberg, P. O. (2019). Machine learning methods for reliable resource
provisioning in edge-cloud computing: A survey. ACM Computing Surveys (CSUR), 52(5), 1-39.
[3] Suleiman, B., Fulwala, M. M., & Zomaya, A. (2023, July). A framework for characterizing very large cloud
workload traces with unsupervised learning. In 2023 IEEE 16th International Conference on Cloud
Computing (CLOUD) (pp. 129-140). IEEE.
[4] Kaur, A., Kaur, B., Singh, P., Devgan, M. S., & Toor, H. K. (2020). Load balancing optimization based on
deep learning approach in cloud environment. International Journal of Information Technology and
Computer Science, 12(3), 8-18.
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1784]
[5] Goodarzy, S., Nazari, M., Han, R., Keller, E., & Rozner, E. (2020, December). Resource management in
cloud computing using machine learning: A survey. In 2020 19th IEEE International Conference on
Machine Learning and Applications (ICMLA) (pp. 811-816). IEEE.
[6] Rajawat, A. S., Goyal, S. B., Kumar, M., & Malik, V. (2024). Adaptive resource allocation and optimization
in cloud environments: Leveraging machine learning for efficient computing. In Applied Data Science
and Smart Systems (pp. 499-508). CRC Press.
[7] Khan, A. R. (2024). Dynamic Load Balancing in Cloud Computing: Optimized RL-Based Clustering with
Multi-Objective Optimized Task Scheduling. Processes, 12(3), 519.
[8] Saxena, D., Kumar, J., Singh, A. K., & Schmid, S. (2023). Performance analysis of machine learning
centered workload prediction models for cloud. IEEE Transactions on Parallel and Distributed Systems,
34(4), 1313-1330.
[9] Alqahtani, D. (2023). Leveraging sparse auto-encoding and dynamic learning rate for efficient cloud
workloads prediction. IEEE Access, 11, 64586-64599.
[10] Bao, Y., Peng, Y., & Wu, C. (2022). Deep learning-based job placement in distributed machine learning
clusters with heterogeneous workloads. IEEE/ACM Transactions on Networking, 31(2), 634-647.
[11] Nawrocki, P., & Smendowski, M. (2024). Optimization of the Use of Cloud Computing Resources Using
Exploratory Data Analysis and Machine Learning. Journal of Artificial Intelligence and Soft Computing
Research, 14(4), 287-308.
[12] Kumar, J., Singh, A. K., & Buyya, R. (2021). Self directed learning based workload forecasting model for
cloud resource management. Information Sciences, 543, 345-366.
[13] Dhakal, A., Kulkarni, S. G., & Ramakrishnan, K. K. (2020, November). ECML: Improving efficiency of
machine learning in edge clouds. In 2020 IEEE 9th International Conference on Cloud Networking
(CloudNet) (pp. 1-6). IEEE.
[14] Patel, Y. S., Jaiswal, R., & Misra, R. (2022). Deep learning-based multivariate resource utilization
prediction for hotspots and coldspots mitigation in green cloud data centers. The Journal of
Supercomputing, 78(4), 5806-5855.
[15] Preuveneers, D., Tsingenopoulos, I., & Joosen, W. (2020). Resource usage and performance trade-offs
for machine learning models in smart environments. Sensors, 20(4), 1176.
[16] Priyadarshini, S., Sawant, T. N., Bhimrao Yadav, G., Premalatha, J., & Pawar, S. R. (2024). Enhancing
security and scalability by AI/ML workload optimization in the cloud. Cluster Computing, 1-15.
[17] Meyer, V., Kirchoff, D. F., Da Silva, M. L., & De Rose, C. A. (2021). ML-driven classification scheme for
dynamic interference-aware resource scheduling in cloud infrastructures. Journal of Systems
Architecture, 116, 102064.
[18] Soni, D., & Kumar, N. (2022). Machine learning techniques in emerging cloud computing integrated
paradigms: A survey and taxonomy. Journal of Network and Computer Applications, 205, 103419.
[19] Ilager, S. (2021). Machine Learning-based Energy and Thermal Efficient Resource Management
Algorithms for Cloud Data Centres (Doctoral dissertation, University of Melbourne).
[20] Noel, R. R., Mehra, R., & Lama, P. (2019, June). Towards self-managing cloud storage with
reinforcement learning. In 2019 IEEE International Conference on Cloud Engineering (IC2E) (pp. 34-
44). IEEE.
Article
Full-text available
The use of artificial intelligence (AI) in cloud architectures has significantly increased processing efficiency and scale. However, with the development of complex algorithms and big data as well as surprisingly entered into our machine learning world; workload management becomes a significant issue in AI cloud computing. Existing workload management solutions are rule-based heuristics that may result in underutilization of resources and poor performance. For that, we present an algorithmic comparative approach to easing the burden of workload management for AI-driven cloud architectures. This is in contrast to executing a batch of tasks with different algorithms and comparing performance, cost, etc. We use ML methods to determine the best algorithm for our workload, and then deploy this in a self-contained binary that can switch between algorithms at runtime on an available resource. We validated our scheme with simulations, which demonstrates the capability of superior resource use and diminished completion time in comparison to rule-based schemes. When needed, flexibility and scalability allow you easier control over workloads that are subject to change or allocation. By simplifying AI-driven cloud workload management, the elasticity of their overall approach greatly enhances efficiency and scalability for those organizations looking to run even larger and take advantage of more complex workloads faster Tweet this Share on Facebook.
Article
Full-text available
Cloud computing has been disrupting the way businesses work through an effective, and low-cost platform for delivering services and resources. However, as cloud computing is growing at a faster pace the complexity of administering and upkeep of such huge systems has become more complex. Time-consuming and resource-intensive tasks make repetitive operations like scaling resources or performance monitoring too slow and cumbersome, which in turn makes cloud architecture not well suited to efficiently managing workload fluctuations. This in turn has led to an increasing effort towards automating monotonous tasks for cloud architectures, using perhaps supervised learning techniques. This means that supervised learning algorithms can learn from the past, and can be used for prediction as well (which is very important in any operation: forecasting resource needs so you have capacity ready before it was needed using predictive analytics real-time data). This will relieve human operators of some work, making the system more efficient. By using the power of supervised learning, we can continuously optimize cloud architectures for costefficient and efficient resource provisioning. It also provides better scalability & adaptability for the system thus making it more fault-tolerant (in accordance to bootstrapping) against sudden spikes in workload that cannot be mitigated.
Article
Full-text available
Cloud computing has been disrupting the way businesses work through an effective, and low-cost platform for delivering services and resources. However, as cloud computing is growing at a faster pace the complexity of administering and upkeep of such huge systems has become more complex. Time-consuming and resource-intensive tasks make repetitive operations like scaling resources or performance monitoring too slow and cumbersome, which in turn makes cloud architecture not well suited to efficiently managing workload fluctuations. This in turn has led to an increasing effort towards automating monotonous tasks for cloud architectures, using perhaps supervised learning techniques. This means that supervised learning algorithms can learn from the past, and can be used for prediction as well (which is very important in any operation: forecasting resource needs so you have capacity ready before it was needed using predictive analytics real-time data). This will relieve human operators of some work, making the system more efficient. By using the power of supervised learning, we can continuously optimize cloud architectures for cost-efficient and efficient resource provisioning. It also provides better scalability & adaptability for the system thus making it more fault-tolerant (in accordance to bootstrapping) against sudden spikes in workload that cannot be mitigated.
Article
Full-text available
Rapid growth in the popularity of cloud computing has been largely caused by increasing demand for scalable IT solutions, which could provide a cost-effective way to manage the software development process and meet business objectives. Optimization of cloud resource usage remains a key issue given its potential to significantly increase efficiency and flexibility, minimize costs, ensure security, and maintain high availability of services. This paper presents a novel concept of a Cloud Computing Resource Prediction and Optimization System , which is based on exploratory data analysis that acknowledges, among others, the information value of outliers and dynamic feature selection. The optimization of cloud resource usage relies on long-term forecasting, which is considered a dynamic and proactive optimization category. The analysis presented here focuses on the applicability of classical statistical models, XGBoost, neural networks and Transformer. Experimental results reveal that machine learning methods are highly effective in long-term forecasting. Particularly promising results – in the context of potential prediction-based dynamic resource reservations – have been yielded by prediction methods based on the BiGRU neural network and the Temporal Fusion Transformer.
Article
Full-text available
The pervasive adoption of Artificial Intelligence (AI) and Machine Learning (ML) applications has exponentially increased the demand for efficient resource allocation, workload scheduling, and parallel computing capabilities in cloud environments. This research addresses the critical need for enhancing both the scalability and security of AI/ML workloads in cloud computing settings. The study emphasizes the optimization of resource allocation strategies to accommodate the diverse requirements of AI/ML workloads. Efficient resource allocation ensures that computational resources are utilized judiciously, avoiding bottlenecks and latency issues that could hinder the performance of AI/ML applications. The research explores advanced parallel computing techniques to harness the full possible cloud infrastructure, enhancing the speed and efficiency of AI/ML computations. The integration of robust security measures is crucial to safeguard sensitive data and models processed in the cloud. The research delves into secure multi-party computation and encryption techniques like the Hybrid Heft Pso Ga algorithm, Heuristic Function for Adaptive Batch Stream Scheduling Module (ABSS) and allocation of resources parallel computing and Kuhn–Munkres algorithm tailored for AI/ML workloads, ensuring confidentiality and integrity throughout the computation lifecycle. To validate the proposed methodologies, the research employs extensive simulations and real-world experiments. The proposed ABSS_SSMM method achieves the highest accuracy and throughput values of 98% and 94%, respectively. The contributions of this research extend to the broader cloud computing and AI/ML communities. By providing scalable and secure solutions, the study aims to empower cloud service providers, enterprises, and researchers to leverage AI/ML technologies with confidence. The findings are anticipated to inform the design and implementation of next-generation cloud platforms that seamlessly support the evolving landscape of AI/ML applications, fostering innovation and driving the adoption of intelligent technologies in diverse domains.
Article
Full-text available
Dynamic load balancing in cloud computing is crucial for efficiently distributing workloads across available resources, ensuring optimal performance. This research introduces a novel dynamic load-balancing approach that leverages a deep learning model combining Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) to calculate load values for each virtual machine (VM). The methodology aims to enhance cloud performance by optimizing task scheduling and stress distribution. The proposed model employs a dynamic clustering mechanism based on computed loads to categorize VMs into overloaded and underloaded clusters. To improve clustering efficiency, the approach integrates Reinforcement Learning (RL) with a sophisticated Hybrid Lyrebird Falcon Optimization (HLFO) algorithm. HLFO merges the Lyrebird Optimization Algorithm (LOA) and Falcon Optimization Algorithm (FOA), enhancing the effectiveness of load balancing. A Multi-Objective Hybrid Optimization model is introduced to optimize task scheduling while considering Quality of Service (QoS) parameters, including makespan minimization, energy consumption reduction, balanced CPU utilization, efficient memory usage, and task prioritization. The implementation, conducted in Python and CloudSim, demonstrates the model’s ability to effectively allocate work between virtual machines (VMs) and physical machines (PMs), resulting in improved resource utilization, shortened makespan, enhanced CPU usage, and rigorous assessments affirming its efficacy. This research addresses the complexity of dynamic load balancing in cloud environments by combining deep learning, reinforcement learning, and hybrid optimization techniques, offering a comprehensive solution to optimize cloud performance under varying workloads and resource conditions.
Article
Full-text available
Cloud computing provides simple on-demand access to a centralized shared pool of computing resources. Performance and efficient utilization of cloud computing resources require accurate prediction of cloud workloads. This is a challenging problem due to cloud workloads’ dynamic nature, making it difficult to predict. Here we leverage deep learning which can with proper training provide accurate bases for the prediction of data center workloads. Deep Learning (DL) models, however, are challenging to train. One challenge is the vast number of hyperparameters needed to define and tune. The performance of a neural network model can be significantly improved by optimizing these hyperparameters. We recognize two of the essential issues to predict data center workloads using deep learning efficiently. First is the high dimensionality which requires removing superfluous information via some form of dimension reduction. Secondly, is the learning rate. Small learning rates can make the time for training very excessive while long learning rates can miss optimal solutions. Our approach is therefore dual-pronged. First, we use Sparse Auto-Encoder (SAE) to retrieve the essential workloads representations from the original high-dimensional historical cloud workloads data. Secondly, we use Gated Recurrent Unit with a Step-Wise Scheduler for the Learning Decay (GRU-SWSLD). The proposed system is demonstrated with data from Google cluster workload traces to predict Central Processing Unit (CPU) usage using the data center’s workload traces at several consecutive time steps. Our experimental results reveal that our proposed methodology provides a better tradeoff between accuracy and training time when compared with other models.
Article
Full-text available
The precise estimation of resource usage is a complex and challenging issue due to the high variability and dimensionality of heterogeneous service types and dynamic workloads. Over the last few years, the prediction of resource usage and traffic has received ample attention from the research community. Many machine learning-based workload forecasting models have been developed by exploiting their computational power and learning capabilities. This paper presents the first systematic survey cum performance analysis-based comparative study of diversified machine learning-driven cloud workload prediction models. The discussion initiates with the significance of predictive resource management followed by a schematic description, operational design, motivation, and challenges concerning these workload prediction models. Classification and taxonomy of different prediction approaches into five distinct categories are presented focusing on the theoretical concepts and mathematical functioning of the existing state-of-the-art workload prediction methods. The most prominent prediction approaches belonging to a distinct class of machine learning models are thoroughly surveyed and compared. All five classified machine learning-based workload prediction models are implemented on a common platform for systematic investigation and comparison using three distinct benchmark cloud workload traces via experimental analysis. The essential key performance indicators of state-of-the-art approaches are evaluated for comparison and the paper is concluded by discussing the trade-offs and notable remarks.
Article
Full-text available
Dynamic virtual machine (VM) consolidation is a constructive technique to enhance resource usage and is extensively employed to minimize data centers’ energy consumption. However, in the current approaches, consolidation techniques are heavily relied on reducing the actively used physical servers (PMs) based on their current resource utilization without considering future resource demands. Also, many of the reported works for cloud workload prediction applied univariate time series-based forecasting models and neglected the dependency of other resource utilization metrics. Thus, resulting in inaccurate predictions, unnecessary migrations, high migration costs, and increased service level agreement violations (SLAVs) may nullify the consolidation benefits. To efficiently address this issue, we propose a multivariate resource usage prediction-based hotspots and coldspots mitigation approach that considers both the current and future usage of resources with O(sk) time complexity, where s and k denote the number of PMs and VMs, respectively. The proposed technique uses a clustering-based stacked bidirectional (Long Short-Term Memory) LSTM deep learning network to predict the future memory and CPU usage of PMs and VMs with high accuracy and O((Q(Q+W)Θ)O((Q(Q+W)*\Theta ) computational complexity, where Q, W, and Θ\Theta represent the number of hidden layer cells, outputs, and training epochs, respectively. Through extensive simulations based on Google’s cluster workload traces, we demonstrate that our proposed method obtains substantial improvements in terms of prediction performance, energy-efficiency, actively used PMs, VM migrations, and SLA violations over the benchmark approaches.
Article
Nowadays, most leading IT companies host a variety of distributed machine learning (ML) workloads in ML clusters to support AI-driven services, such as speech recognition, machine translation, and image processing. While multiple jobs are executed concurrently in a shared cluster to improve resource utilization, interference among co-located ML jobs can lead to significant performance downgrade. Existing cluster schedulers, such as YARN and Mesos, are interference-agnostic in their job placement, leading to suboptimal resource efficiency and usage. Some literature has studied interference-aware job placement policy, but relies on detailed workload profiling and interference modeling, which is not a general solution. In this work, we present, a deep learning-driven ML cluster scheduler that places heterogeneous training jobs (either with parameter server architecture or all-reduce architecture) in a manner that minimizes interference and maximizes performance (i.e., training completion time minimization). The design of is based on a carefully designed deep reinforcement learning (DRL) framework enhanced with reward modeling. The DRL integrates a dynamic sequence-to-sequence model with the state-of-the-art techniques to stabilize training and improve convergence, including actor-critic algorithm, job-aware action space exploration, multi-head attention, and experience replay. In view of a common lack of reward samples corresponding to different placement decisions, we build an auxiliary sequence-to-sequence reward prediction model, which is trained with historical samples and used for producing reward for unseen placement. Experiments using real ML workloads in a Kubernetes cluster of 6 GPU servers show that outperforms representative schedulers by 16%16\% - 42%42\% in terms of average job completion time.
Article
Cloud computing offers Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) to provide compute, network, and storage capabilities to the clients utilizing the pay-per-use model. On the other hand, Machine Learning (ML) based techniques are playing a major role in effective utilization of the computing resources and offering Quality of Service (QoS). Based on the customer’s application requirements, several cloud computing-based paradigms i.e., edge computing, fog computing, mist computing, Internet of Things (IoT), Software-Defined Networking (SDN), cybertwin, and industry 4.0 have been evolved. These paradigms collaborate to offer customer-centric services with the backend of cloud server/data center. In brief, cloud computing has been emerged with respect to the above-mentioned paradigms to enhance the Quality of Experience (QoE) for the users. In particular, ML techniques are the motivating factor to backend the cloud for emerging paradigms, and ML techniques are essentially enhancing the usages of these paradigms by solving several problems of scheduling, resource provisioning, resource allocation, load balancing, Virtual Machine (VM) migration, offloading, VM mapping, energy optimization, workload prediction, device monitoring, etc. However, a comprehensive survey focusing on multi-paradigm integrated architectures, technical and analytical aspects of these paradigms, and the role of ML techniques in emerging cloud computing paradigms are still missing, and this domain needs to be explored. To the best of the authors’ knowledge, this is the first survey that investigates the emerging cloud computing paradigms integration considering the most dominating problem-solving technology i.e., ML. This survey article provides a comprehensive summary and structured layout for the vast research on ML techniques in the emerging cloud computing paradigm. This research presents a detailed literature review of emerging cloud computing paradigms: cloud, edge, fog, mist, IoT, SDN, cybertwin, and industry 4.0 (IIoT) along with their integration using ML. To carry out this study, majorly, the last five years (2017-21) articles are explored and analyzed thoroughly to understand the emerging integrated architectures, the comparative study on several attributes, and recent trends. Based on this, research gaps, challenges, and future trends are revealed.