Content uploaded by Pravallika Mannem
Author content
All content in this area was uploaded by Pravallika Mannem on Oct 21, 2024
Content may be subject to copyright.
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1776]
LEVERAGING UNSUPERVISED LEARNING FOR WORKLOAD BALANCING
AND RESOURCE UTILIZATION IN CLOUD ARCHITECTURES
Rajesh Daruvuri*1, Kiran Kumar Patibandla*2, Pravallika Mannem*3
*1University of the Cumberlands, USA
*2Visvesvaraya Technological University (VTU), India
*3ProBPM, Inc, USA
Email Corresponding Author: venkatrajesh.d@gmail.com(R.D),
Kirru.patibandla@gmail.com(K.P), Pravi.sit05@gmail.com(P.M)
DOI : https://www.doi.org/10.56726/IRJMETS62304
ABSTRACT
Cloud computing architectures are more scalable and economical which is the main reason that has contributed
to its popularity. However, they bring their own set of challenges when it comes to workload scheduling and
resource utilization because virtual machines (VM) and applications have to share different types of resources
like servers, storage, etc. Historically, other strategies for workload balancing and resource management
include manual configuration or simplistic heuristics that do not provide effective optimizations of resource
usage and performance. In this technical brief, we propose an approach built on the use of unsupervised
learning techniques to detect usage patterns perceptively and improve resource utilization, which corresponds
to both optimal performance and automatically balanced workload among VMs. We are making use of
clustering algorithms to cluster similar workloads and then resource allocation for each group based on
demand. The point of this step is to use the resources more effectively so we do not run into resource
exhaustion. We also integrate anomaly detection methods within our system for identifying and handling
abnormal behavior by both monitoring and placing resources. We experiment with region traces from
production workloads to demonstrate the benefits of our approach, showing marked improvements in
workload balancing and resource utilization over current practices.
Keywords: Virtual Machines, Resource Management, Scalability, Simple Heuristics, Balance Workloads.
I. INTRODUCTION
Elasticity Workload balancing Task or the unit of work is assigned to the maximum power available. Optimizing
Resource Utilization: It is an integral part of cloud computing, it allows for efficient use of resources which in
turn helps to improve the system performance[1]. Servers and resources are shared between the users and
applications in cloud architecture. It ensures the workload is equally spread throughout the infrastructure and
it validates that workloads are using all resources. It does so by dynamically adjusting the resources of servers
as the workload on them changes [2]. Workload balancing can be performed in multiple ways and methods for
workload balancing are Load Balancing, Auto Scaling, and Resource pooling; Load-balancing is distributing the
load among servers based on the current load so that no server is overwhelmed and all servers are utilized.
Auto-scaling enables the system to dynamically add or delete resources according to the current workload so
that the system can deal with different demands [4]. Resource pooling — the resources are grouped into pools
and then allocated to applications as per the demand. This effectively balances the workload making the cloud
architectures capable of achieving high resource utilization[5]. It leads to optimal utilization of available
resources, hence, wastage is avoided and cost can be calculated[6]. This also prevents the system from being
overloaded and breaking down due to workload distribution, allowing all apps can be operated without issues.
On top of that, workload balancing also increases the system performance[7]. Spreading the workload across as
many available resources as you can find decreases the likelihood that one server will become overwhelmed.
This way every application will get access to resources and the average processing and response times will be
faster[8]. Thus, workload balancing contributes a major share in the cloud architectures to utilize resources
with high performance prevent system downtime[9], and also improve the average value for [ 8]. Because of
the growing popularity of cloud computing, enabling the right kind of workload balancing techniques plays a
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1777]
crucial role in bringing high efficiency and performance benefits for businesses in the world of cloud
computing[10].The main contribution of the research has the following:
• Optimize Resource Allocation: One of the significant advantages of Workload Balancing and Resource
Utilization in Cloud Architecture is its ability to optimize resource allocation. The workload and resources
are distributed more effectively across the cloud so that each server can be used efficiently by better
utilization through techniques such as Load Balancing, Resource Consolidation, etc. As information comes
less frequently, resources are used more efficiently which lowers costs for cloud providers;
• · Performance Enhancement: It enhances the performance of cloud systems by balancing the workload
between servers and managing the utilization of resources. Incorporated into cloud systems within the next
three years, this technology could make those systems handle a higher traffic volume and operate even
better for end users. It Consequently Enhances User Experience and Satisfaction with its Cloud Services,
Making It More Trustworthy and Competitive In the Market.
• Reducing the Incidence of Downtime and Overload: The capacity to minimize downtime and overload is
another key contribution made by this mechanism within cloud architectures. It can avoid server failure and
overloading by effectively balancing the workload and resources, which can in turn prevent downtime and
service interruptions. This also ensures that one server is not burdened with too much work, thus
enhancing the performance and stability of the cloud. This is especially true when dealing with mission-
critical systems and services that need to be online and operational 24×7.
The remaining part of the research has the following chapters. Chapter 2 describes the recent works related to
the research. Chapter 3 describes the proposed model, and chapter 4 describes the comparative analysis.
Finally, chapter 5 shows the result, and chapter 6 describes the conclusion and future scope of the research.
II. RELATED WORDS
Nawrocki, P., et,al.[11] have discussed optimizing the use of cloud computing resources. This involves
analyzing large sets of data through exploratory data analysis and utilizing machine learning techniques to
identify patterns and trends. This helps to efficiently allocate resources and improve cloud computing's overall
performance and cost-effectiveness. Kumar, J., et al.[12] The self-directed learning-based workload
forecasting model utilizes machine learning algorithms to predict resource demand in a cloud environment.
This enables efficient resource management by allocating resources based on expected future workloads. It
considers various factors, such as historical data, workload patterns, and user preferences, to make accurate
predictions. Dhakal, A., et al.[13] have discussed Edge clouds, geographically distributed computing systems
that bring processing and storage capabilities closer to the network's edge. Using edge computing and machine
learning algorithms reduces the latency and network congestion, improving the efficiency of machine learning
tasks. This enables faster decision-making and real-time analysis, making edge clouds ideal for high-
performance and low-latency machine learning applications. Patel, Y. S., et al.[14] have discussed this approach,
which utilizes deep learning techniques to predict resource utilization patterns in green cloud data centers. By
identifying hotspots and coldspots where resources are either over-utilized or under-utilized, solutions can be
implemented to optimize resource usage, improve efficiency, and reduce energy consumption for a more
sustainable cloud computing environment. Preuveneers, D., et,al.[15] have discussed how Machine learning
models in smart environments require significant computational power and data storage to train and execute.
This can lead to performance trade-offs such as longer execution times and increased energy consumption.
Balancing these trade-offs is important for optimizing the effectiveness and efficiency of these models in smart
environments. Priyadarshini, S., et al.[16] have discussed how the utilization of Artificial Intelligence (AI) and
Machine Learning (ML) in cloud computing can improve the security and scalability of workloads. By analyzing
data patterns, AI/ML can identify potential security risks and optimize workload distribution, improving the
cloud environment's security and scalability. Meyer, V., et al.[17] have discussed this scheme, which uses
machine learning techniques to classify dynamic interference patterns and allocate resources accordingly in
wireless networks. It analyzes the interference environment continuously and adjusts resource allocations in
real time to optimize network performance and reduce interference. This approach improves resource usage
efficiency and enhances QoS for network users. Soni, D., et al.[18] have discussed how Machine learning
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1778]
techniques, such as supervised and unsupervised learning, are being increasingly integrated into emerging
cloud computing paradigms to leverage the benefits of big data analytics, real-time processing, and predictive
models. This integration allows for better resource allocation, improved decision-making, and enhanced
efficiency in the management of cloud services. Ilager, S. et al.[19] have discussed Machine learning-based
energy and thermal efficient resource management algorithms, which use artificial intelligence techniques to
optimize energy consumption and thermal efficiency in various systems and devices. These algorithms learn
from data and make decisions to minimize energy usage while ensuring optimal performance and thermal
comfort. They are used in smart homes, buildings, and industrial processes to reduce energy costs and improve
sustainability. Noel, R. R., et al.[20] have discussed Self-managing cloud storage with reinforcement learning, a
recent approach that uses machine learning techniques to automate the process of maintaining and optimizing
cloud storage resources. Continuously learning from user behavior and system performance can help make
proactive decisions to improve storage efficiency and reduce costs.
III. PROPOSED MODEL
To this end, in the present paper we introduce a model to optimize the resource utilization and workload of
data processing and user requests using unsupervised learning which aims at an effective and efficient balance
when operating cloud architectures. It is mainly divided into data collection, unsupervised learning, and
workload balance. The first is the model, which collects data from, say cloud service providers, server logs, and
user behavior.
Employing machine learning methods to improve the efficiency and effectiveness of resource allocation,
optimization is formalized in a formula.
( ) ( ( ), ( ), ( ))L v p Z v S v N V=
(1)
Machine learning models can learn the complex relationship between workload, service quality requirements,
and resources by analyzing historical data.
Data on resource utilization, workload patterns, and user familiarity is monitored. Then the unsupervised
learning piece of the model examines these recorded data and using clustering algorithms, infers patterns and
relationships between variables.
Response time is the amount of time it takes for a service to respond to a specific type of request.
LV
LV
V
Je M
=
(2)
where Ang RT is the average response time, TRT denotes the total response time, and N is the total number of
requests.
This allows for grouping resources, workloads, and users based on their similarities, which can then be used for
workload balancing. Finally, the workload balancing component uses the insights from unsupervised learning
to dynamically allocate and adjust resources to achieve a balanced and optimized utilization of resources.
The throughput of a service is the maximum speed at which requests may be processed.
M
Throughput V
=
(3)
QoS metrics include functions that describe how throughput varies with load intensity and the maximum
throughput.
A system’s or service’s availability is the percentage of time it is operational and accessible to users.
100
Cv
Vov
Availability Vov V
=
+
(4)
This is usually identified as a percentage of uptime. For some services that are targeted for mission-critical
apps, ensuring availability with no loss of services is the highest priority; available horizontally in a region or
worldwide minimizes downtime and service interruptions, which makes the user experience seamless.
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1779]
Such as moving resources away from idle servers to ones requiring greater demand or utilizing serverless
computing for some workloads. A model that proposes automated load balancing and resource utilization for
efficient and cost-effective Cloud resource management is presented. This can improve the overall performance
of cloud architectures over time as it continuously adapts to changing workload patterns and user behavior.
3.1. Construction
The evolution of cloud computing has made a big impact on the designing, developing, and deployment of large-
scale applications. It is the best practice to use cost-efficient resources and balance the workload on different
nodes for cloud architects to provide this currency information.
The system should be designed to satisfy a service level objective representing the worst possible value for its
average response time, throughput, and availability.
max( , , )
LV
ObjectiveFunction J e Throughput Availability
=
(5)
It might be a way to make sure that the system is properly distributed and that there is not one performance
metric that falls below a certain threshold value.
This is important to ensure our high-speed performance while keeping the cost down for clients. To overcome
these challenges, we have seen the rise of unsupervised learning techniques as an effective tool.
Uses each neuron as a convolutional kernel, composed of these kernels On the contrary side, when the kernel is
a symmetric convolution turns into a correlation operation.
,
( , ) ( , ). ( , )
yy
r d h k d r
p f s b h k g o
=
(6)
Dividing Images Into Tiny Parts (Helper Of Extracting Feature Motifs) That is one of the ways to put
convolution.
Cloud architecture is all about the various design decisions including specific technologies and software
components, system configuration, etc which makes it up to make a running deployment of an application on
the cloud possible. It looks at the historical data and helps you conclude more specifically important patterns
around resource utilization, and flow of workloads by using ML algorithms such as unsupervised learning. Fig 1
shows the construction of the proposed model.
Fig 1: Construction of the proposed model
This information can then be used to optimize resource allocation and workload balancing, leading to improved
performance and cost savings. One example of unsupervised learning in cloud architectures is clustering
algorithms.
where is the input data component Ice, which has been multiplied elementally by the e k l (u, index of the k the
convolutional kernel k1 of the l the layer.
(1,1,,,, ( , ),,,,,, ( , )
y y y y
r r r r
P p p f s p F S
=
(7)
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1780]
Although it is likely to be representable, an equation (Equation) for the feature map yielded by the 𝑘th
convolutional operation.
So, these algorithms can aggregate nodes that have the same resource consumption and workload behavior so
we will get a proper load balancing and efficient use of resources. Unsupervised it is also used in auto-scaling,
which can increase or decrease the amount of resources according to the demand.
Operations generate feature motifs, which can appear anywhere in the picture.
()
yy
r f r
W e P=
(8)
This is the position in terms of both a) not leaving anything, and also b) keeping sure we keep where it has
estimated where each feature will be with respect to other features after extraction.
Using workload patterns and resource utilization, the system can continuously adjust resource allocation to
meet demand without the need of provisioning too much or not enough due to a rapid increase or decrease in
revenue or workload. Finally, a majority of the resource utilization and workload balancing in cloud
architectures can be improved by using unsupervised learning techniques. Other than that, cloud architects can
collect a lot of data and patterns that help them also to make more informed decisions so their systems perform
even better and cheaper.
3.2. Operating principle
Cloud structures are intricate and continuously alterable as they allow numerous diverse applications to
function from isolated virtual machines. So it is important to maintain that your workload must be handled
properly so that you have a great balance of performance and cost.
The activation function thereby, helps identify complex patterns and serves as a decision-making function.
()
yy
r f r
V e P=
(9)
The activation function then applies nonlinearity creating the modified result, that results from applying, it to
the convolution.
And this is where leveraging unsupervised learning comes into the picture. Unsupervised learning works by
attempting to find structure and relationships in data without having an explicit task given from outside. For
example, when working with cloud architectures this means collecting metrics on CPU and memory use,
network traffic, or application performance metrics. Fig 2 shows the operating principle of the proposed model.
Fig 2: Operating principle of the proposed model
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1781]
This data is then given to an unsupervised learning algorithm where clustering and anomaly detection
techniques are used to learn patterns and trends. This enables similar resources and workloads to be grouped
which in turn identifies the idle or underutilized resources and the load imbalance across different virtual
machines. This allows management and the SRE team to further identify all of these insights that can be used to
streamline resources and split overloads in an attempt to improve performance.
This result of the convolution which I in the above equations, is passed to an activation function along with the
bias which enables nonlinearity and generates output changes.
where y is the falcon location regarding the total applicants A for every dimension.
0.1
Max rb
Of
=
(10)
The speed is generated randomly within the t Max and t Min limit.
Anamoly detection algos meanwhile can ID strange resource utilization traffic patterns that suggest problems
or efficiencies. Predictive resource allocation- this is where unsupervised learning also comes into play. With
the help of historical data, the algorithm can learn about workload and how resources are utilized based on it
and predict future needs enabling to provision of resources in advance. Combining architecture principles with
novel approaches to unsupervised learning on stream data could open up the opportunity for intelligent
resource utilization and improving workload balancing. It does not just provide better performance but more
importantly, you can save precious dollars and avoid provisioning of unnecessary resources.
IV. RESULT AND DISCUSSION
The proposed model RLWBRA: Reinforcement Learning for Workload Balancing and Resource Utilization in
Cloud Architectures has been compared with the existing TUBRA: Task-based Unsupervised Balancing and
Resource Allocation in Cloud Architectures , ULCARA: Unsupervised Learning for Cloud Resource Allocation
and Workload Balancing and WUCLBA: Workflow-based Unsupervised Cloud Load Balancing and Allocation
4.1. Clustering accuracy: This refers to the accuracy of the workloads and resources balanced by the
unsupervised learning algorithm. Higher clustering accuracy indicates better performance in accurately
identifying and grouping similar workloads and resources. Fig.3 shows the Comparison of Clustering accuracy
Fig 3: Comparison of Clustering accuracy
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1782]
4.2. Resource utilization improvement: This parameter measures the improvement achieved using the
unsupervised learning algorithm for workload balancing. It can be calculated by comparing the resource
utilization before and after implementing the algorithm. Fig.4 shows the Comparison of Resource utilization
improvement
Fig 4: Comparison of Resource utilization improvement
4.3. Scalability: This refers to the algorithm's ability to handle a large number of workloads and resources
efficiently. The algorithm's performance should not deteriorate significantly as the number of workloads and
resources increases. Fig.5 shows the Comparison of Scalability
Fig 5: Comparison of Scalability
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1783]
4.4. Convergence time: This parameter measures the algorithm's time to reach a stable solution. A shorter
convergence time indicates faster and more efficient performance balancing workloads and resources. Fig.6
shows the Comparison of Convergence time
Fig 6: Comparison of Convergence time
V. CONCLUSION
To sum up, unsupervised learning techniques are an important tool that can prove useful in many aspects of
optimizing workload balancing and resource utilization within cloud architectures. The system will be able to
analyze data and then decide where resources should go & balance workloads by using algorithms such as k-
means clustering, anomaly detection, etc. This results in better performance and efficiency, saving on resources
and providing a smoother overall user experience. Additionally, unsupervised learning helps in scaling in with
workload spikes while adapting to them and catching outliers before they become a real problem in resource
management. In general, the introduction of unsupervised learning in cloud architectures can potentially make
large improvements to the scalability, reliability, and cost performance points for a vast range of their systems.
More research and innovation in this space can only build on these innovations, delivering even more value to
cloud service providers and their customers.
VI. REFERENCES
[1] Desai, B., & Patil, K. (2023). Reinforcement Learning-Based Load Balancing with Large Language
Models and Edge Intelligence for Dynamic Cloud Environments. Journal of Innovative Technologies,
6(1), 1-13.
[2] Duc, T. L., Leiva, R. G., Casari, P., & Östberg, P. O. (2019). Machine learning methods for reliable resource
provisioning in edge-cloud computing: A survey. ACM Computing Surveys (CSUR), 52(5), 1-39.
[3] Suleiman, B., Fulwala, M. M., & Zomaya, A. (2023, July). A framework for characterizing very large cloud
workload traces with unsupervised learning. In 2023 IEEE 16th International Conference on Cloud
Computing (CLOUD) (pp. 129-140). IEEE.
[4] Kaur, A., Kaur, B., Singh, P., Devgan, M. S., & Toor, H. K. (2020). Load balancing optimization based on
deep learning approach in cloud environment. International Journal of Information Technology and
Computer Science, 12(3), 8-18.
e-ISSN: 2582-5208
International Research Journal of Modernization in Engineering Technology and Science
( Peer-Reviewed, Open Access, Fully Refereed International Journal )
Volume:06/Issue:10/October-2024 Impact Factor- 8.187 www.irjmets.com
www.irjmets.com @International Research Journal of Modernization in Engineering, Technology and Science
[1784]
[5] Goodarzy, S., Nazari, M., Han, R., Keller, E., & Rozner, E. (2020, December). Resource management in
cloud computing using machine learning: A survey. In 2020 19th IEEE International Conference on
Machine Learning and Applications (ICMLA) (pp. 811-816). IEEE.
[6] Rajawat, A. S., Goyal, S. B., Kumar, M., & Malik, V. (2024). Adaptive resource allocation and optimization
in cloud environments: Leveraging machine learning for efficient computing. In Applied Data Science
and Smart Systems (pp. 499-508). CRC Press.
[7] Khan, A. R. (2024). Dynamic Load Balancing in Cloud Computing: Optimized RL-Based Clustering with
Multi-Objective Optimized Task Scheduling. Processes, 12(3), 519.
[8] Saxena, D., Kumar, J., Singh, A. K., & Schmid, S. (2023). Performance analysis of machine learning
centered workload prediction models for cloud. IEEE Transactions on Parallel and Distributed Systems,
34(4), 1313-1330.
[9] Alqahtani, D. (2023). Leveraging sparse auto-encoding and dynamic learning rate for efficient cloud
workloads prediction. IEEE Access, 11, 64586-64599.
[10] Bao, Y., Peng, Y., & Wu, C. (2022). Deep learning-based job placement in distributed machine learning
clusters with heterogeneous workloads. IEEE/ACM Transactions on Networking, 31(2), 634-647.
[11] Nawrocki, P., & Smendowski, M. (2024). Optimization of the Use of Cloud Computing Resources Using
Exploratory Data Analysis and Machine Learning. Journal of Artificial Intelligence and Soft Computing
Research, 14(4), 287-308.
[12] Kumar, J., Singh, A. K., & Buyya, R. (2021). Self directed learning based workload forecasting model for
cloud resource management. Information Sciences, 543, 345-366.
[13] Dhakal, A., Kulkarni, S. G., & Ramakrishnan, K. K. (2020, November). ECML: Improving efficiency of
machine learning in edge clouds. In 2020 IEEE 9th International Conference on Cloud Networking
(CloudNet) (pp. 1-6). IEEE.
[14] Patel, Y. S., Jaiswal, R., & Misra, R. (2022). Deep learning-based multivariate resource utilization
prediction for hotspots and coldspots mitigation in green cloud data centers. The Journal of
Supercomputing, 78(4), 5806-5855.
[15] Preuveneers, D., Tsingenopoulos, I., & Joosen, W. (2020). Resource usage and performance trade-offs
for machine learning models in smart environments. Sensors, 20(4), 1176.
[16] Priyadarshini, S., Sawant, T. N., Bhimrao Yadav, G., Premalatha, J., & Pawar, S. R. (2024). Enhancing
security and scalability by AI/ML workload optimization in the cloud. Cluster Computing, 1-15.
[17] Meyer, V., Kirchoff, D. F., Da Silva, M. L., & De Rose, C. A. (2021). ML-driven classification scheme for
dynamic interference-aware resource scheduling in cloud infrastructures. Journal of Systems
Architecture, 116, 102064.
[18] Soni, D., & Kumar, N. (2022). Machine learning techniques in emerging cloud computing integrated
paradigms: A survey and taxonomy. Journal of Network and Computer Applications, 205, 103419.
[19] Ilager, S. (2021). Machine Learning-based Energy and Thermal Efficient Resource Management
Algorithms for Cloud Data Centres (Doctoral dissertation, University of Melbourne).
[20] Noel, R. R., Mehra, R., & Lama, P. (2019, June). Towards self-managing cloud storage with
reinforcement learning. In 2019 IEEE International Conference on Cloud Engineering (IC2E) (pp. 34-
44). IEEE.