Content uploaded by Abdul Sajid Mohammed
Author content
All content in this area was uploaded by Abdul Sajid Mohammed on Jan 16, 2025
Content may be subject to copyright.
2024 7th International Conference on Contemporary Computing and Informatics (IC3I)
1654
979-8-3503-5006-7/24/$31.00 ©2024 IEEE
Efficient Resource Management for Real-time AI
Systems in the Cloud using Reinforcement Learning
Vinay Mallikarjunaradhya
Principal Product Manager
Legal Tech Platform
Thomson Reuters
Toronto, Canada.
vinay.aradhya@thomsonreuters.com
Nagesh Boddapati
Department of Computer Science
Microsoft
Charlotte, NC, USA.
skuraku5286@ucumberlands.edu
Madhusudhan Dasari Sreeramulu
Department of Cyber Security
American Express
USA.
dsmadhu007@gmail.com
Ketan Gupta
Department of Information Technology
University of The Cumberlands
Williamsburg, KY, USA.
ketan1722@gmail.com
Abdul Sajid Mohammed
Department of School of Computer and
Information Sciences,
University of the Cumberlands
Kentucky, USA.
amohammed5836@ucumberlands.edu
Yuvaraj Natarajan
Department of Computer Science
Engineering
Sri Shakthi Institute of Engineering and
Technology
Coimbatore, Tamil Nadu, India
yuvarajncse@siet.ac.in
Abstract— The advent of artificial intelligence (AI) has driven
an emergence in applications demanding online responses to
immense amounts of data. Typically, the cloud-based
deployment of these applications requires resource
management to be optimized for high performance while
meeting cost efficiency. However, conventional static resource
allocation methods may only partially apply to the context of
real-time AI applications as they are dynamic, unpredictable,
and require more efficient & adaptive ways of allocating
resources. To address this challenge, and as a solution in this
research, we introduce an RL-based method for online
resource management of cloud-based AI systems.
Reinforcement learning (RL) is a machine-learning algorithm
that allows systems to learn how best to perform tasks in
changing environments based on feedback from those
interactions with the environment. The third way our
proposed approach utilizes the weapon of RL entails allowing
for dynamic resource allocation so that resources are allocated
when needed rather than from a static method as set at design
time.
Keywords— Optimized, Conventional, Unpredictable,
Reinforcement Learning, Machine-Learning,
I. INTRODUCTION
Artificial intelligence (AI) has been rapidly growing,
resulting in an emerging demand for real-time AI
applications. Applications like this need fast, efficient
resource scheduling to unlock their full potential. Real-time
AI systems: Cloud computing provides state-of-the-art
resource management and scalability in a demand-driven
manner, attracting several real architectures to be
developed quickly in time applications [1]. Nevertheless,
the heterogeneity and dynamism characteristic of AI
workloads challenge traditional resource management
techniques. Here, we explore reinforcement learning (RL)
to manage available resources in cloud-based real-time AI
systems efficiently. The primary example of this is real-
time AI systems, which not only have to process data
instantly but also should respond within a certain number
of seconds. These include self-driving cars, real-time image
recognition, and natural language processing. They each
need massive computing clusters to support the real-time
analytics they process [2]. However, such systems'
workloads and resource requirements can change greatly
from one time to another, making traditional resource
management techniques ineffective. Scalable and on-
demand resources are designed to accommodate variation,
which is what virtual computing offers.cloud The unwitting
pandemic solicited a prescription for the current ossified
clouds. Running real-time AI systems in the cloud allows
for avoiding hardware costs and provides agility with auto-
scaling up and down of resources [3]. Resource
Management for Cloud Computing Takeaway In
conclusion, this is the core principle of resource
management that should be abided by to harness the full
advantages of cloud computing. Traditional means for
managing resources, such as load balancing and auto-
scaling, might work well for real-time AI systems. These
methods are usually based on static, scripted rules and
thresholds, which cannot adjust to new or fluctuating AI
workload profiles. Here is where RL fits in Reinforcement
learning (RL). This machine learning subfield plays a part
in the decisions made. It determines how action needs to be
taken based on an environment state, such as how rewards
are accumulated or optimized for penalty minimizing [4]. It
has seen success across multiple fields, like game playing,
robotics, and online advertising. When applied to resource
management for AI systems, researchers may instead use
RL methods to dynamically assign resources based on real-
time performance metrics and learn how best-effort
assignment decisions should be made. This implementation
enables more effective resource utilization, shorter
response times, and cheaper costs. Reinforcement learning
(RL) algorithms for cloud computing optimization tasks,
such as Q-learning and Deep Q-Networks, have recently
begun showcasing potential solutions. RL is advantageous
in this scenario as it has the potential for resource
management to adapt dynamically to an environment that
can change [5]. It can come in various workloads that are
influenced significantly by the time, day, season (or even
specific events), and these real-time AI systems need to
adapt accordingly. Unlike these, RL algorithms can keep on
learning and modify themselves with changes in the
system, making real-time resource management more
efficient and effective. Another benefit of RL is that it can
manage intricate and non-linear relationships between
performance metrics and resource allocation. Traditional
2024 7th International Conference on Contemporary Computing and Informatics (IC3I) | 979-8-3503-5006-7/24/$31.00 ©2024 IEEE | DOI: 10.1109/IC3I61595.2024.10828656
Authorized licensed use limited to: University of the Cumberlands. Downloaded on January 16,2025 at 17:48:58 UTC from IEEE Xplore. Restrictions apply.
2024 7th International Conference on Contemporary Computing and Informatics (IC3I)
1655
methods may struggle to identify the best resource
allocation strategy for complex and dynamic environments.
RL algorithms can also learn and optimize resource
allocation based on real-time performance metrics. Thus,
RL could also assist in finding and resolving bottlenecks in
the ecosystem [6]. RL algorithms can recognize the patterns
by observing performance metrics and adjusting resource
allocation to prevent failures or slowdowns. While using
RL for resource management is one direction, such an
approach has some limitations - the concern when it comes
down to this is the time and resources consumed by the
training of these algorithms. That said, RL algorithms can
train offline using historical data or in a simulation
environment, saving a lot of computing and time [7]. The
algorithms can also learn and improve their decision-
making process by observing more data; hence, they are
well suited for real-time AI systems that change quickly
over time. To sum up, our results suggest that reinforcement
learning has potential as a mechanism for fine-grained
resource management of real-time cloud-hosted AI
systems. RL algorithms can offer low response times,
running cost reduction, and all resource usage optimization
through changing environments, adaptation levels, and
complex relationship management into bottleneck
identification mitigation [8]. The advancement of AI
applications and the requirement for real-time processing
will push the requirements for efficient resource
management further. As a result, additional R&D is
essential to provide the full potential of real-time AI from
the cloud. The main contribution of the paper has the
following:
• Efficient Resource Allocation: It would be possible
to establish real-time, adaptive, and resource-
efficient resource management techniques for AI
systems in the cloud that can learn and realign
resource allocation with the varying learning. The
end can be better resource utilization, less wastage,
and more effective work.
• Load Balancing: Reinforcement learning
algorithms could keep learning/reading the system
to realign and fully use the resources meant for
particular tasks. It would then reduce latency and
increase throughput for AI applications on the cloud.
• Low Energy Consumption: By deploying
reinforcement learning-based systems, it would be
possible to ensure that the cost incurred is reduced.
Some of the applications are considered wasteful in
terms of computation and energy. Such – if
unchecked – could increase the cost and the end of
the environment.
• Scalable Solutions: Solutions based on artificial
intelligence and machine learning could be scaled
by reinforcement learning-based applications since,
depending on the work – the reinforcement
aggregate algorithm could reassign the “work”
accordingly.
II. RELATED WORKS
Artificial intelligence (AI) has been gaining prominence
across many industries, promising to enhance the efficiency
and quality of decision-making processes. However, as AI
applications are becoming more complex and sophisticated,
their effective deployment their effective deployment is
critical. The growing popularity of real-time AI systems has
made resource management indispensable since the
workloads are ever-changing and continuously increasing.
One popular solution to this dilemma is cloud computing,
which has many problems [9]. Time-sensitive AI tasks
present one of the significant challenges in managing
resources efficiently for real-time AI systems in the cloud.
The idea of real-time AI comes from the fact that a slight
delay can have significant consequences in many
application areas. So most decisions are made to future
unknown world states where guaranteed minimal execution
time is as close as possible. It adds an enormous load on the
resources assigned to perform this functionality; they need
to be efficient enough and should work in real time [10].
Traditional resource management is intended to deal with
something other than that kind of time course tasking,
which might lead to delays and system efficiency loss. It
becomes more problematic when you consider that the
execution workloads of AI are dynamic, with its resource
requirements varying significantly depending on what it is
doing. For example, if you are doing image classification,
which is a relatively simple task compared to natural
language processing tasks, this will likely require fewer
resources [11]. It is difficult to determine the number of
resources allocated for each task; over-provisioning
shortens the host lifespan, while under-provisioning
degrades performance. Also, resources are shared between
many users in the cloud infrastructure, which adds even
more complexity to how limited resources can be
effectively provisioned and utilized. Also, traditional
resource management is usually rule-based and depends on
predefined thresholds to allocate or deallocate recourses.
This method makes it unsuitable for real-time AI systems
because of its ignorance of the complexity and how
dynamic it can be in terms of both workload and resource
requirements. Thus, resource management becomes more
complex and time-consuming because humans must
frequently intervene to adjust resource allocation [12].
Researchers proposed reinforcement learning (RL) to
overcome these challenges and efficiently manage
resources in real-time AI scenarios. RL is a class of machine
learning techniques in which an agent learns to make
decisions by interacting with its environment. In resource
management, RL is an approach that can observe workload
over time and change its behaviour according to the
evolving conditions of these workloads without human
intervention or predefined rules. While RL has the potential
to be an effective solution, there are some more specific
problems with using it for resource management. The
training process for RL agents training some reinforcement
learning (RL) agents is time-consuming and resource
intensive. A big worry for real-time systems is that any
latency between when the input is captured, and a decision
is made has possible severe consequences [13]. In addition,
the RL agent can have its decision-making capabilities
outputting suboptimal resource allocation partly because of
Authorized licensed use limited to: University of the Cumberlands. Downloaded on January 16,2025 at 17:48:58 UTC from IEEE Xplore. Restrictions apply.
2024 7th International Conference on Contemporary Computing and Informatics (IC3I)
1656
its training data's inherent biases. To sum up, managing
resources for real-time AI systems on the cloud is
challenging and elaborate to control. AI workloads are
time-sensitive, resource requirements change frequently,
and traditional resource management techniques must find
a way to deliver optimal resourcing for deep learning
clusters. Despite the helping hand, training and possible
bias in RL remain a concern. This field of research and
development is necessary to scale the use of AI in real time
on the cloud to utilize what this technology can offer fully
[14]. Efficient Resource Management for Real-time AI
Systems in the Cloud with Reinforcement Learning has the
novelty of including the reinforcement learning methods to
manage resources while dealing with real-time (based on
seconds level time granularity) AI systems in the cloud. In
doing so, the functionality optimizes resource deployment
to AI workloads as their configurations change over time
(e.g., for different phases of an experiment), which drives
better performance and scaling efficiencies in cost savings
you see above (~26%). They also understand the
environmental factors (server load, network latency, etc..)
to perform real-time decisions on where recourse should be
allocated [15]. This system is marking progress in AI and
cloud computing, providing a more innovative approach to
resource management as it utilizes reinforcement learning.
III. P
ROPOSED
M
ODEL
We introduce an RL-based model that helps enhance the
efficiency of resource allocation for real-time AI systems in
a cloud-based setting. RL is an area of machine learning
that allows an agent to learn from its environment to reach
a specific goal. The three primary components are the AI
system, cloud infrastructure, and RL engine on which it
relies.
In this use case, the AI system performs tasks and creates
real-time workload requirements backed by a cloud
infrastructure to provide the needed resources (i.e.,
computing power, memory, or storage). At the same time,
The AI system communicates with its corresponding
infrastructure through an RL engine, which may adaptively
control and manage resources according to workloads.
A. Construction
Many technologies are involved in efficient resource
management for real-time AI systems in the cloud using
reinforcement learning, and these techniques are
complicated. It aims to efficiently manage the resources in
cloud-based real-time AI systems to get better performance
out of them while using fewer resources. Fig 1. Shows that
Schema for the architecture of cloud computing.
Figure 1. Schema for the architecture of cloud computing
The primary technologies that enable the approach is
reinforcement learning, a branch of machine learning that
learns how to make decisions by taking actions in a
dynamic and uncertain environment. It means training a
model to understand the best action he has to take at the
moment, depending upon feedback or rewards the
environment offers him. It is also where cloud computing
fits in the infrastructure and resources needed to operate
heavy AI systems. It includes allowing real-time AI
applications to automatically and dynamically allocate
resources that can scale up in capacity and adapt the number
of nodes based on the actual load experienced at any
particular time.
B. Operating Principle
Efficient Resource Management for Real-time AI Systems
in the Cloud using Reinforcement Learning paper is based
on reinforcement learning and cloud computing operating
principles. At a high level, reinforcement learning is an
algorithm that helps agents make decisions based on
rewards and penalties. This learning is effective,
particularly for AI systems that must make decisions
immediately and evolve as they do so. Fig 2. Shows that
Cloud Computing (CC) service models.
Figure 2. Cloud Computing (CC) service models
These cloud-based technical details of this approach
include the ability to access a shared pool of computing
resources for an AI system. It can allocate the resources as
needed, making it scalable and reasonably utilize the
system.
C. Functional Working
Authorized licensed use limited to: University of the Cumberlands. Downloaded on January 16,2025 at 17:48:58 UTC from IEEE Xplore. Restrictions apply.
2024 7th International Conference on Contemporary Computing and Informatics (IC3I)
1657
A Novel Reinforcement Learning Resource Management
for Real-time AI Systems in the Cloud: This solution
combines Reinforcement Learning (RL - one subset in
Machine Learning to make decisions) and cloud computing
for fast resource allocation based on real-time AI tasks.
!
"
#$% & '() *+,
-
.
/
.
0 1
.
2
3
It works with the following algorithm: The first step will be
training the RL algorithm through historical data for
previously performed AI tasks. This data contains
information such as the computational needs, response time
of each task, and resource usage. This data is fed into a
Reinforcement Learning (RL) algorithm and trained to
provide the best possible decisions for resource allocation
in real-time.
IV. R
ESULTS
A
ND
D
ISCUSSION
The result and discussion of " Efficient Resource
Management for Real-time AI Systems in Cloud using
Reinforcement Learning" deals with evaluating
performance outcomes on a proposed reinforcement
learning framework to manage resources effectively in an
AI system. The authors performed more quantitative
extensive experiments over a simulator and compared their
approach with different state-of-the-art resource
management techniques. Findings showed that the
reinforcement learning method enabled AI systems to
dynamically allocate real-time resources dynamically,
significantly enhancing their performance. It did so after
considering many factors like the workload, resource and
application characteristics, etc. To verify the approach,
various scenarios were chosen, and runs for each method
were demonstrated along with a comparison to all other
techniques that prevailed at that time by simulation.
A. Sensitivity
A key challenge in developing cloud-based real-time
artificial intelligence (AI) systems is utilizing resources
efficiently. Such systems depend on a continuous and
instantaneous decision cycle, meaning optimum resource
use is essential. Efficient Resource Management in the
context of AI systems means using the minimum possible
hardware, network, and storage resources for optimized
system performance. Fig 3. Shows that Number of sessions
in the load test.
Figure 3. Number of sessions in the load test
Reward Learning (RL) is an AI learning type that enables
an algorithm to adjust its action based on previous
experience. It means actively interacting with the world -
performing actions and receiving feedback based on those
actions. The objective of using RL in cloud resource
management is to find the optimal actions for each state that
make the utilization of resources as efficient as possible.
B. Accuracy
Real-time AI systems running in the cloud. Efficient usage
of resources is essential to ensure that your actual-time AI
system will work efficiently. Schools' Fast and accurate
decision-making type of machine learning requires a
system using high computing power to efficiently use
resources and learn quickly how systems like humans,
called reinforcement learning (RL), can act in its
environment such that it maximizes a reward signal the
same way RL enables AI agents to behave. As a result, it
could optimize and tune resource allocation from time to
time, depending on the workload changes or performance
levels of real-time AI systems in the cloud. Fig 4. Shows
that Video Stream throughput with custom metrics.
Figure 4. Video Stream throughput with custom metrics
Predictive modelling is a critical piece of the technology
that helps to ensure resource utilization efficiency. It
includes analyzing data from the past and forecasting what
resources will be needed to allocate resources efficiently.
Therefore, the reinforcement learning algorithm that
identifies the resource needs can choose when and how to
assign resources for different tasks at a reasonable time,
allowing the least number of idle resources with maximum
possible utilization.
C. Specificity
Real-time AI systems in the cloud need to efficiently
manage resources to meet their performance requirements.
However, the dynamic nature and usually high resource
Authorized licensed use limited to: University of the Cumberlands. Downloaded on January 16,2025 at 17:48:58 UTC from IEEE Xplore. Restrictions apply.
2024 7th International Conference on Contemporary Computing and Informatics (IC3I)
1658
demands of real-time AI systems mean that traditional
resource management approaches often need to be revised.
It is a good candidate for real-time AI systems where
efficient resource allocation and reinforcement learning
may be critical. Fig 5. Shows that Predictor — Forecasted
number of sessions.
Figure 5. Predictor — Forecasted number of sessions
Machine learning, re-learning is a reinforcement learning
where an agent learns to act in the environment by
performing actions and obtaining rewards. In resource
management for real-time AI systems, the environment will
be cloud infrastructure and actions we can take to balance
capacity demands (i.e., allocating/deallocating resources)
in carrying out specific tasks by AI. There would be
rewards for achieving different performance metrics, i.e.,
how fast you responded or with what accuracy, and at the
same time, keeping an eye on resources booked by the
algorithm to solve these problems.
D. Miss Rate
Effective resource management is essential for real-time AI
systems in the cloud to be effective. These high-
performance systems need to work on real-time big data
with low latency. Yet cloud environments are constantly
changing, and resources run on a limited schedule, so
ensuring optimal performance 24/7 is complex. That is the
point where reinforcement learning (RL) steps in.
Reinforcement learning a type of machine-learning
paradigm based on receiving rewards/penalties
signalization following an action taken by an agent within
some environment system that consequently knows how to
behave/react communally by doing specific actions into
this mannered environmental set-up from one time step to
another independently over inner dimensions process,
params selection, listeners); - via Sciencing-tech]. Figure 6.
Shows that Average active streams per pod.
Figure 6. Average active streams per pod
The miss rate: a significant performance metric for resource
management in RL of AI systems Miss rate (the fraction of
times a system misses the deadline for some task) For real-
time AI systems, this may imply that the system cannot
process data and make decisions within a time limit which
could potentially introduce performance degradation or
even cause the system to fail.
V. C
ONCLUSION
The conclusion for Efficient Resource Management for
Real-time AI Systems in The Cloud using Reinforcement
Learning explained how vital and influential RL techniques
are when managing resources in real-time AI systems. The
paper proves that RL can deal with the highly unfixed and
unpredictable environment of the cloud, i.e., resources are
changing both their availability over time (Game elements)
and price at any stage during GYM (prices). A significant
discovery of the study is that using RL for resource
management leads to much lower resource costs than
traditional approaches used in the literature. They do this
by dynamically allocating resources and increasing the
efficiency of an AI system in real time according to its
requirements. Ultimately, RL additionally provides better
conditions and system performance at lower costs. The
paper also highlights the requirement for intelligent and
performance-effective resource management in real-time
AI systems (as motivated by recent applications and
services). It emphasizes the potential of RL, which can
learn initially and further improve its decision process as a
means well suited to long-term resource planning in real-
time AI systems.
R
EFERENCES
[1] Kanungo, S. (2024). AI-driven resource management strategies for
cloud computing systems, services, and applications. World Journal
of Advanced Engineering Technology and Sciences, 11(2), 559-
566.
[2] Zhang, Y., Liu, B., Gong, Y., Huang, J., Xu, J., & Wan, W. (2024).
Application of Machine Learning Optimization in Cloud
Computing Resource Scheduling and Management. arXiv preprint
arXiv:2402.17216.
[3] Mansouri, M., Eskandari, M., Asadi, Y., & Savkin, A. (2024). A
cloud-fog computing framework for real-time energy management
in multi-microgrid system utilizing deep reinforcement learning.
Journal of Energy Storage, 97, 112912.
[4] Zhou, G., Tian, W., Buyya, R., Xue, R., & Song, L. (2024). Deep
reinforcement learning-based methods for resource scheduling in
cloud computing: A review and future directions. Artificial
Intelligence Review, 57(5), 124.
[5] Xu, Z., Gong, Y., Zhou, Y., Bao, Q., & Qian, W. (2024). Enhancing
kubernetes automated scheduling with deep learning and
reinforcement techniques for large-scale cloud computing
optimization. arXiv preprint arXiv:2403.07905.
[6] Szabó, G., & Pető, J. (2024). Intelligent wireless resource
management in industrial camera systems: Reinforcement
Learning-based AI-extension for efficient network utilization.
Computer Communications, 216, 68-85.
[7] Mangalampalli, S., Karri, G. R., Kumar, M., Khalaf, O. I., Romero,
C. A. T., & Sahib, G. A. (2024). DRLBTSA: Deep reinforcement
learning based task-scheduling algorithm in cloud computing.
Multimedia Tools and Applications, 83(3), 8359-8387.
[8] Kristian, A., Goh, T. S., Ramadan, A., Erica, A., & Sihotang, S. V.
(2024). Application of ai in optimizing energy and resource
management: Effectiveness of deep learning models. International
Transactions on Artificial Intelligence, 2(2), 99-105.
[9] Wang, Z., Wang, R., Wu, J., Zhang, W., & Li, C. (2024). Dynamic
Resource Allocation for Real Time Cloud XR Video Transmission:
A Reinforcement Learning Approach. IEEE Transactions on
Cognitive Communications and Networking.
[10] Gupta, S. K., Ranjith, C. P., Natarajan, R., & Mohideen, M. S. K.
(2024). An Energy Efficient Resource Allocation Framework for
Cloud System Based on Reinforcement Learning. In Advancements
in Science and Technology for Healthcare, Agriculture, and
Environmental Sustainability (pp. 325-332). CRC Press.
[11] Gong, Y., Huang, J., Liu, B., Xu, J., Wu, B., & Zhang, Y. (2024).
Dynamic resource allocation for virtual machine migration
Authorized licensed use limited to: University of the Cumberlands. Downloaded on January 16,2025 at 17:48:58 UTC from IEEE Xplore. Restrictions apply.
2024 7th International Conference on Contemporary Computing and Informatics (IC3I)
1659
optimization using machine learning. arXiv preprint
arXiv:2403.13619.
[12] Anoushee, M., Fartash, M., & Akbari Torkestani, J. (2024). An
intelligent resource management method in SDN based fog
computing using reinforcement learning. Computing, 106(4), 1051-
1080.
[13] Rajawat, A. S., Goyal, S. B., Kumar, M., & Malik, V. (2025).
Adaptive resource allocation and optimization in cloud
environments: Leveraging machine learning for efficient
computing. In Applied Data Science and Smart Systems (pp. 499-
508). CRC Press.
[14] Agarwal, S., Rodriguez, M. A., & Buyya, R. (2024). A Deep
Recurrent-Reinforcement Learning Method for Intelligent
AutoScaling of Serverless Functions. IEEE Transactions on
Services Computing.
[15] Lu, J., Yang, J., Li, S., Li, Y., Jiang, W., Dai, J., & Hu, J. (2024).
A2C-DRL: Dynamic Scheduling for Stochastic Edge-Cloud
Environments Using A2C and Deep Reinforcement Learning. IEEE
Internet of Things Journal.
Authorized licensed use limited to: University of the Cumberlands. Downloaded on January 16,2025 at 17:48:58 UTC from IEEE Xplore. Restrictions apply.