Conference Paper

Neural Task Scheduling with Reinforcement Learning for Fog Computing Systems

Authors:
  • Shenzhen Institute of Artificial Intelligence and Robotics for Society
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The main focus is to minimize the task slowdown. Deep reinforcement learning and pointer network architecture are combined to propose neural task scheduling [51]. ...
Chapter
Full-text available
Internet applications generate massive amount of data. For processing the data, it is transmitted to cloud. Time-sensitive applications require faster access. However, the limitation with the cloud is the connectivity with the end devices. Fog was developed by Cisco to overcome this limitation. Fog has better connectivity with the end devices, with some limitations. Fog works as intermediate layer between the end devices and the cloud. When providing the quality of service to end users, scheduling plays an important role. Scheduling a task based on the end users requirement is a tedious thing. In this paper, we proposed a cloud-fog task scheduling model, which provides quality of service to end devices with proper security.
... Some of existing works just only consider B k as the number of central processing units (CPU cycles) [25]. In another scenarios, GPU and memory requirements are considered during resource allocation for executing heavy and complex tasks such as the AI, and ML ones [49]. ...
Article
Full-text available
Fog computing has been widely integrated in the IoT-based systems, creating IoT-Fog-Cloud (IFC) systems to improve the system performances and satisfy the quality of services (QoS) and quality of experience (QoE) requirements for the end users (EUs). This improvement is enabled by computational offloading schemes, which perform the task computation nearby the task generation sources (i.e., IoT devices, EUs) on behalf of remote cloud servers. To realize the benefits of offloading techniques, however, there is a need to incorporate efficient resource allocation frameworks, which can deal effectively with intrinsic properties of computing environment in the IFC systems such as resource heterogeneity of computing devices, various requirements of computation tasks, high task request rates, and so on. While the centralize optimization and non-cooperative game theory based solutions are applicable in a certain number of application scenarios, they fail to be efficient in many of cases, where the global information and control might be unavailable or cost-intensive to achieve it in the large-scale systems. The need of distributed computational offloading algorithms with low computation complexity has motivated a surge of solutions using matching theory. In the present review, we first describe the fundamental concept of this emerging tool enabling the distributed implementation in the computing environment. Then the key solution concepts and algorithmic implementations proposed in the framework of literature are highlighted and discussed. Given the powerful tool of matching theory, its full capability is still unexplored and unexploited in the literature. We thereby discover and discuss existing challenges and corresponding solutions that the matching theory can be applied to resolve them. Furthermore, new problems and open issues for application scenarios of modern IFC systems are also investigated thoroughly.
... Instead, [11] focuses on real-time task assignment but considers the evolution strategies approach instead of the backpropagation for updating the weight of the DNN. In [12], an approach based on a recurrent neural network (RNN) is proposed, [13] illustrates a solution that targets explicitly vehicular networks, and [14] instead to crowdsensing. In a broader sense of scheduling, other works are instead focused on resource allocation [15] but the task model does not fit the one that is studied in this paper. ...
Conference Paper
Full-text available
Fog Computing is today a wide used paradigm that allows to distribute the computation in a geographic area. This not only makes possible to implement time-critical applications but opens the study to a series of solutions which permit to smartly organise the traffic among a set of Fog nodes, which constitute the core of the Fog Computing paradigm. A typical smart city setting is subject to a continuous change of traffic conditions, a node that was saturated can become almost completely unloaded and this creates the need of designing an algorithm which allows to meet the strict deadlines of the tasks but at the same time it can choose the best scheduling policy according to the current load situation that can vary at any time. In this paper, we use a Reinforcement Learning approach to design such an algorithm starting from the power-of-random choice paradigm, used as a baseline. By showing results from our delay-based simulator, we demonstrate how such distributed reinforcement learning approach is able to maximise the rate of the tasks executed within the deadline in a way that is equal to every node, both in a fixed load condition and in a real geographic scenario.
Article
In fog-assisted Internet-of-Things systems, it is a common practice to cache popular content at the network edge to achieve high quality of service. Due to uncertainties, in practice, such as unknown file popularities, the cache placement scheme design is still an open problem with unresolved challenges: 1) how to maintain time-averaged storage costs under budgets; 2) how to incorporate online learning to aid cache placement to minimize performance loss [also known as (a.k.a.) regret]; and 3) how to exploit offline historical information to further reduce regret. In this article, we formulate the cache placement problem with unknown file popularities as a constrained combinatorial multiarmed bandit problem. To solve the problem, we employ virtual queue techniques to manage time-averaged storage cost constraints, and adopt history-aware bandit learning methods to integrate offline historical information into the online learning procedure to handle the exploration–exploitation tradeoff. With an effective combination of online control and history-aware online learning, we devise a cache placement scheme with history-aware bandit learning called CPHBL . Our theoretical analysis and simulations show that CPHBL achieves a sublinear time-averaged regret bound. Moreover, the simulation results verify CPHBL’s advantage over the deep reinforcement learning-based approach.
Article
Full-text available
As an increasing number of traditional applications migrated to the cloud, achieving resource management and performance optimization in such a dynamic and uncertain environment becomes a big challenge for cloud-based application providers. In particular, job scheduling is a non-trivial task which is responsible for allocating massive job requests submitted by users to the most suitable resources and satisfying user QoS requirements as much as possible. Inspired by recent success of using deep reinforcement learning techniques to solve AI control problems, in this paper, we propose an intelligent QoS-aware job scheduling framework for application providers. A deep reinforcement learning-based job scheduler is the key component of the framework. It is able to learn to make appropriate online jobto- VM decisions for continuous job requests directly from its experiences without any prior knowledge. Experimental results using synthetic workloads and real-world NASA workload traces show that compared with other baseline solutions, our proposed job scheduling approach can efficiently reduce average job response time (e.g. reduced by 40.4% compared with the best baseline for NASA traces), guarantee the QoS at a high level (e.g. job success rate is higher than 93% for all simulated changing workload scenarios), and adapt to different workload conditions.
Article
Full-text available
Fog Computing extends the Cloud Computing paradigm to the edge of the network, thus enabling a new breed of applications and services. Defining characteristics of the Fog are: a) Low latency and location awareness; b) Wide-spread geographical distribution; c) Mobility; d) Very large number of nodes, e) Predominant role of wireless access, f) Strong presence of streaming and real time applications, g) Heterogeneity. In this paper we argue that the above characteristics make the Fog the appropriate platform for a number of critical Internet of Things (IoT) services and applications, namely, Connected Vehicle, Smart Grid, Smart Cities, and, in general, Wireless Sensors and Actuators Networks (WSANs).
Article
Full-text available
Project scheduling is concerned with single-item or small batch production where scarce resources have to be allocated to dependent activities over time. Applications can be found in diverse industries such as construction engineering, software development, etc. Also, project scheduling is increasingly important for make-to-order companies where the capacities have been cut down in order to meet lean management concepts. Likewise, project scheduling is very attractive for researchers, because the models in this area are rich and, hence, difficult to solve. For instance, the resource-constrained project scheduling problem contains the job shop scheduling problem as a special case. So far, no classification scheme exists which is compatible with what is commonly accepted in machine scheduling. Also, a variety of symbols are used by project scheduling researchers in order to denote one and the same subject. Hence, there is a gap between machine scheduling on the one hand and project scheduling on the other with respect to both, viz. a common notation and a classification scheme. As a matter of fact, in project scheduling, an ever growing number of papers is going to be published and it becomes more and more difficult for the scientific community to keep track of what is really new and relevant. One purpose of our paper is to close this gap. That is, we provide a classification scheme, i.e. a description of the resource environment, the activity characteristics, and the objective function, respectively, which is compatible with machine scheduling and which allows to classify the most important models dealt with so far. Also, we propose a unifying notation. The second purpose of this paper is to review some of the recent developments. More specifically, we review exact and heuristic algorithms for the single-mode and the multi-mode case, for the time–cost tradeoff problem, for problems with minimum and maximum time lags, for problems with other objectives than makespan minimization and, last but not least, for problems with stochastic activity durations.
Conference Paper
Put together, the edge and fog form a large diverse pool of computing and networking resources from different owners that can be leveraged towards low latency applications as well as for alleviating high traffic volume in future networks including 5G and beyond. This paper sets out a framework for the integration of edge and fog computing and networking leveraging on ongoing specifications by ETSI MEC ISG and the OpenFog Consortium. It also presents the technological gaps that need to be addressed before such an integrated solution can be developed. These noticeably include challenges relating to the volatility of resources, heterogeneity of underlying technologies, virtualization of devices, and security issues. The framework presented is a Launchpad for a complete solution under development by the 5G-CORAL consortium.
Conference Paper
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.7 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a strong phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which beats the previous state of the art. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Conference Paper
Resource management problems in systems and networking often manifest as difficult online decision making tasks where appropriate solutions depend on understanding the workload and environment. Inspired by recent advances in deep reinforcement learning for AI problems, we consider building systems that learn to manage resources directly from experience. We present DeepRM, an example solution that translates the problem of packing tasks with multiple resource demands into a learning problem. Our initial results show that DeepRM performs comparably to state-of-the-art heuristics, adapts to different conditions, converges quickly, and learns strategies that are sensible in hindsight.
We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence and Neural Turing Machines, because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various combinatorial optimization problems belong to this class. Our model solves the problem of variable size output dictionaries using a recently proposed mechanism of neural attention. It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output. We call this architecture a Pointer Net (Ptr-Net). We show Ptr-Nets can be used to learn approximate solutions to three challenging geometric problems -- finding planar convex hulls, computing Delaunay triangulations, and the planar Travelling Salesman Problem -- using training examples alone. Ptr-Nets not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries. We show that the learnt models generalize beyond the maximum lengths they were trained on. We hope our results on these tasks will encourage a broader exploration of neural learning for discrete problems.
Article
Tasks in modern data parallel clusters have highly diverse resource requirements, along CPU, memory, disk and network. Any of these resources may become bottlenecks and hence, the likelihood of wasting resources due to fragmentation is now larger. Today's schedulers do not explicitly reduce fragmentation. Worse, since they only allocate cores and memory, the resources that they ignore (disk and network) can be over-allocated leading to interference, failures and hogging of cores or memory that could have been used by other tasks. We present Tetris, a cluster scheduler that packs, i.e., matches multi-resource task requirements with resource availabilities of machines so as to increase cluster efficiency (makespan). Further, Tetris uses an analog of shortest-running-time-first to trade-off cluster efficiency for speeding up individual jobs. Tetris' packing heuristics seamlessly work alongside a large class of fairness policies. Trace-driven simulations and deployment of our prototype on a 250 node cluster shows median gains of 30% in job completion time while achieving nearly perfect fairness.
Deep Neural Networks (DNNs) are powerful models that have achieved excellent performance on difficult learning tasks. Although DNNs work well whenever large labeled training sets are available, they cannot be used to map sequences to sequences. In this paper, we present a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure. Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector. Our main result is that on an English to French translation task from the WMT-14 dataset, the translations produced by the LSTM achieve a BLEU score of 34.7 on the entire test set, where the LSTM's BLEU score was penalized on out-of-vocabulary words. Additionally, the LSTM did not have difficulty on long sentences. For comparison, a strong phrase-based SMT system achieves a BLEU score of 33.3 on the same dataset. When we used the LSTM to rerank the 1000 hypotheses produced by the aforementioned SMT system, its BLEU score increases to 36.5, which beats the previous state of the art. The LSTM also learned sensible phrase and sentence representations that are sensitive to word order and are relatively invariant to the active and the passive voice. Finally, we found that reversing the order of the words in all source sentences (but not target sentences) improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.
Article
Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond. (Thorndike, 1911) The idea of learning to make appropriate responses based on reinforcing events has its roots in early psychological theories such as Thorndike's "law of effect" (quoted above). Although several important contributions were made in the 1950s, 1960s and 1970s by illustrious luminaries such as Bellman, Minsky, Klopf and others (Farley and Clark, 1954; Bellman, 1957; Minsky, 1961; Samuel, 1963; Michie and Chambers, 1968; Grossberg, 1975; Klopf, 1982), the last two decades have wit- nessed perhaps the strongest advances in the mathematical foundations of reinforcement learning, in addition to several impressive demonstrations of the performance of reinforcement learning algo- rithms in real world tasks. The introductory book by Sutton and Barto, two of the most influential and recognized leaders in the field, is therefore both timely and welcome. The book is divided into three parts. In the first part, the authors introduce and elaborate on the es- sential characteristics of the reinforcement learning problem, namely, the problem of learning "poli- cies" or mappings from environmental states to actions so as to maximize the amount of "reward"