About
42
Publications
1,762
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
350
Citations
Introduction
Current institution
Additional affiliations
October 2011 - present
Publications
Publications (42)
We study scheduling problems on parallel dedicated machines and assume that a specific job can only be processed on one specific machine. We concentrate on solving scheduling problems involving convex resource allocation and address three of the most fundamental measures in scheduling theory, i.e., makespan, total load, and total weighted completio...
The paper presents complexity results and performance guaranties for a family of approximation algorithms for an optimisation problem arising in software testing and manufacturing. The problem is formulated as a partitioning of a set where each element has an associated subset in another set, but can also be viewed as a scheduling problem with infi...
A data gathering network is a distributed computer system where the workers have to transfer data to a node called the base station, which is responsible for their processing. Wireless sensor networks monitoring the environment or wired systems executing computations are examples of such networks. We analyze scheduling in data gathering networks wi...
This article considers scheduling in tree data gathering networks. The worker nodes of a network acquire datasets of known sizes at possibly different times. The datasets are sent to intermediate nodes which process them and then pass the results to the base station. Each dataset is assigned a due date by which it should arrive at the base station....
The publications on two-machine flow shop scheduling problems with job dependent storage requirements, where a job seizes a portion of the storage space for the entire duration of its processing, were motivated by various applications ranging from supply chains of mineral resources to multimedia systems. In contrast to the previous publications tha...
The paper considers scheduling on parallel machines under the constraint that some pairs of jobs cannot be processed concurrently. Each job has an associated weight, and all jobs have the same deadline. The objective is to maximise the total weight of on-time jobs. The problem is known to be strongly NP-hard in general. A polynomial-time algorithm...
The paper considers scheduling on parallel machines under the constraint that some pairs of jobs cannot be processed concurrently. Each job has an associated weight, and all jobs have the same deadline. The objective is to maximise the total weight of on-time jobs. The problem is known to be strongly NP-hard in general. A polynomial-time algorithm...
In this paper, we analyze scheduling gathering multitype data in a star network. Certain numbers of datasets of different types have to be collected on a single computer, either by downloading them from remote nodes, or by producing them locally. The time required to download a dataset depends on its type and initial location, and the time needed t...
In this work, we study scheduling in star data gathering networks with background communications. The worker nodes of the network hold datasets that have to be transferred directly to the base station. The communication speed of each link changes with time, due to other applications using the network, independently of the speeds of other links. Our...
In this work, we analyze scheduling in a star data gathering network. Each worker node produces a dataset of known size at a possibly different time. The datasets have to be passed to the base station for further processing. A dataset can be transferred in many separate pieces, but each sent message incurs additional time overhead. The scheduling p...
In this paper, we analyze scheduling in data gathering networks with limited base station memory. The network nodes hold datasets that have to be gathered and processed by a single base station. A dataset transfer can only start if sufficient amount of memory is available at the base station. As soon as a node starts sending a dataset, the base sta...
The paper establishes the NP-hardness in the strong sense of a two-machine flow shop scheduling problem with unit execution time (UET) operations, dynamic storage availability, job dependent storage requirements, and the objective to minimise the time required for the completion of all jobs, i.e. to minimise the makespan. Each job seizes the requir...
In this paper, scheduling in a star data gathering network is studied. The worker nodes of the network produce datasets that have to be gathered by a single base station. The datasets may be released at different moments. Each dataset is assigned a due date by which it should arrive at the base station. The scheduling problem is to organize the com...
In this paper, we analyze applicability of various load-balancing methods in countering data skew in MapReduce computations. A MapReduce job consists of several phases: mapping, shuffling data, sorting and reducing. The distribution of the work in the last three phases is data-driven, and unequal distribution of the data keys may cause imbalance in...
This paper analyzes scheduling in a data gathering network with data compression. The nodes of the network collect some data and pass them to a single base station. Each node can, at some cost, preprocess the data before sending it, in order to decrease its size. Our goal is to transfer all data to the base station in given time, at the minimum pos...
We analyze scheduling multilayer divisible computations. Multilayer computations consist
of a chain of parallel applications, such that one application produces input for the next
one. A simple form of multilayer computations are MapReduce parallel applications. The
operations of mapping and reducing are two divisible applications with precedence
c...
Algorithms for mitigating imbalance of the MapReduce computations are considered in this paper. MapReduce is a new paradigm of processing big datasets in parallel. A MapReduce job consists of two phases of mapping and reducing. In the latter phase computation completion times may become imbalanced due to unequal distribution of the data. We propose...
In this paper scheduling communications in data gathering networks is analyzed. We study collecting information by a set of sensors, each of which stores the data in its memory buffer and then passes them to a base station. The network lifetime ends as soon as the first node is out of memory. We use a divisible load model to propose a communication...
In this paper we analyze handling partitioning skew in MapReduce computations. The basic MapReduce implementations strongly depend on the assumption that the data is partitioned evenly for reducing. However, in practical applications the data distribution is often skewed, what leads to decreasing MapReduce system performance. Using divisible load t...
The main goal of this work is the analysis of several divisible load scheduling problems in heterogeneous distributed systems and the construction of algorithms solving these problems. First, single-round divisible load scheduling in star networks is analyzed. Fully polynomial time approximation schemes and approximation algorithms are proposed for...
In this paper we analyze MapReduce distributed computations as a divisible load scheduling problem. The two operations of mapping and reducing can be understood as two divisible applications with precedence constraints. A divisible load model of the computation, and two load partitioning algorithms are proposed. Performance limits of MapReduce comp...
In this paper we study divisible loads scheduling in heterogeneous systems with high bandwidth. Divisible loads represent
computations which can be arbitrarily divided into parts and performed independently in parallel. We propose fully polynomial
time approximation schemes for two optimization problems. The first problem consists in finding the ma...
In this paper scheduling divisible loads in systems with limited memory is examined. Divisible loads are parallel computations which can be arbitrarily divided into parts independently processed on remote processors. The scheduling problem consists in distributing the load, taking into account communication and computation time, and limited memory...
In this paper we study divisible load scheduling in systems with limited memory. Divisible loads are parallel computations which can be divided into independent parts processed in parallel on remote computers, and the part sizes may be arbitrary. The distributed system is a heterogeneous single level tree. The total size of processor memories is to...
In this paper we study divisible load scheduling in systems with limited memory. Divisible loads represent computations which can be arbitrarily divided into parts and performed independently in parallel. The scheduling problem consists in distributing the load in a heterogeneous system taking into account communication and computation times, and l...
In this paper we study divisible loads scheduling in heterogeneous systems with high bandwidth. Divisible loads represent computations which can be arbitrarily divided into parts and performed independently in parallel. We propose fully polynomial time approximation schemes for two optimization problems. The first problem consists in finding the ma...
In this paper we analyze MapReduce distributed computations as divisible load scheduling problem. The two operations of mapping and reducing can be understood as two divisible applications with precedence constraints. A divisible load model is proposed, and schedule dominance properties are analyzed. We investigate dominant schedule structures for...
In this paper we study divisible load scheduling in systems with limited memory. Divisible loads are parallel computations which can be divided into independent parts of arbitrary sizes and processed in parallel on remote computers. The problem consists in distributing the load taking into account communication time, computation time, and limited m...