Publications (69)48.49 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: A multiclass M/M/1 system, with service rate $\mu_in$ for class$i$ customers, is considered with the risksensitive cost criterion $n^{1}\log E\exp\sum_ic_iX^n_i(T)$, where $c_i>0$, $T>0$ are constants, and $X^n_i(t)$ denotes the class$i$ queuelength at time $t$, assuming the system starts empty. An asymptotic upper bound (as $n\to\infty$) on the performance under a fixed priority policy is attained, implying that the policy is asymptotically optimal when $c_i$ are sufficiently large. The analysis is based on the study of an underlying differential game.Electronic communications in probability 02/2014; 19(11):113. · 0.63 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: A Markovian queueing model is considered in which servers of various types work in parallel to process jobs from a number of classes at rates µ ij that depend on the class, i, and the type, j. The problem of dynamic resource allocation so as to minimize a risksensitive criterion is studied in a lawoflargenumbers scaling. Letting X i (t) denote the number of classi jobs in the system at time t, the cost is given by E exp { n [ ∫ T 0 h(¯ X(t))dt + g(¯ X(T)) ]} where T > 0, h and g are given functions satisfying regularity and growth conditions, and ¯ X = ¯ X n = n −1 X(n·). It is wellknown in an analogous context of controlled diffusion, and has been shown for some classes of stochastic networks, that the limit behavior, as n → ∞, is governed by a differential game (DG) in which the state dynamics is given by a fluid equation for the formal limit φ of ¯ X, while the cost consists of ∫ T 0 h(φ(t))dt + g(φ(T)) and an additional term that originates from the underlying largedeviation rate function. We prove that a DG of this type indeed governs the asymptotic behavior, that the game has value, and that the value can be characterized by the corresponding HamiltonJacobiIsaacs equation. The framework allows for both fixed and growing number of servers N → ∞, provided N = o(n).SIAM Journal on Control and Optimization 12/2013; · 1.39 Impact Factor 
Article: Predicting the Impact of Measures Against P2P Networks: Transient Behavior and Phase Transition
[Show abstract] [Hide abstract]
ABSTRACT: The paper has two objectives. The first is to study rigorously the transient behavior of some peertopeer (P2P) networks whenever information is replicated and disseminated according to epidemiclike dynamics. The second is to use the insight gained from the previous analysis in order to predict how efficient are measures taken against P2P networks. We first introduce a stochastic model that extends a classical epidemic model and characterize the P2P swarm behavior in presence of freeriding peers. We then study a second model in which a peer initiates a contact with another peer chosen randomly. In both cases, the network is shown to exhibit phase transitions: A small change in the parameters causes a large change in the behavior of the network. We show, in particular, how phase transitions affect measures of content providers against P2P networks that distribute nonauthorized music, books, or articles and what is the efficiency of countermeasures. In addition, our analytical framework can be generalized to characterize the heterogeneity of cooperative peers.IEEE/ACM Transactions on Networking 06/2013; 21(3):935949. · 1.99 Impact Factor  SIAM J. Control and Optimization. 01/2012; 50:171195.
 [Show abstract] [Hide abstract]
ABSTRACT: Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility, and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the longterm average cost for such a control policy, for a broad range of onestage cost functions, is the same as that for the associated stationary Markov policy. In addition, ATS policies are well suited for a range of estimation, information collection, and adaptive control goals. To illustrate the possibilities we present two examples. The first demonstrates a construction of an ATS policy that leads to consistent estimators for unknown model parameters while producing the desired longterm average cost value. The second example considers a setting where the target stationary Markov control $q$ is not known but there are sampling schemes available that allow for consistent estimation of $q$. We construct an ATS policy which uses dynamic estimators for $q$ for control decisions and show that the associated cost coincides with that for the unknown Markov control $q$.SIAM Journal on Control and Optimization 01/2012; 50(1):171195. · 1.38 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: The paper has two objectives. The first is to study rigorously the transient behavior of some peertopeer (P2P) networks whenever information is replicated and disseminated according to epidemiclike dynamics. The second is to use the insight gained from the previous analysis in order to predict how efficient are measures taken against P2P networks. We first introduce a stochastic model which extends a classical epidemic model, and characterize the P2P swarm behavior in presence of free riding peers. We then study a second model in which a peer initiates a contact with another peer chosen randomly. In both cases the network is shown to exhibit phase transitions: a small change in the parameters causes a large change in the behavior of the network. We show, in particular, how phase transitions affect measures of content providers against P2P networks that distribute nonauthorized music or books, and what is the efficiency of countermeasures.INFOCOM, 2011 Proceedings IEEE; 05/2011  [Show abstract] [Hide abstract]
ABSTRACT: Abstract We consider a system with a single queue and multiple server pools of heterogenous exponential servers. The system operates under a policy that always routes a job to theQueueing Systems 04/2011; 67:275293. · 0.60 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: the general case, this path might be exponentially long in the number of states and actions. We prove that the length of this path is polynomial if the MDP satisfies a coupling property. Thus we obtain a strongly polynomial algorithm for MDPs that satisfies the coupling property. We prove that discrete time versions of controlled M/M/1 queues induce MDPs that satisfy the coupling property. The only previously known polynomial algorithm for controlled M/M/1 queues in the expected average cost model is based on linear programming (and is not known to be strongly polynomial). Our algorithm works both for the discounted and expected average cost models, and the running time does not depend on the discount factor. 1.1. Contribution. We introduce a new approach for solving MDPs in the discounted cost model and expected average cost model. The approach is based on adding an artificial constraint with parameterto obtain a continuum of constrained MDPs, denoted by CMDP��� . We consider the whole range of values for �, so that it also includes the value that an optimal policy of the MDP attains. Our approach is based on a new structural lemma that proves that the set of optimal policies of CMDP��� (for all values of �) constitutes a path in a graph over the deterministic policies. We present an algorithm that finds all the deterministic policies along the path. The optimal policy of the MDP is simply the mincost policy along this path. We cannot rule out the possibility that this path may be exponentially long, and hence the running time of this algorithm might be exponential. We overcome the problem of a long path by introducing a coupling property. We prove that, if the coupling property holds and if a specific artificial constraint is chosen, then the length of the path is polynomial (i.e., n · k). Hence the algorithm becomes strongly polynomial. We prove that the coupling property is satisfied in discrete versions of controlled birthdeath processes such as single server controlled M/M/1 queues. Such controlledMathematics of Operations Research 11/2009; 34:9921007. · 0.92 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider an uplink power control problem where each mobile wishes to maximize its throughput (which depends on the transmission powers of all mobiles) but has a constraint on the average power consumption. A finite number of power levels are available to each mobile. The decision of a mobile to select a particular power level may depend on its channel state. We consider two frameworks concerning the state information of the channels of other mobiles: i) the case of full state information and ii) the case of local state information. In each of the two frameworks, we consider both cooperative as well as noncooperative power control. We manage to characterize the structure of equilibria policies and, more generally, of bestresponse policies in the noncooperative case. We present an algorithm to compute equilibria policies in the case of two noncooperative players. Finally, we study the case where a malicious mobile, which also has average power constraints, tries to jam the communication of another mobile. Our results are illustrated and validated through various numerical examples.IEEE Transactions on Automatic Control 11/2009; · 3.17 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We generalize the geometric discount of finite discounted cost Markov Decision Processes to “exponentially representable”discount functions, prove existence of optimal policies which are stationary from some time NN onward, and provide an algorithm for their computation. Outside this class, optimal “NNstationary” policies in general do not exist.Operations Research Letters 01/2009; 37:5155. · 0.62 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We investigate the existance of simple policies in finite discounted cost Markov Decision Processes, when the discount factor is not constant. We introduce a class called "exponentially representable" discount functions. Within this class we prove existence of optimal policies which are eventually stationaryfrom some time N onward, and provide an algorithm for their computation. Outside this class, optimal policies with this structure in general do not exist.01/2008;  [Show abstract] [Hide abstract]
ABSTRACT: For a countablestate Markov decision process we introduce an embedding which produces a finitestate Markov decision process. The finitestate embedded process has the same optimal cost, and moreover, it has the same dynamics as the original process when restricting to the approximating set. The embedded process can be used as an approximation which, being finite, is more convenient for computation and implementation.Automatica 12/2007; · 3.13 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider a queue with renewal arrivals and n exponential servers in the HalfinWhitt heavy traffic regime, where n and the arrival rate increase without bound, so that a critical loading condition holds. Server k serves at rate $\mu_k $, and the empirical distribution of the $\mu_k $ is assumed to converge weakly. We show that very little information on the service rates is required for a routing mechanism to perform well. More precisely, we construct a routing mechanism that has access to a single sample from the service time distribution of each of $n$ to the power of $1/2 + \epsilon $ randomly selected servers, but not to the actual values of the service rates, the performance of which is asymptotically as good as the best among mechanisms that have the complete information on $ \mu_k $.Mathematics of Operations Research 12/2007; · 0.92 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: A user facing a multiuser resourcesharing system considers a vector of performance measures (e.g. response times to various tasks). Acceptable performance is defined through a set in the space of performance vectors. Can the user obtain a (timeaverage) performance vector which approaches this desired set? We consider the worstcase scenario, where other users may, for selfish reasons, try to exclude his vector from the desired set. For a controlled Markov model of the system, we give a sufficient condition for approachability, and construct appropriate policies. Under certain recurrence conditions, a complete characterization of approachability is then provided for convex sets. The mathematical formulation leads to a theory of approachability for stochastic games. A simple queueing example is analyzed to illustrate the applicability of this approach.12/2007: pages 436450;  [Show abstract] [Hide abstract]
ABSTRACT: Two types of traffic, e. g., voice and data, share a single synchronous and noisy communication channel. This situation is modelled as a system of two discretetime queues with geometric service requirements which compete for the attention of a single server. The problem is cast as one in Markov decision theory with longrun average cost and constraint. An optimal strategy is identified that possesses a simple structure, and its implementation is discussed in terms of an adaptive algorithm of the stochasticapproximations type. The proposed algorithm is extremely simple, recursive and easily implementable, with no a priori knowledge of the actual values of the statistical parameters. The derivation of the results combines martingale arguments, results on Markov chains, O.D.E. characterization of the limit of stochastic approximations and methods from weak convergence. The ideas developed here are of independent interest and should prove useful in studying broad classes of constrained Markov decision problems.01/2006: pages 515532;  [Show abstract] [Hide abstract]
ABSTRACT: We consider the problem of risksensitive control of a stochastic network. In controlling such a network, an escape time criterion can be useful if one wishes to regulate the occurrence of large buffers and buffer overflow. In this paper a risksensitive escape time criterion is formulated, which in comparison to the ordinary escape time criteria penalizes exits which occur on short time intervals more heavily. The properties of the risksensitive problem are studied in the large buffer limit, and related to the value of a deterministic differential game with constrained dynamics. We prove that the game has value, and that the value is the (viscosity) solution of a PDE. For a simple network, the value is computed, demonstrating the applicability of the approach.Mathematics of Operations Research 02/2005; · 0.92 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We consider optimal control of a stochastic network,where service is controlled to prevent buffer overflow. We use a risksensitive escape time criterion, which in comparison to the ordinary escape time criteria heavily penalizes exits which occur on short time intervals. A limit as the buffer sizes tend to infinity is considered. In [2] we showed that, for a large class of networks, the limit of the normalized cost agrees with the value function of a differential game. The game's value is characterized in [2] as the unique solution to a HamiltonJacobiBellman Partial Differential Equation (PDE). In the current paper we apply this general theory to the important case of a network of queues in tandem. Our main results are: (i) the construction of an explicit solution to the corresponding PDE, and (ii) drawing out the implications for optimal risksensitive and robust regulation of the network. In particular, the following general principle can be extracted. To avoid buffer overflow there is a natural competition between two tendencies. One may choose to serve a particular queue, since that will help prevent its own buffer from overflowing, or one may prefer to stop service, with the goal of preventing overflow of buffers further down the line. The solution to the PDE indicates the optimal choice between these two, specifying the parts of the state space where each queue must be served (so as not to lose optimality), and where it can idle.Queueing Systems 02/2005; · 0.60 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: We develop a methodology for studying ''large deviations type'' questions. Our approach does not require that the large deviations principle holds, and is thus applicable to a larg class of systems. We study a system of queues with exponential servers, which share an arrival stream. Arrivals are routed to the (weighted) shortest queue. It is not known whether the large deviations principle holds for this system. Using the tools developed here we derive large deviations type estimates for the most likely behavior, the most likely path to overflow and the probability of overflow. The analysis applies to any finite number of queues. We show via a counterexample that this sytem may exhibit unexpected behavior.Tinbergen Institute, Tinbergen Institute Discussion Papers. 01/2005;  [Show abstract] [Hide abstract]
ABSTRACT: We develop a methodology for studying "large deviations type" questions. Our approach does not require that the large deviations principle holds, and is thus applicable to a larg class of systems. We study a system of queues with exponential servers, which share an arrival stream. Arrivals are routed to the (weighted) shortest queue. It is not known whether the large deviations principle holds for this system. Using the tools developed here we derive large deviations type estimates for the most likely behavior, the most likely path to overflow and the probability of overflow. The analysis applies to any finite number of queues. We show via a counterexample that this sytem may exhibit unexpected behavior.Mathematical Methods of Operational Research 01/2004; · 0.54 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: For the PTMDP model, we consider the problem of optimizing the expected discounted reward when rewards devalue by a discount factorat the beginning of each slow scale cycle. When N is large, initially stationary policies (i.s.p.'s) are natural candidates for optimal policies. Similar to turnpike policies, an initially stationary policy uses the same decision rule for some large number of epochs in each slow scale cycle, followed by a relatively short planning horizon of timevarying decision rules. In this paper, we characterize the form of the optimal value as a function of N , establish conditions ensuring the existence of nearoptimal i.s.p.'s, and characterize their structure. Our analysis deals separately with the cases where the timehomogeneous part of the system has statedependent and stateindependent optimal average reward. As we illustrate, the results in these two distinct cases are qualitatively different. 1. Introduction. Certain stochastic control problems involve the control of fast pro cesses that are influenced by slower ones. The slow processes cause relatively infrequent perturbations in the usual operation of the faster processes. Such multiple time scale phe nomena arise in a variety of ways in practice. The variety of models for such phenomena that have received attention in the literature is correspondingly rich. A number of authors (e.g., Davis 1993, Bäuerle 2001) have considered socalled Piecewise Deterministic Pro cesses (PDPs) in which a deterministically controlled system is periodically perturbed by uncontrolled random events. Such models arise, for example, in manufacturing systems where the usual flow of production, a deterministic process, is occasionally perturbed due to random machine failure. Other authors have considered socalled hierarchical decision making models (e.g., Sethi and Zhang 1994) in which the control space consists of several different types of decision variables used to control related processes. Typically, the differ ent processes are controlled by decisions made with different frequency, leading to multi ple time scales. These models suit, for example, manufacturing systems where the actual production operations constitute a process run by frequent, shortrun decisions while the resources available for production (e.g., number of machines, workers, etc.) are in the con trol of less frequent, longrun decisions. In some literature, multiple time scale phenomena have been treated in a Markov Decision Process (MDP) framework. Singularly perturbed MDPs (e.g., Delebecque and Quadrat 1981, Phillips and Kokotovic 1981) are perhaps the most prominent example, and have been used to model applications such as the control of queuing systems and hydroelectric powerMathematics of Operations Research 11/2003; 28:777800. · 0.92 Impact Factor
Publication Stats
913  Citations  
48.49  Total Impact Points  
Top Journals
Institutions

1990–2013

Technion  Israel Institute of Technology
 Electrical Engineering Group
H̱efa, Haifa District, Israel


2009–2011

University of NiceSophia Antipolis
Nice, ProvenceAlpesCôte d'Azur, France


2004

VU University Amsterdam
 Faculty of Economics and Business Administration
Amsterdamo, North Holland, Netherlands


1999

State University of New York
New York City, New York, United States


1987–1993

University of Maryland, College Park
 Department of Electrical & Computer Engineering
College Park, MD, United States


1989

BenGurion University of the Negev
 Department of Mechanical Engineering
Be'er Sheva`, Southern District, Israel
