Conference PaperPDF Available

New Approximation Algorithms for Some Dynamic Storage Allocation Problems


Abstract and Figures

The offline dynamic storage allocation (DSA) problem has recently received some renewed attention and several new results have been reported. The problem is NP-complete and the best known result for the offline DSA is a polynomial time 3-approximation algorithm [Gerg99]. Better ratios have been reported for special cases if restrictions are placed on the allowable sizes of the blocks [Gerg96,MuBh99]. In this paper, we present new techniques for solving special cases with blocks of restricted sizes and we obtain better approximation ratios for them. We first obtain results for small instances which are then used to solve the more general cases. Our main results are (i) a 4/3-approximation algorithm when the maximum block size h=2 (previous best was 3/2); and (ii) a 1.7-approximation algorithm for the case h=3 (previous best was 1 11/12).
Content may be subject to copyright.
New Approximation Algorithms
for Some Dynamic Storage Allocation Problems
Shuai Cheng Li1,HonWaiLeong
1, and Steven K. Quek2
1School of Computing
National University of Singapore
3 Science Drive 2, Singapore 117543
2School of Information Technology
Nanyang Polytechnic, Singapore
Abstract. The offline dynamic storage allocation (DSA) problem has
recently received some renewed attention and several new results have
been reported. The problem is NP-complete and the best known re-
sult for the offline DSA is a polynomial time 3-approximation algorithm
[Gerg99]. Better ratios have been reported for special cases if restrictions
are placed on the allowable sizes of the blocks [Gerg96,MuBh99]. In this
paper, we present new techniques for solving special cases with blocks
of restricted sizes and we obtain better approximation ratios for them.
We first obtain results for small instances which are then used to solve
the more general cases. Our main results are (i) a 4/3-approximation
algorithm when the maximum block size h=2 (previous best was 3/2);
and (ii) a 1.7-approximation algorithm for the case h=3 (previous best
was 1 11
12 ).
1 Introduction
The dynamic storage allocation (DSA) problem is a classic combinatorial op-
timization problem that has a wide variety of applications, including dynamic
memory allocation in operating systems ([Robs74,Knut73,WJNB95]).
Our study an offline DSA is motivated by the berth allocation problem (BAP)
in busy container transshipment ports (such as Singapore). Ships arriving at the
port have fixed arrival and departure times and have to be berthed with a
(straight-line) section of the port so that they can unload and load containers.
Each vessel must be berth entirely within the section. Effective allocation of
berthing space to the ships is an important planning task in many ports in order
to increase throughput, reduce delay, and optimize the use of scarce resources
(wharf areas, cranes).
We show that the BAP is precisely the offline DSA problem. This DSA prob-
lem was first studied by Knuth, Robson and others [Robs74,Knut73] and has
This work was supported in part by the National University of Singapore under
Grant R252-000-128-112.
K.-Y. Chwa and J.I. Munro (Eds.): COCOON 2004, LNCS 3106, pp. 339–348, 2004.
Springer-Verlag Berlin Heidelberg 2004
340 Shuai Cheng Li, Hon Wai Leong, and Steven K. Quek
recently received some renewed attention. The problem is NP-complete [GJ79]
and there were a series of approximation algorithms proposed in recent years
[Kier88,Kier91,Gerg96,Gerg99]. The best known result is a 3-approximation al-
gorithm by Gergov [Gerg99]. Better ratios have been reported for special cases
if restrictions are placed on the allowable sizes of the blocks [Gerg96,MuBh99].
In this paper, we present new techniques for solving more special cases (of
blocks with restricted sizes) that produce better approximation ratios. Our mo-
tivation for studying special case are twofold: Firstly, we hope that results of
small special cases will lead to better approximation algorithms for these special
cases. Secondly, our research [Li02] has shown that these approximation algo-
rithms for special cases can be adapted to solve the general case and produce
good results in practice for the berth allocation problem.
2 Problem Formulation and Previous Results
The offline DSA problem is formulated as follows1: An instance Bof the offline
DSA is given by a set of blocks B={B1,B
n}.EachblockBj(1 jn)
is a triple of nonnegative integers (sj,a
j)wheresjis the size of the requested
block while ajand djare the arrival time and departure time of Bj.A(memory)
allocation b(B)=(b1,b
n) is a mapping that assigns to each block Bja
nonnegative integer location bjsuch that for all i, j,(1i<jn), either
j)=or (bi,b
j+sj)=. Note that in the online
version of the DSA, we do not know Bin advance.
The DSA can be geometrically interpreted in 2-D (as illustrated in Figure
1) where the x-axisrepresentsthetimeaxisandthey-axis represents the linear
storage. The block Bjis represented by a rectangle whose x-extend is the time
interval [aj,d
j)andthey-extend is the (contiguous) space allocation denoted
by (bj,b
j+sj). The DSA problem is then to pack the collection of rectangles
given by B={B1,B
n}into a larger rectangle. Since the x-extend of each
block is fixed, the rectangles are only free to move vertically and the allocation
function bmerely determined their ypositions.
Fig. 1. A DSA instance with two blocks.
For any allocation b(B), we define the cost L(b)=max
Geometrically, we can interpret L(b) to be the minimum height of a rectangle
1Our notations are similar to those of [Gerg96].
New Approximation Algorithms 341
that contains all the allocated rectangles. We distinguish two versions of the
DSA. In the optimization version of the DSA, we are given an instance Bof the
DSA and we want to find a minimum cost allocation, namely, an allocation bthat
minimizes L(b). In the decision version of the DSA, we are given an instance B
and a cost threshold L0and we want to find an allocation b(B)withL(b)L0.
The berth allocation problem (BAP) is precisely the decision version of the
offline DSA problem if we model each ship by a block Bj=(sj,a
j), where the
size sjcorresponds to the length of the ship, while aiand djcorrespond to the
arrival time and departure time of the ship. The cost threshold L0corresponds
to the length of the section. Then the allocation bjcorresponds to the allocation
the ship to the location (bj,b
j+sj) along the section. An allocation b(B)with
L(b)L0corresponds to a BAP solution where all the ships are berthed without
overlap within a section of length L0.
We note that the DSA is analogous to a 2-D rectangle packing problem –
that of packing the smaller rectangles into the bigger rectangle. In Figure 2, we
illustrate this viewpoint. We refer to the larger rectangle as the packing area.
Allocating a rectangle can also be viewed as “packing” the rectangle into the
packing area. For the rest of the paper, we shall use the term “allocate” and
“pack” interchangeably.
Fig. 2. The DSA as a 2D packing problem. (a) A solution with width W=5,mt1(B)=
2, mt2(B) = 3, and m(B) = 4. (b) Partitioning the packing area into 5 tracks.
AblockBjis said to be active at time tif t[aj,d
j), or equivalently, if
j.LetB(t) denote the set of blocks that are active at time t,namely,
j)}. Then, the local density at time t(denoted by mt(B))
is defined to be the total size of the active blocks at time t,namely,mt(B)=
Bj∈B(t)sj.Thedensity of the instance B(denoted by M(B)) is the maximum
local density, namely, M(B)=maxt{mt(B)}. Given an instance Bof the DSA,
it is easy to see that L(b)M(B) for any allocation b(B).
The DSA problem is NP complete and there has been a number of approxi-
mation algorithms proposed. This ratio has been steadily reduced from Θ(ln h)
[Robs74], and to 80 by Kierstead [Kier88]. Slusarek [Slus89], and independently,
Kierstead [Kier91] reduced the ratio to 6. Subsequently, this ratio was reduced to
5 by Gergov [Gerg96] and later to 3 [Gerg99]. The latter result is the best-known
general result to date.
342 Shuai Cheng Li, Hon Wai Leong, and Steven K. Quek
Definition 1. Let DSA(W, M, h)denote the class of DSA instances Bwhere
(1) the maximum block size is h(namely, for all j,1sjh),
(2) the density of the instance is M(M(B)=M), and
(3) the cost threshold L0=W.
Definition 2. An instance B∈DSA(W, M, h)is said to be feasible if there
exists an allocation b(B)with L(b)W. The class DSA(W, M , h)is said to be
feasible if all instances in the class are feasible.
Then, the optimization version of the DSA problem can be formulated as
follows: Let L(M,h) denote the smallest Wsuch that DSA(W, M, h) is feasible.
For a fixed h,letL(h)=limM→∞ L(M,h)
M. We have the following results:
Theorem 1. [GJ79] L(M, 1) = Mfor all M1.Namely,L(1) = 1
Theorem 2. [Gerg99] L(M,h)3Mfor all h1.Namely,L(h)3
Results for several special cases have also been reported. For the DSA vari-
ant with two distinct block sizes (say s1and s2), Gergov [Gerg96] showed that
2})2Mor L({s1,s
2})22. For the case of blocks with sizes 1 and
h, it was shown [MuBh99] that the First Fit (FF) algorithm has approximation
ratio 2 1/h,namely,L({1,h})(2 1/h). Note that this implies L(2) 11
For the online DSA problem, we let ODSA(W, M, h) denote the class of
instances in which all the blocks have sizes hand the local density at any
time tis bounded by Mand L0=W.LetN(M, h) denote the smallest Wsuch
that ODSA(W, M, h) is feasible. For a fixed h,letN(h)=limM→∞ N(M,h)
the general case, the best result, due to Robson [Robs74], is that 0.5Mlog2h
N(M,h)0.84Mlog2h. For small values of h,ithasbeenshownthatthe
following holds [Knut73,Robs74,LuNO74]:
Theorem 3. For the online DSA, we have
(a) N(2) = 1.5,andmoreprecisely,N(M, 2) = (3M1)/2,
(b) 12
3≤N(3) 111
12 ,
(c) N({1,h})=21/h.
In this paper, we study DSA(W, M , h) and obtain values of L(M,h)for
small Mand h= 2 and 3 using new algorithms for solving these basic cases. We
then present new approximation algorithms based on channel partitioning that
achieves improved bounds for L(h), h=2,3. Both of these bounds are superior
to the corresponding best known value for N(h) (the online case). For h=2,it
is known that L(2) 5
4while the best known upper bound is 3
same as N(2). We improve this to L(2) 4
3.Forh= 3, we show that L(3) 1.7
while the best known bound for N(3) is 1 11
12 .
2Here, we extend the notation DSA(W, M, h)toDSA(W, M, S), where the Sdenote
the set of allowable block sizes. We also extend the notation of Laccordingly.
New Approximation Algorithms 343
3 Solving Some Small Basic Cases
We first solve DSA(W, M, h) classes with some small Mand h=2,3. In solving
these cases, we develop algorithms that partition the packing area into two ad-
jacent channels and use a channel flip procedure for effective allocation. These
ideas are central to the approximation algorithms developed in the next section
for L(2) and L(3).
In our packing algorithm for small DSA instances, it is convenient to divide
the packing area into horizontal tracks of unit length as illustrated in Figure
2(b). For DSA(W,M,h), we have tracks 1,2,...,W, where track krefer to the
horizontal strip whose y-extend is (k1, k). For a block Bj(with size sj), setting
bj=kis equivalent to allocating the (contiguous) tracks k+1, k+2, ... ,k+sj
to Bj, or equivalently, the y-extend of Bjis (k, k +sj). For example, allocating
a (unit size) block Bjto track 1 means bj=0.
3.1 Basic Cases for h=2
Theorem 4. For the offline DSA problem, we have
(a) L(2,2) = 2,(b)L(3,2) = 3,(c)L(4,2) = 5,(d)L(5,2) = 6.
It is easily shown that DSA(2,2,2) is feasible using a First Fit algorithm. The
algorithm for solving DSA(3,3,2) is a modified version of online first fit which
we call First-Fit with Channel Flip (FF-CF). We first divide the 3 tracks into
two channels C1with 1 track and C2with two contiguous tracks. There are two
possible configurations: (a) C1={1},C2={2,3}or (b) C1={3},C2={1,2}.The
actual configuration in use will vary during the allocation process as described
The algorithm FF-CF processes the blocks in Bin increasing order by their
arrival times. The blocks are packed (allocated) using a modified first fit algo-
rithm trying channel C1first, and then C2. With loss of generality, assume that
the configuration used is C1={1},C2={2,3}. In a general step, we want to pack
ablockBthat arrives at time t. We distinguish the following cases:
Case 1: If Bhas size 1, Bis allocated to the first free track using first fit.
Case 2(a): If Bhas size 2 and channel C2is free, we allocate Bto C2(tracks
2, 3).
If Bhas size 2 and channel C2is not free, then C1must be free and there is
exactly one occupied track in C2(since local density at time t3). There are
two sub-cases:
Case 2(b): If the middle track (track 2) of C2is free, then both tracks 1 and
2 are free. We perform a channel-flip operation at time tin which we change
the channel configuration to C2={1,2}and C1={3}. We then allocate block B
to the new C2.
Case 2(c): If the middle track (track 2) of C2is occupied, then the two free
tracks (tracks 1 and 3) are “split” by the middle track. This is illustrated in
Figure 3. In this case, we “exchange” the two tracks in channel C2–namely,
exchange the allocation of all the size 1 blocks in C2that arrives after the most
344 Shuai Cheng Li, Hon Wai Leong, and Steven K. Quek
recent size 2 block. After the exchange operation, tracks 1 and 2 are free at time
tand this case reduces to Case 2(b) above. We perform a channel-flip operation
at time tand allocate Bto the new C2.
Fig. 3. Example of track exchange and Channel Flip operations for DSA(3,3,2). (a)
Channel C1and C2before block B8is assigned; tracks 1 and 3 are free. (b) The
configuration after exchanging the “active size 1 blocks” in C2(blocks B6and B7),
and performing a channel flip. The block B8is assigned to the new C2.
Algorithm 1: DSA-Alg(3,3,2,B): [FF-CF Algorithm]
1. begin
2. Sort the blocks in Bby arrival time;
3. Define channels C1={1}and C2={2,3};
4. FS ←∅,C2Start 0
5. for each block Bj∈Bdo
6. if (sj=1)then
7. Pack Bjat the first free track using First Fit;
8. if (Bjis packed into C2)then FS FS ∪{Bj};
9. else // (sj=2)
10. if (C2is free)
11. then Pack Bjinto C2;
12. else Channel-Flip(FS,Bj);
13. C2Startdj;FS ←∅;
14. end
15. Procedure Channel-Flip(FS,Bj);
16. begin
17. if (middle track of C2is occupied)
18. then Exchange the track allocation of the blocks in FS on C2;
19. Flip the channel configuration of C1and C2;
20. Pack Bjin the new C2;
21. end {Flip}
The details of the algorithm is shown in Algorithm 1. Correctness of the
algorithm is easily shown by proving the invariant that every size channel-flip
operation generates a new empty C2channel that is used to pack a size 2 block.
The running time for this algorithm is O(nlog n),sinceweneedtosortallthe
blocks, and each block would get allocated once, and flipped at most once.
Proof of Theorem 4(c): To prove Theorem 4(c), namely that L(4,2) = 5,
we first prove that DSA(4,4,2) is infeasible. Figure 4 shows an instance of
DSA(4,4,2) that can be shown to be not feasible (using a case analysis).
New Approximation Algorithms 345
Lemma 1. DSA(4,4,2) is not feasible.
Corollary 1. DSA(M, M, h)is infeasible for all h2,andM4.
Fig. 4. An infeasible DSA(4,4,2) instance. A careful case analysis shows that this
instance requires 5 tracks, namely W= 5. It also gives a lower bound of 5
4for L(2).
Next, we show that DSA(5,4,2) is feasible by giving a constructive algorithm
called First-Fit with Block Partitioning (FF-BP). Instead of partitioning the
tracks, the algorithm partitions the blocks in Binto disjoint subsets B1and
B2which are then separately packed. For DSA(5,4,2), the algorithm FF-BP
partitions Bso that the subset B1has density 2 (m(B1)2), and the subset
B2has density 3 (m(B1)3). Then the blocks in B1are packed in 2 tracks
using algorithm DSA(2,2,2) and the blocks in B2are packed in 3 tracks using
algorithm DSA(3,3,2).
Algorithm FF-BP goes as follows: blocks in Bare processed in increasing
arrival times. Whenever a blocks Bjarrives, it is appended to B1if mt(B1)+sj
2. Otherwise, it is appended to B2. (Details are given in Algorithm 2.)
The correctness of this procedure is shown as follows: First, m(B1)2by
construction, and so we only need to show that m(B2)3. Blocks of size 1
are trivially placed. Consider a block Bjof size 2 arriving at time t. Then,
at most two tracks are occupied at time t(since total density is 4), namely,
mt(B1)+mt(B2)2. If Bjcannot be appended to B1,thenmt(B1)1andso
mt(B2)1. Thus, mt(B2)3aftertheblockBjis appended to B2.
Algorithm 2: DSA-Alg(5,4,2,B): [FF-BP Algorithm]
1. begin
2. Sort the blocks in Bby arrival time;
3. B1←∅;m1=2;
4. B2←∅;m2=3;
5. for each block Bjdo
6. tai;
7. if (mt(B1)+sjm1)
8. then B1←B
9. else B2←B
10. Apply DSA-Alg(2, 2, 2, B1);
11. Apply DSA-Alg(3, 3, 2, B2);
12. end
The algorithm FF-BP also solves DSA(6, 5, 2) by partitioning the blocks
into two subsets B1and B2having density 3 each. The proof is omitted here.
346 Shuai Cheng Li, Hon Wai Leong, and Steven K. Quek
3.2 Basic Cases for h=3
Theorem 5. For the offline DSA problem, we have
(a) L(3,3) = 3 and (b) L(4,3) = 5.
The proof of Theorem 5(a), namely, than L(3,3) = 3, is simple. We first assign
all the blocks of size 3. Then between any two consecutive blocks of size 3, we
have an instance of DSA(3,3,2) that is feasible.
To prove Theorem 5(b), we first note that DSA(4,4,3) is infeasible by Corol-
lary 1. Next, we prove that DSA(5,4,3) is feasible. The algorithm used is a gener-
alization of the First Fit with Channel Flip procedure uses in DSA-Alg(3,3,2,B).
We have two channels C2with 2 tracks and C3with 3 tracks. Without loss of
generality, we assume that C2={1,2}and C3={3,4,5}.
Consider a block Bjarriving at time t.Ifsj= 1, it is trivially assigned using
first fit. If sj=2,ifC2is free, Bjis assigned to C2otherwise, Bjis assigned to
C3using DSA-Alg(3,3,2) 3.Ifsj=3,andC3is free, Bjis assigned to C3.IfC3
is not free, then C2must be free and there is exactly one occupied track. If track
3 is free, we perform a channel flip at tso that the new C2={4,5}and the new
C3={1,2,3}is free and assign Bjto the new C3. (If track 3 is not free, we first
invert the tracks in channel C3so that track 3 is now free.)
4 Approximation Algorithms for Special Cases
In this section, we show how the results for the small instances can be used to
give improved approximation ratios, namely, L(h), for small h.
4.1 Algorithms Based on Multi-level Block Partitioning
We extend the block partitioning ideas used in DSA-Alg(5,4,2) to a multi-level
block partitioning algorithm called FF-MLBP. We illustrate the algorithm with
the case h= 2. The block Bjin Bare considered in increasing arrival time. The
algorithm proceeds in 2 phases as follows:
Phase 1: Let n1=M
3. We define n1level-1 subsets denoted by B1
2, ... ,
n1, each with maximum density m1=3.EachblockBjis assigned using first
fit to the first subset that can accomodate it. If Bjdoes not fit into any of these
subsets, it is left unassigned (for Phase 2). (Note: We show later that all blocks
of size 1 are assigned in Phase 1 and there are now only blocks of size 2 left.)
Phase 2: Now, define n2=M
6level-2 subsets denoted by B2
2, ... , B2
each with maximum density m2= 2. It is trivial to assign the remaining blocks
using first fit to these subsets.
Theorem 6. DSA(4/3M+4,M,2) is feasible, namely, L(2) 4
3Note that this assignment to channel C3may require an “internal” channel flip
within the 3 tracks of C3.
New Approximation Algorithms 347
Proof. We first prove the correctness of the algorithm. It is easy to see that all
blocks of size 1 are assigned in Phase 1 since there are n1=M
3subsets each
with maximum density of 3. Hence, there are only blocks of size 2 left in Phase 2.
Suppose that a block Bof size 2 arriving at time tcannot be assigned to any one
of the level-2 subsets (in Phase 2). This implies that at time t,eachofthesen2
subsets must have density 2 (since there are no blocks of height 1). Furthermore,
each of the level-1 subsets must have density at least 2 at time t(since block B
was not assigned in Phase 1). Thus, the total density at time t(excluding block
6M. This contradicts the density bound
Mfor all blocks. Thus, all the blocks are successfully assigned in Phase 2.
We now consider the number of tracks needed. Each of the n1level-1 sub-
sets can be solved using DSA(3,3,2) and each of the n2level-2 subsets with
DSA(2,2,2). Thus, the total number of tracks needed is 3n1+2n24
This result improves the previous bound for L(2) from 3/2 (by [MuBh99]) to
4/3. This multi-level block partitioning approach can be generalized for h=3
to give a bound of 11/6 (=1.833) for L(3) using a 3-phase block parititioning
approach with n1=M
However, a superior result is described in the next section.
4.2 An 1.7-Approximation Algorithm for h=3
In this section, we will present a special multi-level block partitioning algorithm
for h=3 that gives a bound of 1.7 for L(3). We first describe a customized block
partitioning algorithm (called FF-BP4) that assigns blocks to a packing area
with 4 tracks. We divide the packing area into two channels C1 and C2 each
with 2 contiguous tracks, say C1={1,2}and C2={3,4}.Ablockofsize1can
be assigned to any free track. A block of size 2 can only be assigned to channel
C1or C2(but not to the tracks 2 and 3 even if they are free). A block of size 3
can be assigned to tracks {1,2,3}or {2,3,4}if there is room (with one of the
channels flipped if necessary). Any block that can not be assigned in this way is
left unassigned.
Let B1be the set of assigned blocks after running algorithm FF-BP4. We
prove that mt(B1)2 whenever a block Bcannot be assigned at time t.IfB
is of size 1, then mt(B1)=4.IfBis of size 2, then mt(B1)3 except for the
case when tracks {2,3}are free in which case mt(B1)2. If Bis of size 3, then
mt(B1)2, otherwise we always pack Bsuccessfully.
We are now ready to present the multi-level block partitioning algorithm for
h=3. The algorithm is the same as the 3-phase FF-MLBP except that we use
different algorithms to assign blocks to the subsets.
Phase 1: Define n1=M
4level-1 subsets, with m1= 4 and assign blocks using
algorithm FF-BP4. After this phase, only blocks of sizes 2 and 3 remain.
Phase 2: Define n2=M
10 level-2 subsets, with m2= 6 and assign blocks using
the algorithm for DSA(6,6,{2,3}) from Theorem 6(d). All blocks of size 2 are
assigned in this phase.
348 Shuai Cheng Li, Hon Wai Leong, and Steven K. Quek
Phase 3: Define n3=M
30 level-3 subsets, with m3= 3 and assign blocks using
first fit.
Theorem 7. DSA(1.7M+13,M,3) is feasible, namely, L(3) 1.7.
Proof. We first prove the correctness of the algorithm. Suppose that a block B
of size 2, arriving at time tcannot be assigned in Phase 2. This implies that at
time t,eachofthen1level-1 subsets must have density at least 2, and each of
the n2level-2 subsets must have density at least 5. Thus, the total density at
time t(excluding block B)is2n1+5n2=2M
10 M,whichgives
a contradiction. Thus, all blocks of size 2 are assigned in Phase 2.
Now, suppose that a block Bof size 3, arriving at time tcannot be assigned
in Phase 3. This implies that at time t,eachofthen1level-1 subsets must have
density at least 2, each of the n2level-2 subsets must have density at least 4, and
each of the n3level-3 subsets must have density of 3, Thus, the total density at
time t(excluding block B)is2n1+4n2+3n3=2M
10 +3M
30 M,
which gives a contradiction. Thus, all blocks of size 3 are also assigned and the
algorithm is correct.
The total number of tracks needed is 4n1+6n2+3n31.7M+ 13.
[GJ79] M.R. Garey and D.S. Johnson: Computers and Intractability – A guide to
the Theory of NP-Completeness. Freeman, (1979).
[Gerg96] J. Gergov: Approximation algorithms for dynamic storage allocation. Euro-
pean Symp. on Algorithms (ESA ’96), Springer, LNCS 1136, (1996), 52–61.
[Gerg99] J. Gergov: Algorithms for Compile-Time Memory Optimization. ACM-
SIAM Symposium on Discrete Algorithms, (1999), 907–908.
[Kier88] H. A. Kierstead: The linearity of first-fit coloring of interval graphs. SIAM
J. Disc. Math., 1, (1988), 526–530.
[Kier91] H. A. Kierstead: A polynomial time approximation algorithm for Dynamic
Storage Allocation. Discrete Mathematics, 88, (1991), 231–237
[Knut73] D.E. Knuth: Fundamental Algorithms: The art of computer programming.
Volumn 1. Addison-Wesley Pub. (1973).
[Li02] S.C. Li: Algorithms for Berth Allocation Problem, M.Sc Thesis, Depart-
ment of Computer Science, National University of Singapore (2002).
[LuNO74] M.G. Ludy, J. Naor, and A. Orda: Tight Bounds for Dynamic Storage
Allocation. J. ACM, Vol 12, (1974), 491–499.
[MuBh99] P. K. Murthy, S. S. Bhattacharyya: Approximation algorithms and heuris-
tics for dynamic storage allocation problem. UMIACS TR-99-31, University
of Maryland, College Park, (1999).
[Robs74] J. M. Robson: Bounds for some functions concerning dynamic storage al-
location. Journal of the ACM, Vol 21(3), (1974), 491–499.
[Robs77] J. M. Robson: Worst base fragmentation of first fit and best fit storage
allocation strategies. Computer Journal, Vol 20, (1977), 242–244.
[Slus89] M. Slusarek: A Coloring Algorithm for Interval Graphs. Proc. 14th Math.
Foundations of Computer Science, LNCS 379, Springer, (1989), 471–480.
[WJNB95] P. R. Wilson, M. S. Johnstone, M. Neely, and D. Boles: Dynamic stor-
age allocation: A survey and critical review. Proceedings of International
Workshop on Memory Management, LNCS 986, (1995).
... that determine ordering (in address space) of allocations that overlap in lifetime. While the offsets that comprise the solution to the MIP formulation are provably correct and optimal, the MIP is, in general, computationally intractable [37]. The best-known polynomial-time approximation is 2 + ε by Buchsbaum [11], over the previously 3 + ε best by Gergov [23]. ...
Full-text available
We study memory allocation patterns in DNNs during inference, in the context of large-scale systems. We observe that such memory allocation patterns, in the context of multi-threading, are subject to high latencies, due to \texttt{mutex} contention in the system memory allocator. Latencies incurred due to such \texttt{mutex} contention produce undesirable bottlenecks in user-facing services. Thus, we propose a "memorization" based technique, \texttt{MemoMalloc}, for optimizing overall latency, with only moderate increases in peak memory usage. Specifically, our technique consists of a runtime component, which captures all allocations and uniquely associates them with their high-level source operation, and a static analysis component, which constructs an efficient allocation "plan". We present an implementation of \texttt{MemoMalloc} in the PyTorch deep learning framework and evaluate memory consumption and execution performance on a wide range of DNN architectures. We find that \texttt{MemoMalloc} outperforms state-of-the-art general purpose memory allocators, with respect to DNN inference latency, by as much as 40\%.
... The model assumes that there exist a total ordering of processors (resources), i.e., they can be arranged in a line, and contiguous allocations maximize proximity relation between processors. Possible applications range from assigning memory to processes (Huang and Korf 2013) by the operating system, scheduling of ships on a long wharf (Li et al. 2004), usage of radio frequency spectra (Mitola and Maguire 1999) or scheduling check-in counters at airports (Duin and Sluis 2006). ...
Full-text available
In this paper, we show that sequence pair (SP) representation, primarily applied to the rectangle packing problems appearing in the VLSI industry, can be a solution representation of precedence constrained scheduling. We present three interpretations of sequence pair, which differ in complexity of schedule evaluation and size of a corresponding solution space. For each interpretation we construct an incremental precedence constrained SP neighborhood evaluation algorithm, computing feasibility of each solution in the insert neighborhood in an amortized constant time per examined solution, and prove the connectivity property of the considered neighborhoods. To compare proposed interpretations of SP, we construct heuristic and metaheuristic algorithms for the multiprocessor job scheduling problem, and verify their efficiency in the numerical experiment.
... For this purpose, we use techniques introduced in [LLQ04] to approximate DSA. Results in [LLQ04] can extend directly to SA in path networks giving approximation algorithms with factors 4 3 and 1.7 when the spectrum demands are bounded with 2 and 3, respectively. In what follows we use the same techniques to design constant-factor approximations for SA in binary trees when the spectrum demand is bounded by 6. ...
Full-text available
To face the explosion of the Internet traffic, a new generation of optical networks is being developed; the Elastic optical Networks (EONs). The aim with EONs is to use the optical spectrum efficiently and flexibly. The benefit of the flexibility is accompanied by more difficulty in the resource allocation problems. In this report, we study the problem of Spectrum Allocation in Elastic Optical Tree-Networks. In trees, even though the routing is fixed, the spectrum allocation is NP-hard. We survey the complexity and approximability results that have been established for the SA in trees and prove new results for stars and binary trees.
... A rectangle packing solution tells us both when programs should be run, as well as which memory addresses they should be assigned. Similar problems include scheduling when and where ships of different length can be berthed along a single, long wharf (Li, Leong, & Quek, 2004), as well as the allocation and scheduling of radio frequency spectra usage (Mitola & Maguire, 1999). Rectangle packing also appears when loading a set of rectangular objects on a pallet without stacking them. ...
Full-text available
We consider the problem of finding all enclosing rectangles of minimum area that can contain a given set of rectangles without overlap. Our rectangle packer chooses the x-coordinates of all the rectangles before any of the y-coordinates. We then transform the problem into a perfect-packing problem with no empty space by adding additional rectangles. To determine the y-coordinates, we branch on the different rectangles that can be placed in each empty position. Our packer allows us to extend the known solutions for a consecutive-square benchmark from 27 to 32 squares. We also introduce three new benchmarks, avoiding properties that make a benchmark easy, such as rectangles with shared dimensions. Our third benchmark consists of rectangles of increasingly high precision. To pack them efficiently, we limit the rectangles coordinates and the bounding box dimensions to the set of subset sums of the rectangles dimensions. Overall, our algorithms represent the current state-of-the-art for this problem, outperforming other algorithms by orders of magnitude, depending on the benchmark.
In this paper we consider the problem of arranging segments on parallel arcs drawn within a circular sector, to provide foundational work for the visualization of genomic regions in the study of pathogenic integration. The arcs as well as the start and end angles for each segment are pre-defined; our problem is to place each segment on an arc without having them overlap. There are no segments that span multiple arcs. For visualization purpose, the segments are to be easily distinguishable. To achieve that we consider various criteria that in a sense, place segments as far as possible from each other—for instance, maximizing the sum of inter-center distances between nearest segments. We show complexity results for some of the resultant problems, while providing approximation or heuristic solutions for others. Our algorithms have been implemented in JavaScript and made available at
In this paper, we discuss and introduce to the scheduling field a novel optimization objective - half perimeter proximity measure in scheduling under the network of temporo-spatial proximity relationships. The presented approach enables to qualitatively express various reasons of scheduling certain jobs in close proximity, without resorting to quantitative, precisely defined consequences of such scheduling. Based on the correspondence between scheduling and rectangle packing problems in VLSI, we present an incremental Sequence Pair neighborhood evaluation algorithm, as an essential tool for complex solution-search methods for both proximity scheduling and physical layout synthesis of integrated circuits. A numerical experiment showed that such an incremental approach is considerably faster than the naive approach, performing evaluation of a solution from scratch each time, at the cost of small approximation error.
Conference Paper
The offline Dynamic Storage Allocation (DSA) problem is a well-known problem in combinatorial optimization. The problem is one of packing a set of blocks of arbitrary sizes on an area, with the objective of minimizing the area's usage along the y-axis, under the condition that the x-position of the blocks are fixed (as given in the input). The problem has uses in memory management, berth allocation, and can potentially be used in the allocation of bandwidth resources in a network. Li et al. [4] considered the case of the problem where the width of the blocks (their dimension along the y-axis) is to be no larger than a given number. They obtained several approximation algorithms for special cases of the problem, by first studying the feasibility of several subcases with further restriction, such as limiting the y-dimension of the area to pack the blocks. We believe that such feasibility results on subcases of the problem can help in obtaining algorithms for other special cases of the problem. In this paper we propose a method to automatically derive such results. Our implementation of a simplified version of the proposed method in C++ correctly duplicated many of the earlier results obtained by Li et al. (4 pages)
Conference Paper
We study the single-device Dynamic Storage Allocation (DSA) problem and multi-device Balancing DSA problem in this paper. The goal is to dynamically allocate the job into memory to minimize the usage of space without concurrency. The SRF problem is just a variant of DSA problem. Our results are as follows, The NP-completeness for 2-SRF problem, 3-DSA problem, and DSA problem for jobs with agreeable deadlines. An improved 3-competitive algorithm for jobs with agreeable deadlines on single-device DSA problem. A 4-competitive algorithm for jobs with agreeable deadlines on multi-device Balancing DSA problem. Lower bounds for jobs with agreeable deadlines: any non-clairvoyant algorithm cannot be (2 − ε)-competitive and any clairvoyant algorithm cannot be (1.54 − ε)-competitive. The first O(logL)-competitive algorithm for general jobs on multi-device Balancing DSA problem without any assumption.
This paper considers algorithm selection for the berth allocation problem (BAP) under algorithm runtime limits. BAP consists in scheduling ships on berths subject to ship ready times and size constraints, for a certain objective function. For the purposes of strategic port capacity planning, BAP must be solved many times in extensive simulations, needed to account for ship traffic and handling times uncertainties, and alternative terminal designs. The algorithm selection problem (ASP) consists in selecting algorithms with the best performance for a considered application. We propose a new method of selecting a portfolio of algorithms that will solve the considered BAP instances and return good solutions. ....
We use an on-line algorithm for coloring interval graphs to construct a polynomial time approximation algorithm WIC for Dynamic Storage Allocation. The performance ratio for WIC is at most six; the best previous upper bound on the performance ratio for a polynomial time approximation algorithm for Dynamic Storage Allocation had been 80.
Conference Paper
We devise a new optimal on-line coloring algorithm for interval graphs. Then we exhibit how it can be used to the construction of a polynomial time approximation strategy for the NP-hard dynamic storage allocation problem. Previously Chrobak, Slusarek, and Woodall applied the suboptimal greedy coloring (first fit) to the analysis of a certain allocation strategy. Due to that analysis (only partially carried out) the performance ratio of their allocation algorithm lies within [4.45, 40]. The new strategy decribed here achieves the performance ratio of 3.
Conference Paper
Given a program P in a structured programming language, we propose the first polynomial-time algorithm for compile-time memory allocation for the source objects of P (e.g. arrays, C structures) with a performance guarantee. Further, we present a new and simple O(nlogn) time 3-approximation algorithm for (off-line) dynamic storage allocation and, thus, improve the best previous approximation ratio of 5.
Conference Paper
We present a new O(n log n)-time 5-approximation algorithm for the NP-hard dynamic storage allocation problem (DSA). The two previous approximation algorithms for DSA are based on on-line coloring of interval graphs and have approximation ratios of 6 and 80 [6, 7, 16]. Our result gives an affirmative answer to the important open question of whether the approximation ratio of DSA can be improved below the bound implied by on-line coloring of interval graphs [7, 16]. Our approach is based on the novel concept of a 2-allocation and on the design of an efficient transformation of a 2-allocation to an at most 5/2 times larger memory allocation. For the NP-hard variant of DSA with only two sizes of blocks allowed, we give a simpler 2-approximation algorithm. Further, by means of a tighter analysis of the widely used First Fit strategy, we show how the competitive ratio of on-line DSA can be improved to Θ(max{1, log(nk/M)}) where M, k, and n are upper bounds on the maximum number of simultaneously occupied cells, the maximum number of blocks simultaneously in the storage, and the maximum size of a block.
. This paper is concerned with on-line storage allocation to processes in a dynamic environment. This problem has been extensively studied in the past. We provide a new, tighter bound, for the competitive ratio of the well known First Fit algorithm. This bound is obtained by considering a new parameter, namely the maximum number of concurrent active processes. We observe that this bound is also a lower bound on the competitive ratio of any deterministic on-line algorithm. Our second contribution is an on-line allocation algorithm which uses coloring techniques. We show that the competitive ratio of this algorithm is the same as that of First Fit. Furthermore, we indicate that this algorithm may be advantageous in certain applications. Our third contribution is to analyze the performance of randomized algorithms for this problem. We obtain lower bounds on the competitive ratio which are close to the best deterministic upper bounds. Key words. on-line algorithms, memory management, dyna...
It is shown that First-Fit coloring requires at most $40\omega $ colors to color an interval graph with clique size $\omega $. It follows that a polynomial time approximation algorithm for Dynamic Storage Allocation due to Chrobak and Slusarek has a constant performance ratio of 80.