Content uploaded by Michał Jarski

Author content

All content in this area was uploaded by Michał Jarski on Jan 22, 2020

Content may be subject to copyright.

Project 1-1

Pentominoes

3D Knapsack Problem

Ren´e Steeman

Samuele Torregrossa

Ali Alsaeedi

Lindalee Conradie

Max Persoon

Michal Jarski

Drago Stoyanov

A report presented for Project 1-1

Department of

Data Science and Knowledge Engineering

Maastricht University

January 22, 2020

Abstract

The paper presents suggested solutions to the 3D knapsack prob-

lem. The problem is a well-known optimisation problem, the objective

of which is to ﬁt a given set of parcels with a given size and value in a

3D container in a way that maximizes the resulting score (represented

as the sum of all of the parcels’ values).

In the Introduction to this text, the idea of the 3D knapsack prob-

lem is established. The importance of the problem is also discussed,

and the possible constraints are presented.

The methods suggested for solving the problem are presented in the

Methods section. The text focuses on three diﬀerent algorithms, with

their modiﬁcations: Algorithm X, Genetic Algorithm, and a Greedy

algorithm. The limitations of those algorithms are also presented in

this section, and it is argued why the inclusion of a greedy algorithm

is important, even if it does not yield the highest possible result.

The paper investigates the way in which we can answer a variety

of optimization questions, all of them having a common goal: to ﬁnd

a way in which given pentominoes or packages yield the highest score,

and therefore the (best approximation of the) best solution to the

problem is found.

It is hoped that this paper will bring more attention to the knap-

sack problem and develop new tools that would allow for even better

results for this optimization problem.

1

Contents

1 Introduction 3

2 Methods 4

2.1 Greedyalgorithm......................... 4

2.1.1 Density-greedy algorithm . . . . . . . . . . . . . . . . . 4

2.1.2 Value-greedy algorithm . . . . . . . . . . . . . . . . . . 4

2.2 AlgorithmX............................ 5

2.2.1 Modiﬁed Algorithm X . . . . . . . . . . . . . . . . . . 6

2.3 Geneticalgorithm......................... 7

2.3.1 Implementation details . . . . . . . . . . . . . . . . . . 8

2.3.2 Prosandcons....................... 9

2.4 UMLdiagram........................... 10

3 Experiments 11

3.1 Greedyalgorithm......................... 11

3.2 AlgorithmX............................ 11

3.3 Generalcase............................ 11

4 Results 12

4.1 Greedyalgorithm......................... 12

4.2 AlgorithmX............................ 12

4.3 Generalcase............................ 15

5 Conclusions 16

6 References 17

List of Figures 18

List of Tables 18

2

1 Introduction

Imagine that you are a big logistics company, delivering items to customers

around the world from your storage centers, where they are being delivered

from local delivery points in many urban areas. Some of the questions you

could be asking yourself in such a scenario are: How do I maximize the proﬁt

from shipments? How do I pack the most amount of parcels possible in each

lorry my company owns? What would be the most eﬃcient packing order?

Such a problem is known scientiﬁcally as a 3D Knapsack problem. In the

knapsack problem, and more speciﬁcally in its 3D version, the objective is

to get the highest score from ﬁtting boxes of diﬀerent shapes and sizes (later

in this paper referred to as types), with each box type having a separate

unit score, in a container of a given (predeﬁned) size. Such a container in a

real-life scenario could be the truck’s cargo, container for a ship, cargo hold

on an airplane, etc.

The knapsack problem is a well-known combinatorial optimization prob-

lem and as such was approached by many scientists, although, after conduct-

ing independent research it is quite hard to compare the various studies. It

is however appropriate to acknowledge that the complexity of the input and

desired result would greatly aﬀect the decision in approaches that have been

studied. There exist, in addition, many restraints in approaching the 3D

knapsack problem such as: if there is more than one constraint (for example,

both a volume limit and a weight limit, where the volume and weight of each

item are not related), we get the multiply-constrained knapsack problem,

multidimensional knapsack problem, or m-dimensional knapsack problem.

In the following sections, all aspects of our work will be discussed, includ-

ing the methods we have used, diﬀerent algorithms together with a UML dia-

gram (Section 2), the experiments that were performed using those methods

(Section 3), results of those experiments, accompanied by their interpreta-

tion (Section 4), and ﬁnally the research conclusions (Section 5). All of the

referenced works can be found in the References.

3

2 Methods

All of the algorithms implemented use the deﬁnitions of parcels below:

Parcel type Dimensions

A 1.0×1.0×1.0

B 1.0×1.5×2.0

C 1.5×1.5×1.5

2.1 Greedy algorithm

2.1.1 Density-greedy algorithm

The greedy algorithm, implemented by our group, attempts to ﬁnd the most

value dense solution to the knapsack problem. The algorithm adapts to

the type of parcels it is dealing with, to maximise the volume covered by

either of them (pentominoes or standard parcels). The scoring method that

is used for the greedy algorithm is the score of each pentomino or parcel

divided by the volume of the chosen type. After the score is calculated, the

algorithm attempts to ﬁll the ﬁeld with the parcels or pentominoes yielding

the highest score ﬁrst, and then when it cannot ﬁll the space with any more

of those parcels, it tries to add parcels of lower scores, and so on, until

not a single parcel more ﬁts. This algorithm is not optimal, but it allows

us to approximate the lower bound for other algorithms, since the score

produced by them should be at least as high as the score yielded by the

greedy algorithm to consider them meaningful.

2.1.2 Value-greedy algorithm

The value-greedy algorithm is another variant of the greedy algorithm that

was implemented by our group. It does not take into account the volume

of parcels or pentominoes that are to be ﬁtted in the container and only

attempts to put the parcel with the maximum score ﬁrst, then the one with

the lower score, etc. It does not account for the volume of parcels at all, hence

it might perform worse in scenarios where the parcel with the highest score

covers the whole container, while parcels with less score are much smaller

and could potentially yield a much better end result (what is accounted for

in the density-greedy version of this algorithm).

4

Figure 1: An overview of the dancing links structure as found on

www.geeksforgeeks.org.

2.2 Algorithm X

The version of Algorithm X that was implemented creates a sparse matrix

that represents all the possible ways that pieces can be placed in. All the

places that would be ﬁlled in by an option are represented by nodes. These

nodes in turn link to their neighbors and (column) header nodes. Header

nodes are special nodes that are referred to by every node in that column and

contain some extra information. The name for this type of implementation

is dancing links [Knuth, 2000] (Figure 1).

The algorithm tries to ﬁll the container as much as possible by picking

items to ﬁll the space. Each time it picks an item, it removes any overlapping

items from the list of items that would still be left. The process continues

until there are no items left that could be ﬁtted in. Then, it checks whether

the obtained result is the best so far. The order in which the items that are

going to ﬁll the container is also important in the case of exact covers, since

items that ﬁll it up quicker are more likely to reduce the amount of steps

needed to ﬁll in the rest, and thus reduce the amount of backtracking needed.

5

The backtracking occurs whenever no more items can be used to ﬁll the

container. The advantage of using dancing links is that instead of fully re-

moving an option, it only breaks the reference to that object and thus stops

using it. Whenever the backtracking occurs, it can re-create those references

and the object can be used again. This prevents a lot of insertion operations

and thus makes it more eﬃcient.

2.2.1 Modiﬁed Algorithm X

Dancing links is not intended to solve partial cover problems nor optimize

for scoring, thus in order to allow it to still be used for the purpose of the

project, a few adjustments were needed. That was accomplished by checking

the state of the (partial-)solution whenever no new items could be placed.

Since this is always the case at an endpoint of the search tree, all these

endpoints can be compared based on the score that this solution has. So at

every leaf node the partial solution is compared with the best solution so far

and if it is better, the best solution and the best score will be updated. In

addition, it also ﬁlters out any locations that can never be ﬁlled from the

input, essentially preventing it from ﬁlling a position that can never be ﬁlled.

It does this by removing empty columns from the input.

Figure 2: Structure of Modiﬁed Algorithm X

6

The order was also optimized by basing it on greedy algorithms so that the

shape with the highest value to volume ratio is considered ﬁrst. This makes

the algorithm consider the shapes with a higher value density ﬁrst and thus

the algorithm is able to get better scores.

To further optimize the solving of partial-cover problems, pruning is used.

At each given amount of layers in the search tree, the score of the partial

solution is compared to the best score so far multiplied by a chosen value. If

this score is above the predeﬁned threshold, the algorithm will continue ex-

ploring that branch and if not, the branch will be abandoned (pruned). This

prevents maintaining branches that are much less likely to yield a better best

score by discarding them. Pruning is disabled for the ﬁrst nlayers (where

nis deﬁned in the scope of this algorithm), allowing for more options that

have a bad initial score, but still have a good chance of drastic improvement.

In the case of exact cover, once the algorithm is done, it will either have

found a way to ﬁll in the entire space or it will be certain that there is no

way to ﬁll it in completely. In the case of optimizing the score, the program

will keep track of the best solution and the corresponding score. The score

is created based on the number of items used and their value. The solution

with the highest score is then picked.

2.3 Genetic algorithm

The underlying concept is to train an agent to ﬁll a space (a three-dimensional

container) by creating a population of individuals and having them perform

the task. These individuals get evaluated based on their performance and the

best from a generation have a higher chance to pass on part of their genome

to the next generations. We have also implemented a mutation mechanism

to produce emergent properties in the gene pool. After a certain number of

generations, we can expect to see an improvement in the overall performance

of the population.

7

Our choices for the algorithm:

•Selection method: tournament selection with tournament size equal 5%

of the population

We chose tournament selection as it allowed us to retain some proper-

ties from individuals that did not perform as well as the top individ-

uals. This produced a more diverse population than a simple elitist

approach and was more time eﬃcient than other selection methods like

the roulette wheel selection.

•Crossover: A single random crossover point

We chose to use a single random crossover point in the mating function

as we did not ﬁnd any beneﬁt from multiple crossover points in our

testing.

•Mutation method

We implemented a mutation method that consists of two separate mu-

tation rates and magnitudes. Our group came to this conclusion after a

lot of testing to optimize the ideal values. During long training sessions

we were experiencing a stagnation in the improvement of the ﬁtness,

while the chromosome value seemed un-optimized, leading us to believe

that some options were left unexplored.

We therefore implemented a small mutation rate with a small magni-

tude that promotes diversity, as well as a very small mutation rate with

a sizable magnitude to promote the emergence of new properties in the

population.

2.3.1 Implementation details

An individual (or chromosome) is a collection of weights that represent how

much a container parameter aﬀects the rating of a certain move.

For example, after training we might see how the weight associated with the

number of holes in the container is a negative number. When a move is an-

alyzed, the number of holes in the board gets multiplied by this weight and

added to the rating for that move. This means that if the move creates less

holes than all others, its rating will be higher and it will be more likely to be

selected.

Initially, a chromosome in the population is constituted by random weights,

and in all the following generations these weights are inherited as a result of

the mating of two individuals, selected preferably from the highest perform-

ing chromosomes, spliced at the crossover point, and possibly mutated.

8

Figure 3: Flowchart representing the genetic algorithm

During the training phase the algorithm works by examining and executing

the best rated move at each step, always having in consideration the current

genome.

This process gets carried out until the container is full, at which point the

cumulative score of the pieces inside the container gets assigned to the ﬁtness

of the individual and a new individual is evaluated.

At the end of a generation the individuals get sorted by ﬁtness, a tournament

selection determines the individuals that will create the next generation and

the process repeats until the ﬁtness of the top individuals reaches an accept-

able value.

2.3.2 Pros and cons

A genetic algorithm is not guaranteed to ﬁnd a complete cover of the con-

tainer or an optimal solution. However, this approach is a good compromise

in many categories. It is quite versatile and, after training, it does not require

a lot of computational time to ﬁnd a solution compared to other methods. A

well-trained genome should be able to allow an agent to ﬁnd an acceptable

solution in a reasonable amount of time.

9

2.4 UML diagram

Figure 4: UML Diagram

10

3 Experiments

3.1 Greedy algorithm

Two versions of the greedy algorithm were implemented, ie. a value-greedy al-

gorithm and a density-greedy algorithm. Additionally, the greedy algorithm

was developed to be used as a reference for measuring the performance of

other algorithms (what was further discussed in its Methods section), since

it should compute the - most likely not optimal - lower bound of the highest

score we can get with given parcels. Therefore, most experiments on the

greedy algorithm were concentrated around comparing those two approaches

(in terms of value and density) and showing the diﬀerence of performance

between the greedy algorithm and other algorithms on graphs.

3.2 Algorithm X

For the experiments on Algorithm X, the computer that was used to perform

them (HP Omen 15-ce032nd) has the following speciﬁcations:

•RAM: 16GB 2400MHz

•CPU: i7-7700HQ 4 cores with hyperthreading at 2.8 GHz

The experiments that were performed on Algorithm X involved manipulation

of three independent variables:

•pruneWait: the amount of branches that will be skipped after each

time pruning occurs,

•pruneCutoﬀ: the multiplier for the maximum score that is used. If

the score for the current solution is below this than this branch is

discontinued,

•layerCutoﬀ: speciﬁcation of up to which layer pruning will not occur.

Additionally, a fourth variable - maxTime1- is used to set the maximum

time when the variables measured are not time-dependent.

3.3 General case

The General case allows the user to input amount of parcels A, B, C or

pentominoes L, T, P, and their value to dynamically calculate the best com-

position of parcels (number and type yielding the highest score).

1maxTime - the amount of time (in seconds) that the algorithm has to run for (ex-

cluding setting up the structure, getting the input and updating UI)

11

4 Results

There were four main questions that we had to answer based on the algo-

rithms that we decided to implement:

Question literal Question

A Is it possible to ﬁll the complete cargo

space with A, B and/or C parcels, with-

out having any gaps?

B If parcels of type A, B and C represent

values of 3, 4 and 5 units respectively,

then what is the maximum value that

you can store in your cargo-space?

C Is it possible to ﬁll the complete cargo

space with L, P and/or T parcels, with-

out having any gaps?

D If parcels of type L, P and T represent

values of 3, 4 and 5 units respectively,

then what is the maximum value that

you can store in your cargo-space?

Those questions will be referred to further in this paper by their literals.

4.1 Greedy algorithm

As discussed previously in the paper, the greedy algorithm was used as a base

and guideline to ensure precision in the other algorithms used. Although,

it was expected to achieve the lowest result obtainable, it is necessary to

mention that the results obtained were not signiﬁcantly lower in comparison

to the other algorithms.

Please refer to Table 5 in Section 4 for further experimentation in comparison

to the Genetic Algorithm.

4.2 Algorithm X

The results of experiments on Algorithm X’s performance are presented in

the following tables. Since the algorithm works only for Questions B and D,

12

the results for other questions are not presented. As seen in Table 1, the

results for question B do not diﬀer with respect to the time given, while the

results for Question D improve slightly when more time is given to compute

the best score. In the time given, the results improve from 1201 points to

1208.

pruneWait = 3; pruneCutoﬀ = 0.85; layerCutoﬀ = 15

Time (s) Question literal

B D

1 236 1201

5 236 1203

20 236 1203

60 236 1204

600 236 1208

Table 1: Algorithm X tests with pruning

The situation presents itself completely diﬀerently when the variable

tested is not the maximum time but pruneWait (as seen in Table 2). Then,

the score for question D remains the same throughout the tests, with score

for question B improving sligthly between the ﬁrst two values of pruneWait

(from 233 points to 236). At the same time, increasing the variable’s value

even further has no impact over the highest score achieved by the algorithm.

pruneCutoﬀ = .85; layerCutoﬀ = 15; maxTime = 5s

pruneWait Question literal

B D

1 233 1203

3 236 1203

10 236 1203

50 236 1203

Table 2: Algorithm X tests in terms of values of pruneWait

In Table 3, the variable tested is pruneCutoﬀ. What is interesting in this

case, is that although the results for Question D do not diﬀer between the

13

values tested in the ﬁrst three cases, and values of B diﬀer slightly between

pruneCutoﬀ = 0.7 and pruneCutoﬀ = 0.9, a major decline in performance

is observed at pruneCutoﬀ = 1, with the value for Question B falling by 9

points (compared to the measurement at pruneCutoﬀ = 0.9 ) and the one for

Question D plummeting from 1203 to 139.

pruneWait = 3; layerCutoﬀ = 15; maxTime = 5s

pruneCutoﬀ Question literal

B D

0.1 236 1203

0.7 236 1203

0.9 233 1203

1 224 139

Table 3: Algorithm X tests in terms of values of pruneCutoﬀ

In the last test, as presented in Table 4, the performance appears to be

independent of the value of layerCutoﬀ. All measurements for Question B

and D remain the same throughout all the tests (on values from 0 to 20, with

step size = 10).

pruneWait = 3; pruneCutoﬀ = 0.85; maxTime = 5s

layerCutoﬀ Question literal

B D

0 236 1203

10 236 1203

20 236 1203

Table 4: Algorithm X tests in terms of values of layerCutoﬀ

14

4.3 General case

Last but not least, the ”General” case (ie. when number of each type of

parcel and the score is from user input) is presented in Table 7 based on

parcels and pentominoes from Tables 5 and 6.

Parcel type Amount of parcels Single parcel value

A 30 2

B 20 1

C 20 3

Table 5: Parcels considered for the General case

Pentomino type Amount of parcels Single parcel value

L 30 2

P 20 1

T 20 3

Table 6: Pentominoes considered for the General case

Input table Algorithm Results

Table 5

Algorithm X -

Genetic 128

Greedy (value) 126

Greedy (density) 110

Table 6

Algorithm X -

Genetic 140

Greedy (value) -

Greedy (density) -

Table 7: Results of algorithms for both parcels and pentominoes

As seen in Table 7, the algorithm with the best performance in both

cases is the genetic one, with it being the only one that can compute answers

when pentominoes are considered instead of packages. At the same time,

Algorithm X is not able to compute results in either of them.

15

5 Conclusions

Since each algorithm that we have implemented performed better at some

tasks and worse at others, we have decided to present best scores that we

have achieved for each research, alongside the algorithm that we got that

result from. The ﬁnal answers are presented in Table 8. Additionally, a table

with all of the algorithms’ results is presented in Appendix A.

Question literal Best score Algorithm

A2- -

B 232 Algorithm X

D 1209 Algorithm X

Table 8: List of answers to problem questions

Since Question C is a Yes/No question, our short answer is ”Yes”. That

answer is based on the implementation of Algorithm X, as it is a full-cover

problem. The resulting shape can be found in Figure 5.

Figure 5: Visualization of the answer to Question C

2We were not able to compute a score for Question A with any of the algorithms that

we used, hence no best score is presented in the table

16

6 References

[Bribiesca, 2000] Bribiesca, E. (2000). A measure of compactness for 3D

shapes. Computers & Mathematics with Applications, 40:1275–1284.

[Chu, 2006] Chu, J. (2006). A sudoku solver in Java implementing Knuth’s

dancing links algorithm. Harker Research Symposium submission.

[Knuth, 2000] Knuth, D. (2000). Dancing links. Millenial Perspectives in

Computer Science, pages 187–214.

17

List of Figures

1 An overview of the dancing links structure . . . . . . . . . . . 5

2 Structure of Modiﬁed Algorithm X . . . . . . . . . . . . . . . 6

3 Flowchart representing the genetic algorithm . . . . . . . . . . 9

4 UMLDiagram........................... 10

5 Visualization of the answer to Question C . . . . . . . . . . . 16

List of Tables

1 Algorithm X tests with pruning . . . . . . . . . . . . . . . . . 13

2 Algorithm X tests in terms of values of pruneWait . . . . . . . 13

3 Algorithm X tests in terms of values of pruneCutoﬀ . . . . . . 14

4 Algorithm X tests in terms of values of layerCutoﬀ . . . . . . 14

5 Parcels considered for the General case . . . . . . . . . . . . . 15

6 Pentominoes considered for the General case . . . . . . . . . . 15

7 Results of algorithms for both parcels and pentominoes . . . . 15

8 List of answers to problem questions . . . . . . . . . . . . . . 16

9 Full list of answers to problem questions . . . . . . . . . . . . 19

10 Comparison between diﬀerent aspects of algorithms that were

implemented............................ 20

18

Appendix A: all results for Questions A, B, D

Question literal Best score Algorithm

A - -

B

232 Algorithm X

231 Genetic

230 Greedy (value)

192 Greedy (density)

D

1209 Algorithm X

1192 Genetic

- Greedy3

Table 9: Full list of answers to problem questions

3Not possible for this question, since the greedy algorithm is not able to work with

pentominoes

19

Appendix B: Comparison table between algo-

rithms

Algorithm Precision Speed Versality Diﬃculty to implement

Greedy Low Very fast Medium Easy

Genetic Medium Fast High Medium

Algorithm X Extreme Very slow Low Medium

X Modiﬁed High Decent Medium Hard

Table 10: Comparison between diﬀerent aspects of algorithms that were

implemented

20

Appendix C: Full UML diagram

21