Article

Frequent Itemset-Driven Search for Finding Minimal Node Separators and Its Application to Air Transportation Network Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The α\alpha -separator problem ( α\alpha -SP) consists of finding the minimum set of vertices whose removal separates the network into multiple different connected components with fewer than a limited number of vertices in each component, which belongs to the family of critical node detection problems. The α\alpha -SP problem is an important NP-hard problem with various real-world applications. In this paper, we propose a frequent itemset-driven search (FIS) algorithm to solve α\alpha -SP, which integrates the concept of frequent itemset into the well-known memetic search framework. Starting from a high-quality population built by population construction and population repair, FIS then iteratively employs a frequent itemset recombination operator (to generate promising offspring solution), a tabu-based simulated annealing (to find local optima), a population repair procedure, and a population management strategy (to guarantee healthy/diverse population). Extensive evaluations on 50 benchmark instances show that FIS significantly outperforms the state-of-the-art algorithms. In particular, it discovers 29 new upper bounds and matches 18 previous best-known bounds. Finally, we experimentally analyze the importance of each key algorithmic component, and perform a case study on an air transportation network for understanding its network structure and identifying its influential airports.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The proposed TAMLS algorithm relies on a set of correlated parameters, which make the parameter tuning itself a very difficult problem. It is a common practice to employ an automatic parameter configuration method to solve this problem in the literature [38], [39], [40]. In our case, we adopt the iterated F-race (IFR) method implemented in the irace package [41]. ...
Article
Full-text available
The capacitated electric vehicle routing problem (CEVRP) extends the traditional vehicle routing problem by simultaneously considering the service order of the customers and the recharging schedules of the vehicles. Due to its NP-hard nature, we decompose the original problem into two sub-problems: a capacitated vehicle routing problem (CVRP) and a fixed route vehicle charging problem (FRVCP). A highly effective threshold acceptance based multi-layer search (TAMLS) algorithm is proposed to quickly obtain high-quality solutions. TAMLS consists of three layers. An iterated thresholding search procedure and a thresholding selection procedure are employed to produce diversified CVRP solutions in the first layer and to screen out high quality ones in the second layer, respectively. In the third layer, a removal heuristic coupling with an enumeration method is adopted to solve FRVRP, which produces optimized charging schedules. Extensive computational results show that TAMLS outperforms the state-of-the-art algorithms in terms of both solution quality and computation time. In particular, it is able to obtain new best results for 11 out of 17 benchmark instances, and reach the best known results on the remaining 6 instances. Additional experimental analyses are performed to better understand the contributions of key algorithmic components.
... According to different cases of | S| and σ, CNDPs can be divided into two categories: K-vertex-CNDP and β-connectivity-CNDP. The former is to optimize (minimize or maximize) the connectivity measure σ, such that no more than K nodes are deleted (i.e., | S| ≤ K), whereas the latter aims to minimize the set of deleted nodes, such that the connectivity measure σ is bounded by a given threshold β (Zhou et al. 2023d). A detailed taxonomy of CNDPs is provided in (Zhou et al. 2021b). ...
Article
This study considers a well-known critical node detection problem that aims to minimize a pairwise connectivity measure of an undirected graph via the removal of a sub-set of nodes (referred to as critical nodes) subject to a cardinality constraint. Potential appli-cations include epidemic control, emergency response, vulnerability assessment, carbon emission monitoring, network security, and drug design. To solve the problem, we present a “reduce-solve-combine” memetic search approach that integrates a problem reduction mechanism into the popular population-based memetic algorithm framework. At each generation, a common pattern mined from two parent solutions is first used to reduce the given problem instance, then the reduced instance is solved by a component-based hybrid neighborhood search that effectively combines an articulation point impact strategy and a node weighting strategy, and finally an offspring solution is produced by combining the mined common pattern and the solution of the reduced instance. Extensive evaluations on 42 real-world and synthetic benchmark instances show the efficacy of the proposed method, which discovers nine new upper bounds and significantly outperforms the cur-rent state-of-the-art algorithms. Investigation of key algorithmic modules additionally dis-closes the importance of the proposed ideas and strategies. Finally, we demonstrate the generality of the proposed method via its adaptation to solve the node-weighted critical node problem.
... According to different cases of |S| and σ, CNDPs can be divided into two categories: K-vertex-CNDP and β-connectivity-CNDP. The former is to optimize (minimize or maximize) the connectivity measure σ, such that no more than K nodes are deleted (i.e., |S| ≤ K), while the latter aims to minimize the set of deleted nodes, such that the connectivity measure σ is bounded by a given threshold β (Zhou et al. 2023d). A detailed taxonomy of CNDPs is provided in (Zhou et al. 2021a). ...
Article
Full-text available
This study considers a well-known critical node detection problem that aims to minimize a pairwise connectivity measure of an undirected graph via the removal of a subset of nodes (referred to as critical nodes) subject to a cardinality constraint. Potential applications include epidemic control, emergency response, vulnerability assessment, carbon emission monitoring, network security, and drug design. To solve the problem, we present a “reduce-solve-combine” memetic search approach that integrates a problem reduction mechanism into the popular population-based memetic algorithm framework. At each generation, a common pattern mined from two parent solutions is first used to reduce the given problem instance, then the reduced instance is solved by a component-based hybrid neighborhood search that effectively combines an articulation point impact strategy and a node weighting strategy, and finally an offspring solution is produced by combining the mined common pattern and the solution of the reduced instance. Extensive evaluations on 42 real-world and synthetic benchmark instances show the efficacy of the proposed method, which discovers nine new upper bounds and significantly outperforms the current state-of-the-art algorithms. Investigation of key algorithmic modules additionally discloses the importance of the proposed ideas and strategies. Finally, we demonstrate the generality of the proposed method via its adaptation to solve the node-weighted critical node problem. History: Accepted by Erwin Pesch, Area Editor for Heuristic Search & Approximation Algorithms. Funding: This work was supported by the National Natural Science Foundation of China [Grants 61903144, 72031007]. Supplemental Material: The software that supports the findings of this study is available within the paper and its Supplemental Information ( https://pubsonline.informs.org/doi/suppl/10.1287/ijoc.2022.0130 ) as well as from the IJOC GitHub software repository ( https://github.com/INFORMSJoC/2022.0130 ). The complete IJOC Software and Data Repository is available at https://informsjoc.github.io/ .
... Our future studies intend to i) develop exact or approximate approaches [46], [47], [48] to obtain higher-quality solutions of large-size instances; ii) expand the model by taking into account other practical factors, for example, the changing of road travel flows, leading to time-dependent robust lane reservation problems; iii) consider other options such as valueat-risk measures [49] and service efficiency and equity [50], [51] as an evaluation index in the objective function; iv) extend the proposed model to investigate bus lane reservation and transit network design problems [52], [53]; and v) extend the problem by considering ordinary traffic flow redistribution due to travelers' route choice behaviors and develop highly effective algorithms to solve it. APPENDIX A THE DEFORMATION FOR EQUATION (7) The BPR function summarizes the relationship between traffic flow and travel time, which can be shown as follows: ...
Article
Lane reservation optimization is important in intelligent transportation systems. Most existing studies are carried out under deterministic road conditions by assuming constant road travel time. However, road conditions vary due to various factors, resulting in uncertain road travel time. This work addresses a new reliability-based lane reservation and route design problem by considering uncertain road travel time with its known mean and covariance matrix. It aims to decide which road segments in a network should implement reserved lanes and to design routes for special time-crucial transportation tasks. The objective is to maximize transportation service reliability (i.e., the probability of completing the special tasks on time). For this problem, a novel distributionally robust optimization model is first established. To solve it, this work proposes i) an adapted sample average approximation-based approach and ii) a two-stage hierarchical heuristic algorithm based on second-order cone programming. Experimental results on an illustrative example and a real-world case demonstrate that the latter is more effective and efficient than the former. In addition, we conduct a series of parameter sensitivity analysis experiments to reveal the factors affecting lane reservation and provide optimal solutions given different parameter settings.
Article
In leader-follower multiagent systems (MASs), seeking an efficient scheme to select a set of agents as leaders is important for realizing the expected cooperative performance. In this article, the problem of minimal leader selection is investigated for impulsive general linear MASs with switching topologies. This study focuses on selecting a set of agents as leaders that receive information from a reference signal directly, while minimizing the number of leaders, subject to consensus tracking performance. First, adopting the average dwell time technique and a time-ratio constraint, an explicit criterion for consensus tracking is derived as prepreparation for leader selection. Second, applying the submodular optimization framework, leader selection metrics are established based on the derived criterion. Third, employing the greedy rule, an efficient leader selection scheme is presented according to the established metrics. The scheme comprises two polynomial-time algorithms that return selected leader sets within a logarithmic bound of the optimum. Finally, the effectiveness of the developed leader selection scheme is verified using an illustrative example.
Article
In an intelligent transportation system, accurate traffic flow prediction can provide significant help for travel planning. Even though some methods are proposed to do so, they focus on either algorithm or data level studies. This work focuses on both by proposing a Community-based dandelion algorithm-enabled Feature selection and Broad learning system (CFB). Specifically, a feature selection method is adopted to choose suitable features aiming to avoid redundant ones affecting prediction accuracy, and a neural network-based learning algorithm, namely a Broad Learning System (BLS), is used to predict traffic flow. In order to further boost its prediction performance, a Community-based Dandelion Algorithm (CDA) is proposed by considering an individual and its multiple offspring as a community and adopting a learning strategy for different communities. The proposed CDA is used to a) choose the suitable features as a feature selection method; and b) optimize the parameters and network structure of BLS. CDA’s superiority over its competitive peers is first verified on CEC2013’s benchmark functions, and then the proposed CFB is applied to handle the traffic flow prediction problems. The results indicate that it can improve the prediction accuracy by 5%-16% compared to the updated traffic flow prediction methods.
Article
High-dimensional and incomplete (HDI) interactions among numerous nodes are commonly encountered in a Big Data-related application, like user-item interactions in a recommender system. Owing to its high efficiency and flexibility, a stochastic gradient descent (SGD) algorithm can enable efficient latent feature analysis (LFA) of HDI data for its precise representation, thereby enabling efficient solutions to knowledge acquisition issues like missing data estimation. However, LFA on HDI data involves a bilinear issue, making SGD-based LFA a sequential process, i.e., the update on a feature can impact the results on the others. Intervening the sequence of SGD-based LFA on HDI data can affect the training results. Therefore, a parallel SGD algorithm to LFA should be designed with care. Existing parallel SGD-based LFA models suffer from a) low parallelization degree, and b) slow convergence, which significantly restrict their scalability. Aiming at addressing these vital issues, this paper presents an A daptively-accelerated P arallel S tochastic G radient D escent (AP-SGD) algorithm to LFA by: a) establishing a novel local minimum-based data splitting and scheduling scheme to reduce the scheduling cost among threads, thereby achieving high parallelization degree; and b) incorporating the adaptive momentum method into the learning scheme, thereby accelerating the convergence rate by making the learning rate and acceleration coefficient self-adaptive. The convergence of the achieved AP-SGD-based LFA model is theoretically proved. Experimental results on three HDI matrices generated by real industrial applications demonstrate that the AP-SGD-based LFA model outperforms state-of-the-art parallel SGD-based LFA models in both estimation accuracy for missing data and parallelization degree. Hence, it has the potential for efficient representation of HDI data in industrial scenes.
Article
In today’s semiconductor manufacturing industry, wafer foundries often face the challenge of producing a variety of integrated circuit chip products using a single manufacturing line. To address this, multicluster tools have become a popular choice for processing multiple wafer types simultaneously. Operating such tools involves coordinating the robots in adjacent individual tools to transport multitype wafers through a shared buffer. This study aims to develop a scheduling method for the concurrent fabrication processes of two wafer types, performed by a multicluster tool with wafer residency time constraints. The proposed approach presents a two-backward sequence, based on a backward strategy of a single wafer type, to convert a one-wafer cyclic schedule into a one-wafer-per-type cyclic schedule while revealing its temporal properties. To ensure a smooth operation of a single-arm multicluster tool system and synchronize multiple robots, several necessary and sufficient conditions are derived for the first time. Two efficient algorithms are then proposed to determine the feasibility of a periodic schedule and obtain a schedule that achieves the lower-bound cycle time under a two-backward strategy, maximizing the productivity of such a multicluster tool. Finally, numerical simulations and two practical examples are presented to demonstrate the applications and performance of the proposed approach.
Article
Full-text available
The k -vertex cut ( k -VC) problem belongs to the family of the critical node detection problems, which aims to find a minimum subset of vertices whose removal decomposes a graph into at least k connected components. It is an important NP-hard problem with various real-world applications, e.g., vulnerability assessment, carbon emissions tracking, epidemic control, drug design, emergency response, network security, and social network analysis. In this article, we propose a fast local search (FLS) approach to solve it. It integrates a two-stage vertex exchange strategy based on neighborhood decomposition and cut vertex, and iteratively executes operations of addition and removal during the search. Extensive experiments on both intersection graphs of linear systems and coloring/DIMACS graphs are conducted to evaluate its performance. Empirical results show that it significantly outperforms the state-of-the-art (SOTA) algorithms in terms of both solution quality and computation time in most of the instances. To evaluate its generalization ability, we simply extend it to solve the weighted version of the k -VC problem. FLS also demonstrates its excellent performance.
Article
Full-text available
This work addresses a soft-clustered vehicle routing problem that extends the classical capacitated vehicle routing problem with one additional constraint, that is, customers are partitioned into clusters and all customers of the same cluster must be served by the same vehicle. Its potential applications include parcel delivery in courier companies and freight transportation. Due to its NP-hard nature, solving it is computationally challenging. This paper presents an efficient bilevel memetic search method to do so, which explores search space at both cluster and customer levels. It integrates three distinct modules: a group matching-based crossover (to generate promising offspring solutions), a bilevel hybrid neighborhood search (to perform local optimization), and a tabu-driven population reconstruction strategy (to help the search escape from local optima). Extensive experiments on three sets of 390 widely used public benchmark instances are conducted. The results convincingly demonstrate that the proposed method achieves much better overall performance than state-of-the-art algorithms in terms of both solution quality and computation time. In particular, it is able to find 20 new upper bounds for large-scale instances while matching the best-known upper bounds for all but four of the remaining instances. Ablation studies on three key algorithm modules are also performed to demonstrate the novelty and effectiveness of the proposed ideas and strategies. Funding: This work was supported by the Macau Young Scholars Program [Grant AM2020011], Fundo para o Desenvolvimento das Cienciase da Tecnologia (FDCT) [Grant 0047/2021/A1], the National Natural Science Foundation of China [Grants 61903144, 71871142, and 71931007], and the Open Project of the Shenzhen Institute of Artificial Intelligence and Robotics for Society [Grant AC01202005002]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/trsc.2022.1186 .
Article
Full-text available
The distance-based critical node problem involves identifying a subset of nodes in a graph such that the removal of these nodes leads to a residual graph with the minimum distance-based connectivity. Due to its NP-hard nature, solving this problem using exact algorithms has proved to be highly challenging. Moreover, existing heuristic algorithms are typically time-consuming. In this work, we introduce a fast tri-individual memetic search approach to solve the problem. The proposed approach maintains a small population of only three individuals during the whole search. At each generation, it sequentially executes an inherit-repair recombination operator to generate a promising offspring solution, a fast betweenness centrality-based late-acceptance search to find high-quality local optima, and a simple population updating strategy to maintain a healthy population. Extensive experiments on both real-world and synthetic benchmarks show our method significantly outperforms state-of-the-art algorithms. In particular, it can steadily find the known optimal solutions for all 22 real-world instances with known optima in only one minute, and new upper bounds on the remaining 22 large real-world instances. For 54 synthetic instances, it finds new upper bounds on 36 instances, and matches the previous best-known upper bounds on 15 other instances in ten minutes. Finally, we investigate the usefulness of each key algorithmic ingredient.
Article
Full-text available
In critical node problems, the task is to identify a small subset of so-called critical nodes whose deletion maximally degrades a network’s “connectivity” (however that is measured). Problems of this type have been widely studied, for example, for limiting the spread of infectious diseases. However, existing approaches for solving them have typically been limited to networks having fewer than 1,000 nodes. In this paper, we consider a variant of this problem in which the task is to delete b nodes so as to minimize the number of node pairs that remain connected by a path of length at most k. With the techniques developed in this paper, instances with up to 17,000 nodes can be solved exactly. We introduce two integer programming formulations for this problem (thin and path-like) and compare them with an existing recursive formulation. Although the thin formulation generally has an exponential number of constraints, it admits an efficient separation routine. Also helpful is a new, more general preprocessing procedure that, on average, fixes three times as many variables than before. Summary of Contribution: In this paper, we consider a distance-based variant of the critical node problem in which the task is to delete b nodes so as to minimize the number of node pairs that remain connected by a path of length at most k. This problem is motivated by applications in social networks, telecommunications, and transportation networks. In our paper, we aim to solve large-scale instances of this problem. Standard out-of-the-box approaches are unable to solve such instances, requiring new integer programming models, methodological contributions, and other computational insights. For example, we propose an algorithm for finding a maximum independent set of simplicial nodes that runs in time O(nm) that we use in a preprocessing procedure; we also prove that the separation problem associated with one of our integer programming models is NP-hard. We apply our branch-and-cut implementation to real-life networks from a variety of domains and observe speedups over previous approaches.
Article
Full-text available
We present frequent pattern-based search (FPBS) that combines data mining and optimization. FPBS is a general-purpose method that unifies data mining and optimization within the population-based search framework. The method emphasizes the relevance of a modular- and component-based approach, making it applicable to optimization problems by instantiating the underlying components. To illustrate its potential for solving difficult combinatorial optimization problems, we apply the method to the well-known and challenging quadratic assignment problem. We show the computational results and comparisons on the hardest QAPLIB benchmark instances. This work reinforces the recent trend toward closer cooperations between the optimization methods and machine learning or data mining techniques.
Article
Full-text available
Population-based memetic algorithms have been successfully applied to solve many difficult combinatorial problems. Often, a population of fixed size is used in such algorithms to record some best solutions sampled during the search. However, given the particular features of the problem instance under consideration, a population of variable size would be more suitable to ensure the best search performance possible. In this work, we propose variable population memetic search (VPMS), where a strategic population sizing mechanism is used to dynamically adjust the population size during the search process. Our VPMS approach starts its search from a small population of only two solutions to focus on exploitation, and then adapts the population size according to the search status to continuously influence the balancing between exploitation and exploration. We illustrate an application of the VPMS approach to solve the challenging critical node problem (CNP). We show that the VPMS algorithm integrating a variable population, an effective local optimization procedure and a backbone-based crossover operator performs very well compared to state-of-the-art CNP algorithms. The algorithm is able to discover new upper bounds for 12 instances out of the 42 popular benchmark instances, while matching 23 previous best-known upper bounds.
Article
Full-text available
Finding an optimal set of nodes, called key players, whose activation (or removal) would maximally enhance (or degrade) a certain network functionality, is a fundamental class of problems in network science. Potential applications include network immunization, epidemic control, drug design and viral marketing. Due to their general NP-hard nature, these problems typically cannot be solved by exact algorithms with polynomial time complexity. Many approximate and heuristic strategies have been proposed to deal with specific application scenarios. Yet, we still lack a unified framework to efficiently solve this class of problems. Here, we introduce a deep reinforcement learning framework FINDER, which can be trained purely on small synthetic networks generated by toy models and then applied to a wide spectrum of application scenarios. Extensive experiments under various problem settings demonstrate that FINDER significantly outperforms existing methods in terms of solution quality. Moreover, it is several orders of magnitude faster than existing methods for large networks. The presented framework opens up a new direction of using deep learning techniques to understand the organizing principle of complex networks, which enables us to design more robust networks against both attacks and failures. A fundamental problem in network science is how to find an optimal set of key players whose activation or removal significantly impacts network functionality. The authors propose a deep reinforcement learning framework that can be trained on small networks to understand the organizing principles of complex networked systems.
Article
Full-text available
In recent years, the relevance of cybersecurity has been increasingly evident to companies and institutions, as well as to final users. Because of that, it is important to ensure the robustness of a network. With the aim of improving the security of the network, it is desirable to find out which are the most critical nodes in order to protect them from external attackers. This work tackles this problem, named the α‐separator problem, from a heuristic perspective, proposing an algorithm based on the Greedy Randomized Adaptive Search Procedure (GRASP). In particular, a novel approach for the constructive procedure is proposed, where centrality metrics derived from social network analysis are used as a greedy criterion. Furthermore, the quality of the provided solutions is improved by means of a combination method based on Path Relinking (PR). This work explores different variants of PR, also adapting the most recent one, Exterior PR, for the problem under consideration. The combination of GRASP + PR allows the algorithm to obtain high‐quality solutions within a reasonable computing time. The proposal is supported by a set of intensive computational experiments that show the quality of the proposal, comparing it with the most competitive algorithm found in the state of art.
Article
Full-text available
The family of critical node detection problems asks for finding a subset of vertices, deletion of which minimizes or maximizes a predefined connectivity measure on the remaining network. We study a problem of this family called the k-vertex cut problem. The problem asks for determining the minimum weight subset of nodes whose removal disconnects a graph into at least k components. We provide two new integer linear programming formulations, along with families of strengthening valid inequalities. Both models involve an exponential number of constraints for which we provide poly-time separation procedures and design the respective branch-and-cut algorithms. In the first formulation one representative vertex is chosen for each of the k mutually disconnected vertex subsets of the remaining graph. In the second formulation, the model is derived from the perspective of a two-phase Stackelberg game in which a leader deletes the vertices in the first phase, and in the second phase a follower builds connected components in the remaining graph. Our computational study demonstrates that a hybrid model in which valid inequalities of both formulations are combined significantly outperforms the state-of-the-art exact methods from the literature.
Article
Full-text available
Complex networks have become an active interdisciplinary field of research inspired by the empirical study of various networks. A subway network is a real-world example of complex networks in the transportation domain, which has attracted growing attention in network analysis recently. Analyzing human mobility patterns, specifically in ranking subway stations closely bounded by urban subway planning and individuals' travel experience, is still an open issue. In this paper, we propose a novel ranking method of station importance (SIRank) by utilizing human mobility patterns and improved PageRank algorithm. Specifically, by analyzing human mobility patterns of the subway system in Shanghai, we demonstrate both static and dynamic characteristics using two network models (Shanghai subway static network and Shanghai subway passenger network). In particular, the SIRank focuses on bi-directional passenger flow analysis between origins and destinations to iteratively generate the importance value for each station. We implement a range of the experiments to illustrate the effectiveness of SIRank using the real-world subway transaction datasets. The results demonstrate that the hit ratio in SIRank reaches 60% in the top five stations, which is much higher than that of ranking by a weighted mixed index (WMIRank) and ranking by node degree (NDRank) approaches.
Article
Full-text available
Frequent itemset mining (FIM) is an essential task within data analysis since it is responsible for extracting frequently occurring events, patterns, or items in data. Insights from such pattern analysis offer important benefits in decision‐making processes. However, algorithmic solutions for mining such kind of patterns are not straightforward since the computational complexity exponentially increases with the number of items in data. This issue, together with the significant memory consumption that is present in the mining process, makes it necessary to propose extremely efficient solutions. Since the FIM problem was first described in the early 1990s, multiple solutions have been proposed by considering centralized systems as well as parallel (shared or nonshared memory) architectures. Solutions can also be divided into exhaustive search and nonexhaustive search models. Many of such approaches are extensions of other solutions and it is therefore necessary to analyze how this task has been considered during the last decades. This article is categorized under: Algorithmic Development > Association Rules Technologies > Association Rules
Article
Full-text available
Social networks are absolutely a useful and important place for connecting people within the world. A basic issue in a social network is to identify the key persons within it. This is why different centrality measures have been found over the years. In this survey paper, we present past and present research works on measures of centrality in social network. For this plan, we discuss mathematical definitions and different developed centrality measures. We also present some applications of centrality measures in biology, research, security, traffic, transportation, drug, class room. At last, our future research work on centrality measure is given.
Article
Full-text available
Critical node problems involve identifying a subset of critical nodes from an undirected graph whose removal results in optimizing a pre-defined measure over the residual graph. As useful models for a variety of practical applications, these problems are computational challenging. In this paper, we study the classic critical node problem (CNP) and introduce an effective memetic algorithm for solving CNP. The proposed algorithm combines a double backbone-based crossover operator (to generate promising offspring solutions), a component-based neighborhood search procedure (to find high-quality local optima) and a rank-based pool updating strategy (to guarantee a healthy population). Specially, the component-based neighborhood search integrates two key techniques, i.e., two-phase node exchange strategy and node weighting scheme. The double backbone-based crossover extends the idea of general backbone-based crossovers. Extensive evaluations on 42 synthetic and real-world benchmark instances show that the proposed algorithm discovers 21 new upper bounds and matches 18 previous best-known upper bounds. We also demonstrate the relevance of our algorithm for effectively solving a variant of the classic CNP, called the cardinality-constrained critical node problem. Finally, we investigate the usefulness of each key algorithmic component.
Article
Full-text available
Given a vertex-weighted undirected graph G = (V, E, w) and a positive integer k, we consider the k-separator problem: it consists in finding a minimum-weight subset of vertices whose removal leads to a graph where the size of each connected component is less than or equal to k. We show that this problem can be solved in polynomial time for some graph classes including bounded tree width, mK(2)-free, (G(1), G(2), G(3), P-6)-free, interval-filament, asteroidal triple-free, weakly chordal, interval and circular-arc graphs. Polyhedral results with respect to the convex hull of the incidence vectors of k-separators are reported. Approximation algorithms are also presented.
Article
Full-text available
Modern optimization algorithms typically require the setting of a large number of parameters to optimize their performance. The immediate goal of automatic algorithm configuration is to find, automatically, the best parameter settings of an optimizer. Ultimately, automatic algorithm configuration has the potential to lead to new design paradigms for optimization software. Theirace package is a software package that implements a number of automatic configuration procedures. In particular, it offers iterated racing procedures, which have been used successfully to automatically configure various state-of-the-art algorithms. The iterated racing procedures implemented inirace include the iterated F-race algorithm and several extensions and improvements over it. In this paper, we describe the rationale underlying the iterated racing procedures and introduce a number of recent extensions. Among these, we introduce a restart mechanism to avoid premature convergence, the use of truncated sampling distributions to handle correctly parameter bounds, and an elitist racing procedure for ensuring that the best configurations returned are also those evaluated in the highest number of training instances. We experimentally evaluate the most recent version ofirace and demonstrate with a number of example applications the use and potential ofirace, in particular, and automatic algorithm configuration, in general.
Book
Full-text available
This paper presents the fundamental principles underlying tabu search as a strategy for combinatorial optimization problems. Tabu search has achieved impressive practical successes in applications ranging from scheduling and computer channel balancing to cluster analysis and space planning, and more recently has demonstrated its value in treating classical problems such as the traveling salesman and graph coloring problems. Nevertheless, the approach is still in its infancy, and a good deal remains to be discovered about its most effective forms of implementation and about the range of problems for which it is best suited. This paper undertakes to present the major ideas and findings to date, and to indicate challenges for future research. Part I of this study indicates the basic principles, ranging from the short-term memory process at the core of the search to the intermediate and long term memory processes for intensifying and diversifying the search. Included are illustrative data structures for implementing the tabu conditions (and associated aspiration criteria) that underlie these processes. Part I concludes with a discussion of probabilistic tabu search and a summary of computational experience for a variety of applications. Part II of this study (to appear in a subsequent issue) examines more advanced considerations, applying the basic ideas to special settings and outlining a dynamic move structure to insure finiteness. Part II also describes tabu search methods for solving mixed integer programming problems and gives a brief summary of additional practical experience, including the use of tabu search to guide other types of processes, such as those of neural networks. INFORMS Journal on Computing, ISSN 1091-9856, was published as ORSA Journal on Computing from 1989 to 1995 under ISSN 0899-1499.
Article
Social networks are an essential component of the Internet of People (IoP) and play an important role in stimulating interactive communication among people. Graph convolutional networks provide methods for social network analysis with its impressive performance in semi-supervised node classification. However, the existing methods are based on the assumption of balanced data distribution and ignore the imbalanced problem of social networks. In order to extract valuable information from imbalanced data for decision making, a novel method named minority-weighted graph neural network (mGNN) is presented in this paper. It extends imbalanced classification ideas in the traditional machine learning field to graph-structured data to improve the classification performance of GNNs. In a node feature aggregation stage, the node membership values among nodes are calculated for minority nodes’ feature aggregation enhancement. In an over-sampling stage, cost-sensitive learning is used to improve edge prediction results of synthetic minority nodes, and further raise their importance. In addition, a Gumbel distribution is adopted as an activation function. The proposed mGNN is evaluated on six social network datasets. Experimental results show that it yields promising results for imbalanced node classification.
Article
Along with the development of information technologies such as mobile Internet, information acquisition technology, cloud computing and big data technology, the traditional knowledge engineering and knowledge-based software engineering have undergone fundamental changes where the network plays an increasingly important role. Within this context, it is required to develop new methodologies as well as technical tools for network-based knowledge representation, knowledge services and knowledge engineering. Obviously, the term 'network' has different meanings in different scenarios. Meanwhile, some breakthroughs in several bottleneck problems of complex networks promote the developments of the new methodologies and technical tools for network-based knowledge representation, knowledge services and knowledge engineering. This paper first reviews some recent advances on complex networks, and then, in conjunction with knowledge graph, proposes a framework of networked knowledge which models knowledge and its relationships with the perspective of complex networks. For the unique advantages of deep learning in acquiring and processing knowledge, this paper reviews its development and emphasizes the role that it played in the development of knowledge engineering. Finally, some challenges and further trends are discussed.
Article
The minimum weighted vertex cover (MWVC) problem is a well known combinatorial optimization problem with important applications. This paper introduces a novel local search algorithm called NuMWVC for MWVC based on three ideas. First, four reduction rules are introduced during the initial construction phase. Second, the configuration checking with aspiration is proposed to reduce cycling problem. Moreover, a self-adaptive vertex removing strategy is proposed to save time.
Article
Quality issues in supply networks can adversely affect the performance of suppliers and their downstream customers. Since suppliers might fail to comply with quality guidelines, decentralized quality controls by each firm in a supply network may be insufficient; thus, a complete network perspective on risk management could help to minimize supply disruptions. Here, we develop a novel modeling framework drawing on epidemiology, to demonstrate how network structure impacts the propagation of quality issues—akin to the spread of an infectious disease. We formulate an SIS model in which nodes represent individual suppliers while directed edges represent the movement of goods between suppliers; these nodes can be either susceptible (S) to or infected (I) by a disruption. Applying the model to 21 real-world networks, we find that a quality issue’s magnitude depends strongly on its origin node and the network archetype. The network’s maximum Authority value—based on the relationship between relevant authoritative nodes and hub nodes— is highly correlated with the extent of a supply disruption in our simulation. We examine different network-level strategies for containing an outbreak and find that improving quality control at critical nodes—those characterized by a high Authority value or customer proximity—is an effective measure. Adjusting the network structure by focusing on an upstream-centric flow of goods, thereby reducing the maximum Authority value, decreases vulnerability to quality issues. Managers can reduce the impact of quality disruptions through a combination of conventional firm-level strategies and novel network risk management strategies.
Article
Dear editor, This letter models and analyzes the Matthew effect under switching social networks via distributed competition. A competitive strategy is leveraged to trace the evolution of individuals in social systems under the Matthew effect. In addition, a consensus filter is utilized to decentralize the model with undirected graphs used to describe the interactions among individuals in social networks. Further, a term is added to the dynamic consensus filter to compensate for the influence of the switching of communication networks on the model. In this letter, the convergence of the proposed Matthew effect model is theoretically proved, and simulative experiments are conducted to verify the availability of the model. In particular, this letter points out the state and evolution of each individual in social systems with distributed competition, and the influence of the social environment on the development of individuals is also intuitively displayed.
Article
Protecting the critical nodes of a Cyber-Physical Power System (CPPS) is an effective strategy for mitigating the risk of incurring large-scale blackouts. A Gene Importance based Evolutionary Algorithm (GIEA) is proposed to identify a set of critical k nodes by maximizing the total load loss received by end-users. GIEA adopts an importance-based evolutionary strategy to improve the algorithm’s convergence and accuracy, in which the initial node importance metrics are assessed based on dynamic power flows and topology information. Both performance contribution (PC) and coupling failure impact (CFI) are considered in our importance evaluation framework. The impacts of different types of communication nodes on power networks are integrated into the proposed cascading failure model and CFI assessment. Based on the coupling and interdependence information, the strong coupling node pairs are identified to reduce the dimension of the decision vector to improve the computational efficiency of GIEA. The effectiveness and superiority of the proposed methods are illustrated through an example of a coupling CPPS consisting of the IEEE 30-bus model and a communication network with the small-world structure.
Article
Planning driving routes is a common yet challenging task. Previous solutions that often suggest the shortest or time-dependent fastest paths for drivers are incompatible with the diversity of user requirements. Recent years are witnessing an increasing need of utility gained along the driving path and more advanced pathplanners are developed correspondingly, such as the safest and the most beautiful paths. Similar to the travel time on the edge, the utility score actually changes with the time as well. In this paper, targeting a more realistic path planning system, we allow both time cost and utility on the edge to vary from time to time. Over such two-fold time-dependent road network, we aim to discover the optimal driving path for the given user query. To address this NP-hard problem, we propose an approximated algorithm called 2TD Path-Planner, which mainly consists of three components, i.e., edge reachability computing, chromosome encoding, and chromosome decoding.
Article
Robustness is one of the most important performance criteria for any rail transit network (RTN), because it helps us enhance the efficiency of RTN. Several studies have addressed the issue of RTN robustness primarily from the perspectives of given rail network structures or static distributions of passenger flow. An open problem that remains in fully understanding RTN robustness is how to take the spatio-temporal characteristics of passenger travel into consideration, since the dynamic passenger flow in an RTN can readily trigger unexpected cascading failures. This paper addresses this problem as follows: (1) we propose a two-layer rail transit network (TL-RTN) model that captures the interactions between a rail network and its corresponding dynamic passenger flow network, and then (2) we conduct the cascading failure analysis of the TL-RTN model based on an extended coupled map lattice (CML). Specifically, our proposed model takes the strategy of passenger flow redistribution and the passenger flow capacity of each station into account to simulate the human mobility behaviors and to estimate the maximum passenger flow appeal in each station, respectively. Based on the smart card data of RTN passengers in Shanghai, our experiments show that the TL-RTN robustness is related to both external perturbations and failure modes. Moreover, during the peak hours on weekdays, due to the large passenger flow, a small perturbation will trigger a 20% cascading failure of a network. Having ranked the cascade size caused by the stations, we find that this phenomenon is determined by both the hub nodes and their neighbors.
Article
Multilevel programming can provide the right mathematical formulations for modeling sequential decision-making problems. In such cases, it is implicit that each level anticipates the optimal reaction of the subsequent ones. Defender–attacker–defender trilevel programs are a particular case of interest that encompasses a fortification strategy, followed by an attack, and a consequent recovery defensive strategy. In “Multilevel Approaches for the Critical Node Problem,” Baggio, Carvalho, Lodi, and Tramontani study a combinatorial sequential game between a defender and an attacker that takes place in a network. The authors propose an exact algorithmic framework. This work highlights the significant improvements that the defender can achieve by taking the three-stage game into account instead of considering fortification and recovery as isolated. Simultaneously, the paper contributes to advancing the methodologies for solving trilevel programs.
Article
Identifying the critical elements of metro networks attracts a growing attention due to the significant impact of accidents on metro systems. The existing measures can be divided into two types: the localized measures (e.g., flow, centrality, etc.) in a normal network state and the impact-based measures by assuming failure scenarios to conduct before-and-after analysis. In this paper, we develop a new method to identify the critical stations in metro systems based on the concept of route redundancy. Different from the localized measures, route redundancy describes the origin-destination effective connections under potential disruptions explicitly from travelers’ perspective. Compared with the impact-based measures, the proposed method does not need to enumerate disruption scenarios and to reevaluate the resultant network performances. Specifically, the mean-excess criticality probability is proposed as a risk measure to calculate the criticality of each station in a metro network. A realistic case study based on the Shanghai metro network is conducted to demonstrate the features of the proposed method. The results indicate that the critical stations are not necessarily transfer stations or those with a high degree, and the important stations based on betweenness, passenger flow and network efficiency are not necessarily critical for the network redundancy. The proposed method could assist in a cost-effective resource allocation and an informed decision-making for strategically enhancing the metro network resiliency.
Article
This paper surveys the recent attempts, both from the machine learning and operations research communities, at leveraging machine learning to solve combinatorial optimization problems. Given the hard nature of these problems, state-of-the-art algorithms rely on handcrafted heuristics for making decisions that are otherwise too expensive to compute or mathematically not well defined. Thus, machine learning looks like a natural candidate to make such decisions in a more principled and optimized way. We advocate for pushing further the integration of machine learning and combinatorial optimization and detail a methodology to do so. A main point of the paper is seeing generic optimization problems as data points and inquiring what is the relevant distribution of problems to use for learning on a given task.
Article
A multifactorial evolutionary algorithm (MFEA) is a recently proposed algorithm for evolutionary multitasking, which optimizes multiple optimization tasks simultaneously. With the design of knowledge transfer among different tasks, MFEA has demonstrated the capability to outperform its single-task counterpart in terms of both convergence speed and solution quality. In MFEA, the knowledge transfer across tasks is realized via the crossover between solutions that possess different skill factors . This crossover is thus essential to the performance of MFEA. However, we note that the present MFEA and most of its existing variants only employ a single crossover for knowledge transfer, and fix it throughout the evolutionary search process. As different crossover operators have a unique bias in generating offspring, the appropriate configuration of crossover for knowledge transfer in MFEA is necessary toward robust search performance, for solving different problems. Nevertheless, to the best of our knowledge, there is no effort being conducted on the adaptive configuration of crossovers in MFEA for knowledge transfer, and this article thus presents an attempt to fill this gap. In particular, here, we first investigate how different types of crossover affect the knowledge transfer in MFEA on both single-objective (SO) and multiobjective (MO) continuous optimization problems. Furthermore, toward robust and efficient multitask optimization performance, we propose a new MFEA with adaptive knowledge transfer (MFEA-AKT), in which the crossover operator employed for knowledge transfer is self-adapted based on the information collected along the evolutionary search process. To verify the effectiveness of the proposed method, comprehensive empirical studies on both SO and MO multitask benchmarks have been conducted. The experimental results show that the proposed MFEA-AKT is able to identify the appropriate knowledge transfer crossover for different optimization problems and even at different optimization stages along the search, which thus leads to superior or competitive performances when compared to the MFEAs with fixed knowledge transfer crossover operators.
Article
Attributed graphs are widely used to represent network data where the attribute information of nodes is available. To address the problem of identify clusters in attributed graphs, most of existing solutions are developed simply based on certain particular assumptions related to the characteristics of clusters of interest. However, it is yet unknown whether such assumed characteristics are consistent with attributed graphs. To overcome this issue, we innovatively introduce an inductive clustering algorithm that tends to address the clustering problem for attributed graphs without any assumption made on the clusters. To do so, we first process the attribute information to obtain pairwise attribute values that significantly frequently cooccur in adjacent nodes as we believe that they have potential ability to represent the characteristics of a given attributed graph. For two adjacent nodes, their likelihood of being grouped in the same cluster can be weighted by their ability to characterize the graph. Then based on these verifed characteristics instead of assumed ones, a depth-first search strategy is applied to perform the clustering task. Moreover, we are also able to classify clusters such that their significances can be indicated. The experimental results demonstrate the performance and usefulness of our algorithm.
Article
The problem of finding a minimum weighted vertex cover (MWVC) in a graph is a well-known combinatorial optimisation problem with important applications. This article introduces a novel local search algorithm called NuMWVC for MWVC based on three ideas. First, four reduction rules are introduced during the initial construction phase. Second, a strategy called configuration checking with aspiration, which aims for reducing cycling in local search, is proposed for MWVC for the first time. Moreover, a self-adaptive vertex removing strategy is proposed to save time spent on searching solutions for which the quality is likely far from optimality. Experimental results show that NuMWVC outperforms state-of-the-art local search algorithms for MWVC on the standard benchmarks, massive graphs and real-world problem (map labeling problem) instances.
Article
In networked systems such as communication networks or power grids, graph separation from node failures can damage the overall operation severely. One of the most important goals of network attackers is thus to separate nodes so that the sizes of connected components become small. In this work, we consider the problem of finding a minimum α-separator, that partitions the graph into connected components of sizes at most αn, where n is the number of nodes. To solve the α-separator problem, we develop a random walk algorithm based on Metropolis chain. We characterize the conditions for the first passage time (to find an optimal solution) of our algorithm. We also find an optimal cooling schedule, under which the random walk converges to an optimal solution almost surely. Furthermore, we generalize our algorithm to non-uniform node weights. We show through extensive simulations that the first passage time is less than O(n³), thereby validating our analysis. The solution found by our algorithm allows us to identify the weakest points in the network that need to be strengthened. Simulations in real topologies show that attacking a dense area is often not an efficient solution for partitioning a network into small components.
Article
Road networks are extremely vulnerable to cascading failure caused by traffic accidents or anomalous events. Therefore, accurate identification of critical nodes, whose failure may cause a dramatic reduction in the road network transmission efficiency, is of great significance to traffic management and control schemes. However, none of the existing approaches can locate city-wide critical nodes in real road networks. In this paper, we propose a novel data-driven framework to rank node importance through mining from comprehensive vehicle trajectory data, instead of analysis solely on the topology of the road network. In this framework, we introduce a trip network modeled by a tripartite graph to characterize the dynamics of the road network. Furthermore, we present two algorithms, integrating the origin-destination entropy with flow (ODEF) algorithm and the crossroad-rank (CRRank) algorithm, to better exploit the information included in the tripartite graph and to effectively assess the node importance. ODEF absorbs the idea of the information entropy to evaluate the centrality of a node and to calculate its importance rating by integrating its centrality with the traffic flow. CRRank is a ranking algorithm based on eigenvector centrality that captures the mutual reinforcing relationships among the OD-pair, path, and intersection. In addition to the factors considered in ODEF, CRRank considers the irreplaceability of a node and the spatial relationships between neighboring nodes. We conduct a synthetic experiment and a real case study based on a real-world dataset of taxi trajectories. Experiments verify the utility of the proposed algorithms.
Article
It is an important challenge to detect an overlapping community and its evolving tendency in a complex network. To our best knowledge, there is no such an overlapping community detection method that exhibits high normalized mutual information (NMI) and F-score, and can also predict an overlapping community's future considering node evolution, activeness, and multiscaling. This paper presents a novel method based on node vitality, an extension of node fitness for modeling network evolution constrained by multiscaling and preferential attachment. First, according to a node's dynamics such as link creation and destruction, we find node vitality by comparing consecutive network snapshots. Then, we combine it with the fitness function to obtain a new objective function. Next, by optimizing the objective function, we expand maximal cliques, reassign overlapping nodes, and find the overlapping community that matches not only the current network but also the future version of the network. Through experiments, we show that its NMI and F-score exceed those of the state-of-the-art methods under diverse conditions of overlaps and connection densities. We also validate the effectiveness of node vitality for modeling a node's evolution. Finally, we show how to detect an overlapping community in a real-world evolving network.
Article
For a network formed by nodes and undirected links between pairs of nodes, the network optimal attack problem aims at deleting a minimum number of target nodes to break the network down into many small components. This problem is intrinsically related to the feedback vertex set problem that was successfully tackled by spin glass theory and an associated belief propagation-guided decimation (BPD) algorithm [H.-J. Zhou, Eur. Phys. J.B 86 (2013) 455]. In the present work we apply a slightly adjusted version of BPD (with approximately linear time complexity) to the network optimal attack problem, and demonstrate that it has much better performance than a recently proposed Collective Information algorithm [F. Morone and H. A. Makse, Nature 524 (2015) 63--68] for different types of random networks and real-world network instances. The BPD-guided attack scheme often induces an abrupt collapse of the whole network, which may make it very difficult to defend.
Chapter
The critical node detection problem seeks a set of nodes with at most a given cardinality, whose deletion results in maximum pairwise disconnectivity. The critical nodes are responsible for the overall connectivity of the graph. In a prior work by the authors, a novel combinatorial algorithm is proposed to identify critical nodes in sparse graphs. The robustness of the algorithm is demonstrated on several test instances. In this work, we apply this algorithm on the human PPI network. In this article, the human protein–protein interaction (PPI) network is considered, where the nodes correspond to proteins and the edges correspond to the interaction between the proteins. The heuristic technique is applied to identify the critical nodes on a subgraph of the PPI network induced by a node set corresponding to the proteins that are present in the cancer pathway in the human PPI network. These set of proteins are obtained from the Human Cancer Protein Interaction Network (HCPIN) database. The information about the interactions between these proteins are obtained from the Human Protein Resource Database (HPRD), in order to construct the graph. The critical nodes in the human cancer protein network correspond to the hub proteins that are responsible for the overall connectivity of the graph and play a role in multiple biological processes. The dysfunction of the interactions with some of the hub proteins or mutation in these proteins have been directly linked to cancer and other diseases. In this research, such hub proteins were identified from a purely graph theoretic perspective in terms of their role in determining the overall connectivity of the PPI network. This new technique will shed light on new hub proteins that are yet to be discovered and the proteins responsible for other genetic disorders.
Article
An effective way to analyze and apprehend the structural properties of networks is to find their most critical nodes. This makes them easier to control, whether the purpose is to keep or to delete them. Given a graph, Critical Node Detection problem (CNDP) consists in finding a set of nodes, deletion of which satisfies some given connectivity metrics in the induced graph. In this paper, we propose and study a new variant of this problem, called Component-Cardinality-Constrained Critical Node Problem (3C-CNP). In this variant, we seek to find a minimal set of nodes, removal of which constrains the size of each connected component in the induced graph to a given bound. We prove the NP-hardness of this problem on a graph of maximum degree , through which we deduce the NP-hardness of CNP (Arulselvan et al., 2009) on the same class of graphs. Also, we study 3C-CNP on trees for different cases depending on node weights and connection costs. For the case where node weights and connection costs have non-negative values, we prove its NP-completeness. While, for the case where node weights (or connection costs) have unit values, we present a polynomial algorithm. Also, we study 3C-CNP on chordal graphs, where we show that it is NP-complete on split graphs, and polynomially solvable on proper interval graphs.
Conference Paper
Given a vertex-weighted undirected graph G = (V,E,w) and a positive integer k, we consider the k-separator problem: it consists in finding a minimum-weight subset of vertices whose removal leads to a graph where the size of each connected component is less than or equal to k. We show that this problem can be solved in polynomial time for some graph classes: for cycles and trees by a dynamic programming approach and by using a peculiar graph transformation coupled with recent results from the literature for m K 2-free, (G 1, G 2, G 3, P 6)-free, interval-filament, asteroidal triple-free, weakly chordal, interval and circular-arc graphs. Approximation algorithms are also presented.
Article
We examine variants of the critical node problem on specially structured graphs, which aim to identify a subset of nodes whose removal will maximally disconnect the graph. These problems lie at the intersection of network interdiction and graph theory research and are relevant to several practical optimization problems. The two different connectivity metrics that we consider regard the number of maximal connected components (which we attempt to maximize) and the largest component size (which we attempt to minimize). We develop optimal polynomial-time dynamic programming algorithms for solving these problems on tree structures and on series-parallel graphs, corresponding to each graph-connectivity metric. We also extend our discussion by considering node deletion costs, node weights, and solving the problems on generalizations of tree structures. Finally, we demonstrate the computational efficacy of our approach on randomly generated graph instances. © 2011 Wiley Periodicals, Inc. NETWORKS, 2012 © 2012 Wiley Periodicals, Inc.
Article
In this paper we present a randomized rounding algorithm for approximating the cardinality-constrained critical node detection problem. This problem seeks to fragment a given network into subgraphs no larger than a prescribed cardinality by removing the smallest possible subset of vertices from the original graph. The motivating application is containment of pandemic disease by prophylactic vaccination, however, the approach is general. We prove that a derandomized algorithm provides a 1/(1−θ)-approximation1/(1−θ)-approximation to the optimal objective value for θ a rounding threshold, in expectation. To improve the practical performance a local search is subsequently performed. We verify the algorithm׳s performance using four common complex network models with different structural properties and over a variety of cardinality constraints.
Chapter
We consider methodologies for managing risk in a telecommunication network based on identification of the critical nodes. The objective is to minimize the number of vertices whose deletion results in disconnected components which are constrained by a given cardinality. This is referred to as the CARDINALITY CONSTRAINED CRITICAL NODE PROBLEM (CC-CNP), and finds application in epidemic control, telecommunications, and military tactical planning, among others. From a telecommunication perspective, the set of critical nodes helps determine which players should be removed from the network in the event of a virus outbreak. Conversely, in order to maintain maximum global connectivity, it should be ensured that the critical nodes remain intact and as secure as possible. The presence of these nodes make a network vulnerable to attacks as they are crucial for the overall connectivity. This is a variation of the CRITICAL NODE DETECTION PROBLEM which has a known complexity and heuristic procedure. In this chapter, we review the recent work in this area, provide formulations based on integer linear programming and develop heuristic procedures for CC-CNP. We also examine the relations of CC-CNP with the well known NODE DELETION PROBLEM and discuss complexity results as a result of this relation.
Article
Tabucol is a tabu search algorithm that tries to determine whether the vertices of a given graph can be colored with a fixed number k of colors such that no edge has both endpoints with the same color. This algorithm was proposed in 1987, one year after Fred Glover's article that launched tabu search. While more performing local search algorithms have now been proposed, Tabucol remains very popular and is often chosen as a subroutine in hybrid algorithms that combine a local search with a population based method. In order to explain this unfailing success, we make a thorough survey of local search techniques for graph coloring problems, and we point out the main differences between all these techniques.
Article
Identifying critical nodes in a graph is important to understand the structural characteristics and the connectivity properties of the network. In this paper, we focus on detecting critical nodes, or nodes whose deletion results in the minimum pair-wise connectivity among the remaining nodes. This problem, known as the critical node problem has applications in several fields including biomedicine, telecommunications, and military strategic planning. We show that the recognition version of the problem is NP-complete and derive a mathematical formulation based on integer linear programming. In addition, we propose a heuristic for the problem which exploits the combinatorial structure of the graph. The heuristic is then enhanced by the application of a local improvement method. A computational study is presented in which we apply the integer programming formulation and the heuristic to real and randomly generated data sets. For all instances tested, the heuristic is able to efficiently provide optimal solutions in a fraction of the time required by a commercial software package.
Conference Paper
Let G be an n-vertex graph that has a vertex separator of size k that partitions the graph into connected components of size smaller than fin, for some fixed 2=3 • fi < 1. Such a separator is called an fi-separator. Finding an fi-separator of size at most k is NP-hard. Moreover, under reasonable complexity theoretic assumptions, it is shown that this problem is not polynomially solvable even when k = O(logn). In this paper, we give a randomized algorithm that finds an fi-separator of size k in the given graph, unless the graph contains an (fi + †)-separator of size strictly less than k, in which case our algorithm finds one such separator. For fixed †, the run- ning time of our algorithm is nO(1)2O(k), which is polynomial for k = O(logn). For bounded degree graphs (as well as for the case of finding balanced edge separators), we present a deterministic al- gorithm with similar running time. Our algorithm involves (among other things) a new concept that we call (†;k)-samples. This is related to the notion of detection sets for network failures, introduced by Kleinberg (FOCS 2000). Our proofs adapt and simplify techniques that were introduced by Kleinberg. As a by-product, our proof improves the known bounds on the size of detection sets. We also show applications of (†;k)- samples to problems in approximation algorithms and rigorous analy- sis of heuristics.
Article
A set of problems which has attracted considerable interest recently is the set of node-deletion problems. The general node-deletion problem can be stated as follows: Given a graph, find the minimum number of nodes whose deletion results in a subgraph satisfying property π\pi . In [LY] this problem was shown to be NP-complete for a large class of properties (the class of properties that are hereditary on induced subgraphs) using a small number of reduction schemes from the node cover problem. Since the node cover problem becomes polynomial on bipartite graphs, it might be hoped that this is the case with other node-deletion problems too. In this paper we characterize those properties for which the bipartite restriction of the node-deletion problem is polynomial and those for which it remains NP-complete. Similar results follow for analogous problems on other structures such as families of sets, hypergraphs and 0,1 matrices. For example, in the case of matrices, our result states that if M is a class of 0,1 matrices which is closed under permutation and deletion of rows and columns, then finding the largest submatrix in M of a matrix is polynomial if the matrices of M have bounded rank and NP-complete otherwise.
Article
While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all but ignored. This article reviews the current practice and then theoretically and empirically examines several suitable tests. Based on that, we recommend a set of simple, yet safe and robust non-parametric tests for statistical comparisons of classifiers: the Wilcoxon signed ranks test for comparison of two classifiers and the Friedman test with the corresponding post-hoc tests for comparison of more classifiers over multiple data sets. Results of the latter can also be neatly presented with the newly introduced CD (critical difference) diagrams.