Publications (300)101.8 Total impact

[Show abstract] [Hide abstract]
ABSTRACT: Feature selection plays a vital role in many areas of pattern recognition and data mining. The effective computation of feature selection is important for improving the classification performance. In rough set theory, many feature selection algorithms have been proposed to process static incomplete data. However, feature values in an incomplete data set may vary dynamically in realworld applications. For such dynamic incomplete data, a classic (nonincremental) approach of feature selection is usually computationally timeconsuming. To overcome this disadvantage, we propose an incremental approach for feature selection, which can accelerate the feature selection process in dynamic incomplete data. We firstly employ an incremental manner to compute the new positive region when feature values with respect to an object set vary dynamically. Based on the calculated positive region, two efficient incremental feature selection algorithms are developed respectively for single object and multiple objects with varying feature values. Then we conduct a series of experiments with 12 UCI real data sets to evaluate the efficiency and effectiveness of our proposed algorithms. The experimental results show that the proposed algorithms compare favorably with that of applying the existing nonincremental methods.Pattern Recognition 12/2014; 47(12):3890–3906. DOI:10.1016/j.patcog.2014.06.002 · 2.58 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: Given a set of multiple channels, a set of multiple requests, where each request contains multiple requested data items and a client equipped with multiple antennae, the multiantennabased multirequest data retrieval problem (DRMRMA) is to find a data retrieval sequence for downloading all data items of the requests allocated to each antenna, such that the maximum access latency of all antennae is minimized. Most existing approaches for the data retrieval problem focus on either single antenna or single request and are hence not directly applicable to DRMRMA for retrieving multiple requests. This paper proposes two data retrieval algorithms that adopt two different grouping schemes to solve DRMRMA so that the requests can be suitably allocated to each antenna. To find the data retrieval sequence of each request efficiently, we present a data retrieval scheme that converts a wireless data broadcast system to a special tree. Experimental results show that the proposed scheme is more efficient than other existing schemes. Copyright © 2014 John Wiley & Sons, Ltd.International Journal of Communication Systems 12/2014; DOI:10.1002/dac.2917 · 1.11 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: Traffic matrix (TM) describes the traffic volumes traversing a network from the input nodes to the output nodes over a measured period. Such a TM contains very useful information for network managers, traffic engineers and users. However, TM is hard to be obtained and analyzed due to its large size, especially for largescale networks. In this paper, we present a new method based on diffusion wavelets for analyzing the traffic matrix. It is shown that this method can conduct efficient multiresolution analysis (MRA) on TM. We compare the analysis results by using different diffusion operators. Through reconstructing the original TM from the diffused traffic on a particular level, we show the high efficiency of this MRA tool based on these operators. We then develop an anomaly detection method based on the analysis results and explore the possibilities of other potential applications.Computers & Electrical Engineering 08/2014; 40(6). DOI:10.1016/j.compeleceng.2014.04.021 · 0.99 Impact Factor 
Article: Foreword to Special Issue
Journal of Interconnection Networks 05/2014; 14(03). DOI:10.1142/S0219265913020015 
Article: Updating attribute reduction in incomplete decision systems with the variation of attribute set
[Show abstract] [Hide abstract]
ABSTRACT: In rough set theory, attribute reduction is a challenging problem in the applications in which data with numbers of attributes available. Moreover, due to dynamic characteristics of data collection in decision systems, attribute reduction will change dynamically as attribute set in decision systems varies over time. How to carry out updating attribute reduction by utilizing previous information is an important task that can help to improve the efficiency of knowledge discovery. In view of that attribute reduction algorithms in incomplete decision systems with the variation of attribute set have not yet been discussed so far. This paper focuses on positive regionbased attribute reduction algorithm to solve the attribute reduction problem efficiently in the incomplete decision systems with dynamically varying attribute set. We first introduce an incremental manner to calculate the new positive region and tolerance classes. Consequently, based on the calculated positive region and tolerance classes, the corresponding attribute reduction algorithms on how to compute new attribute reduct are put forward respectively when an attribute set is added into and deleted from the incomplete decision systems. Finally, numerical experiments conducted on different data sets from UCI validate the effectiveness and efficiency of the proposed algorithms in incomplete decision systems with the variation of attribute set.International Journal of Approximate Reasoning 03/2014; 55(3):867–884. DOI:10.1016/j.ijar.2013.09.015 · 1.98 Impact Factor 
Computer Science and Information Systems 01/2014; 11(1):309320. DOI:10.2298/CSIS130212010T · 0.58 Impact Factor

Conference Paper: Efficient Approximation Algorithm for Data Retrieval with Conflicts in Wireless Networks
[Show abstract] [Hide abstract]
ABSTRACT: Given a set of data items broadcasting at multiple parallel channels, where each channel has the same broadcast pattern over a time period, and a set of client's requested data items, the data retrieval problem requires to find a sequence of channel access to retrieve the requested data items among the channels such that the total access latency is minimized, where both channel access (to retrieve a data item) and channel switch are assumed to take a single time slot. As an important problem of information retrieval in wireless networks, this problem arises in many applications such as ecommerce and ubiquitous data sharing, and is known two conflicts: requested data items are broadcast at same time slots or adjacent time slots in different channels. Although existing studies focus on this problem with one conflict, there is little work on this problem with two conflicts. So this paper proposes efficient algorithms from two views: single antenna and multiple antennae. Our algorithm adopts a novel approach that wireless data broadcast system is converted to DAG, and applies set cover to solve this problem. Through Experiments, this result presents currently the most efficient algorithm for this problem with two conflicts.Proceedings of International Conference on Advances in Mobile Computing & Multimedia; 12/2013 
[Show abstract] [Hide abstract]
ABSTRACT: In the emerging environment of the Internet of things (IoT), through the connection of billions of radio frequency identification (RFID) tags and sensors to the Internet, applications will generate an unprecedented number of transactions and amount of data that require novel approaches in RFID data stream processing and management. Unfortunately, it is difficult to maintain a distributed model without a shared directory or structured index. In this paper, we propose a fully distributed model for federated RFID data streams. This model combines two techniques, namely, tilted time frame and histogram to represent the patterns of object flows. Our model is efficient in space and can be stored in main memory. The model is built on top of an unstructured P2P overlay. To reduce the overhead of distributed data acquisition, we further propose several algorithms that use a statistically minimum number of network calls to maintain the model. The scalability and efficiency of the proposed model are demonstrated through an extensive set of experiments.IEEE Transactions on Parallel and Distributed Systems 10/2013; 24(10):20362045. DOI:10.1109/TPDS.2013.99 · 2.17 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: For a given undirected (edge) weighted graph G = (V, E), a terminal set S ⊆ V and a root r ∈ S, the rooted kvertex connected minimum Steiner network (kVSMNr) problem requires to construct a minimumcost subgraph of G such that each terminal in S {R} is kvertex connected to τ. As an important problem in survivable network design, the kVSMNτ problem is known to be NPhard even when k 1/4 1 [14]. For k 1/4 3 this paper presents a simple combinatorial eightapproximation algorithm, improving the known best ratio 14 of Nutov [20]. Our algorithm constructs an approximate 3VSMNτ through augmenting a twovertex connected counterpart with additional edges of bounded cost to the optimal. We prove that the total cost of the added edges is at most six times of the optimal by showing that the edges in a 3VSMNτ compose a subgraph containing our solution in such a way that each edge appears in the subgraph at most six times.IEEE Transactions on Computers 09/2013; 62(9):16841693. DOI:10.1109/TC.2012.170 · 1.47 Impact Factor 
Conference Paper: Improved approximation algorithms for constrained faulttolerant resource allocation
[Show abstract] [Hide abstract]
ABSTRACT: In Constrained FaultTolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources and a set of clients accessing these resources. Each site i can open at most Ri facilities with opening cost fi. Each client j requires an allocation of rj open facilities and connecting j to any facility at site i incurs a connection cost cij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained FaultTolerant Resource Allocation (FTRA∞) [10] and the classical FaultTolerant Facility Location (FTFL) [7] problems: for every site i, FTRA∞ does not have the constraint Ri, whereas FTFL sets Ri=1. These problems are said to be uniform if all rj's are the same, and general otherwise. For the general metric FTRA, we first give an LProunding algorithm achieving an approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [2]. For the uniform FTRA, we provide a 1.52approximation primaldual algorithm in O(n4) time, where n is the total number of sites and clients.Proceedings of the 19th international conference on Fundamentals of Computation Theory; 08/2013 
Article: An efficient compressive data gathering routing scheme for largescale wireless sensor networks
[Show abstract] [Hide abstract]
ABSTRACT: Compressive sensing based innetwork compression is an efficient technique to reduce communication cost and accurately recover sensory data at the sink. Existing compressive sensing based data gathering methods require a large number of sensors to participate in each measurement gathering, and it leads to waste a lot of energy. In this paper, we present an energy efficient clustering routing data gathering scheme for largescale wireless sensor networks. The main challenges of our scheme are how to obtain the optimal number of clusters and how to keep all cluster heads uniformly distributed. To solve the above problems, we first formulate an energy consumption model to obtain the optimal number of clusters. Second, we design an efficient deterministic dynamic clustering scheme to guarantee all cluster heads uniformly distributed approximately. With extensive simulation, we demonstrate that our scheme not only prolongs nearly 2x network's lifetime compared with the state of the art compressive sensing based data gathering schemes, but also makes the network energy consumption very uniformly.Computers & Electrical Engineering 08/2013; 39(6):19351946. DOI:10.1016/j.compeleceng.2013.04.009 · 0.99 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: Data publishing based on hypergraphs is becoming increasingly popular due to its power in representing multirelations among objects. However, security issues have been little studied on this subject, while most recent work only focuses on the protection of relational data or graphs. As a major privacy breach, identity disclosure reveals the identification of entities with certain background knowledge known by an adversary. In this paper, we first introduce a novel background knowledge attack model based on the property of hyperedge ranks, and formalize the rankbased hypergraph anonymization problem. We then propose a complete solution in a twostep framework: rank anonymization and hypergraph reconstruction. We also take hypergraph clustering (known as community detection) as data utility into consideration, and discuss two metrics to quantify information loss incurred in the perturbation. Our approaches are effective in terms of efficacy, privacy, and utility. The algorithms run in nearquadratic time on hypergraph size, and protect data from rank attacks with almost the same utility preserved. The performances of the methods have been validated by extensive experiments on realworld datasets as well. Our rankbased attack model and algorithms for rank anonymization and hypergraph reconstruction are, to our best knowledge, the first systematic study to privacy preserving for hypergraphbased data publishing.IEEE Transactions on Information Forensics and Security 08/2013; 8(8):13841396. DOI:10.1109/TIFS.2013.2271425 · 2.07 Impact Factor 
Article: Improved Approximation Algorithms for Computing k Disjoint Paths Subject to Two Constraints
[Show abstract] [Hide abstract]
ABSTRACT: For a given graph $G$ with positive integral cost and delay on edges, distinct vertices $s$ and $t$, cost bound $C\in Z^{+}$ and delay bound $D\in Z^{+}$, the $k$ biconstraint path ($k$BCP) problem is to compute $k$ disjoint $st$paths subject to $C$ and $D$. This problem is known NPhard, even when $k=1$ [4]. This paper first gives a simple approximation algorithm with factor$(2,2)$, i.e. the algorithm computes a solution with delay and cost bounded by $2*D$ and $2*C$ respectively. Later, a novel improved approximation algorithm with ratio $(1+\beta, \max{2, 1+\ln\frac{1}{\beta}})$ is developed by constructing interesting auxiliary graphs and employing the cycle cancelation method. As a consequence, we can obtain a factor$(1.369, 2)$ approximation algorithm by setting $1+\ln\frac{1}{\beta}=2$ and a factor$(1.567, 1.567)$ algorithm by setting $1+\beta=1+\ln\frac{1}{\beta}$. Besides, when $\beta=0$, by slightly modifying our algorithm, an approximation algorithm with ratio $(1, (1+\epsilon)(\ln n+\ln\frac{1}{\epsilon}))$, i.e. an algorithm with only a single factor ratio $O(\ln n)$ on cost, can be immediately obtained by setting the delay of each edge $e$ to $\lfloor \frac{d(e)}{\frac{\epsilon D}{n}}\rfloor $ for a given fixed $\epsilon>0$. To the best of our knowledge, this is the first nontrivial approximation algorithm for the $k$BCP problem which strictly obeys the delay constraint. Our developed algorithms can be directly used to solve some related problems, in particular, the kdisjoint restricted shortest path problem ($k$RSP) [10], resulting in the same ratio $(1+\beta, \max{2, 1+\ln\frac{1}{\beta}})$, which improves currently the best result of ratio $(2, 2)$ in [6].Journal of Combinatorial Optimization 01/2013; 29(1). DOI:10.1007/s108780139693x · 1.04 Impact Factor 
Conference Paper: A roughset based incremental approach for updating attribute reduction under dynamic incomplete decision systems
[Show abstract] [Hide abstract]
ABSTRACT: Efficient attribute reduction in largescale incomplete decision systems is a challenging problem. The computation of tolerance classes induced by the condition attributes in the incomplete decision system is a key part among all existing attribute reduction algorithms. Moreover, updating attribute reduction for dynamicallyincreasing decision systems has attracted much attention, in view of that incremental attribute reduction algorithms in a dynamic incomplete decision system have not yet been sufficiently discussed so far. In this paper, we first introduce a simpler way of computing tolerance classes than the classical method. Then we present an incremental attribute reduction algorithm to compute an attribute reduct for a dynamicallyincreasing incomplete decision system. Compared with the nonincremental algorithms, our incremental attribute reduction algorithm can compute a new attribute reduct in much shorter time. Experiments on four data sets downloaded from UCI show that the feasibility and effectiveness of the proposed incremental algorithm.Fuzzy Systems (FUZZ), 2013 IEEE International Conference on; 01/2013 
Article: On Finding MinMin Disjoint Paths
[Show abstract] [Hide abstract]
ABSTRACT: The MinMin problem of finding a disjointpath pair with the length of the shorter path minimized is known to be NPhard and admits no Kapproximation for any K>1 in the general case (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006). In this paper, we first show that Bhatia et al.’s NPhardness proof (Bhatia et al. in J. Comb. Optim. 12:83–96, 2006), a claim of correction to Xu et al.’s proof (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006), for the edgedisjoint MinMin problem in the general undirected graphs is incorrect by giving a counter example that is an unsatisfiable 3SAT instance but classified as a satisfiable 3SAT instance in the proof of Bhatia et al. (J. Comb. Optim. 12:83–96, 2006). We then gave a correct proof of NPhardness of this problem in undirected graphs. Finally we give a polynomialtime algorithm for the vertex disjoint MinMin problem in planar graphs by showing that the vertex disjoint MinMin problem is polynomially solvable in stplanar graph G=(V,E) whose corresponding auxiliary graph G(V,E∪{e(st)}) can be embedded into a plane, and a planar graph can be decomposed into several stplanar graphs whose MinMin paths collectively contain a MinMin disjointpath pair between s and t in the original graph G. To the best of our knowledge, these are the first polynomial algorithms for the MinMin problems in planar graphs.Algorithmica 01/2013; 66(3). DOI:10.1007/s0045301296560 · 0.57 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: Randomization methods widely applied for privacypreserving data mining are generally subject to reconstruction attack, linkage attack, and semanticrelated attacks. A probabilistic anonymity definition has been proposed in [1] to defend against the linkage attack in which the attacker links the same randomized record to all of the original records. In this paper we name this type of attack as Multiple (original records) to One (randomized record) attack, while focus on another attack that has not been researched before, i.e. One (original record) to Multiple (randomized records) attack. The latter is different from the former in that it does not require the attacker to know the distribution and all values of quasiidentifiers in original records, and thus is easier to be launched by the attacker. To defend against this attack we propose a novel probabilistic anonymity concept different from [1]. We achieve this anonymity goal on a hybrid model combining random projection and random noise addition. We also analyze the security properties of this model against the other common types of attacks. Compared with existing work in randomization, kanonymity and differential privacy, our work achieves the holistic aim of higher security, higher efficiency and higher data utility, and demonstrates very promising applications in largescale and highdimensional data mining in clouds.eBusiness Engineering (ICEBE), 2013 IEEE 10th International Conference on; 01/2013 
[Show abstract] [Hide abstract]
ABSTRACT: In the Constrained FaultTolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources, and a set of clients accessing these resources. Specifically, each site i is allowed to open at most R_i facilities with cost f_i for each opened facility. Each client j requires an allocation of r_j open facilities and connecting j to any facility at site i incurs a connection cost c_ij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained FaultTolerant Resource Allocation (FTRA_{\infty}) [18] and the classical FaultTolerant Facility Location (FTFL) [13] problems: for every site i, FTRA_{\infty} does not have the constraint R_i, whereas FTFL sets R_i=1. These problems are said to be uniform if all r_j's are the same, and general otherwise. For the general metric FTRA, we first give an LProunding algorithm achieving the approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [3]. For the uniform FTRA, we provide a 1.52approximation primaldual algorithm in O(n^4) time, where n is the total number of sites and clients. We also consider the Constrained FaultTolerant kResource Allocation (kFTRA) problem where additionally the total number of facilities can be opened across all sites is bounded by k. For the uniform kFTRA, we give the first constantfactor approximation algorithm with a factor of 4. Note that the above results carry over to FTRA_{\infty} and kFTRA_{\infty}. 
[Show abstract] [Hide abstract]
ABSTRACT: For an undirected and weighted graph G=(V,E) and a terminal set S of V, the 2connected Steiner minimal network (SMN) problem requires to compute a minimumweight subgraph of G in which all terminals are 2connected to each other. This problem has important applications in design of survivable networks and faulttolerant communication, and is known MAXSNPhard, a harder subclass of NPhard problems for which no polynomialtime approximation scheme (PTAS) is known. This paper presents an efficient algorithm of O( V^2S^3) time for computing a 2vertex connected Steiner network (2VSN) whose weight is bounded by 2 times of the optimal solution 2VSMN. It compares favorably with the currently known 2approximation solution to the 2VSMN problem based on that to the survivable network design problem, with a time complexity reduction of O(V^5E7) for strongly polynomial time and O(V^5g) for weakly polynomial time where g is determined by the sizes of input. Our algorithm applies a novel greedy approach to generate a 2VSN through progressive improvement on a set of vertexdisjoint shortest path pairs incident with each terminal of S. The algorithm can be directly deployed to solve the 2edge connected SMN problem at the same approximation ratio within time O(V^2S^2).IEEE Transactions on Computers 07/2012; 61(99PP):1  1. DOI:10.1109/TC.2011.123 · 1.47 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: The min–min problem of finding a disjoint path pair with the length of the shorter path minimized is known to be NPcomplete (Xu et al., 2006) [1]. In this paper, we prove that in planar digraphs the edgedisjoint min–min problem remains NPcomplete and admits no KKapproximation for any K>1K>1 unless P=NPP=NP. As a byproduct, we show that this problem remains NPcomplete even when all edge costs are equal (i.e., stronglyNPcomplete). To our knowledge, this is the first NPcompleteness proof for the edgedisjoint min–min problem in planar digraphs.Theoretical Computer Science 05/2012; 432:58–63. DOI:10.1016/j.tcs.2011.12.009 · 0.52 Impact Factor 
[Show abstract] [Hide abstract]
ABSTRACT: We initiate the study of the Reliable Resource Allocation (RRA) problem. In this problem, we are given a set of sites equipped with an unbounded number of facilities as resources. Each facility has an opening cost and an estimated reliability. There is also a set of clients to be allocated to facilities with corresponding connection costs. Each client has a reliability requirement (RR) for accessing resources. The objective is to open a subset of facilities from sites to satisfy all clients' RRs at a minimum total cost. The Unconstrained FaultTolerant Resource Allocation (UFTRA) problem studied in (Liao & Shen 2011) is a special case of RRA. In this paper, we present two equivalent primaldual algorithms for the RRA problem, where the second one is an acceleration of the first and runs in quasilinear time. If all clients have the same RR above the threshold that a single facility can provide, our analysis of the algorithm yields an approximation factor of 2+2√2 and later a reduced ratio of 3.722 using a factor revealing program. The analysis further elaborates and generalizes the generic inverse dual fitting technique introduced in (Xu & Shen 2009). As a byproduct, we also formalize this technique for the classical minimum set cover problem.Proceedings of the Eighteenth Computing: The Australasian Theory Symposium  Volume 128; 01/2012
Publication Stats
1k  Citations  
101.80  Total Impact Points  
Top Journals
Institutions

2013–2014

Sun YatSen University
Shengcheng, Guangdong, China


2008–2014

Beijing Jiaotong University
 • School of Computer and Information Technology
 • Department of Computer Science
Peping, Beijing, China


2006–2014

University of Adelaide
 School of Computer Science
Tarndarnya, South Australia, Australia 
Manchester Metropolitan University
Manchester, England, United Kingdom


2006–2008

University of Science and Technology of China
 School of Computer Science and Technology
Luchow, Anhui Sheng, China


2007

University of Texas at Dallas
 Department of Computer Science
Dallas, TX, United States


2001–2007

Japan Advanced Institute of Science and Technology
 School of Information Science
KMQ, Ishikawa, Japan


2005–2006

Fudan University
 School of Computer Science
Shanghai, Shanghai Shi, China


1994–2001

Griffith University
 School of Information and Communication Technology (ICT)
Southport, Queensland, Australia


1995

Australian National University
 Research School of Computer Science
Canberra, Australian Capital Territory, Australia
