Publications (294)61.2 Total impact
 [Show abstract] [Hide abstract]
ABSTRACT: Feature selection plays a vital role in many areas of pattern recognition and data mining. The effective computation of feature selection is important for improving the classification performance. In rough set theory, many feature selection algorithms have been proposed to process static incomplete data. However, feature values in an incomplete data set may vary dynamically in realworld applications. For such dynamic incomplete data, a classic (nonincremental) approach of feature selection is usually computationally timeconsuming. To overcome this disadvantage, we propose an incremental approach for feature selection, which can accelerate the feature selection process in dynamic incomplete data. We firstly employ an incremental manner to compute the new positive region when feature values with respect to an object set vary dynamically. Based on the calculated positive region, two efficient incremental feature selection algorithms are developed respectively for single object and multiple objects with varying feature values. Then we conduct a series of experiments with 12 UCI real data sets to evaluate the efficiency and effectiveness of our proposed algorithms. The experimental results show that the proposed algorithms compare favorably with that of applying the existing nonincremental methods.Pattern Recognition. 12/2014; 47(12):3890–3906. 
Article: Foreword to Special Issue
Journal of Interconnection Networks 05/2014; 14(03).  [Show abstract] [Hide abstract]
ABSTRACT: Traffic matrix (TM) describes the traffic volumes traversing a network from the input nodes to the output nodes over a measured period. Such a TM contains very useful information for network managers, traffic engineers and users. However, TM is hard to be obtained and analyzed due to its large size, especially for largescale networks. In this paper, we present a new method based on diffusion wavelets for analyzing the traffic matrix. It is shown that this method can conduct efficient multiresolution analysis (MRA) on TM. We compare the analysis results by using different diffusion operators. Through reconstructing the original TM from the diffused traffic on a particular level, we show the high efficiency of this MRA tool based on these operators. We then develop an anomaly detection method based on the analysis results and explore the possibilities of other potential applications.Computers & Electrical Engineering 01/2014; · 0.93 Impact Factor 
Conference Paper: Efficient Approximation Algorithm for Data Retrieval with Conflicts in Wireless Networks
[Show abstract] [Hide abstract]
ABSTRACT: Given a set of data items broadcasting at multiple parallel channels, where each channel has the same broadcast pattern over a time period, and a set of client's requested data items, the data retrieval problem requires to find a sequence of channel access to retrieve the requested data items among the channels such that the total access latency is minimized, where both channel access (to retrieve a data item) and channel switch are assumed to take a single time slot. As an important problem of information retrieval in wireless networks, this problem arises in many applications such as ecommerce and ubiquitous data sharing, and is known two conflicts: requested data items are broadcast at same time slots or adjacent time slots in different channels. Although existing studies focus on this problem with one conflict, there is little work on this problem with two conflicts. So this paper proposes efficient algorithms from two views: single antenna and multiple antennae. Our algorithm adopts a novel approach that wireless data broadcast system is converted to DAG, and applies set cover to solve this problem. Through Experiments, this result presents currently the most efficient algorithm for this problem with two conflicts.Proceedings of International Conference on Advances in Mobile Computing & Multimedia; 12/2013 
Conference Paper: Improved approximation algorithms for constrained faulttolerant resource allocation
[Show abstract] [Hide abstract]
ABSTRACT: In Constrained FaultTolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources and a set of clients accessing these resources. Each site i can open at most Ri facilities with opening cost fi. Each client j requires an allocation of rj open facilities and connecting j to any facility at site i incurs a connection cost cij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained FaultTolerant Resource Allocation (FTRA∞) [10] and the classical FaultTolerant Facility Location (FTFL) [7] problems: for every site i, FTRA∞ does not have the constraint Ri, whereas FTFL sets Ri=1. These problems are said to be uniform if all rj's are the same, and general otherwise. For the general metric FTRA, we first give an LProunding algorithm achieving an approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [2]. For the uniform FTRA, we provide a 1.52approximation primaldual algorithm in O(n4) time, where n is the total number of sites and clients.Proceedings of the 19th international conference on Fundamentals of Computation Theory; 08/2013 
Article: Improved Approximation Algorithms for Computing k Disjoint Paths Subject to Two Constraints
[Show abstract] [Hide abstract]
ABSTRACT: For a given graph $G$ with positive integral cost and delay on edges, distinct vertices $s$ and $t$, cost bound $C\in Z^{+}$ and delay bound $D\in Z^{+}$, the $k$ biconstraint path ($k$BCP) problem is to compute $k$ disjoint $st$paths subject to $C$ and $D$. This problem is known NPhard, even when $k=1$ [4]. This paper first gives a simple approximation algorithm with factor$(2,2)$, i.e. the algorithm computes a solution with delay and cost bounded by $2*D$ and $2*C$ respectively. Later, a novel improved approximation algorithm with ratio $(1+\beta, \max{2, 1+\ln\frac{1}{\beta}})$ is developed by constructing interesting auxiliary graphs and employing the cycle cancelation method. As a consequence, we can obtain a factor$(1.369, 2)$ approximation algorithm by setting $1+\ln\frac{1}{\beta}=2$ and a factor$(1.567, 1.567)$ algorithm by setting $1+\beta=1+\ln\frac{1}{\beta}$. Besides, when $\beta=0$, by slightly modifying our algorithm, an approximation algorithm with ratio $(1, (1+\epsilon)(\ln n+\ln\frac{1}{\epsilon}))$, i.e. an algorithm with only a single factor ratio $O(\ln n)$ on cost, can be immediately obtained by setting the delay of each edge $e$ to $\lfloor \frac{d(e)}{\frac{\epsilon D}{n}}\rfloor $ for a given fixed $\epsilon>0$. To the best of our knowledge, this is the first nontrivial approximation algorithm for the $k$BCP problem which strictly obeys the delay constraint. Our developed algorithms can be directly used to solve some related problems, in particular, the kdisjoint restricted shortest path problem ($k$RSP) [10], resulting in the same ratio $(1+\beta, \max{2, 1+\ln\frac{1}{\beta}})$, which improves currently the best result of ratio $(2, 2)$ in [6].Journal of Combinatorial Optimization 01/2013; · 0.59 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: For a given undirected (edge) weighted graph G = (V, E), a terminal set S ⊆ V and a root r ∈ S, the rooted kvertex connected minimum Steiner network (kVSMNr) problem requires to construct a minimumcost subgraph of G such that each terminal in S {R} is kvertex connected to τ. As an important problem in survivable network design, the kVSMNτ problem is known to be NPhard even when k 1/4 1 [14]. For k 1/4 3 this paper presents a simple combinatorial eightapproximation algorithm, improving the known best ratio 14 of Nutov [20]. Our algorithm constructs an approximate 3VSMNτ through augmenting a twovertex connected counterpart with additional edges of bounded cost to the optimal. We prove that the total cost of the added edges is at most six times of the optimal by showing that the edges in a 3VSMNτ compose a subgraph containing our solution in such a way that each edge appears in the subgraph at most six times.IEEE Transactions on Computers 01/2013; 62(9):16841693. · 1.38 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: In the emerging environment of the Internet of things (IoT), through the connection of billions of radio frequency identification (RFID) tags and sensors to the Internet, applications will generate an unprecedented number of transactions and amount of data that require novel approaches in RFID data stream processing and management. Unfortunately, it is difficult to maintain a distributed model without a shared directory or structured index. In this paper, we propose a fully distributed model for federated RFID data streams. This model combines two techniques, namely, tilted time frame and histogram to represent the patterns of object flows. Our model is efficient in space and can be stored in main memory. The model is built on top of an unstructured P2P overlay. To reduce the overhead of distributed data acquisition, we further propose several algorithms that use a statistically minimum number of network calls to maintain the model. The scalability and efficiency of the proposed model are demonstrated through an extensive set of experiments.IEEE Transactions on Parallel and Distributed Systems 01/2013; 24(10):20362045. · 1.80 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Data publishing based on hypergraphs is becoming increasingly popular due to its power in representing multirelations among objects. However, security issues have been little studied on this subject, while most recent work only focuses on the protection of relational data or graphs. As a major privacy breach, identity disclosure reveals the identification of entities with certain background knowledge known by an adversary. In this paper, we first introduce a novel background knowledge attack model based on the property of hyperedge ranks, and formalize the rankbased hypergraph anonymization problem. We then propose a complete solution in a twostep framework: rank anonymization and hypergraph reconstruction. We also take hypergraph clustering (known as community detection) as data utility into consideration, and discuss two metrics to quantify information loss incurred in the perturbation. Our approaches are effective in terms of efficacy, privacy, and utility. The algorithms run in nearquadratic time on hypergraph size, and protect data from rank attacks with almost the same utility preserved. The performances of the methods have been validated by extensive experiments on realworld datasets as well. Our rankbased attack model and algorithms for rank anonymization and hypergraph reconstruction are, to our best knowledge, the first systematic study to privacy preserving for hypergraphbased data publishing.IEEE Transactions on Information Forensics and Security 01/2013; 8(8):13841396. · 1.90 Impact Factor 
Article: On Finding MinMin Disjoint Paths
[Show abstract] [Hide abstract]
ABSTRACT: The MinMin problem of finding a disjointpath pair with the length of the shorter path minimized is known to be NPhard and admits no Kapproximation for any K>1 in the general case (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006). In this paper, we first show that Bhatia et al.’s NPhardness proof (Bhatia et al. in J. Comb. Optim. 12:83–96, 2006), a claim of correction to Xu et al.’s proof (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006), for the edgedisjoint MinMin problem in the general undirected graphs is incorrect by giving a counter example that is an unsatisfiable 3SAT instance but classified as a satisfiable 3SAT instance in the proof of Bhatia et al. (J. Comb. Optim. 12:83–96, 2006). We then gave a correct proof of NPhardness of this problem in undirected graphs. Finally we give a polynomialtime algorithm for the vertex disjoint MinMin problem in planar graphs by showing that the vertex disjoint MinMin problem is polynomially solvable in stplanar graph G=(V,E) whose corresponding auxiliary graph G(V,E∪{e(st)}) can be embedded into a plane, and a planar graph can be decomposed into several stplanar graphs whose MinMin paths collectively contain a MinMin disjointpath pair between s and t in the original graph G. To the best of our knowledge, these are the first polynomial algorithms for the MinMin problems in planar graphs.Algorithmica 01/2013; 66(3). · 0.49 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: Randomization methods widely applied for privacypreserving data mining are generally subject to reconstruction attack, linkage attack, and semanticrelated attacks. A probabilistic anonymity definition has been proposed in [1] to defend against the linkage attack in which the attacker links the same randomized record to all of the original records. In this paper we name this type of attack as Multiple (original records) to One (randomized record) attack, while focus on another attack that has not been researched before, i.e. One (original record) to Multiple (randomized records) attack. The latter is different from the former in that it does not require the attacker to know the distribution and all values of quasiidentifiers in original records, and thus is easier to be launched by the attacker. To defend against this attack we propose a novel probabilistic anonymity concept different from [1]. We achieve this anonymity goal on a hybrid model combining random projection and random noise addition. We also analyze the security properties of this model against the other common types of attacks. Compared with existing work in randomization, kanonymity and differential privacy, our work achieves the holistic aim of higher security, higher efficiency and higher data utility, and demonstrates very promising applications in largescale and highdimensional data mining in clouds.eBusiness Engineering (ICEBE), 2013 IEEE 10th International Conference on; 01/2013  [Show abstract] [Hide abstract]
ABSTRACT: Watermarking as a powerful technique for copyright protection, content verification, covert communication and so on, has been studied for years, and is drawing more and more attention recently. There are many situations in which embedding multiple watermarks in an image is desired. This paper proposes an effective approach to embed dual watermarks by extending the single watermarking algorithms in Xie and Shen (2005) [1] and Xie and Shen (2006) [2] for numerical and logo watermarking, respectively. Experimental results show that the resulting dual watermarking algorithms have a significantly higher PSNR than existing dual watermarking algorithms and also retain the same robustness as and higher sensitivity than the original single watermarking algorithms on which they are based.Computers & Electrical Engineering 09/2012; 38(5):1310–1324. · 0.93 Impact Factor  [Show abstract] [Hide abstract]
ABSTRACT: In the Constrained FaultTolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources, and a set of clients accessing these resources. Specifically, each site i is allowed to open at most R_i facilities with cost f_i for each opened facility. Each client j requires an allocation of r_j open facilities and connecting j to any facility at site i incurs a connection cost c_ij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained FaultTolerant Resource Allocation (FTRA_{\infty}) [18] and the classical FaultTolerant Facility Location (FTFL) [13] problems: for every site i, FTRA_{\infty} does not have the constraint R_i, whereas FTFL sets R_i=1. These problems are said to be uniform if all r_j's are the same, and general otherwise. For the general metric FTRA, we first give an LProunding algorithm achieving the approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [3]. For the uniform FTRA, we provide a 1.52approximation primaldual algorithm in O(n^4) time, where n is the total number of sites and clients. We also consider the Constrained FaultTolerant kResource Allocation (kFTRA) problem where additionally the total number of facilities can be opened across all sites is bounded by k. For the uniform kFTRA, we give the first constantfactor approximation algorithm with a factor of 4. Note that the above results carry over to FTRA_{\infty} and kFTRA_{\infty}.08/2012;  [Show abstract] [Hide abstract]
ABSTRACT: We initiate the study of the Reliable Resource Allocation (RRA) problem. In this problem, we are given a set of sites equipped with an unbounded number of facilities as resources. Each facility has an opening cost and an estimated reliability. There is also a set of clients to be allocated to facilities with corresponding connection costs. Each client has a reliability requirement (RR) for accessing resources. The objective is to open a subset of facilities from sites to satisfy all clients' RRs at a minimum total cost. The Unconstrained FaultTolerant Resource Allocation (UFTRA) problem studied in (Liao & Shen 2011) is a special case of RRA. In this paper, we present two equivalent primaldual algorithms for the RRA problem, where the second one is an acceleration of the first and runs in quasilinear time. If all clients have the same RR above the threshold that a single facility can provide, our analysis of the algorithm yields an approximation factor of 2+2√2 and later a reduced ratio of 3.722 using a factor revealing program. The analysis further elaborates and generalizes the generic inverse dual fitting technique introduced in (Xu & Shen 2009). As a byproduct, we also formalize this technique for the classical minimum set cover problem.Proceedings of the Eighteenth Computing: The Australasian Theory Symposium  Volume 128; 01/2012  [Show abstract] [Hide abstract]
ABSTRACT: The min–min problem of finding a disjoint path pair with the length of the shorter path minimized is known to be NPcomplete (Xu et al., 2006) [1]. In this paper, we prove that in planar digraphs the edgedisjoint min–min problem remains NPcomplete and admits no KKapproximation for any K>1K>1 unless P=NPP=NP. As a byproduct, we show that this problem remains NPcomplete even when all edge costs are equal (i.e., stronglyNPcomplete). To our knowledge, this is the first NPcompleteness proof for the edgedisjoint min–min problem in planar digraphs.Theoretical Computer Science 01/2012; 432:58–63. · 0.49 Impact Factor 
Conference Paper: Incorporating Manifold Ranking with Active Learning in Relevance Feedback for Image Retrieval
[Show abstract] [Hide abstract]
ABSTRACT: Combining manifold ranking with active learning (MRAL for short) is one popular and successful technique for relevance feedback in contentbased image retrieval (CBIR). Despite the success, conventional MRAL has two main drawbacks. First, the performance of manifold ranking is very sensitive to the scale parameter used for calculating the Laplacian matrix. Second, conventional MRAL does not take into account the redundancy among examples and thus could select multiple examples that are similar to each other. In this work, a novel MRAL framework is presented to address the drawbacks. Concretely, we first propose a selftuning manifold ranking algorithm that can adaptively calculate the Laplacian matrix via a local scaling mechanism, and then develop a hybrid active learning algorithm by integrating three wellknown selective sampling criteria, which is able to effectively and efficiently identify the most informative and diversified examples for the user to label. Experiments on 10,000 Corel images show that the proposed method is significantly more effective than some existing approaches.Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2012 13th International Conference on; 01/2012 
Conference Paper: A Clustering Algorithm Based on DensityGrid for Stream Data
[Show abstract] [Hide abstract]
ABSTRACT: Many real applications, such as network traffic monitoring, intrusion detection, satellite remote sensing, and electronic business, generate data in the form of a stream arriving continuously at high speed. Clustering is an important data analysis tool for knowledge discovery. Compared with traditional clustering algorithms, clustering stream data is an important and challenging problem which has attracted many researchers. Clustering stream data is facing two main challenges. First, as the data is continuously arriving with high rate and the computer storage capacity is limited, raw data can only be scaned in one pass. Second, stream data is always changing with time, so viewing a data stream as a set of static data can deteriorate the clustering quality. In fact, users are more concerned with the evolving behaviors of clusters which can help people making correct decisions. This paper proposes a densitygrid based clustering algorithm, PKSStreamI, for stream data. It is an optimization of PKSStream in density detection period selection, sporadic grid detection and removal. Empirical results show the proposed method yields out better performance.Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2012 13th International Conference on; 01/2012 
Conference Paper: Efficient Approximation Algorithms for Computing kDisjoint Minimum Cost Paths with Delay Constraint
[Show abstract] [Hide abstract]
ABSTRACT: For a given graph G with distinct vertices s, t and a given delay constraint D ∈ R+, the kdisjoint restricted shortest path (kRSP) problem of computing kdisjoint minimum cost stpaths with total delay restrained by D, is known to be NPhard. Bifactor approximation algorithms have been developed for its special case when k = 2, while no approximation algorithm with constant single factor or bifactor ratio has been developed for general k. This paper firstly presents a (k, (1 + ε)H(k))approximation algorithm for the kRSP problem by extending Orda's factor(1.5, 1.5) approximation algorithm [9]. Secondly, this paper gives a novel linear programming (LP) formula for the kRSP problem. Based on LP rounding technology, this paper rounds an optimal solution of this formula and obtains an approximation algorithm within a bifactor ratio of (2, 2). To the best of our knowledge, it is the first approximation algorithm with constant bifactor ratio for the kRSP problem. Our results can be applied to serve applications in networks which require quality of service and robustness simultaneously, and also have broad applications in construction of survivable networks and fault tolerance systems.Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2012 13th International Conference on; 01/2012 
Conference Paper: On Robust Multicast in Multichannel Multiradio Wireless Mesh Networks
[Show abstract] [Hide abstract]
ABSTRACT: The multicast problem in multichannel multiradio wireless mesh networks has received much attention recently. Most recent studies on this problem focus on improving the network throughput. However, many realworld applications require routing algorithms to achieve lowdelay and lowloss. In this paper, we tackle the problem of constructing a robust minimumcost multicast tree that tolerates link interference. To save bandwidth resource and alleviate the interference in the communication, we propose a robust multicast algorithm for multichannel multiradio wireless mesh networks. Our experimental results show that our algorithm is very efficient to achieve better performances in network throughput and endtoend delay than previous studies.Parallel and Distributed Computing, Applications and Technologies (PDCAT), 2012 13th International Conference on; 01/2012  [Show abstract] [Hide abstract]
ABSTRACT: Random Projection (RP) has raised great concern among the research community of privacypreserving data mining, due to its high efficiency and utility, e.g., keeping the euclidean distances among the data points. It was shown in (33) that, if the original data set composed of m attributes is multiplied by a mixing matrix of kmð m>k Þ which is random and orthogonal on expectation, then the k series of perturbed data can be released for mining purposes. Given the data perturbed by RP and some necessary prior knowledge, to our knowledge, little work has been done in reconstructing the original data to recover some sensitive information. In this paper, we choose several typical scenarios in data mining with different assumptions on prior knowledge. For the cases that an attacker has full or zero knowledge of the mixing matrix R, respectively, we propose reconstruction methods based on Underdetermined Independent Component Analysis (UICA) if the attributes of the original data are mutually independent and sparse, and propose reconstruction methods based on Maximum A Posteriori (MAP) if the attributes of the original data are correlated and nonsparse. Simulation results show that our reconstructions achieve high recovery rates, and outperform the reconstructions based on Principal Component Analysis (PCA). Successful reconstructions essentially mean the leakage of privacy, so our work identify the possible risks of RP when it is used for data perturbations.IEEE Transactions on Computers 01/2012; 61:101117. · 1.38 Impact Factor
Publication Stats
1k  Citations  
61.20  Total Impact Points  
Top Journals
Institutions

2006–2014

University of Adelaide
 School of Computer Science
Tarndarnya, South Australia, Australia 
Manchester Metropolitan University
Manchester, England, United Kingdom


2013

Sun YatSen University
Shengcheng, Guangdong, China


2008–2013

Beijing Jiaotong University
 • School of Computer and Information Technology
 • Department of Computer Science
Peping, Beijing, China


2006–2008

University of Science and Technology of China
 School of Computer Science and Technology
Luchow, Anhui Sheng, China


2007

University of Texas at Dallas
 Department of Computer Science
Dallas, TX, United States


2001–2007

Japan Advanced Institute of Science and Technology
 School of Information Science
KMQ, Ishikawa, Japan 
Georgia State University
 Department of Computer Science
Atlanta, Georgia, United States


2003–2006

Fudan University
 School of Computer Science
Shanghai, Shanghai Shi, China 
The Hong Kong Polytechnic University
 Department of Computing
Hong Kong, Hong Kong


2005

Texas A&M University
 Department of Computer Science and Engineering
College Station, TX, United States


1995–2001

Griffith University
 School of Information and Communication Technology (ICT)
Southport, Queensland, Australia 
Australian National University
 Research School of Computer Science
Canberra, Australian Capital Territory, Australia


2000

University of Dayton
 Department of Computer Science
Dayton, Ohio, United States
