Hong Shen

Sun Yat-Sen University, Shengcheng, Guangdong, China

Are you Hong Shen?

Claim your profile

Publications (300)100.11 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Given a set of multiple channels, a set of multiple requests, where each request contains multiple requested data items and a client equipped with multiple antennae, the multi-antenna-based multirequest data retrieval problem (DRMR-MA) is to find a data retrieval sequence for downloading all data items of the requests allocated to each antenna, such that the maximum access latency of all antennae is minimized. Most existing approaches for the data retrieval problem focus on either single antenna or single request and are hence not directly applicable to DRMR-MA for retrieving multiple requests. This paper proposes two data retrieval algorithms that adopt two different grouping schemes to solve DRMR-MA so that the requests can be suitably allocated to each antenna. To find the data retrieval sequence of each request efficiently, we present a data retrieval scheme that converts a wireless data broadcast system to a special tree. Experimental results show that the proposed scheme is more efficient than other existing schemes. Copyright © 2014 John Wiley & Sons, Ltd.
    International Journal of Communication Systems 12/2014; · 1.11 Impact Factor
  • Wenhao Shu, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: Feature selection plays a vital role in many areas of pattern recognition and data mining. The effective computation of feature selection is important for improving the classification performance. In rough set theory, many feature selection algorithms have been proposed to process static incomplete data. However, feature values in an incomplete data set may vary dynamically in real-world applications. For such dynamic incomplete data, a classic (non-incremental) approach of feature selection is usually computationally time-consuming. To overcome this disadvantage, we propose an incremental approach for feature selection, which can accelerate the feature selection process in dynamic incomplete data. We firstly employ an incremental manner to compute the new positive region when feature values with respect to an object set vary dynamically. Based on the calculated positive region, two efficient incremental feature selection algorithms are developed respectively for single object and multiple objects with varying feature values. Then we conduct a series of experiments with 12 UCI real data sets to evaluate the efficiency and effectiveness of our proposed algorithms. The experimental results show that the proposed algorithms compare favorably with that of applying the existing non-incremental methods.
    Pattern Recognition 12/2014; 47(12):3890–3906. · 2.58 Impact Factor
  • Hui Tian, Binze Zhong, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: Traffic matrix (TM) describes the traffic volumes traversing a network from the input nodes to the output nodes over a measured period. Such a TM contains very useful information for network managers, traffic engineers and users. However, TM is hard to be obtained and analyzed due to its large size, especially for large-scale networks. In this paper, we present a new method based on diffusion wavelets for analyzing the traffic matrix. It is shown that this method can conduct efficient multi-resolution analysis (MRA) on TM. We compare the analysis results by using different diffusion operators. Through reconstructing the original TM from the diffused traffic on a particular level, we show the high efficiency of this MRA tool based on these operators. We then develop an anomaly detection method based on the analysis results and explore the possibilities of other potential applications.
    Computers & Electrical Engineering 08/2014; · 0.99 Impact Factor
  • Hong Shen, Yingpeng Sang, Yidong Li
    Journal of Interconnection Networks 05/2014; 14(03).
  • Wenhao Shu, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: In rough set theory, attribute reduction is a challenging problem in the applications in which data with numbers of attributes available. Moreover, due to dynamic characteristics of data collection in decision systems, attribute reduction will change dynamically as attribute set in decision systems varies over time. How to carry out updating attribute reduction by utilizing previous information is an important task that can help to improve the efficiency of knowledge discovery. In view of that attribute reduction algorithms in incomplete decision systems with the variation of attribute set have not yet been discussed so far. This paper focuses on positive region-based attribute reduction algorithm to solve the attribute reduction problem efficiently in the incomplete decision systems with dynamically varying attribute set. We first introduce an incremental manner to calculate the new positive region and tolerance classes. Consequently, based on the calculated positive region and tolerance classes, the corresponding attribute reduction algorithms on how to compute new attribute reduct are put forward respectively when an attribute set is added into and deleted from the incomplete decision systems. Finally, numerical experiments conducted on different data sets from UCI validate the effectiveness and efficiency of the proposed algorithms in incomplete decision systems with the variation of attribute set.
    International Journal of Approximate Reasoning 03/2014; 55(3):867–884. · 1.98 Impact Factor
  • Computer Science and Information Systems 01/2014; 11(1):309-320. · 0.58 Impact Factor
  • Ping He, Hong Shen, Hui Tian
    [Show abstract] [Hide abstract]
    ABSTRACT: Given a set of data items broadcasting at multiple parallel channels, where each channel has the same broadcast pattern over a time period, and a set of client's requested data items, the data retrieval problem requires to find a sequence of channel access to retrieve the requested data items among the channels such that the total access latency is minimized, where both channel access (to retrieve a data item) and channel switch are assumed to take a single time slot. As an important problem of information retrieval in wireless networks, this problem arises in many applications such as e-commerce and ubiquitous data sharing, and is known two conflicts: requested data items are broadcast at same time slots or adjacent time slots in different channels. Although existing studies focus on this problem with one conflict, there is little work on this problem with two conflicts. So this paper proposes efficient algorithms from two views: single antenna and multiple antennae. Our algorithm adopts a novel approach that wireless data broadcast system is converted to DAG, and applies set cover to solve this problem. Through Experiments, this result presents currently the most efficient algorithm for this problem with two conflicts.
    Proceedings of International Conference on Advances in Mobile Computing & Multimedia; 12/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: In the emerging environment of the Internet of things (IoT), through the connection of billions of radio frequency identification (RFID) tags and sensors to the Internet, applications will generate an unprecedented number of transactions and amount of data that require novel approaches in RFID data stream processing and management. Unfortunately, it is difficult to maintain a distributed model without a shared directory or structured index. In this paper, we propose a fully distributed model for federated RFID data streams. This model combines two techniques, namely, tilted time frame and histogram to represent the patterns of object flows. Our model is efficient in space and can be stored in main memory. The model is built on top of an unstructured P2P overlay. To reduce the overhead of distributed data acquisition, we further propose several algorithms that use a statistically minimum number of network calls to maintain the model. The scalability and efficiency of the proposed model are demonstrated through an extensive set of experiments.
    IEEE Transactions on Parallel and Distributed Systems 10/2013; 24(10):2036-2045. · 2.17 Impact Factor
  • Hong Shen, Longkun Guo
    [Show abstract] [Hide abstract]
    ABSTRACT: For a given undirected (edge) weighted graph G = (V, E), a terminal set S ⊆ V and a root r ∈ S, the rooted k-vertex connected minimum Steiner network (kVSMNr) problem requires to construct a minimum-cost subgraph of G such that each terminal in S {R} is k-vertex connected to τ. As an important problem in survivable network design, the kVSMNτ problem is known to be NP-hard even when k 1/4 1 [14]. For k 1/4 3 this paper presents a simple combinatorial eight-approximation algorithm, improving the known best ratio 14 of Nutov [20]. Our algorithm constructs an approximate 3VSMNτ through augmenting a two-vertex connected counterpart with additional edges of bounded cost to the optimal. We prove that the total cost of the added edges is at most six times of the optimal by showing that the edges in a 3VSMNτ compose a subgraph containing our solution in such a way that each edge appears in the subgraph at most six times.
    IEEE Transactions on Computers 09/2013; 62(9):1684-1693. · 1.47 Impact Factor
  • Kewen Liao, Hong Shen, Longkun Guo
    [Show abstract] [Hide abstract]
    ABSTRACT: In Constrained Fault-Tolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources and a set of clients accessing these resources. Each site i can open at most Ri facilities with opening cost fi. Each client j requires an allocation of rj open facilities and connecting j to any facility at site i incurs a connection cost cij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained Fault-Tolerant Resource Allocation (FTRA∞) [10] and the classical Fault-Tolerant Facility Location (FTFL) [7] problems: for every site i, FTRA∞ does not have the constraint Ri, whereas FTFL sets Ri=1. These problems are said to be uniform if all rj's are the same, and general otherwise. For the general metric FTRA, we first give an LP-rounding algorithm achieving an approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [2]. For the uniform FTRA, we provide a 1.52-approximation primal-dual algorithm in O(n4) time, where n is the total number of sites and clients.
    Proceedings of the 19th international conference on Fundamentals of Computation Theory; 08/2013
  • Yidong Li, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: Data publishing based on hypergraphs is becoming increasingly popular due to its power in representing multirelations among objects. However, security issues have been little studied on this subject, while most recent work only focuses on the protection of relational data or graphs. As a major privacy breach, identity disclosure reveals the identification of entities with certain background knowledge known by an adversary. In this paper, we first introduce a novel background knowledge attack model based on the property of hyperedge ranks, and formalize the rank-based hypergraph anonymization problem. We then propose a complete solution in a two-step framework: rank anonymization and hypergraph reconstruction. We also take hypergraph clustering (known as community detection) as data utility into consideration, and discuss two metrics to quantify information loss incurred in the perturbation. Our approaches are effective in terms of efficacy, privacy, and utility. The algorithms run in near-quadratic time on hypergraph size, and protect data from rank attacks with almost the same utility preserved. The performances of the methods have been validated by extensive experiments on real-world datasets as well. Our rank-based attack model and algorithms for rank anonymization and hypergraph reconstruction are, to our best knowledge, the first systematic study to privacy preserving for hypergraph-based data publishing.
    IEEE Transactions on Information Forensics and Security 08/2013; 8(8):1384-1396. · 2.07 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Compressive sensing based in-network compression is an efficient technique to reduce communication cost and accurately recover sensory data at the sink. Existing compressive sensing based data gathering methods require a large number of sensors to participate in each measurement gathering, and it leads to waste a lot of energy. In this paper, we present an energy efficient clustering routing data gathering scheme for large-scale wireless sensor networks. The main challenges of our scheme are how to obtain the optimal number of clusters and how to keep all cluster heads uniformly distributed. To solve the above problems, we first formulate an energy consumption model to obtain the optimal number of clusters. Second, we design an efficient deterministic dynamic clustering scheme to guarantee all cluster heads uniformly distributed approximately. With extensive simulation, we demonstrate that our scheme not only prolongs nearly 2x network's lifetime compared with the state of the art compressive sensing based data gathering schemes, but also makes the network energy consumption very uniformly.
    Computers & Electrical Engineering 08/2013; 39(6):1935-1946. · 0.99 Impact Factor
  • Source
    Longkun Guo, Hong Shen, Kewen Liao
    [Show abstract] [Hide abstract]
    ABSTRACT: For a given graph $G$ with positive integral cost and delay on edges, distinct vertices $s$ and $t$, cost bound $C\in Z^{+}$ and delay bound $D\in Z^{+}$, the $k$ bi-constraint path ($k$BCP) problem is to compute $k$ disjoint $st$-paths subject to $C$ and $D$. This problem is known NP-hard, even when $k=1$ [4]. This paper first gives a simple approximation algorithm with factor-$(2,2)$, i.e. the algorithm computes a solution with delay and cost bounded by $2*D$ and $2*C$ respectively. Later, a novel improved approximation algorithm with ratio $(1+\beta, \max{2, 1+\ln\frac{1}{\beta}})$ is developed by constructing interesting auxiliary graphs and employing the cycle cancelation method. As a consequence, we can obtain a factor-$(1.369, 2)$ approximation algorithm by setting $1+\ln\frac{1}{\beta}=2$ and a factor-$(1.567, 1.567)$ algorithm by setting $1+\beta=1+\ln\frac{1}{\beta}$. Besides, when $\beta=0$, by slightly modifying our algorithm, an approximation algorithm with ratio $(1, (1+\epsilon)(\ln n+\ln\frac{1}{\epsilon}))$, i.e. an algorithm with only a single factor ratio $O(\ln n)$ on cost, can be immediately obtained by setting the delay of each edge $e$ to $\lfloor \frac{d(e)}{\frac{\epsilon D}{n}}\rfloor $ for a given fixed $\epsilon>0$. To the best of our knowledge, this is the first non-trivial approximation algorithm for the $k$BCP problem which strictly obeys the delay constraint. Our developed algorithms can be directly used to solve some related problems, in particular, the k-disjoint restricted shortest path problem ($k$RSP) [10], resulting in the same ratio $(1+\beta, \max{2, 1+\ln\frac{1}{\beta}})$, which improves currently the best result of ratio $(2, 2)$ in [6].
    Journal of Combinatorial Optimization 01/2013; · 1.04 Impact Factor
  • Longkun Guo, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: The Min-Min problem of finding a disjoint-path pair with the length of the shorter path minimized is known to be NP-hard and admits no K-approximation for any K>1 in the general case (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006). In this paper, we first show that Bhatia et al.’s NP-hardness proof (Bhatia et al. in J. Comb. Optim. 12:83–96, 2006), a claim of correction to Xu et al.’s proof (Xu et al. in IEEE/ACM Trans. Netw. 14:147–158, 2006), for the edge-disjoint Min-Min problem in the general undirected graphs is incorrect by giving a counter example that is an unsatisfiable 3SAT instance but classified as a satisfiable 3SAT instance in the proof of Bhatia et al. (J. Comb. Optim. 12:83–96, 2006). We then gave a correct proof of NP-hardness of this problem in undirected graphs. Finally we give a polynomial-time algorithm for the vertex disjoint Min-Min problem in planar graphs by showing that the vertex disjoint Min-Min problem is polynomially solvable in st-planar graph G=(V,E) whose corresponding auxiliary graph G(V,E∪{e(st)}) can be embedded into a plane, and a planar graph can be decomposed into several st-planar graphs whose Min-Min paths collectively contain a Min-Min disjoint-path pair between s and t in the original graph G. To the best of our knowledge, these are the first polynomial algorithms for the Min-Min problems in planar graphs.
    Algorithmica 01/2013; 66(3). · 0.57 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Randomization methods widely applied for privacy-preserving data mining are generally subject to reconstruction attack, linkage attack, and semantic-related attacks. A probabilistic anonymity definition has been proposed in [1] to defend against the linkage attack in which the attacker links the same randomized record to all of the original records. In this paper we name this type of attack as Multiple (original records) to One (randomized record) attack, while focus on another attack that has not been researched before, i.e. One (original record) to Multiple (randomized records) attack. The latter is different from the former in that it does not require the attacker to know the distribution and all values of quasi-identifiers in original records, and thus is easier to be launched by the attacker. To defend against this attack we propose a novel probabilistic anonymity concept different from [1]. We achieve this anonymity goal on a hybrid model combining random projection and random noise addition. We also analyze the security properties of this model against the other common types of attacks. Compared with existing work in randomization, k-anonymity and differential privacy, our work achieves the holistic aim of higher security, higher efficiency and higher data utility, and demonstrates very promising applications in large-scale and high-dimensional data mining in clouds.
    e-Business Engineering (ICEBE), 2013 IEEE 10th International Conference on; 01/2013
  • Wenhao Shu, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: Efficient attribute reduction in large-scale incomplete decision systems is a challenging problem. The computation of tolerance classes induced by the condition attributes in the incomplete decision system is a key part among all existing attribute reduction algorithms. Moreover, updating attribute reduction for dynamically-increasing decision systems has attracted much attention, in view of that incremental attribute reduction algorithms in a dynamic incomplete decision system have not yet been sufficiently discussed so far. In this paper, we first introduce a simpler way of computing tolerance classes than the classical method. Then we present an incremental attribute reduction algorithm to compute an attribute reduct for a dynamically-increasing incomplete decision system. Compared with the non-incremental algorithms, our incremental attribute reduction algorithm can compute a new attribute reduct in much shorter time. Experiments on four data sets downloaded from UCI show that the feasibility and effectiveness of the proposed incremental algorithm.
    Fuzzy Systems (FUZZ), 2013 IEEE International Conference on; 01/2013
  • Source
    Kewen Liao, Hong Shen, Longkun Guo
    [Show abstract] [Hide abstract]
    ABSTRACT: In the Constrained Fault-Tolerant Resource Allocation (FTRA) problem, we are given a set of sites containing facilities as resources, and a set of clients accessing these resources. Specifically, each site i is allowed to open at most R_i facilities with cost f_i for each opened facility. Each client j requires an allocation of r_j open facilities and connecting j to any facility at site i incurs a connection cost c_ij. The goal is to minimize the total cost of this resource allocation scenario. FTRA generalizes the Unconstrained Fault-Tolerant Resource Allocation (FTRA_{\infty}) [18] and the classical Fault-Tolerant Facility Location (FTFL) [13] problems: for every site i, FTRA_{\infty} does not have the constraint R_i, whereas FTFL sets R_i=1. These problems are said to be uniform if all r_j's are the same, and general otherwise. For the general metric FTRA, we first give an LP-rounding algorithm achieving the approximation ratio of 4. Then we show the problem reduces to FTFL, implying the ratio of 1.7245 from [3]. For the uniform FTRA, we provide a 1.52-approximation primal-dual algorithm in O(n^4) time, where n is the total number of sites and clients. We also consider the Constrained Fault-Tolerant k-Resource Allocation (k-FTRA) problem where additionally the total number of facilities can be opened across all sites is bounded by k. For the uniform k-FTRA, we give the first constant-factor approximation algorithm with a factor of 4. Note that the above results carry over to FTRA_{\infty} and k-FTRA_{\infty}.
    08/2012;
  • Hong Shen, Longkun Guo
    [Show abstract] [Hide abstract]
    ABSTRACT: For an undirected and weighted graph G=(V,E) and a terminal set S of V, the 2-connected Steiner minimal network (SMN) problem requires to compute a minimum-weight subgraph of G in which all terminals are 2-connected to each other. This problem has important applications in design of survivable networks and fault-tolerant communication, and is known MAXSNP-hard, a harder subclass of NP-hard problems for which no polynomial-time approximation scheme (PTAS) is known. This paper presents an efficient algorithm of O(| V|^2|S|^3) time for computing a 2-vertex connected Steiner network (2VSN) whose weight is bounded by 2 times of the optimal solution 2VSMN. It compares favorably with the currently known 2-approximation solution to the 2VSMN problem based on that to the survivable network design problem, with a time complexity reduction of O(|V|^5|E|7) for strongly polynomial time and O(|V|^5g) for weakly polynomial time where g is determined by the sizes of input. Our algorithm applies a novel greedy approach to generate a 2VSN through progressive improvement on a set of vertex-disjoint shortest path pairs incident with each terminal of S. The algorithm can be directly deployed to solve the 2-edge connected SMN problem at the same approximation ratio within time O(|V|^2|S|^2).
    IEEE Transactions on Computers 07/2012; · 1.47 Impact Factor
  • Longkun Guo, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: The min–min problem of finding a disjoint path pair with the length of the shorter path minimized is known to be NP-complete (Xu et al., 2006) [1]. In this paper, we prove that in planar digraphs the edge-disjoint min–min problem remains NP-complete and admits no KK-approximation for any K>1K>1 unless P=NPP=NP. As a by-product, we show that this problem remains NP-complete even when all edge costs are equal (i.e., stronglyNP-complete). To our knowledge, this is the first NP-completeness proof for the edge-disjoint min–min problem in planar digraphs.
    Theoretical Computer Science 05/2012; 432:58–63. · 0.52 Impact Factor
  • Source
    Kewen Liao, Hong Shen
    [Show abstract] [Hide abstract]
    ABSTRACT: We initiate the study of the Reliable Resource Allocation (RRA) problem. In this problem, we are given a set of sites equipped with an unbounded number of facilities as resources. Each facility has an opening cost and an estimated reliability. There is also a set of clients to be allocated to facilities with corresponding connection costs. Each client has a reliability requirement (RR) for accessing resources. The objective is to open a subset of facilities from sites to satisfy all clients' RRs at a minimum total cost. The Unconstrained Fault-Tolerant Resource Allocation (UFTRA) problem studied in (Liao & Shen 2011) is a special case of RRA. In this paper, we present two equivalent primal-dual algorithms for the RRA problem, where the second one is an acceleration of the first and runs in quasi-linear time. If all clients have the same RR above the threshold that a single facility can provide, our analysis of the algorithm yields an approximation factor of 2+2√2 and later a reduced ratio of 3.722 using a factor revealing program. The analysis further elaborates and generalizes the generic inverse dual fitting technique introduced in (Xu & Shen 2009). As a by-product, we also formalize this technique for the classical minimum set cover problem.
    Proceedings of the Eighteenth Computing: The Australasian Theory Symposium - Volume 128; 01/2012

Publication Stats

1k Citations
100.11 Total Impact Points

Institutions

  • 2013–2014
    • Sun Yat-Sen University
      Shengcheng, Guangdong, China
  • 2008–2014
    • Beijing Jiaotong University
      • • School of Computer and Information Technology
      • • Department of Computer Science
      Peping, Beijing, China
  • 2006–2014
    • University of Adelaide
      • School of Computer Science
      Tarndarnya, South Australia, Australia
    • Manchester Metropolitan University
      Manchester, England, United Kingdom
  • 2006–2008
    • University of Science and Technology of China
      • School of Computer Science and Technology
      Luchow, Anhui Sheng, China
  • 2007
    • University of Texas at Dallas
      • Department of Computer Science
      Dallas, TX, United States
  • 2001–2007
    • Japan Advanced Institute of Science and Technology
      • School of Information Science
      KMQ, Ishikawa, Japan
  • 2003–2006
    • Fudan University
      • School of Computer Science
      Shanghai, Shanghai Shi, China
    • The Hong Kong Polytechnic University
      • Department of Computing
      Hong Kong, Hong Kong
  • 2005
    • Texas A&M University
      • Department of Computer Science and Engineering
      College Station, TX, United States
  • 1994–2002
    • Griffith University
      • School of Information and Communication Technology (ICT)
      Southport, Queensland, Australia
  • 1995–2001
    • Australian National University
      • Research School of Computer Science
      Canberra, Australian Capital Territory, Australia
  • 2000
    • University of Dayton
      • Department of Computer Science
      Dayton, Ohio, United States