Chi-Hung Chi

Tsinghua University, Peping, Beijing, China

Are you Chi-Hung Chi?

Claim your profile

Publications (113)2.94 Total impact

  • B.S. Vidyalakshmi, R.K. Wong, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Varieties and volume of content generated from mobile devices contribute to the complications on research and analysis on big data. Increases in content generation through mobile devices, increasing penetration, and decentralized nature are the leading reasons of Peer-to-Peer (P2P) content sharing among mobile devices. With technologies like Bluetooth and NFC paving way for easier and less expensive ad hoc data transfer, content sharing among smartphones is realistic and achievable. Traditional access control among peers in a P2P network assumes that all data is shared among all peers. However, this may not always be the case as data on the smartphones can be personal or confidential. There is a need to address sharing specific data with specific peer, based on peer's trustworthiness with the host peer. We propose a model which controls access to files at a category level rather than at file or user level. We argue that the model preserves peers' autonomy while preserving P2P decentralized structure.
    Big Data (BigData Congress), 2013 IEEE International Congress on; 01/2013
  • V.W. Chu, R.K. Wong, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Due to the popularity of smartphones, finding and recommending suitable services on mobile devices are increasingly important. Recent research has attempted to use role-based approaches to recommend mobile services to other members among the same group in a context dependent manner. However, the traditional role mining approaches originated from the domain of security control tend to be rigid and may not be able to capture human behaviors adequately. In particular, during the course of role mining process, these approaches easily result in over-fitting, i.e., too many roles with slightly different service consumption patterns are found. As a result, they fail to reveal the true common preferences within the user community. This paper proposes an online role mining algorithm with a residual term that automatically group users according to their interests and habits without losing sight of their individual preferences. Moreover, to resolve the over-fitting problem, we relax the role mining mechanism by introducing quasi-roles based on the concept of quasi-bicliques. Most importantly, the new concept allows us to propose a monitoring framework to detect and correct over-fitting in online role mining such that recommendations can be made based on the latest and genuine common preferences. To the best of our knowledge, this is a new area in service recommendation that is yet to be fully explored.
    Web Services (ICWS), 2013 IEEE 20th International Conference on; 01/2013
  • [show abstract] [hide abstract]
    ABSTRACT: Web services have become a primary mechanism for consuming resources available on the Internet. As more and more services are published on the Web, automated service discovery is critical to consumers to identify relevant and reliable services efficiently. In this paper, we enhance the Web Service Crawler Engine (WSCE) framework by introducing comparison measures to allow for more accurate identification, discovery and ranking of relevant Web services. To discover services effectively, we need to be able to measure and compare the similarity among services. Most ontology-based and IR-based discovery techniques assume that service input/output are simple data types when calculating service similarity. However, real-world services published on the Web usually have complex data types input/output parameters. Furthermore, a good match of parameters does not guarantee good usability and good reliability. The relevant services must be further evaluated by users' past experiences, based on both objective and subjective measures, to make optimal solution selection possible. This paper proposes a service matchmaking algorithm that considers the complex data types of service input/output parameters, as well as experience-based objective and subjective measures for ranking. Experiments show that our approach performs better than previous works that only consider simple data types.
    Services Computing (SCC), 2013 IEEE International Conference on; 01/2013
  • Zhiwei Yu, R.K. Wong, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Cloud computing platforms facilitate efficiently processing complicated computing problems of which the time cost used to be unacceptable. Recent research has attempted to use role-based approaches for context-aware service recommendation, yet role mining problem has been proven to be difficult to compute. Currently proposed role-mining algorithms are inefficient and may not scale to cope with the huge amount of data in the real-world. This paper proposes a novel algorithm with much better runtime complexity, and in MapReduce style to take advantage of popular distributed computing platforms. Experiments running on a medium-sized high performance computing cluster demonstrate that our proposed algorithm works well with both running time complexity and scalability.
    Big Data, 2013 IEEE International Conference on; 01/2013
  • Shuo Chen, Chi-Hung Chi, Chen Ding, R.K. Wong
    [show abstract] [hide abstract]
    ABSTRACT: Nowadays many software services are hosted in the Cloud. When there are more requests on these services, there are also more queries sent to the underlying database. In order to keep up with the increasing workload, it is necessary to have multiple servers hosting the data. Some cloud providers offer the full data replication solution. However, this solution only works when the load mainly consists of the read requests, and when the number of write requests increases, it does not scale well. Although data decomposition has been widely used in data-intensive web sites, not much study has been done on how to decompose the underlying data of software services for the purpose of data replication. In this paper, we propose a data-decomposition-based partial replication model for software services. We devise an automatic algorithm for data decomposition under the constraint of the capacity limit of the host machines. We evaluate our approach from two aspects: scalability and performance, using two benchmarks: RUBiS and TPC-W. In the experiment, we test the algorithm using different workload inputs, and also compare our approach with the full data replication approach.
    Services Computing (SCC), 2013 IEEE International Conference on; 01/2013
  • [show abstract] [hide abstract]
    ABSTRACT: Non-Functional (NF) requirement is very important for the success of a software service. Considering that there could be multiple services implementing a same function, it is crucial for software providers to understand the real NF demands from consumers so that they can meet these demands and attract users. It is also crucial for consumers to know what is being offered so that they can pose realistic NF requests. We address both issues here by proposing a NF requirement analysis and recommendation system which works for both providers and consumers. NF requirements from various sources are first collected, and then we apply the factor analysis technique to identify those independent latent factors which contribute to those observable NF values. Finally we use cluster analysis to summarize the popular NF demands. Our experiment result shows the effectiveness of this approach.
    Web Services (ICWS), 2013 IEEE 20th International Conference on; 01/2013
  • A. Mohebi, Chen Ding, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Many QoS-based service selection algorithms are complex and time-consuming, and sometimes require a lot of manual efforts from users. As a result, users' real-time searching experiences may not be good when there are many candidate services, because it could be very slow to calculate the ranking scores of services. In this paper, we want to improve the efficiency of the selection process by using a simple vector space model. We also consider the actual QoS requirements from users in the ranking process, which is missing in many current systems. Our experiment results show a big improvement on the system efficiency without losing much on the accuracy when compared with a well-known algorithm -- skyline computation.
    Enterprise Distributed Object Computing Conference (EDOC), 2012 IEEE 16th International; 01/2012
  • Zhou Wei, G. Pierre, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Cloud data stores provide scalability and high availability properties for Web applications, but do not support complex queries such as joins. Web application developers must therefore design their programs according to the peculiarities of No SQL data stores rather than established software engineering practice. This results in complex and error-prone code, especially with respect to subtle issues such as data consistency under concurrent read/write queries. We present join query support in Cloud TPS, a middleware layer which stands between a Web application and its data store. The system enforces strong data consistency and scales linearly under a demanding workload composed of join queries and read-write transactions. In large-scale deployments, Cloud TPS outperforms replicated Postgre SQL up to three times.
    Cluster, Cloud and Grid Computing (CCGrid), 2012 12th IEEE/ACM International Symposium on; 01/2012
  • Source
    W. Zhou, G. Pierre, C. Chi
    [show abstract] [hide abstract]
    ABSTRACT: NoSQL Cloud data stores provide scalability and high availability properties for web applications, but at the same time they sacrifice data consistency. However, many applications cannot afford any data inconsistency. CloudTPS is a scalable transaction manager which guarantees full ACID properties for multi-item transactions issued by Web applications, even in the presence of server failures and network partitions. We implement this approach on top of the two main families of scalable data layers: Bigtable and SimpleDB. Performance evaluation on top of HBase (an open-source version of Bigtable) in our local cluster and Amazon SimpleDB in the Amazon cloud shows that our system scales linearly at least up to 40 nodes in our local cluster and 80 nodes in the Amazon cloud.
    IEEE Transactions on Services Computing 01/2011; · 2.46 Impact Factor
  • Raed Karim, Chen Ding, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: In order to choose from a list of functionally similar services, users often need to make their decisions based on multiple non-functional criteria they require on the target service. It is a natural fit to apply the Multi-Criteria Decision Making (MCDM) theory to this selection problem. However, the high demand of MCDM approaches on user expertise and user involvement could become an obstacle of using them for service selection. In this paper, we address this issue by taking a user-centric standpoint to design the non-functional criteria based service selection system. On one hand, we try to reduce the workload and the skill level requirement on users. On the other hand, we still give them the flexibility to define the necessary information, which include their preferences on multiple criteria, as well as the decision strategies they would follow to select the desired services from a list of alternatives. The former is crucial for optimal decision making. The latter is often ignored by most of the service selection systems and a common default selection strategy is to find a service which has the best overall score calculated by a certain formula. In reality, users may not necessarily follow this strategy and there are many other possible strategies they may follow. We should take this into consideration when designing selection systems. We use a case study to show that our system could produce a more accurate and customized result for individual users.
    01/2011;
  • Raed Karim, Chen Ding, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Since selecting a web service based on Quality of Services (QoS) is essentially a Multi-Criteria Decision Making (MCDM) problem, various MCDM models would be suitable for implementing the selection systems. A few of the MCDM approaches have been explored in previous research works. In this paper, we propose to use an enhanced PROMETHEE model for QoS-based web service selection. Many selection algorithms assume the independency between the QoS criteria, which is not very accurate. Thus, our first enhancement is to take into account the QoS interdependency by using the Analytical Network Process (ANP) to calculate the weight/priority associated with each criterion. User's QoS requirement is not considered in the original PROMETHEE model. As a consequence, during the process of finding an optimal service, when tradeoff decisions are involved, we may end up with a service which optimizes the overall QoS criteria, however, does not satisfy the user request. To overcome this insufficiency, our second enhancement is to check the outranking flows of each service with respect to the request in the ranking step, so that we know how well a service satisfies the user requirement. A case study is presented to explain the detailed selection process.
    IEEE International Conference on Services Computing, SCC 2011, Washington, DC, USA, 4-9 July, 2011; 01/2011
  • Yun Wei Zhao, Chi-Hung Chi, Chen Ding
    [show abstract] [hide abstract]
    ABSTRACT: Clustering is an important technique for intelligence computation such as trust, recommendation, reputation, and requirement elicitation. With the user centric nature of service and the user's lack of prior knowledge on the distribution of the raw data, one challenge is on how to associate user quality requirements on the clustering results with the algorithmic output properties (e.g. number of clusters to be targeted). In this paper, we focus on the hierarchical clustering process and propose two quality-driven hierarchical clustering algorithms, HBH (homogeneity-based hierarchical) and HDH (homogeneity-driven hierarchical) clustering algorithms, with minimum acceptable homogeneity and relative population for each cluster output as their input criteria. Furthermore, we also give a HDH-approximation algorithm in order to address the time performance issue. Experimental study on data sets with different density distribution and dispersion levels shows that the HDH gives the best quality result and HDH-approximation can significantly improve the execution time.
    Seventh International Conference on Semantics Knowledge and Grid (SKG 2011), Beijing, China, October 24-26, 2011; 01/2011
  • Yun Wei Zhao, Chi-Hung Chi, Chen Ding
    [show abstract] [hide abstract]
    ABSTRACT: Clustering is a classic technique widely used in computation intelligence to study similarity measure among entities of interest. The output measurement of clustering, however, is often computation centric (e.g. number of peaks, K) instead of user centric (e.g. quality of the clusters). This creates a big gap between the algorithms and the users, in particular when they are applied to areas such as software services. To address this issue, we propose to use the expected homogeneity degree among entities within a given cluster as the input quality requirements specified by the users to drive the data clustering process. We evaluate the effectiveness of our proposal by modifying two most widely used clustering methods, K-means and hierarchical, according to the homogeneity degrees of the clustered output results.
    Networked Digital Technologies - Third International Conference, NDT 2011, Macau, China, July 11-13, 2011. Proceedings; 01/2011
  • YunWei Zhao, Chi-Hung Chi, Chen Ding
    [show abstract] [hide abstract]
    ABSTRACT: With the current direction of service and cloud, unique characteristics of online software services impose new algorithmic requirements and cause differential applicability / suitability of diff erent clustering approaches in service analytics. In this paper, we investigate the efficiency and effectiveness of current important data clustering techniques, partitioning and hierarchical, for service analytics. It is our goal that results from this paper will serve as requirement guidelines for developing and deploying future intelligence services.
    01/2011;
  • Qiong Zhang, Chen Ding, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Collaborative filtering based recommender systems are very successful on dealing with the information overload problem and providing personalized recommendations to users. When more and more web services are published online, this technique can also help recommend and select services which satisfy users' particular Quality of Service (QoS) requirements and preferences. In this paper, we propose a novel collaborative filtering based service ranking mechanism, in which the invocation and query histories are used to infer the user behavior, and user similarity is calculated based on similar invocations and queries. To overcome some of the inherent problems with the collaborative filtering systems such as the cold start and data sparsity problem, the final ranking score is a combination of the QoS-based matching score and the collaborative filtering based score. The experiment using a simulated dataset proves the effectiveness of the algorithm.
    IEEE International Conference on Web Services, ICWS 2011, Washington, DC, USA, July 4-9, 2011; 01/2011
  • [show abstract] [hide abstract]
    ABSTRACT: Flash crowd has generally been agreed to be one of the main threats to the availability of web services, and application replication is a promising approach to address such problem. In this paper, we first model the workload of flash crowds according to the characteristics of their request distribution. Then a new mechanism to trigger replication service based on the gradient of cumulative request pattern is proposed. This mechanism, called hybrid adaptive selection algorithm (HASA), targets to minimize the number of discarded requests (NDR) and the spare capacity (NSC) together. Detail simulation of this mechanism with respect to interval size and thresholds of NDR and NSC shows that significant performance improvement can be obtained over the previously proposed single-targeted adaptive selection algorithms.
    Semantics Knowledge and Grid (SKG), 2010 Sixth International Conference on; 12/2010
  • Source
    Shuo Chen, Guillaume Pierre, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Gossip based aggregation protocols are a promising approach to monitoring large-scale decentralized IT infrastructures. Compared to traditional approaches they exhibit good properties of scalability, tolerance of churn, and communication overhead. Gossip-based protocols can compute statistical aggregates such as the average, sum or statistical distribution of an attribute across a large system. However, such protocols are extremely vulnerable to malicious attacks, and even a small number of attackers in the system can largely undermine aggregation results. This paper presents a secure protocol for computing attribute averages. In this system, each node autonomously judges whether its neighbors are malicious, and may subsequently stop any interaction with them. A node appearing malicious to its neighbors quickly gets excluded from the system. Instead of defining malicious behavior (and excluding nodes that follow the definition of maliciousness), our system defines correct behavior (and excludes any node that behaves differently). This allows in principle our system to address arbitrary types of attacks. Simulations based on real-world attribute data demonstrate that our system offers good resistance against four different types of attacks.
    Proceedings of the 19th International Conference on Computer Communications and Networks, IEEE ICCCN 2010, Zürich, Switzerland, August 2-5, 2010; 01/2010
  • Web Information Systems Engineering - WISE 2010 - 11th International Conference, Hong Kong, China, December 12-14, 2010. Proceedings; 01/2010
  • Source
    Dejun Jiang, Guillaume Pierre, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: Dynamic resource provisioning aims at maintaining the end- to-end response time of a web application within a pre- defined SLA. Although the topic has been well studied for monolithic applications, provisioning resources for applica- tions composed of multiple services remains a challenge. When the SLA is violated, one must decide which service(s) should be reprovisioned for optimal effect. We propose to as- sign an SLA only to the front-end service. Other services are not given any particular response time objectives. Services are autonomously responsible for their own provisioning op- erations and collaboratively negotiate performance objec- tives with each other to decide the provisioning service(s). We demonstrate through extensive experiments that our sys- tem can add/remove/shift both servers and caches within an entire multi-service application under varying workloads to meet the SLA target and improve resource utilization.
    Proceedings of the 19th International Conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010; 01/2010
  • Delnavaz Mobedpour, Chen Ding, Chi-Hung Chi
    [show abstract] [hide abstract]
    ABSTRACT: One of the prerequisites for the success of a QoS-based web service selection process is an accurately formulated QoS query. It is usually not an easy task for users to formulate an accurate query considering the complexity of many current QoS languages and users' lack of knowledge on realistic QoS values. It would be very helpful if the system can provide some assistance to users during the whole process. Nonetheless, not many research works put user support to the center of their system design. In this paper we want to tackle this issue by proposing a QoS query language which is expressive while not so complicated, together with a comprehensive user support mechanism to guide users through the query formulation process. A few unique features of the language include its time dimension, user-defined relaxation order which could be different from the preference order, and the support for the mixed fuzzy and range requirement. How to handle these new features is also discussed as case studies in the paper.
    2010 IEEE International Conference on Services Computing, SCC 2010, Miami, Florida, USA, July 5-10, 2010; 01/2010

Publication Stats

184 Citations
650 Downloads
2.94 Total Impact Points

Institutions

  • 2006–2011
    • Tsinghua University
      • School of Software
      Peping, Beijing, China
  • 2010
    • VU University Amsterdam
      Amsterdamo, North Holland, Netherlands
  • 2008
    • Hunan University
      Ch’ang-sha-shih, Hunan, China
  • 2001–2006
    • National University of Singapore
      • School of Computing
      Tumasik, Singapore