Kai Chen

East China JiaoTong University, Jiangxi, Gansu Sheng, China

Are you Kai Chen?

Claim your profile

Publications (44)1.57 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we study on the popularity prediction of online user-generated contents, where high quality predictions give us much more flexibility and preparing time in deploying limited resources (such as advertising budget, monitoring capacity) into more popular contents. However the high retrieval cost of data used in prediction is a big challenge due to the large amount of users and contents involved. We propose a notion that higher popularity user-generated contents can be predicted by concentrating on fewer but informative users, as we notice the fact that contents generated by those users tend to become popular while that which are generated by the rest users do not. We develop a cost-effective popularity prediction framework to fulfil online prediction. It contains 3 modules: (a) online data retrieving, (b) informative users selection and (c) popularity prediction. A hybrid user selection algorithm and several popularity prediction algorithms/improvements are presented, and their performance are evaluated and compared using (a) the selected users' generated data and (b) all users' generated data, retrieved from Sina Weibo Micro blogger. The best prediction algorithm reaches a 78% accuracy at the time of 24 hours after publishing time when level width Nl equals 500. And the best combination of prediction and selection algorithms performs only about 7% worse on dataset of 2000 users than on dataset of all users (about 4.46 million).
    No preview · Article · Jan 2015
  • Pei Shen · Yi Zhou · Kai Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Microblogging has become a popular means of communication and information diffusion. Due to the huge amount of microblogs generated daily, the communication and computing costs required for real hot event detection is a big challenge. Choosing a small subnet of nodes to detect events has received increasing research interests in recent years. But the previous methods manage to select nodes to cover all the events including less popular events in sample datasets under the limited subnet size, which cause a big difference of event detection ratio between sample events and online real events in microblogs. In this paper we propose a new subnet nodes selection scheme based on the event detection ratio and nodes' events participation probabilities. Under the requirement of average event detection ratio, we prefer to choose the nodes who are active in propagating hot events than the nodes who participate in the less popular events. And we take dynamic programming to accelerate the computing. The experimental results show that our proposed method has a better performance.
    No preview · Conference Paper · Aug 2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Spamming has been a widespread problem for social networks. In recent years there is an increasing interest in the analysis of anti-spamming for microblogs, such as Twitter. In this paper we present a systematic research on the analysis of spamming in Sina Weibo platform, which is currently a dominant microblogging service provider in China. Our research objectives are to understand the specific spamming behaviors in Sina Weibo and find approaches to identify and block spammers in Sina Weibo based on spamming behavior classifiers. To start with the analysis of spamming behaviors we devise several effective methods to collect a large set of spammer samples, including uses of proactive honeypots and crawlers, keywords based searching and buying spammer samples directly from online merchants. We processed the database associated with these spammer samples and interestingly we found three representative spamming behaviors: aggressive advertising, repeated duplicate reposting and aggressive following. We extract various features and compare the behaviors of spammers and legitimate users with regard to these features. It is found that spamming behaviors and normal behaviors have distinct characteristics. Based on these findings we design an automatic online spammer identification system. Through tests with real data it is demonstrated that the system can effectively detect the spamming behaviors and identify spammers in Sina Weibo.
    No preview · Conference Paper · Aug 2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a cost-effective hot event detection system over Sina Weibo platform, currently the dominant microblogging service provider in China. The problem of finding a proper subset of microbloggers under resource constraints is formulated as a mixed-integer problem for which heuristic algorithms are developed to compute approximate solution. Preliminary results show that by tracking about 500 out of 1.6 million candidate microbloggers and processing 15,000 microposts daily, 62% of the hot events can be detected five hours on average earlier than they are published by Weibo.
    No preview · Conference Paper · May 2013
  • Source
    Yong Xu · Yi Zhou · Kai Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Micro-blogging service has been developing and evolving rapidly in China which has led to a significant rise in social spamming attacks. However, little is known about these spammers. Thus, in this paper, we presented an observation on spammers in Sina Weibo, the biggest micro-blogging community in China. Specifically, we used program-controlled profiles to monitor, track and record spamming behaviors. We gave a detailed description of the experiment settings and then analyzed the spamming data collected by these profiles. We found that the spammers on Sina Weibo can be classified into two categories and they shared some distinguishing characteristics. These results are promising for the future study on automatically detecting and identifying spammers.
    Preview · Article · Mar 2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper researches on Matthew Effect in Sina Weibo microblogger. We choose the microblogs in the ranking list of Hot Microblog App in Sina Weibo microblogger as target of our study. The differences of repost number of microblogs in the ranking list between before and after the time when it enter the ranking list of Hot Microblog app are analyzed. And we compare the spread features of the microblogs in the ranking list with those hot microblogs not in the list and those ordinary microblogs of users who have some microblog in the ranking list before. Our study proves the existence of Matthew Effect in social network.
    No preview · Conference Paper · Jan 2013
  • Yin Zhang · Yi Zhou · Kai Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Interest in traffic classification has dramatically grown in the past few years in both industry and academia. As more and more applications are encrypting the payloads and not to use well-known ports, traditional traffic classification methods such as transport-layer protocol ports based ones can not accurately and efficiently deal with these applications. In this paper we investigate the problem of classifing traffic flows into different application categories. And a new bag-of-words (BoW) model based traffic classification method is proposed, which has been widely used in document classification and computer vision. In the new traffic classification method the application categories of interests represents the bags, centroids represent the words of the BoW model, respectively. By constructing representation vectors for the application categories and calculating the cosine similarity between each category representation vector and newly built-up vector converted from flows to be tested, we can find the application category that a tested flow belongs to. Using real traffic traces we demonstrate that the proposed approach is able to achieve 93% overall accuracy and the classification is not affected by the packet arrival sequences (e.g. out of order arrivals). The overall accuracy of the proposed approach is observed to be higher than the widely used C4.5 algorithm by 10% in our experiment when the out of order arrival happens.
    No preview · Conference Paper · Dec 2012
  • Yi Zhou · Kai Chen · Xiaokang Yang

    No preview · Conference Paper · May 2012
  • Qing Yan · Yi Xu · Xiaokang Yang · Kai Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: Edges are of significant importance in visual resolution perception. In this paper, we propose a novel image super-resolution method by enhancing the edges in the low resolution image. We first define a new edge sharpness feature: gradient profile sharpness (GPS), which considers both the absolute magnitude and the spatial scattering of edge gradient profile. Then we learn the relationship between GPSs in high resolution images and low resolution images, and we formulate a linear GPS transform to provide gradient prior for image reconstruction. Our GPS can represent edge sharpness perceptually well. And our super-resolution method can output harmonious and faithful images with better reconstruction quality.
    No preview · Conference Paper · Jan 2012
  • Yi Zhou · Kai Chen · Li Song · Xiaokang Yang · Jianhua He
    [Show abstract] [Hide abstract]
    ABSTRACT: In this poster we report our study on the microblog spammers with samples attracted by 50 honeyspots from two popular Chinese microblogging networks: Sina Weibo (weibo.com), and Ten cent Weibo (t.QQ.com) in seven months. We studied their features such as social information, activity, account age and spamming strategy. Several distinguishing characteristics of spammers on these two social network communities are observed, which can be helpful to the further study on automatic detection of microblog spammers. To our best knowledge our work is the first of its kind on the analysis of features of Chinese micloblog spammers.
    No preview · Conference Paper · Jan 2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Dynamic binary translation (DBT) has attracted much attention as a powerful technique for the runtime adaptation of software among different ISAs. It offers unprecedented flexibility in the control and modification of a program during the runtime. However, its inherent high overhead has perplexed researchers for many years. In order to reduce the overhead of DBT, this paper presents a dynamic-static combined approach to reorganize the layout of software cache. Under this approach, we first employ an emulating execution to collect the profile information and the translated target code. Especially, the path of execution flow will be tracked. In the static phase, based on the profile information collected in the previous stage, we first use the method of code replicating to build the traces, and then reorganize the layout of the target code by putting the hottest traces at the top of the software cache. Because of exact prediction and improved locality, the execution stream will concentrate on a small area with less control transfer. This approach can greatly reduce the overhead of DBT on the condition that the program runs repeatedly. Experimental results on executing the SPEC 2000 benchmarks show that our approach can reduce more than 30% run time on average.
    Preview · Article · Dec 2011
  • Yi Zhou · Kai Chen · Jianhua He · Haibing Guan
    [Show abstract] [Hide abstract]
    ABSTRACT: Recently underwater sensor networks (UWSN) at- tracted large research interests. Medium access control (MAC) is one of the major challenges faced by UWSN due to the large propagation delay and narrow channel bandwidth of acoustic communications used for UWSN. Widely used slotted aloha (S-Aloha) protocol suffers large performance loss in UWSNs, which can only achieve performance close to pure aloha (P- Aloha). In this paper we theoretically model the performances of S-Aloha and P-Aloha protocols and analyze the adverse impact of propagation delay. According to the observation on the performances of S-Aloha protocol we propose two enhanced S-Aloha protocols in order to minimize the adverse impact of propagation delay on S-Aloha protocol. The first enhancement is a synchronized arrival S-Aloha (SA-Aloha) protocol, in which frames are transmitted at carefully calculated time to align the frame arrival time with the start of time slots. Propagation delay is taken into consideration in the calculation of transmit time. As estimation error on propagation delay may exist and can affect network performance, an improved SA-Aloha (denoted by ISA- Aloha) is proposed, which adjusts the slot size according to the range of delay estimation errors. Simulation results show that both SA-Aloha and ISA-Aloha perform remarkably better than S-Aloha and P-Aloha for UWSN, and ISA-Aloha is more robust even when the propagation delay estimation error is large.
    No preview · Conference Paper · May 2011
  • Jianhua He · Wenyang Guan · Lin Bai · Kai Chen
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we investigate rate adaptation algorithm SampleRate, which spends a fixed time on bit-rates other than the currently measured best bit-rate. A simple but effective analytic model is proposed to study the steady-state behavior of the algorithm. Impacts of link condition, channel congestion and multi-rate retry on the algorithm performance are modeled. Simulations validate the model. It is also observed there is still a large performance gap between SampleRate and optimal scheme in case of high frame collision probability.
    No preview · Article · May 2011 · IEEE Communications Letters
  • [Show abstract] [Hide abstract]
    ABSTRACT: Dynamic Binary Translation (DBT) is widely used, but it suffers from substantial overhead. Several methods are taken to improve its performance, such as linking/chaining, building super block according to profiling and/or tracing. Reorganizing code layout of software cache can also improve performance on the ground that the execution stream will be more approximate to its control flow. Once the target code in software cache is reframed properly, hot code will be gathered together and well organized. Because of exact prediction and improved locality, the execution stream will concentrate on a small area with less control transfer. In this paper, we designed a new approach using dynamic-static combined framework to reorganize code layout of software cache. Then we compare it with another two conventional types of code layout in detail. Experimental result shows that our method can significantly cut down the overhead. The overall run time reduced by about 30% on average. Finally, we analyze the reason why reorganizing code layout can improve the performance of dynamic binary translator from several different perspectives.
    No preview · Conference Paper · Jan 2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: System virtualization, which provides good isolation, is now widely used in server consolidation. Meanwhile, one of the hot topics in this field is to extend virtualization for embedded systems. However, current popular virtualization platforms do not support real-time operating systems such as embedded Linux well because the platform is not real-time ware, which will bring low-performance I/O and high scheduling latency. The goal of this paper is to optimize the Xen virtualization platform to be real-time operating system friendly. We improve two aspects of the Xen virtualization platform. First, we improve the xen scheduler to manage the scheduling latency and response time of the real-time operating system. Second, we import multiple real-time operating systems balancing method. Our experiment demonstrates that our enhancement to the Xen virtualization platform support real-time operating system well and the improvement to the real-time performance is about 20%.
    No preview · Conference Paper · Jan 2011
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In Recent years embedded world has been undergoing a shift from traditional single-core processors to processors with multiple cores. However, this shift poses a challenge of adapting legacy uniprocessor-oriented real-time operating system (RTOS) to exploit the capability of multi-core processor. In addition, some embedded systems are inevitably going towards the direction of integrating real-time with off-the-shelf time-sharing system, as the combination of the two has the potential to provide not only timely and deterministic response but also a large application base. Virtualization technology, which ensures strong isolation between virtual machines, is therefore a promising solution to above mentioned issues. However, there remains a concern regarding the responsiveness of the RTOS running on top of a virtual machine. In this paper we propose an embedded real-time virtualization architecture based on Kernel-Based Virtual Machine (KVM), in which VxWorks and Linux are combined together. We then analyze and evaluate how KVM influences the interrupt-response times of VxWorks as a guest operating system. By applying several real-time performance tuning methods on the host Linux, we will show that sub-millisecond interrupt response latency can be achieved on the guest VxWorks.
    Preview · Conference Paper · Jan 2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: IEEE 802.15.4 networks (also known as ZigBee networks) has the features of low data rate and low power consumption. In this paper we propose an adaptive data transmission scheme which is based on CSMA/CA access control scheme, for applications which may have heavy traffic loads such as smart grids. In the proposed scheme, the personal area network (PAN) coordinator will adaptively broadcast a frame length threshold, which is used by the sensors to make decision whether a data frame should be transmitted directly to the target destinations, or follow a short data request frame. If the data frame is long and prone to collision, use of a short data request frame can efficiently reduce the costs of the potential collision on the energy and bandwidth. Simulation results demonstrate the effectiveness of the proposed scheme with largely improve bandwidth and power efficiency.
    No preview · Conference Paper · Jan 2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Debugging could be a threat to system security when adopted by malicious attackers. The major challenges of software-only anti-debugging are compromised strategy and lack of self-protection. Leveraging hardware virtualization, we proposes a strategy of software protection through anti-debugging which imperceptibly monitors the debug event on a higher privilege level than the conventional kernel space. Our prototype can effectively prohibit the debugging behavior from selected popular debuggers in the replication experiment.
    No preview · Conference Paper · Jan 2011
  • Source
    Kai Chen · Yi Zhou · Li Song · Xiaokang Yang
    [Show abstract] [Hide abstract]
    ABSTRACT: as the popularity of social networking sites increase, so does their attractiveness for criminals. In this work, we show how an adversary can build artificial identities using semantic information in social network. Our method make the identities look more like real people, therefore can be used to support many kinds of attacks, such as ASE (1), profile cloning (2). A prototype of this method is implemented, includes following stages: Firstly, categories of virtual identity are predefined, and each category has multiple properties, such as geographical region, hobby, education, age, interested topic/keywords, etc. Secondly, based on category information, each identity will foster its own "life" semantically, such as edit profile and update status, find hot related news/topic from Google then post to wall, find related groups/networks then request to add in, and find/like/create/comment pages/posts, etc. Thirdly, artificial identity will evolve to multiple stages according to its status (for example, number of friends of real people), single identity with different evolutionary stages is linked together to a group that will help to ensure the number of attack edges (3).
    Full-text · Conference Paper · Jan 2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: It is crucial to minimize virtualization overhead for virtual machine deployment. The conventional ×86 CPU is incapable of classical trap-and-emulate virtualization, leading that paravirtualization was the optimal virtualization strategy formerly. Since architectural extensions are introduced to support classical virtualization, hardware assisted virtualization becomes a competitive alternative method. Hardware assisted virtualization is superior in CPU and memory virtualization, yet paravirtualization is still valuable in some aspects as it is capable of shortening the disposal path of I/O virtualization. Thus we propose the hybrid virtualization which runs the paravirtualized guest in the hardware assisted virtual machine container to take advantage of both. Experiment results indicate that our hybrid solution outweighs origin paravirtualization by nearly 30% in memory intensive test and 50% in microbenchmarks. Meanwhile, compared with the origin hardware assisted virtual machine, hybrid guest owns over 16% improvement in I/O intensive workloads.
    No preview · Conference Paper · Jan 2011

Publication Stats

168 Citations
1.57 Total Impact Points

Institutions

  • 2009-2013
    • East China JiaoTong University
      Jiangxi, Gansu Sheng, China
  • 2008-2013
    • Shanghai University
      Shanghai, Shanghai Shi, China
  • 2009-2012
    • Shanghai Jiao Tong University
      • • Department of Electronic Engineering
      • • School of Information Security Engineering
      Shanghai, Shanghai Shi, China
  • 2011
    • Renji Hospital
      Shanghai, Shanghai Shi, China