Yifan Yang’s research while affiliated with CHINA COMMUNICATIONS CONSTRUCTION and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


FIGURE 1: Framework of MV-GCN model. The left are the graph interactive information and the right are the content information. The auto-encodings are integrated based on the multiple views with constrains on the consistencies. The predict scores are calculated by the dot product of node embeddings.
FIGURE 2: MV-GCN model for Heterogeneous Networks. The left are the graph interactive information and the content information. The unobservable links are predicted based on the auto-encodings for users and items. The auto-encodings are integrated based on the multiple views with constrains on the consistencies.
FIGURE 3: The RMSEs are calculated based on 25 random initialization. x-axis shows four major benchmark datasets. y-axis shows the RMSE values. For each dataset, we run 25 times using different random initializations based on our proposed method. The shape in each subgraph represents the sampling distribution of the experimental results. White circles () are the mean over the 25 runs. Grey points represent the reported RMSE used regarding different algorithms. For ML-100K and Jester dataset, our proposed model consistently outperforms all the benchmark methods over the 25 random initializations. For the Flixster dataset, our proposed method still outperforms the benchmark models considering the average result. For the Douban dataset, we have compatible results as the GC-MCN models and show great advantages comparing with other models.
FIGURE 4: Regularity graph on Movielens-100K dataset.
FIGURE 5: Regularity graph on Flixster dataset. The left graph represents the relationship between the RMSE and λ1 while keeping λ2 as a constant value. The right graph represents the relationship between the RMSE and λ2 while keeping λ1 as a constant value. The red solid line represents the RMSE on the training set. The orange dash line represents the RMSE on the validation set. As λ1 and λ2 increases, training errors increases. However, the validation error decreases. Regularity parameters are optimized based on such procedures to minimize the RMSE on the validation set.

+2

MV-GCN: Multi-View Graph Convolutional Networks for Link Prediction
  • Article
  • Full-text available

December 2019

·

1,832 Reads

·

39 Citations

IEEE Access

·

·

Jiaming Huang

·

[...]

·

Yifan Yang

Link prediction is a demanding task in real-world scenarios, such as recommender systems, which targets to predict the unobservable links between different objects by learning network-structured data. In this paper, we propose a novel multi-view graph convolutional neural network (MV-GCN) model to solve this problem based on Matrix Completion method by simultaneously exploiting the interactive relationship and the content information of different objects. Unlike existing approaches directly concatenate the interactive and content information as a single view, the proposed MV-GCN improves the accuracy of the predictions by restricting the consistencies on the graph embedding from multiple views. Experimental results on six primary benchmark datasets, including two homogeneous datasets and four heterogeneous datasets, both show that MV-GCN outperforms the recent state-of-the-art methods.

Download

Detecting and Characterizing Web Bot Traffic in a Large E-commerce Marketplace: 23rd European Symposium on Research in Computer Security, ESORICS 2018, Barcelona, Spain, September 3-7, 2018, Proceedings, Part II

August 2018

·

120 Reads

·

7 Citations

Lecture Notes in Computer Science

A certain amount of web traffic is attributed to web bots on the Internet. Web bot traffic has raised serious concerns among website operators, because they usually consume considerable resources at web servers, resulting in high workloads and longer response time, while not bringing in any profit. Even worse, the content of the pages it crawled might later be used for other fraudulent activities. Thus, it is important to detect web bot traffic and characterize it. In this paper, we first propose an efficient approach to detect web bot traffic in a large e-commerce marketplace and then perform an in-depth analysis on the characteristics of web bot traffic. Specifically, our proposed bot detection approach consists of the following modules: (1) an Expectation Maximization (EM)-based feature selection method to extract the most distinguishable features, (2) a gradient based decision tree to calculate the likelihood of being a bot IP, and (3) a threshold estimation mechanism aiming to recover a reasonable amount of non-bot traffic flow. The detection approach has been applied on Taobao/Tmall platforms, and its detection capability has been demonstrated by identifying a considerable amount of web bot traffic. Based on data samples of traffic originating from web bots and normal users, we conduct a comparative analysis to uncover the behavioral patterns of web bots different from normal users. The analysis results reveal their differences in terms of active time, search queries, item and store preferences, and many other aspects. These findings provide new insights for public websites to further improve web bot traffic detection for protecting valuable web contents.


Impression Allocation for Combating Fraud in E-commerce Via Deep Reinforcement Learning with Action Norm Penalty

July 2018

·

179 Reads

·

33 Citations

Conducting fraud transactions has become popular among e-commerce sellers to make their products favorable to the platform and buyers, which decreases the utilization efficiency of buyer impressions and jeopardizes the business environment. Fraud detection techniques are necessary but not enough for the platform since it is impossible to recognize all the fraud transactions. In this paper, we focus on improving the platform's impression allocation mechanism to maximize its profit and reduce the sellers' fraudulent behaviors simultaneously. First, we learn a seller behavior model to predict the sellers' fraudulent behaviors from the real-world data provided by one of the largest e-commerce company in the world. Then, we formulate the platform's impression allocation problem as a continuous Markov Decision Process (MDP) with unbounded action space. In order to make the action executable in practice and facilitate learning, we propose a novel deep reinforcement learning algorithm DDPG-ANP that introduces an action norm penalty to the reward function. Experimental results show that our algorithm significantly outperforms existing baselines in terms of scalability and solution quality.

Citations (3)


... Where, D is the degree matrix of A, and W is the learnable weight matrix. GCN has been widely applied to tasks such as node classification [37,38], graph classification [38,39], and link prediction [40,41], achieving excellent results. ...

Reference:

A Hybrid Real-Time Framework for Efficient Fussell-Vesely Importance Evaluation Using Virtual Fault Trees and Graph Neural Networks
MV-GCN: Multi-View Graph Convolutional Networks for Link Prediction

IEEE Access

... E-commerce ranks fifth in the intensity of bad bot traffic and first in the volume of sophisticated bot traffic [20]. Bot traffic is a long-standing problem for companies such as Amazon [21], causing a huge economic impact. There are different types of bots which can cause problems for e-commerce such as manipulating product ranks [17] and increasing data access latency [21]. ...

Detecting and Characterizing Web Bot Traffic in a Large E-commerce Marketplace: 23rd European Symposium on Research in Computer Security, ESORICS 2018, Barcelona, Spain, September 3-7, 2018, Proceedings, Part II
  • Citing Chapter
  • August 2018

Lecture Notes in Computer Science

... The last decade has seen significant advances in Natural Language Processing (NLP) and computer vision techniques that can do well in creating multiple learning contexts to build detection systems that are robust to the high level of dynamism observed in various fraud domains. A few articles in our corpus incorporate text-based methods [47,95,104,106] in their detection models, but there are no articles looking at multi-modal approaches incorporating image data into training. This is a future research opportunity area for this domain. ...

Impression Allocation for Combating Fraud in E-commerce Via Deep Reinforcement Learning with Action Norm Penalty
  • Citing Conference Paper
  • July 2018