Conference Paper

Research on ShadowsocksR Traffic Identification Based on DART Algorithm

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... However, it does not effectively utilize the intelligence of external network entities, resulting in a large false-positive rate for the model in the real environment. e authors of reference [16] proposed an SSR traffic detection method based on the DART algorithm the following year. Compared with the previous method, it can identify more types and ranges of SSR obfuscated traffic. ...
Article
Full-text available
ShadowsocksR (SSR), as a typical emerging anonymous communication tool, may record user information on the SSR client or server, leading to the theft of the user’s privacy, and may be used by attackers to anonymize their internal network environment and organization, which will cause serious damage to data security and bring severe challenges to security defense and threat assessment within organizations. To solve the problem of accurately and effectively discovering SSR users within an organization in a real traffic environment, in this paper, we propose an SSR user detection method based on network entity intelligence as follows: (1) According to the communication characteristics of SSR users, relevant network entity intelligence information from inside and outside the organization is obtained, such as the distribution of IP addresses within and outside the organization, and the differences between SSR and non-SSR users are analyzed to construct a feature space. (2) The communication behaviors of SSR and non-SSR users are further analyzed and features are extracted from the perspective of traffic behavior analysis, and the feature space of the SSR user detection model is expanded. (3) A data-driven machine-learning-based approach is designed and implemented to provide suggestions for the automatic identification of SSR users based on the extracted feature vectors. Results show that the detection method proposed in this paper has a detection accuracy of more than 95% for SSR users in the experimental environment, can accurately distinguish between SSR communication and normal communication, and can achieve accurate SSR user detection.
Article
Full-text available
Cloud Virtual Private Server (VPS) services provide the chance of rapid deployment of anonymous proxy services, becoming an important part of many anonymous proxy solutions. The anonymous system represented by ShadowSocks (SS), through proxy services deployed on VPSs provided by different cloud service providers, has become an important mean for illegal network activists to engage in illegal network activities such as cyber-attacks and darknet transactions. It is difficult for local network administrators to supervise SS traffic from the cloud. While from the local network, the task faces the challenges of Invisible Negotiation Process and Data Transparent Transmission. In this paper, we present a novel SS detection method based on flow context and host behavior. The method can not only accurately identify SS flows, but also be applicable to large-scale network environment. In this method, we extract 12-dimensional features from three aspects: the relationship between flows, hosts’ flow behavior and hosts’ DNS behavior to build the detection model. Among them, the four features about flow burst and the feature of unassociated domain names’ number are innovatively proposed in this paper. Moreover, the big data statistical and association techniques are used in the method. To verify the effectiveness of the method, we firstly built a real SS running environment based on campus network and two VPSs on two different public cloud platforms. And then we conduct a series of experiments on the NTCI-BDP data platform which is a big data platform built by our team. The experimental results show that our method achieves 93.43% accuracy on experimental data sets and can effectively identify SS traffic.
Article
Full-text available
With the advancement of technology and communication system, use of internet is giving at a tremendous role. This causes an exponential growth of data and traffic over the internet. So to correctly classify this traffic is a hot research area. Internet traffic classification is a very popular tool against the information detection system. Although so many methods had been develop to efficiently classify internet traffic but among them machine learning techniques are most popular. A brief survey on various supervised and unsupervised machine learning techniques applied by various researchers to solve internet traffic classification has been discussed. This paper also present various issues related to machine learning techniques that may help interested researchers to work future in this direction.
Conference Paper
Full-text available
Accurate traffic classification is of fundamental importance to numerous other network activities, from security monitoring to accounting, and from Quality of Service to providing operators with useful forecasts for long-term provisioning. We apply a Naïve Bayes estimator to categorize traffic by application. Uniquely, our work capitalizes on hand-classified network data, using it as input to a supervised Naïve Bayes estimator. In this paper we illustrate the high level of accuracy achievable with the \Naive Bayes estimator. We further illustrate the improved accuracy of refined variants of this estimator.Our results indicate that with the simplest of Naïve Bayes estimator we are able to achieve about 65% accuracy on per-flow classification and with two powerful refinements we can improve this value to better than 95%; this is a vast improvement over traditional techniques that achieve 50--70%. While our technique uses training data, with categories derived from packet-content, all of our training and testing was done using header-derived discriminators. We emphasize this as a powerful aspect of our approach: using samples of well-known traffic to allow the categorization of traffic using commonly available information alone.
Article
Full-text available
The early detection of applications associated with TCP flows is an essential step for network security and traffic engineering. The classic way to identify flows, i.e. looking at port numbers, is not effective anymore. On the other hand, state-of-the-art techniques cannot determine the application before the end of the TCP flow. In this editorial, we propose a technique that relies on the obser- vation of the first five packets of a TCP connection to identify the application. This result opens a range of new possibilities for online traffic classification.
Conference Paper
Automated network traffic classification is a basic requirement for managing Quality of Service in communications networks. This research compares the performance of six widely-used supervised machine learning algorithms for classifying network traffic. The evaluations were conducted for classification of five distinct network traffic classes and two feature selection techniques. Our comparative results show that the Random Forest and Decision Tree algorithms are promising classifiers for network traffic in terms of both classification accuracy and computational efficiency.
Article
Considering the importance of encrypted traffic identification technology and existing research work, first, the type of encrypted traffic identification according to the demand of traffic analysis were introduced, such as protocols, applications and services. Second, the encrypted traffic identification technology was summarized, and identification technology was compared from multiple views. Third, the deficiencies and the affecting factors of the existing encrypted traffic identification technologies were induced, such as tunneling, traffic camouflage technology, new protocols of HTTP/2.0 and QUIC. Finally, prospect trends and directions of future research on encrypted traffic identification were discussed. © 2016, Editorial Board of Journal on Communications. All right reserved.
Conference Paper
Identifying applications and classifying network traffic flows according to their source applications are critical for a broad range of network activities. Such classifications can be based on information derived from packet header fields and payload content, or statistical characteristics of flows and communication patterns of hosts. However, most of present methods rely on some forms of priori knowledge. In this paper, an application signature based traffic classification system with a novel approach to fully automate the process of deriving signatures from unknown traffic is proposed. The key idea is to combine traffic clustering based on statistical flow properties in order to generate clusters dominated by a single application on the one hand, and application signature construction solely based on payload content from each cluster on the other hand. Evaluation using real-world traffic traces indicate that the proposed approach is highly effective.
Article
Identifying applications and classifying network traffic flows according to their source applications are critical for a broad range of network activities. Such a decision can be based on packet header fields, packet payload content, statistical characteristics of traffic and communication patterns of network hosts. However, most present techniques rely on some sort of a priori knowledge, which means they require labor-intensive preprocessing before running and cannot deal with previously unknown applications. In this paper, we propose a traffic classification system based on application signatures, with a novel approach to fully automate the process of deriving signatures from unidentified traffic. The key idea is to integrate statistics-based flow clustering with payload-based signature matching method, so as to eliminate the requirement of pre-labeled training data sets. We evaluate the efficiency of our approach using real-world traffic trace, and the results indicate that signature classifiers built from clustered data and pre-labeled data are able to achieve similar high accuracy better than 99%. Copyright © 2010 John Wiley & Sons, Ltd.
Research Status and Development Trends on Network Encrypted Traffic Identification
  • Chen Liangchen
  • Gao Shu
  • Liu Baoxu
  • L U Zhigang
Passive and Active Network Measurement
  • Denis Zuev
  • A W Moore
Research on Shadowsocks Traffic Identification Based on Xgboost Algorithm
  • hang-song
Research Status and Development Trends on Network Encrypted Traffic Identification
  • liangchen
International Conference on Neural Information Processing
SSH Traffic Identification Based on XGBoost
  • Li Xuhang