To read the full-text of this research, you can request a copy directly from the authors.
... Then [44] steps forward to reveal that RESIP proxies may be harvested through unauthorized NAT entry injection by exploiting a UPNP vulnerability. Besides, [26] moves the spotlight closer to mobile devices and explores how mobile devices have been recruited to serve as RESIPs along with a set of mobile proxy SDKs identified and profiled. ...
... In summary, little is known about this regional RESIP ecosystem as well as its security implications. Lastly, although [26] has revealed the recruitment of mobile devices by several RESIP services, the supply chain of RESIPs has yet to be further uncovered, which is crucial for understanding and addressing the security risks of RESIPs. ...
... As the first step of our pipeline, we queried major search engines with RPS-relevant keywords. Among these wellcrafted keywords, some are adopted from previous works [25,26,44] while others are manually extracted through visiting RPS websites. Table 1 lists them along with their translations in Chinese. ...
We carry out the first in-depth characterization of residential proxies (RESIPs) in China, for which little is studied in previous works. Our study is made possible through a semantic-based classifier to automatically capture RESIP services. In addition to the classifier, new techniques have also been identified to capture RESIPs without interacting with and relaying traffic through RESIP services, which can significantly lower the cost and thus allow a continuous monitoring of RESIPs. Our RESIP service classifier has achieved a good performance with a recall of 99.7% and a precision of 97.6% in 10-fold cross validation. Applying the classifier has identified 399 RESIP services, a much larger set compared to 38 RESIP services collected in all previous works. Our effort of RESIP capturing lead to a collection of 9,077,278 RESIP IPs (51.36% are located in China), 96.70% of which are not covered in publicly available RESIP datasets. An extensive measurement on RESIPs and their services has uncovered a set of interesting findings as well as several security implications. Especially, 80.05% RESIP IPs located in China have sourced at least one malicious traffic flows during 2021, resulting in 52-million malicious traffic flows in total. And RESIPs have also been observed in corporation networks of 559 sensitive organizations including government agencies, education institutions and enterprises. Also, 3,232,698 China RESIP IPs have opened at least one TCP/UDP ports for accepting relaying requests, which incurs non-negligible security risks to the local network of RESIPs. Besides, 91% China RESIP IPs are of a lifetime less than 10 days while most China RESIP services show up a crest-trough pattern in terms of the daily active RESIPs across time.
... Some RESIP providers build their networks thanks to mobile SDKs included by developers in apps. Device owners voluntarily download these apps and give consent to be part of the network [21]. However, Frappier et al. [18] suggest that, in some cases, the provider lures these device owners. ...
... While the found pool sizes are much bigger than the ones seen in the previous modeling, in both cases the pool sizes claimed by RESIP providers do not correspond to our observations. Between 2021 and 2022, studies on mobile devices as RESIP GATEWAYs have been published [21], [26], [29], [31]. The focus of these works is to understand how a device becomes part of a RESIP. ...
... The more AV engines that ag a URL/FQDN, the more likely the URL/FQDN is malicious [21]. Here we denote the set of URLs/FQDNs detected by at least 1 AV engine as +) 1 and the set of those agged by at least 5 engines as +) 5. We consider +) 5 since it is a common threshold for determining whether a spam URL/FQDN is malicious [40,41]. Also on a report is coarse-grained categories which VT assigns to each agged URL/FQDN, i.e., malicious, malware, or phishing. ...
... We further looked into the WHOIS information of these spam IPs, aiming to uncover the underlying hosting providers potentially abused by these spammers. We consider spam IPs associated with spam FQDNs in the set of +) 5 as malicious-related IPs, which is a commonly used threshold [40,41]. The detailed information is listed in Appendix 16. ...
With its critical role in business and service delivery through mobile devices, SMS (Short Message Service) has long been abused for spamming, which is still on the rise today possibly due to the emergence of A2P bulk messaging. The effort to control SMS spam has been hampered by the lack of up-to-date information about illicit activities. In our research, we proposed a novel solution to collect recent SMS spam data, at a large scale, from Twitter, where users voluntarily report the spam messages they receive. For this purpose, we designed and implemented SpamHunter, an automated pipeline to discover SMS spam reporting tweets and extract message content from the attached screenshots. Leveraging SpamHunter, we collected from Twitter a dataset of 21,918 SMS spam messages in 75 languages, spanning over four years. To our best knowledge, this is the largest SMS spam dataset ever made public. More importantly, SpamHunter enables us to continuously monitor emerging SMS spam messages, which facilitates the ongoing effort to mitigate SMS spamming. We also performed an in-depth measurement study that sheds light on the new trends in the spammer's strategies, infrastructure and spam campaigns. We also utilized our spam SMS data to evaluate the robustness of the spam countermeasures put in place by the SMS ecosystem, including anti-spam services, bulk SMS services, and text messaging apps. Our evaluation shows that such protection cannot effectively handle those spam samples: either introducing significant false positives or missing a large number of newly reported spam messages.
... Whenever a platform enables users to share resources in exchange for benefits, such as monetary compensation, malicious actors can exploit third-party resources without authorization to make a personal profit. This behavior, known as sponge attack [4], has been observed across several domains, particularly in crypto mining and residential proxies [4,10]. In this type of attack, the victim remains unaware that a resource-sharing protocol operates on their infrastructure, resulting in the unauthorized use of resources for the attacker's gain. ...
... 27 Live sports were especially affected by streaming piracy during the COVID-19 pandemic: a 2021 estimate of sports streaming piracy alone put damages at an estimated $28.3 billion per year. 28 While these estimates are based on the US, estimates for the EU are likely even higher due to generally higher levels of piracy (45.72% in Europe compared to 13.48% in North America in 2020 29 ) and larger population numbers (almost 500 million people in the EU compared to 330 million in the US). 24 Ibid, pp. ...
Digital piracy, i.e., large-scale commercial copyright infringement online, is a constantly evolving phenomenon. Following the enactment and expansion of online intermediary liability rules, professional pirates have shifted away from easily blockable websites and services. Streaming piracy on dedicated platforms, monetised through embedded services, has become the prevalent model in Europe, as is well illustrated by the Mobdro case study analysed in this article. These dedicated platforms are supported by embedded service providers who, as they are not online intermediaries, can avoid the online intermediary liability regime. One potential solution to this issue could be the application of contributory copyright infringement rules, which are well-established in US copyright law but absent in EU law, to all parties that contribute to digital piracy. The CJEU has opened expressly this path in the recent C-682/18 YouTube and C-683/18 Cyando cases. Based on the CJEU's initiative and the existing precedent of harmonising intellectual property tort rules within EU law, further contributory liability rules could be modelled after US rules by updating the Enforcement Directive 2004/48/EC. Addressing this gap in EU copyright law is crucial for enhancing the effectiveness of digital copyright enforcement against evolving digital piracy.
... A line of works [37], [41], [62] have revealed cryptojacking wherein device computing resources are abused by miscreants for cryptocurrency mining. In addition, another abuse scenario is the unauthorized monetization of residential and mobile devices into web proxies to relay third-party network traffic [59], [60]. Moving forward from these studies, we reveal for the first time how video viewers' devices and network resources can be consumed without user consent to serve the video streaming services and third-party PDN providers. ...
As an emerging service for in-browser content delivery, peer-assisted delivery network (PDN) is reported to offload up to 95\% of bandwidth consumption for video streaming, significantly reducing the cost incurred by traditional CDN services. With such benefits, PDN services significantly impact today's video streaming and content delivery model. However, their security implications have never been investigated. In this paper, we report the first effort to address this issue, which is made possible by a suite of methodologies, e.g., an automatic pipeline to discover PDN services and their customers, and a PDN analysis framework to test the potential security and privacy risks of these services. Our study has led to the discovery of 3 representative PDN providers, along with 134 websites and 38 mobile apps as their customers. Most of these PDN customers are prominent video streaming services with millions of monthly visits or app downloads (from Google Play). Also found in our study are another 9 top video/live streaming websites with each equipped with a proprietary PDN solution. Most importantly, our analysis on these PDN services has brought to light a series of security risks, which have never been reported before, including free riding of the public PDN services, video segment pollution, exposure of video viewers' IPs to other peers, and resource squatting. All such risks have been studied through controlled experiments and measurements, under the guidance of our institution's IRB. We have responsibly disclosed these security risks to relevant PDN providers, who have acknowledged our findings, and also discussed the avenues to mitigate these risks.
... Chiapponi et al [15] performed a mathematical analysis of ip addresses hypothesized to belong to resip providers, examining the repetitions of ip addresses. Study and detection of software used in devices to enroll them as resip gateways have been investigated in recent publications [29,34]. ...
Web scraping bots are now using so-called Residential ip Proxy (resip) services to defeat state-of-the-art commercial bot countermeasures. resip providers promise their customers to give them access to tens of millions of residential ip addresses, which belong to legitimate users. They dramatically complicate the task of the existing anti-bot solutions and give the upper hand to the malicious actors. New specific detection methods are needed to identify and stop scrapers from taking advantage of these parties. This work, thanks to a 4 months-long experiment, validates the feasibility, soundness, and practicality of a detection method based on network measurements. This technique enables contacted servers to identify whether an incoming request comes directly from a client device or if it has been proxied through another device.KeywordsWeb scrapingResidential ip Proxy
resip
Round trip time measurement
tls
SecurityBots
... As explained in the works of Mi et al. [13], [14], RESIP infrastructures are built taking advantage of mobile SDKs included by developers in all kinds of applications in exchange for a fee per installed app. A percentage of these networks are composed of infected devices, such as IoT ones. ...
Network middleboxes are important components in modern networking systems, impacting approximately 40% of network paths according to recent studies [1]. This survey paper delves into their endemic presence, enriches the original 2002 RFC with over two decades of findings, and emphasizes the significance of their impact in terms of security and performance. Furthermore, it categorizes network middleboxes based on their functions, objectives, and alterations. In today’s world, network middleboxes emerge as a dual-edged sword. While important for network operations, they also pose security risks. We present the various challenges they introduce, including their contribution to Internet ossification, their potential for censorship, monitoring, and traffic differentiation. Substantial effort remains to make their presence more visible to end users. This paper explores potential solutions, ranging from prevention and detection to curative measures. Ultimately, we aim to establish this survey as a foundational resource for addressing challenges revolving around the notion of network middleboxes, thereby fostering further research and innovation in this area.
An emerging Internet business is residential proxy (RESIP) as a service, in which a provider utilizes the hosts within residential networks (in contrast to those running in a datacenter) to relay their customers' traffic, in an attempt to avoid server-side blocking and detection. With the prominent roles the services could play in the underground business world, little has been done to understand whether they are indeed involved in Cybercrimes and how they operate, due to the challenges in identifying their RESIPs, not to mention any in-depth analysis on them. In this paper, we report the first study on RESIPs, which sheds light on the behaviors and the ecosystem of these elusive gray services. Our research employed an infiltration framework, including our clients for RESIP services and the servers they visited, to detect 6 million RESIP IPs across 230+ countries and 52K+ ISPs. The observed addresses were analyzed and the hosts behind them were further fingerprinted using a new profiling system. Our effort led to several surprising findings about the RESIP services unknown before. Surprisingly, despite the providers' claim that the proxy hosts are willingly joined, many proxies run on likely compromised hosts including IoT devices. Through cross-matching the hosts we discovered and labeled PUP (potentially unwanted programs) logs provided by a leading IT company, we uncovered various illicit operations RESIP hosts performed, including illegal promotion, Fast fluxing, phishing, malware hosting, and others. We also reverse engineered RESIP services' internal infrastructures, uncovered their potential rebranding and reselling behaviors. Our research takes the first step toward understanding this new Internet service, contributing to the effective control of their security risks.
Free web proxies promise anonymity and censorship circumvention at no cost. Several websites publish lists of free proxies organized by country, anonymity level, and performance. These lists index hundreds of thousand of hosts discovered via automated tools and crowd-sourcing. A complex free proxy ecosystem has been forming over the years, of which very little is known. In this paper we shed light on this ecosystem via ProxyTorrent, a distributed measurement platform that leverages both active and passive measurements. Active measurements discover free proxies, assess their performance, and detect potential malicious activities. Passive measurements relate to proxy performance and usage in the wild, and are collected by free proxies users via a Chrome plugin we developed. ProxyTorrent has been running since January 2017, monitoring up to 180,000 free proxies and totaling more than 1,500 users over a 10 months period. Our analysis shows that less than 2% of the proxies announced on the Web indeed proxy traffic on behalf of users; further, only half of these proxies have decent performance and can be used reliably. Around 10% of the working proxies exhibit malicious behaviors, e.g., ads injection and TLS interception, and these proxies are also the ones providing the best performance. Through the analysis of more than 2 Terabytes of proxied traffic, we show that web browsing is the primary user activity. Geo-blocking avoidance is not a prominent use-case, with the exception of proxies located in countries hosting popular geo-blocked content.
In this paper, we propose a novel android malware detection system that uses a deep convolutional neural network (CNN). Malware classification is performed based on static analysis of the raw opcode sequence from a disassembled program. Features indicative of malware are automatically learned by the network from the raw opcode sequence thus removing the need for hand-engineered malware features. The training pipeline of our proposed system is much simpler than existing n-gram based malware detection methods, as the network is trained end-to-end to jointly learn appropriate features and to perform classification, thus removing the need to explicitly enumerate millions of n-grams during training. The network design also allows the use of long n-gram like features, not computationally feasible with existing methods. Once trained, the network can be efficiently executed on a GPU, allowing a very large number of files to be scanned quickly.
Open proxies forward traffic on behalf of any Internet user. Listed on open proxy aggregator sites, they are often used to bypass geographic region restrictions or circumvent censorship. Open proxies sometimes also provide a weak form of anonymity by concealing the requestor's IP address.
To better understand their behavior and performance, we conducted a comprehensive study of open proxies, encompassing more than 107,000 listed open proxies and 13M proxy requests over a 50 day period. While previous studies have focused on malicious open proxies' manipulation of HTML content to insert/modify ads, we provide a more broad study that examines the availability, success rates, diversity, and also (mis)behavior of proxies.
Our results show that listed open proxies suffer poor availability---more than 92% of open proxies that appear on aggregator sites are unresponsive to proxy requests. Much more troubling, we find numerous examples of malicious open proxies in which HTML content is manipulated to mine cryptocurrency (that is, cryptojacking). We additionally detect TLS man-in-the-middle (MitM) attacks, and discover numerous instances in which binaries fetched through proxies were modified to include remote access trojans and other forms of malware. As a point of comparison, we conduct and discuss a similar measurement study of the behavior of Tor exit relays. We find no instances in which Tor relays performed TLS MitM or manipulated content, suggesting that Tor offers a far more reliable and safe form of proxied communication.
Third-party libraries on Android have been shown to be security and privacy hazards by adding security vulnerabilities to their host apps or by misusing inherited access rights. Correctly attributing improper app behavior either to app or library developer code or isolating library code from their host apps would be highly desirable to mitigate these problems, but is impeded by the absence of a third-party library detection that is effective and reliable in spite of obfuscated code. This paper proposes a library detection technique that is resilient against common code obfuscations and that is capable of pinpointing the exact library version used in apps. Libraries are detected with profiles from a comprehensive library database that we generated from the original library SDKs. We apply our technique to the top apps on Google Play and their complete histories to conduct a longitudinal study of library usage and evolution in apps. Our results particularly show that app developers only slowly adapt new library versions, exposing their end-users to large windows of vulnerability. For instance, we discovered that two long-known security vulnerabilities in popular libs are still present in the current top apps. Moreover, we find that misuse of cryptographic APIs in advertising libs, which increases the host apps' attack surface, affects 296 top apps with a cumulative install base of 3.7bn devices according to Play. To the best of our knowledge, our work is first to quantify the security impact of third-party libs on the Android ecosystem.
We measure the prevalence and uses of TLS proxies using a Flash tool deployed with a Google AdWords campaign. We generate 2.9 million certificate tests and find that 1 in 250 TLS connections are TLS-proxied. The majority of these proxies appear to be benevolent, however we identify over 1,000 cases where three malware products are using this technology nefariously. We also find numerous instances of negligent, duplicitous, and suspicious behavior, some of which degrade security for users without their knowledge. Distinguishing these types of practices is challenging in practice, indicating a need for transparency and user awareness.
Detecting violations of application-level end-to-end connectivity on the Internet is of significant interest to researchers and end users; recent studies have revealed cases of HTTP ad injection and HTTPS man-in-the-middle attacks. Unfortunately, detecting such end-to-end violations at scale remains difficult, as it generally requires having the cooperation of many nodes spread across the globe. Most successful approaches have relied either on dedicated hardware, user-installed software, or privileged access to a popular web site. In this paper, we present an alternate approach for detecting end-to-end violations based on Luminati, a HTTP/S proxy service that routes traffic through millions of end hosts. We develop measurement techniques that allow Luminati to be used to detect end-to-end violations of DNS, HTTP, and HTTPS, and, in many cases, enable us to identify the culprit. We present results from over 1.2m nodes across 14k ASes in 172 countries, finding that up to 4.8% of nodes are subject to some type of end-to-end connectivity violation. Finally, we are able to use Luminati to identify and measure the incidence of content monitoring, where end-host software or ISP middleboxes record users' HTTP requests and later re-download the content to third-party servers.
We present a growing collection of Android Applications collected from several sources, including the official Google Play app market. Our dataset, AndroZoo, currently contains more than three million apps, each of which has been analysed by tens of different Antivirus products to know which applications are detected as Malware. We provide this dataset to contribute to ongoing research efforts, as well as to enable new potential research topics on Android Apps. By releasing our dataset to the research community, we also aim at encouraging our fellow researchers to engage in reproducible experiments.
We present LibRadar, a tool that is able to detect third-party libraries used in an Android app accurately and instantly. As third-party libraries are widely used in Android apps, program analysis on Android apps typically needs to detect or remove third-party libraries first in order to function correctly or provide accurate results. However, most previous studies employ a whitelist of package names of known libraries, which is incomplete and unable to deal with obfuscation. In contrast, LibRadar detects libraries based on stable API features that are obfuscation resilient in most cases. After analyzing one million free Android apps from Google Play, we have identified possible libraries and collected their unique features. Based on these features, LibRadar can detect third-party libraries in a given Android app within seconds, as it only requires simple static analysis and fast comparison. LibRadar is available for public use at http://radar.pkuos.org. The demo video is available at: https://youtu.be/GoMYjYxsZnI
As smartphones and mobile devices are rapidly becoming indispensable for many network users, mobile malware has become a serious threat in the network security and privacy. Especially on the popular Android platform, many malicious apps are hiding in a large number of normal apps, which makes the malware detection more challenging. In this paper, we propose a ML-based method that utilizes more than 200 features extracted from both static analysis and dynamic analysis of Android app for malware detection. The comparison of modeling results demonstrates that the deep learning technique is especially suitable for Android malware detection and can achieve a high level of 96% accuracy with real-world Android application sets.
Recently, the threat of Android malware is spreading rapidly, especially those repackaged Android malware. Although understanding Android malware using dynamic analysis can provide a comprehensive view, it is still subjected to high cost in environment deployment and manual efforts in investigation. In this study, we propose a static feature-based mechanism to provide a static analyst paradigm for detecting the Android malware. The mechanism considers the static information including permissions, deployment of components, Intent messages passing and API calls for characterizing the Android applications behavior. In order to recognize different intentions of Android malware, different kinds of clustering algorithms can be applied to enhance the malware modeling capability. Besides, we leverage the proposed mechanism and develop a system, called Droid Mat. First, the Droid Mat extracts the information (e.g., requested permissions, Intent messages passing, etc) from each application's manifest file, and regards components (Activity, Service, Receiver) as entry points drilling down for tracing API Calls related to permissions. Next, it applies K-means algorithm that enhances the malware modeling capability. The number of clusters are decided by Singular Value Decomposition (SVD) method on the low rank approximation. Finally, it uses kNN algorithm to classify the application as benign or malicious. The experiment result shows that the recall rate of our approach is better than one of well-known tool, Androguard, published in Black hat 2011, which focuses on Android malware analysis. In addition, Droid Mat is efficient since it takes only half of time than Androguard to predict 1738 apps as benign apps or Android malware.
In this paper we present Netalyzr, a network measurement and debugging service that evaluates the functionality provided by people's Internet connectivity. The design aims to prove both comprehensive in terms of the properties we measure and easy to employ and understand for users with little technical background. We structure Netalyzr as a signed Java applet (which users access via their Web browser) that communicates with a suite of measurement-specific servers. Traffic between the two then probes for a diverse set of network properties, including outbound port filtering, hidden in-network HTTP caches, DNS manipulations, NAT behavior, path MTU issues, IPv6 support, and access-modem buffer capacity. In addition to reporting results to the user, Netalyzr also forms the foundation for an extensive measurement of edge-network properties. To this end, along with describing Netalyzr 's architecture and system implementation, we present a detailed study of 130,000 measurement sessions that the service has recorded since we made it publicly available in June 2009.
Riskranker: scalable and accurate zero-day android malware detection
Jan 2012
281-294
Michael Grace
Yajin Zhou
Qiang Zhang
Shihong Zou
Xuxian Jiang
Michael Grace, Yajin Zhou, Qiang Zhang, Shihong Zou, and Xuxian
Jiang. Riskranker: scalable and accurate zero-day android malware
detection. In Proceedings of the 10th international conference on
Mobile systems, applications, and services, pages 281-294. ACM, 2012.
Boxed out: Blocking cellular interconnect bypass fraud at the network edge
Jan 2015
833-848
Bradley Reaves
Ethan Shernan
Adam Bates
Henry Carter
Patrick Traynor
Bradley Reaves, Ethan Shernan, Adam Bates, Henry Carter, and Patrick
Traynor. Boxed out: Blocking cellular interconnect bypass fraud at the
network edge. In 24th {USENIX} Security Symposium ({USENIX}
Security 15), pages 833-848, 2015.
Sotiris Ioannidis, Elias Athanasopoulos, and Michalis Polychronakis. A large-scale analysis of content modification by open http proxies
Jan 2018
Giorgos Tsirantonakis
Panagiotis Ilia
Giorgos Tsirantonakis, Panagiotis Ilia, Sotiris Ioannidis, Elias Athanasopoulos, and Michalis Polychronakis. A large-scale analysis of content
modification by open http proxies. In NDSS, 2018.
Detecting stepping stones
Jan 2000
184
Yin Zhang
Vern Paxson
Yin Zhang and Vern Paxson. Detecting stepping stones. In USENIX
Security Symposium, volume 171, page 184, 2000.