Martin Arlitt’s research while affiliated with University of Calgary and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (157)


Fig. 2: Isocost contours for C score for three different cost ratios, r c . The C score corresponding to each contour is listed next to it. The black dotted-line is the PR curve for a particular model.
Fig. 4: Variation of thresholds with different cost ratios for each dataset.
Fig. 5: Precision-Recall curve for different cost ratios
Fig. 6: Contour plots with Precision and Recalls
Fig. 7: Percentage improvement in cost for threshold obtained by minimizing the new cost score compared to the threshold obtained from maximizing F 1 score.
Is F1F_1 Score Suboptimal for Cybersecurity Models? Introducing CscoreC_{score}, a Cost-Aware Alternative for Model Assessment
  • Preprint
  • File available

July 2024

·

26 Reads

·

Asad Narayanan

·

Stephen Jou

·

[...]

·

Maria Pospelova

The cost of errors related to machine learning classifiers, namely, false positives and false negatives, are not equal and are application dependent. For example, in cybersecurity applications, the cost of not detecting an attack is very different from marking a benign activity as an attack. Various design choices during machine learning model building, such as hyperparameter tuning and model selection, allow a data scientist to trade-off between these two errors. However, most of the commonly used metrics to evaluate model quality, such as F1F_1 score, which is defined in terms of model precision and recall, treat both these errors equally, making it difficult for users to optimize for the actual cost of these errors. In this paper, we propose a new cost-aware metric, CscoreC_{score} based on precision and recall that can replace F1F_1 score for model evaluation and selection. It includes a cost ratio that takes into account the differing costs of handling false positives and false negatives. We derive and characterize the new cost metric, and compare it to F1F_1 score. Further, we use this metric for model thresholding for five cybersecurity related datasets for multiple cost ratios. The results show an average cost savings of 49%.

Download


Trust Issue(r)s: Certificate Revocation and Replacement Practices in the Wild

March 2024

·

32 Reads

·

2 Citations

Lecture Notes in Computer Science

Every time we use the web, we place our trust in X.509 certificates binding public keys to domain identities. However, for these certificates to be trustworthy, proper issuance, management, and timely revocations (in cases of compromise or misuse) are required. While great efforts have been placed on ensuring trustworthiness in the issuance of new certificates, there has been a scarcity of empirical studies on revocation management. This study offers the first comprehensive analysis of certificate replacements (CRs) of revoked certificates. It provides a head-to-head comparison of the CRs where the replaced certificate was revoked versus not revoked. Leveraging two existing datasets with overlapping timelines, we create a combined dataset containing 1.5 million CRs that we use to unveil valuable insights into the effect of revocations on certificate management. Two key questions guide our research: (1) the influence of revocations on certificate replacement behavior and (2) the effectiveness of revocations in fulfilling their intended purpose. Our statistical analysis reveals significant variations in revocation rates, retention rates, and post-revocation usage, shedding light on differences in Certificate Authorities’ (CAs) practices and subscribers’ decisions. Notably, a substantial percentage of revoked certificates were either observed or estimated to be used after revocation, raising concerns about key-compromise instances. Finally, our findings highlight shortcomings in existing revocation protocols and practices, emphasizing the need for improvements. We discuss ongoing efforts and potential solutions to address these issues, offering valuable guidance for enhancing the security and integrity of web communications.


On the Dark Side of the Coin: Characterizing Bitcoin Use for Illicit Activities

March 2024

·

17 Reads

·

1 Citation

Lecture Notes in Computer Science

The decentralized nature of Bitcoin allows for pseudonymous money exchange beyond authorities’ control, contributing to its popularity for diverse illegal activities such as scams, ransomware attacks, money laundering, and black markets. In this paper, we characterize this landscape, providing insights into similarities and differences in the use of Bitcoin for such activities. Our analysis and the derived insights contribute to the understanding of Bitcoin transactions associated with illegal activities through three main aspects. First, our study offers a comprehensive characterization of money flows to and from Bitcoin addresses linked to different abuse categories, revealing variations in flow patterns and success rates. Second, our temporal analysis captures long-term trends and weekly patterns across categories. Finally, our analysis of outflow from reported addresses uncovers differences in graph properties and flow patterns among illicit addresses and between abuse categories. These findings provide valuable insights into the distribution, temporal dynamics, and interconnections within various categories of Bitcoin transactions related to illicit activities. The increased understanding of this landscape and the insights gained from this study offer important empirical guidance for informed decision-making and policy development in the ongoing effort to address the challenges presented by illicit activities within the cryptocurrency space.



A Retrospective on Campus Network Traffic Monitoring

July 2023

·

18 Reads

ACM SIGCOMM Computer Communication Review

On April 1, 2023 we stopped monitoring the traffic on our campus Internet link, nearly 20 years to the day since we first started doing so. During these two decades, we faced a vast array of issues that affected the collection, storage, analysis and backup of our monitoring data. In this paper we share some of our experiences, so that future networking researchers have an opportunity to learn from our successes as well as our many mistakes and misfortunes.



Trace-Driven Scaling of Microservice Applications

January 2023

·

117 Reads

·

4 Citations

IEEE Access

The containerized microservices architecture is being increasingly used to build complex applications. To minimize operating costs, service providers typically rely on an auto-scaler to "right size" their infrastructure amid fluctuating workloads. The agile nature of microservice development and deployment requires an auto-scaler that does not require significant effort to derive resource allocation decisions. In this paper, we investigate reducing auto-scaler development effort along a number of dimensions. First, we focus on a technique that does not require an expert to develop a model, e.g., a queuing model or machine learning model, of the system and tweak the model as the underlying microservice application changes. Second, we explore ways to limit the number of workload patterns that need to be considered. Third, we study techniques to reduce the number of resource allocation scenarios that one has to explore before deploying the auto-scaler. To address these goals, we first analyze the workload of 24,000 real microservice applications and find that a small number of workload patterns dominate for any given application. These results suggest that auto-scaler design can be driven by this small subset of popular workload patterns thereby limiting effort. To limit the number of resource allocation scenarios explored, we develop a novel heuristic optimization technique called MOAT, which outperforms Bayesian Optimization often used for such exercises. We combine insights obtained from real microservice workloads and MOAT to realize an auto-scaler called TRIM that requires no system modeling. For each popular workload pattern identified for an application, TRIM uses MOAT to pre-compute a near minimal resource allocation that satisfies end user response time targets. These resource allocations are then used at runtime when appropriate. We validate our approach using a variety of analytical, on-premise, and public cloud systems. From our results, TRIM in consort with MOAT significantly improves the performance of the industry-standard HPA auto-scaler by achieving up to 92% fewer response time violations and up to 34% lower costs compared to using HPA in isolation.




Citations (70)


... The Yahoo Finance platform was chosen because it offers free access to historical commodity price data such as crude oil with a broad period and reliable data quality. In addition, Yahoo Finance provides an easy-to-process data format for analysis and prediction purposes, making it an efficient and reliable source for academic and professional research (Rosenquist et al., 2024). Data is collected from 2000 to 2023 daily. ...

Reference:

Preparing Better Data for Oil Price Prediction Using Long Short-Term Memory
On the Dark Side of the Coin: Characterizing Bitcoin Use for Illicit Activities
  • Citing Chapter
  • March 2024

Lecture Notes in Computer Science

... Static allocation often leads to under-provisioning or over-provisioning, especially during sudden workload fluctuations. Rulebased scaling methods like HPA, though widely used, struggle to respond quickly to abrupt changes, resulting in performance bottlenecks and inefficient resource use [6], [7]. Moreover, HPA does not account for the interdependencies of microservices, which can lead to inefficient scaling decisions [8]. ...

Trace-Driven Scaling of Microservice Applications

IEEE Access

... Especially, they seem to have developed strong preferences for aggregation and customization. With increase in the number of websites, users are increasingly looking for aggregation and easy dissemination of information from multiple sites [32], [33]. With respect to customization, some participants complained the lack of flexibility in the order of the question clustering results. ...

Characterization of FriendFeed — A Web-based Social Aggregation Service
  • Citing Article
  • March 2009

Proceedings of the International AAAI Conference on Web and Social Media

... Big differences in CAs' revocation rates: While the overall revocation rates are similar to those observed in prior works (e.g., [19,23,29,48]), and revocations typically are relatively rare (all things considered), we observe big differences between the revocation rates of individual CAs, with revocation rates ranging from a fraction of a percent (Amazon and cPanel) to 17% (GoDaddy). ...

Temporal Analysis of X.509 Revocations and their Statuses
  • Citing Conference Paper
  • June 2022

... pandemic necessitated a significant increase in the delivery of online learning for healthcare workers and a shift towards the use of video-platforms such as Zoom and Teams, 35 ...

Zoomiversity: A Case Study of Pandemic Effects on Post-secondary Teaching and Learning
  • Citing Chapter
  • January 2022

Lecture Notes in Computer Science

... The second item was regarding the quality of the online sessions; 33.5% agreed that Internet quality was excellent and online sessions were uninterrupted. A prior study noted that online learning sessions and Zoom sessions had experienced disruptions due to high demand for a limited number of servers in different regions [30]. Also, 33.5% reported that e-learning platforms had been adequately explained and trained. ...

Zoom Session Quality: A Network-Level View
  • Citing Chapter
  • January 2022

Lecture Notes in Computer Science

... Gap Validity period Observed in use Revoked (CR) occurs. While replacements generally happen near the end of a certificate's validity period [5], some replacements are done with some margin ahead of the certificate expiry while others are made after expiry. We define a validity overlap as the intersection of the new certificate's validity period with that of the previous one. ...

Changing of the Guards: Certificate and Public Key Management on the Internet
  • Citing Chapter
  • January 2022

Lecture Notes in Computer Science

... Generative methods are also proposed in the literature to tackle the problem concerning the scarcity of abundant and public-available datasets, and many works discuss the feasibility of deep architectures to generate new traffic data. Recent proposals (Dowoo et al., 2019;Cheng, 2019;Ring et al., 2019;Zingo and Novocin, 2020;Shahid et al., 2020;Madarasingha et al., 2022;Meslet-Millet et al., 2022;Nukavarapu et al., 2022;Sivaroopan et al., 2023;Nguyen et al., 2023) rely on GAN and DAE to generate entire traffic datasets at different granularity (long-time aggregate series, PCAP traces, flows, packets), while Xu et al. (2021) advance a method based on Convolutional Neural Networks to generate synthetic data at the flow level. Nonetheless, these proposals do not comprehensively address the problem of insufficient publicly available datasets, as they disregard privacy-oriented training methodologies. ...

STAN: Synthetic Network Traffic Generation with Generative Neural Models
  • Citing Chapter
  • September 2021

Communications in Computer and Information Science

... MockFog supports dynamic network changes and uses a dedicated cloud virtual machine for each compute node, which imposes fewer restrictions on the kinds of software it supports, yet such flexibility comes at a price: With a dedicated cloud virtual machine for each satellite server, we cannot achieve a cost-efficient emulation for large LEO constellations. Similar concerns can also be raised for further IoT and edge computing testbed tooling [4,5,20,34,35]. ...

Contention Aware Web of Things Emulation Testbed
  • Citing Conference Paper
  • April 2020