November 2024
·
5 Reads
·
1 Citation
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
November 2024
·
5 Reads
·
1 Citation
November 2024
·
12 Reads
·
1 Citation
ACM Computing Surveys
Traditionally, deep learning practitioners would bring data into a central repository for model training and inference. Recent developments in distributed learning, such as federated learning and deep learning as a service (DLaaS) do not require centralized data and instead push computing to where the distributed datasets reside. These decentralized training schemes, however, introduce additional security and privacy challenges. This survey first structures the field of distributed learning into two main paradigms and then provides an overview of the recently published protective measures for each. This work highlights both secure training methods as well as private inference measures. Our analyses show that recent publications while being highly dependent on the problem definition, report progress in terms of security, privacy, and efficiency. Nevertheless, we also identify several current issues within the private and secure distributed deep learning (PSDDL) field that require more research. We discuss these issues and provide a general overview of how they might be resolved.
August 2024
·
4 Reads
·
1 Citation
July 2024
·
12 Reads
March 2024
·
7 Reads
February 2024
·
37 Reads
·
2 Citations
Real-Time Systems
While high accuracy is of paramount importance for deep learning (DL) inference, serving inference requests on time is equally critical but has not been carefully studied especially when the request has to be served over a dynamic wireless network at the edge. In this paper, we propose Jellyfish—a novel edge DL inference serving system that achieves soft guarantees for end-to-end inference latency service-level objectives (SLO). Jellyfish handles the network variability by utilizing both data and deep neural network (DNN) adaptation to conduct tradeoffs between accuracy and latency. Jellyfish features a new design that enables collective adaptation policies where the decisions for data and DNN adaptations are aligned and coordinated among multiple users with varying network conditions. We propose efficient algorithms to continuously map users and adapt DNNs at runtime, so that we fulfill latency SLOs while maximizing the overall inference accuracy. We further investigate dynamic DNNs, i.e., DNNs that encompass multiple architecture variants, and demonstrate their potential benefit through preliminary experiments. Our experiments based on a prototype implementation and real-world WiFi and LTE network traces show that Jellyfish can meet latency SLOs at around the 99th percentile while maintaining high accuracy.
December 2023
·
5 Reads
December 2023
·
1 Read
·
1 Citation
December 2023
·
5 Reads
June 2023
·
21 Reads
Nowadays, the deployment of deep learning models on edge devices for addressing real-world classification problems is becoming more prevalent. Moreover, there is a growing popularity in the approach of early classification, a technique that involves classifying the input data after observing only an early portion of it, aiming to achieve reduced communication and computation requirements, which are crucial parameters in edge intelligence environments. While early classification in the field of time series analysis has been broadly researched, existing solutions for multivariate time series problems primarily focus on early classification along the temporal dimension, treating the multiple input channels in a collective manner. In this study, we propose a more flexible early classification pipeline that offers a more granular consideration of input channels and extends the early classification paradigm to the channel dimension. To implement this method, we utilize reinforcement learning techniques and introduce constraints to ensure the feasibility and practicality of our objective. To validate its effectiveness, we conduct experiments using synthetic data and we also evaluate its performance on real datasets. The comprehensive results from our experiments demonstrate that, for multiple datasets, our method can enhance the early classification paradigm by achieving improved accuracy for equal input utilization.
... Currently, programmable switches and smart NICs are designed primarily for highspeed packet forwarding, not for complex computations: they have limited memory, computational power, and processing stages. For example, a high-performance switch like Intel Tofino offers 12-20 stages in its pipeline [72], and these must be shared between essential network functions (e.g., routing, ACLs) and in-network computations. Also, depending on their position, network devices must handle from hundreds to millions of packets per second, making problematic the parallel execution of different services. ...
November 2024
... Almusawi et al., (2023) argues that although initially designed for two-party scenarios, Yao's Garbled Circuits have been extended to multiparty settings, facilitating collaborative computations without compromising individual data privacy. The Goldreich-Micali-Wigderson (GMW) protocol further enhances these capabilities by incorporating secret sharing and oblivious transfer for secure Boolean circuit computation (Allaart et al., 2024;Mayeke et al., 2024). Additionally, additive and multiplicative secret sharing schemes allow for secure arithmetic operations, a crucial feature in financial applications requiring aggregation and analysis of sensitive transactional data (Ali et al., 2024;Olutimehin, 2025a). ...
November 2024
ACM Computing Surveys
... The beacon is powered on constantly, with fixed duty-cycling known to all participating devices. In the bootstrap phase, each battery-free device first discovers the beacon [24], [20], [25]. Upon discovery, the beacon allocates a unique slot on a pre-defined cycle with a fixed number of slots to the battery-free device. ...
August 2024
... With the rapid advance in Artificial Intelligence (AI) technology and the growing demand for real-time complex environment perception, recurrent neural networkbased in-sensor inference has become increasingly popular for performing efficient perception from large amounts of data. [1][2][3][4][5] However, recurrent neural networks (RNNs) typically require an extensive amount of computational resources to impute time-series data, which may consume a significant amount of energy and shorten the lifespan of the sensor. Most sensors often lack a continuous and stable charging source and have limited energy. ...
December 2023
... Inspired by foundational research on latency modeling and resource allocation in edge-cloud environments, particularly the surveys conducted by Mao et al. [34], [35]-we develop a closed-form latency expression specifically adapted to our multi-replica hybrid infrastructure. This formulation incorporates conventional latency factors such as computation, communication, and queuing delays, while also integrating empirical latency patterns drawn from recent experimental analyses [36], [37]. ...
December 2022
... Overlapping memory access with computation is a common technique used to tolerate slow data movement [17]. For example, ALCOP [19] utilizes the CUDA-provided asynchronous data movement API to explore multi-stage pipelining. ...
November 2022
ACM Computing Surveys
... To our knowledge, there have been a few preliminary attempts to achieve effective downstream adaptation of pretrained models while safeguarding data and models. For example, split learning (SL) [1,44] divides a machine learning Figure 1. The comparison between different downstream task adaptation approaches. ...
July 2022
... MiniRocket distinguishes itself from Rocket by computing features using a fixed set of k convolutional kernels with a shorter kernel length, resulting in greater computational efficiency and refining the convolutional process by introducing alterations to the kernels [83]. The MiniRocket transform calculates the Max and PPV features for each k fixed convolutional kernel. ...
May 2022
... Therefore, to do operations such as object classification and identification, most subsisting video analytics systems employ cloud infrastructure (Lin et al. 2020). However, the performance score for cloud-based video analytics systems is an important factor with factors having lofty network latency and bandwidth shortage (Apostolo et al. 2022). ...
April 2022
... SPINN [43] further incorporates early-exit into DNN partitioning to allow the inference to exit early based on the input complexity. Feedback-based methods [19,28,52] rely on the feedback or result from server to assist the local processing. Glimpse [19] periodically sends trigger frames to the server to obtain the object recognition hints to assist local tracking. ...
November 2021