About
17
Publications
4,933
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,109
Citations
Introduction
Current institution
Amazon
Current position
- Researcher
Publications
Publications (17)
We address the problem of serving Deep Neural Networks (DNNs) efficiently from a cluster of GPUs. In order to realize the promise of very low-cost processing made by accelerators such as GPUs, it is essential to run them at sustained high utilization. Doing so requires cluster-scale resource management that performs detailed scheduling of GPUs, rea...
Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch drive the current popularity and utility of deep learning. However, these frameworks are optimized for a narrow range of server-class GPUs and deploying workloads to other platforms such as mobile phones, embedded devices, and specialized accelerators (e.g., FPGAs, ASICs) requires l...
Recent advances have enabled "oracle" classifiers that can classify across many classes and input distributions with high accuracy without retraining. However, these classifiers are relatively heavyweight, so that applying them to classify video is costly. We show that day-to-day video exhibits highly skewed class distributions over the short term,...
Recent advances have enabled "oracle" classifiers that can classify across many classes and input distributions with high accuracy without retraining. However, these classifiers are relatively heavyweight, so that applying them to classify video is costly. We show that day-to-day video exhibits highly skewed class distributions over the short term,...
We consider applying computer vision to video on cloud-backed mobile devices using Deep Neural Networks (DNNs). The computational demands of DNNs are high enough that, without careful resource management, such applications strain device battery, wireless data, and cloud cost budgets. We pose the corresponding resource management problem, which we c...
Cloud-based file synchronization services such as Dropbox are a worldwide resource for many millions of users. However, individual services often have tight resource limits, suffer from temporary outages or shutdowns, and sometimes silently corrupt or leak user data. As a solution, the authors design, implement, and evaluate MetaSync, a secure and...
Always-on continuous sensing apps drain the battery quickly because they prevent the main processor from sleeping. Instead, sensor hub hardware, available in many smartphones today, can run continuous sensing at lower power while keeping the main processor idle. However, developers have to divide functionality between the main processor and the sen...
Visualizing NLP annotation is useful for the collection of training data for
the statistical NLP approaches. Existing toolkits either provide limited visual
aid, or introduce comprehensive operators to realize sophisticated linguistic
rules. Workers must be well trained to use them. Their audience thus can hardly
be scaled to large amounts of non-e...
Mobile Web page loads are notoriously slow due to limited computing power and slow network access. Our preliminary experiments show that computation is a significant fraction of page load time on mobile devices. Also, energy arguments suggest that it will stay this way. To compensate the limited computing power, our position is that offloading port...
Enabling flexible spectrum access (FSA) in existing wireless networks is challenging due to the limited spectrum programmability – the ability to change spectrum properties of a signal to match an arbitrary frequency allocation. This paper argues that spectrum programmability can be separated from general wireless physical layer (PHY) modulation. T...
Retransmissions reduce the efficiency of data communication in wireless networks because of: (i) per-retransmission packet headers, (ii) contention overhead on every retransmission, and (iii) redundant bits in every retransmission. In fact, every retransmission nearly doubles the time to successfully deliver the packet. To improve spectrum efficien...
This demonstration shows a novel virtualization architecture, called Multi-Purpose Access Point (MPAP), which can virtualize multiple heterogenous wireless standards based on software radio. The basic idea is to deploy a wide-band radio front-end to receive wireless signals from all wireless standards sharing the same spectrum band, and use separat...
This demonstration shows a novel virtualization architecture, called Multi-Protocol Access Point (MPAP), which exploits the software radio technology to virtualize multiple heterogenous wireless standards on single radio hardware. The basic idea is to deploy a wide-band radio front-end to receive radio signals from all wireless standards sharing th...
This paper presents an architecture to support dynamic spectrum access (DSA) in general wireless networks. Our architecture advocates a new spectrum virtualization layer (SVL), directly below the wireless physical layer (PHY or baseband processing). We call it Layer 0.5. SVL presents a virtual baseband to traditional wireless PHY, which is designed...
This demonstration shows a novel virtualization architecture, called Multi-Purpose Access Point (MPAP), which can virtualize multiple heterogenous wireless standards based on software radio. The basic idea is to deploy a wide-band radio front-end to receive wireless signals from all wireless standards sharing the same spectrum band, and use separat...
This demonstration shows a novel virtualization architecture, called Multi-Purpose Access Point (MPAP), which can virtualize multiple heterogenous wireless standards based on software radio. The basic idea is to deploy a wide-band radio front-end to receive wireless signals from all wireless standards sharing the same spectrum band, and use separat...