[Show abstract][Hide abstract] ABSTRACT: The Google Glass is a mobile device designed to be worn as eyeglasses. This form factor enables new usage possibil-ities, such as hands-free video chats and instant web search. However, its shape also hampers its potential: (1) battery size, and therefore lifetime, is limited by a need for the de-vice to be lightweight, and (2) high-power processing leads to significant heat, which should be limited, due to the Glass' compact form factor and close proximity to the user's skin. We use the Glass in a case study of the power and thermal characteristics of optical head-mounted display devices. We share insights and implications to limit power consumption to increase the safety and utility of head-mounted devices.
[Show abstract][Hide abstract] ABSTRACT: Mobile System-on-Chips (SoC) that incorporate heterogeneous coherence domains promise high energy efficiency to a wide range of mobile applications, yet are difficult to program. To exploit the architecture, a desirable, yet missing capability is to replicate operating system (OS) services over multiple coherence domains with minimum inter-domain communication. In designing such an OS, we set three goals: to ease application development, to simplify OS engineering, and to preserve the current OS performance. To this end, we identify a shared-most OS model for multiple coherence domains: creating per-domain instances of core OS services with no shared state, while enabling other extended OS services to share state across domains. To test the model, we build K2, a prototype OS on the TI OMAP4 SoC, by reusing most of the Linux 3.4 source. K2 presents a single system image to applications with its two kernels running on top of the two coherence domains of OMAP4. The two kernels have independent instances of core OS services, such as page allocator and interrupt management, as coordinated by K2; the two kernels share most extended OS services, such as device drivers, whose state is kept coherent transparently by K2. Despite platform constraints and unoptimized code, K2 improves energy efficiency for light OS workloads by 8x-10x, while incurring less than 6% performance overhead for a device driver shared between kernels. Our experiences with K2 show that the shared-most model is promising.
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems; 02/2014
[Show abstract][Hide abstract] ABSTRACT: A recent study showed that while US consumers spent 30% more time on mobile apps than on traditional web, advertisers spent 1600% less money on mobile ads. One key reason is that unlike most web ad providers, today's mobile ads are not contextual---they do not take into account the content of the page they are displayed on. Thus, most mobile ads are irrelevant to what the user is interested in. For example, it is not uncommon to see gambling ads being displayed in a Bible app. This irrelevance results in low clickthrough rates, and hence advertisers shy away from the mobile platform. Using data from top 1200 apps in Windows Phone marketplace, and a one-week trace of ad keywords from Microsoft's ad network, we show that content displayed by mobile apps is a potential goldmine of keywords that advertisers are interested in. However, unlike web pages, which can be crawled and indexed offline for contextual advertising, content shown on mobile apps is often either generated dynamically, or is embedded in the apps themselves; and hence cannot be crawled. The only solution is to scrape the content at runtime, extract keywords and fetch contextually relevant ads. The challenge is to do this without excessive overhead and without violating user privacy. In this paper, we describe a system called SmartAds to address this challenge. We have built a prototype of SmartAds for Windows Phone apps. In a large user study with over 5000 ad impressions, we found that SmartAds nearly doubles the relevance score, while consuming minimal additional resources and preserving user privacy.
Proceeding of the 11th annual international conference on Mobile systems, applications, and services; 06/2013
[Show abstract][Hide abstract] ABSTRACT: Mobile systems are embracing heterogeneous architectures by getting more
types of cores and more specialized cores, which allows applications to be
faster and more efficient. We aim at exploiting the hardware heterogeneity from
the browser without requiring any changes to either the OS or the web
applications. Our design, Guadalupe, can use hardware processing units with
different degrees of capability for matched browser services. It starts with a
weak hardware unit, determines if and when a strong unit is needed, and
seamlessly migrates to the strong one when necessary. Guadalupe not only makes
more computing resources available to mobile web browsing but also improves its
energy proportionality. Based on Chrome for Android and TI OMAP4, We provide a
prototype browser implementation for resource loading and rendering. Compared
to Chrome for Android, we show that Guadalupe browser for rendering can
increase other 3D application's frame rate by up to 767% and save 4.7% of the
entire system's energy consumption. More importantly, by using the two cases,
we demonstrate that Guadalupe creates the great opportunity for many browser
services to get better resource utilization and energy proportionality by
exploiting hardware heterogeneity.
[Show abstract][Hide abstract] ABSTRACT: Modern smartphones are embracing asymmetric, loosely coupled processors that have drastically different performance-power tradeoffs. To exploit such architecture for energy proportionality, both application and OS workloads need to be distributed. We propose Kage, a combination of runtime and OS support, to replicate application execution and OS functions over asymmetric processors. Kage selectively creates replicas of application and OS services and maintains state consistency for them with low overhead. By doing so, it is able to reduce processor energy consumption of light-loaded smartphones manyfold. While enabling energy-proportionality, Kage simplifies application programming by providing the illusion of a single system image and per-process address spaces.
Proceedings of the 2012 USENIX conference on Power-Aware Computing and Systems; 10/2012
[Show abstract][Hide abstract] ABSTRACT: To accomplish frequent, simple tasks with high efficiency, it is necessary to leverage low-power, microcontroller-like processors that are increasingly available on mobile systems. However, existing solutions require developers to directly program the low-power processors and carefully manage inter-processor communication. We present Reflex, a suite of compiler and runtime techniques that significantly lower the barrier for developers to leverage such low-power processors. The heart of Reflex is a software Distributed Shared Memory (DSM) that enables shared memory objects with release consistency among code running on loosely coupled processors. In order to achieve high energy efficiency without sacrificing performance much, the Reflex DSM leverages (i) extreme architectural asymmetry between low-power processors and powerful central processors, (ii) aggressive compile-time optimization, and (iii) a minimalist runtime that supports efficient message passing and event-driven execution. We report a complete realization of Reflex that runs on a TI OMAP4430-based development platform as well as on a custom tri-processor mobile platform. Using smartphone sensing applications reported in recent literature, we show that Reflex supports a programming style very close to contemporary smartphone programming. Compared to message passing, the Reflex DSM greatly reduces efforts in programming heterogeneous smartphones, eliminating up to 38% of the source lines of application code. Compared to running the same applications on existing smartphones, Reflex reduces the average system power consumption by up to 81%.
[Show abstract][Hide abstract] ABSTRACT: Mobile browser is known to be slow because of the bottleneck in resource
loading. Client-only solutions to improve resource loading are attractive
because they are immediately deployable, scalable, and secure. We present the
first publicly known treatment of client-only solutions to understand how much
they can improve mobile browser speed without infrastructure support.
Leveraging an unprecedented set of web usage data collected from 24 iPhone
users continuously over one year, we examine the three fundamental, orthogonal
approaches a client-only solution can take: caching, prefetching, and
speculative loading, which is first proposed and studied in this work.
Speculative loading predicts and speculatively loads the subresources needed to
open a web page once its URL is given. We show that while caching and
prefetching are highly limited for mobile browsing, speculative loading can be
significantly more effective. Empirically, we show that client-only solutions
can improve the browser speed by about 1.4 second on average for web sites
visited by the 24 iPhone users. We also report the design, realization, and
evaluation of speculative loading in a WebKit-based browser called Tempo. On
average, Tempo can reduce browser delay by 1 second (~20%).
[Show abstract][Hide abstract] ABSTRACT: Sensing on smartphones is known to be power-hungry. It has been shown that
this problem can be solved by adding an ultra low-power processor to execute
simple, frequent sensor data processing. While very effective in saving energy,
this resulting heterogeneous, distributed architecture poses a significant
challenge to application development.
We present Reflex, a suite of runtime and compilation techniques to conceal
the heterogeneous, distributed nature from developers. The Reflex automatically
transforms the developer's code for distributed execution with the help of the
Reflex runtime. To create a unified system illusion, Reflex features a novel
software distributed shared memory (DSM) design that leverages the extreme
architectural asymmetry between the low-power processor and the powerful
central processor to achieve both energy efficiency and performance.
We report a complete realization of Reflex for heterogeneous smartphones with
Maemo/Linux as the central kernel. Using a tri-processor hardware prototype and
sensing applications reported in recent literature, we evaluate the Reflex
realization for programming transparency, energy efficiency, and performance.
We show that Reflex supports a programming style that is very close to
contemporary smartphone programming. It allows existing sensing applications to
be ported with minor source code changes. Reflex reduces the system power in
sensing by up to 83%, and its runtime system only consumes 10% local memory on
a typical ultra-low power processor.
[Show abstract][Hide abstract] ABSTRACT: We present RhythmLink, a system that improves the wireless pairing user experience. Users can link devices such as phones and headsets together by tapping a known rhythm on each device. In contrast to current solutions, RhythmLink does not require user interaction with the host device during the pairing process; and it only requires binary input on the peripheral, making it appropriate for small devices with minimal physical affordances. We describe the challenges in enabling this user experience and our solution, an algorithm that allows two devices to compare imprecisely-entered tap sequences while maintaining the secrecy of those sequences. We also discuss our prototype implementation of RhythmLink and review the results of initial user tests.
Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, Santa Barbara, CA, USA, October 16-19, 2011; 01/2011
[Show abstract][Hide abstract] ABSTRACT: We report a study on the effectiveness of the mobile browser's cache. The study is based on the browsing history from 24 smartphone users over one year. We make two interesting findings. Firstly, increasing the cache size of the browser on smartphones will not improve the effectiveness of the browser cache very much. Secondly, revalidations greatly reduce the effectiveness of the browser cache. The findings reveal the limitations of the cache design for mobile browsers and motivate a new level of cache design and speculative revalidation and loading.
[Show abstract][Hide abstract] ABSTRACT: We report the first work that examines the internals of web browsers on smartphones, using the WebKit codebase, two generations of Android smartphones, and webpages visited by 25 smart-phone users over three months. We make many surprising findings. First, over half of the webpages visited by smartphone users are not optimized for mobile devices. This highlights the importance of client-based optimization and the limitation of prior work that only studies mobile webpages. Second, while prior work suggests that several compute-intensive operations should be the focus of optimization, our measurement and analysis show that their improvement will only lead to marginal performance gain with existing webpages. Furthermore, we find that resource loading, ignored by all except one prior work, contributes most to the browser delay. While our results agree with a recent network study showing that network round-trip time is a major problem, we further demonstrate how the internals of the browser and operating system contribute to the browser delay and therefore reveal new opportunities for optimization.
[Show abstract][Hide abstract] ABSTRACT: Many innovative mobile health applications can be enabled by augmenting wireless body sensors to mobile phones, e.g. monitoring personal fitness with on-body accelerometer and EKG sensors. However, it is difficult for the majority of smartphone developers to program wireless body sensors directly; current sensor nodes require developers to master node-level programming, implement the communication between the smartphone and sensors, and even learn new languages. The large gap between existing programming styles for smartphones and sensors prevents body sensors from being widely adopted by smartphone applications, despite the burgeoning Apple App Store and Android Market. To bridge this programming gap, we present Dandelion1, a novel framework for developing wireless body sensor applications on smartphones. Dandelion provides three major benefits: 1) platform-agnostic programming abstraction for in-sensor data processing, called senselet, 2) transparent integration of senselets and the smartphone code, and 3) platform-independent development and distribution of senselets. We provide an implementation of Dandelion on the Maemo Linux smartphone platform and the Rice Orbit body sensor platform. We evaluate Dandelion by implementing real-world applications, and show that Dandelion effectively eliminates the programming gap and significantly reduces the development efforts. We further show that Dandelion incurs a very small overhead; in total less than 5% of the memory capacity and less than 3% of the processor time of a typical ultra low power sensor.
Proceedings of Wireless Health 2010, WH 2010, San Diego, CA, USA, October 5-7, 2010; 01/2010