ArticlePDF Available

Abstract and Figures

Mobile web traffic has now surpassed the desktop web traffic and has become the primary means for service providers to reach-out to the billions of end-users. Due to this trend, optimization of mobile web browsing (MWB) has gained significant attention. In this paper, we present a survey of techniques for improving efficiency of web browsing on mobile systems, proposed in last 6-7 years. We review the techniques from both networking domain (e.g., proxy and browser enhancements) and processor-architecture domain (e.g., hardware customization, thread-to-core scheduling). We organize the research works based on key parameters to highlight their similarities and differences. Beyond summarizing the recent works, this survey aims to emphasize the need of architecting for MWB as the first principle, instead of retrofitting for it.
Content may be subject to copyright.
A preview of the PDF is not available
... They also compare their technique with offloading to cloud and find that when network conditions are poor, their technique can provide similar or lower latency than cloud-offloading. Also, their method consumes lower energy even when network conditions are good since accessing the handheld via local radio takes less energy than accessing the internet via WiFi [17]. ...
Preprint
Full-text available
This technical report is an addendum to our survey paper, "A Survey of Deep Learning on CPUs: Opportunities and Co-optimizations", published in IEEE TNNLS 2021. In this technical report, we discuss deep learning on CPUs in heterogeneous computing scenarios. Specifically, we discuss heterogeneous computing across big and little cores of an asymmetric multicore processor (Section 1), across wearable and handheld, or handheld and cloud (Section 2) and across CPU and accelerator such as GPU/FPGA/Phi (Section 3). ! Table 1 provides an overview of these techniques. TABLE 1 Heterogeneous computing techniques Processors used in heterogeneous computing Asymmetric multicore (big+LITTLE) [2, 3] CPU+accelerator CPU+GPU [2, 4-8], CPU+GPU+DSP [9-11], CPU+FPGA [12], CPU+Phi [13] Wearable+handheld [14] handheld+cloud [7, 15] Quantum/basis of partitioning CNN layer-wise [3, 8, 10, 12, 14, 15] Computation-wise Dot-product computations on CPU and autoencoder computations on GPU [5], sparse matrices on CPU and dense matrices on GPU [4], based on difficulty-in-upscaling and target accuracy [9], for achieving load-balancing [6, 11] Batch-wise [13] 1 BETWEEN BIG AND LITTLE CORES IN AN ASYMMETRIC MULTICORE An asymmetric multicore processor [16], shown in Figure 1, uses cores of different microarchitectural configurations to optimize for different design objectives. These core-types are generally called big and little (or small). Cores of a type are organized into a cluster. Let kB or ks denote use of k cores of big (B) or small (s) type, respectively. Wang et al. [3] note that when asymmetric multicores are used for executing DNNs, the images are processed sequentially and processing of a kernel is parallelized over multiple cores. This "kernel-level" scheme, shown in Figure 2(a), works well within a cluster but performs poorly for inter-cluster processing. Specifically, they measure the throughput of multiple DNNs on using 1B, 2B, 3B, 4B, 4B+1s, 4B+2s, 4B+3s and 4B+4s cores. Throughput rises from 1B to 4B but shows a steep fall with 4B+1s. In fact, the throughput with 4B+4s cores is lower than that with 4B cores. The reason behind this is that the use of multiple cores in a single cluster increases the intensity of L2 cache accesses, which are handled efficiently by the "snoop-control unit" of the cluster. However, on using cores from different clusters, the working set is divided between their L2 caches. The L2 cache misses of one cluster are satisfied by the L2 cache of another cluster. • .
... Επομένως, είναι σημαντικό να διερευνηθούν τα οφέλη από τη χρήση διαφόρων τεχνολογιών ηλεκτρονικής υγείας που είναι επί του παρόντος διαθέσιμες για την αντιμετώπιση τέτοιων καταστροφών και ασθενειών. Τέτοια στοιχεία μπορούν επίσης να συμβάλουν στην υιοθέτηση αυτών των τεχνολογιών ως μέρος των τακτικών υπηρεσιών και, ως εκ τούτου, να αυξήσουν τις πιθανότητες καλύτερης αντιμετώπισης σε περίπτωση απρόβλεπτων καταστάσεων στο μέλλον (Srivastava, et al, 2015 (Mittal, et al., 2016). Οι παγκόσμιες πωλήσεις smartphone ξεπέρασαν τα στοιχεία πωλήσεων για κινητά τηλέφωνα στις αρχές του 2013 (Qian, et al, 2014 ...
Thesis
Full-text available
Οι εφαρμογές κινητής υγείας (mHealth) είναι ένας υποκλάδος της ψηφιακής υγείας (eHealth), ο οποίος φαίνεται να είναι υποσχόμενος στην προαγωγή της υγείας τόσο σε υγιείς πληθυσμούς, όσο και σε πληθυσμούς με χρόνιες παθήσεις. Παρά την ευρεία χρήση τους, ελάχιστη έρευνα έχει γίνει στις αντιλήψεις και τη χρήση τους πάνω σε άτομα με χρόνιες παθήσεις. Κύριος στόχος λοιπόν της παρούσας έρευνας ήταν η διερεύνηση συμπεριφορών, αντιλήψεων και επίπεδο χρήσης των εφαρμογών κινητής υγείας (mHealth) τόσο στον γενικό πληθυσμό, όσο και σε χρόνιους πάσχοντες. Στην συγχρονική μελέτη συμμετείχαν 1428 άτομα, από τους οποίους οι 1314 ήταν χρήστες έξυπνων κινητών τηλεφώνων. Η μελέτη διερεύνησε διάφορες πτυχές της χρήσης εφαρμογών κινητής υγείας (mHealth), όπως τη συχνότητα, τους λόγους χρήσης καθώς και τα αποτελέσματα που έχει επιφέρει στην ποιότητα ζωής τους. Περίπου 7 στους 10 ερωτηθέντες δεν χρησιμοποιούν εφαρμογές mHealth περισσότερο από μια φορά την εβδομάδα (Ν%=70,10%). Παρόλα αυτά έχει αυξηθεί και ο μέσος αριθμός των εφαρμογών mHealth που έχουν στο κινητό τους περίπου από 3 (Μ.Τ.=2,64, Τ.Α.=1,856) σε 4 (Μ.Τ.=3,79, Τ.Α.=1,984) αλλά και ο μέσος ημερήσιος χρόνος χρήσης από 1,39 σε 1,44 λεπτά. Η φυσική ή η πνευματική κατάσταση του δείγματος δεν έδειξε ότι επηρεάζει την επιλογή εφαρμογής καθώς δεν εξάχθηκαν στατιστικά σημαντικές συσχετίσεις, αλλά αντίθετα παρατηρήθηκαν μοτίβα επιλογής τύπου εφαρμογής με βάση το εκπαιδευτικό επίπεδο, την επαγγελματική κατάσταση και το φύλο. Οι εφαρμογές που παρουσίασαν τις μεγαλύτερες διαφοροποιήσεις ήταν η παρακολούθηση κατάστασης υγείας, μεταφοράς δεδομένων από και προς επαγγελματίες υγείας και υπενθύμισης. Τα αποτελέσματα της μελέτης υποδηλώνουν πως άτομα με χρόνιες παθήσεις / κακή υγεία, χρησιμοποιούν λιγότερο τις εφαρμογές mHealth.
... They are no longer merely a source of communication; they are also used for entertainment, accessing information, government and commercial services, and much more. To quote some facts, more than 50% of the video streaming requests and website traffic worldwide originates from the mobile [5,6], and the time spent on mobile internet access is likely to surpass soon that spent on TV [7]. The total number of apps available in major app stores has exceeded 5.5 million, with many of them being gaming apps [8]. ...
Thesis
Full-text available
Convolution Neural Networks is one of the most hyper-active research topics today among innovators. We work on different real-world problems that can help in giving their contribution to society by applying Deep learning in novel applications or scenarios. As the capabilities of mobile phones have increased, the potential of their negative use has also increased tremendously. For example, the use of mobile phones while driving or in high-security zones can lead to accidents, information leaks, and security breaches. In this work, we use deep-learning algorithms viz., single-shot multi-box detector (SSD) and faster-region based convolution neural network (Faster-RCNN), to detect mobile phone usage. We highlight the importance of mobile phone usage detection and the challenges involved in it. We have used a subset of State Farm Distracted Driver Detection dataset from Kaggle, which we term as KaggleDriver dataset. In addition, we have created a dataset on mobile phone usage, which we term as the Indian Institute of Technology Hyderabad- Dataset on Mobile Phone Usage (IITH-DMU). Although small, IITH-DMU is more generic than the KaggleDriver dataset, since it has images with a higher amount of variation in foreground and background objects. Ours is possibly the first work to perform mobile-phone detection for a wide range of scenarios. On the KaggleDriver dataset, the AP at 0.5IoU is 98.97% with SSD and 98.84% with Faster-RCNN. On the IITH-DMU dataset, these numbers are 92.6% for SSD and 95.92% for Faster-RCNN. We will release the annotated KaggleDriver and IITH-DMU datasets and the trained CNN models in open-source. In multi-scale feature enhancement work, we study the use of plugins that perform multiscale feature aggregation for improving the accuracy of object detection algorithms. These plugins improve the input feature representation, and also remove the semantic ambiguity and background noise arising from feature fusion of low and high layers representation. Further, these plugins improve focus on the contextual information that comes from the shallow layers. We carefully choose the plugins to strike a fine balance between accuracy and model size. These plugins are generic and can be easily merged with the baseline models, which avoids the need for retraining the model. We perform experiments using the PASCAL-VOC2007 dataset. While the baseline SSD has 22M parameters and an mAP score of 77.20, the use of the SFCM plugin increases the mAP score to 78.82 and the number of parameters to 25M.
... They are no longer merely a source of communication; they are also used for entertainment, accessing information, government and commercial services, and much more. To quote some facts, more than 50% of the video streaming requests and website traffic worldwide originates from the mobile [1,2], and the time spent on mobile internet access is likely to surpass soon that spent on TV. The total number of apps available in major app stores has exceeded 5.5 million, with many of them being gaming apps [3]. ...
Conference Paper
Full-text available
As the capabilities of mobile phones have increased, the potential of their negative use has also increased tremendously. For example, use of mobile phones while driving or in high-security zones can lead to accidents, information leaks and security breaches. In this paper, we use deep-learning algorithms viz., single shot multiBox detector (SSD) and faster-region based convolution neural network (Faster-RCNN), to detect mobile phone usage. We highlight the importance of mobile phone usage detection and the challenges involved in it. We have used a subset of State Farm Distracted Driver Detection dataset from Kaggle, which we term as Kag-gleDriver dataset. In addition, we have created a dataset on mobile phone usage, which we term as IITH-dataset on mobile phone usage (IITH-DMU). Although small, IITH-DMU is more generic than the KaggleDriver dataset, since it has images with higher amount of variation in foreground and background objects. Ours is possibly the first work to perform mobile-phone detection for a wide range of scenarios. On the KaggleDriver dataset, the AP at 0.5IoU is 98.97% with SSD and 98.84% with Faster-RCNN. On the IITH-DMU dataset, these numbers are 92.6% for SSD and 95.92% for Faster-RCNN. These pretrained models and the datasets are available at sites.google.com/view/mobile-phone-usage-detection.
... Our goal in choosing the plugins is to improve the mAP score with a minimal increase in the model size and minimal impact on the frame-rate. Keeping the model size small is especially important because many deep-learning applications run on mobile devices [10,11]. ...
Conference Paper
Full-text available
In this paper, we study the use of plugins that perform multiscale feature aggregation for improving the accuracy of object detection algorithms. These plugins improve the input feature representation, and also remove the semantic ambiguity and background noise arising from feature fusion of low and high layers representation. Further, these plu-gins improve focus on the contextual information that comes from the shallow layers. We carefully choose the plugins to strike a delicate balance between accuracy and model size. These plugins are generic and can be easily merged with the baseline models, which avoids the need for retraining the model. We perform experiments using the PASCAL-VOC2007 dataset. While the baseline SSD has 22M parameters and an mAP score of 77.20, the use of the SFCM (one of the plugins we used) increases the mAP score to 78.82 and the number of parameters to 25M.
... To achieve this, you should avoid using lengthy and overused software and codes and try to find the most efficient algorithm for purpose. Lightweight applications are also called Lite, an acronym for lightweight [9][10][11] [12]. ...
Thesis
Developers rely more and more on so-called End To End (E2E) tests to test the web applications they develop and to check that they have no bug from an end-user point of view. An E2E test simulates the actions performed by the user with his/her browser, and checks that the web application returns the expected outputs. It considers that a web application is a black box, and only knows what are the user actions and what are their expected outputs. However, once some evolutions are performed on a web application, the user actions may change (move the button to another location, add a new button or delete a button). As a result, the E2E test needs to evolve with the evolution of web applications, such as repair the broken test, add the new test, and delete the obsolete test. But it takes a lot of time to evolve E2E tests, especially for large web applications. As such, we do a systematic mapping study to evaluate the existing literature to find gaps in web test suite. We then present an approach, named WebTestSuiteRepair(WTSR), to help the developers who face broken test scripts. In this thesis, WTSR aims at comparing test suite graphs to repair broken actions, hence helps to efficiently repair the E2E tests for web applications automatically. Those approach has been validated through several case studies. We describe some future work to improve our solution, and some research problems that our approaches can target.
Article
With the recent advancement of smartphone technology in the past few years, smartphone usage has increased on a tremendous scale due to its portability and ability to perform many daily life tasks. As a result, smartphones have become one of the most valuable targets for hackers to perform cyberattacks, since the smartphone can contain individuals’ sensitive data. Smartphones are embedded with highly accurate sensors. This article proposes BetaLogger , an Android-based application that highlights the issue of leaking smartphone users’ privacy using smartphone hardware sensors (accelerometer, magnetometer, and gyroscope). BetaLogger efficiently infers the typed text (long or short) on a smartphone keyboard using Language Modeling and a Dense Multi-layer Neural Network (DMNN). BetaLogger is composed of two major phases: In the first phase, Text Inference Vector is given as input to the DMNN model to predict the target labels comprising the alphabet, and in the second phase, sequence generator module generate the output sequence in the shape of a continuous sentence. The outcomes demonstrate that BetaLogger generates highly accurate short and long sentences, and it effectively enhances the inference rate in comparison with conventional machine learning algorithms and state-of-the-art studies.
Article
Full-text available
This research has conducted a survey for the modern information technology software and online learning which was carried out in Iraq and the variety of platforms used. The survey was conducted to obtain information about the use of models and platforms that were effectively used in online learning during the social distancing period to prevent the infection of COVID-19. In addition, the information obtained from the survey also includes obstacles in the implementation of learning, how to deliver the material and the number of online meetings that are held every week. The study reached a set of conclusions, the most important of which are that the most used and influential educational platforms and software in education are Google Classroom, Zoom, Moodle and Telegram.
Chapter
In this paper, we study the use of plugins that perform multiscale feature aggregation for improving the accuracy of object detection algorithms. These plugins improve the input feature representation, and also remove the semantic ambiguity and background noise arising from feature fusion of low and high layers representation. Further, these plugins improve focus on the contextual information that comes from the shallow layers. We carefully choose the plugins to strike a delicate balance between accuracy and model size. These plugins are generic and can be easily merged with the baseline models, which avoids the need for retraining the model. We perform experiments using the PASCAL-VOC2007 dataset. While the baseline SSD has 22M parameters and an mAP score of 77.20, the use of the SFCM (one of the plugins we used) increases the mAP score to 78.82 and the number of parameters to 25M.
Article
Full-text available
Branch predictor (BP) is an essential component in modern processors since high BP accuracy can improve performance and reduce energy by decreasing the number of instructions executed on wrong-path. However, reducing latency and storage overhead of BP while maintaining high accuracy presents significant challenges. In this paper, we present a survey of dynamic branch prediction techniques. We classify the works based on key features to underscore their differences and similarities. We believe this paper will spark further research in this area and will be useful for computer architects, processor designers and researchers.
Article
Full-text available
Modeling the energy consumption of applications on mobile devices is an important topic that has received much attention in recent years. However, there has been very little research on modeling the energy consumption of the mobile Web. This is primarily due to the short-lived yet complex page load process that makes it infeasible to rely on coarse-grained resource monitoring for accurate power estimation. We present RECON, a modeling approach that accurately estimates the energy consumption of any Web page load and deconstructs it into the energy contributions of individual page load activities. Our key intuition is to leverage low-level application semantics in addition to coarse-grained resource utilizations for modeling the page load energy consumption. By exploiting fine-grained information about the individual activities that make up the page load, RECON enables fast and accurate energy estimations without requiring complex models. Experiments across 80 Web pages and under four different optimizations show that RECON can estimate the energy consumption for a Web page load with an average error of less than 7%. Importantly, RECON helps to analyze and explain the energy effects of an optimization on the individual components of Web page loads.
Article
Full-text available
Mobile Web page performance is critical to content providers, service providers, and users, as Web browsers are one of the most popular apps on phones. Slow Web pages are known to adversely affect profits and lead to user abandonment. While improving mobile web performance has drawn increasing attention, most optimizations tend to overlook an important factor, energy. Given the importance of battery life for mobile users, we argue that web page optimizations should be evaluated for their impact on energy consumption. However, examining the energy effects of a web optimization is challenging, even if one has access to power monitors, for several reasons. First, the page load process is relatively short-lived, ranging from several milliseconds to a few seconds. Fine-grained resource monitoring on such short timescales to model energy consumption is known to incur substantial overhead. Second, Web pages are complex. A Web enhancement can have widely varying effects on different page load activities. Thus, studying the energy impact of a Web enhancement on page loads requires understanding its effects on each page load activity. Existing approaches to analyzing mobile energy typically focus on profiling and modeling the resource consumption of the device during execution. Such approaches consider long-running services and apps such as games, audio, and video streaming, for which low-overhead, coarse-grained resource monitoring suffices. For page loads, however, coarse-grained resource monitoring is not sufficient to analyze the energy consumption of individual, short-lived, page load activities. We present RECON (REsource- and COmpoNent-based modeling), a modeling approach that addresses the above challenges to estimate the energy consumption of any Web page load. The key intuition behind RECON is to go beyond resource-level information and exploit application-level semantics to capture the individual Web page load activities. Instead of modeling the energy consumption at the full page load level, which is too coarse grained, RECON models at a much finer component level granularity. Components are individual page load activities such as loading objects, parsing the page, or evaluating JavaScript. To do this, RECON combines coarse-grained resource utilization and component-level Web page load information available from existing tools. During the initial training stage, RECON uses a power monitor to measure the energy consumption during a set of page load processes and juxtaposes this power consumption with coarse-grained resource and component information. RECON uses both simple linear regression and more complex neural networks to build a model of the power consumption as a function of the resources used and the individual page load components, thus providing benefits over individual models. Using the model, RECON can estimate the energy consumption of any Web page loaded as-is or upon applying any enhancement, without the monitor. We experimentally evaluate RECON on the Samsung Galaxy S4, S5, and Nexus devices using 80 Web pages. Comparisons with actual power measurements from a fine-grained power meter show that, using the linear regression model, RECON can estimate the energy consumption of the entire page load with a mean error of 6.3% and that of individual page load activity segments with a mean error of 16.4%. When trained as a neural network, RECON's mean error for page energy estimation reduces to 5.4% and the mean segment error is 16.5%. We show that RECON can accurately estimate the energy consumption of a Web page under different network conditions, such as lower bandwidth or higher RTT, even when the model is trained under a default network condition. RECON also accurately estimates the energy consumption of a Web page after applying popular Web enhancements including ad blocking, inlining, compression, and caching.
Article
To load a web page, a browser must fetch and evaluate objects like HTML files and JavaScript source code. Evaluating an object can result in additional objects being fetched and evaluated. Thus, loading a web page requires a browser to resolve a dependency graph; this partial ordering constrains the sequence in which a browser can process individual objects. Unfortunately, many edges in a page’s dependency graph are unobservable by today’s browsers. To avoid violating these hidden dependencies, browsers make conservative assumptions about which objects to process next, leaving the network and CPU underutilized. We provide two contributions. First, using a new measurement platform called Scout that tracks fine-grained data flows across the JavaScript heap and the DOM, we show that prior, coarse-grained dependency analyzers miss crucial edges: across a test corpus of 200 pages, prior approaches miss 30% of edges at the median, and 118% at the 95th percentile. Second, we quantify the benefits of exposing these new edges to web browsers. We introduce Polaris, a dynamic client-side scheduler that is written in JavaScript and runs on unmodified browsers; using a fully automatic compiler, servers can translate normal pages into ones that load themselves with Polaris. Polaris uses fine-grained dependency graphs to dynamically determine which objects to load, and when. Since Polaris’ graphs have no missing edges, Polaris can aggressively fetch objects in a way that minimizes network round trips. Experiments in a variety of network conditions show that Polaris decreases page load times by 34% at the median, and 59% at the 95th percentile.
Conference Paper
The existing slowness of the web on mobile devices frustrates users and hurts the revenue of website providers. Prior studies have attributed high page load times to dependencies within the page load process: network latency in fetching a resource delays its processing, which in turn delays when dependent resources can be discovered and fetched. To securely address the impact that these dependencies have on page load times, we present Vroom, a rethink of how clients and servers interact to facilitate web page loads. Unlike existing solutions, which require clients to either trust proxy servers or discover all the resources on any page themselves, Vroom's key characteristics are that clients fetch every resource directly from the domain that hosts it but web servers aid clients in discovering resources. Input from web servers decouples a client's processing of resources from its fetching of resources, thereby enabling independent use of both the CPU and the network. As a result, Vroom reduces the median page load time by more than 5 seconds across popular News and Sports sites. To enable these benefits, our contributions lie in making web servers capable of accurately aiding clients in resource discovery and judiciously scheduling a client's receipt of resources.