
John Francis Canny- University of California, Berkeley
John Francis Canny
- University of California, Berkeley
About
291
Publications
246,018
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
50,183
Citations
Introduction
Current institution
Publications
Publications (291)
While there have been significant gains in the field of automated video description, the generalization performance of automated description models to novel domains remains a major barrier to using these systems in the real world. Most visual description methods are known to capture and exploit patterns in the training data leading to evaluation me...
The design process of user interfaces (UIs) often begins with articulating high-level design goals. Translating these high-level design goals into concrete design mock-ups, however, requires extensive effort and UI design expertise. To facilitate this process for app designers and developers, we introduce three deep-learning techniques to create lo...
Automatic video captioning aims to train models to generate text descriptions for all segments in a video, however, the most effective approaches require large amounts of manual annotation which is slow and expensive. Active learning is a promising way to efficiently build a training set for video captioning tasks while reducing the need to manuall...
Sketching is an effective communication medium that augments and enhances what can be communicated in text. We introduce Sketchforme, the first neural-network-based system that can generate complex sketches based on text descriptions specified by users. Sketchforme's key contribution is to factor complex sketch rendering into layout and rendering s...
Sketches and real-world user interface examples are frequently used in multiple stages of the user interface design process. Unfortunately, finding relevant user interface examples, especially in large-scale datasets, is a highly challenging task because user interfaces have aesthetic and functional properties that are only indirectly reflected by...
Sketching and natural languages are effective communication media for interactive applications. We introduce Sketchforme, the first neural-network-based system that can generate sketches based on text descriptions specified by users. Sketchforme is capable of gaining high-level and low-level understanding of multi-object sketched scenes without bei...
Modern datasets and models are notoriously difficult to explore and analyze due to their inherent high dimensionality and massive numbers of samples. Existing visualization methods which employ dimensionality reduction to two or three dimensions are often inefficient and/or ineffective for these datasets. This paper introduces t-SNE-CUDA, a GPU-acc...
The internal states of most deep neural networks are difficult to interpret, which makes diagnosis and debugging during training challenging. Activation maximization methods are widely used, but lead to multiple optima and are hard to interpret (appear noise-like) for complex neurons. Image-based methods use maximally-activating image regions which...
We present a novel Metropolis-Hastings method for large datasets that uses small expected-size mini-batches of data. Previous work on reducing the cost of Metropolis-Hastings tests yields only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to provide arbitrarily small batch sizes...
Surgical debridement is the process of removing dead or damaged tissue to allow the remaining parts to heal. Automating this procedure could reduce surgical fatigue and facilitate teleoperation, but doing so is challenging for Robotic Surgical Assistants (RSAs) such as the da Vinci Research Kit (dVRK) due to inherent non-linearities in cable-driven...
Machine learning is growing in importance in industry, sciences, and many other fields. In many and perhaps most of these applications, users need to trade off competing goals. Machine learning, however, has evolved around the optimization of a single, usually narrowly-defined criterion. In most cases, an expert makes (or should be making) trade-of...
We introduce Inquire, a tool designed to enable qualitative exploration of utterances in social media and large-scale texts. As opposed to keyword search, Inquire allows the effective use of sentences as queries to quickly explore millions of documents to retrieve semantically-similar sentences. We apply Inquire to LiveJournal.com (LJ) database, wh...
This demo presents an instance of Inquire, a tool designed to support qualitative researchers in the early stages of research. The tool enables the search over millions of users' records to extract early insights to aid in the formulation of research strategies. The tool presents the work described in the Inquire paper by Paredes, et. al. [12] in t...
We present a novel Metropolis-Hastings method for large datasets that uses small expected-size minibatches of data. Previous work on reducing the cost of Metropolis-Hastings tests yield variable data consumed per sample, with only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to...
We fuse science and design thinking to create a novel, IoT interactive urban lights system focused on increasing positive affect among pedestrians. Our contributions are three-fold. First, the design, construction, and evaluation of an efficient interactive lighting system focused on well-being, as opposed to systems focused on utility or landscapi...
A fundamental task in machine learning and related fields is to perform
inference on Bayesian networks. Since exact inference takes exponential time in
general, a variety of approximate methods are used. Gibbs sampling is one of
the most accurate approaches and provides unbiased samples from the posterior
but it has historically been too expensive...
Little is known about the affective expressivity of multisensory stimuli in wearable devices. While the theory of emotion has referenced single stimulus and multisensory experiments, it does not go further to explain the potential effects of sensorial stimuli when utilized in combination. In this paper, we present an analysis of the combinations of...
Gibbs sampling is a workhorse for Bayesian inference but has several limitations when used for parameter estimation, and is often much slower than non-sampling inference methods. SAME (State Augmentation for Marginal Estimation) [15, 8] is an approach to MAP parameter estimation which gives improved parameter estimates over direct Gibbs sampling. S...
Heart rate monitoring is widely used in clinical care, fitness training, and stress management. However, tracking individuals' heart rates faces two major challenges, namely equipment availability and user motivation. In this paper, we present a novel technique, LivePulse Games (LPG), to measure users' heart rates in real time by having them play g...
In this chapter we investigate practical technologies for security
and privacy
in data analysis at large scale.
We motivate our approach by discussing the challenges and opportunities in light of current and emerging analysis paradigms on large data sets. In particular, we present a framework for privacy-preserving distributed data analysis that is...
Gibbs sampling is a workhorse for Bayesian inference but has several
limitations when used for parameter estimation, and is often much slower than
non-sampling inference methods. SAME (State Augmentation for Marginal
Estimation) \cite{Doucet99,Doucet02} is an approach to MAP parameter estimation
which gives improved parameter estimates over direct...
Allreduce is a basic building block for parallel computing. Our target here is 'Big Data' processing on commodity clusters (mostly sparse power-law data). Allreduce can be used to synchronize models, to maintain distributed datasets, and to perform operations on distributed data such as sparse matrix multiply. We first review a key constraint on cl...
Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive datasets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full da...
*Honorable Mention for Best Paper Award
Stress causes and exacerbates many physiological and mental health problems. Routine and unobtrusive monitoring of stress would enable a variety of treatments, from break-taking to calming exercises. It may also be a valuable tool for assessing effects (frustration, difficulty) of using interfaces or applica...
Early literacy is critical to child development, and determines a child's later educational and life opportunities. Moreover, preschool children are incessantly inquisitive, and will readily engage in question answering and asking activities if given the opportunity. We argue here that question asking/answering technologies can play a major role in...
Heart rate monitoring is widely used in clinical care, fitness training, and stress management. However, tracking individuals' heart rate faces two major challenges, namely equipment availability and user motivation. In this paper, we present a novel technique, LivePulse Games (LPG), to measure users' heart rate in real time by having them play cas...
Search advertising shows trends of vertical extension. Vertical ads typically offer better Return of Investment (ROI) to advertisers as a result of better user engagement. However, campaign and bids in vertical ads are not set at the keyword level. As a result, the matching between user query and ads suffers low recall rate and the match quality is...
This special issue aims to explore critical elements of the overall design, user experience, and resulting solutions related to using pervasive computing technologies to inform our understanding of the dynamics of ourselves and our ecosystem, community, and urban landscapes. This issue not only explores pervasive technologies but also reviews pract...
This paper presents "Deus Ex" a system which uses Behavioral Therapy (CBT). The expected gains are cinematographic fun and increase efficacy by providing a balance of interactivity and narrative. Through highly pervasive media such as mobile phones and digital video discs (DVD), machinima out to populations currently lacking access to this type of...
Coding style is important to teach to beginning programmers, so that bad habits don't become permanent. This is often done manually at the University level because automated Python static analyzers cannot accurately grade based on a given rubric. However, even manual analysis of coding style encounters problems, as we have seen quite a bit of incon...
Lack of proper English pronunciations is a major problem for immigrant
population in developed countries like U.S. This poses various problems,
including a barrier to entry into mainstream society. This paper presents a
research study that explores the use of speech technologies merged with
activity-based and arcade-based games to do pronunciation...
Many large datasets exhibit power-law statistics: The web graph, social
networks, text data, click through data etc. Their adjacency graphs are termed
natural graphs, and are known to be difficult to partition. As a consequence
most distributed algorithms on these graphs are communication intensive. Many
algorithms on natural graphs involve an Allr...
This poster presents a theoretical framework for the use of interactive machinima (machine + cinema) as an adaptable means to deliver Cognitive Behavioral Therapy (CBT) to large audiences. The expected gains are to improve engagement and adherence through interactive cinematographic. Theoretical foundations are aggregated in a machinima agent that...
The preschool ”literacy gap” is one of the most difficult challenges for education in the US. Children in the lowest SES (Socio-Economic Status) quartile have less than half the working vocabulary of those in the top quartile at age 3. On the other hand, preschool children are incessantly inquisitive, and will readily engage in question answering a...
This paper describes the BID Data Suite, a collection of hardware, software and design patterns that enable fast, large-scale data mining at very low cost. By co-designing all of these elements we achieve single-machine performance levels that equal or exceed reported cluster implementations for common benchmark problems. A key design criterion is...
The goal of this special issue is to contribute to the advancement of ubiquitous information societies, where computers and humans are part of the same ecosystem. One crucial property of entities living in the same ecosystem is that they mutually influence and affect each other's behavior in a variety of ways. This special issue, organized as a fol...
This paper presents a list of principles that can be used to conceptualize games for health behavior change. These principles are derived from lessons learned after teaching two design-centered courses on Gaming and Narrative Technologies for Health Behavior Change. Course sessions were designed to create many rapid prototypes on specific topics co...
This paper describes our vision on what should be the research around sensing and adaptive interventions to make affective computing and stress management technology pervasive and unobtrusive. With the use of common computer peripherals and mobile computing devices as affect sensors, personalized and adaptive intervention technologies can be develo...
Parents are well aware that pre-school children are incessantly inquisitive, and the high ratio of questions to statements suggests that questions are a primary method utilized by children for language acquisition, cognitive development, and formulating knowledge structures. Question-asking is furthermore a comfortable medium for a child to stay en...
Understanding and facilitating real-life social interaction is a high-impact goal for Ubicomp research. Microphone arrays offer the unique capability to provide continuous, calm capture of verbal interaction in large physical spaces, such as homes and (especially open-plan) offices. Most microphone array work has focused on arrays of custom sensors...
The human voice encodes a wealth of information about emotion, mood, stress, and mental state. With mobile phones (one of the mostly used modules in body area networks) this information is potentially available to a host of applications and can enable richer, more appropriate, and more satisfying human-computer interaction. In this paper we describ...
We describe an innovative and scalable recommendation system successfully deployed at eBay. To build recommenders for long-tail marketplaces requires projection of volatile items into a persistent space of latent products. We first present a generative clustering model for collections of unstructured, heterogeneous, and ephemeral item data, under t...
Mental illness is one of the most undertreated health problems worldwide. Previous work has shown that there are remarkably strong cues to mental illness in short samples of the voice. These cues are evident in severe forms of illness, but it would be most valuable to make earlier diagnoses from a richer feature set. Furthermore there is an abstrac...
This paper investigates the usefulness of segmental phonemedynamics for classification of speaking styles. We modeled transition details based on the phoneme sequences emitted by a speech recognizer, using data obtained from a recording of 39 depressed patients with 7 different speaking styles- normal, pressured, slurred, stuttered, flat, slow and...
In this paper, we describe our experiences and thoughts on building speech applications on mobile devices for developing countries. We describe three models of use for automatic speech recognition (ASR) systems on mobile devices that are currently used – embedded speech recognition, speech recognition in the cloud, and distributed speech recognitio...
Rural health workers in India do not always have the training, credibility or motivation to effectively convince clients to adopt healthy practices. To help build their efficacy, we provided them with messages on mobile phones to present to clients. We present a study which compared three presentations of persuasive health messages by health worker...
Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The state-of-the-art of BT derives a linear Poisson regression model from fine-grained user behavioral data and predicts click-through rate (CTR) from user history. We designed and implemented a highly scalable and efficient solution to...
In this paper we introduce a framework for privacy-preserving distributed computation that is practical for many real-world applications. The framework is called Peers for Privacy (P4P) and features a novel heterogeneous architecture and a number of efficient tools for performing private computation and ensuring security at large scale. It maintain...
Today's youth are shaping the frontier of digital media in general, and mobile technology in particular. This special issue features applications with a youth focus, studies of how youth are appropriating pervasive technology, and a glimpse of how our lives will change when "pervasive" technology finally lives up to its name.
Cellphones have the potential to improve education for the millions of underprivileged users in the developing world. However, mobile learning in developing countries remains under-studied. In this paper, we argue that cellphones are a perfect vehicle for making educational opportunities accessible to rural children in places and times that are mor...
In many developing countries such as India and China, low educational levels often hinder economic empowerment. In this paper, we argue that mobile learning games can play an important role in the Chinese literacy acquisition process. We report on the unique challenges in the learning Chinese language, especially its logographic writing system. Bas...
Researchers have long been interested in the potential of ICTs to enable positive change in developing regions com- munities. In these environments,ICT interventionsoften fail because political, social and cultural forces work against the changes ICTs entail. We argue that familiar uses of ICTs for information services in these contexts are less po...
Dictionary-based disambiguation (DBD) is a very popular solution for text entry on mobile phone keypads but suffers from two problems: 1. the resolution of encoding collision (two or more words sharing the same numeric key sequence) and 2. entering out-of-vocabulary (OOV) words. In this paper, we present SHRIMP, a system and method that addresses t...
Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The state-of-the-art of BT derives a linear Poisson regression model from fine-grained user behavioral data and predicts click-through rate (CTR) from user history. We designed and implemented a highly scalable and efficient solution to...
The advancement of precision micropower amplifiers, microcontrollers, and MEMs devices have allowed for a paradigm shift from traditionally large and costly health monitoring equipment only found in hospitals or care centers to smaller, wireless, low powered portable devices that can provide continuous monitoring for a number of applications. Along...
In this paper we present ethno-mining, a mixed methods approach drawing on techniques from ethnography and data mining. Ethno-mining is characterized by tight, iterative loops that integrate both the results and the processes of ethnographic and data mining techniques to interpret data. Ethno-mining provides two key benefits. First, it makes use of...
We developed and tested the Berkeley Tricorder, a health monitoring device capable of measuring a subject's ECG, EMG, Blood Oxygenation, Respiration (via Bioimpedance), and motion--almost equivalent to the feature set of a hospital bedside patient monitor. Our focus has been a highly integrated design incorporating the radio and all associated circ...
A major problem in current privacy-preserving data-mining research is the lack of practical mechanisms to deal with malicious users who may submit bogus data to bias the computation. In this paper we explore private computation built on vector addition and its applications in privacy-preserving data mining. We show that such a paradigm not only sup...
Behavioral targeting (BT) leverages historical user behavior to select the ads most relevant to users to display. The state-of-the-art of BT derives a linear Poisson regression model from ne-grained user behavioral data and predicts click-through rate (CTR) from user history. We designed and implemented a highly scalable and ecient solution to BT u...
Literacy is one of the great challenges in the developing world. But universal education is an unattainable dream for those children who lack access to quality educational resources such as well-prepared teachers and schools. Worse, many of them do not attend school regularly due to their need to work for the family in the agricultural fields or ho...
Video conferencing attempts to convey subtle cues of face-to-face interaction (F2F), but it is generally believed to be less effective than F2F. We argue that careful design based on an understanding of non-verbal communication can mitigate these differences. In this paper, we study the effects of video image framing in one-on-one meetings on empat...
Low educational levels hinder economic empowerment in developing countries. We make the case that educational games can impact children in the developing world. We report on exploratory studies with three communities in North and South India to show some problems with digital games that fail to match rural children's understanding of games, to high...
We explore natural and calm interfaces for configuring ubiq- uitous computing environments. A natural interface should enable the user to name a desired configuration and have the system enact that configuration. Users should be able to use familiar names for configurations without learning, which implies the mapping from names to configurations is...
We adapt a probabilistic latent variable model, namely GaP (Gamma-Poisson) (6), to ad targeting in the contexts of sponsored search (SS) and behaviorally targeted (BT) display advertising. We also approach the important problem of ad posi- tional bias by formulating a one-latent-dimension GaP factorization. Learning from click-through data is intri...
In this paper, we argue that, in developing regions, there is a need to disseminate information about education and health whose value is not always immediately recognized by local communities. We believe that we can draw inspiration from the design space of persuasive technologies in order to create effective interactions between users and technol...
Sensors integrated into mobile phones have the advantage of mobility, co-location with peo- ple, pre-built network and power infrastructure, and potentially, ubiquity. These characteristics, however, also present significant challenges. Mobility mea ns non-uniform sampling in space, and also constrains the size and weight of the sensors. In thi s p...
The persuasive power of live interaction is hard to match, yet technologies are increasingly taking on roles to promote behavioral
change. We believe that speech-based interfaces offer a compelling mode of interaction for engaging users and are motivated
to understand how to best present persuasive information using speech interaction. We present a...
In this paper we explore private computation built on vector addition and its applications in privacy- preserving data mining. Vector addition is a surpris- ingly general tool for implementing many algorithms prevalent in distributed data mining. Examples include linear algorithms like voting and summation, as well as non-linear algorithms such as...
Poor literacy remains a barrier to economic empowerment in the developing world. Of particular importance is fluency in a widely spoken "world language" such as English, which is typically a second language for these low-income learners. We make the case that mobile games on cellphones is an appropriate solution in the typical ecologies of developi...
Low levels of education remain a barrier to economic empowerment in the developing world. In our work on English language learning among underserved communities in India since 2004, we have observed differences between school communities in terms of their access to educational opportunities outside school, access to ICTs including cellphones and di...
Information Technology has had significant impact on the society and has touched all aspects of our lives. So far, computers and expensive devices have fueled this growth. The challenge now is to take this success of IT to its next level where IT services can be accessed by masses. "Masses" here mean the people who (a) are not yet IT literate and/o...
We argue that social marketing, a strategy that uses techniques from corporate marketing to influence the behavior of target audiences, is a useful framework for thinking about motivating people to enact environmentally sustainable behaviors. We critically examine some pervasive green applications through the lens of social marketing and discuss ho...
We present Data Souvenirs, book-like electronic objects that display various forms of information with the goal of supporting reflection and re minis- cence. Data Souvenirs draw on the ability of electronic data streams to provide new perspectives on data in and around the home while taking on a less distract- ing, more reflective form than existin...
Weight training, in addition to aerobic exercises, is an important component of a balanced exercise program. However, mechanisms for tracking free weight exercises have not yet been explored. In this paper, we study methods that automatically recognize what type of exercise you are doing and how many repetitions you have done so far. We incorporate...
In this paper we explore private computation built on vector addition which is a surprisingly general tool for implementing many useful analysis on user-provided data. Examples include both linear and non-linear algorithms such as singular value decomposition (SVD), regression, analysis of variance (ANOVA), and several machine learning algorithms b...
Many network applications are based on a group communications model where one party sends messages to a large number of authorized recipients and/or receives messages from multiple senders. In this paper we present a secure group communication scheme based on a new cryptosystem that admits a rigorous proof of security against adaptive chosen cipher...
Video conferencing is still considered a poor alternative to face-to-face meetings. In the business setting, where these systems are most prevalent, the misuse of video conferencing systems can have detrimental results, especially in high-stakes communications. Prior work suggests that spatial distortions of nonverbal cues, particularly gaze and de...
Poor literacy remains a decisive barrier to the economic empowerment of many people in the developing world. Of particular importance is literacy in a widely spoken "world language" such as English, which is typically a second language for these speakers. For complex reasons, schools are often not effective as vehicles for second language learning....
Technology arguably has the potential to play a key role in improving the lives of people in developing regions. However, these communities are not well understood and designers must thoroughly investigate possibilities for technological innovations in these contexts. We describe findings from two field studies in India and one in Uganda where we e...
Recent HCI research shows a strong interest in task man- agement systems (e.g. (19, 27)) that support the multi- tasked nature of information work (13). These systems ei- ther require manual creation and maintenance of task repre- sentations, or they depend on explicit user cues to guide the creation and maintenance process. Furthermore, to access...
We present several interesting applications for the Pattern-Annotated Course Tool (PACT) and pedagogical design patterns in the process of curriculum design. PACT is a visual editor in which content designers can create visual representations of their courses and annotate them with references to educational theory in the form of pedagogical pattern...
We present several interesting applications for the Pattern- Annotated Course Tool (PACT) and pedagogical design patterns in the process of curriculum design. PACT is a visual editor in which content designers can create visual representations of their courses and annotate them with references to educational theory in the form of pedagogical patter...
Poor literacy remains a barrier to economic empowerment in the developing world. We make the case that "serious games" can make an impact for these learners and highlight that much remains to be learned about designing engaging gameplay experiences for children living in rural areas. Our approach revolves around game design patterns, which are buil...
In this paper we introduce a new practical framework, called P4P (peers for privacy), for privacy-preserving data mining. P4P features a hybrid architecture combining P2P and client-server paradigms and provides practical private protocols for user data validation and general computation. The architecture is guided by the natural incentives of the...
This paper presents TinyMotion, a pure software approach for detecting a mobile phone user's hand movement in real time by analyzing image sequences captured by the built-in camera. We present the design and implementation of Ti- nyMotion and several interactive applications based on TinyMotion. Through both an informal evaluation and a formal 17-p...
Privacy has been recognized as a very important is- sue in electronic commerce. However, many privacy tech- niques were not adopted and many online anonymity ser- vices failed. In this paper we propose treating privacy as a "value" that is to be added to other services to avoid the adoption pitfall. We present an architecture that anonymizes online...
Increasing availability of sensor-based location traces for in- dividuals, combined with the goal of better understanding user context, has resulted in a recent emphasis on algorithms for automatically ex- tracting users' significant places from location data. Place-finding can be characterized by two sub-problems, (1) finding significant locations...
Personal computing launched with the IBM PC. But popular computing—computing for the masses—launched with the modern WIMP (windows, icons, mouse, pointer) interface, which made computers usable by ordinary people. As popular computing has grown, the role of HCI (human-computer interaction) has increased. Most software today is interactive, and code...
This paper draws on a 2-week design workshop conducted at a rural primary school in northern India to provide recommendations on carrying out participatory design with school children in rural, underdeveloped regions. From our experiences in prototyping low-tech and hi-tech English language learning games with rural student participants, we advocat...
Pervasive sensors in the home have a variety of applications including energy minimization, activity monitoring for elders, and tutors for household tasks such as cooking. Many of the common sensors today are binary, e.g. IR motion sensors, door close sensors, and floor pressure pads. Predicting user behavior is one of the key enablers for applicat...
Advances in Location-Based Services (LBS) are opening opportunities for using the location of people, places, and things to augment or streamline interaction. While computers work with physical locations like latitude and longitude directly, people usually think and speak in terms of places, which adds personal, environmental and social meaning to...