Andreas Paepcke

Andreas Paepcke
Stanford University | SU · Department of Computer Science

About

149
Publications
31,981
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,007
Citations

Publications

Publications (149)
Article
Full-text available
Committing to a major is a fateful step in an undergraduate education, yet the relationship between courses taken early in an academic career and ultimate major issuance remains little studied at scale. Using transcript data capturing the academic careers of 26,892 undergraduates enrolled at a private university between 2000 and 2020, we describe e...
Preprint
Committing to a major is a fateful step in an undergraduate education, yet the relationship between courses taken early in an academic career and ultimate major selection remains little studied at scale. Using transcript data capturing the academic careers of 26,892 undergraduates enrolled at a private university between 2000 and 2020, we describe...
Article
Full-text available
Researchers have investigated the demography and styles of engagement of those who enroll in MOOCs but have lent little attention to how learners navigate MOOCs’ ambiguity as academic certifications. Analyzing semi-structured interviews with 60 people who devoted substantial time to at least one MOOC between 2014–2017, we find that people use MOOCs...
Preprint
Student course reviews are rarely considered as research instruments, yet their ubiquity makes them potentially powerful tools for education data science. To illustrate this potential we utilize a corpus of 11,255 reviews of computer science courses submitted by students at a private research university to observe how students appraise their own le...
Conference Paper
The processes through which course selections accumulate into college pathways in US higher education is poorly instrumented for observation at scale. We offer an analytic toolkit, called Via, which transforms commonly available enrollment data into formal graphs that are amenable to interactive visualizations and computational exploration. We expl...
Preprint
Full-text available
Understanding large-scale patterns in student course enrollment is a problem of great interest to university administrators and educational researchers. Yet important decisions are often made without a good quantitative framework of the process underlying student choices. We propose a probabilistic approach to modelling course enrollment decisions,...
Conference Paper
A study deployed the mental health Relational Frame Theory as grounding for an analysis of sentiment dynamics in human-language dialogs. The work takes a step towards enabling use of conversational agents in mental health settings. Sentiment tendencies and mirroring behaviors in 11k human-human dialogs were compared with behaviors when humans inter...
Conference Paper
The software platforms that mediate online learning experiences are the common ground where learning science and computer science intersect. This panel will discuss the affordances of current online learning platforms and lessons learned in using them with students. The goal of the panel is to help learning scientists and computer scientists unders...
Conference Paper
We describe our experience exhibiting a human-size robot in a museum, encouraging visitors to interact with the robot and even program it to perform a sequence of timed poses. At the museum, users' programs were run on a real robot for all to see. The installation attracted and engaged visitors from age two to adult. The most intuitive of our inter...
Article
Full-text available
Assistive mobile manipulators (AMMs) have the potential to one day serve as surrogates and helpers for people with disabilities, giving them the freedom to perform tasks such as scratching an itch, picking up a cup, or socializing with their families.
Article
Organizations rely on data analysts to model customer engagement, streamline operations, improve production, inform business decisions, and combat fraud. Though numerous analysis and visualization tools have been built to improve the scale and efficiency at which analysts can work, there has been little research on how analysis takes place within t...
Conference Paper
The Robots for Humanity project aims to enable people with severe motor impairments to interact with their own bodies and their environment through the use of an assistive mobile manipulator, thereby improving their quality of life. Assistive mobile manipulators (AMMs) are mobile robots that physically manipulate the world in order to provide assis...
Conference Paper
Data quality issues such as missing, erroneous, extreme and duplicate values undermine analysis and are time-consuming to find and fix. Automated methods can help identify anomalies, but determining what constitutes an error is context-dependent and so requires human judgment. While visualization tools can facilitate this process, analysts must oft...
Conference Paper
Full-text available
In our field deployments of mobile remote presence (MRP) systems in offices, we observed that remote operators of MRPs often unintentionally spoke too loudly. This disrupted their local co-workers, who happened to be within earshot of the MRP system. To address this issue, we prototyped and empirically evaluated the effect of sidetone to help opera...
Conference Paper
Though data analysis tools continue to improve, analysts still expend an inordinate amount of time and effort manipulating data and assessing data quality issues. Such "data wrangling" regularly involves reformatting data values or layout, correcting erroneous or missing values, and integrating multiple data sources. These transforms are often diff...
Conference Paper
A fundamental premise of tagging systems is that regular users can organize large collections for browsing and other tasks using uncontrolled vocabularies. Until now, that premise has remained relatively unexamined. Using library data, we test the tagging approach to organizing a collection. We find that tagging systems have three major large scale...
Article
We describe an implementation that has users ‘flick’ notes, images, audio, and video files onto virtual, imaginary piles beyond the display of small-screen devices. Multiple sets of piles can be maintained in persistent workspaces. Two user studies yielded the following: Participants developed mental schemes to remember virtual pile locations, and...
Conference Paper
We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amount of text and the number of links it contains. The method is site-independent and does not use any language- based features. We tested our algorithm on a set of 1120 new...
Conference Paper
Full-text available
This paper explores architectural support for interfaces combining pen, paper, and PC. We show how the event- based approach common to GUIs can apply to augmented paper, and describe additions to address paper's distinguish- ing characteristics. To understand the developer experience of this architecture, we deployed the toolkit to 17 student teams...
Article
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and match- ing signatures for near duplicate detection in large Web crawls. Our spot signatures are designed to favor natural- language portions of Web pages over advertisements and navig...
Article
We review selected technical challenges addressed in our digital library project. Our InfoBus, a CORBA-based dis- tributed object infrastructure, unifies access to heteroge- neous document collections and information processing services. We organize search access using a protocol (DLIOP) that is tailored for use with distributed objects. A metadata...
Article
PhotoSpread is a spreadsheet system for organizing and analyzing photo collections. It extends the current spreadsheet paradigm in two ways: (a) PhotoSpread accommodates sets of objects (e.g., photos) annotated with tags (attribute-value pairs). Formulas can manipulate object sets and refer to tags. (b) Photos can be reorganized (tags and location...
Data
Appendix 1: information gain. This appendix describes the Information Gain algorithm in detail with examples.
Data
Appendix 2: Good-Turing smoothing. This appendix describes the Good-Turing smoothing algorithm in detail with examples.
Article
Full-text available
We sketch our species identification tool for palm sized computers that helps knowledgeable observers with census activities. An algorithm turns an identification matrix into a minimal length series of questions that guide the operator towards identification. Historic observation data from the census geographic area helps minimize question volume....
Conference Paper
Full-text available
Using gaze information as a form of input poses challenges based on the nature of eye movements and how we humans use our eyes in conjunction with other motor actions. In this paper, we present three techniques for improving the use of gaze as a form of input. We first present a saccade detection and smoothing algorithm that works on real-time stre...
Article
This document defines the data model as well as the syntax and seman-tics of the formula language employed by PhotoSpread. It is inspired by Excel with specialized and enriched functionality for managing and tag-ging large photo collections in a spreadsheet. PhotoSpread allows for cap-turing, storing, arranging, manipulating, and querying arbitrary...
Conference Paper
We present a practical technique for pointing and selection using a combination of eye gaze and keyboard triggers. EyePoint uses a two-step progressive refinement process fluidly stitched together in a look-press-look-release action, which makes it possible to compensate for the accuracy limitations of the current state-of-the-art eye gaze trackers...
Conference Paper
We present several gaze-enhanced scrolling techniques developed as part of continuing work in the GUIDe (Gaze-enhanced User Interface Design) project. This effort explores how gaze information can be effectively used as input that augments keyboard and mouse. The techniques presented below use gaze both as a primary input and as an augmented input...
Article
With advances in digital pens, there has been recent interest in supporting augmented paper in both research and com- mercial applications. This paper introduces the iterative design of a toolkit for event-driven programming of aug- mented paper applications. We evaluated the toolkit with 69 students (17 teams) in an external university class, gath...
Conference Paper
This paper describes a communication-minded visualization called progressive multiples that supports both the forensic analysis and presentation of multidimensional event data. We combine ideas from progressive disclosure, which reveals data to the user on demand, and small multiples (21), which allows users to compare many images at once. Sets of...
Article
A Stanford University research group explores how design alternatives for tabletop interfaces can impact group dynamics to promote effective teamwork. They built and evaluated a series of novel prototypes that explore multi-user coordination policies and cooperative gesturing, encouraging equitable participation in educational tasks and supporting...
Article
NameSet is a system that translates a set of geographic coordinates into a textual name based on the geographic regions where the coordinates occur. One possible application of NameSet is to concisely present the geographical scope of a set of geo-referenced observations to a human user. Another application is to generate text to depict a set of co...
Conference Paper
Full-text available
Biological studies rely heavily on large collections of species observations. All of these collections cannot be compiled by biology professionals alone. Skilled amateurs can assist by contributing observations they make in the field. The challenge with such contributions is their potentially questionable quality. We present our PDA-based applicati...
Article
We describe the design and performance of WebBase, a tool for Web research. The system includes a highly customizable crawler, a repository for collected Web pages, an indexer for both text and link-related page features, and a high-speed content distribution facility. The distribution module enables researchers world-wide to retrieve pages from We...
Conference Paper
Full-text available
Through a study of field biology practices, we observed that biology fieldwork generates a wealth of heterogeneous information, requiring substantial labor to coordinate and distill. To manage this data, biologists leverage a diverse set of tools, organizing their effort in paper notebooks. These observations motivated ButterflyNet, a mobile captur...
Conference Paper
Multi-user, touch-sensing input devices create opportunities for the use of cooperative gestures - multi-user gestural interactions for single display groupware. Cooperative gestures are interactions where the system interprets the gestures of more than one user as contributing to a single, combined command. Cooperative gestures can be used to enha...
Conference Paper
Full-text available
We explore how the placement of control widgets (such as menus) affects collaboration and usability for co-located tabletop groupware applications. We evaluated two design alternatives: a centralized set of controls shared by all users, and separate per-user controls replicated around the borders of the shared tabletop. We conducted this evaluation...
Conference Paper
We describe an implementation that has usersflick' notes, images, audio, and vi deo files onto virtual piles beyond the display of small-screen devices. This scheme allows PDA users to keep information close at hand without sacrificing valu able screen real estate. Our approach takes advantage of human spatial memory capabilities. It also obviates...
Conference Paper
Interactive tables can enhance small group colocated collaborative work in many domains. One application enabled by this new technology is copresent, collaborative search for digital content. For example, a group of students could sit around an interactive table and search for digital images to use in a report. We have developed TeamSearch, an appl...
Conference Paper
We examined the strengths and weaknesses of three diverse scroll control modalities for photo browsing on personal digital assistants (PDAs). This exploration covered nine alternatives in a design space that consisted of three visual interfaces and three control modalities. The three interfaces were a traditional thumbnail layout, a layout that pl...
Article
In 1994 the National Science Foundation launched its Digital Libraries Initiative (DLI). The choice of combining the word digital with library immediately defined three interested parties: librarians, computer scientists, and publishers. The eventual impact of the Initiative reached far beyond these three groups. The Google search engine emerged fr...
Conference Paper
Full-text available
Our system suggests likely identity labels for photographs in a personal photo collection. Instead of using face recognition techniques, the system leverages automatically available context, like the time and location where the photos were taken.Based on time and location, the system automatically computes event and location groupings of photos. As...
Article
We introduce Multi-User Piles Across Space, a technique that allows co-located individuals with PDAs to share and organize information items (e.g., photos, text, sound clips, etc.) by placing these items in shared, imaginary off-screen piles. This technique relies on human capacities to remember spatial layouts, and allows small co-located groups w...
Conference Paper
Given time and location information about digital photographs we can automatically generate an abundance of related contextual metadata, using off-the-shelf and Web-based data sources. Among these are the local daylight status and weather conditions at the time and place a photo was taken. This metadata has the potential of serving as memory cues a...
Conference Paper
We describe PhotoCompas, a system that utilizes the time and location information embedded in digital photographs to automatically organize a personal photo collection. PhotoCompas produces browseable location and event hierarchies for the collection. These hierarchies are created using algorithms that interleave time and location to produce an org...
Conference Paper
We developed two browsers to support large personal photo collections on PDAs. Our first browser is based on a traditional, folder-based layout that utilizes either the user's manually created organization structure, or a system-generated structure. Our second browser uses a novel interface that is based on a vertical, zoomable timeline. This timel...
Conference Paper
We developed two browsers to support large personal photo collections on PDAs. Our first browser is based on a traditional, folder-based layout that utilizes either the user's manually created organization structure, or a system-generated structure. Our second browser uses a novel interface that is based on a vertical, zoomable timeline. This timel...
Article
Full-text available
Given location information on digital photographs, we can automatically generate an abundance of photo-related metadata using o#-the-shelf and web-based data sources. These metadata can serve as additional memory cues and filters when browsing a personal or global collection of photos.
Article
Abstract We attach an inexpensive,pressure sensor to the side ofa personal digital assistant and use it as three input devices at once. Users can squeeze the device to provide near-continuous,input to applications. At the same,time the drivers interpret a sudden,full squeeze as the push of a virtual button. A user’s sudden,pressure release while sq...
Conference Paper
We describe LOCALE, a system that allows cooperating in- formation systems to share labels for photographs. Participating pho- tographs are enhanced with a geographic location stamp { the latitude and longitude where the photograph was taken. For a photograph with no label, LOCALE can use the shared information to assign a label based on other phot...
Article
The portion of web traffic attributed to dynamic web content is substantial and continues to grow as users expect more personalization and tailored information. Unfortunately, dynamic content is costly to generate. Moreover, traditional web caching schemes are not very effective for dynamically-created pages. In this paper we study two acceleration...
Article
The overall goal of the Stanford Digital Library project is to provide an infrastructure that aÄords interoperability among heterogeneous, autono- mous digital library services. These services include both search services and remotely usable information process- ing facilities. In this paper, we survey and categorize the metadata required for a div...
Conference Paper
The portion of web traffic attributed to dynamic web content is substantial and continues to grow as users expect more personalization and tailored information. Unfortunately, dynamic content is costly to generate. Moreover, traditional web caching schemes are not very effective for dynamically-created pages. In this paper we study two new accelera...
Article
Full-text available
We developed two browsers to support large photo collections on PDAs. Our first browser uses a tradi tional, folder-based layout that utilizes either the user's manually created organization structure, or a system-generat ed structure. Our second browser uses a novel interfac e that is based on a vertical, zoomable timeline. This Timeli ne browser...
Article
As a foundation for designing computer-supported photograph management tools, we have been conducting focused experiments. Here, we describe our analysis of how people initially organize collections of familiar images. We asked 26 subjects in pairs to organize 50 images on a common horizontal table. Each pair then organized a different 50-image set...
Conference Paper
We developed two photo browsers for collections with thousands of time-stamped digital images. Modern digital cameras record photo shoot times, and semantically related photos tend to occur in bursts. Our browsers exploit the timing information to structure the collections and to automatically generate meaningful summaries. The browsers differ in h...
Article
Full-text available
We o#er an overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and implementation techniques for each of these component...
Conference Paper
We developed two photo browsers for collections with thousands of time-stamped digital images. Modern digital cameras record photo shoot times, and semantically related photos tend to occur in bursts. Our browsers exploit the timing information to structure the collections and to automatically generate meaningful summaries. The browsers differ in h...
Article
In this paper, we introduce the novel concept of a secure interface definition compiler (a "security " compiler, for short). We show how interface designers can declare an application's security requirements as part of the interface definition process, and how a security compiler can automatically generate code that implements security requirements...
Article
We present a design and implementation for displaying and manipulating HTML pages on small handheld devices such as personal digital assistants (PDAs), or cellular phones. We introduce methods for summarizing parts of Web pages and HTML forms. Each Web page is broken into text units that can each be hidden, partially displayed, made fully visible,...
Article
We present a design for displaying and manipulating HTML pages on small handheld devices such as personal digital assistants (PDAs), or cellular phones. We introduce methods for summarizing parts of Web pages. Each page is broken into text units that can each be hidden, partially displayed, made fully visible, or summarized. A variety of methods ar...
Article
We introduce five methods for summarizing parts of Web pages on handheld devices, such as personal digital assistants (PDAs), or cellular phones. Each Web page is broken into text units that can each be hidden, partially displayed, made fully visible, or summarized. The methods accomplish summarization by different means. One method extracts signif...
Conference Paper
We propose a design for displaying and manipulating HTML forms on small PDA screens. The form input widgets are not shown until the user is ready to fill them in. At that point, only one widget is shown at a time. The form is summarized on the screen by displaying just the text labels that prompt the user for each widget's information. The challeng...
Conference Paper
We demonstrate a new browsing technique for devices with small displays such as PDAs or cellular phones. We concentrate on end-game browsing, where the user is close to or on the target page. We make browsing more efficient and easier by Accordion Summarization. In this technique the Web page is first represented as a short summary. The user can th...
Article
Full-text available
Introduction The development of novel information applications is reaching an impasse. HTML forms for searching the Web are fine for traditional, form-based interfaces to information. But what if we wish to develop more intuitive interfaces that reach across multiple information sources, or are more specialized for particular sources? For example,...
Conference Paper
PDA access to the World-Wide Web pose a variety of difficulties for users. The small screen quickly renders Web pages confusing and cumbersome to peruse. Inputting information by pen is time consuming and error-prone. The download time for Web material to radio linked devices is still much slower than landline connections. The standard browsing pro...
Article
PDA access to the World-Wide Web pose a variety of difficulties for users. The small screen quickly renders Web pages confusing and cumbersome to peruse. Inputting information by pen is time consuming and error-prone. The download time for Web material to radio linked devices is still much slower than landline connections. The standard browsing pro...
Article
Full-text available
Motivated by our design of an annotation system (Notable), we have developed a framework for annotation systems, based on dimensions of the platform used to create the annotations, the platform used to read them, the annotations themselves, the targets of the annotations, and the correspondence between annotation and target. We demonstrate the fram...
Article
The Stanford Power Browser project addresses the problems of interacting with the World Wide Web through wirelessly connected Personal Digital Assistants (PDAs). These problems include bandwidth limitations, screen real-estate shortage, battery capacity, and the time costs of pen-based search keyword input. As a way to address bandwidth and battery...
Article
In this paper, we study the problem of constructing and maintaining a large shared repository of Web pages. We discuss the unique characteristics of such a repository, propose an architecture, and identify its functional modules. We focus on the storage manager module, and illustrate how traditional techniques for storage and indexing can be tailor...
Article
In the face of small, one or two word queries, high volumes of diverse documents on the Web are overwhelming search and ranking technologies that are based on document similarity measures. The increase of multimedia data within documents sharply exacerbates the shortcomings of these approaches. Recently, research prototypes and commercial experimen...
Article
Digital library mediators allow interoperation between diverse information services. In this paper we describe a flexible and dynamic mediator infrastructure that allows mediators to be composed from a set of modules ("blades"). Each module implements a particular mediation function, such as protocol translation, query translation, or result mergin...
Article
Document sources are available everywhere, both within the internal networks of organizations and on the Internet. Even individual organizations use search engines from different vendors to index their internal document collections. These search engines are typically incompatible in that they sup-port different query models and interfaces, they do...
Conference Paper
Full-text available
The Notable annotation system enables users to annotate paper documents using handheld devices in a mobile environment. This paper describes the design issues and solutions that arose in creating Notable, with a particular focus on design challenges at the intersection of annotations and handheld technology. Novel design strategies include separati...
Article
Includes four articles that discuss: (1) search middleware, or software packages that allow access to information sources for digital libraries; (2) film archives and building online collections of data for use in film research and teaching; (3) digital archives; and (4) a virtual union catalog for the University of California. (LRW)