IEEE Multimedia

Published by Institute of Electrical and Electronics Engineers
Online ISSN: 1070-986X
Publications
Article
In this issue, Art Beat looks at networked performance, featuring Turbulence, a net art portal. Turbulence's Helen Thorington and Jo-Ann Green discuss their blog, Networked_Performance, which provides an extensive archive of networked performances. We also profile performance artist Barbara Campbell's 1,001 Nights Cast, a durational networked performance project. Multimedia researchers interested in Web 2.0 might want to help develop a template or standard that lets performance artists share and search their work on the Web. Such work could also ensure that certain software, when it fades away, is translatable or included in new versions of software so that viewing art isn't contingent on outdated file formats.
 
Diver user interface. The overview window (bottom left) shows the full video source. The magnified viewing window (upper left) shows a selected image from the scene. The annotation window, a.k.a. Dive worksheet (right) lets users comment on the frames or path movies they create.
Panoramic video as input for Diver. The user gets a 360-degree view of the scene around the camera's location.  
Graphical representation of the .dvc file format.  
A Dive from the WebDiver site, as it appears in a browser window.  
Article
The digital interactive video exploration and reflection (Diver) system lets users create virtual pathways through existing video content using a virtual camera and an annotation window for commentary. Users can post their Dives to the WebDiver server system to generate active collaboration, further repurposing, and discussion. Although our current work focuses on video records in learning research and educational practices, Diver can aid collaborative analysis of a broad array of visual data records, including simulations, 2D and 3D animations, and static works of art, photography, and text. In addition to the social and behavioral sciences, substantive application areas include medical visualization, astronomic data or cosmological models, military satellite intelligence, and ethnology and animal behavior. Diver-style user-centered video repurposing might also prove compelling for popular media with commercial application involving sports events, movies, television shows, and video gaming. Future technical development includes possible enhancements to the interface to support simultaneous display of multiple Dives on the same source content, a more fluid two-way relation between desktop Diver and WebDiver, and solutions to the current limitations on displaying and authoring time/space cropped videos in a browser context. These developments support the tool's fundamentally collaborative, communication-oriented nature.
 
Article
The mark of a good bus design is to be general-purpose enough so that its use expands to applications beyond those for which it was originally intended. Compare the narrowly defined EISA (Extended Industry Standard Architecture) specification and the now-defunct MicroChannel bus to the vibrant IEEE 1394 standard, which has numerous specifications built around it already, and many more under development. This article highlights some of the many applications made possible, or made better, by expanding on the 1394 standard. It also directs you to the key specifications that have developed for each of the application areas.
 
Article
People, not technology, have become the focus of current interface design. Multimedia interface designers try to take advantage of human senses to ease our communication with one another and with the computer. This survey of current work highlights the complexity facing them in their task.< >
 
Article
For pt. 1 see ibid., vol. 8 , no. 4, p. 82-88 (2001). The article is the second part of a two-part series on SMIL 2.0, the newest version of the World Wide Web Consortium's Synchronized Multimedia Integration Language. Part 1 looked in detail at various aspects of the SMIL specification and the underlying SMIL timing model. This part looks at simple and complex examples of SMIL 2.0's use and compares SMIL with other multimedia formats. We focus on SMIL's textual structure in its various implementation profiles
 
Article
The World Wide Web Consortium's Synchronized Multimedia Integration Language format for encoding multimedia presentations for delivery over the Web is a little-known but widely used standard. First released in mid-1998, SMIL has been installed on approximately 200,000,000 desktops worldwide, primarily because of its adoption in RealPlayer G2, Quicktime 4.1, and Internet Explorer 5.5. In August 2001, the W3C released a significant update with SMIL 2.0. In a two-part report on SMIL 2.0, the author will discuss the basics of SMIL 2.0 and compare its features with other formats. This article will focus on SMIL's basic concepts and structure. Part two, in the January-March 2002 issue, will look at detailed examples of SMIL 2.0, covering both simple and complex examples. It'll also contrast the facilities in SMIL 2.0 and MPEG-4
 
Article
Web 2.0 is an area that's gained much attention recently, especially with Google's acquisition of YouTube. Given the strong focus on media in many Web 2.0 applications, from a multimedia perspective the question arises what multimedia (research) and Web 2.0 have in common, where the two fields meet, and how they can benefit each other
 
Article
At no time in history has multimedia technology had the prospect of making a stronger impact on cultures. Multimedia is everywhere: regardless of where we are we can access multimedia originating in many parts of the world as easily as we can create content to be shared with people from different cultures across the globe. The gap between the haves and the have-nots, however, is growing (the "we" above refers to a very few). Wealthier countries are using multimedia to reinforce physical boundaries(for example, requiring fingerprints and photographs at airports) and multimedia content from only a few cultures is proliferating.
 
Article
The authors review a book that offers a human-centered approach on next-generation multimedia database retrieval.
 
Article
It seems clear that many businesses believe there's a profit to be had in IPTV and mDTV innovations, given the number of companies who had booths and who participated in the National Association of Broadcasters 2006 conference. And these businesses are investing on a variety of levels, with different approaches in how they think consumers will buy into the technology - whether streaming over a laptop, cell phone, or on a TV in their living room
 
Article
In 2005, the publisher Hubert Burda and the investor Yossi Vardi initiated the Digital Lifestyle Day to connect key players of media-related enterprises with founders of new and hot companies, bringing together technology with arts and design. The first event was a success and Burda and Vardi have helped evolve the conference into one of the hottest and most prestigious new media events of the year in Europe, with a visibility that now reaches other continents. In a familiar atmosphere, business suits meet t-shirts and sneakers to discuss the challenges and opportunities of tomorrow's digital lifestyle. In 2007, DLD became Digital, Life, Design (see http://www.dld-conference.com/) but maintained its goal to provide a small but very nice event for connecting with others and having intense discussions, all in an inspiring atmosphere. This year, the experts met under the slogan "Uploading the 21st Century." The participants discussed the challenges, consequences, and opportunities of the increasing digitization in our daily life. This article presents insights formed from DLD 2008. A separate article—see this issue's Visions and Views, "Digital Lifestyle 2020"—provides a snapshot of views and opinions from other presenters and participants.
 
Article
Our future life will be fulfilled with a digital experience that complements our real, physical world. This article presents visions and views on what makes our future digital life. What are the key drivers that will enable our future digital life and what is their role? Where are places where our digital life takes place—in our homes, on the move, and in our cities? Many applications and systems will make a variety of digital services and content possible that reach the end consumer everywhere at anytime. We will continue to understand and develop the potential of communicating, meeting, joining, and acting in our future digital life.
 
Article
This paper describes the Media Value Chain Ontology we have specified, a semantic representation of the Intellectual Property along the Value Chain that it is in the way of being standardized as MPEG-21 Part 19. This model defines the minimal set of kinds of Intellectual Property, the roles of the users interacting with them, and the relevant actions regarding the Intellectual Property law. Besides this, a basis for authorizations along the chain has been laid out, and the model is ready for managing class instances representing real objects and users. The computer representation has been made publicly available so that applications can interoperate around this common shared basis.
 
Article
To foster multimedia e-commerce, MPEG is developing a new part of the MPEG-21 standard that specifies the creation and delivery of events related to peer usage of digital items. This overview of MPEG's work on event reporting describes the standard's new part and positions it in relation to other efforts.
 
Article
The access devices of today are becoming increasingly sophisticated. Thanks to multimedia, communication is much widespread and therefore more powerful. However, we face a serious problem of heterogeneity in our terminals, in our networks, and in the people who ultimately consume and interact with the information presented to them. In this article, we focus on Part 7 of the MPEG-21 standard (ISO/IEC 21000-7), which we refer to as Digital Item Adaptation (DIA). At the time of this writing, the DIA specification is at the penultimate stage of Final Committee Draft (2003); final approval is scheduled for December 2003. The general DIA concept is that Digital Item is subject to both a resource adaptation and a descriptor adaptation engine, which together produce the adapted Digital Item. Note that the standard specifies only the tools that assist with the adaptation process, not the adaptation engines themselves. Digital Item Adaptation (DIA) specifies the following natural environment description tools: location and time and audiovisual environment. The DIA offers a rich set of tools to assist with the adaptation of Digital Items. It offers standardized tools for the description of usage environments, tools to create high-level descriptions of the bitstream syntax to achieve format-independent adaptation, tools that assist in making tradeoffs between feasible adaptation operations and constraints, tools that enable low-complexity adaptation of metadata, and tools for session mobility. Moving forward, the MPEG-21 committee is considering amendments to the specification-for instance, tools that provide further assistance with modality conversion and tools that relate more specifically to the adaptation of audio and graphics media. Furthermore, also being actively considered is how to express the rights that a User has to perform adaptation and how this expression fits into a system that governs those rights.
 
Article
The Internet has spawned a revolution in the way people distribute content and access services. At the same time, the availability of broadband and wireless networks has increased, as have the capability and portability of computing and consumer electronic devices. These factors have fueled the development of new technologies to automate, manage, and secure content flow and service access over the Internet. This paper deals with the recently approved ISO standard, MPEG-21 Rights Expression Language (REL). This language is precise, flexible, extensible, and rich in expressing rights. Thus, it can support reliable, flexible, and cost-effective interoperable digital rights management (DRM) systems and applications for electronic commerce and enterprise management of content and services. It is an international standard for expressing and interpreting rights for using and distributing content, resources, and arid services. As an enabling technology for interoperable DRM, its' adoption by industry and incorporation into products certainly takes time. The challenge is to proliferate the REL's adaptation across many different DRM systems as well as conditional access and authorization systems. Moreover, the REL must pervade not only entertainment but also many other applications, such as enterprise, medical information, and even privacy protection.
 
The VideoAnnEx MPEG-7 annotation tool consists of four regions: (1) video playback, (2) shot annotation, (3) views panel, and (4) region annotation (not shown).  
User client interface on the Palm PDA for three personalized video scenarios: (a) video-ondemand with interactive links, (b) summarized video based on preference topics and time, and (c) summarized video based on query keywords and time.
Figure A. Example of MPEG-7 video segment description.
Article
As multimedia content has proliferated over the past several years, users have begun to expect that content be easily accessed according to their own preferences. One of the most effective ways to do this is through using the MPEG-7 and MPEG-21 standards, which can help address the issues associated with designing a video personalization and summarization system in heterogeneous usage environments. This three-tier architecture provides a standards-compliant infrastructure that, in conjunction with our tools, can help select, adapt, and deliver personalized video summaries to users. In extending our summarization research, we plan to explore semantic similarities across multiple simultaneous news media sources and to abstract summaries for different viewpoints. Doing so will allow us to track a semantic topic as it evolves into the future. As a result, we should be able to summarize news repositories into a smaller collection of topic threads.
 
Article
The H.264 video coding standard was created to support the next generation of multimedia applications. H.264 improves performance over previous video coding standards, such as MPEG-2, H.263, and MPEG-4 part 2, by applying more sophisticated techniques for intraframe and interframe prediction, transform coding, entropy coding, and so on. The H.264 standard is unique in its broad applicability across a range of bit rates and video resolutions and is gaining momentum in its adoption by industry. Hari Kalva's article reviews H.264, highlights its unique features, and describes its applicability to emerging applications such as IPTV.
 
The general architecture of the proposed system consists of two steps: (a) an offline step where the map graph descriptor is created and stored in the database, and (b) an online step where the user's query is processed to retrieve the relevant results.
Results for manual sketches.
The road network extraction procedure: (a) initial urban map obtained from http://maps. google.com/ ( c 2009 Google), (b) semantic information, and (c) extracted roadnetwork structure.
Results for CAD-based queries.
The application GUI using maps obtained from http://maps.google.com ( c 2009 Google).  
Article
This paper presents a novel framework for urban map search. The search capabilities of the existing GIS systems are restricted to text-based search, neglecting significant topological and semantic information. We propose a framework that aims to extend the search capabilities by offering sketch-based search. Initially, the urban maps are processed in an offline step in order to extract the topological information in the form of an attributed graph. In the online step, the user queries the system by sketching the desired network structure. The search algorithm is based on attributed graph matching of the query graph and the attributed graphs of the urban maps and allows both partial and global matching of the query. Experimental results illustrate the excellent performance of the system for intuitive search in maps.
 
Article
This article describes a framework for high-performance Web 3.0 computer game experiences. The framework is designed to facilitate multimedia for Web browsers, heads-up displays, 3D authoring, and remote play.
 
shows a block diagram of the 3G-324M system. The 3GPP 3G-324M technical specification defines a video-telephony service based on H.324M as follows: ❚ Using the ITU-T H.324 umbrella recommendation and its annex C. This defines the overall videotelephony service, including H.223 and H.245. ❚ Using annexes A and B of H.223 ITU-T to enhance the framing facilities of the multiplexer in error-prone conditions. ❚ Using the mobile command and control facilities of H.245. ❚ Using specific audio and video codecs. For example, it mandates the Global System for Mobile Communication; Adaptive Multirate (GSM-AMR) audio codec and the H.263 video codec. Other audio and video codecs are pro
Article
As mobile operators worldwide migrate to third-generation (3G) networks, conversational video-telephony services are becoming a key differentiator between new 3G offerings and existing 2G/2.5G services. Although it's possible to have limited video-based services - such as a multimedia messaging service - that deliver pictures and video clips over 2.5G services, these are delay-insensitive applications that could run over a packet-based wireless network like general packet radio service (GPRS) or code division multiple access (CDMA)'s 1XRTT. For delay-sensitive applications such as conversational video telephony, present 3G packet bearers are inadequate, and the Third Generation Partnership Project (3GPP; http://www.3gpp.org) mandates using the 3G bandwidth-guaranteed circuit-switched bearer and the 3G-324M system. The 3G-324M system is a derivative of the International Telecommunication Union (ITU) H.324 protocol standard for low-bitrate multimedia communication, which ITU-T developed for the public switched telephone network (PSTN). This article describes the 3G-324M system, which has been adopted by both 3GPP and 3GPP2 (htpp://www.3gpp2.org), as well as its H.324 roots.
 
Various meshes and their resulting parameterization using the technique described in this paper. Hat2(top-left), Ear(top-right), Top-fan(bottom-left), Mountain(bottom-right). 
Article
Digital geometry is a new data type for multimedia applications. To foster the use of 3D geometry, we introduce a piecewise linear parameterization of 3D surfaces that we can use for texture mapping, morphing, remeshing, and geometry imaging. Our method guarantees one-to-one mapping without foldovers in a geometrically intuitive way.
 
Article
Virtual reality systems use digital models to provide interactive viewing. We present a 3D digital video system that attempts to provide the same capabilities for actual performances such as dancing. Recreating the original dynamic scene in 3D, the system allows photorealistic interactive playback from arbitrary viewpoints using video streams of a given scene from multiple perspectives
 
Virtual space on a large screen.
Three different environments for conversation: (a) FTF, (b) InPerson, and (c) FreeWalk.
Article
A meeting environment for casual communication in a networked community, FreeWalk provides a 3D common area where everyone can meet and talk freely. FreeWalk represents participants as 3D polygon pyramids, on which their live video is mapped. Voice volume remains proportional to the distance between sender and receiver. For evaluation, we compared communications in FreeWalk to a conventional desktop videoconferencing system and a face-to-face meeting
 
Our multispectral system: (a) internal view and (b) external view.
The spectral model of the acquisition process in a multispectral system.
One output cell of a perception.  
Article
A stereoscopic system based on a multispectral camera and an LCD projector uses multispectral information for 3D object reconstruction. By linking 3D points to a curve representing the spectral reflectance, the system gives a physical representation of the matter that's independent from illuminant, observer, and acquisition devices
 
Article
First Page of the Article
 
Article
Supervision and control of wide area transportation networks requires continuous monitoring of large data sets. Two factors complicate the process: data items are spread over a wide geographic area, but are reciprocally influenced through network links, and data types attached to network nodes belong to different categories. We describe a visualization environment that tests the joint use of multiple presentation modes, such as 3D graphics, color, and windowing, to address both factors.
 
Article
Advances in graphic engines and software tools have facilitated the development of visual interfaces based on 3D virtual environments (VEs). These interfaces use interactive 3D graphics to represent visual and spatial information and allow natural interaction with direct object manipulation. Particularly in the training field, interactive 3D graphics offers effective, near-real-world representations, supporting learning-by-doing and case-based reasoning approaches. Although researchers have proposed and practiced many development guidelines on 2D graphical user interfaces, few contributions have addressed the systematic development of user interfaces based on 3D graphics and their possible extension to other media. We've addressed this problem in the construction of several VEs and we've organized our experience in a set of guidelines. We demonstrate their use by describing a virtual training environment called VECWIT (Virtual Environments for Construction Workers' Instruction and Training) that we developed to test the suitability of a VE as a complementary tool supporting education and training for construction workers' safety
 
Article
MPEG-4's complicated format makes developing scenes from scratch all but impossible for novice users. By converting MPEG-4's text-based description into graphical form, the authors' proposed tool exploits all of MPEG-4's 3D functionalities while easing the authoring burden.
 
Interface comparison.
(a) In the camera image, the blue rectangle indicates the head, the red circle indicates hand one, and the green cross indicates hand two. (b) The system assigns observations to one of the four models depending on their probabilities. In this image, the blue is the head, the red blob is hand one, the green blob is hand two, the gray area is discarded, and the white pixels are ignored in Expectation Maximization.
Users can design more complex sketches such as (a) a living room or (b) an office using Masterpiece.
User attempting to perform similar operations using (a) the Masterpiece interface, (b) a haptic glove interface, and (c) an air mouse interface.
Article
Virtual reality interfaces can immerse users into virtual environments from an impressive array of application fields, including entertainment, education, design, and navigation. However, history teaches us that no matter how rich the content is from these applications, it remains out of reach for users without a physical way to interact with it. Multimodal interfaces give users a way to interact with the virtual environment (VE) using more than one complementary modality. Masterpiece (which is short for multimodal authoring tool with similar technologies from European research utilizing a physical interface in an enhanced collaborative environment) is a platform for a multimodal natural interface. We integrated Masterpiece into a new authoring tool for designers and engineers that uses 3D search capabilities to access original database content, supporting natural human-computer interaction
 
Article
Not Available
 
Thule whalebone house (QiLe-1) on Bathurst Island, Nunavut: (a) photograph of the archeological site and (b) computer-aided design (CAD) drawing.  
Computer reconstruction of the Thule whalebone house showing the ridgepole design.  
Computer reconstruction of the interior of a Thule whalebone house.  
Article
In this article, the authors share with us an interesting use for the latest laser-scanning-based 3D imaging technology for reconstructing a thule whalebone house. All major components (such as date capturing, data modeling, display, and interaction) are covered to demonstrate how archaeological research can benefit from this new technology for the purposes of testing and education. Although this article focuses on building the skeletal models of a whale, it also provides a stepping-stone to the researchers, engineers, architects, and archaeologists who are interested in virtual reality. Interactive digital media is becoming one of the hot areas in the multimedia community, which aims to provide users with an immersive experience while consuming media.
 
Article
Not Available
 
Article
To avoid lung disease, regular preemptive screenings are an absolute necessity. This article describes a technique that creates a 3D reconstruction of the ribcage without expensive scanners. One of the major uses of this 3D reconstruction software would be as a medical visualization tool that would show the real shape of a ribcage. This 3D reconstruction method could be combined with diagnosis software to help detect lung cancer, to generate 3D size estimates, and to improve visualization of affected regions.
 
Article
To integrate both video and audio interfaces into a computer, multimedia design required multistandard compression chips. The chips are comprised of a hardware engines for functions needing powerful computing capabilities as well as processors programmed to regulate data flow inside the chip. The programmability of these chips were restricted, but they can afford real-time efficiency. In order to manipulate, process and incorporate images effectively, a multimedia system with interactive computing abilities and compression is required.
 
Article
The IEEE P802.11 committee developed the 802.11 Wireless LAN standard to cover wireless networks for fixed, portable, and moving stations within a local area. This standard addresses the need for wireless connectivity to stations, equipment, or automatic machinery that requires rapid deployment and may be portable, handheld, or mounted on moving vehicles. It can function totally wireless or connected to a wired network. Most people familiar with the standard expect to use it in providing wireless networks for personal computers or stations connected to the global wired infrastructure through access points. Now that the 802.11 standard is finally here, it will energize the wireless LAN market and result in the proliferation of low cost wireless connectivity in the office and home. Study groups are working on higher rates at 2.4 GHz and at 5 GHz for future inclusion into the standard. These higher rates will make it even more practical to employ this standard for multimedia traffic
 
Article
For many years, people have consumed more multimedia content than written information. Yet the tools consumers use to find the contenttheywanttoseeareoftenbasedon text. Hence, despite years of vibrant research, the multimedia field has been criticized for a lack of real-world applications. 1 A recent panel discussion on multimedia search had the revealing title: Multimedia Information Retrieval: What Is It, and Why Isn’t Anyone Using It? 2 With the exception of face recognition and optical character recognition, little of the wonderful technology created by the multimedia community has met with commercial success. Search engines look for multimedia content on the basis of the text around the object. Recommendation engines do better by ignoring the content completely and correlating user’s rating data instead. The ACM Multimedia Grand Challenge was designed to bring such commercial needs to the attention of researchers. Seven industrial partners led the Multimedia Grand Challenge by identifying issues they think are important to their business and worth further study. Key challenges The industrial partners pointed out that addressing these challenges would open up new business opportunities and create a richer experience for their users. In addition, they said they hoped that describing real-world Editor’s Note This column is about last year’s ACM Multimedia Grand Challenge in Florence, Italy, an event that endeavors to connect (academic) researchers more effectively with the realities of the business world. The authors describe the 10 challenges and present the three winning applications. —Frank Nack challenges would allow researchers to focus on projects that have a better chance of success in the marketplace. Thus, the following 10 challenges were identified
 
Article
The development of successful multimedia applications is becoming a challenging task due to short deployment cycles and the huge amount of applications flooding the market. One major problem that the multimedia industry is facing in this area is the heterogeneity of the content-delivery chain. ISO/IEC MPEG has recognized this fact with the development of an MPEG Extensible Middleware (MXM), which specifically addresses this issue. A key issue when defining a platform is the portability to other platforms, a concept that calls for middleware to be used in a platform independent way. Finally, an infinite number of innovative business models based on media technologies, including those based on MPEG, could be developed at reduced costs by using a normative API. The Digital Item adaptation engine specifies the means to access and create information pertaining to the usage environment context, such as screen resolution and supported coding formats.
 
Article
The advantage of degrading the QoS of a multimedia session is that it reduces the network bandwidth required for distributed multimedia applications, increasing the number of users that a multimedia server can support concurrently. Experimentation shows that users perceive a reduced frame rate for a continuous-media stream differently, depending on the content
 
Article
The growth of multimedia is increasing the need for standards for accessing and searching distributed repositories. The moving picture experts group (MPEG) is developing the MPEG query format (MPQF) to standardize this interface as part of MPEG-7. The objective is to make multimedia access and search easier and interoperable across search engines and repositories. This article describes the MPQF and highlights some of the ways it goes beyond today's query languages by providing capabilities for multimedia query-by-example and spatiotemporal queries.
 
Article
Cultural setting is an intrinsic part of what we're trying to capture and use in muitimedia systems. However, being in our own culture (both everyday culture and professional culture) we forget that multimedia interfaces and communication are culture-specific. This article gives some great insights that stem from diversity in countries (and cultures) as well as inside the interdisciplinary multimedia community.
 
Article
Integrating disabled individuals into society, with dignity, is an ancient social issue. And while each person who is born with or acquires a disability faces an immediate and urgent crisis, society is slow to fix chronic problems even after discovering solutions. This situation is especially tragic considering that some individuals might spend an entire life attempting to surmount problems that already have been solved simply because the institutions in charge of change don't find the problem urgent enough to implement the known solution. For example, solutions exist to solve the information-access problems faced by disabled individuals, but content producers see no urgency in addressing the fundamental issues. Media-production processes must change to incorporate the already-existing solutions, which, for the most part, have minimal costs.
 
Rating of Daisy features on a scale ranging from: 0 (unimportant) to 4 (very important).
Article
The Daisy standard for multimedia representation of books and other material is designed to facilitate technologies that foster easy navigation and synchronized multimodal presentation for people with print-reading-related disabilities.
 
Article
We are using the results of the study to improve the design of both programs. We plan to repeat these evaluations several times as development of both programs progresses. Evaluation with kindergarten and elementary school deaf children and their teachers will be done in collaboration with the Indiana School for the Deaf and will start in the fall of 2009. We will report the results in a future article. A strong need exists for solutions that allow deaf users to communicate and interact in an environment free of prejudice, stigma, technological barrier, or other obstacles. The fact that all children were able to engage with and complete the tasks in both test systems is encouraging.
 
Article
We propose a method for constructing superstructure on the Web using XML and external annotations to Web documents. We have three approaches for annotating documents: linguistic, commentary, and multimedia. The result is annotated documents that computers can understand and process more easily, allowing content to reach a wider audience with minimal overhead
 
Article
We've collected personal audio - essentially everything we hear - for two years and have experimented with methods to index and access the resulting data. Here, we describe our experiments in segmenting and labeling these recordings into episodes (relatively consistent acoustic situations lasting a few minutes or more) using the Bayesian information criterion (from speaker segmentation) and spectral clustering
 
Article
Asynchronous voice is the interactive communication process of people leaving voice messages for other people and the other people responding with their voice messages. A primitive form of asynchronous voice is a kind of telephone tag in which people use voice mail to have an interactive conversation. The author gives a personal account of his work with asynchronous voice and asynchronous learning.
 
Top-cited authors
Zhengyou Zhang
Iraj Sodagar
  • Independent Researcher
S. Chang
  • National Tsing Hua University
Ramesh Jain
  • University of California, Irvine
Miroslav Goljan
  • Binghamton University