Fig 2 - uploaded by Silvia Pfeiffer
Content may be subject to copyright.
Source publication
The World Wide Web, with its paradigms of surfing and searching for information, has become the predominant system for computer-based information retrieval. Media resources, however information-rich, only play a minor role in providing information to Web users. While bandwidth (or the lack thereof) may be an excuse for this situation, the lack of s...
Context in source publication
Context 1
... is the format in which media with interspersed CMML markup is exchanged. Analogous to a normal Web server offering a collection of HTML pages to clients, an Annodex server offers a collection of Annodex resources. After a Web client has issued a URI request for an Annodex resource, the Web server delivers the Annodex resource, or an appropriate subpart of it according to the URI query parameters. Annodex resources conceptually consist of one or more media streams and one CMML annotation stream, interleaved in a temporally synchronized way. The annotation stream may contain several sets of clips that provide alternative markup tracks for the Annodex resource. This is implemented in CMML through a track attribute of the clip tag. The media streams may be complementary, such as an audio track with a video track, or alternative, such as two speech tracks in different languages. Figure 2 shows a conceptual representation of an example Annodex resource with three media tracks (light colored bars) and two annotation tracks (darker clips) with a header describing the complete resource (dark bar at the start). The Annodex format enables encapsulation of any type of streamable time-continuous data and is thus independent of a media compression format. It is basically a bitstream consisting of continuous media data interspersed with the structured XML markup of the CMML file. This is performed by merging the clip tags time-synchronously with the time-continuous bitstreams on authoring an Annodex bitstream. The clip tags are regarded as state changes in this respect and are valid from the time that they appear in the bitstream until another clip tag replaces them. If there is no clip that directly replaces a previous one, an empty clip tag is inserted that simply marks the end of the previous clip tag. Thus, Annodex is designed to be used as both a persistent file format and a streaming format. Figure 3 shows an example of the creation of a bitstream of an Annodexed media resource. Conceptually, the media bitstreams and the annotation bitstreams share a common timeline. When encapsulated into one binary bitstream, these data have to be flattened (serialized). CMML is designed for serialization through multiplexing. The figure shows roughly how this is performed. There are several advantages to having an integrated bitstream that includes the annotations in a time-synchronous manner with the media data. Firstly, all the information required is contained within one resource that can be distributed more easily. Also, many synchronization problems that occur with other media markup formats such as SMIL [31] are inherently solved. Also, when extracting temporal intervals from the resource for reuse, the metainformation is included in the media data, which enables one, e.g., to retain the copyright information of a clip over the whole lifetime of a reused clip. Last but not least, having a flat integrated format solves the problem of making the Annodex resource streamable. To perform the encapsulation, a specific bitstream format was required. As stated, an Annodex bitstream consists of XML markup in the annotation bitstream interleaved with the related media frames of the media bitstreams into a sin- gle bitstream. It is not possible to use straight XML as encapsulation because XML cannot enclose binary data unless encoded as Unicode, which would introduce too much overhead. Therefore, an encapsulation format that could handle binary bitstreams and textual frames was required. The following list gives a summary of the requirements for the Annodex format bitstream: • Framing for binary time-continuous data and XML. • Temporal synchronization between XML and time-continuous media bitstreams. • Temporal resynchronization after parsing error. • Detection of corruption. • Seeking landmarks for direct random access. • Streaming capability (i.e., the information required to parse and decode a bitstream part is available at the time at which the bitstream part is reached and does not come, e.g., at the end of the stream). • Small overhead. • Simple interleaving format with a track paradigm. We selected Xiph.Org’s [36] Ogg encapsulation format version 0 [20] as the encapsulation format for Annodex bitstreams as it meets all the requirements, has proven reliable and stable, and is an open IETF (Internet Engineering Task Force, ) standard [20]. Hier- archical formats like MPEG-4 or QuickTime were deemed less suitable as they are hierarchical file formats and therefore could not easily provide for streamable, time-accurate interleaving of multiple media and annotation tracks. To author Annodexed media, we must distinguish between files and live streams. The advantage of the former is that a file can be uploaded from the computer’s file system and annotated in a conventional authoring application. In contrast, the markup of a live Internet stream by its very nature has to be done on the fly. Annodex media files may be created in a traditional authoring application (e.g., iMovie or Adobe Premiere may easily support Annodex in the future) or through the use of CMML transcoded from metainformation collected in databases. The authoring application should support the creation of: • Structured and unstructured annotations, • Keyframe references, • Anchor points, and • URI links for media clips. Live Annodexed media streams must be created by merging clip tags with the live digital media stream. A merger application, similar to that described in Fig. 3, in- serts clip tags into the live stream at any point in time under the control of a user, e.g., by selecting a previously prepared clip tag from a list. It is expected that extending existing graphical video editing applications such as Apple’s iMovie or Adobe’s Premiere to author Annodex will be a simple task. Most already provide for specific markup to be ...
Similar publications
This paper presents a metadata model for the description of language resources proposed in the framework of the META-SHARE infrastructure, aiming to cover both datasets and tools/technologies used for their processing. It places the model in the overall framework of metadata models, describes the basic principles and features of the model, elaborat...
There are currently several metadata patterns which describes
information of different themes and areas. Many of these patterns are present in
web systems which allow different forms of precisely retrieving than the popular
search engines. The current work proposes a metadata model to describe news story
in several information vehicles. The resourc...
Citations
In recent years, blogging has become an exploding passion among Internet communities. By combining the grassroots blogging with the richness of expression available in video, videoblogs (vlogs for short) will be a powerful new media adjunct to our existing televised news sources. Vlogs have gained much attention worldwide, especially with Google's acquisition of YouTube. This article presents a comprehensive survey of videoblogging (vlogging for short) as a new technological trend. We first summarize the technological challenges for vlogging as four key issues that need to be answered. Along with their respective possibilities, we give a review of the currently available techniques and tools supporting vlogging, and envision emerging technological directions for future vlogging. Several multimedia technologies are introduced to empower vlogging technology with better scalability, interactivity, searchability, and accessability, and to potentially reduce the legal, economic, and moral risks of vlogging applications. We also make an in-depth investigation of various vlog mining topics from a research perspective and present several incentive applications such as user-targeted video advertising and collective intelligence gaming. We believe that vlogging and its applications will bring new opportunities and drives to the research in related fields.
Since the year 2000 a project under the name of "Continuous Media Web", CMWeb, has explored how to make video (and incidentally audio) a first class citizen on the Web. The project has led to a set of open specifications and open source implementations, which have been included into the Xiph set of open media technologies. In the spirit of the Web, specifications for a Video Web should be based on unencumbered formats, which is why Xiph was chosen.
In this paper we illustrate the model-driven development approach applied to the user interface of an audiovisual search application, within the European project PHAROS. We show how conceptual modelling can capture the most complex features of an audio-visual Web search portal, which allows users to pose advanced queries over multi-media materials, access results of queries using multi-modal and multi-channel interfaces, and customize the search experience by saving queries of interest in a personal profile, so that they can be exploited for asynchronous notification of new relevant audiovisual information. We show how model-driven development can help the generation of the code for sophisticated Rich Internet Application front-ends, typical of the multimedia portals of the future.
In recent years, blogging has become an exploding passion among Internet communities. By combining the grassroots blogging with the richness of expression available in video, videoblogs (vlogs for short) will be a powerful new media adjunct to our existing televised news sources. Vlogs have gained much attention worldwide, especially with Google's acquisition of YouTube. This article presents a comprehensive survey of videoblogging (vlogging for short) as a new technological trend. We first summarize the technological challenges for vlogging as four key issues that need to be answered. Along with their respective possibilities, we give a review of the currently available techniques and tools supporting vlogging, and envision emerging technological directions for future vlogging. Several multimedia technologies are introduced to empower vlogging technology with better scalability, interactivity, searchability, and accessability, and to potentially reduce the legal, economic, and moral risks of vlogging applications. We also make an in-depth investigation of various vlog mining topics from a research perspective and present several incentive applications such as user-targeted video advertising and collective intelligence gaming. We believe that vlogging and its applications will bring new opportunities and drives to the research in related fields.
In this paper, we provide a brief survey of the mul timedia information retrieval domain as well as introduce s ome ideas investigated in the special issue. We hope that the contributions of this issue provide motivation for readers to dea l with the current challenges and problems. Such contributions are the basis of tomorrow's multimedia information systems. Our aims are to clarify some notions raised by this new technology by reviewing the current capabilities and the potential usefulne ss to users in various areas. The research and development issues cover a wide range of fields, many of which are shared with medi a processing, signal processing, data base technologies and data mining.
Semantic interpretation of the data distributed over the Internet is subject to major current research activity. The Continuous Media Web (CMWeb) extends the World Wide Web to time-continuously sampled data such as audio and video in regard to the searching, linking, and browsing functionality. The CMWeb technology is based the file format Annodex which streams the media content interspersed with markup in the Continuous Media Markup Language (CMML) format that contains information relevant to the whole media file, e.g., title, author, language as well as time-sensitive information, e.g., topics, speakers, time-sensitive hyperlinks. The CMML markup may be generated manually or automatically. This paper investigates the automatic extraction of meta data and markup information from complex linguistic annotations, which are annotated recordings collected for use in linguistic research. We are particularly interested in annotated recordings of meetings and teleconferences and see automatically generated CMML files and their corresponding Annodex streams as one way of viewing such recordings. The paper presents some experiments with generating Annodex files from hand-annotated meeting recordings.