Conference PaperPDF Available

Using MPEG-7 for Automatic Annotation of Audiovisual Content in eLearning Digital Libraries.

Authors:

Abstract and Figures

In this paper we present a prototype system to enrich au- diovisual contents with annotations, which exploits exist- ing technologies for automatic extraction of metadata (such as OCR, speech recognition, cut detection, visual descrip- tors, etc.). The prototype relies on a metadata model that unifies MPEG-7 and LOM descriptions to edit and enrich audiovisual contents, and it is based on MILOS, a general purpose Multimedia Content Management System created to support design and effective implementation of digital li- brary applications. MILOS supports the storage and con- tent based retrieval of any multimedia documents whose de- scriptions are provided by using arbitrary metadata models represented in XML. As a result, the indexed digital mate- rial can be retrieved by means of content based retrieval on the text extracted and on the MPEG-7 visual descriptors (via similarity search), assisting the user of the e-Learning Library (student or teacher) to retrieve the items not only on the basic bibliographic metadata (title, author, etc.).
Content may be subject to copyright.
A preview of the PDF is not available
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
As many metadata are encoded in XML, and many digital libraries need to manage XML documents, efficient techniques for search- ing in such formatted data are required. In order to efficiently process path expressions with wildcards on XML data, a new path index is pro- posed. Extensive evaluation confirms better performance with respect to other techniques proposed in the literature. An extension of the pro- posed technique to deal with the content of XML documents in addition to their structure is also discussed.
Conference Paper
Full-text available
(Extended abstract) Abstract. This paper describes the architecture of the MILOS Content Management System. MILOS supports the storage and content based retrieval of any XML document, as well as multimedia documents whose descriptions are provided by using heterogenous metadata models represented in XML. MILOS is flexible in the management of documents containing different types of data and content descriptions; it is efficient and scalable in the storage and content based retrieval of these documents. The paper illustrates the solutions adopted to support the management of different metadata descriptions of multimedia documents in the same repository, and it illustrates the experiments performed by using the MILOS system to archive documents belonging to three different and heterogenous collections which contain news agencies, scientific papers, and audiovideo documentaries. 1
Article
Full-text available
During the last decade, multimedia databases have become increasingly important in many application areas such as medicine, CAD, geography, or molecular biology. An important research issue in the field of multimedia databases is the content based retrieval of similar multimedia objects such as images, text, and videos. However, in contrast to searching data in a relational database, a content based retrieval requires the search of similar objects as a basic functionality of the database system. Most of the approaches addressing similarity search use a so-called feature transformation which transforms important properties of the multimedia objects into high-dimensional points (feature vectors). Thus, the similarity search is transformed into a search of points in the feature space which are close to a given query point in the high-dimensional feature space. Query Processing in high-dimensional spaces has therefore been a very active research area over the last few years. A number of new index structures and algorithms have been proposed. It has been shown that the new index structures considerably improve the performance in querying large multimedia databases. Based on recent tutorials [BK 98, BK 00], in this survey we provide an overview of the current state-of-the-art in querying multimedia databases, describing the index structures and algorithms for an efficient query processing in high-dimensional spaces. We identify the problems of processing queries in high-dimensional space, and we provide an overview of the proposed approaches to overcome these problems.
Article
Full-text available
The past ten years saw the introduction of three major metadata specifications. These are the heavily test biased Dublin Core Metadata Initiative (DMCI), MPEG-7 which expanded the scope of describing a single audiovisual object, and the Semantic Web that characterizes all information, regardless of location or encoding. Now a decade later, it is evident that not much new has been learned about using any of these specifications to locate generalized new media. Against this background, a five-point plan for a moratorium on metadata is introduced: proclaiming the three major metadata specifications as Official Successes and are Ready for Business; issuance of a second proclamation calling for a general moratorium on metadata; during the moratorium period, it is critical to concentrate on locating objects within a range of mixed-media assets based on context-sensitive queries; a culturally diverse corpus of 1 million nontext media assets should be created; and consideration of a multimedia content differentiation.
Article
This book is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory. The book also introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods. SVMs deliver state-of-the-art performance in real-world applications such as text categorisation, hand-written character recognition, image classification, biosequences analysis, etc. Their first introduction in the early 1990s lead to a recent explosion of applications and deepening theoretical analysis, that has now established Support Vector Machines along with neural networks as one of the standard tools for machine learning and data mining. Students will find the book both stimulating and accessible, while practitioners will be guided smoothly through the material required for a good grasp of the theory and application of these techniques. The concepts are introduced gradually in accessible and self-contained stages, though in each stage the presentation is rigorous and thorough. Pointers to relevant literature and web sites containing software ensure that it forms an ideal starting point for further study. Equally the book will equip the practitioner to apply the techniques and an associated web site will provide pointers to updated literature, new applications, and on-line software.
Conference Paper
E-learning is an emerging new education approach that augments learning experiences by integrating multimedia and network technologies. As an integrated part of e-learning, the lecture videos captured in classrooms contain most of the instructional content. Effective use of these videos, however, remains a challenging task. This paper reviews previous research work on capturing, analyzing, indexing, and retrieval of lecture (instructional) videos, and introduces on-going research efforts related with instructional videos. This paper compares instructional video to other video genres and addresses its special issues and difficulties. We present the current challenges in content-based indexing and retrieval of instructional videos. Improving these techniques for lecture videos has significant educational and social benefits.
Conference Paper
The Informedia Digital Library Project allows full content indexing and retrieval of text, audio and video material. The integration of speech recognition, image processing, natural language processing and information retrieval overcomes limits in each technology to create a useful system. In order to answer the question how good speech recognition has to be in order to be useful and usable for indexing and retrieving speech recognizer generated transcripts, some empirical evidence is presented that illustrates the degradation of information retrieval at different levels of speech accuracy. In our experiments, word error rates up to 25% did not significantly impact information retrieval and error rates of 50% still provided 85 to 95% of the recall and precision relative to fully accurate transcripts in the same retrieval system
XML path language (XPath), version 1.0, W3C. Recommendation
  • W W W Consortium
W. W. W. Consortium. XML path language (XPath), version 1.0, W3C. Recommendation, November 1999.
Information technology -Multimedia content description interfaces. Part 3: Visual
  • Iso Iec
ISO/IEC. Information technology -Multimedia content description interfaces. Part 3: Visual. 15938-3:2002.