ArticlePDF Available

ZYX-A Multimedia Document Model for Reuse and Adaptation of Multimedia Content.

Authors:

Abstract and Figures

Advanced multimedia applications require adequate support for the modeling of multimedia content by multimedia document models. More and more this support calls for not only the adequate modeling of the temporal and spatial course of a multimedia presentation and its interactions, but also for the partial reuse of multimedia documents and adaptation to a given user context. However, our thorough investigation of existing standards for multimedia document models such as HTML, MHEG, SMIL, and HyTime leads to us the conclusion that these standard models do not provide sufficient modeling support for reuse and adaptation. Therefore, we propose a new approach for the modeling of adaptable and reusable multimedia content, the Z<sub>Y</sub>X model. The model offers primitives that provide-beyond the more or less common primitives for temporal, spatial, and interaction modeling-a variform support for reuse of structure and layout of document fragments and for the adaptation of the content and its presentation to the user context. We present the model in detail and illustrate the application and effectiveness of these concepts by samples taken from our Cardio-OP application in the domain of cardiac surgery. With the Z<sub>Y</sub>X model, we developed a comprehensive means for advanced multimedia content creation: support for template-driven authoring of multimedia content and support for flexible, dynamic composition of multimedia documents customized to the user's local context and needs. The approach significantly impacts and supports the authoring process in terms of methodology and economic aspects
Content may be subject to copyright.
ZYXÐA Multimedia Document Model for Reuse
and Adaptation of Multimedia Content
Susanne Boll and Wolfgang Klas
AbstractÐAdvanced multimedia applications require adequate support for the modeling of multimedia content by multimedia
document models. More and more this support calls for not only the adequate modeling of the temporal and spatial course of a
multimedia presentation and its interactions, but also for the partial reuse of multimedia documents and adaptation to a given user
context. However, our thorough investigation of existing standards for multimedia document models such as HTML, MHEG, SMIL, and
HyTime leads to us the conclusion that these standard models do not provide sufficient modeling support for reuse and adaptation.
Therefore, we propose a new approach for the modeling of adaptable and reusable multimedia content, the ZYX model. The model
offers primitives that provideÐbeyond the more or less common primitives for temporal, spatial, and interaction modelingÐa variform
support for reuse of structure and layout of document fragments and for the adaptation of the content and its presentation to the user
context. We present the model in detail and illustrate the application and effectiveness of these concepts by samples taken from our
Cardio-OP application in the domain of cardiac surgery. With the ZYX model, we developed a comprehensive means for advanced
multimedia content creation: support for template-driven authoring of multimedia content and support for flexible, dynamic composition
of multimedia documents customized to the user's local context and needs. The approach significantly impacts and supports the
authoring process in terms of methodology and economic aspects.
Index TermsÐMultimedia document model, reuse, adaptation, multimedia database system.
æ
1INTRODUCTION
MULTIMEDIA applications need data models for the
representation of the composition of media ele-
mentsÐmultimedia document models. They are employed
to model the semantic relationships between the media
elements participating in a multimedia presentation. The
initial requirements to multimedia documents are the
modeling of the temporal and spatial course of a
multimedia presentation and also the modeling of user
interaction. However, the requirements of multimedia
applications have evolved: As authoring of multimedia
information is a very time consuming and costly task,
attention has been drawn to reuse multimedia docu-
ments for efficiency and economical reasons. Further-
more, the growing plenitude of multimedia information
calls for the personalization of the multimedia informa-
tion according to the user's individual context. Access
and distribution of multimedia documents via networks
like the Internet require adaptation of the documents to
heterogeneous network and system environments.
Our research project, ªGallery of Cardiac Surgeryº
(Cardio-OP
1
) [1], is an example of an advanced multimedia
application that emphasizes this need for reuse and
adaptation and explicitly requires a model for multimedia
material that supports extensive reuse of the material in
different user contexts. The overall goal is to develop an
Internet-based and database-driven multimedia informa-
tion system for physicians, medical lecturers, students, and
patients in the domain of cardiac surgery. The system will
serve as a common information and education base for its
different types of users in which the users are provided
with multimedia information according to their specific
request, their different understanding of the selected
subject, and their geographic location and technical infra-
structure. Within this project context, our group is devel-
oping concepts and prototypical implementations of a
database-driven multimedia repository that integrates mod-
eling,management, and content-based retrieval of multimedia
content with flexible dynamic multimedia presentation services
that select, deliver, and present the multimedia content
according to the user context. Major project requirements
are the support for reuse,adaptation, and presentation-neutral
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001 361
.The authors are with the University of Vienna, Institute for Computer
Science and Business Informatics, Liebiggasse 4/3-4, A-1010 Wien,
Austria. E-mail: {susanne.boll, wolfgang.klas}@univie.ac.at.
Manuscript received Apr. 1999; revised Nov. 1999; accepted Dec. 1999.
For information on obtaining reprints of this article, please send e-mail to:
tkde@computer.org, and reference IEEECS Log Number 111252.
1. This work was partially funded by the German Ministry of Research
and Education, grant number 08C58456. Our project partners are the
University Hospital of Ulm, Dept. of Cardiac Surgery and Dept. of
Cardiology, the University Hospital of Heidelberg, Dept. of Cardiac
Surgery, an associated Rehabilitation Hospital, the publishers Barth-Verlag
and dpunkt-Verlag, Heidelberg, FAW Ulm, and ENTEC GmbH, St.
Augustin. For details see also URL http://www.informatik.uni-ulm.de/
dbis/Cardio-OP/.
1041-4347/01/$10.00 ß2001 IEEE
description of the structure and content of multimedia
documents.
Given the project's requirements, we were looking for a
suitable modeling support among existing multimedia
document standards. Therefore, we elaborated both the
traditional and advanced requirements to multimedia
document models and, endowed with this metrics, ana-
lyzed the document models HTML [2], MHEG [3], [4], [5],
HyTime [6], [7], and SMIL [8]. The detailed analysis and
comparison of the models can be found in [9], [10], [11].
However, the analysis of the models' basic modeling
concepts as well as their support for reuse, adaptation,
and presentation-neutral description of multimedia content
showed that each of the models lacks some significant
concepts and does not meet all of the requirements.
Therefore, we designed and implemented the ZYX model
to overcome these limitations and to have a proper basis to
start out from to comprehensively provide for reusability
and adaptation by the multimedia repository.
In this paper, we present the ZYX model, which forms
the core for the modeling of the multimedia content in our
repository. In comparison to existing models, it provides
more adequate support for semantic modeling, reusability
and flexible composition, adaptation and individualization
for presentation, and presentation-neutral storage. We
illustrate the application of the model in the domain of
cardiac surgery and point out the implications of such a
model that supports reuse and adaptation to multimedia
authoring and multimedia presentation.
The paper is organized as follows: Section 2 provides the
reader a better understanding of the new requirements we
see with next generation multimedia applications. This
leads to a metric that we used to analyze existing multi-
media document models. The summary of this analysis is
also presented in this section. It motivates the need for our
new document model ZYX that emphasizes the require-
ments for reuse and adaptation of multimedia documents.
Section 3 presents the basic ideas and design considerations
of the ZYX model, Section 4 gives the formal framework for
a detailed understanding of the model. Focusing on reuse
and adaptation, Section 5 presents and illustrates the
spectrum of application possibilities of ZYX for reuse and
adaptation and discusses the advantages this support
brings to the creation and delivery of multimedia content.
Section 6 summarizes our work and gives an outlook to
ongoing and future work.
2REQUIREMENTS TO MULTIMEDIA DOCUMENT
MODELS AND AN ANALYSIS OF EXISTING
MODELS
In this section, we present our requirements to multimedia
document models. Hereby, we distinguish basic and
advanced requirements. The basic requirements to multi-
media document models are the modeling of the temporal
and spatial course of a multimedia presentation and the
modeling of interaction. The challenging, advanced re-
quirements to multimedia document models are the
reusability of the multimedia material, the adaptation to
user specific needs and context, and the presentation-
neutral description of the content. As our focus lies on the
advanced requirements, we start out with presenting these
in Section 2.1 and only shortly sketch the basic require-
ments afterwards in Section 2.2. Both the basic and the
advanced requirements constitute a metrics along which we
analyzed selected relevant multimedia document models
for their suitability in the project context. This analysis is
summarized in Section 2.3.
2.1 Advanced Requirements
In order to support a modular and context-dependent
composition of multimedia documents from media objects
and parts of multimedia documents, document models
need to provide a data model which provides support for
reuse,adaptation, and the presentation-neutral description of
the structure and content of multimedia documents.
Reuse. As explained in Section 1, reuse of multimedia
material is an unavoidable requirement for multimedia
document models. We characterize reusability of multi-
media content along three dimensions: the granularity of
reuse, the kind of reuse, and the selection and identification
of reusable components.
.Granularity. The granularity of reuse determines what
can be reused. Regarding multimedia document
models, we can distinguish at least three levels of
granularity for reusable components: reuse of com-
plete multimedia documents, reuse of fragments of
multimedia documents like single scenes or teaching
units, and reuse of individual atomic media elements
such as a video or audio and parts of those media
elements such as a scene of a video.
.Kind of reusage. For all three levels of granularity, we
distinguish two different ways of how to reuse
material for the composition of new documents:
identical reusage, i.e., the components are reused
including all temporal, spatial, design, and interac-
tion relationships and constraints as originally
specified by the author(s), and structural reusage by
means of separating layout and structure and
reusing only structural parts.
.Selection and identification. Before we can reuse
multimedia components, we have to identify and
select them within the multimedia information
system. This calls for metadata and for mechanisms
for classifying, indexing, and querying components.
Hence, a document model should provide support
for the comprehensive and sophisticated annotation
of reusable components with metadata.
Adaptation. The presentation of multimedia documents
preferably should adapt to the user context, like the user's
interest, knowledge level, preferences, the targeted user
system environment, and varying resources like available
network bandwidth and CPU time. To introduce adaptivity
into multimedia presentations which is a requirement to a
multimedia document model is that the model must offer
primitives to specify or generate orderive in some way
presentation alternatives that reflect and meet the different
presentation contexts. For an actual presentation, the
system can use these alternatives to adapt the delivery
and the rendering of the presentation to the current user
context.
362 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
For example, consider a professor on campus who is
interested to see in-depth multimedia material on coronary
artery bypass grafting, and an undergraduate student at
home who needs to get only an abstraction of the same
material to pass the upcoming exam. In these two different
presentations, the ªstoryº behind each actual presentation,
however, might be the same; some components of the
professor's presentation might be (re)used in the student's
presentations while others might be substituted or adapted
by more abstract representations of the specific content.
For a better understanding, we distinguish adaptation by
the extent to which the adaptability is modeled and when the
adaptability is exploited:
.Extent of the adaptability: For the extent of the
adaptability, we distinguish between adaptation to
personal interest, which adapts the contents of a
document to the user's interests, knowledge, profes-
sional background, etc. and adaptation to technical
infrastructure, which adapts to the technical infra-
structure available to a user.
In the example above, adaptation to technical
infrastructure would be the capability to adapt the
document's presentation both to the high-end
environment of the professor on campus and the
low-end environment of the student at home.
Therefore, the presentation should be adaptable by
means of technical parameters like resolution of
images and frame rate of videos, but also by means
of media substitutability like substituting an audio
by text or a video by a sequence of pictures or a
small animation. Adaptation to personal interest
would be an adaptation of the content such that the
professor would see a more in-depth presentation of
the coronary artery bypass grafting, whereas the
student would rather get a simplified variant
presentation of the operation, thus reflecting the
expected background knowledge of the different
users.
.Static or dynamic adaptability. With regard to the
presentation alternatives, it is of interest whether all
possible alternatives for the adaptation are to be
known and modeled at the authoring time of a
multimedia document or whether they are left for
generation at the actual presentation time just when
the adaptation is needed.
Presentation-neutral representation. The multimedia ma-
terial available has to be presentable in a heterogeneous
software and hardware environment which can be found
on the Internet. As a consequence, the multimedia
material has to be stored presentation-neutral, i.e.,
independent of the actual realization of a presentation
at a client. This calls for a presentation-neutral represen-
tation of multimedia content that is convertible into the
respective presentation-specific format used for playout of
the multimedia material. It is desirable that this conver-
sion is lossless and a conversion to different ªoutput
formatsº is possible. The presentation-neutral representa-
tion of multimedia content should henceÐbesides the
coverage of rich multimedia functionalityÐtake place on
a high-level of semantics. The presentation-neutral model
should also be open in the sense that it allows for later
integration of multimedia functionality expected to be
developed in the future.
.Multimedia functionality. The multimedia functional-
ity of a multimedia document model describes the
expressiveness of its modeling primitives. A docu-
ment should have a high multimedia functionality to
give sufficient support for modeling multimedia
content. With regard to the conversion process of a
(presentation-neutral) document into another/out-
put format, this means that if the target document
model does not offer an equivalent multimedia
functionality as offered by the source model, the
conversion will be loosy.
.Semantic level. A document model describes a
document on a high semantic level if the document's
structure is specified rather than its presentation.
This is helpful and necessary to allow for an
automatic conversion of a document into another
document format as then the course of the presenta-
tion can be extracted and converted easier. If the
document has a low semantic level, a conversion
may need knowledge about the multimedia content
that often only the author will have.
Therefore, the presentation-neutral representation of
multimedia content should have a high multimedia
functionality and take place on a high-level of semantics.
2.2 Basic Requirements
The traditional requirements for a temporal and spatial
model as well as interaction modeling are imperative for a
multimedia document model and, hence, are presented
only in short for the sake of completeness.
Temporal model. A temporal model (see also [12], [13],
[14], [15]) describes temporal dependencies between the
media elements of a multimedia document. One can find
four types of temporal models: point-based temporal models,
interval-based temporal models, and event-based temporal
models. Another way to specify temporal relations between
media elements is by the use of scriptsÐprograms written in
a scripting language which can comprise temporal syn-
chronization operations.
Spatial Model. Three approaches of positioning the visual
elements on the presentation medium can be distinguished:
absolute positioning based on a coordinate system, directional
relations [16], using relations like strong-north and weak-north
(to specify overlapping), and topological relations [17] using
relations like disjoint,meet, and overlap.
Interaction. Users should be able to interact with
presentations in terms of three types of interaction:
1) Navigational interactions determining the user-defined
flow of a multimedia presentation, 2) design interactions
influencing the visual and audible layout of a presentation,
and 3) movie interactions affecting the temporal course of the
entire presentation. Navigational and design interactions
should be specified within multimedia documents, whereas
movie interactions are expected to be offered by the
presentation engine.
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 363
2.3 Analysis of Existing Models
In this section, we very briefly summarize our analysis of
the most relevant existing standards and data models in
view of the requirements presented in the previous section.
Both the basic and the advanced requirements constitute a
metrics along which we analyzed selected multimedia
document models. Due to the limitation of space, we can
not present our comprehensive and detailed discussion
anout how the models meet the specific requirements in this
paper but refer the reader to [9], [10], [11]. Fig. 1 illustrates
the results of our analysis of the most relevant existing
approaches and shows to which extent HTML/DHTML,
MHEG-5/6, HyTime, and SMIL fulfill the basic and
advanced requirements. For each of the requirements, the
single aspects elaborated in Section 2 are listed and, for
each of the models, the figure shows how/to what extent
the requirements are met by the model.
The analysis of existing standards, defacto standard
formats, and models shows that, although, individual
formats and models are strong with respect to particular
features, they are not capable of meeting all the require-
ments identified in the previous section, especially those we
find with advanced multimedia applications, i.e., support
for reuse,adaptation, and presentation-neutral description. This
result led to the design and implementation of the
ZYX model which tries to take the pick of the bunch of
features of existing formats and models, especially also
recent developments in the area of Internet-applicable
models driven by the development of XML and SMIL.
3THE ZYXMODEL
When designing the ZYX model, we were, of course, taking
into account the lessons learned with the models we
analyzed. To give the reader an understanding of the
design of our model and also the points of contact of ZYX
with other approaches in the field, we sketch our design
considerations in Section 3.1. In Section 3.2, we then
introduce the reader to the basic concepts of our ZYX data
model before we present the detailed formal framework for
ZYX in Section 4.
3.1 Design Considerations
Aiming at the design of a model which fulfills the
requirements of reuse, adaptation, and presentation-neutral
representation as presented in the previous section, there
are still choices open about how to achieve sufficient
support for these requirements by a new data model. In the
following, we take up the advanced requirements and
discuss the approach about how we aim at supporting them
in ZYX. With regard to the basic requirements, we present
what underlying temporal and spatial model we selected
and explain the interaction capabilities.
Presentation-neutral representation. For the supported
degree of presentation neutrality of the multimedia docu-
ment model, the semantic level of the model and the
model's abstraction from the actual presentation are crucial.
Therefore, we decided to develop a data model that
describes a multimedia document on a high semantic level.
This allows us a (loosy) export or conversion of our
multimedia document into data models like MHEG-5,
SMIL, and HTML. To keep the documents independent of
the final realization within a multimedia presentation, the
model strictly separates modeling of layout information
from document structure. To be able to support a rich
multimedia functionality, our model is designed to support
as much of the multimedia functionality of these models as
possible while still keeping a high semantic level.
Reuse. For the structure of the documents, we consider a
hierarchical organization of the document as it can be found
with XML-based document models. To achieve reuse on an
arbitrary level of granularity, the model supports different
granules of reusable components, i.e., media elements,
document fragments, and entire documents. The model
strictly separates modeling of layout information from
structure to keep the documents independent of the final
realization within a multimedia presentation. Hence, it is
both possible to just reuse the structure and add new layout
information to it, and to reuse the different granules directly
with the layout information, due to this separation of layout
information from structure,. Hence, the ZYX model sup-
ports structural and identical reusage of elements, frag-
ments, and documents. For the selection and identification
of the different granules, the model has the capability to
annotate/enhance the granules with content-descriptive
metadata.
Adaptation. With our document model, we want to
support comprehensive adaptation mechanisms. Adapt-
ability of ZYX is not limited to adaptation to a predefined
set of discriminating technical attributes that are exploited
for adaptation, as can be found with SMIL, but can be
364 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
Fig. 1. Summary of the support of the basic and advanced requirements
by HTML, DHTML, MHEG-5/6, SMIL, and HyTime (+ support, o partial
supportÐno support).
specified by an open set of attributes that reflect a complex
user and system context. The model offers the static
modeling of ªpresentation alternativesº that can be
exploited for adaptation to the different presentation
contexts. Additionally, the model offer primitives that
determine the needed presentation alternative only at the
point in time when the document is actually requested and
presented.
Temporal model. We decided to use an interval-based
temporal model. In order to fulfill the important require-
ment to describe the temporal dimension of interaction, we
selected the Interval Expressions [14] to form the basis of
the underlying temporal model of the ZYX data model. In
comparison to other interval-based temporal models it
allows us to describe the related time intervals which
possibly have an unknown duration, a feature which is of
importance with interaction modeling. The selection of an
interval-based temporal model does not contradict the high-
semantic level of the document model as this would be the
case of an event-based or script-based temporal model.
Spatial model. For the spatial layout, we decided to use a
point-based description of each visual media entity in a
multimedia document. Each visual media entity has
assigned two-dimensional extension plus a third dimension
to specify overlapping of visual media entities. So far, we do
not consider the specification of spatial relationships
between media entities like right-of or besides. However, as
our model strictly separates structure and layout and
defines clear interfaces to add layout to structure, it allows
for the extension by a more sophisticated spatial model
later.
Interaction. Our model supports the two interaction
types: navigational/decision interactions and design inter-
actions. This means that our model provides a comprehen-
sive support for these two interaction types comparable
with the interaction capabilities of MHEG-5, but more
sophisticated than those of SMIL.
3.2 Basic Concepts of the ZYXModel
In this section, we present the terminology and the basic
concepts of the ZYX model. The ZYX model describes a
multimedia document by means of a tree. The nodes of the
tree are the presentation elements and the edges of the tree
bind the presentation elements together in a hierarchical
fashion. Each presentation element has one binding point
with which it can be bound to another presentation element.
It also has one or more variables with which it can bind other
presentation elements. Additionally, each presentation
element can bind projector variables to specify the element's
layout. Fig. 2 introduces the graphical representation of
these basic elements of the model which we use in the
following to illustrate the model's features. The presenta-
tion elements are represented by rectangles, they form the
nodes of the document tree. On top of this rectangle, a
diamond represents the element's binding point. The
variables are represented by the filled circles below the
rectangle. The open circles on the right side of each
presentation element represent the element's projector
variables. The actual connections of variables and projector
variables to binding points of other presentation elements
are represented by edges in the graphical representation. A
variable that is connected to another presentation element is
called a bound variable, those variables that are not
connected are called free variables.
Presentation elements are the generic elements of the
model. They can be media elements that represent the
media data but also elements that represent the temporal,
spatial, layout, and interactive semantic relationships
between the elements of a multimedia document. Consider
the simple document tree, a so called ZYXfragment, in Fig. 3.
A temporal element, the sequential element seq, binds the
media elements Image and Text to its variables v2and v4,as
well as a parallel element par to its variable v3. The par
element again synchronizes a Video and an Audio which are
bound to its variables v6and v7. The presentation semantics
of each fragment is that it starts with the presentation of the
root element, here the sequential element. The specific
presentation semantics of the seq element is that the
elements bound to its variables v1,v2,v3,v4, and v5are
presented one after the other. That is, the element that will
be bound to v1is presented first, then the image bound to
v2, then the par element and so on. The presentation
semantics of the par element is that the video and the audio
element bound to its variable v6and v7are presented in
parallel. The sample fragment represents the media ele-
ments and the semantic relationships between the four
media elements. With the seq element's binding point this
fragment can be bound to another presentation element in a
more complex multimedia document tree. The variables v1
and v5of the fragment are still unbound. Here, an(other)
author could insert, e.g., a title at the beginning and a
summary at the end of the sequence, later.
We now explain the modeling capabilities of our model
with regard to our specific requirements of reusability,
adaptation and presentation-neutral representation, as well
as temporal and spatial modeling, and interaction.
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 365
Fig. 2. Graphical representation of the basic document elements.
Fig. 3. Simple document treeÐa ZYX fragment.
Reusability. First, we describe the elements of ZYX that
support the different granularity of reusable components of
multimedia documents.
Reusability on the level of media elements is supported
by means of selector elements: These are presentation
elements that determine what, that is, which part of a media
element is presented. They can be used to select and thereby
(re)use a specific part of an audio or a specific area of an
image. To select a part of a continuous media element, the
temporal selector temporal-s specifies start and duration of
the selected sequence. Fig. 4 illustrates the usage and
semantics of a temporal selector element: The temporal
selector selects a scene of a video of a duration of 40sec
beginning with second 10 of the original video.
To select a spatial fraction of a visual media element, the
spatial selector specifies the selected area by a polygon. In
Fig. 5, a spatial selector spatial-s is applied to an image
media element to select a rectangular area from the image.
The selectors can also be applied to fragments, e.g., to select
two minutes of an existing slide presentation or a fraction of
a composite visual element. Reuse is also supported on the
level of fragments. Here templates,complex media elements,
and external media elements provide for the support of
reusability of fragments:
.Templates. In the ZYX model, not all of the variables
of a presentation element must be bound at author-
ing time. In Fig. 3, the variables v1and v5, e.g., the
title and the summary of the presentation are still
unbound. This means that the sequence element seq
can later be completed by binding presentation
elements to the free variables v1and v5. This makes
the simple fragment in Fig. 3 a ªtemplateº for later
(re)use. This is an important feature for building
reusable fragments that can be applied in different
multimedia documents by (a kind of late) binding of
the free variables differently corresponding to the
current context.
.Complex and external media elements. It is, of
course, possible to form more complex fragments
like the one shown in Fig. 6. To make reuse more
easy and make it easier to handle, large document
fragments can be encapsulated by complex media
elements. Then, an encapsulated fragment appears
like a single presentation element in the specification
tree with one binding point and possibly a set of
variables. In the example in Fig. 6, different
presentation elements of the fragment leave vari-
ables unbound, which makes it a template as
described above. Here, also the encapsulation of
fragment by complex media elements is of help: To
make later ªfillingº of such templates easier, a
template can also be encapsulated. The free variables
of the fragment are exported and form the variables of
the complex media element. Fig. 6 illustrates how a
complex media element encapsulates a complex
fragment. A complex media element somehow is
the black box view to a possibly complex presenta-
tion fragment. The concepts of free variables in
combination with complex media element guarantee
comprehensive and workable reusability on the level
of presentation fragments.
.Analogously, an external media element encapsulates a
specification of a fragment that was composed in
another external document format. This allows for
the inclusion of existing documents of another
document format into our model. What, however,
is encapsulated by the external media element is
dependent of the external document format.
.Fragments and documents. And, of course, frag-
ments of entire documents can be reused by binding
the root element of the document to a free variable in
another document.
With regard to the kind of reusage the model supports
both identical and structural reuse: Therefore, besides the
selector elements, the ZYX data model offers projector
366 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
Fig. 5. Spatial selector element spatial-s and its semantics.
Fig. 6. Complex fragments encapsulated in a complex media element.
Fig. 4. Temporal selector element temporal-s and its semantics.
elements that influence the visual and audible layout in a
presentation of a multimedia document. Projector ele-
ments determine how a media element or a fragment is
presented. They determine, for example, the presentation
speed of a video or the spatial position of an image on
the screen. Projectors are bound to the projector variables of
presentation elements. Each presentation element can
have one or more projector variables to which projectors
can be bound. A projector applies not only to the
presentation element it is bound to but also to its subtree.
For the arbitrary nesting of projectors, authoring tools
should provide support for consistency checking to avoid
contradicting layout specifications.
Fig. 7 illustrates the usage of projector elements and the
separation of structure and layout. In this example, a
fragment defines the parallel presentation of an audio and a
video. Two projector elements are bound to the sequential
element, a spatial projector spatial-p and an acoustic
projector acoustic-p. Each of the projectors applies only to
those elements in the same tree that can be affected by it.
Therefore, the spatial projector affects the spatial layout of
the video. The acoustic projector applies to the audio
element and determines the volume, base, treble, and
balance for presentation. By means of changing/adding
projector elements, one can change the layout of the
document. This allows for reusability of the same structure
with different presentation layouts, i.e., implements struc-
tural reuse. This follows the idea of separating structure
from layout information as can be found with SGML and
XML and complies also with our requirement for presenta-
tion-neutral representation of the documents.
As we have outlined in the requirements, reuse needs
support for identification and selection of the multimedia
content to be reusedÐhence, metadata is needed. Therefore,
each ZYX fragment is assigned a set of metadata that
describes its content by means of attribute-value pairs.
3.2.1 Adaptation
Adaptation means that the ZYX document that is delivered
for presentation should best match the context of the user
who requested the document. To support this kind of
adaptation, both a description of the user context and a
multimedia document that can be adapted to this context is
needed.
The context of a user is captured in a so called user profile,
i.e., metadata that describes the user's topics of interest,
presentation system environment, network connection
characteristics, etc. This metadata is organized as key-value
pairs just as the metadata that is assigned to the multimedia
content.
The ZYX data model provides two presentation elements
for an adaptation of the document to a user profile: the
switch element and the query element. The switch element
allows us to specify different alternatives for a specific part
of the document. With each of the alternatives under a
switch element, there is associated metadata that describes
the context in which this specific alternative is the best
choice for presentation. This metadata is specified as a set of
discriminating attribute-value pairs for each alternative.
During presentation, the user profile is evaluated against
the metadata of the switch and that alternative is selected for
presentation of which the discriminating attributes best
match the current user profile. An illustration of the switch
element is given in Fig. 8. The switch element specifies two
presentation alternatives: the first alternative, bound to v1is
associated with a seminar-like teaching style (type, seminar)
and the second one with a lecture-like type of teaching (type,
lecture). When the document is presented, depending of the
preferred type of teaching which is reflected in the user's
current profile, either the left or the right subtree is
presented. As the switch element can specify an arbitrary
number of alternatives each of which is described by an
arbitrary number of attribute-value pairs, this provides for a
very comprehensive extent of adaptability as almost every
aspect of a user and the environment can be distinguished
and later be evaluated for adaptation during presentation.
Aswitch element can be used only if all alternatives can
be modeled at authoring time, in advance to the presenta-
tion. Hence, the switch element implements the requirement
for static adaptability of the model. However, there might be
the case that an author cannot or does not want to exactly
specify a part of the presentation but only describe the
desired fragments and defer the actual selection of suitable
fragments to the point in time when the document is
requested for presentation. For example, an author might
wish to specify that at a specific point in the presentation
about ªcardiac surgery,º a digression into physiology is to
be made, however, the author does not want to specify
which fragments are relevant to this but have the most
suitable one selected out of a pool of available fragments
just before presentation. This can be specified with a query
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 367
Fig. 8. Specification of presentation alternatives with the switch element.
Fig. 7. A simple fragment with spatial and acoustic porjector elements
and their semantics.
element. By means of metadata, the query represents the
fragment that is expected at this point in the presentation.
When the document is selected for presentation the query
element is evaluated and the element is replaced by the
fragment best matching the metadata given by query
element. An illustration of the query element is given in
Fig. 9. The sample query element is the place holder for the
fragment best matching the query with topic ªphysiology in
cardiac surgery,º of type ªlecture,º and of five minute
duration. The more metadata tuples are used the more
specific the query is. The query element provides for the
dynamic adaptability of the model as the evaluation of the
query and the selection of the fragment takes place just
before presentation.
Presentation-neutral representation. The requirement of
presentation-neutral representation is strongly interrelated
with the structural reuse (see also Fig. 7). The explicit
separation of structure and layout allows for presentation-
neutral representation. As outlined before, the variables of a
presentation element need not to be bound in the first place,
this also applies for the projector variables. It is possible to
specify the presentation-neutral course of the presentation
and, later, bind the presentation-dependent layout just
when the document is selected for presentation. Then, the
presentation-neutral structure of the document is bound via
projector variables to the presentation-dependent layout
defined by a set of projectors.
Temporal and spatial modeling. Based on the Interval
Expressions [14], the model offers the primitives seq,par,
loop, and delay to specify temporal interval relationships.
These presentation elements can be nested to specify any
arbitrary temporal course of the multimedia presentation.
For the spatial model, we use the spatial projectors as
presented above. They realize the absolute positioning we
decided to use for the ZYX model. A spatial projector
determines the spatial layout of the presentation element it
is applied to and the layout applies to the entire subtree of
the presentation element.
Interaction. The requirement to support the modeling of
interactive multimedia presentations is met by the data
model's interaction elements. The model offers two types of
interaction elements, navigational interactive elements and
design interactive elements. The basic navigational element is
the genericLink element that allows us to specify the
transition from the document to an arbitrary link target.
Note that this element is not interactive. Based on the
genericLink, the menu element supports interactively select
one out of a set of visual elements and follow the
presentation path that is associated with the selected
element. The elements hotspot and hypertext define fine-
grained interactive visual areas in images and text. The
design interactive elements are the interactive version of the
projector elements. For example, for the typographic
projector that allows us to specify font, size, and style of a
text, the interactive typographic projector element specifies
that these settings can be altered interactively when the
document is presented.
4FORMAL FRAMEWORK OF THE ZYXMODEL
In this section, we present the formal framework of the
ZYX model. Therefore, we introduce the reader to the basic
terminology and formalism of the basic elements of the
model and then present the elements for modeling the
temporal course, the layout, interaction, and the adaptation
of the presentation. Fig. 10 gives the reader an overview of
the definitions to follow. They are listed along the
requirements and design criteria presented in Section 2
which where used for the comparison of document models,
illustrated in Fig. 1.
4.1 Basic Terminology
The presentation elements are the generic elements of the
ZYX model. Each presentation element phas assigned
exactly one binding point bp. This is the connector with
which a presentation element can be bound to another
presentation element. A presentation element has further-
more 0to nvariables vwhich are used to bind other
presentation elements to it. To add layout information to
a presentation element, it optionally can have 0to
nprojector variables pv that can be used to bind projector
elements to the element. The projector variables are treated
separately, due to separating structure and layout.
The symbols introduced in Definition 1 are used in the
definitions to follow.
Definition 1 (Symbols). Let Bdenote the set of all binding
points, VAR the set of all variables, PVAR the set of all
projector variables, Tthe set of all element types, MT the set of
media types, MED the set of all raw media data, OT the set of
ZYX operator element types, the ZYXDOC the set of all ZYX
documents, EXT the set of multimedia documents in an
external document format, PT OT the set of all projector
element types, AT T RI BU T ES the set of all possible attribute
names, and COLORS the set of all possible colors.
368 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
Fig. 9. Specification of presentation alternatives with the query
elementsÐevaluation of the query element and replacement by selected
ZYX fragment.
A presentation element pis defined as follows:
Definition 2 (Presentation Element). A presentation element
pis a tuple p:tp;b
p;V
p;PV
pwith tp2Tdenoting the type of
p,bp2Bdenoting the binding point of p,VpVARdenoting
the set of variables of p, and PV
pPV AR denoting the set of
projector variables of p. The tuple pcan be augmented with
further tuple elements depending on the type tpof the
presentation element.
A presentation element pcan be an atomic media element,
a complex media element, an external media element, a
specific operator element to build up the temporal,
structural and interactive relationships, or serve for the
specification of adaptation. This is distinguished by the
type tpin the definition of a presentation element p.
The basic units of a ZYX multimedia document are the
atomic media elements. An atomic media element is an
instantiation of a media type. An atomic media element in
our model abstracts from the raw media data and just
represents the media element and its media specific
characteristics. The formal definition of an atomic media
element is given in Definition 3.
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 369
Fig. 10. Summary of definitions of ZYX elements.
Definition 3 (Atomic Media Element). An atomic media
element am :tam;b
am;V
am;PV
am;mis a presentation ele-
ment with
tam 2MT fAudio; V ideo; I mage; Text; AnimationgT;
Vam ;, and m2MED denoting the media data represented
by am.
Presentation elements are interconnected using their
variables and binding points. Each variable and also each
projector variable of a presentation element can be bound to
exactly one binding point of another presentation element.
Each binding point of a presentation element can be bound
to exactly one variable or projector variable of another
presentation element. A connection binds one variable to a
binding point, and is formally defined in Definition 4.
Definition 4 (Connection). A connection cv; bp0connects
the (projector) variable v2Vp[PVpof a presentation element p
with the binding point bp0of presentation element p06 p.
The result of interconnecting presentation elements is a
specification tree that describes a reusable fragment of
multimedia document. A fragment can be comprised of a
single media element, a part, or an entire multimedia
document. The formal description of a valid fragment is
given in the following Definition 5.
Definition 5 (Fragment). A fragment fP;Cis an acyclic,
undirected graph that describes a part or an entire multimedia
document with:
.Pis the set of presentation elements that are part of the
tree.
.Cfv; bp0jp; p02P;p 6 p0;v 2Vp[PVpgis the
set of connections in the tree.
For a valid fragment fP;C, the following conditions
must hold:
1. If c1;c
22C,c1v1;b
p;c
2v2;b
p;p2P,then
v1v2, i.e., each binding point can be bound to only
one variable.
2. If c1;c
22C; p; p02Pand c1v; bp;c
2v; bp0,
then pp0, i.e., each variable can be bound to only one
binding point.
3.
Unboundffp2Pj:9v2[
p02P
Vp0:v; bp2Cg
and jUnboundfj1; rootf2Unboundf^trootf=2PT:
There is exactly one presentation element p2Pof the
fragment fthat is not bound to any other presentation
element. This unbound presentation element is called
the root element, denoted rootf, of the fragment and
has the binding point brootfthat forms the ªentry
pointº of the fragment; note that projector elements
cannot be root elements.
4. There is no sequence of connections c1;;c
n, such that
civi;b
pi;i1nÿ1, with vi12Vpi, and v12Vpn.
This means that fis acyclic.
5. 8pv 2Sp2PPV
p:9pv; bp02C)tp02PT.
Projector variables of a presentation element can
bind only projector elements.
6. 8v2Sp2PVp:9v; bp02C)tp0=2PT.
Variables of a presentation element can not bind
projector elements.
7. 8p2P:tp2PT )VpPV
p;.
A projector element can not bind any other
presentation element.
Fragments form the building blocks of a multimedia
document. They are the units that can be reused and
recomposed in different multimedia documents. To ease
this reuse of ZYX fragments, we introduce the definition of a
complex media element. A complex media object cm encapsu-
lates a fragment fP;Cwithin the definition of a
presentation element, somehow like a container. With this
definition, an encapsulated fragment can simply be reused
like a single presentation element in any other fragment. A
complex media element cm is defined as follows:
Definition 6 (Complex Media Element). A complex media
element cm :tcm;b
cm;V
cm;PV
cm;fis a presentation ele-
ment that encapsulates the fragment fP;Cwith
tcm Complex 2T,bcm brootf,
Vcm fv2[
p2P
Vpj8q2P:v; bq=2Cg;
and PVcm fpv 2S
p2P
PVpj8q2P:pv; bq=2Cg.
That is, the binding point of the root brootfof the
encapsulated fragment fbecomes the binding point bcm of
the complex media object cm. All variables and all projector
variables in the fragment fthat are not bound are exported
and form the free variables Vcm and projector variables PV
cm
of the complex media object. For an illustration, recall Fig. 6:
The binding point of the seq element becomes the root
element of the complex media element, and the unbound
variables v1,v6,v8,v9, and v5become the free variables of
the complex media element.
As complex media objects encapsulate ZYX fragments,
they offer a means of abstraction. The export of free
variables allows for a later accomplishment of the complex
media element. Hence, complex media elements can form
templates which can be ªfilledº later by binding media
elements, other complex media elements and fragments to
the free variables. This ªlate bindingº of presentation
elements to the free variables finally instantiates the actual
ZYX document.
To encapsulate fragments that are specified in an
external format, we define external media elements
(Definition 7). An external media element em is also a
complex media element. It encapsulates, however, not a
fragment specified in ZYX, but the specification of an
external fragment available in another data model. Like the
complex media element, the external media element has
assigned a set of variables Vem, projector variables PVem,
and one binding point bem. However, the meaning of the
variables and projector variables depends on the external
document format.
Definition 7 (External Media Element). An external media
element cm :tem;b
em;V
em;PV
em;fis a presentation
370 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
element that encapsulates the fragment f2EXT with
tem External 2T,bem binding point of the external
fragment, Vem variables of the external fragment,
PV
em projector variables of the external fragment.
With the definitions given so far, it is possible to
compose presentation elements by means of connections.
The interconnection of presentation elements via their
variables and binding point puts these presentation ele-
ments in a relationship; the semantics of this relationship,
however, is not yet defined. Therefore, our data model
offers different types of presentation elements, operator
elements, with which presentation elements can be inter-
connected with a certain semantics.
In the following, we present the element definitions of
temporal operators,projectors,selectors,interaction elements,
and adaptation elements. These elements determine the
semantics that have to be interpreted by a presentation
environment and mapped into the spatial, temporal,
structural, interaction, and adaptive domain of a multi-
media presentation. The different operator elements are
defined in the tuple notation as already introduced for the
generic presentation element. Again, the type distinguishes
the different operator elements. For the different elements,
the tuple carries additional operator type-specific values
that characterize the element's specific semantics. Not to be
repetitive in the definitions to follow, only the domains of
each of the newly introduced tuple elements are given.
4.2 Temporal Operator Elements
The temporal operator elements determine the temporal
relationships between the presentation elements. As out-
lined above, our temporal model is based on Interval
Expressions [14]. In the following, we present the definition
of the temporal operator elements par,seq,loop,delay, their
specific parameters, and semantics. An illustration of these
temporal operator elements is shown in Fig. 11.
The presentation semantics of the par operator element
(Definition 8) is that the presentation elements bound to its
variables are to be presented in parallel.
Definition 8 (Temporal Operator ElementÐpar). The
temporal operator element
par :tpar;b
par;V
par;PV
par; f inish; lipsync
is a presentation element with
tpar Par 2OT ; Vpar fv1;;v
ngV AR;
finish 2f1;...; n; min; maxg;
and lipsync 2N0.
The par operator element offers the two parameters
finish and lipsync to control the synchronization of
parallel presentation: The parameter finish determines
which one of the npresentation elements fv1;...;v
ng
terminates the parallel presentation. If finish is set to min
or max, then the presentation stops when the presenta-
tion of the element with the minimal presentation time
stop, respectively, with the maximal presentation time. By
setting finish i; i 2f1;...;ng, the presentation stops
when the presentation of the dedicated presentation
element bound to vistops. The second parameter lipsync
determines the element that forms the master of a
continuous fine synchronization during playout of the
par operator. If the second parameter lipsync equals 0,
then no lip synchronization is specified. If the value of
lipsync is i,i>0, the presentation of the presentation
elements bound to v1;;v
nis carried out in lip synchro-
nization and the presentation element bound to viforms
the master of this synchronization.
The presentation semantics of the seq operator element
(Definition 9) is that the presentation elements that are
bound to it are presented in sequence. The presentation of a
seq operator element starts the sequential presentation of
the presentation elements that are bound to the variables
vi;i1...nin the order of v1;v
2;...;v
n. The presentation
of the seq operator element begins with the presentation of
the presentation element bound to V1and ends with the end
of the presentation of the element bound to vn.
Definition 9 (Temporal Operator ElementÐseq). The
temporal operator element seq :tseq ;b
seq;V
seq;PV
seqis a
presentation element with tseq Seq 2OT, and
Vseq fv1;...;v
ngV AR:
The presentation semantics of the loop operator element
(Definition 10) is that its presentation starts the repeated
presentation of the single presentation element bound to
v2Vloop. The presentation is repeated rtimes and stops
after the rth presentation of the presentation element. If ris
set to 1, the presentation of the element loops forever.
Definition 10 (Temporal Operator ElementÐloop). The
temporal operator element loop :tloop;b
loop;V
loop;PV
loop;ris
a presentation element with tloop Loop 2OT ,jVloop j1,
and r2N[1.
The delay operator element (Definition 11) models a
temporal delay of tmilliseconds. It can be seen as an
ªemptyº media element that is presented for a duration of t
milliseconds.
Definition 11 (Temporal Operator ElementÐdelay). The
temporal operator element
delay :tdelay;b
delay;V
delay;PV
delay;t
is a presentation element with
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 371
Fig. 11. Fragment illustrating the usage of the temporal operator
elements.
tdelay Delay 2OT ; Vdelay ;PVdelay;
and t2N.
Fig. 11 illustrates the different temporal operators
defined above. The loop element that forms the root of the
sample fragments specifies that the subtree is repeated
10 times. This subtree is comprised of a sequence of two
videos with accompanying texts, each followed by a short
temporal gap of 50ms for the transition.
4.3 Selectors
The model offers selector elements to reuse parts of media
elements and fragments, i.e., spatial regions and temporal
intervals.
First, Definition 12 introduces the notion of a successor in
a fragment needed for subsequent definitions.
Definition 12 (Successor). Let Fdenote the set of all
fragments. We then define a function expand :F!F that
computes for a fragment f, the fragment that is semantically
equivalent to fbut does not contain any complex media
element. The function expandfrecursively replaces each
complex media element in fby the fragment that the complex
media element encapsulates.
Be f2F a fragment, expandfP;Cthe expanded
fragment, and p; p02Ppresentation elements. Then, the
following direct and indirect successor relationships hold:
1. p0is a direct successor of p()9v; bp02C:v2Vp.
2. p0is a indirect successor of p,() p0is not a direct
successor of pand there exists a sequence
succ1;...; succn;n2N with succ1is direct successor
of p,succiis direct successor of succiÿ1;i2;...;n,
and p0is direct successor of succn.
3. p0is successor of p,() p0is direct or indirect
successor of p.
For example, in Fig. 11, the seq element is a direct
successor of the loop element. The video and text elements
are indirect successors of the loop element and a direct
successor of the parallel element. There is no successor
relationship between image and the audio media element.
Now, we can define the different selector elements, the
temporal selector,spatial selector,textual selector, and the
acoustic selector. A temporal selector temporal-s(Definition 13)
is a presentation element that can bind exactly one other
presentation element p. The presentation semantics of this
element is that the presentation of the direct and indirect
successors of pis started start milliseconds after the original
starting point of the fragment and lasts for duration
milliseconds.
Definition 13 (Temporal Selector ElementÐtemporal-s).
The temporal selector element
temporal-s:ttemporal-s;b
temporal-s;V
temporal-s;
PV
temporal-s; start; duration
is a presentation element with
jVtemporal-sj1;t
temporal-sT emporal-S2OT ;
and start; duration 2N
0.
Aspatial selector spatial-s(Definition 14) element can
bind exactly one other presentation element p, which can be
a visual media element like an image or a video but also a
complex media element with visual appearance. The spatial
selector selects a spatial area from p. The presentation
semantics of the spatial selector is that only those visual
parts of pand its successors that are visible within the
rectangular area that is specified with the element's
parameters x; y; width; and height are presented. For an
illustration of the spatial selector confer to Fig. 5.
Definition 14 (Spatial Selector ElementÐspatial-s). The
spatial selector element
spatial-s:tspatialÿs;b
spatialÿs;V
spatialÿs;
PVspatialÿs; x; y; width; height
is a presentation element with tspatial-sSpatial-S2OT ,
jVspatial-sj1,x,y2N
0, and width; height 2N.
The application of temporal and spatial selector elements
is context sensitive. That is, they apply to the entire subtree
of the presentation element bound to it. Selector elements
can be organized in a hierarchy and each selector element is
applied in the the context of the subtree it is bound to. For
an illustration, consider the example given in Fig. 12: Two
temporal selector elements s1and s2with
s1T emporal-S; bs1;fvs1g;;;10;25
and
s2Temporal-S; bs2;fvs2g;;;10;40time in seconds
are nested with s2being a direct or indirect successor of s1.
Then, the selected temporal interval defined by s1is defined
relative to the temporal interval specified by s2. That is, the
start time 10sec of s1is relative to the beginning of the
interval already selected by s2.
To also be able to reuse parts of text, a textual selector
spatial-s(Definition 15) selects a continuous fraction from a
text media element bound to the variable of p.The
presentation semantics is that only the selected part of the
text is presented, i.e., the text fraction that begins at the text
position start and has the given length in characters.
Definition 15 (Textual Selector ElementÐtextual-s). The
textual selector element
372 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
Fig. 12. Sample fragment illustrating the usage and semantics of nesting
temporal selector elements s1:...;10;25;s
2:...10;40:
textual-s:ttextual-s;b
textual-s;V
textual-s;
PVtextual-s; start; length
is a presentation element with ttextual-sTextual-S2OT ,
jVtextual-sj1,start 2N
0, and length 2N.
4.4 Projectors
To add layout information to a presentation element, its 0to
nprojector variables pv can be used to bind projector elements
to the presentation element. Projector elements are pre-
sentation elements that determine how presentation ele-
ments are presented. The model offers the four different
projector elements spatial-p,temporal-p,acoustic-p, and
typographic-p, to specify the temporal, spatial, acoustic, and
typographic layout of a presentation which we define in the
following.
The presentation semantics of the spatial projector element
spatial-p(Definition 16) is that the visual presentation of p,
the presentation element it is bound to, is ªprojectedº on the
rectangular presentation area defined by the projector
element. The parameters xand ydefine the position of
the upper left corner of a rectangle with the given width and
height. The parameter priority defines the order of the
overlapping of visual objects such that an object with a
higher priority value covers objects with a lower priority
value. The value of the parameter unit determines whether
the values of the parameters x; y; width; height are given in
pixel or in percent of a presentation window.
Definition 16 (Spatial Projector ElementÐspatial-p). The
spatial projector element
spatial-p:tspatial-p;b
spatial-p;V
spatial-p;PV
spatialÿp;
x; y; width; height; priority; unit
is a presentation element with tspatial-pSpatial-P2PT,
Vspatial-pPVspatial-p;,x,y,priority 2N
0,width,
height 2N, and unit 2fpixel; percentg.
The spatial projector, like all projector elements, applies
not only to the presentation element pit is bound to but also
to all successors of p. That is, it affects the entire subtree of
which pis the root element with regard to the spatial
projection. The visual parts of pand possibly successors are
scaled to the presentation area defined by the projector's
parameters.
If spatial projectors are nested, then each spatial projector
spatial-pis evaluated in its context. Fig. 13 illustrates the
usage and the semantics of nesting spatial projector
elements. In the example, the root par element has a spatial
projector bound to it that specifies the rectangle presentation
area for the subtree as x10;y10;w 100;h 100. This
area is indicated on the right part of the figure with an dotted
rectangle. The two images that are successors of the par
element each have an own spatial projector. The spatial
projector of Image1in the subtree defines a presentation area
x0;y0;w 40;h 40and the second image a pre-
sentation area with x60;y60;w 40;h 40. In con-
sequence, both spatial projectors of the images are evaluated
in the context of the spatial projector bound to the par
element. Therefore, the areas of the tow images are projected
within the area defined by the spatial projector of the par
element.
The presentation semantics of the temporal projector
element temporal-p(Definition 17) bound to a presentation
element pis that the element pis presented with the given
playback direction and speed. The parameter direction
specifies whether the presentation element (and its subtree)
is presented in a forward (direction 1) or in a backward
direction (direction ÿ1). The actual playback speed is
computed by multiplying the original playback speed with
the factor given by the speed parameter.
Definition 17 (Temporal Projector ElementÐtemporal-p).
The temporal projector element
temporal-p:ttemporal-p;b
temporal-p;V
temporal-p;
PVtemporal-p; direction; speed
is a presentation element with
ttemporal-pTemporal-P2PT;
Vtemporal-pPV
temporal-p;;
direction 2fÿ1;1g;
and speed 2<
.
Like the spatial projector element, a temporal projector
element applies not only to the presentation element pit is
bound to but to all successors of that presentation element.
If, for example, the temporal-pprojector of a presentation
element pdefines speed 2and a successor p0of phas a
temporal projector that also defines speed 2, then, in fact,
the successor p0is presented at a speed factor of 4.
In the same way the acoustic projector element and the
typographic projector element are defined, the acoustic
projector element acoustic-p(Definition 18) determines the
volume, balance, base, and treble of the presentation of the
presentation element pand all successors of p.The
typographic projector element typographic-p(Definition 19)
affects the parameters font, size, style, background and
foreground color of the presentation of the presentation
element pit is bound to and all successors of p.
Definition 18 (Acoustic Projector ElementÐacoustic-p).
The acoustic projector element
acoustic-p:tacoustic-p;b
acoustic-p;V
acoustic-p;
volume; balance; base; treble
is a presentation element with
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 373
Fig. 13. Sample fragment illustrating the usage and semantics of nesting
spatial projector elements.
tacoustic-pAcoustic-P2PT;
Vacoustic-pPV
acoustic-p;;
volume 20;...;100;
and balance,base,treble 2ÿ1;...;1.
Definition 19 (Typographic ProjectorÐtypographic-p). The
typographic projector element
typographic-p:btypographicÿp;V
typographicÿp;
font; size; style; bg; fg
is a presentation element with
ttypographic-pT ypographic-P2PT;
Vtypographic-pPVtypographic-p;;
font 2F ontNames;
style 2fnormal; italic; boldg;
size 2point;
and bg; fg 2COLORS.
A projector element at first affects the presentation
element pit is bound to. If, however, phas successors, then
those can be affected, too. Each successor of pis affected if
the specific projector can acutally have an affect on it. For
example, a typographic projector affects only those ele-
ments in the subtree of pthat bear typographic aspects. In
Fig. 7, a spatial and an acoustic project element are bound to
apar temporal operator. The spatial projector applies only
to the video, whereas the acoustic projector applies only to
the audio that is bound to the par element.
4.5 Interaction Elements
To support the requirement of interactive multimedia
presentations, the model offers different interaction elements
for navigational and design interactions.
The gen link element (Definition 20) is the basic element
for the modeling of navigation in ZYX documents. The
generic link is the presentation element that specifies a
noninteractive direct transition to a target element. It serves
as the basis for the actual ªinteractiveº elements in the
following. The gen_link has the two parameters target and
mode. The parameter target specifies the target of the
transition and the parameter mode the way how this
transition is to be carried out.
Definition 20 (Generic Link ElementÐgen link). The
interaction element
gen link :tgen link;b
gen link;V
gen link;PV
gen link; target; mode
is a presentation element with
tgen link GenericLink 2OT ;
Vgen link PVgen link ;;
target 2domUniform Resource Identifier;
and mode 2fstop; spawng.
The presentation semantics of the gen link is that on
the presentation of the link element, the link target which
is specified by a Uniform Resource Identifier (URI) is
presented. The target need not be a ZYX document but
can be an HTML document or an arbitrary application
and is presented by the browser/viewer that is associated
with the target's URI. The mode of the generic link
determines whether the current presentation stops and
only the target is presented (mode stop), or if the
presentation of the target is presented in parallel with the
current presentation (mode spawn). The ZYX sample tree
in Fig. 14 shows a video-audio presentation which is
followed directly by the presentation of the link target,
i.e., the presentation of the target specified with anURI.
As the generic link is intended to model transitions
to arbitrary link targets, we introduce the ZYX_link
(Definition 21) to specify the specific transition to a
ZYX document.
Definition 21 (ZYX Link ElementÐZYX_link). The interac-
tion element
ZYXlink :tZYXlink;b
ZYXlink;V
ZYXlink;
PVZYXlink ; target; mode
is a presentation element with
tZYXlink ZYXLIN K 2OT ;
VZYXlink PVZYXlink ;;
target 2ZYXDOC;
and mode 2fstop; spawng.
The semantics of the ZYXlink is that on its presentation,
the ZYX document specified by target is presented. The
parameter mode describes whether presentatin of the
current document stops and the target ZYX document is
presented (mode stop), or if it is presented in parallel with
the current presentation (mode spawn).
So far, the elements gen link and the ZYXlink are used
to model a direct, noninteractive transition to a link target.
For a link transition initiated by a user interaction with a
visual presentation element, we define the menu interaction
element.
The menu interaction element (Definition 22) defines a
set of variables to which the presentation elements of the
visual link anchors are bound and the corresponding
presentation elements that are to be presented when the
respective link anchor is interactively selected.
Definition 22 (Interaction elementÐmenu). The interaction
element menu :tmenu;b
menu;V
menu;PV
menu; modeis a pre-
sentation element with
374 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
Fig. 14. Sample fragment illustrating the usage and semantics of the
gen_link interaction element.
tmenu Menu 2OT ;
mode 2fvanish; prevailg;
Vmenu fv1;...;v
n;t
1;...;t
ng;
and n2N.
The menu interaction element defines a set of
selectable presentation elements (link anchors) bound to
vi2Vmenu;i 1...nrepresenting the menu items. The
presentation elements bound to ti2Vmenu ;i1...n
represent the target elements of the selection. Each
selectable menu item bound to vkcorresponds to the
target tk. The presentation semantics of the menu
element is that on presentation of the menu element,
in parallel, all the elements bound to vi2Vmenu;i 1...n
are presented, i.e., the menu is presented. When a user
selects one of the menu items bound to vj, the target
element of the selection bound to tjis presented. The
parameter mode determines what happens with the current
presentation. If mode vanish, the engine finishes the
presentation of all presentation elements bound to
vj;j1...n, and starts the presentation of the presentation
element bound to ti. If parameter mode prevail,the
engine ªmergesº the presentation of the presentation
element bound to tiwith the currently running presenta-
tion. If no element (menu item) is selected by a user, the
presentation of the menu element stops as soon as the
presentation of all presentation elements bound to
vi;i1...n, is finished.
Fig. 15 illustrates the usage of the menu element. Image1
and Image2represent the two selectable menu items. On
interaction with Image1, the presentation of the video-
audio presentation bound to t2starts. On interaction with
the link anchor Image2bound to v2, the presentation of a
ZYXlink starts which results in the presentation of the
ZYX document bound to t2.
The menu interaction element is provided to allow for
interaction with visual presentation elements and naviga-
tion within a document, i.e., the selection of one out of a set
of possible presentation paths. By using the gen link and
ZYXlink as target elements of the menu element, these
paths can leave the document and lead to other documents.
So far, the appearance of a link is limited to the visual
appearance of the presentation element that forms the link
anchor. To offer a more fine-grained specification of link
anchors, e.g., a region in an image or a word within a text,
the ZYX model offers the primitives hotspot and hypertext.
The hotspot element (Definition 23) is a variant of the
menu element but refines the interaction sensitive area to an
arbitrary polygon of a visual element. In addition to the link
anchors in the menu element, it specifies a set of sensitive
areas by polygons. Instead of linking a set of link anchors
with a set of targets in the menu element, the hotspot
element interlinks areas of visual presentation elements with
link targets.
Definition 23 (Interaction ElementÐhotspot). The interac-
tion element
hotspot :thotspot;b
hotspot;V
hotspot;PV
hotspot;P
1;...;P
n; mode
is a presentation element with
thotspot HotSpot 2OT ; Vhotspot fv; t1;;t
ng;
PV
hotspot ;;P
i<x1;y
1>; ...;<x
m;y
m>;
start; dur; mode 2fvanish; prevailg;
and n2N.
The presentation semantics of the hotspot is the pre-
sentation of the link anchor bound to vand, not necessarily
visible, the associated interaction-sensitive areas. These
areas are defined each by a tuple Pithat specifies the
sensitive area by a polygon <x1;y
1>; ...;<x
m;y
m>and
the interval start; durfor which the sensitive area is active
during the presentation. This interval is related to the
beginning of the presentation of the hotspot. On user
interaction with the sensitive area specified by Pi, the
corresponding link target tiis presented under the given
mode (vanish or prevail).
A further variant of the menu element is the hypertext
element (Definition 23). As a hotspot allows to associate an
interaction-sensitive region of an image or a video with a
link, the hypertext element offers a means to model sensitive
parts within text. Like the hotspot, a hypertext interaction
element is sensitive for a specified temporal interval
start; dur.
Definition 24 (Interaction ElementÐhypertext). The inter-
action element
hypertext :thypertext;b
hypertext;V
hypertext;
PVhypertext ;T
1;...;T
n; mode
is a presentation element with thypertext HyperT ext 2OT ,
Vhypertext fv; t1;...;t
ng, and
PV
hypertext ;;T
istart; length;start; dur;
mode 2fvanish; prevailg;n2N:
The presentation semantics of the hypertext is that on
its presentation the presentation of the text anchor
bound to vstarts. The hypertextelement specifies the
sensitive regions of the text by means of tuples Ti
start; length;start; dur each defining a sensitive text
segment by its starting text position and its length and
the temporal interval for which the sensitive text area is
active during the presentation. On user interaction with
the sensitive segment defined by Tiof the text, the
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 375
Fig. 15. Sample fragment illustrating the usage and semantics of the
menu and ZYXlink interaction element.
corresponding link target tiis presented under the given
mode (vanish or prevail).
The model provieds two further types of interaction
elements, interactive projector elements and interactive selector
elements. These elements comply in general with the
projector and selector elements presented in Definitions 16,
17, 18, and 19, but they have an additional ªinteractiveº
aspect, i.e., they can be interactively changed and adjusted
by a user. For each of the projector and selector element, a
corresponding interactive projector element is provided by
the model.
An example of an interactive projector element is the
interactive temporal projector element temporal-pi (Defini-
tion 25) which is an interactive temporal-pprojector
element. Its presentation semantics is that in addition to
the specified temporal projection during presentation, a
user can interactively adjust the element's specific para-
meters direction and speed within their domains. For each
temporal projector, the model offers the corresponding
interactive projector element.
Definition 25 (Interaction ElementÐtemporal-pi). The
temporal interactive projector element
temporal-pi :btemporal-pi;V
temporal-pi;PV
temporal-pi;
direction; speed
is a presentation element Vtemporal-pi ;,speed 2<
,
direction 2fÿ1;1g.
An example of an interactive selector is the interaction
element spatial-si (Definition 26) which is a special spatial-s
selector. Its presentation semantics is that, in addition to the
spatial selection, the presentation engine offers a user to
interactively adjust the selected spatial area and the over-
lapping by changing the parameters x,y,width,height and
priority within their domains.
Definition 26 (Interaction ElementÐspatial-si). The spatial
interactive selector element
spatial-si :bspatialÿsi;V
spatialÿsi;PV
spatialÿsi; x; y; width; height
is a presentation element with Vtemporal-si ;,x; y 2N
0, and
width; height 2Ng.
Analogously, the temporal-si element is defined. The
interactive selector elements allow us to model the inter-
active spatial and temporal scaling of media elements and
fragments during the presentation.
In addition to the support for navigational interac-
tion by the elements gen link,ZYXlink,menu,hotspot,
and hypertext, the interactive projector and selector
elements implement the design interactions of multi-
media presentations.
4.6 Adaptation Elements
Our model offers the two elements switch and query which
allow for the adaptation of a multimedia presentation
according to the user's individual context. This user context,
expressing the user's topics of interest, presentation system
environment, network connection characteristics and the
like, is described in a global profile GP by means of attribute
value pairs (Definition 27).
Definition 27 (Global ProfileÐGP ). The Global Profile GP :
m1;...;m
nis a set of metadata with miattri; valuei
denoting attribute-value pairs that describe the current user
context during a presentation with attri2AT T RIBUTES
and valuei2domattri;i2N.
The switch adaptation element (Definition 28) serves the
purpose of specifying different presentation alternatives for
different contexts. Under a switch element, an author can
ªcollectº different alternatives (media elements or frag-
ments) and add metadata to each alternative that specify
under which presentation conditions the alternative is to be
selected. Thereby, an author can define different fragments
for conveying the same content under different presentation
context like system environment, user language, the user's
understanding of the subject, network bandwidth, and the
like. The metadata associated with the switch element is
evaluated by the presentation environment against the
global profile to select the one best matching the current
context.
Definition 28 (Adaptation ElementÐswitch). The adaptation
element switch :tswitch;b
switch;V
switch;PV
switch;M
1;...;M
n
is a presentation element with tswitch Switch 2OT ,Mi
denoting sets of attribute-value pairs,
Vswitch fv1;...;v
n;v
default g;
and n2N.
The presentation semantics of the switch element is that
upon its presentation, the metadata available with the GP is
evaluated against the sets of metadata Mi;i 1...nof the
switch. Let Mj;j2f1;...;ngbe the set of metadata which
matches best GP. Then, the fragment bound to vj, i.e., the
presentation alternative best matching the current presenta-
tion context, is presented. If there is no suitable set of
metadata among M1;...;M
n, the presentation element
bound to vdefault is selected for presentation. The metadata
of the switch element is continuously evaluated against the
current, possibly changing global profile, i.e., changing
presentation context like varying bandwidth. In this case,
during the presentation of the switch element, the pre-
sentation environment can select another more suitable
alternative due to a changed context, e.g., switching from a
video to a slide show due to decreasing network band-
width. The presentation of the switch element finally
terminates when the presentation of the selected presenta-
tion element is finished.
For cases in which an author does not want to allow this
kind of continuous adaptation, the model provides the
decide element. The usage of a decide element instead of the
switch element would, e.g., make the presentation stay with
the video, once selected, instead of switching to an
alternative slide show. The definition of the decide element
is given in Definition 29:
Definition 29 (Adaptation ElementÐdecide). The adaptation
element decide :tdecide;b
decide;V
decide;PV
decide;M
1;;M
nis a
presentation element with tdecide Decide 2OT ,Midenoting
376 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
sets of attribute-value pairs, Vdecide fv1;...;v
n;v
default g, and
n2N.
The presentation semantics of the decide element is the
same as that of the switch element. However, the evaluation
of the sets of metadata against the current global profile GP
and the selection of the best match is made only once at the
beginning of the presentation of the decide element.
For cases in which the presentation alternatives of a
document are not known at authoring time, the query
element (Definition 30) is provided. The query element is
just a ªplaceholderº for a fragment. It specifies a ªqueryº
which selects a fragment just before presentation time from
all available fragments. The resulting fragment replaces the
query element in the ZYX document.
For the definition of the query element, we enhance the
definition of a fragment as given in Definition 5 such that a
fragment specification also includes metadata, i.e., f
P;C;Mwith Mbeing a set of attribute-value pairs. This
metadata describes both the content of a fragment flike the
topics covered and technical features of the fragment like
the network bandwidth needed for its presentation.
Definition 30 (Adaptation ElementÐquery). The adaptation
element query :tquery;b
query;V
queryPV
query;Mis a presenta-
tion element with tquery Query 2OT ,Mdenoting a set of
attribute-value pairs, and Vquery ;.
The semantics of the query element is that before the
actual presentation, the metadata of the query element and
the global profile specified with M[GP is evaluated
against the metadata given with all fragments known to
the system. Then, the fragment with the best match with
respect to Mand the profile GP is selected and the query
element is replaced by the selected fragment. The query
element allows us to dynamically select the most suitable
fragment at presentation time taking into account the actual
user interest and system environment.
5APPLICATION OF ZYXAND IMPLICATIONS TO
AUTHORING AND PRESENTATION
We have made clear how important we consider the
support for reuse and adaptation by a multimedia docu-
ment model, requirements we were aiming to meet in the
ZYX model. This section illustrates the application of these
two specific features we elaborated on in detail in Section 2.1
in authoring and presenting of ZYX multimedia documents.
We start out with presenting the many different kinds of
reuse of ZYX elements and fragments in Section 5.1 before
we come to the various possibilities to employ ZYX for
adaptation in Section 5.2. And, looking on the impact of
new document models like ZYX for multimedia content
production, we point at the implications and positive effects
this has to multimedia authoring.
5.1 Reuse
Applying ZYX for reuse means that, first, we show how
identification and selection is supported by ZYX as this
forms the basis for efficient reuse of media elements,
fragments, and documents. Then, we show application of
reusing ZYX elements in different granules and present
structural vs. identical reuse in ZYX.
5.1.1 Identification and Selection
Support for identification and selection is obligatory for
content to be efficiently reused. Only if the content can be
easily retrieved within the authoring process can reuse of
material be possible. Hence, sophisticated metadata must be
associated with media elements, fragments, and documents.
The metadata for the media elements comes with the
modeling of the different media types. At the level of
fragments, a set of metadata describes the content of the
composition. This metadata is anchored in the definition of
aZ
YX fragment fP;C;M, and relates especially to the
content and targeted user group.
The available metadata concerning both the content and
the structure of fragments can be employed for the
browsing of fragments in an authoring environment and
to identify and select fragments for composition of
ZYX documents.
5.1.2 Different Granularity of Reuse
Equipped with the modeling of metadata of the media
elements and ZYX fragments, we illustrate how reuse of
media elements, fragments, and documents can be exten-
sively applied with ZYX.
Reuse of media elements. Atomic media elements represent
the raw media data within ZYX documents. These elements
can be reused entirely or only partwise. Atomic media
elements form the leaves of the document structure. One
media element can be used in different branches of the tree.
As the atomic media elements only represent the actual
media data, only the atomic media elements are then used
several times in the document; however, the mere data
exists only once. To select only a part of a media element,
the selector elements are used. They select the desired
scene, visual area, or sound sequence of a medium. In
Fig. 16, two different scenes of the same video showing
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 377
Fig. 16. Reuse of media elements in ZYX.
the opening of patient's chest before the actual operation
on the open heart as well as two different parts of the
same text explaining the operative steps are composed in
aZ
YX document. The reuse of media elements, especially
partly reuse, can avoid redundant preparation of media
data just for one single application.
Reuse of fragments and complex media elements. The
composition of presentation elements leads to fragments
of arbitrary size and complexity. Fragments can be reused
as fragments themselves but also encapsulated within
complex media elements. Both the fragments and the
complex media elements can be bound to any other variable
during the composition of a (new) document. Exploiting
identification and selection as discussed in Section 5.1.1, an
authoring environment for ZYX here can offer the author
fragments or complex media elements relevant in the
desired context to be part of the newly composed docu-
ment. The only difference between reusing complex media
elements and fragments is that with the complex media
elements, the structure and complexity of the selected
subpart of the document are intendedly hidden from the
author. Rather, the semantics of the complex media element
is important, e.g., it comprises a slide show, and of how to
fill the unbound variables with presentation elements and
fragments. As the structure of all ZYX documents is
accessible and explicitly visible, authoring support could
go so far that a sophisticated content-based search algo-
rithm identifies those nodes (presentation elements) in
other documents that could be of interest to an author and
extracts the respective subtree (=fragment) for reuse.
The reuse of fragments and complex media elements of
arbitrary size is a feature that relieves an author from
cutting & pasting formerly composed documents but opens
the way to composition of multimedia documents much
like using a Lego or K'NEX unit construction set.
Fig. 17 illustrates the reuse of fragments and complex
media elements. In the example, the fragment already
introduced in Fig. 16 is reused in a course about operative
surgery. Additionally, an already existing complex media
element about a bypass operation is inserted as a digression
of the course into the specific domain of open heart surgery.
The fragment and the complex media element are, e.g.,
arranged in an sequential order and this sequence is then,
indicated by the dashed line, part of the entire course.
Reusable templates. With ZYX, an author can define
templates that cover, e.g., a didactic unit like a multimedia
course, a lecture, a technical guide, a tour through a
museum, and the like. Such a template is a regular
ZYX fragment but with unbound variables, i.e., the author
leaves some of the leafs of the tree unbound. These
templates give other authors a basic structure to start with
for the composition of a new ZYX document. Consider the
sample fragment in Fig. 18: It forms a sequence of five
presentation elements two of which are bound to a parallel
operator. This fragment is encapsulated into a complex
media element denoted aTemplate which then another
author uses to ªplug-inº the missing presentation elements
and, hereby, forms a new document. In Fig. 18, two
complex media elements, a title and a summary, and two
videos with captions are bound to the template aT emplate,
e.g., in a semiautomatic authoring process. For this, the
author rather needs only information about the usage of the
complex media element but not necessarily about the
explicit structure of the template.
Reuse of documents. As entire documents in ZYX are
nothing else but a (logically complete) fragment, documents
can be reused in any other ZYX document. Or, reuse can just
mean that an author arbitrarily alters and by this adjusts an
existing ZYX document to his/her specific needs.
5.1.3 Identical versus Structural Reuse
Following one of ZYX's design ideas to separate structure
from layout is to reuse a multimedia document with
different layouts, e.g., a different look and feel. For example,
if the layout designer of our Cardio-OP project changes the
concept for the overall presentation of medical content in
the project, hopefully only the layout of the documents
must be changed without touching the documents at all.
Another application is the change of the technical presenta-
tion medium. Consider a presentation with a screen layout.
What happens if the same presentation is to be presented at
a point of information with a touch screen? By exchanging
the layout, the same fragments can be used in different
presentation contexts. As each presentation element distin-
guishes between its variables and projector variables, the
378 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
Fig. 17. Reuse of fragments and complex media elements in ZYX.
Fig. 18. TemplatesÐstructural reuse of ZYX fragments.
structural part can easily be separated from the layout part.
An author, hence, can select to use only the structure and
assign a new layout to the document or fragment.
With structural reuse of ZYX documents and fragments,
the adaptation of a document's appearance to the presenta-
tion context is possibleÐhere, the relationship between
reuse and adaptation becomes obvious. Fig. 19 gives a
simple example of reusing the same fragment with two
different layouts. The presentation of the same fragment
then changes depending on the layout bound to it.
Structural reuse is also an application of the adaptation of
the layout of ZYX documents to a specific user context.
5.2 Adaptation
In the following, we describe the different adaptation
possibilities we have when exploiting the modeling
primitives of the ZYX model. The adaptation elements
switch and query as well as ZYXtemplates play the key role in
supporting adaptation.
5.2.1 Explicit Modeling of Presentation Alternatives
With the modeling of presentation alternatives, the author
of a ZYX document can explicitly model adaptivity to the
user context. For example, in the Cardio-OP context, a
switch can distinguish the alternatives for undergraduate
students, graduate students, and researchers. The switch
element allows us to define arbitrary discriminating values.
An alternative can also be ªlabeledº by a combination of
discriminating values. This means that adaptation has as
many dimensions as the author desires.
However, this means that the document, to be adaptable
to many different presentation contexts, needs to model all
the different presentation alternatives for the respective
contexts under the document's switch elements. To relieve
an author from such a time consuming and somehow never
ending story, we propose providing mechanisms to (semi)
automatically augment the document with the necessary
alternatives, possibly guided by a user. The idea is that the
author concentrates on the initial goal, to compose a
multimedia document with a certain content, and then
enrich the document, exploiting the switch primitive with
additional fragments for conveying the same information
but in different presentation contexts. In the following, we
only illustrate how this can be achieved; for further details,
we refer the reader to [18].
Automatic generation of presentation alternativesÐAugmen-
tation. For a fine-grained adaptation to many different user
contexts, it is mandatory that a high number of alternatives
be available. However, if an author had to specify all
possible alternatives, this would result in a very time
consuming composition effort and deviate the author from
the initial goal, namely, the composition of a sound
presentation. To relieve the authors from this additional
burden, we propose supporting the automatization of the
specification of the alternatives. We call this step augmenta-
tion of the multimedia document which takes place after the
document has been composed by the author. The augmen-
tation process queries the underlying pool of fragments
exploiting the inherent technical data and the metadata the
media elements have been annotated with to receive
potential presentation alternatives. The alternatives are
then inserted into the document, i.e., the document is
augmented by the alternatives to provide for adaptivity in
different presentation contexts. However, the suggested
alternatives cannot simply be inserted into the document
but, to preserve the semantics of the presentation intended
by the author, have to undergo a verification to assure that
the augmented document is still valid with regard to the
representation semantics.
Fig. 20 shows a small document which has been
augmented by additional fragments. First, before the
augmentation, the document contained the video V ideo1,
indicated in bold face. Then, targeting the document at both
a medical professor and a medical student and at the same
time taking into account three different levels of available
bandwidth for the presentation, the augmentation results in
a switch element offering such different alternatives. From
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 379
Fig. 20. Augmentation of a ZYX fragment.
Fig. 19. Reuse of structure with different layouts in ZYX.
the technical side, the augmentation constraint has intro-
duced atomic media elements and fragments for medium
and low bandwidth. Please note, that this does not
necessarily mean that there are only different quality of
the same medium. For example, for the professor, the
alternative for the V ideo 1at low bandwidth is a complex
media element, a slide show SlideShow. Additionally, the
documents can be used also in the context of a medical
student. Therefore, for each available bandwidth, a media
element has been augmented that is targeted at the
knowledge and background of a medical student but covers
the same topic. The parameters of the switch element in
Fig. 20 only indicate the discriminating attributes, whereas
the actual parameter list is too long for this illustrative
example.
In a first step, we elaborated an augmentation scheme to
(semi)automatically augment documents with respect to
different system contexts which mainly differ in the
targeted bandwidth and system power on the level of
providing presentation alternatives on the level of atomic
media elements. We have formalized the verification of this
kind of automatic augmentation of ZYX document with
presentation alternatives in [18]. A much more complicated
effort is to automatically augment ZYX documents with
semantically equivalent fragments that cover larger parts of
a presentation. For example, can a subsection of a multi-
media presentation intended for a medical doctor be
automatically augmented such that an equivalent content
is conveyed to a student which presumably has a much
lower background in the field? Here, possibly the annota-
tion of multimedia content must be carried out very
carefully by the experts in the field to give an automatic
augmentation model sufficient input to select and insert
semantically equivalent presentation alternative. And ad-
ditionally, the process of augmentation will be rather
semiautomatic, possibly guided by an author who is an
expert in the field.
5.2.2 Declarative Modeling of Presentation Alternatives
There are two kinds of applications of query elements for
adaptation: The query elements can be used for the
dynamic binding of fragments just before presentation
and can also be used to support the authoring process.
The query element bears the metadata that is to be
evaluated for the selection of the best matching fragment.
The formal definition of the query element specifies a set of
metadata to be met by the fragment to replace the query
node. The query semantics, however, are not specified by
the model but left to the application.
Query elements can be used to automatically adjust
documents to the current context, i.e., the query elements
are used to select the element that best matches the query at
the latest point in time just before presentation. One of the
advantages of leaving parts of the document somehow ªa
black boxº just until the actual request for presentation is
that in this case always the most up-to-date pool of
fragments is considered in the query evaluation. The
evaluation of a query element specified in a document can
be executed at authoring time to test the later result of the
presentation.
In combination with templates, the query element can be
applied for authoring support. Instead of leaving the
variables of a template unbound, one could bind these to
suitable query elements. The evaluation of the query
element at authoring time can then propose fragments to
be placed at that respective node. By this, a kind of content-
oriented browsing can be inserted in the documents and
allow, e.g., novice users to have an easy start with the
model.
5.3 Implications to Authoring and Presentation
The approach we have taken for the modeling of multi-
media content significantly impacts the authoring and
presentation of the multimedia material. Traditional author-
ing systems usually aim at the creation of a preorchestrated
presentation addressing a dedicated user group. These
presentations usually do not allow us to exploit the logical
structure or layout definitions for adaptation of the
presentation during playout. Given our approach, the
authoring process has to focus much more on the structural
composition of multimedia material, separating the logical
structure of a multimedia presentation from its layout
specifications. The resulting composition is no longer a
fixed preorchestrated presentation. It allows for explicit
exploitation of the structural composition in order to adapt
the presentation to individual user needs. In consequence,
the authoring system needs to have access to the individual
media elements, fragments, and documents that should be
considered for composition. Hence, the authoring tool has
to offer browsing, navigation, and selection mechanisms to
the authors in order to identify those media elements in the
multimedia repository that should become part of the
presentation. Obviously, the annotation of media elements,
parts of media elements, fragments, and documents give
the necessary support for the content-oriented browsing
such that an author can easily identify and select the
relevant parts. The authoring tool can either provide for the
construction of a ZYX document tree from scratch, or allow
for the completion of predefined ZYX templates.
The playout of a ZYX document can be realized in
different ways. As a first alternative, the ZYX document can
be transformed into a presentation format that can be
directly interpreted by existing players. This alternative
seems to be very interesting for the SMIL format, as first
SMIL players are already available. Obviously, the trans-
formation into another document format may result in the
loss of specific features or presentation information if the
target model does not provide the same level of semantic
expressiveness as available by the ZYX model. As a second
alternative, ZYX documents could be played out by a
ZYX-specific presentation engine that is capable of fully
exploiting all the features of the ZYX model with respect to
adaptation of a presentation. This allows for the integration
of new business models into the presentation environment.
For example, the end user can be billed for the actual quality
of the multimedia material s/he received. In the Cardio-OP
project, we developed a specific ZYX presentation engine.
In summary, the kind of structured authoring that
results in adaptive multimedia documents and the
380 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
presentation features of a ZYX-based presentation tool,
both aiming at reuse and adaptation of multimdia
material allow for cost effective multimedia authoring
and customized presentations.
6CONCLUSION AND FUTURE WORK
Starting out with the requirements of the Cardio-OP project,
which calls for the support of reusability,adaptation, and
presentation-neutral description of the structure and content of
multimedia documents, we sketched our analysis of
existing relevant multimedia document models. As these
models do not meet the project's requirements, we
introduced our new ZYX model that gives the necessary
support. We outlined the design considerations of the
ZYX model and the basic concepts followed by a formal
framework of the ZYX primitives. Finally, we illustrated the
applicability of ZYX for reuse and adaptation and the
challenges and implications of these advanced concepts
when using them for authoring and presentation environ-
ments for multimedia documents.
The ZYX model has been implemented as a DataBlade
module for the object-relational database system Informix
Dynamic Server/Universal Data Option under SUN
Solaris [19], following the architectural framework initi-
ally presented in [20], [21]. The formal description served
as the basis for the definition of an XML DTD for the
ZYX model. This will enable access to content stored in
the Cardio-OP repository by future XML-capable brow-
sers and we can also think about storing ZYX documents
in an SGML/XML-capable database system in the future,
following the approach taken in [22]. Furthermore, we
have developed a generic presentation engine for ZYX
documents which includes support for continuous MPEG
video streams based on an MPEG-specific extension of
the L/MRP buffer management technique [23].
For content-based managing and querying the under-
lying media data, we have been developing a Media
Integration DataBlade module [24] for the IDS/UD which
forms an integration layer offering uniform, homogeneous
access to the different types of media data. Supporting
multimedia authoring, this DataBlade allows for inter-
active content-based browsing in the multimedia material.
With the MediaWorkBench, we have been developing a
tool in Java on top of the Media Integration DataBlade
module for GUI-supported annotating and browsing the
media data.
With regard to the global profile describing the user
context, we have been developing a mathematical model for
the combination of different profiles describing different
aspects of a user like user group, user system environment
and the like into one semantically correct, conflict-free
global profile that can be exploited for presentation of
adaptive ZYX documents.
For adaptation support, we have developed a cross-
media adaptation scheme [18] that can be integrated with
the ZYX model and provides for the automatic augmenta-
tion of ZYX documents by semantically correct presentation
alternativesÐa process which relieves the authors from a
time-consuming task of comprehensively composing docu-
ments for different user and system contexts.
Given this ongoing work, one further goal is to develop
generic composition schemes and, exploiting the metadata
provided with the fragments and the global profile
describing the user context, to support (semi)automatic
composition of documents that are adapted and persona-
lized to the specific user context.
ACKNOWLEDGMENTS
The authors would like to thank Utz Westermann for his
contributions to the design and implementation of the
ZYX model. The authors would like to thank Jochen Wandel
for his contributions to the formal framework to support
automatic augmentation of multimedia document models.
The authors would also like to thank Christian Heinlein for
his valuable comments on the paper.
REFERENCES
[1] W. Klas, C. Greiner, and R. Friedl, ªCardio-OPÐGallery of
Cardiac Surgery,º Proc. IEEE Int'l Conf. Multimedia Computing
and Systems (ICMCS '99), 1999.
[2] D. Raggett, A. Le Hors, and I. Jacobs, HTML 4.0 SpecificationÐW3C
Recommendation, revised on 24-April-1998, W3C, URL: http://
www.w3.org/TR/1998/REC-html40-19980424, Apr. 1998.
[3] ISO/IEC JTC1/SC29, Information TechnologyÐCoding of Multimedia
and Hypermedia InformationÐPart 1: MHEG Object Representation
ISO/IEC 13522-1, ISO/IEC IS, 1997.
[4] ISO/IEC JTC1/SC29/WG12, Information TechnologyÐCoding of
Multimedia and Hypermedia InformationÐPart 6: Support for
Enhanced Interactive Applications, ISO/IEC IS 13522-6, ISO/IEC,
1996.
[5] ISO/IEC JTC1/SC29/WG12, Information TechnologyÐCoding of
Multimedia and Hypermedia InformationÐPart 5: Support for Base-
Level Interactive Applications, ISO/IEC IS 13522-5, ISO/IEC, 1995.
[6] ISO/IEC, Information TechnologyÐHypermedia/Time-Based Structur-
ing Language (HyTime), ISO/IEC IS 10744, 1992.
[7] S.R. Newcomb, N.A. Kipp, and V.T. Newcomb, ªHyTimeÐThe
Hypermedia/Time-Based Document Structuring Language,º
Comm. ACM, vol. 34, no. 11, Nov. 1991.
[8] P. Hoschka, S. Bugaj, D. Bulterman et al. Synchronized Multimedia
Integration LanguageÐW3C, Working Draft 2-February-98, W3C,
URL: http://www.w3.org/TR/1998/WD-smil-0202, Feb. 1998.
[9] S. Boll, W. Klas, and U. Westermann, ªMultimedia Document
FormatsÐSealed Fate or Setting Out for New Shores?º Proc. IEEE
Int'l Conf. Multimedia Computing and Systems (ICMCS '99), 1999.
[10] S. Boll, W. Klas, and U. Westermann, ªA Comparison of
Multimedia Document Models Concerning Advanced Require-
ments,º Technical ReportÐUlmer Informatik-Berichte Nr. 99-01,
Univ. Ulm, Germany, http://www.informatik.uni-ulm.de/dbis/
Cardio-OP/publications/TR99-01.ps.gz, Feb. 1999.
[11] S. Boll, W. Klas, and U. Westermann, ªMultimedia Document
FormatsÐSealed Fate or Setting Out for New Shores?º Multi-
mediaÐTools and Applications, vol. 11, no. 2, pp. 267-279, Aug. 2000.
[12] T.D.C. Little and A. Ghafoor, ªInterval-Based Conceptual Models
for Time-Dependent Multimedia Data,º IEEE Trans. Knowledge and
Data Eng., vol. 5, no. 4, Aug. 1993.
[13] T. Wahl and K. Rothermel, ªRepresenting Time in Multimedia
Systems,º Proc. IEEE Int'l Conf. Multimedia Computing and Systems,
pp. 538±543, 1994.
[14] A. Duda and C. Keramane, ªStructured Temporal Composition of
Multimedia Data,º Proc. IEEE Int'l Workshop Multimedia- Database-
Management Systems, 1995.
[15] N. Hirzalla, B. Falchuk, and A. Karmouch, ªA Temporal Model for
Interactive Multimedia Scenarios,º IEEE Multimedia, vol. 2, no. 3,
pp. 24±31, Fall 1995.
[16] D. Papadias, Y. Theodoridis, T. Sellis, and M.J. Egenhofer,
ªTopological Relations in the World of Minimum Bounding
Rectangles: A Study with R-Trees,º Proc. ACM SIGMOD Conf.
Management of Data, 1995.
BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 381
[17] M.J. Egenhofer and R. Franzosa, ªPoint-Set Topological Spatial
Relations,º Int'l J. Geographic Information Systems, vol. 5, no. 2, Mar.
1991.
[18] S. Boll, W. Klas, and J. Wandel, ªA Cross-Media Adaptation
Strategy for Multimedia Presentations,º Proc. ACM Multimedia '99,
1999.
[19] S. Boll, W. Klas, and U. Westermann, ªExploiting OR-DBMS
Technology to Implement the ZYX Data Model for Multimedia
Documents and Presentations,º Proc. Datenbanksysteme in Bu
Èro,
Technik und Wissenschaft (BTW '99), GI-Fachtagung, 1999.
[20] W. Klas and K. Aberer, ªMultimedia and Its Impact on Database
System Architectures,º Multimedia Databases in Perspective,
P.M.G. Apers, H.M. Blanken, and M.A.W. Houtsma, eds., 1997.
[21] S. Boll, W. Klas, and M. Lo
Èhr, ªIntegrated Database Services for
Multimedia Presentations,º Multimedia Information Storage and
Management, S.M. Chung, ed., 1996.
[22] K. Bo
Èhm, K. Aberer, and W. Klas, ªBuilding a Hybrid Database
Application for Structured Documents,º MultimediaÐTools and
Applications, vol. 8, no. 1, 1999.
[23] F. Moser, A. Kraiû, and W. Klas, ªL/MRP: A Buffer Management
Strategy for Interactive Continuous Data Flows in a Multimedia
DBMS,º Proc. Very Large Data Bases, 1995.
[24] U. Westermann and W. Klas, ªArchitecture of a DataBlade
Module for the Integrated Management of Multimedia Assets,º
Proc. First Int'l Workshop Multimedia Intelligent Storage and Retrieval
Management (MISRM), 1999.
Susanne Boll received the diploma degree in
computer science at the Technical University of
Darmstadt, Germany, in 1995. She currently
pursues her PhD studies working as a research
assistant for Professor Klas. She is a member of
the Institute for Computer Science and Business
Informatics at the University of Vienna, Austria.
Until 2000, she was a member of the Database
and Systems (DBIS) group at the University of
Ulm, Germany. Her research interests lie in the
areas of database-driven Internet-based multimedia information sys-
tems and e-commerce systems. Currently, she works on flexible,
adaptive multimedia document models and support for context-specific
multimedia presentation generation.
Wolfgang Klas is a professor at the Institute for
Computer Science and Business Informatics at
the University of Vienna, Austria. Until 2000, he
was a professor in the Computer Science
Department at the University of Ulm, Germany.
Until 1996, he was head of the Distributed
Multimedia Systems Research Division
(DIMSYS) at GMD-IPSI, Darmstadt, Germany,
and directed many research projects and in-
dustrial collaborations in the fields of object-
oriented database technology, multimedia information systems, inter-
operable database systems, and cooperative systems. In 1991/1992,
Dr. Klas was a visiting fellow at the International Computer Science
Institute (ICSI) at the University of California at Berkeley. His research
interests are currently in multimedia information systems and Internet-
based applications. He currently serves on the editorial board of the
Very Large Data Bases Journal and has been a member and chair of
program committees of many conferences.
382 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001
... The expressiveness of a multimedia document model, i. e., the primitives it defines, determines the degree of functionality the multimedia documents can provide. These central features or central aspects [133] are the temporal course, spatial layout, and interaction possibilities of a multimedia presentation, i. e., how users can interact with the document [23,26,81,80,86,125,97]. We present an overview of these central aspects. ...
... We present an overview of these central aspects. For further discussions, we refer the reader to [133,23,26]. ...
... hyperlink in [86]). Scaling interaction and movie interaction allow the users to interactively manipulate the visible and audible layout of a presentation [23,26]. For example, one can define if a user is allowed to change the presentation's volume or spatial dimensions. ...
... Une des conséquences immédiates de la modularité au niveau des systèmes multimédias est l'incrémentalité du processus de création. Il est en effet possible de créer les présentations multimédias parétapes, en concevant chaque module l'un après l'autre ou bien en parallèle [18]. Cette incrémentalité est exploitée dans les systèmes d'édition multimédia, comme Madeus ou GRiNS. ...
... Ces langages permettent de créer des structures fonctions d'autres structures, certains permettant une définition générique d'une structure paramétrée par d'autres structures. On retrouve cette généricité dans des langages comme Z Y X [18]. Les langages orientés objets forment un paradigme très utilisé aujourd'hui. ...
... Bien qu'augmentant le pouvoir d'expression du langage, la fusion temporelle a une sémantique qui est non-triviale. En effet, [2], [160] et [3] On retrouve ces Interval Expressions dans le modèle Z Y X [18] et dans [21]. ...
Article
The multimedia field exists since a long time, but its importance is increased by the convergence of industries such as movies, telecommunications or video games. Multimedia content is based on monomedia contents and defines various aspects whose main ones are spatial and temporel compositions. Within this scope, we propose the language TAO (Temporal Algebraic Operators) which allows to define multimedia presentations. TAO is an object-oriented language where objects make reference to monomedia data. Temporal operators specify the composition of these objects, according to a semantics defined with the notion of interval. The language temporal model is based on interval temporel points and causal relations expressing their links. The semantics of TAO programs is given by a normalization process and some of their properties are analysed, as equality and stop. We also present an execution engine for TAO programs which we implemented in Java by reusing an existing prototype. It is based on a set of sequential instructions and jumps. TAO programs are compiled into contexts of instructions which are managed by the various components of the execution engine.
... Thus, authoring languages SMIL and MPEG-4 XMT [24] are also based on the hierarchical paradigm. The hypermedia model presented in [45] and the ZYX model [11] also uses compositions to define the temporal relations as a tree model representation. ...
Conference Paper
This paper proposes an interactive multimedia authoring tool called STEVE (Spatio-Temporal View Editor) and a new multimedia model called SIMM (Simple Interactive Multimedia Model). STEVE aims at allowing users with no knowledge of multimedia authoring languages and models to create hypermedia applications for web and digital TV systems in a user-friendly way. Compared with existing multimedia authoring tools, STEVE is the unique tool that allows ordinary users to export hypermedia applications to HTML5 and NCL documents. STEVE uses an event-based temporal synchronization model called SIMM that exactly fits its needs. SIMM provides high-level temporal, spatial and interactivity relations to make authoring with STEVE easier. Usability tests show that, according to users, STEVE allowed them to create multimedia applications and export them as HTML5 and NCL documents in a few minutes without programming.
... -Further Multimedia Presentation Models and Languages: Further multimedia presentation/document models and languages are described by Adali et al. [1999Adali et al. [ , 2000, Adiba and Zechinelli-Martini [1999], Assimakopoulos [1999], Deng et al. [2002a], and Scherp and Boll [2005]. Further models are ZYX [Boll et al. 1999[Boll et al. , 2000Boll and Klas 2001], the Layered Multimedia Data Model (LMDM) [Schloss and Wynblatt 1994]; Madeus [Layaida and Sabry-Ismail 1996], and MPGS [Bertino et al. 2000]. Interchange formats are the CWI Multimedia Interchange Format (CMIF) [Bulterman et al. 1991] and the Procedural Markup Language (PML) [Ram et al. 1999]. ...
Article
Hypervideos and interactive multimedia presentations allow the creation of fully interactive and enriched video. It is possible to organize video scenes in a nonlinear way. Additional information can be added to the video ranging from short descriptions to images and more videos. Hypervideos are video-based but also provide navigation between video scenes and additional multimedia elements. Interactive multimedia presentations consist of different media with a temporal and spatial synchronization that can be navigated via hyperlinks. Their creation and description requires description formats,multimediamodels, and standards- as well as players. Specialized authoring tools with advanced editing functions allow authors to manage all media files, link and arrange them to an overall presentation, and keep an overview during the whole process. They considerably simplify the creation process compared to writing and editing description documents in simple text editors. Data formats need features that describe interactivity and nonlinear navigation while maintaining temporal and spatial synchronization. Players should be easy to use with extended feature sets keeping elements synchronized. In this article, we analyzed more than 400 papers for relevant work in this field. From the findings we discovered a set of trends and unsolved problems, and propose directions for future research.
... Researchers can refer to previous work [27,121,122,123] in relation to secure session. Detailed information on the emerging protocols related to multimedia content is available in the existing literature [124,125,126,127,128,129]. ...
Thesis
Content adaptation bridges the mismatch between rich contents and user preferences along with the end device capability. This thesis addresses five key issues: enabling content adaptation as services; locating and selecting best possible services in the network; and negotiating, providing and managing quality assurance of a service.<br /
Article
The mulsemedia (Multiple Sensorial Media (MulSeMedia)) concept has been explored to provide users with new sensations using other senses beyond sight and hearing. The demand for producing such applications has motivated various studies in the mulsemedia authoring phase. To encourage researchers to explore new solutions for enhancing the mulsemedia authoring, this survey article reviews several mulsemedia authoring tools and proposals for representing sensory effects and their characteristics. The article also outlines a set of desirable features for mulsemedia authoring tools. Additionally, a multimedia background is discussed to support the proposed study in the mulsemedia field. Open challenges and future directions regarding the mulsemedia authoring phase are also discussed.
Conference Paper
We present an analysis of a large corpus of multimedia documents obtained from the web. From this corpus of documents, we have extracted the media assets and the relation information between the assets. In order to conduct our analysis, the assets and relations are represented using a formal ontology. The ontology not only allows for representing the structure of multimedia documents but also to connect with arbitrary background knowledge on the web. The ontology as well as the analysis serve as basis for implementing a novel search engine for multimedia documents on the web.
Article
Methods for authoring Web-based multimedia presentations have advanced considerably with the improvements provided by HTML5. However, authors of these multimedia presentations still lack expressive, declarative language constructs to encode synchronized multimedia scenarios. The SMIL Timesheets language is a serious contender to tackle this problem as it provides alternatives to associate a declarative timing specification to an HTML document. However, in its current form, the SMIL Timesheets language does not meet important requirements observed in Web-based multimedia applications. In order to tackle this problem, this paper presents the ActiveTimesheets engine, which extends the SMIL Timesheets language by providing dynamic clientside modifications, temporal linking and reuse of temporal constructs in fine granularity. All these contributions are demonstrated in the context of a Web-based annotation and extension tool for multimedia documents.
Article
Full-text available
There are several education and training cases where multi-camera view is a traditional way to work: performing arts and news, medical surgical actions, sport actions, instruments playing, speech training, etc. In most cases, users need to interact with multi camera and multi audiovisual to create among audiovisual segments their own relations and annotations with the purpose of: comparing actions, gesture and posture; explaining actions; providing alternatives, etc. Most of the present solutions are based on custom players and/or specific applications which force to create custom streams from server side, thus leading to restrictions on the user activity as to establishing dynamically additional relations. Web based solutions would be more appreciated and are complex to be realized for the problems related to the video desynchronization. In this paper, MyStoryPlayer/ECLAP solution is presented. The major contributions to the state of the art are related to: (i) the semantic model to formalize the relationships and play among audiovisual determining synchronizations, (ii) the model and modality to save and share user experiences in navigating among lessons including several related and connected audiovisual, (iii) the design and development of algorithm to shorten the production of relationships among media, (iv) the design and development of the whole system including its user interaction model, and (v) the solution and algorithm to keep the desynchronizations limited among media in the event of low network bandwidth. The proposed solution has been developed for and it is in use within ECLAP (European Collected Library of Performing Arts) for accessing and commenting performing arts training content. The paper also reports validation results about performance assessment and tuning, and about the usage of tools on ECLAP services. In ECLAP, users may navigate in the audiovisual relationships, thus creating and sharing experience paths. The resulting solution includes a uniform semantic model, a corresponding semantic database for the knowledge, a distribution server for semantic knowledge and media, and the MyStoryPlayer Client for web applications.
Article
Structured document plays a vital role in information carrier for realizing information exchange and dissemination in digital community. However, there is no prior work on discussing structured document model which appropriates for describing special characteristics of the structured document in digital community. In this paper, we present an appropriate structured document model which is elaborately described by formal method and based on the analysis of the special live characteristics of the structured document in digital community. And then, we design some suitable constraints in order to construct the well-formed structured document model.
Article
Full-text available
Practical needs in geographic information systems (GIS) have led to the investigation of formal and sound methods of describing spatial relations. After an introduction to the basic ideas and notions of topology, a novel theory of topological spatial relations between sets is developed in which the relations are defined in terms of the intersections of the boundaries and interiors of two sets. By considering empty and non-empty as the values of the intersections, a total of sixteen topological spatial relations is described, each of which can be realized in R 2. This set is reduced to nine relations if the sets are restricted to spatial regions, a fairly broad class of subsets of a connected topological space with an application to GIS. It is shown that these relations correspond to some of the standard set theoretical and topological spatial relations between sets such as equality, disjointness and containment in the interior.
Article
Full-text available
Existing multimedia document models like HTML, MHEG, SMIL, and HyTime lack appropriate modeling primitives to fit the needs of next generation multimedia applications which bring up requirements like reusability of multimedia content in different presentations and contexts, and adaptation to user preferences. In this paper, we motivate and present new requirements stemming from advanced multimedia applications and the resulting consequences for multimedia document models. Along these requirements, we discuss the document model standards HTML, HyTime, MHEG, SMIL, and ZYX, a new model that has been developed with special focus on reusability and adaptation. The analysis and comparison of the models show the limitations of existing models, point the way to the need for new flexible multimedia document models, and throw light on the many implications on authoring systems, multimedia content management, and presentation.
Conference Paper
Full-text available
Adaptation techniques for multimedia presentations are mainly concerned with switching between different qualities of single media elements to reduce the data volume and by this to adapt to limited presentation resources. This kind of adaptation, however, is limited to an inherent lower bound, i.e., the lowest acceptable technical quality of the respective media type. To overcome this limitation, we propose cross-media adaptation in which the presentation alternatives can be media elements of different media type, even different fragments. Thereby, the alternatives can extremely vary in media type and data volume and this enormously widens the possibilities to efficiently adapt to the current presentation resources. However, the adapted presentation must still convey the same content as the original one, hence, the substitution of media elements and fragments must preserve the presentation semantics. Therefore, our cross-media adaptation strategy provides models for the automatic augmentation of multimedia documents by semantically equivalent presentation alternatives. Additionally, during presentation, substitution models enforce a semantically correct information flow in case of dynamic adaptation to varying presentation resources. The cross-media adaptation strategy allows for flexible reuse of multimedia content in many different environments and, at the same time, maintains a semantically correct information flow of the presentation.
Working Paper
As multimedia systems deal with a variety of temporally interrelated media items, synchronization is an important issue in those systems. One part of synchronization is the representation of temporal information. In contrast to traditional computing tasks, multimedia imposes new requirements on the representation of time. Specifically, a fine-grained and a flexible temporal model is required. Therefore, a number of temporal models have been suggested by various authors. This paper evaluates and classifies a selection of the most common existing models applying fundamental statements of the time theory and temporal logic. Learning from the deficits of the existing models, a new temporal model based on interval operators is proposed for multimedia systems.
Article
Recent developments in spatial relations have led to their use in numerous applications involving spatial databases. This paper is concerned with the retrieval of topological relations in Minimum Bounding Rectangle-based data structures. We study the topological information that Minimum Bounding Rectangles convey about the actual objects they enclose, using the concept of projections. Then we apply the results to R-trees and their variations, R+-trees and R*-trees in order to minimise disk accesses for queries involving topological relations. We also investigate queries that involve complex spatial conditions in the form of disjunctions and conjunctions and we discuss possible extensions.
Article
This document specifies version 1 of the Synchronized Multimedia Integration Language (SMIL 1.0, pronounced "smile"). SMIL allows integrating a set of independent multimedia objects into a synchronized multimedia presentation. Using SMIL, an author can 1. describe the temporal behavior of the presentation 2. describe the layout of the presentation on a screen 3. associate hyperlinks with media objects This specification is structured as follows: Section 1 presents the specification approach. Section 2 defines the "smil" element. Section 3 defines the elements that can be contained in the head part of a SMIL document. Section 4 defines the elements that can be contained in the body part of a SMIL document. In particular, this Section defines the time model used in SMIL. Section 5 describes the SMIL DTD.
Conference Paper
Next generations of online multimedia training and education applications call for new approaches for the creation, storage, maintenance, commercial marketing, and publishing of multimedia content. The project “Gallery of Cardiac Surgery” (Cardio-OP) aims at the development of an Internet based database-driven multimedia information system for physicians, medical lecturers, students, and patients in the domain of cardiac surgery. The research project has a volume of about 3.3 Million Euro and constitutes a total effort of about 41 person years. Scientific contributions of Cardio-OP include a new approach towards the organization and online distribution of multimedia content, its creation and authoring, and its maintenance in a multimedia repository. The resulting information system is intended to be applicable to other application domains, such as continuous education and training programs for employees in production processes. The paper presents details on the project background and motivation, overall goals and objectives, and outlines some of the approaches taken and results achieved