ArticlePDF Available

ZYX-A Multimedia Document Model for Reuse and Adaptation of Multimedia Content.

June 2001
IEEE Transactions on Knowledge and Data Engineering 13(3):361-382

June 2001
13(3):361-382

DOI:10.1109/69.929895

Source
DBLP

Authors:

Susanne CJ Boll

Carl von Ossietzky Universität Oldenburg

Advanced multimedia applications require adequate support for the modeling of multimedia content by multimedia document models. More and more this support calls for not only the adequate modeling of the temporal and spatial course of a multimedia presentation and its interactions, but also for the partial reuse of multimedia documents and adaptation to a given user context. However, our thorough investigation of existing standards for multimedia document models such as HTML, MHEG, SMIL, and HyTime leads to us the conclusion that these standard models do not provide sufficient modeling support for reuse and adaptation. Therefore, we propose a new approach for the modeling of adaptable and reusable multimedia content, the Z<sub>Y</sub>X model. The model offers primitives that provide-beyond the more or less common primitives for temporal, spatial, and interaction modeling-a variform support for reuse of structure and layout of document fragments and for the adaptation of the content and its presentation to the user context. We present the model in detail and illustrate the application and effectiveness of these concepts by samples taken from our Cardio-OP application in the domain of cardiac surgery. With the Z<sub>Y</sub>X model, we developed a comprehensive means for advanced multimedia content creation: support for template-driven authoring of multimedia content and support for flexible, dynamic composition of multimedia documents customized to the user's local context and needs. The approach significantly impacts and supports the authoring process in terms of methodology and economic aspects

Summary of the support of the basic and advanced requirements by HTML, DHTML, MHEG-5/6, SMIL, and HyTime (+ support, o partial supportÐno support).

…

Graphical representation of the basic document elements.

…

Simple document treeÐa Z Y X fragment.

…

Temporal selector element temporal-s and its semantics.

…

+13

Spatial selector element spatial-s and its semantics.

…

Figures - uploaded by Susanne CJ Boll

Content may be subject to copyright.

Content uploaded by Susanne CJ Boll

Content may be subject to copyright.

ZYXÐA Multimedia Document Model for Reuse

and Adaptation of Multimedia Content

Susanne Boll and Wolfgang Klas

AbstractÐAdvanced multimedia applications require adequate support for the modeling of multimedia content by multimedia

document models. More and more this support calls for not only the adequate modeling of the temporal and spatial course of a

multimedia presentation and its interactions, but also for the partial reuse of multimedia documents and adaptation to a given user

context. However, our thorough investigation of existing standards for multimedia document models such as HTML, MHEG, SMIL, and

HyTime leads to us the conclusion that these standard models do not provide sufficient modeling support for reuse and adaptation.

Therefore, we propose a new approach for the modeling of adaptable and reusable multimedia content, the ZYX model. The model

offers primitives that provideÐbeyond the more or less common primitives for temporal, spatial, and interaction modelingÐa variform

support for reuse of structure and layout of document fragments and for the adaptation of the content and its presentation to the user

context. We present the model in detail and illustrate the application and effectiveness of these concepts by samples taken from our

Cardio-OP application in the domain of cardiac surgery. With the ZYX model, we developed a comprehensive means for advanced

multimedia content creation: support for template-driven authoring of multimedia content and support for flexible, dynamic composition

of multimedia documents customized to the user's local context and needs. The approach significantly impacts and supports the

authoring process in terms of methodology and economic aspects.

Index TermsÐMultimedia document model, reuse, adaptation, multimedia database system.

1INTRODUCTION

MULTIMEDIA applications need data models for the

representation of the composition of media ele-

mentsÐmultimedia document models. They are employed

to model the semantic relationships between the media

elements participating in a multimedia presentation. The

initial requirements to multimedia documents are the

modeling of the temporal and spatial course of a

multimedia presentation and also the modeling of user

interaction. However, the requirements of multimedia

applications have evolved: As authoring of multimedia

information is a very time consuming and costly task,

attention has been drawn to reuse multimedia docu-

ments for efficiency and economical reasons. Further-

more, the growing plenitude of multimedia information

calls for the personalization of the multimedia informa-

tion according to the user's individual context. Access

and distribution of multimedia documents via networks

like the Internet require adaptation of the documents to

heterogeneous network and system environments.

Our research project, ªGallery of Cardiac Surgeryº

(Cardio-OP

) [1], is an example of an advanced multimedia

application that emphasizes this need for reuse and

adaptation and explicitly requires a model for multimedia

material that supports extensive reuse of the material in

different user contexts. The overall goal is to develop an

Internet-based and database-driven multimedia informa-

tion system for physicians, medical lecturers, students, and

patients in the domain of cardiac surgery. The system will

serve as a common information and education base for its

different types of users in which the users are provided

with multimedia information according to their specific

request, their different understanding of the selected

subject, and their geographic location and technical infra-

structure. Within this project context, our group is devel-

oping concepts and prototypical implementations of a

database-driven multimedia repository that integrates mod-

eling,management, and content-based retrieval of multimedia

content with flexible dynamic multimedia presentation services

that select, deliver, and present the multimedia content

according to the user context. Major project requirements

are the support for reuse,adaptation, and presentation-neutral

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001 361

.The authors are with the University of Vienna, Institute for Computer

Science and Business Informatics, Liebiggasse 4/3-4, A-1010 Wien,

Austria. E-mail: {susanne.boll, wolfgang.klas}@univie.ac.at.

Manuscript received Apr. 1999; revised Nov. 1999; accepted Dec. 1999.

For information on obtaining reprints of this article, please send e-mail to:

tkde@computer.org, and reference IEEECS Log Number 111252.

1. This work was partially funded by the German Ministry of Research

and Education, grant number 08C58456. Our project partners are the

University Hospital of Ulm, Dept. of Cardiac Surgery and Dept. of

Cardiology, the University Hospital of Heidelberg, Dept. of Cardiac

Surgery, an associated Rehabilitation Hospital, the publishers Barth-Verlag

and dpunkt-Verlag, Heidelberg, FAW Ulm, and ENTEC GmbH, St.

Augustin. For details see also URL http://www.informatik.uni-ulm.de/

dbis/Cardio-OP/.

1041-4347/01/$10.00 ß2001 IEEE

description of the structure and content of multimedia

documents.

Given the project's requirements, we were looking for a

suitable modeling support among existing multimedia

document standards. Therefore, we elaborated both the

traditional and advanced requirements to multimedia

document models and, endowed with this metrics, ana-

lyzed the document models HTML [2], MHEG [3], [4], [5],

HyTime [6], [7], and SMIL [8]. The detailed analysis and

comparison of the models can be found in [9], [10], [11].

However, the analysis of the models' basic modeling

concepts as well as their support for reuse, adaptation,

and presentation-neutral description of multimedia content

showed that each of the models lacks some significant

concepts and does not meet all of the requirements.

Therefore, we designed and implemented the ZYX model

to overcome these limitations and to have a proper basis to

start out from to comprehensively provide for reusability

and adaptation by the multimedia repository.

In this paper, we present the ZYX model, which forms

the core for the modeling of the multimedia content in our

repository. In comparison to existing models, it provides

more adequate support for semantic modeling, reusability

and flexible composition, adaptation and individualization

for presentation, and presentation-neutral storage. We

illustrate the application of the model in the domain of

cardiac surgery and point out the implications of such a

model that supports reuse and adaptation to multimedia

authoring and multimedia presentation.

The paper is organized as follows: Section 2 provides the

reader a better understanding of the new requirements we

see with next generation multimedia applications. This

leads to a metric that we used to analyze existing multi-

media document models. The summary of this analysis is

also presented in this section. It motivates the need for our

new document model ZYX that emphasizes the require-

ments for reuse and adaptation of multimedia documents.

Section 3 presents the basic ideas and design considerations

of the ZYX model, Section 4 gives the formal framework for

a detailed understanding of the model. Focusing on reuse

and adaptation, Section 5 presents and illustrates the

spectrum of application possibilities of ZYX for reuse and

adaptation and discusses the advantages this support

brings to the creation and delivery of multimedia content.

Section 6 summarizes our work and gives an outlook to

ongoing and future work.

2REQUIREMENTS TO MULTIMEDIA DOCUMENT

MODELS AND AN ANALYSIS OF EXISTING

MODELS

In this section, we present our requirements to multimedia

document models. Hereby, we distinguish basic and

advanced requirements. The basic requirements to multi-

media document models are the modeling of the temporal

and spatial course of a multimedia presentation and the

modeling of interaction. The challenging, advanced re-

quirements to multimedia document models are the

reusability of the multimedia material, the adaptation to

user specific needs and context, and the presentation-

neutral description of the content. As our focus lies on the

advanced requirements, we start out with presenting these

in Section 2.1 and only shortly sketch the basic require-

ments afterwards in Section 2.2. Both the basic and the

advanced requirements constitute a metrics along which we

analyzed selected relevant multimedia document models

for their suitability in the project context. This analysis is

summarized in Section 2.3.

2.1 Advanced Requirements

In order to support a modular and context-dependent

composition of multimedia documents from media objects

and parts of multimedia documents, document models

need to provide a data model which provides support for

reuse,adaptation, and the presentation-neutral description of

the structure and content of multimedia documents.

Reuse. As explained in Section 1, reuse of multimedia

material is an unavoidable requirement for multimedia

document models. We characterize reusability of multi-

media content along three dimensions: the granularity of

reuse, the kind of reuse, and the selection and identification

of reusable components.

.Granularity. The granularity of reuse determines what

can be reused. Regarding multimedia document

models, we can distinguish at least three levels of

granularity for reusable components: reuse of com-

plete multimedia documents, reuse of fragments of

multimedia documents like single scenes or teaching

units, and reuse of individual atomic media elements

such as a video or audio and parts of those media

elements such as a scene of a video.

.Kind of reusage. For all three levels of granularity, we

distinguish two different ways of how to reuse

material for the composition of new documents:

identical reusage, i.e., the components are reused

including all temporal, spatial, design, and interac-

tion relationships and constraints as originally

specified by the author(s), and structural reusage by

means of separating layout and structure and

reusing only structural parts.

.Selection and identification. Before we can reuse

multimedia components, we have to identify and

select them within the multimedia information

system. This calls for metadata and for mechanisms

for classifying, indexing, and querying components.

Hence, a document model should provide support

for the comprehensive and sophisticated annotation

of reusable components with metadata.

Adaptation. The presentation of multimedia documents

preferably should adapt to the user context, like the user's

interest, knowledge level, preferences, the targeted user

system environment, and varying resources like available

network bandwidth and CPU time. To introduce adaptivity

into multimedia presentations which is a requirement to a

multimedia document model is that the model must offer

primitives to specify or generate orderive in some way

presentation alternatives that reflect and meet the different

presentation contexts. For an actual presentation, the

system can use these alternatives to adapt the delivery

and the rendering of the presentation to the current user

context.

362 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

For example, consider a professor on campus who is

interested to see in-depth multimedia material on coronary

artery bypass grafting, and an undergraduate student at

home who needs to get only an abstraction of the same

material to pass the upcoming exam. In these two different

presentations, the ªstoryº behind each actual presentation,

however, might be the same; some components of the

professor's presentation might be (re)used in the student's

presentations while others might be substituted or adapted

by more abstract representations of the specific content.

For a better understanding, we distinguish adaptation by

the extent to which the adaptability is modeled and when the

adaptability is exploited:

.Extent of the adaptability: For the extent of the

adaptability, we distinguish between adaptation to

personal interest, which adapts the contents of a

document to the user's interests, knowledge, profes-

sional background, etc. and adaptation to technical

infrastructure, which adapts to the technical infra-

structure available to a user.

In the example above, adaptation to technical

infrastructure would be the capability to adapt the

document's presentation both to the high-end

environment of the professor on campus and the

low-end environment of the student at home.

Therefore, the presentation should be adaptable by

means of technical parameters like resolution of

images and frame rate of videos, but also by means

of media substitutability like substituting an audio

by text or a video by a sequence of pictures or a

small animation. Adaptation to personal interest

would be an adaptation of the content such that the

professor would see a more in-depth presentation of

the coronary artery bypass grafting, whereas the

student would rather get a simplified variant

presentation of the operation, thus reflecting the

expected background knowledge of the different

users.

.Static or dynamic adaptability. With regard to the

presentation alternatives, it is of interest whether all

possible alternatives for the adaptation are to be

known and modeled at the authoring time of a

multimedia document or whether they are left for

generation at the actual presentation time just when

the adaptation is needed.

Presentation-neutral representation. The multimedia ma-

terial available has to be presentable in a heterogeneous

software and hardware environment which can be found

on the Internet. As a consequence, the multimedia

material has to be stored presentation-neutral, i.e.,

independent of the actual realization of a presentation

at a client. This calls for a presentation-neutral represen-

tation of multimedia content that is convertible into the

respective presentation-specific format used for playout of

the multimedia material. It is desirable that this conver-

sion is lossless and a conversion to different ªoutput

formatsº is possible. The presentation-neutral representa-

tion of multimedia content should henceÐbesides the

coverage of rich multimedia functionalityÐtake place on

a high-level of semantics. The presentation-neutral model

should also be open in the sense that it allows for later

integration of multimedia functionality expected to be

developed in the future.

.Multimedia functionality. The multimedia functional-

ity of a multimedia document model describes the

expressiveness of its modeling primitives. A docu-

ment should have a high multimedia functionality to

give sufficient support for modeling multimedia

content. With regard to the conversion process of a

(presentation-neutral) document into another/out-

put format, this means that if the target document

model does not offer an equivalent multimedia

functionality as offered by the source model, the

conversion will be loosy.

.Semantic level. A document model describes a

document on a high semantic level if the document's

structure is specified rather than its presentation.

This is helpful and necessary to allow for an

automatic conversion of a document into another

document format as then the course of the presenta-

tion can be extracted and converted easier. If the

document has a low semantic level, a conversion

may need knowledge about the multimedia content

that often only the author will have.

Therefore, the presentation-neutral representation of

multimedia content should have a high multimedia

functionality and take place on a high-level of semantics.

2.2 Basic Requirements

The traditional requirements for a temporal and spatial

model as well as interaction modeling are imperative for a

multimedia document model and, hence, are presented

only in short for the sake of completeness.

Temporal model. A temporal model (see also [12], [13],

[14], [15]) describes temporal dependencies between the

media elements of a multimedia document. One can find

four types of temporal models: point-based temporal models,

interval-based temporal models, and event-based temporal

models. Another way to specify temporal relations between

media elements is by the use of scriptsÐprograms written in

a scripting language which can comprise temporal syn-

chronization operations.

Spatial Model. Three approaches of positioning the visual

elements on the presentation medium can be distinguished:

absolute positioning based on a coordinate system, directional

relations [16], using relations like strong-north and weak-north

(to specify overlapping), and topological relations [17] using

relations like disjoint,meet, and overlap.

Interaction. Users should be able to interact with

presentations in terms of three types of interaction:

1) Navigational interactions determining the user-defined

flow of a multimedia presentation, 2) design interactions

influencing the visual and audible layout of a presentation,

and 3) movie interactions affecting the temporal course of the

entire presentation. Navigational and design interactions

should be specified within multimedia documents, whereas

movie interactions are expected to be offered by the

presentation engine.

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 363

2.3 Analysis of Existing Models

In this section, we very briefly summarize our analysis of

the most relevant existing standards and data models in

view of the requirements presented in the previous section.

Both the basic and the advanced requirements constitute a

metrics along which we analyzed selected multimedia

document models. Due to the limitation of space, we can

not present our comprehensive and detailed discussion

anout how the models meet the specific requirements in this

paper but refer the reader to [9], [10], [11]. Fig. 1 illustrates

the results of our analysis of the most relevant existing

approaches and shows to which extent HTML/DHTML,

MHEG-5/6, HyTime, and SMIL fulfill the basic and

advanced requirements. For each of the requirements, the

single aspects elaborated in Section 2 are listed and, for

each of the models, the figure shows how/to what extent

the requirements are met by the model.

The analysis of existing standards, defacto standard

formats, and models shows that, although, individual

formats and models are strong with respect to particular

features, they are not capable of meeting all the require-

ments identified in the previous section, especially those we

find with advanced multimedia applications, i.e., support

for reuse,adaptation, and presentation-neutral description. This

result led to the design and implementation of the

ZYX model which tries to take the pick of the bunch of

features of existing formats and models, especially also

recent developments in the area of Internet-applicable

models driven by the development of XML and SMIL.

3THE ZYXMODEL

When designing the ZYX model, we were, of course, taking

into account the lessons learned with the models we

analyzed. To give the reader an understanding of the

design of our model and also the points of contact of ZYX

with other approaches in the field, we sketch our design

considerations in Section 3.1. In Section 3.2, we then

introduce the reader to the basic concepts of our ZYX data

model before we present the detailed formal framework for

ZYX in Section 4.

3.1 Design Considerations

Aiming at the design of a model which fulfills the

requirements of reuse, adaptation, and presentation-neutral

representation as presented in the previous section, there

are still choices open about how to achieve sufficient

support for these requirements by a new data model. In the

following, we take up the advanced requirements and

discuss the approach about how we aim at supporting them

in ZYX. With regard to the basic requirements, we present

what underlying temporal and spatial model we selected

and explain the interaction capabilities.

Presentation-neutral representation. For the supported

degree of presentation neutrality of the multimedia docu-

ment model, the semantic level of the model and the

model's abstraction from the actual presentation are crucial.

Therefore, we decided to develop a data model that

describes a multimedia document on a high semantic level.

This allows us a (loosy) export or conversion of our

multimedia document into data models like MHEG-5,

SMIL, and HTML. To keep the documents independent of

the final realization within a multimedia presentation, the

model strictly separates modeling of layout information

from document structure. To be able to support a rich

multimedia functionality, our model is designed to support

as much of the multimedia functionality of these models as

possible while still keeping a high semantic level.

Reuse. For the structure of the documents, we consider a

hierarchical organization of the document as it can be found

with XML-based document models. To achieve reuse on an

arbitrary level of granularity, the model supports different

granules of reusable components, i.e., media elements,

document fragments, and entire documents. The model

strictly separates modeling of layout information from

structure to keep the documents independent of the final

realization within a multimedia presentation. Hence, it is

both possible to just reuse the structure and add new layout

information to it, and to reuse the different granules directly

with the layout information, due to this separation of layout

information from structure,. Hence, the ZYX model sup-

ports structural and identical reusage of elements, frag-

ments, and documents. For the selection and identification

of the different granules, the model has the capability to

annotate/enhance the granules with content-descriptive

metadata.

Adaptation. With our document model, we want to

support comprehensive adaptation mechanisms. Adapt-

ability of ZYX is not limited to adaptation to a predefined

set of discriminating technical attributes that are exploited

for adaptation, as can be found with SMIL, but can be

364 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Fig. 1. Summary of the support of the basic and advanced requirements

by HTML, DHTML, MHEG-5/6, SMIL, and HyTime (+ support, o partial

supportÐno support).

specified by an open set of attributes that reflect a complex

user and system context. The model offers the static

modeling of ªpresentation alternativesº that can be

exploited for adaptation to the different presentation

contexts. Additionally, the model offer primitives that

determine the needed presentation alternative only at the

point in time when the document is actually requested and

presented.

Temporal model. We decided to use an interval-based

temporal model. In order to fulfill the important require-

ment to describe the temporal dimension of interaction, we

selected the Interval Expressions [14] to form the basis of

the underlying temporal model of the ZYX data model. In

comparison to other interval-based temporal models it

allows us to describe the related time intervals which

possibly have an unknown duration, a feature which is of

importance with interaction modeling. The selection of an

interval-based temporal model does not contradict the high-

semantic level of the document model as this would be the

case of an event-based or script-based temporal model.

Spatial model. For the spatial layout, we decided to use a

point-based description of each visual media entity in a

multimedia document. Each visual media entity has

assigned two-dimensional extension plus a third dimension

to specify overlapping of visual media entities. So far, we do

not consider the specification of spatial relationships

between media entities like right-of or besides. However, as

our model strictly separates structure and layout and

defines clear interfaces to add layout to structure, it allows

for the extension by a more sophisticated spatial model

later.

Interaction. Our model supports the two interaction

types: navigational/decision interactions and design inter-

actions. This means that our model provides a comprehen-

sive support for these two interaction types comparable

with the interaction capabilities of MHEG-5, but more

sophisticated than those of SMIL.

3.2 Basic Concepts of the ZYXModel

In this section, we present the terminology and the basic

concepts of the ZYX model. The ZYX model describes a

multimedia document by means of a tree. The nodes of the

tree are the presentation elements and the edges of the tree

bind the presentation elements together in a hierarchical

fashion. Each presentation element has one binding point

with which it can be bound to another presentation element.

It also has one or more variables with which it can bind other

presentation elements. Additionally, each presentation

element can bind projector variables to specify the element's

layout. Fig. 2 introduces the graphical representation of

these basic elements of the model which we use in the

following to illustrate the model's features. The presenta-

tion elements are represented by rectangles, they form the

nodes of the document tree. On top of this rectangle, a

diamond represents the element's binding point. The

variables are represented by the filled circles below the

rectangle. The open circles on the right side of each

presentation element represent the element's projector

variables. The actual connections of variables and projector

variables to binding points of other presentation elements

are represented by edges in the graphical representation. A

variable that is connected to another presentation element is

called a bound variable, those variables that are not

connected are called free variables.

Presentation elements are the generic elements of the

model. They can be media elements that represent the

media data but also elements that represent the temporal,

spatial, layout, and interactive semantic relationships

between the elements of a multimedia document. Consider

the simple document tree, a so called ZYXfragment, in Fig. 3.

A temporal element, the sequential element seq, binds the

media elements Image and Text to its variables v2and v4,as

well as a parallel element par to its variable v3. The par

element again synchronizes a Video and an Audio which are

bound to its variables v6and v7. The presentation semantics

of each fragment is that it starts with the presentation of the

root element, here the sequential element. The specific

presentation semantics of the seq element is that the

elements bound to its variables v1,v2,v3,v4, and v5are

presented one after the other. That is, the element that will

be bound to v1is presented first, then the image bound to

v2, then the par element and so on. The presentation

semantics of the par element is that the video and the audio

element bound to its variable v6and v7are presented in

parallel. The sample fragment represents the media ele-

ments and the semantic relationships between the four

media elements. With the seq element's binding point this

fragment can be bound to another presentation element in a

more complex multimedia document tree. The variables v1

and v5of the fragment are still unbound. Here, an(other)

author could insert, e.g., a title at the beginning and a

summary at the end of the sequence, later.

We now explain the modeling capabilities of our model

with regard to our specific requirements of reusability,

adaptation and presentation-neutral representation, as well

as temporal and spatial modeling, and interaction.

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 365

Fig. 2. Graphical representation of the basic document elements.

Fig. 3. Simple document treeÐa ZYX fragment.

Reusability. First, we describe the elements of ZYX that

support the different granularity of reusable components of

multimedia documents.

Reusability on the level of media elements is supported

by means of selector elements: These are presentation

elements that determine what, that is, which part of a media

element is presented. They can be used to select and thereby

(re)use a specific part of an audio or a specific area of an

image. To select a part of a continuous media element, the

temporal selector temporal-s specifies start and duration of

the selected sequence. Fig. 4 illustrates the usage and

semantics of a temporal selector element: The temporal

selector selects a scene of a video of a duration of 40sec

beginning with second 10 of the original video.

To select a spatial fraction of a visual media element, the

spatial selector specifies the selected area by a polygon. In

Fig. 5, a spatial selector spatial-s is applied to an image

media element to select a rectangular area from the image.

The selectors can also be applied to fragments, e.g., to select

two minutes of an existing slide presentation or a fraction of

a composite visual element. Reuse is also supported on the

level of fragments. Here templates,complex media elements,

and external media elements provide for the support of

reusability of fragments:

.Templates. In the ZYX model, not all of the variables

of a presentation element must be bound at author-

ing time. In Fig. 3, the variables v1and v5, e.g., the

title and the summary of the presentation are still

unbound. This means that the sequence element seq

can later be completed by binding presentation

elements to the free variables v1and v5. This makes

the simple fragment in Fig. 3 a ªtemplateº for later

(re)use. This is an important feature for building

reusable fragments that can be applied in different

multimedia documents by (a kind of late) binding of

the free variables differently corresponding to the

current context.

.Complex and external media elements. It is, of

course, possible to form more complex fragments

like the one shown in Fig. 6. To make reuse more

easy and make it easier to handle, large document

fragments can be encapsulated by complex media

elements. Then, an encapsulated fragment appears

like a single presentation element in the specification

tree with one binding point and possibly a set of

variables. In the example in Fig. 6, different

presentation elements of the fragment leave vari-

ables unbound, which makes it a template as

described above. Here, also the encapsulation of

fragment by complex media elements is of help: To

make later ªfillingº of such templates easier, a

template can also be encapsulated. The free variables

of the fragment are exported and form the variables of

the complex media element. Fig. 6 illustrates how a

complex media element encapsulates a complex

fragment. A complex media element somehow is

the black box view to a possibly complex presenta-

tion fragment. The concepts of free variables in

combination with complex media element guarantee

comprehensive and workable reusability on the level

of presentation fragments.

.Analogously, an external media element encapsulates a

specification of a fragment that was composed in

another external document format. This allows for

the inclusion of existing documents of another

document format into our model. What, however,

is encapsulated by the external media element is

dependent of the external document format.

.Fragments and documents. And, of course, frag-

ments of entire documents can be reused by binding

the root element of the document to a free variable in

another document.

With regard to the kind of reusage the model supports

both identical and structural reuse: Therefore, besides the

selector elements, the ZYX data model offers projector

366 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Fig. 5. Spatial selector element spatial-s and its semantics.

Fig. 6. Complex fragments encapsulated in a complex media element.

Fig. 4. Temporal selector element temporal-s and its semantics.

elements that influence the visual and audible layout in a

presentation of a multimedia document. Projector ele-

ments determine how a media element or a fragment is

presented. They determine, for example, the presentation

speed of a video or the spatial position of an image on

the screen. Projectors are bound to the projector variables of

presentation elements. Each presentation element can

have one or more projector variables to which projectors

can be bound. A projector applies not only to the

presentation element it is bound to but also to its subtree.

For the arbitrary nesting of projectors, authoring tools

should provide support for consistency checking to avoid

contradicting layout specifications.

Fig. 7 illustrates the usage of projector elements and the

separation of structure and layout. In this example, a

fragment defines the parallel presentation of an audio and a

video. Two projector elements are bound to the sequential

element, a spatial projector spatial-p and an acoustic

projector acoustic-p. Each of the projectors applies only to

those elements in the same tree that can be affected by it.

Therefore, the spatial projector affects the spatial layout of

the video. The acoustic projector applies to the audio

element and determines the volume, base, treble, and

balance for presentation. By means of changing/adding

projector elements, one can change the layout of the

document. This allows for reusability of the same structure

with different presentation layouts, i.e., implements struc-

tural reuse. This follows the idea of separating structure

from layout information as can be found with SGML and

XML and complies also with our requirement for presenta-

tion-neutral representation of the documents.

As we have outlined in the requirements, reuse needs

support for identification and selection of the multimedia

content to be reusedÐhence, metadata is needed. Therefore,

each ZYX fragment is assigned a set of metadata that

describes its content by means of attribute-value pairs.

3.2.1 Adaptation

Adaptation means that the ZYX document that is delivered

for presentation should best match the context of the user

who requested the document. To support this kind of

adaptation, both a description of the user context and a

multimedia document that can be adapted to this context is

needed.

The context of a user is captured in a so called user profile,

i.e., metadata that describes the user's topics of interest,

presentation system environment, network connection

characteristics, etc. This metadata is organized as key-value

pairs just as the metadata that is assigned to the multimedia

content.

The ZYX data model provides two presentation elements

for an adaptation of the document to a user profile: the

switch element and the query element. The switch element

allows us to specify different alternatives for a specific part

of the document. With each of the alternatives under a

switch element, there is associated metadata that describes

the context in which this specific alternative is the best

choice for presentation. This metadata is specified as a set of

discriminating attribute-value pairs for each alternative.

During presentation, the user profile is evaluated against

the metadata of the switch and that alternative is selected for

presentation of which the discriminating attributes best

match the current user profile. An illustration of the switch

element is given in Fig. 8. The switch element specifies two

presentation alternatives: the first alternative, bound to v1is

associated with a seminar-like teaching style (type, seminar)

and the second one with a lecture-like type of teaching (type,

lecture). When the document is presented, depending of the

preferred type of teaching which is reflected in the user's

current profile, either the left or the right subtree is

presented. As the switch element can specify an arbitrary

number of alternatives each of which is described by an

arbitrary number of attribute-value pairs, this provides for a

very comprehensive extent of adaptability as almost every

aspect of a user and the environment can be distinguished

and later be evaluated for adaptation during presentation.

Aswitch element can be used only if all alternatives can

be modeled at authoring time, in advance to the presenta-

tion. Hence, the switch element implements the requirement

for static adaptability of the model. However, there might be

the case that an author cannot or does not want to exactly

specify a part of the presentation but only describe the

desired fragments and defer the actual selection of suitable

fragments to the point in time when the document is

requested for presentation. For example, an author might

wish to specify that at a specific point in the presentation

about ªcardiac surgery,º a digression into physiology is to

be made, however, the author does not want to specify

which fragments are relevant to this but have the most

suitable one selected out of a pool of available fragments

just before presentation. This can be specified with a query

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 367

Fig. 8. Specification of presentation alternatives with the switch element.

Fig. 7. A simple fragment with spatial and acoustic porjector elements

and their semantics.

element. By means of metadata, the query represents the

fragment that is expected at this point in the presentation.

When the document is selected for presentation the query

element is evaluated and the element is replaced by the

fragment best matching the metadata given by query

element. An illustration of the query element is given in

Fig. 9. The sample query element is the place holder for the

fragment best matching the query with topic ªphysiology in

cardiac surgery,º of type ªlecture,º and of five minute

duration. The more metadata tuples are used the more

specific the query is. The query element provides for the

dynamic adaptability of the model as the evaluation of the

query and the selection of the fragment takes place just

before presentation.

Presentation-neutral representation. The requirement of

presentation-neutral representation is strongly interrelated

with the structural reuse (see also Fig. 7). The explicit

separation of structure and layout allows for presentation-

neutral representation. As outlined before, the variables of a

presentation element need not to be bound in the first place,

this also applies for the projector variables. It is possible to

specify the presentation-neutral course of the presentation

and, later, bind the presentation-dependent layout just

when the document is selected for presentation. Then, the

presentation-neutral structure of the document is bound via

projector variables to the presentation-dependent layout

defined by a set of projectors.

Temporal and spatial modeling. Based on the Interval

Expressions [14], the model offers the primitives seq,par,

loop, and delay to specify temporal interval relationships.

These presentation elements can be nested to specify any

arbitrary temporal course of the multimedia presentation.

For the spatial model, we use the spatial projectors as

presented above. They realize the absolute positioning we

decided to use for the ZYX model. A spatial projector

determines the spatial layout of the presentation element it

is applied to and the layout applies to the entire subtree of

the presentation element.

Interaction. The requirement to support the modeling of

interactive multimedia presentations is met by the data

model's interaction elements. The model offers two types of

interaction elements, navigational interactive elements and

design interactive elements. The basic navigational element is

the genericLink element that allows us to specify the

transition from the document to an arbitrary link target.

Note that this element is not interactive. Based on the

genericLink, the menu element supports interactively select

one out of a set of visual elements and follow the

presentation path that is associated with the selected

element. The elements hotspot and hypertext define fine-

grained interactive visual areas in images and text. The

design interactive elements are the interactive version of the

projector elements. For example, for the typographic

projector that allows us to specify font, size, and style of a

text, the interactive typographic projector element specifies

that these settings can be altered interactively when the

document is presented.

4FORMAL FRAMEWORK OF THE ZYXMODEL

In this section, we present the formal framework of the

ZYX model. Therefore, we introduce the reader to the basic

terminology and formalism of the basic elements of the

model and then present the elements for modeling the

temporal course, the layout, interaction, and the adaptation

of the presentation. Fig. 10 gives the reader an overview of

the definitions to follow. They are listed along the

requirements and design criteria presented in Section 2

which where used for the comparison of document models,

illustrated in Fig. 1.

4.1 Basic Terminology

The presentation elements are the generic elements of the

ZYX model. Each presentation element phas assigned

exactly one binding point bp. This is the connector with

which a presentation element can be bound to another

presentation element. A presentation element has further-

more 0to nvariables vwhich are used to bind other

presentation elements to it. To add layout information to

a presentation element, it optionally can have 0to

nprojector variables pv that can be used to bind projector

elements to the element. The projector variables are treated

separately, due to separating structure and layout.

The symbols introduced in Definition 1 are used in the

definitions to follow.

Definition 1 (Symbols). Let Bdenote the set of all binding

points, VAR the set of all variables, PVAR the set of all

projector variables, Tthe set of all element types, MT the set of

media types, MED the set of all raw media data, OT the set of

ZYX operator element types, the ZYXDOC the set of all ZYX

documents, EXT the set of multimedia documents in an

external document format, PT OT the set of all projector

element types, AT T RI BU T ES the set of all possible attribute

names, and COLORS the set of all possible colors.

368 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Fig. 9. Specification of presentation alternatives with the query

elementsÐevaluation of the query element and replacement by selected

ZYX fragment.

A presentation element pis defined as follows:

Definition 2 (Presentation Element). A presentation element

pis a tuple p:tp;b

p;V

p;PV

pwith tp2Tdenoting the type of

p,bp2Bdenoting the binding point of p,VpVARdenoting

the set of variables of p, and PV

pPV AR denoting the set of

projector variables of p. The tuple pcan be augmented with

further tuple elements depending on the type tpof the

presentation element.

A presentation element pcan be an atomic media element,

a complex media element, an external media element, a

specific operator element to build up the temporal,

structural and interactive relationships, or serve for the

specification of adaptation. This is distinguished by the

type tpin the definition of a presentation element p.

The basic units of a ZYX multimedia document are the

atomic media elements. An atomic media element is an

instantiation of a media type. An atomic media element in

our model abstracts from the raw media data and just

represents the media element and its media specific

characteristics. The formal definition of an atomic media

element is given in Definition 3.

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 369

Fig. 10. Summary of definitions of ZYX elements.

Definition 3 (Atomic Media Element). An atomic media

element am :tam;b

am;V

am;PV

am;mis a presentation ele-

ment with

tam 2MT fAudio; V ideo; I mage; Text; AnimationgT;

Vam ;, and m2MED denoting the media data represented

by am.

Presentation elements are interconnected using their

variables and binding points. Each variable and also each

projector variable of a presentation element can be bound to

exactly one binding point of another presentation element.

Each binding point of a presentation element can be bound

to exactly one variable or projector variable of another

presentation element. A connection binds one variable to a

binding point, and is formally defined in Definition 4.

Definition 4 (Connection). A connection cv; bp0connects

the (projector) variable v2Vp[PVpof a presentation element p

with the binding point bp0of presentation element p06 p.

The result of interconnecting presentation elements is a

specification tree that describes a reusable fragment of

multimedia document. A fragment can be comprised of a

single media element, a part, or an entire multimedia

document. The formal description of a valid fragment is

given in the following Definition 5.

Definition 5 (Fragment). A fragment fP;Cis an acyclic,

undirected graph that describes a part or an entire multimedia

document with:

.Pis the set of presentation elements that are part of the

tree.

.Cfv; bp0jp; p02P;p 6 p0;v 2Vp[PVpgis the

set of connections in the tree.

For a valid fragment fP;C, the following conditions

must hold:

1. If c1;c

22C,c1v1;b

p;c

2v2;b

p;p2P,then

v1v2, i.e., each binding point can be bound to only

one variable.

2. If c1;c

22C; p; p02Pand c1v; bp;c

2v; bp0,

then pp0, i.e., each variable can be bound to only one

binding point.

Unboundffp2Pj:9v2[

p02P

Vp0:v; bp2Cg

and jUnboundfj1; rootf2Unboundf^trootf=2PT:

There is exactly one presentation element p2Pof the

fragment fthat is not bound to any other presentation

element. This unbound presentation element is called

the root element, denoted rootf, of the fragment and

has the binding point brootfthat forms the ªentry

pointº of the fragment; note that projector elements

cannot be root elements.

4. There is no sequence of connections c1;;c

n, such that

civi;b

pi;i1nÿ1, with vi12Vpi, and v12Vpn.

This means that fis acyclic.

5. 8pv 2Sp2PPV

p:9pv; bp02C)tp02PT.

Projector variables of a presentation element can

bind only projector elements.

6. 8v2Sp2PVp:9v; bp02C)tp0=2PT.

Variables of a presentation element can not bind

projector elements.

7. 8p2P:tp2PT )VpPV

p;.

A projector element can not bind any other

presentation element.

Fragments form the building blocks of a multimedia

document. They are the units that can be reused and

recomposed in different multimedia documents. To ease

this reuse of ZYX fragments, we introduce the definition of a

complex media element. A complex media object cm encapsu-

lates a fragment fP;Cwithin the definition of a

presentation element, somehow like a container. With this

definition, an encapsulated fragment can simply be reused

like a single presentation element in any other fragment. A

complex media element cm is defined as follows:

Definition 6 (Complex Media Element). A complex media

element cm :tcm;b

cm;V

cm;PV

cm;fis a presentation ele-

ment that encapsulates the fragment fP;Cwith

tcm Complex 2T,bcm brootf,

Vcm fv2[

p2P

Vpj8q2P:v; bq=2Cg;

and PVcm fpv 2S

p2P

PVpj8q2P:pv; bq=2Cg.

That is, the binding point of the root brootfof the

encapsulated fragment fbecomes the binding point bcm of

the complex media object cm. All variables and all projector

variables in the fragment fthat are not bound are exported

and form the free variables Vcm and projector variables PV

of the complex media object. For an illustration, recall Fig. 6:

The binding point of the seq element becomes the root

element of the complex media element, and the unbound

variables v1,v6,v8,v9, and v5become the free variables of

the complex media element.

As complex media objects encapsulate ZYX fragments,

they offer a means of abstraction. The export of free

variables allows for a later accomplishment of the complex

media element. Hence, complex media elements can form

templates which can be ªfilledº later by binding media

elements, other complex media elements and fragments to

the free variables. This ªlate bindingº of presentation

elements to the free variables finally instantiates the actual

ZYX document.

To encapsulate fragments that are specified in an

external format, we define external media elements

(Definition 7). An external media element em is also a

complex media element. It encapsulates, however, not a

fragment specified in ZYX, but the specification of an

external fragment available in another data model. Like the

complex media element, the external media element has

assigned a set of variables Vem, projector variables PVem,

and one binding point bem. However, the meaning of the

variables and projector variables depends on the external

document format.

Definition 7 (External Media Element). An external media

element cm :tem;b

em;V

em;PV

em;fis a presentation

370 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

element that encapsulates the fragment f2EXT with

tem External 2T,bem binding point of the external

fragment, Vem variables of the external fragment,

em projector variables of the external fragment.

With the definitions given so far, it is possible to

compose presentation elements by means of connections.

The interconnection of presentation elements via their

variables and binding point puts these presentation ele-

ments in a relationship; the semantics of this relationship,

however, is not yet defined. Therefore, our data model

offers different types of presentation elements, operator

elements, with which presentation elements can be inter-

connected with a certain semantics.

In the following, we present the element definitions of

temporal operators,projectors,selectors,interaction elements,

and adaptation elements. These elements determine the

semantics that have to be interpreted by a presentation

environment and mapped into the spatial, temporal,

structural, interaction, and adaptive domain of a multi-

media presentation. The different operator elements are

defined in the tuple notation as already introduced for the

generic presentation element. Again, the type distinguishes

the different operator elements. For the different elements,

the tuple carries additional operator type-specific values

that characterize the element's specific semantics. Not to be

repetitive in the definitions to follow, only the domains of

each of the newly introduced tuple elements are given.

4.2 Temporal Operator Elements

The temporal operator elements determine the temporal

relationships between the presentation elements. As out-

lined above, our temporal model is based on Interval

Expressions [14]. In the following, we present the definition

of the temporal operator elements par,seq,loop,delay, their

specific parameters, and semantics. An illustration of these

temporal operator elements is shown in Fig. 11.

The presentation semantics of the par operator element

(Definition 8) is that the presentation elements bound to its

variables are to be presented in parallel.

Definition 8 (Temporal Operator ElementÐpar). The

temporal operator element

par :tpar;b

par;V

par;PV

par; f inish; lipsync

is a presentation element with

tpar Par 2OT ; Vpar fv1;;v

ngV AR;

finish 2f1;...; n; min; maxg;

and lipsync 2N0.

The par operator element offers the two parameters

finish and lipsync to control the synchronization of

parallel presentation: The parameter finish determines

which one of the npresentation elements fv1;...;v

terminates the parallel presentation. If finish is set to min

or max, then the presentation stops when the presenta-

tion of the element with the minimal presentation time

stop, respectively, with the maximal presentation time. By

setting finish i; i 2f1;...;ng, the presentation stops

when the presentation of the dedicated presentation

element bound to vistops. The second parameter lipsync

determines the element that forms the master of a

continuous fine synchronization during playout of the

par operator. If the second parameter lipsync equals 0,

then no lip synchronization is specified. If the value of

lipsync is i,i>0, the presentation of the presentation

elements bound to v1;;v

nis carried out in lip synchro-

nization and the presentation element bound to viforms

the master of this synchronization.

The presentation semantics of the seq operator element

(Definition 9) is that the presentation elements that are

bound to it are presented in sequence. The presentation of a

seq operator element starts the sequential presentation of

the presentation elements that are bound to the variables

vi;i1...nin the order of v1;v

2;...;v

n. The presentation

of the seq operator element begins with the presentation of

the presentation element bound to V1and ends with the end

of the presentation of the element bound to vn.

Definition 9 (Temporal Operator ElementÐseq). The

temporal operator element seq :tseq ;b

seq;V

seq;PV

seqis a

presentation element with tseq Seq 2OT, and

Vseq fv1;...;v

ngV AR:

The presentation semantics of the loop operator element

(Definition 10) is that its presentation starts the repeated

presentation of the single presentation element bound to

v2Vloop. The presentation is repeated rtimes and stops

after the rth presentation of the presentation element. If ris

set to 1, the presentation of the element loops forever.

Definition 10 (Temporal Operator ElementÐloop). The

temporal operator element loop :tloop;b

loop;V

loop;PV

loop;ris

a presentation element with tloop Loop 2OT ,jVloop j1,

and r2N[1.

The delay operator element (Definition 11) models a

temporal delay of tmilliseconds. It can be seen as an

ªemptyº media element that is presented for a duration of t

milliseconds.

Definition 11 (Temporal Operator ElementÐdelay). The

temporal operator element

delay :tdelay;b

delay;V

delay;PV

delay;t

is a presentation element with

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 371

Fig. 11. Fragment illustrating the usage of the temporal operator

elements.

tdelay Delay 2OT ; Vdelay ;PVdelay;

and t2N.

Fig. 11 illustrates the different temporal operators

defined above. The loop element that forms the root of the

sample fragments specifies that the subtree is repeated

10 times. This subtree is comprised of a sequence of two

videos with accompanying texts, each followed by a short

temporal gap of 50ms for the transition.

4.3 Selectors

The model offers selector elements to reuse parts of media

elements and fragments, i.e., spatial regions and temporal

intervals.

First, Definition 12 introduces the notion of a successor in

a fragment needed for subsequent definitions.

Definition 12 (Successor). Let Fdenote the set of all

fragments. We then define a function expand :F!F that

computes for a fragment f, the fragment that is semantically

equivalent to fbut does not contain any complex media

element. The function expandfrecursively replaces each

complex media element in fby the fragment that the complex

media element encapsulates.

Be f2F a fragment, expandfP;Cthe expanded

fragment, and p; p02Ppresentation elements. Then, the

following direct and indirect successor relationships hold:

1. p0is a direct successor of p()9v; bp02C:v2Vp.

2. p0is a indirect successor of p,() p0is not a direct

successor of pand there exists a sequence

succ1;...; succn;n2N with succ1is direct successor

of p,succiis direct successor of succiÿ1;i2;...;n,

and p0is direct successor of succn.

3. p0is successor of p,() p0is direct or indirect

successor of p.

For example, in Fig. 11, the seq element is a direct

successor of the loop element. The video and text elements

are indirect successors of the loop element and a direct

successor of the parallel element. There is no successor

relationship between image and the audio media element.

Now, we can define the different selector elements, the

temporal selector,spatial selector,textual selector, and the

acoustic selector. A temporal selector temporal-s(Definition 13)

is a presentation element that can bind exactly one other

presentation element p. The presentation semantics of this

element is that the presentation of the direct and indirect

successors of pis started start milliseconds after the original

starting point of the fragment and lasts for duration

milliseconds.

Definition 13 (Temporal Selector ElementÐtemporal-s).

The temporal selector element

temporal-s:ttemporal-s;b

temporal-s;V

temporal-s;

temporal-s; start; duration

is a presentation element with

jVtemporal-sj1;t

temporal-sT emporal-S2OT ;

and start; duration 2N

Aspatial selector spatial-s(Definition 14) element can

bind exactly one other presentation element p, which can be

a visual media element like an image or a video but also a

complex media element with visual appearance. The spatial

selector selects a spatial area from p. The presentation

semantics of the spatial selector is that only those visual

parts of pand its successors that are visible within the

rectangular area that is specified with the element's

parameters x; y; width; and height are presented. For an

illustration of the spatial selector confer to Fig. 5.

Definition 14 (Spatial Selector ElementÐspatial-s). The

spatial selector element

spatial-s:tspatialÿs;b

spatialÿs;V

spatialÿs;

PVspatialÿs; x; y; width; height

is a presentation element with tspatial-sSpatial-S2OT ,

jVspatial-sj1,x,y2N

0, and width; height 2N.

The application of temporal and spatial selector elements

is context sensitive. That is, they apply to the entire subtree

of the presentation element bound to it. Selector elements

can be organized in a hierarchy and each selector element is

applied in the the context of the subtree it is bound to. For

an illustration, consider the example given in Fig. 12: Two

temporal selector elements s1and s2with

s1T emporal-S; bs1;fvs1g;;;10;25

and

s2Temporal-S; bs2;fvs2g;;;10;40time in seconds

are nested with s2being a direct or indirect successor of s1.

Then, the selected temporal interval defined by s1is defined

relative to the temporal interval specified by s2. That is, the

start time 10sec of s1is relative to the beginning of the

interval already selected by s2.

To also be able to reuse parts of text, a textual selector

spatial-s(Definition 15) selects a continuous fraction from a

text media element bound to the variable of p.The

presentation semantics is that only the selected part of the

text is presented, i.e., the text fraction that begins at the text

position start and has the given length in characters.

Definition 15 (Textual Selector ElementÐtextual-s). The

textual selector element

372 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Fig. 12. Sample fragment illustrating the usage and semantics of nesting

temporal selector elements s1:...;10;25;s

2:...10;40:

textual-s:ttextual-s;b

textual-s;V

textual-s;

PVtextual-s; start; length

is a presentation element with ttextual-sTextual-S2OT ,

jVtextual-sj1,start 2N

0, and length 2N.

4.4 Projectors

To add layout information to a presentation element, its 0to

nprojector variables pv can be used to bind projector elements

to the presentation element. Projector elements are pre-

sentation elements that determine how presentation ele-

ments are presented. The model offers the four different

projector elements spatial-p,temporal-p,acoustic-p, and

typographic-p, to specify the temporal, spatial, acoustic, and

typographic layout of a presentation which we define in the

following.

The presentation semantics of the spatial projector element

spatial-p(Definition 16) is that the visual presentation of p,

the presentation element it is bound to, is ªprojectedº on the

rectangular presentation area defined by the projector

element. The parameters xand ydefine the position of

the upper left corner of a rectangle with the given width and

height. The parameter priority defines the order of the

overlapping of visual objects such that an object with a

higher priority value covers objects with a lower priority

value. The value of the parameter unit determines whether

the values of the parameters x; y; width; height are given in

pixel or in percent of a presentation window.

Definition 16 (Spatial Projector ElementÐspatial-p). The

spatial projector element

spatial-p:tspatial-p;b

spatial-p;V

spatial-p;PV

spatialÿp;

x; y; width; height; priority; unit

is a presentation element with tspatial-pSpatial-P2PT,

Vspatial-pPVspatial-p;,x,y,priority 2N

0,width,

height 2N, and unit 2fpixel; percentg.

The spatial projector, like all projector elements, applies

not only to the presentation element pit is bound to but also

to all successors of p. That is, it affects the entire subtree of

which pis the root element with regard to the spatial

projection. The visual parts of pand possibly successors are

scaled to the presentation area defined by the projector's

parameters.

If spatial projectors are nested, then each spatial projector

spatial-pis evaluated in its context. Fig. 13 illustrates the

usage and the semantics of nesting spatial projector

elements. In the example, the root par element has a spatial

projector bound to it that specifies the rectangle presentation

area for the subtree as x10;y10;w 100;h 100. This

area is indicated on the right part of the figure with an dotted

rectangle. The two images that are successors of the par

element each have an own spatial projector. The spatial

projector of Image1in the subtree defines a presentation area

x0;y0;w 40;h 40and the second image a pre-

sentation area with x60;y60;w 40;h 40. In con-

sequence, both spatial projectors of the images are evaluated

in the context of the spatial projector bound to the par

element. Therefore, the areas of the tow images are projected

within the area defined by the spatial projector of the par

element.

The presentation semantics of the temporal projector

element temporal-p(Definition 17) bound to a presentation

element pis that the element pis presented with the given

playback direction and speed. The parameter direction

specifies whether the presentation element (and its subtree)

is presented in a forward (direction 1) or in a backward

direction (direction ÿ1). The actual playback speed is

computed by multiplying the original playback speed with

the factor given by the speed parameter.

Definition 17 (Temporal Projector ElementÐtemporal-p).

The temporal projector element

temporal-p:ttemporal-p;b

temporal-p;V

temporal-p;

PVtemporal-p; direction; speed

is a presentation element with

ttemporal-pTemporal-P2PT;

Vtemporal-pPV

temporal-p;;

direction 2fÿ1;1g;

and speed 2<

.

Like the spatial projector element, a temporal projector

element applies not only to the presentation element pit is

bound to but to all successors of that presentation element.

If, for example, the temporal-pprojector of a presentation

element pdefines speed 2and a successor p0of phas a

temporal projector that also defines speed 2, then, in fact,

the successor p0is presented at a speed factor of 4.

In the same way the acoustic projector element and the

typographic projector element are defined, the acoustic

projector element acoustic-p(Definition 18) determines the

volume, balance, base, and treble of the presentation of the

presentation element pand all successors of p.The

typographic projector element typographic-p(Definition 19)

affects the parameters font, size, style, background and

foreground color of the presentation of the presentation

element pit is bound to and all successors of p.

Definition 18 (Acoustic Projector ElementÐacoustic-p).

The acoustic projector element

acoustic-p:tacoustic-p;b

acoustic-p;V

acoustic-p;

volume; balance; base; treble

is a presentation element with

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 373

Fig. 13. Sample fragment illustrating the usage and semantics of nesting

spatial projector elements.

tacoustic-pAcoustic-P2PT;

Vacoustic-pPV

acoustic-p;;

volume 20;...;100;

and balance,base,treble 2ÿ1;...;1.

Definition 19 (Typographic ProjectorÐtypographic-p). The

typographic projector element

typographic-p:btypographicÿp;V

typographicÿp;

font; size; style; bg; fg

is a presentation element with

ttypographic-pT ypographic-P2PT;

Vtypographic-pPVtypographic-p;;

font 2F ontNames;

style 2fnormal; italic; boldg;

size 2point;

and bg; fg 2COLORS.

A projector element at first affects the presentation

element pit is bound to. If, however, phas successors, then

those can be affected, too. Each successor of pis affected if

the specific projector can acutally have an affect on it. For

example, a typographic projector affects only those ele-

ments in the subtree of pthat bear typographic aspects. In

Fig. 7, a spatial and an acoustic project element are bound to

apar temporal operator. The spatial projector applies only

to the video, whereas the acoustic projector applies only to

the audio that is bound to the par element.

4.5 Interaction Elements

To support the requirement of interactive multimedia

presentations, the model offers different interaction elements

for navigational and design interactions.

The gen link element (Definition 20) is the basic element

for the modeling of navigation in ZYX documents. The

generic link is the presentation element that specifies a

noninteractive direct transition to a target element. It serves

as the basis for the actual ªinteractiveº elements in the

following. The gen_link has the two parameters target and

mode. The parameter target specifies the target of the

transition and the parameter mode the way how this

transition is to be carried out.

Definition 20 (Generic Link ElementÐgen link). The

interaction element

gen link :tgen link;b

gen link;V

gen link;PV

gen link; target; mode

is a presentation element with

tgen link GenericLink 2OT ;

Vgen link PVgen link ;;

target 2domUniform Resource Identifier;

and mode 2fstop; spawng.

The presentation semantics of the gen link is that on

the presentation of the link element, the link target which

is specified by a Uniform Resource Identifier (URI) is

presented. The target need not be a ZYX document but

can be an HTML document or an arbitrary application

and is presented by the browser/viewer that is associated

with the target's URI. The mode of the generic link

determines whether the current presentation stops and

only the target is presented (mode stop), or if the

presentation of the target is presented in parallel with the

current presentation (mode spawn). The ZYX sample tree

in Fig. 14 shows a video-audio presentation which is

followed directly by the presentation of the link target,

i.e., the presentation of the target specified with anURI.

As the generic link is intended to model transitions

to arbitrary link targets, we introduce the ZYX_link

(Definition 21) to specify the specific transition to a

ZYX document.

Definition 21 (ZYX Link ElementÐZYX_link). The interac-

tion element

ZYXlink :tZYXlink;b

ZYXlink;V

ZYXlink;

PVZYXlink ; target; mode

is a presentation element with

tZYXlink ZYXLIN K 2OT ;

VZYXlink PVZYXlink ;;

target 2ZYXDOC;

and mode 2fstop; spawng.

The semantics of the ZYXlink is that on its presentation,

the ZYX document specified by target is presented. The

parameter mode describes whether presentatin of the

current document stops and the target ZYX document is

presented (mode stop), or if it is presented in parallel with

the current presentation (mode spawn).

So far, the elements gen link and the ZYXlink are used

to model a direct, noninteractive transition to a link target.

For a link transition initiated by a user interaction with a

visual presentation element, we define the menu interaction

element.

The menu interaction element (Definition 22) defines a

set of variables to which the presentation elements of the

visual link anchors are bound and the corresponding

presentation elements that are to be presented when the

respective link anchor is interactively selected.

Definition 22 (Interaction elementÐmenu). The interaction

element menu :tmenu;b

menu;V

menu;PV

menu; modeis a pre-

sentation element with

374 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Fig. 14. Sample fragment illustrating the usage and semantics of the

gen_link interaction element.

tmenu Menu 2OT ;

mode 2fvanish; prevailg;

Vmenu fv1;...;v

n;t

1;...;t

ng;

and n2N.

The menu interaction element defines a set of

selectable presentation elements (link anchors) bound to

vi2Vmenu;i 1...nrepresenting the menu items. The

presentation elements bound to ti2Vmenu ;i1...n

represent the target elements of the selection. Each

selectable menu item bound to vkcorresponds to the

target tk. The presentation semantics of the menu

element is that on presentation of the menu element,

in parallel, all the elements bound to vi2Vmenu;i 1...n

are presented, i.e., the menu is presented. When a user

selects one of the menu items bound to vj, the target

element of the selection bound to tjis presented. The

parameter mode determines what happens with the current

presentation. If mode vanish, the engine finishes the

presentation of all presentation elements bound to

vj;j1...n, and starts the presentation of the presentation

element bound to ti. If parameter mode prevail,the

engine ªmergesº the presentation of the presentation

element bound to tiwith the currently running presenta-

tion. If no element (menu item) is selected by a user, the

presentation of the menu element stops as soon as the

presentation of all presentation elements bound to

vi;i1...n, is finished.

Fig. 15 illustrates the usage of the menu element. Image1

and Image2represent the two selectable menu items. On

interaction with Image1, the presentation of the video-

audio presentation bound to t2starts. On interaction with

the link anchor Image2bound to v2, the presentation of a

ZYXlink starts which results in the presentation of the

ZYX document bound to t2.

The menu interaction element is provided to allow for

interaction with visual presentation elements and naviga-

tion within a document, i.e., the selection of one out of a set

of possible presentation paths. By using the gen link and

ZYXlink as target elements of the menu element, these

paths can leave the document and lead to other documents.

So far, the appearance of a link is limited to the visual

appearance of the presentation element that forms the link

anchor. To offer a more fine-grained specification of link

anchors, e.g., a region in an image or a word within a text,

the ZYX model offers the primitives hotspot and hypertext.

The hotspot element (Definition 23) is a variant of the

menu element but refines the interaction sensitive area to an

arbitrary polygon of a visual element. In addition to the link

anchors in the menu element, it specifies a set of sensitive

areas by polygons. Instead of linking a set of link anchors

with a set of targets in the menu element, the hotspot

element interlinks areas of visual presentation elements with

link targets.

Definition 23 (Interaction ElementÐhotspot). The interac-

tion element

hotspot :thotspot;b

hotspot;V

hotspot;PV

hotspot;P

1;...;P

n; mode

is a presentation element with

thotspot HotSpot 2OT ; Vhotspot fv; t1;;t

ng;

hotspot ;;P

i<x1;y

1>; ...;<x

m;y

m>;

start; dur; mode 2fvanish; prevailg;

and n2N.

The presentation semantics of the hotspot is the pre-

sentation of the link anchor bound to vand, not necessarily

visible, the associated interaction-sensitive areas. These

areas are defined each by a tuple Pithat specifies the

sensitive area by a polygon <x1;y

1>; ...;<x

m;y

m>and

the interval start; durfor which the sensitive area is active

during the presentation. This interval is related to the

beginning of the presentation of the hotspot. On user

interaction with the sensitive area specified by Pi, the

corresponding link target tiis presented under the given

mode (vanish or prevail).

A further variant of the menu element is the hypertext

element (Definition 23). As a hotspot allows to associate an

interaction-sensitive region of an image or a video with a

link, the hypertext element offers a means to model sensitive

parts within text. Like the hotspot, a hypertext interaction

element is sensitive for a specified temporal interval

start; dur.

Definition 24 (Interaction ElementÐhypertext). The inter-

action element

hypertext :thypertext;b

hypertext;V

hypertext;

PVhypertext ;T

1;...;T

n; mode

is a presentation element with thypertext HyperT ext 2OT ,

Vhypertext fv; t1;...;t

ng, and

hypertext ;;T

istart; length;start; dur;

mode 2fvanish; prevailg;n2N:

The presentation semantics of the hypertext is that on

its presentation the presentation of the text anchor

bound to vstarts. The hypertextelement specifies the

sensitive regions of the text by means of tuples Ti

start; length;start; dur each defining a sensitive text

segment by its starting text position and its length and

the temporal interval for which the sensitive text area is

active during the presentation. On user interaction with

the sensitive segment defined by Tiof the text, the

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 375

Fig. 15. Sample fragment illustrating the usage and semantics of the

menu and ZYXlink interaction element.

corresponding link target tiis presented under the given

mode (vanish or prevail).

The model provieds two further types of interaction

elements, interactive projector elements and interactive selector

elements. These elements comply in general with the

projector and selector elements presented in Definitions 16,

17, 18, and 19, but they have an additional ªinteractiveº

aspect, i.e., they can be interactively changed and adjusted

by a user. For each of the projector and selector element, a

corresponding interactive projector element is provided by

the model.

An example of an interactive projector element is the

interactive temporal projector element temporal-pi (Defini-

tion 25) which is an interactive temporal-pprojector

element. Its presentation semantics is that in addition to

the specified temporal projection during presentation, a

user can interactively adjust the element's specific para-

meters direction and speed within their domains. For each

temporal projector, the model offers the corresponding

interactive projector element.

Definition 25 (Interaction ElementÐtemporal-pi). The

temporal interactive projector element

temporal-pi :btemporal-pi;V

temporal-pi;PV

temporal-pi;

direction; speed

is a presentation element Vtemporal-pi ;,speed 2<

,

direction 2fÿ1;1g.

An example of an interactive selector is the interaction

element spatial-si (Definition 26) which is a special spatial-s

selector. Its presentation semantics is that, in addition to the

spatial selection, the presentation engine offers a user to

interactively adjust the selected spatial area and the over-

lapping by changing the parameters x,y,width,height and

priority within their domains.

Definition 26 (Interaction ElementÐspatial-si). The spatial

interactive selector element

spatial-si :bspatialÿsi;V

spatialÿsi;PV

spatialÿsi; x; y; width; height

is a presentation element with Vtemporal-si ;,x; y 2N

0, and

width; height 2Ng.

Analogously, the temporal-si element is defined. The

interactive selector elements allow us to model the inter-

active spatial and temporal scaling of media elements and

fragments during the presentation.

In addition to the support for navigational interac-

tion by the elements gen link,ZYXlink,menu,hotspot,

and hypertext, the interactive projector and selector

elements implement the design interactions of multi-

media presentations.

4.6 Adaptation Elements

Our model offers the two elements switch and query which

allow for the adaptation of a multimedia presentation

according to the user's individual context. This user context,

expressing the user's topics of interest, presentation system

environment, network connection characteristics and the

like, is described in a global profile GP by means of attribute

value pairs (Definition 27).

Definition 27 (Global ProfileÐGP ). The Global Profile GP :

m1;...;m

nis a set of metadata with miattri; valuei

denoting attribute-value pairs that describe the current user

context during a presentation with attri2AT T RIBUTES

and valuei2domattri;i2N.

The switch adaptation element (Definition 28) serves the

purpose of specifying different presentation alternatives for

different contexts. Under a switch element, an author can

ªcollectº different alternatives (media elements or frag-

ments) and add metadata to each alternative that specify

under which presentation conditions the alternative is to be

selected. Thereby, an author can define different fragments

for conveying the same content under different presentation

context like system environment, user language, the user's

understanding of the subject, network bandwidth, and the

like. The metadata associated with the switch element is

evaluated by the presentation environment against the

global profile to select the one best matching the current

context.

Definition 28 (Adaptation ElementÐswitch). The adaptation

element switch :tswitch;b

switch;V

switch;PV

switch;M

1;...;M

n

is a presentation element with tswitch Switch 2OT ,Mi

denoting sets of attribute-value pairs,

Vswitch fv1;...;v

n;v

default g;

and n2N.

The presentation semantics of the switch element is that

upon its presentation, the metadata available with the GP is

evaluated against the sets of metadata Mi;i 1...nof the

switch. Let Mj;j2f1;...;ngbe the set of metadata which

matches best GP. Then, the fragment bound to vj, i.e., the

presentation alternative best matching the current presenta-

tion context, is presented. If there is no suitable set of

metadata among M1;...;M

n, the presentation element

bound to vdefault is selected for presentation. The metadata

of the switch element is continuously evaluated against the

current, possibly changing global profile, i.e., changing

presentation context like varying bandwidth. In this case,

during the presentation of the switch element, the pre-

sentation environment can select another more suitable

alternative due to a changed context, e.g., switching from a

video to a slide show due to decreasing network band-

width. The presentation of the switch element finally

terminates when the presentation of the selected presenta-

tion element is finished.

For cases in which an author does not want to allow this

kind of continuous adaptation, the model provides the

decide element. The usage of a decide element instead of the

switch element would, e.g., make the presentation stay with

the video, once selected, instead of switching to an

alternative slide show. The definition of the decide element

is given in Definition 29:

Definition 29 (Adaptation ElementÐdecide). The adaptation

element decide :tdecide;b

decide;V

decide;PV

decide;M

1;;M

nis a

presentation element with tdecide Decide 2OT ,Midenoting

376 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

sets of attribute-value pairs, Vdecide fv1;...;v

n;v

default g, and

n2N.

The presentation semantics of the decide element is the

same as that of the switch element. However, the evaluation

of the sets of metadata against the current global profile GP

and the selection of the best match is made only once at the

beginning of the presentation of the decide element.

For cases in which the presentation alternatives of a

document are not known at authoring time, the query

element (Definition 30) is provided. The query element is

just a ªplaceholderº for a fragment. It specifies a ªqueryº

which selects a fragment just before presentation time from

all available fragments. The resulting fragment replaces the

query element in the ZYX document.

For the definition of the query element, we enhance the

definition of a fragment as given in Definition 5 such that a

fragment specification also includes metadata, i.e., f

P;C;Mwith Mbeing a set of attribute-value pairs. This

metadata describes both the content of a fragment flike the

topics covered and technical features of the fragment like

the network bandwidth needed for its presentation.

Definition 30 (Adaptation ElementÐquery). The adaptation

element query :tquery;b

query;V

queryPV

query;Mis a presenta-

tion element with tquery Query 2OT ,Mdenoting a set of

attribute-value pairs, and Vquery ;.

The semantics of the query element is that before the

actual presentation, the metadata of the query element and

the global profile specified with M[GP is evaluated

against the metadata given with all fragments known to

the system. Then, the fragment with the best match with

respect to Mand the profile GP is selected and the query

element is replaced by the selected fragment. The query

element allows us to dynamically select the most suitable

fragment at presentation time taking into account the actual

user interest and system environment.

5APPLICATION OF ZYXAND IMPLICATIONS TO

AUTHORING AND PRESENTATION

We have made clear how important we consider the

support for reuse and adaptation by a multimedia docu-

ment model, requirements we were aiming to meet in the

ZYX model. This section illustrates the application of these

two specific features we elaborated on in detail in Section 2.1

in authoring and presenting of ZYX multimedia documents.

We start out with presenting the many different kinds of

reuse of ZYX elements and fragments in Section 5.1 before

we come to the various possibilities to employ ZYX for

adaptation in Section 5.2. And, looking on the impact of

new document models like ZYX for multimedia content

production, we point at the implications and positive effects

this has to multimedia authoring.

5.1 Reuse

Applying ZYX for reuse means that, first, we show how

identification and selection is supported by ZYX as this

forms the basis for efficient reuse of media elements,

fragments, and documents. Then, we show application of

reusing ZYX elements in different granules and present

structural vs. identical reuse in ZYX.

5.1.1 Identification and Selection

Support for identification and selection is obligatory for

content to be efficiently reused. Only if the content can be

easily retrieved within the authoring process can reuse of

material be possible. Hence, sophisticated metadata must be

associated with media elements, fragments, and documents.

The metadata for the media elements comes with the

modeling of the different media types. At the level of

fragments, a set of metadata describes the content of the

composition. This metadata is anchored in the definition of

YX fragment fP;C;M, and relates especially to the

content and targeted user group.

The available metadata concerning both the content and

the structure of fragments can be employed for the

browsing of fragments in an authoring environment and

to identify and select fragments for composition of

ZYX documents.

5.1.2 Different Granularity of Reuse

Equipped with the modeling of metadata of the media

elements and ZYX fragments, we illustrate how reuse of

media elements, fragments, and documents can be exten-

sively applied with ZYX.

Reuse of media elements. Atomic media elements represent

the raw media data within ZYX documents. These elements

can be reused entirely or only partwise. Atomic media

elements form the leaves of the document structure. One

media element can be used in different branches of the tree.

As the atomic media elements only represent the actual

media data, only the atomic media elements are then used

several times in the document; however, the mere data

exists only once. To select only a part of a media element,

the selector elements are used. They select the desired

scene, visual area, or sound sequence of a medium. In

Fig. 16, two different scenes of the same video showing

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 377

Fig. 16. Reuse of media elements in ZYX.

the opening of patient's chest before the actual operation

on the open heart as well as two different parts of the

same text explaining the operative steps are composed in

YX document. The reuse of media elements, especially

partly reuse, can avoid redundant preparation of media

data just for one single application.

Reuse of fragments and complex media elements. The

composition of presentation elements leads to fragments

of arbitrary size and complexity. Fragments can be reused

as fragments themselves but also encapsulated within

complex media elements. Both the fragments and the

complex media elements can be bound to any other variable

during the composition of a (new) document. Exploiting

identification and selection as discussed in Section 5.1.1, an

authoring environment for ZYX here can offer the author

fragments or complex media elements relevant in the

desired context to be part of the newly composed docu-

ment. The only difference between reusing complex media

elements and fragments is that with the complex media

elements, the structure and complexity of the selected

subpart of the document are intendedly hidden from the

author. Rather, the semantics of the complex media element

is important, e.g., it comprises a slide show, and of how to

fill the unbound variables with presentation elements and

fragments. As the structure of all ZYX documents is

accessible and explicitly visible, authoring support could

go so far that a sophisticated content-based search algo-

rithm identifies those nodes (presentation elements) in

other documents that could be of interest to an author and

extracts the respective subtree (=fragment) for reuse.

The reuse of fragments and complex media elements of

arbitrary size is a feature that relieves an author from

cutting & pasting formerly composed documents but opens

the way to composition of multimedia documents much

like using a Lego or K'NEX unit construction set.

Fig. 17 illustrates the reuse of fragments and complex

media elements. In the example, the fragment already

introduced in Fig. 16 is reused in a course about operative

surgery. Additionally, an already existing complex media

element about a bypass operation is inserted as a digression

of the course into the specific domain of open heart surgery.

The fragment and the complex media element are, e.g.,

arranged in an sequential order and this sequence is then,

indicated by the dashed line, part of the entire course.

Reusable templates. With ZYX, an author can define

templates that cover, e.g., a didactic unit like a multimedia

course, a lecture, a technical guide, a tour through a

museum, and the like. Such a template is a regular

ZYX fragment but with unbound variables, i.e., the author

leaves some of the leafs of the tree unbound. These

templates give other authors a basic structure to start with

for the composition of a new ZYX document. Consider the

sample fragment in Fig. 18: It forms a sequence of five

presentation elements two of which are bound to a parallel

operator. This fragment is encapsulated into a complex

media element denoted aTemplate which then another

author uses to ªplug-inº the missing presentation elements

and, hereby, forms a new document. In Fig. 18, two

complex media elements, a title and a summary, and two

videos with captions are bound to the template aT emplate,

e.g., in a semiautomatic authoring process. For this, the

author rather needs only information about the usage of the

complex media element but not necessarily about the

explicit structure of the template.

Reuse of documents. As entire documents in ZYX are

nothing else but a (logically complete) fragment, documents

can be reused in any other ZYX document. Or, reuse can just

mean that an author arbitrarily alters and by this adjusts an

existing ZYX document to his/her specific needs.

5.1.3 Identical versus Structural Reuse

Following one of ZYX's design ideas to separate structure

from layout is to reuse a multimedia document with

different layouts, e.g., a different look and feel. For example,

if the layout designer of our Cardio-OP project changes the

concept for the overall presentation of medical content in

the project, hopefully only the layout of the documents

must be changed without touching the documents at all.

Another application is the change of the technical presenta-

tion medium. Consider a presentation with a screen layout.

What happens if the same presentation is to be presented at

a point of information with a touch screen? By exchanging

the layout, the same fragments can be used in different

presentation contexts. As each presentation element distin-

guishes between its variables and projector variables, the

378 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Fig. 17. Reuse of fragments and complex media elements in ZYX.

Fig. 18. TemplatesÐstructural reuse of ZYX fragments.

structural part can easily be separated from the layout part.

An author, hence, can select to use only the structure and

assign a new layout to the document or fragment.

With structural reuse of ZYX documents and fragments,

the adaptation of a document's appearance to the presenta-

tion context is possibleÐhere, the relationship between

reuse and adaptation becomes obvious. Fig. 19 gives a

simple example of reusing the same fragment with two

different layouts. The presentation of the same fragment

then changes depending on the layout bound to it.

Structural reuse is also an application of the adaptation of

the layout of ZYX documents to a specific user context.

5.2 Adaptation

In the following, we describe the different adaptation

possibilities we have when exploiting the modeling

primitives of the ZYX model. The adaptation elements

switch and query as well as ZYXtemplates play the key role in

supporting adaptation.

5.2.1 Explicit Modeling of Presentation Alternatives

With the modeling of presentation alternatives, the author

of a ZYX document can explicitly model adaptivity to the

user context. For example, in the Cardio-OP context, a

switch can distinguish the alternatives for undergraduate

students, graduate students, and researchers. The switch

element allows us to define arbitrary discriminating values.

An alternative can also be ªlabeledº by a combination of

discriminating values. This means that adaptation has as

many dimensions as the author desires.

However, this means that the document, to be adaptable

to many different presentation contexts, needs to model all

the different presentation alternatives for the respective

contexts under the document's switch elements. To relieve

an author from such a time consuming and somehow never

ending story, we propose providing mechanisms to (semi)

automatically augment the document with the necessary

alternatives, possibly guided by a user. The idea is that the

author concentrates on the initial goal, to compose a

multimedia document with a certain content, and then

enrich the document, exploiting the switch primitive with

additional fragments for conveying the same information

but in different presentation contexts. In the following, we

only illustrate how this can be achieved; for further details,

we refer the reader to [18].

Automatic generation of presentation alternativesÐAugmen-

tation. For a fine-grained adaptation to many different user

contexts, it is mandatory that a high number of alternatives

be available. However, if an author had to specify all

possible alternatives, this would result in a very time

consuming composition effort and deviate the author from

the initial goal, namely, the composition of a sound

presentation. To relieve the authors from this additional

burden, we propose supporting the automatization of the

specification of the alternatives. We call this step augmenta-

tion of the multimedia document which takes place after the

document has been composed by the author. The augmen-

tation process queries the underlying pool of fragments

exploiting the inherent technical data and the metadata the

media elements have been annotated with to receive

potential presentation alternatives. The alternatives are

then inserted into the document, i.e., the document is

augmented by the alternatives to provide for adaptivity in

different presentation contexts. However, the suggested

alternatives cannot simply be inserted into the document

but, to preserve the semantics of the presentation intended

by the author, have to undergo a verification to assure that

the augmented document is still valid with regard to the

representation semantics.

Fig. 20 shows a small document which has been

augmented by additional fragments. First, before the

augmentation, the document contained the video V ideo1,

indicated in bold face. Then, targeting the document at both

a medical professor and a medical student and at the same

time taking into account three different levels of available

bandwidth for the presentation, the augmentation results in

a switch element offering such different alternatives. From

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 379

Fig. 20. Augmentation of a ZYX fragment.

Fig. 19. Reuse of structure with different layouts in ZYX.

the technical side, the augmentation constraint has intro-

duced atomic media elements and fragments for medium

and low bandwidth. Please note, that this does not

necessarily mean that there are only different quality of

the same medium. For example, for the professor, the

alternative for the V ideo 1at low bandwidth is a complex

media element, a slide show SlideShow. Additionally, the

documents can be used also in the context of a medical

student. Therefore, for each available bandwidth, a media

element has been augmented that is targeted at the

knowledge and background of a medical student but covers

the same topic. The parameters of the switch element in

Fig. 20 only indicate the discriminating attributes, whereas

the actual parameter list is too long for this illustrative

example.

In a first step, we elaborated an augmentation scheme to

(semi)automatically augment documents with respect to

different system contexts which mainly differ in the

targeted bandwidth and system power on the level of

providing presentation alternatives on the level of atomic

media elements. We have formalized the verification of this

kind of automatic augmentation of ZYX document with

presentation alternatives in [18]. A much more complicated

effort is to automatically augment ZYX documents with

semantically equivalent fragments that cover larger parts of

a presentation. For example, can a subsection of a multi-

media presentation intended for a medical doctor be

automatically augmented such that an equivalent content

is conveyed to a student which presumably has a much

lower background in the field? Here, possibly the annota-

tion of multimedia content must be carried out very

carefully by the experts in the field to give an automatic

augmentation model sufficient input to select and insert

semantically equivalent presentation alternative. And ad-

ditionally, the process of augmentation will be rather

semiautomatic, possibly guided by an author who is an

expert in the field.

5.2.2 Declarative Modeling of Presentation Alternatives

There are two kinds of applications of query elements for

adaptation: The query elements can be used for the

dynamic binding of fragments just before presentation

and can also be used to support the authoring process.

The query element bears the metadata that is to be

evaluated for the selection of the best matching fragment.

The formal definition of the query element specifies a set of

metadata to be met by the fragment to replace the query

node. The query semantics, however, are not specified by

the model but left to the application.

Query elements can be used to automatically adjust

documents to the current context, i.e., the query elements

are used to select the element that best matches the query at

the latest point in time just before presentation. One of the

advantages of leaving parts of the document somehow ªa

black boxº just until the actual request for presentation is

that in this case always the most up-to-date pool of

fragments is considered in the query evaluation. The

evaluation of a query element specified in a document can

be executed at authoring time to test the later result of the

presentation.

In combination with templates, the query element can be

applied for authoring support. Instead of leaving the

variables of a template unbound, one could bind these to

suitable query elements. The evaluation of the query

element at authoring time can then propose fragments to

be placed at that respective node. By this, a kind of content-

oriented browsing can be inserted in the documents and

allow, e.g., novice users to have an easy start with the

model.

5.3 Implications to Authoring and Presentation

The approach we have taken for the modeling of multi-

media content significantly impacts the authoring and

presentation of the multimedia material. Traditional author-

ing systems usually aim at the creation of a preorchestrated

presentation addressing a dedicated user group. These

presentations usually do not allow us to exploit the logical

structure or layout definitions for adaptation of the

presentation during playout. Given our approach, the

authoring process has to focus much more on the structural

composition of multimedia material, separating the logical

structure of a multimedia presentation from its layout

specifications. The resulting composition is no longer a

fixed preorchestrated presentation. It allows for explicit

exploitation of the structural composition in order to adapt

the presentation to individual user needs. In consequence,

the authoring system needs to have access to the individual

media elements, fragments, and documents that should be

considered for composition. Hence, the authoring tool has

to offer browsing, navigation, and selection mechanisms to

the authors in order to identify those media elements in the

multimedia repository that should become part of the

presentation. Obviously, the annotation of media elements,

parts of media elements, fragments, and documents give

the necessary support for the content-oriented browsing

such that an author can easily identify and select the

relevant parts. The authoring tool can either provide for the

construction of a ZYX document tree from scratch, or allow

for the completion of predefined ZYX templates.

The playout of a ZYX document can be realized in

different ways. As a first alternative, the ZYX document can

be transformed into a presentation format that can be

directly interpreted by existing players. This alternative

seems to be very interesting for the SMIL format, as first

SMIL players are already available. Obviously, the trans-

formation into another document format may result in the

loss of specific features or presentation information if the

target model does not provide the same level of semantic

expressiveness as available by the ZYX model. As a second

alternative, ZYX documents could be played out by a

ZYX-specific presentation engine that is capable of fully

exploiting all the features of the ZYX model with respect to

adaptation of a presentation. This allows for the integration

of new business models into the presentation environment.

For example, the end user can be billed for the actual quality

of the multimedia material s/he received. In the Cardio-OP

project, we developed a specific ZYX presentation engine.

In summary, the kind of structured authoring that

results in adaptive multimedia documents and the

380 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

presentation features of a ZYX-based presentation tool,

both aiming at reuse and adaptation of multimdia

material allow for cost effective multimedia authoring

and customized presentations.

6CONCLUSION AND FUTURE WORK

Starting out with the requirements of the Cardio-OP project,

which calls for the support of reusability,adaptation, and

presentation-neutral description of the structure and content of

multimedia documents, we sketched our analysis of

existing relevant multimedia document models. As these

models do not meet the project's requirements, we

introduced our new ZYX model that gives the necessary

support. We outlined the design considerations of the

ZYX model and the basic concepts followed by a formal

framework of the ZYX primitives. Finally, we illustrated the

applicability of ZYX for reuse and adaptation and the

challenges and implications of these advanced concepts

when using them for authoring and presentation environ-

ments for multimedia documents.

The ZYX model has been implemented as a DataBlade

module for the object-relational database system Informix

Dynamic Server/Universal Data Option under SUN

Solaris [19], following the architectural framework initi-

ally presented in [20], [21]. The formal description served

as the basis for the definition of an XML DTD for the

ZYX model. This will enable access to content stored in

the Cardio-OP repository by future XML-capable brow-

sers and we can also think about storing ZYX documents

in an SGML/XML-capable database system in the future,

following the approach taken in [22]. Furthermore, we

have developed a generic presentation engine for ZYX

documents which includes support for continuous MPEG

video streams based on an MPEG-specific extension of

the L/MRP buffer management technique [23].

For content-based managing and querying the under-

lying media data, we have been developing a Media

Integration DataBlade module [24] for the IDS/UD which

forms an integration layer offering uniform, homogeneous

access to the different types of media data. Supporting

multimedia authoring, this DataBlade allows for inter-

active content-based browsing in the multimedia material.

With the MediaWorkBench, we have been developing a

tool in Java on top of the Media Integration DataBlade

module for GUI-supported annotating and browsing the

media data.

With regard to the global profile describing the user

context, we have been developing a mathematical model for

the combination of different profiles describing different

aspects of a user like user group, user system environment

and the like into one semantically correct, conflict-free

global profile that can be exploited for presentation of

adaptive ZYX documents.

For adaptation support, we have developed a cross-

media adaptation scheme [18] that can be integrated with

the ZYX model and provides for the automatic augmenta-

tion of ZYX documents by semantically correct presentation

alternativesÐa process which relieves the authors from a

time-consuming task of comprehensively composing docu-

ments for different user and system contexts.

Given this ongoing work, one further goal is to develop

generic composition schemes and, exploiting the metadata

provided with the fragments and the global profile

describing the user context, to support (semi)automatic

composition of documents that are adapted and persona-

lized to the specific user context.

ACKNOWLEDGMENTS

The authors would like to thank Utz Westermann for his

contributions to the design and implementation of the

ZYX model. The authors would like to thank Jochen Wandel

for his contributions to the formal framework to support

automatic augmentation of multimedia document models.

The authors would also like to thank Christian Heinlein for

his valuable comments on the paper.

REFERENCES

[1] W. Klas, C. Greiner, and R. Friedl, ªCardio-OPÐGallery of

Cardiac Surgery,º Proc. IEEE Int'l Conf. Multimedia Computing

and Systems (ICMCS '99), 1999.

[2] D. Raggett, A. Le Hors, and I. Jacobs, HTML 4.0 SpecificationÐW3C

Recommendation, revised on 24-April-1998, W3C, URL: http://

www.w3.org/TR/1998/REC-html40-19980424, Apr. 1998.

[3] ISO/IEC JTC1/SC29, Information TechnologyÐCoding of Multimedia

and Hypermedia InformationÐPart 1: MHEG Object Representation

ISO/IEC 13522-1, ISO/IEC IS, 1997.

[4] ISO/IEC JTC1/SC29/WG12, Information TechnologyÐCoding of

Multimedia and Hypermedia InformationÐPart 6: Support for

Enhanced Interactive Applications, ISO/IEC IS 13522-6, ISO/IEC,

1996.

[5] ISO/IEC JTC1/SC29/WG12, Information TechnologyÐCoding of

Multimedia and Hypermedia InformationÐPart 5: Support for Base-

Level Interactive Applications, ISO/IEC IS 13522-5, ISO/IEC, 1995.

[6] ISO/IEC, Information TechnologyÐHypermedia/Time-Based Structur-

ing Language (HyTime), ISO/IEC IS 10744, 1992.

[7] S.R. Newcomb, N.A. Kipp, and V.T. Newcomb, ªHyTimeÐThe

Hypermedia/Time-Based Document Structuring Language,º

Comm. ACM, vol. 34, no. 11, Nov. 1991.

[8] P. Hoschka, S. Bugaj, D. Bulterman et al. Synchronized Multimedia

Integration LanguageÐW3C, Working Draft 2-February-98, W3C,

URL: http://www.w3.org/TR/1998/WD-smil-0202, Feb. 1998.

[9] S. Boll, W. Klas, and U. Westermann, ªMultimedia Document

FormatsÐSealed Fate or Setting Out for New Shores?º Proc. IEEE

Int'l Conf. Multimedia Computing and Systems (ICMCS '99), 1999.

[10] S. Boll, W. Klas, and U. Westermann, ªA Comparison of

Multimedia Document Models Concerning Advanced Require-

ments,º Technical ReportÐUlmer Informatik-Berichte Nr. 99-01,

Univ. Ulm, Germany, http://www.informatik.uni-ulm.de/dbis/

Cardio-OP/publications/TR99-01.ps.gz, Feb. 1999.

[11] S. Boll, W. Klas, and U. Westermann, ªMultimedia Document

FormatsÐSealed Fate or Setting Out for New Shores?º Multi-

mediaÐTools and Applications, vol. 11, no. 2, pp. 267-279, Aug. 2000.

[12] T.D.C. Little and A. Ghafoor, ªInterval-Based Conceptual Models

for Time-Dependent Multimedia Data,º IEEE Trans. Knowledge and

Data Eng., vol. 5, no. 4, Aug. 1993.

[13] T. Wahl and K. Rothermel, ªRepresenting Time in Multimedia

Systems,º Proc. IEEE Int'l Conf. Multimedia Computing and Systems,

pp. 538±543, 1994.

[14] A. Duda and C. Keramane, ªStructured Temporal Composition of

Multimedia Data,º Proc. IEEE Int'l Workshop Multimedia- Database-

Management Systems, 1995.

[15] N. Hirzalla, B. Falchuk, and A. Karmouch, ªA Temporal Model for

Interactive Multimedia Scenarios,º IEEE Multimedia, vol. 2, no. 3,

pp. 24±31, Fall 1995.

[16] D. Papadias, Y. Theodoridis, T. Sellis, and M.J. Egenhofer,

ªTopological Relations in the World of Minimum Bounding

Rectangles: A Study with R-Trees,º Proc. ACM SIGMOD Conf.

Management of Data, 1995.

BOLL AND KLAS: ZYXÐA MULTIMEDIA DOCUMENT MODEL FOR REUSE AND ADAPTATION OF MULTIMEDIA CONTENT 381

[17] M.J. Egenhofer and R. Franzosa, ªPoint-Set Topological Spatial

Relations,º Int'l J. Geographic Information Systems, vol. 5, no. 2, Mar.

1991.

[18] S. Boll, W. Klas, and J. Wandel, ªA Cross-Media Adaptation

Strategy for Multimedia Presentations,º Proc. ACM Multimedia '99,

1999.

[19] S. Boll, W. Klas, and U. Westermann, ªExploiting OR-DBMS

Technology to Implement the ZYX Data Model for Multimedia

Documents and Presentations,º Proc. Datenbanksysteme in Bu

Èro,

Technik und Wissenschaft (BTW '99), GI-Fachtagung, 1999.

[20] W. Klas and K. Aberer, ªMultimedia and Its Impact on Database

System Architectures,º Multimedia Databases in Perspective,

P.M.G. Apers, H.M. Blanken, and M.A.W. Houtsma, eds., 1997.

[21] S. Boll, W. Klas, and M. Lo

Èhr, ªIntegrated Database Services for

Multimedia Presentations,º Multimedia Information Storage and

Management, S.M. Chung, ed., 1996.

[22] K. Bo

Èhm, K. Aberer, and W. Klas, ªBuilding a Hybrid Database

Application for Structured Documents,º MultimediaÐTools and

Applications, vol. 8, no. 1, 1999.

[23] F. Moser, A. Kraiû, and W. Klas, ªL/MRP: A Buffer Management

Strategy for Interactive Continuous Data Flows in a Multimedia

DBMS,º Proc. Very Large Data Bases, 1995.

[24] U. Westermann and W. Klas, ªArchitecture of a DataBlade

Module for the Integrated Management of Multimedia Assets,º

Proc. First Int'l Workshop Multimedia Intelligent Storage and Retrieval

Management (MISRM), 1999.

Susanne Boll received the diploma degree in

computer science at the Technical University of

Darmstadt, Germany, in 1995. She currently

pursues her PhD studies working as a research

assistant for Professor Klas. She is a member of

the Institute for Computer Science and Business

Informatics at the University of Vienna, Austria.

Until 2000, she was a member of the Database

and Systems (DBIS) group at the University of

Ulm, Germany. Her research interests lie in the

areas of database-driven Internet-based multimedia information sys-

tems and e-commerce systems. Currently, she works on flexible,

adaptive multimedia document models and support for context-specific

multimedia presentation generation.

Wolfgang Klas is a professor at the Institute for

Computer Science and Business Informatics at

the University of Vienna, Austria. Until 2000, he

was a professor in the Computer Science

Department at the University of Ulm, Germany.

Until 1996, he was head of the Distributed

Multimedia Systems Research Division

(DIMSYS) at GMD-IPSI, Darmstadt, Germany,

and directed many research projects and in-

dustrial collaborations in the fields of object-

oriented database technology, multimedia information systems, inter-

operable database systems, and cooperative systems. In 1991/1992,

Dr. Klas was a visiting fellow at the International Computer Science

Institute (ICSI) at the University of California at Berkeley. His research

interests are currently in multimedia information systems and Internet-

based applications. He currently serves on the editorial board of the

Very Large Data Bases Journal and has been a member and chair of

program committees of many conferences.

382 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 13, NO. 3, MAY/JUNE 2001

Authoring of Multimedia Content: A Survey of 20 Years of Research

Chapter

Full-text available

Jun 2014

Ansgar Scherp

Langage de spécification et de description de présentations multimédias

Article

Nov 2002

Stéphane Lo Presti

The multimedia field exists since a long time, but its importance is increased by the convergence of industries such as movies, telecommunications or video games. Multimedia content is based on monomedia contents and defines various aspects whose main ones are spatial and temporel compositions. Within this scope, we propose the language TAO (Temporal Algebraic Operators) which allows to define multimedia presentations. TAO is an object-oriented language where objects make reference to monomedia data. Temporal operators specify the composition of these objects, according to a semantics defined with the notion of interval. The language temporal model is based on interval temporel points and causal relations expressing their links. The semantics of TAO programs is given by a normalization process and some of their properties are analysed, as equality and stop. We also present an execution engine for TAO programs which we implemented in Java by reusing an existing prototype. It is based on a set of sequential instructions and jumps. TAO programs are compiled into contexts of instructions which are managed by the various components of the execution engine.

STEVE: a Hypermedia Authoring Tool based on the Simple Interactive Multimedia Model

Conference Paper

Aug 2018

This paper proposes an interactive multimedia authoring tool called STEVE (Spatio-Temporal View Editor) and a new multimedia model called SIMM (Simple Interactive Multimedia Model). STEVE aims at allowing users with no knowledge of multimedia authoring languages and models to create hypermedia applications for web and digital TV systems in a user-friendly way. Compared with existing multimedia authoring tools, STEVE is the unique tool that allows ordinary users to export hypermedia applications to HTML5 and NCL documents. STEVE uses an event-based temporal synchronization model called SIMM that exactly fits its needs. SIMM provides high-level temporal, spatial and interactivity relations to make authoring with STEVE easier. Usability tests show that, according to users, STEVE allowed them to create multimedia applications and export them as HTML5 and NCL documents in a few minutes without programming.

Hypervideos and Interactive Multimedia Presentations

Article

Mar 2017

Britta Meixner

Hypervideos and interactive multimedia presentations allow the creation of fully interactive and enriched video. It is possible to organize video scenes in a nonlinear way. Additional information can be added to the video ranging from short descriptions to images and more videos. Hypervideos are video-based but also provide navigation between video scenes and additional multimedia elements. Interactive multimedia presentations consist of different media with a temporal and spatial synchronization that can be navigated via hyperlinks. Their creation and description requires description formats,multimediamodels, and standards- as well as players. Specialized authoring tools with advanced editing functions allow authors to manage all media files, link and arrange them to an overall presentation, and keep an overview during the whole process. They considerably simplify the creation process compared to writing and editing description documents in simple text editors. Data formats need features that describe interactivity and nonlinear navigation while maintaining temporal and spatial synchronization. Players should be easy to use with extended feature sets keeping elements synchronized. In this article, we analyzed more than 400 papers for relevant work in this field. From the findings we discovered a set of trends and unsolved problems, and propose directions for future research.

Broker-based service-oriented content adaptation framework

Thesis

Jul 2012

Mohd Farhan Md Fudzee

Content adaptation bridges the mismatch between rich contents and user preferences along with the end device capability. This thesis addresses five key issues: enabling content adaptation as services; locating and selecting best possible services in the network; and negotiating, providing and managing quality assurance of a service.<br /

Beyond Multimedia Authoring: On the Need for Mulsemedia Authoring Tools

Article

Jul 2021

The mulsemedia (Multiple Sensorial Media (MulSeMedia)) concept has been explored to provide users with new sensations using other senses beyond sight and hearing. The demand for producing such applications has motivated various studies in the mulsemedia authoring phase. To encourage researchers to explore new solutions for enhancing the mulsemedia authoring, this survey article reviews several mulsemedia authoring tools and proposals for representing sensory effects and their characteristics. The article also outlines a set of desirable features for mulsemedia authoring tools. Additionally, a multimedia background is discussed to support the proposed study in the mulsemedia field. Open challenges and future directions regarding the mulsemedia authoring phase are also discussed.

A Novel Approach for Semantics-Enabled Search of Multimedia Documents on the Web

Conference Paper

Jan 2014

We present an analysis of a large corpus of multimedia documents obtained from the web. From this corpus of documents, we have extracted the media assets and the relation information between the assets. In order to conduct our analysis, the assets and relations are represented using a formal ontology. The ontology not only allows for representing the structure of multimedia documents but also to connect with arbitrary background knowledge on the web. The ontology as well as the analysis serve as basis for implementing a novel search engine for multimedia documents on the web.

ActiveTimesheets: Extending Web-based multimedia documents with dynamic modification and reuse features

Article

Sep 2014

Methods for authoring Web-based multimedia presentations have advanced considerably with the improvements provided by HTML5. However, authors of these multimedia presentations still lack expressive, declarative language constructs to encode synchronized multimedia scenarios. The SMIL Timesheets language is a serious contender to tackle this problem as it provides alternatives to associate a declarative timing specification to an HTML document. However, in its current form, the SMIL Timesheets language does not meet important requirements observed in Web-based multimedia applications. In order to tackle this problem, this paper presents the ActiveTimesheets engine, which extends the SMIL Timesheets language by providing dynamic clientside modifications, temporal linking and reuse of temporal constructs in fine granularity. All these contributions are demonstrated in the context of a Web-based annotation and extension tool for multimedia documents.

MyStoryPlayer: experiencing multiple audiovisual content for education and training

Article

Full-text available

Aug 2014

There are several education and training cases where multi-camera view is a traditional way to work: performing arts and news, medical surgical actions, sport actions, instruments playing, speech training, etc. In most cases, users need to interact with multi camera and multi audiovisual to create among audiovisual segments their own relations and annotations with the purpose of: comparing actions, gesture and posture; explaining actions; providing alternatives, etc. Most of the present solutions are based on custom players and/or specific applications which force to create custom streams from server side, thus leading to restrictions on the user activity as to establishing dynamically additional relations. Web based solutions would be more appreciated and are complex to be realized for the problems related to the video desynchronization. In this paper, MyStoryPlayer/ECLAP solution is presented. The major contributions to the state of the art are related to: (i) the semantic model to formalize the relationships and play among audiovisual determining synchronizations, (ii) the model and modality to save and share user experiences in navigating among lessons including several related and connected audiovisual, (iii) the design and development of algorithm to shorten the production of relationships among media, (iv) the design and development of the whole system including its user interaction model, and (v) the solution and algorithm to keep the desynchronizations limited among media in the event of low network bandwidth. The proposed solution has been developed for and it is in use within ECLAP (European Collected Library of Performing Arts) for accessing and commenting performing arts training content. The paper also reports validation results about performance assessment and tuning, and about the usage of tools on ECLAP services. In ECLAP, users may navigate in the audiovisual relationships, thus creating and sharing experience paths. The resulting solution includes a uniform semantic model, a corresponding semantic database for the knowledge, a distribution server for semantic knowledge and media, and the MyStoryPlayer Client for web applications.

Structured Document Model in Digital Community

Article

Sep 2013

Structured document plays a vital role in information carrier for realizing information exchange and dissemination in digital community. However, there is no prior work on discussing structured document model which appropriates for describing special characteristics of the structured document in digital community. In this paper, we present an appropriate structured document model which is elaborately described by formal method and based on the analysis of the special live characteristics of the structured document in digital community. And then, we design some suitable constraints in order to construct the well-formed structured document model.

Point Set Topological Spatial Relations

Article

Full-text available

Apr 1991
Int J Geogr Inform Syst

Practical needs in geographic information systems (GIS) have led to the investigation of formal and sound methods of describing spatial relations. After an introduction to the basic ideas and notions of topology, a novel theory of topological spatial relations between sets is developed in which the relations are defined in terms of the intersections of the boundaries and interiors of two sets. By considering empty and non-empty as the values of the intersections, a total of sixteen topological spatial relations is described, each of which can be realized in R 2. This set is reduced to nine relations if the sets are restricted to spatial regions, a fairly broad class of subsets of a connected topological space with an application to GIS. It is shown that these relations correspond to some of the standard set theoretical and topological spatial relations between sets such as equality, disjointness and containment in the interior.

Multimedia Document Models: Sealed Fate or Setting Out for New Shores?

Article

Full-text available

Jan 2000

Existing multimedia document models like HTML, MHEG, SMIL, and HyTime lack appropriate modeling primitives to fit the needs of next generation multimedia applications which bring up requirements like reusability of multimedia content in different presentations and contexts, and adaptation to user preferences. In this paper, we motivate and present new requirements stemming from advanced multimedia applications and the resulting consequences for multimedia document models. Along these requirements, we discuss the document model standards HTML, HyTime, MHEG, SMIL, and ZYX, a new model that has been developed with special focus on reusability and adaptation. The analysis and comparison of the models show the limitations of existing models, point the way to the need for new flexible multimedia document models, and throw light on the many implications on authoring systems, multimedia content management, and presentation.

A cross-media adaptation strategy for multimedia presentations

Conference Paper

Full-text available

Oct 1999

Adaptation techniques for multimedia presentations are mainly concerned with switching between different qualities of single media elements to reduce the data volume and by this to adapt to limited presentation resources. This kind of adaptation, however, is limited to an inherent lower bound, i.e., the lowest acceptable technical quality of the respective media type. To overcome this limitation, we propose cross-media adaptation in which the presentation alternatives can be media elements of different media type, even different fragments. Thereby, the alternatives can extremely vary in media type and data volume and this enormously widens the possibilities to efficiently adapt to the current presentation resources. However, the adapted presentation must still convey the same content as the original one, hence, the substitution of media elements and fragments must preserve the presentation semantics. Therefore, our cross-media adaptation strategy provides models for the automatic augmentation of multimedia documents by semantically equivalent presentation alternatives. Additionally, during presentation, substitution models enforce a semantically correct information flow in case of dynamic adaptation to varying presentation resources. The cross-media adaptation strategy allows for flexible reuse of multimedia content in many different environments and, at the same time, maintains a semantically correct information flow of the presentation.

Representing time in multimedia systems

Working Paper

Jan 1993

As multimedia systems deal with a variety of temporally interrelated media items, synchronization is an important issue in those systems. One part of synchronization is the representation of temporal information. In contrast to traditional computing tasks, multimedia imposes new requirements on the representation of time. Specifically, a fine-grained and a flexible temporal model is required. Therefore, a number of temporal models have been suggested by various authors. This paper evaluates and classifies a selection of the most common existing models applying fundamental statements of the time theory and temporal logic. Learning from the deficits of the existing models, a new temporal model based on interval operators is proposed for multimedia systems.

Topological Relations in the World of Minimum Bounding Rectangles: A Study with R-trees

Article

May 1995

Recent developments in spatial relations have led to their use in numerous applications involving spatial databases. This paper is concerned with the retrieval of topological relations in Minimum Bounding Rectangle-based data structures. We study the topological information that Minimum Bounding Rectangles convey about the actual objects they enclose, using the concept of projections. Then we apply the results to R-trees and their variations, R+-trees and R*-trees in order to minimise disk accesses for queries involving topological relations. We also investigate queries that involve complex spatial conditions in the form of disjunctions and conjunctions and we discuss possible extensions.

Synchronized Multimedia Integration Language (SMIL) 1.0

Article

Jan 1998
Europhys News

This document specifies version 1 of the Synchronized Multimedia Integration Language (SMIL 1.0, pronounced "smile"). SMIL allows integrating a set of independent multimedia objects into a synchronized multimedia presentation. Using SMIL, an author can 1. describe the temporal behavior of the presentation 2. describe the layout of the presentation on a screen 3. associate hyperlinks with media objects This specification is structured as follows: Section 1 presents the specification approach. Section 2 defines the "smil" element. Section 3 defines the elements that can be contained in the head part of a SMIL document. Section 4 defines the elements that can be contained in the body part of a SMIL document. In particular, this Section defines the time model used in SMIL. Section 5 describes the SMIL DTD.

HTML 4.0 Specification, W3C Recommendation

Article

Synchronized multimedia integration language

Article

Jan 1998

E Hoschka

Multimedia document formats-sealed fate or setting out for new shores

Article

Jan 1999

Cardio-OP: Gallery of Cardiac Surgery

Conference Paper

Aug 1999

Next generations of online multimedia training and education applications call for new approaches for the creation, storage, maintenance, commercial marketing, and publishing of multimedia content. The project “Gallery of Cardiac Surgery” (Cardio-OP) aims at the development of an Internet based database-driven multimedia information system for physicians, medical lecturers, students, and patients in the domain of cardiac surgery. The research project has a volume of about 3.3 Million Euro and constitutes a total effort of about 41 person years. Scientific contributions of Cardio-OP include a new approach towards the organization and online distribution of multimedia content, its creation and authoring, and its maintenance in a multimedia repository. The resulting information system is intended to be applicable to other application domains, such as continuous education and training programs for employees in production processes. The paper presents details on the project background and motivation, overall goals and objectives, and outlines some of the approaches taken and results achieved

ZYX-A Multimedia Document Model for Reuse and Adaptation of Multimedia Content.

Abstract and Figures

Recommended publications

Correlation measurements in nuclear β-decay using traps and polarized low energy beams

The $\tau$ Magnetic Dipole Moment at Future Lepton Colliders

Beyond the Standard Model Searches at CMS

Rational criterion approach for ship cabin acoustic layout optimization design under area constraint...