Multimedia documents are different from text-based documents in the sense that they do not have a predefined dominant media type but are composed of multiple media items using different media types, such as, image, text, audio and video. The author of a multimedia document uses media items that are, either specifically created, or (re)used from existing resources, to represent the message she intends to convey. Furthermore, a multimedia document has, besides two spatial dimensions, a temporal dimension. Consequently, the author of a multimedia document should, in addition to the spatial layout, synchronize media items in a meaningful way.
Authoring multimedia documents is in multiple ways different from authoring a text-based electronic document. First, modern text processors allow an author to abstract from typesetting details, such as hyphenation, kerning and leading. The word processor automatically formats the text in such a way that it fits within the designated area, such as a page or screen. In contrast, the author of a multimedia document carefully designs a multimedia document so that it exactly fits the screen size the document is designed for. One presentation may have been created for a screen with a width of 1024 pixels and a height of 768 pixels. A second presentation, which conveys an identical message, but on a screen with a width of 640 pixels and a height of 800 pixels will typically require manual authoring.
Secondly, modern text processors often have the ability to include predefined styles (e.g.
corporate identity), which allows an author to abstract from the styling of the document. Consequently, an author does not require design expertise to ensure a consistently formatted and aesthetically pleasing document. In contrast, modern authoring tools for multimedia documents require an author to make both authoring and design decisions.
The reason that authoring and design are intertwined in the production of multimedia documents is that the spatial layout and temporal synchronization between media items is semantically significant. Unlike text, where a sentence or word may be split to continue on the next line or page, breaking the spatio-temporal relations between media items in a multimedia document typically alters the message conveyed by the document. When the presentation does not fit the screen, the author carefully redesigns the presentation in order to maintain these relationships.
Although a multimedia document can be adapted to a particular context, and multiple multimedia documents can be consistently styled, this typically requires significant human investment. The costs involved in authoring and designing multimedia documents are therefore relatively high compared to textual documents. As a result, the production of multimedia documents is only viable in specific cases, which is unfortunate because multimedia documents are typically effective to convey a particular message.
To address this discrepancy we derived requirements for an extended document engineering model. These include requirements derived from the traditional document engineering model. However, the traditional model assumes generally applicable overflow strategies, which is not the case for multimedia documents. Therefore, the formatting of multimedia documents may, in contrast to text-based documents, fail. An extended document engineering model should thus detect constraint violations and propose alternative formatting when necessary. We defined such a extended model, expressing explicit knowledge on the properties of the delivery context and form constructs that are relevant for detecting constraint violations. As a result, the knowledge requirements in an extended document engineering model are significantly larger compared to traditional document engineering. To reduce the associated costs, an extended document engineering model should support reuse and preserve existing knowledge where possible.
To evaluate the model, we have implemented a multimedia document engineering formatter using a Constraint Logic Programming approach. The formatter is embedded in a client-server framework so that is can be used in a web environment. Based on this framework we demonstrate in three distinctive use cases that the document engineering paradigm may be successfully applied to a number of multimedia documents that are representative for each use case. Successful in this context means that: firstly, a single set of style rules may be used to transform multiple structured documents. Secondly, the intended output is automatically adapted to the delivery context without changing the function that is conveyed.
Compared to the traditional model, our model extends the notions of function, form and style to meet the specific requirements of multimedia documents. We include an explicit and parametrized delivery context that represents the constraints of the environment the document is played in, and the specification of alternative style rules that are automatically invoked by the formatter if the resulting document form does not comply to the hard constraints imposed by the delivery context.