Conference Paper

Data-independent sequencing with the timing object: a JavaScript sequencer for single-device and multi-device web media

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Media players and frameworks all depend on the ability to produce correctly timed audiovisual effects. More formally, sequencing is the process of translating timed data into correctly timed presentation. Though sequencing logic is a central part of all multimedia applications, it tends to be tightly integrated with specific media formats, authoring models, timing/control primitives and/or predefined UI elements. In this paper, we present the Sequencer, a generic sequencing tool cleanly separated from data, timing/control and UI. Data-independent sequencing implies broad utility as well as simple integration of different data types and delivery methods in multimedia applications. UI-independent sequencing simplifies integration of new data types into visual and interactive components. Integration with an external timing object [7] ensures that media components based on the Sequencer may trivially be synchronized and remote controlled, both in single-page media presentations as well as global, multi-device media applications [5, 6, 7, 16]. A JavaScript implementation for the Sequencer is provided based on setTimeout, ensuring precise timing and reduced energy consumption. The implementation is open sourced as part of timingsrc [2, 3], a new programming model for precisely timed Web applications. The timing object and the Sequencer are proposed for standardization by the W3C Multi-device Timing Community Group [20].

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Research in media synchronization explored the limitation of audio and video synchronization on the Web platform, demonstrating that echoless synchronization was possible on highend smartphones as early as 2015 [23]. Sequencing tools for dynamic datasets [24] demonstrated precise synchronization of time-dependent data. Media State Vector [25] demonstrated a scalable solution for online media clocks, with global availability and precision down to a few milliseconds. ...
... Additionally, CdM extends the scope of assembly. Prior research in data sequencing [24], exclusively addressed timeline consistency for timed data. In CdM, timeline consistency is seen as part of a larger challenge -application state management. ...
Article
Full-text available
Abstract: Many media providers offer complementary products on different platforms to target a diverse consumer base. Online sports coverage, for instance, may include professionally produced audio and video channels, as well as Web pages and native apps offering live statistics, maps, data visualizations, social commentary and more. Many consumers also engage in parallel usage, setting up streaming products and interactive interfaces on available screens, laptops and handheld devices. This ability to combine products holds great promise, yet, with no coordination, cross-platform user experiences often appear inconsistent and disconnected. We present Control-driven Media (CdM), an extension of the current media model that adds support for coordination and consistency across interfaces, devices, products, and platforms while remaining compatible with existing services, technologies, and workflows. CdM promotes online media control as an independent resource type in multimedia systems. With control as a driving force, CdM offers a highly flexible model, opening up for further innovations in automation, personalization, multi-device support, collaboration and time-driven visualization. Furthermore, CdM bridges the gap between continuous media and Web/native apps, allowing the combined powers of these platforms to be seamlessly exploited as parts of a single, consistent user experience. Extensive research in time-dependent, multi-device, data-driven media experiences supports CdM. In particular, CdM requires a generic and flexible concept for online, timeline-consistent media control, for which a candidate solution (State Trajectory) has recently been published. This paper makes the case for CdM, bringing the significant potential of this model to the attention of research and industry.
... Trajectory cursors may be played back in precise synchrony with other rendering components also directed by timing objects, such as audio or video. To ensure precision and reduced energy consumption, cursor playback uses Sequencers [3,4] for activation of segments during playback. Sequencer execution is based on timeouts instead of high frequency polling. ...
Chapter
Interactive applications are powerful tools for data exploration, visualization and collaboration. Applications featuring viewports are particularly expressive, offering controls for altering perspective by scrolling, panning, zooming or tilting a view. Still, interactivity is inherently live and manual, and often limited to a single interface. We propose to model interactivity as a data source. This way, interactivity may be transmitted from one interface to another, or broadcasted to a distributed audience. Interactivity could also be created or edited by AI-based algorithms, recorded from manual input, stored and made available for on demand playback, or shared in real-time in a multi-view setup or among collaborators in a group. To facilitate such opportunities, we propose State Trajectory, a unifying concept for local and online interactivity. State trajectories extend regular program variables with a temporal dimension and provide built-in support for persistence, real-time sharing, time-consistent recording and playback, and gradual transitions. A concept implementation demonstrates that state trajectories encapsulate significant complexity, yet with a low performance overhead. Using trajectories, support for real-time collaboration and time-shifted replays could be added to a 3’rd party map framework, with minimal modifications to the existing code base.
... Timing Objects [13,14] are used in combination with sequencers [15] to ensure that the positioning data is tightly synchronised with video playback. This is very important, as re-positioning the video on the wrong frame on scene shifts is highly detectable and creates fullscreen flickering that detracts substantially from the user experience. ...
Conference Paper
Full-text available
Media is to a large extent consumed on devices that have non-standard aspect ratios, both physically or while rendering content. For example, social media platforms, televisions, tablets, and android devices, most commonly utilise varying aspect ratios of 1, 16:9, 4:3/3:4, 16:9/9:16, respectively. Web pages tend to use responsive design and can therefore have almost any aspect ratio. As current solutions are static, multiple encoded versions of the content must be created to cater for different aspect ratios, increasing workload, storage space requirements and content management complexity. With this in mind, there is a case for client side dynamic aspect ratios that adapt suitably to the user's device to improve their viewing experience based on a common encoded version of the content. In this paper we make the case for a client side dynamic aspect ratio solution, present work on implementation and experimentation, and finally provide some insights into how such a system could be implemented and provided in real world systems. Our solution was tested on content provided by NRK, including both drama series and TV debates.
... Finally, the original video scene is recomposed by alpha blending the processed signal bitstreams in the appropriate order (i.e., from back-to foreground), on a frame-by-frame basis. For this approach to work, the playback of the scene's constituent video signals needs to be synchronized, which is achieved by leveraging the W3C Timing Object specification [1,4]. Both the chroma keying and alpha blending operations are implemented as WebGL shaders to profit from 3D hardware acceleration. ...
Conference Paper
Full-text available
Over-the-top (OTT) streaming services like YouTube and Netflix induce massive amounts of video data, hereby putting substantial pressure on network infrastructure. This paper describes a demonstration of the object-based video (OBV) methodology that allows for the quality-variant MPEG-DASH streaming of respectively the background and foreground object(s) of a video scene. The OBV methodology is inspired by research into human visual attention and foveated compression, in that it allows to adaptively and dynamically assign bitrate to those portions of the visual scene that have the highest utility in terms of perceptual quality. Using a content corpus of interview-like video footage, the described demonstration proves the OBV methodology's potential to downsize video bitrate requirements while incurring at most marginal perceptual impact (i.e., in terms of subjective video quality). Thanks to its standards-compliant Web implementation, the OBV methodology is directly and broadly deployable without requiring capital expenditure.
... Though this mech- anism is not optimized for precision, Web browsers may be precise down to a few milliseconds. The sequencer is presented in further detail in [4]. ...
Chapter
The Web is a natural platform for multimedia, with universal reach, powerful backend services, and a rich selection of components for capture, interactivity, and presentation. In addition, with a strong commitment to modularity, composition, and interoperability, the Web should allow advanced media experiences to be constructed by harnessing the combined power of simpler components. Unfortunately, with timed media this may be complicated, as media components require synchronization to provide a consistent experience. This is particularly the case for distributed media experiences. In this chapter we focus on temporal interoperability on the Web, how to allow heterogeneous media components to operate consistently together, synchronized to a common timeline and subject to shared media control. A programming model based on external timing is presented, enabling modularity, interoperability, and precise timing among media components, in single-device as well as multi-device media experiences. The model has been proposed within the W3C Multi-device Timing Community Group as a new standard, and this could establish temporal interoperability as one of the foundations of the Web platform.
Conference Paper
Full-text available
Media has been shaped by inherent limitations of the available distribution mechanisms since the advent of broadcasting. We seek to break free of this heritage by fundamentally reconstructing media. We promote the concept of motion as a fundamental building block in all media. By structuring and executing media according to shared motion, a world of opportunity opens up. In particular, our recent invention of highly scalable, cross­Internet motion­synchronization implies that motion­based media is collaborative and multi­device by design. We envision a shift away from the current paradigm, where fixed pieces of content are produced, transmitted and consumed. Instead, media will be composed again and again from a continuously developing corpus of online content and online motion. In this world, viewing, navigation, interaction and authoring can all be collaborative, multi­device activities. We call this Composite Media.
Technical Report
Full-text available
In this report we analyze the quality of synchronization we can expect when synchronizing HTML5 audio and video on multiple devices using Shared Motion. We demonstrate that the concept of Shared Motion enables sub-frame synchronization for video, and near perfect synchronization for audio. Experiments are conducted in real world scenarios.
Conference Paper
Full-text available
Composition is a hallmark of the Web, yet it does not fully extend to linear media. This paper defines linear composition as the ability to form linear media by coordinated playback of independent linear components. We argue that native Web support for linear composition is a key enabler for Web-based multi-device linear media, and that precise multi-device timing is the main technical challenge. This paper proposes the introduction of an HTMLTimingObject as basis for linear composition in the single-device scenario. Linear composition in the multi-device scenario is ensured as HTMLTimingObjects may integrate with Shared Motion, a generic timing mechanism for the Web. By connecting HTMLMediaElements and HTMLTrackElements with a multi-device timing mechanism, a powerful programming model for multi-device linear media is unlocked.
Conference Paper
Full-text available
Using high-quality video cameras on mobile devices, it is relatively easy to capture a significant volume of video content for community events such as local concerts or sporting events. A more difficult problem is selecting and sequencing individual media fragments that meet the personal interests of a viewer of such content. In this paper, we consider an infrastructure that supports the just-in-time delivery of personalized content. Based on user profiles and interests, tailored video mash-ups can be created at view-time and then further tailored to user interests via simple end-user interaction. Unlike other mash-up research, our system focuses on client-side compilation based on personal (rather than aggregate) interests. This paper concentrates on a discussion of language and infrastructure issues required to support just-in-time video composition and delivery. Using a high school concert as an example, we provide a set of requirements for dynamic content delivery. We then provide an architecture and infrastructure that meets these requirements. We conclude with a technical and user analysis of the just-in-time personalized video approach.
Conference Paper
Full-text available
This paper provides an overview of the Ambulant Open SMIL player. Unlike other SMIL implementations, the Ambulant Player is a reconfigureable SMIL engine that can be customized for use as an experimental media player core. The Ambulant Player is a reference SMIL engine that can be integrated in a wide variety of media player projects. This paper starts with an overview of our motivations for creating a new SMIL engine, then discusses the architecture of the Ambulant Core (including the scalability and custom integration features of the player). We close with a discussion of our implementation experiences with Ambulant instances for Windows, Mac and Linux versions for desktop and PDA devices.
Conference Paper
Full-text available
In this paper we examine adaptive time-based web applications (or presentations). These are interactive presentations where time dictates the major structure, and that require interactivity and other dynamic adaptation. We investigate the current technologies available to create such presentations and their shortcomings, and suggest a mechanism for addressing these shortcomings. This mechanism, SMIL State, can be used to add user-defined state to declarative time-based languages such as SMIL or SVG animation, thereby enabling the author to create control flows that are difficult to realize within the temporal containment model of the host languages. In addition, SMIL State can be used as a bridging mechanism between languages, enabling easy integration of external components into the web application. Categories and Subject Descriptors
Conference Paper
This paper presents the concept of the Media State Vector (MSV), an implementation of uni-dimensional motion in real time. The MSV is intended as a general representation of media navigation and a basis for synchronization of multi-device media presentations. The MSV is motivated by the idea that media navigation can be decoupled from media content and visual presentation, and shared across a network. Implementation of the MSV concept for the Web allows us to construct navigable, synchronized, multi-device, multimedia presentations, spanning computers across the Internet. In particular, media presentations may be hosted by regular Web browsers on a range of devices, including smart phones, pads, laptops and smart TVs. Our proof of concept implementation bases its synchronization accuracy on primitive, centralized, ad-hoc, application-level clock synchronization. Still, inter-client synchronization error of about 33 ms is demonstrated between three screens in London (UK), synchronized via a server in Tromsø (Norway).
Timing Object; Draft community group report
  • I M Arntzen
  • F Daoust
  • N T Borch
I. M. Arntzen, F. Daoust, and N. T. Borch. Timing Object; Draft community group report. http://webtiming.github.io/timingobject/.
js the HTML5 media framework
  • Mozilla
  • Popcorn
Mozilla. Popcorn.js the HTML5 media framework. http://popcornjs.org/. [18] MPEG-4. http://mpeg.chiariglione.org/standards/mpeg-4. [19] MPEG-4 Systems. http: //mpeg.chiariglione.org/standards/mpeg-4/systems.
Timingsrc: A programming model for timed web applications based on the Timing Object. Precise timing synchronization and control enabled for single-device and multi-device Web applications
  • I M Arntzen
  • N T Borch
I. M. Arntzen and N. T. Borch. Timingsrc: A programming model for timed web applications, based on the Timing Object. Precise timing, synchronization and control enabled for single-device and multi-device Web applications. http://webtiming.github.io/timingsrc/.
HTML5 Video Compositor. https://github.com/bbc
  • M Shotton
M. Shotton. HTML5 Video Compositor. https://github.com/bbc/html5-video-compositor.
Object-Based Broadcasting
  • T Churnside
T. Churnside. Object-Based Broadcasting. http://www.bbc.co.uk/rd/blog/ 2013-05-object-based-approach-to-broadcasting, 2013.
Popcorn.js the HTML5 media framework
  • Mozilla
Timingsrc: Open source implementation
  • I M Arntzen
  • N T Borch
I. M. Arntzen and N. T. Borch. Timingsrc: Open source implementation. https://github.com/webtiming/timingsrc.