A scalable and adaptive video streaming framework over multiple paths.
Article: Rate-distortion optimized distributed packet scheduling of multiple video streams over shared communication resources[show abstract] [hide abstract]
ABSTRACT: We consider the problem of distributed packet selection and scheduling for multiple video streams sharing a communication channel. An optimization framework is proposed, which enables the multiple senders to coordinate their packet transmission schedules, such that the average quality over all video clients is maximized. The framework relies on rate-distortion information that is used to characterize a video packet. This information consists of two quantities: the size of the packet in bits, and its importance for the reconstruction quality of the corresponding stream. A distributed streaming strategy then allows for trading off rate and distortion, not only within a single video stream, but also across different streams. Each of the senders allocates to its own video packets a share of the available bandwidth on the channel in proportion to their importance. We evaluate the performance of the distributed packet scheduling algorithm for two canonical problems in streaming media, namely adaptation to available bandwidth and adaptation to packet loss through prioritized packet retransmissions. Simulation results demonstrate that, for the difficult case of scheduling nonscalably encoded video streams, our framework is very efficient in terms of video quality, both over all streams jointly and also over the individual videos. Compared to a conventional streaming system that does not consider the relative importance of the video packets, the gains in performance range up to 6 dB for the scenario of bandwidth adaptation, and even up to 10 dB for the scenario of random packet loss adaptation.IEEE Transactions on Multimedia 05/2006; · 1.93 Impact Factor
Perform. Eval. 01/2005; 62:417-438.
Multimed Tools Appl (2010) 47:207–224
A scalable and adaptive video streaming framework
over multiple paths
Ivan Lee·Jong Hyuk Park
Published online: 21 October 2009
© Springer Science + Business Media, LLC 2009
Abstract In this paper, we examine the frame loss probabilities for multiple-
description coded video transmitted over independent paths. We apply an efficient
multiple description coding technique for the analysis, and we investigate the impact
of drifting error in terms of the probability of receiving freeze frames for recon-
structed video. In order to improve the video delivery, an adaptive video coding
scheme by adjusting the length of group-of-pictures is investigated in this paper. In
addition, a scalable video streaming framework from client-server, centralized peer-
to-peer, and decentralized peer-to-peer network topologies are examined. Analytical
and experimental results based on Gilbert model are used to evaluate the perfor-
mance of the proposed adaptive and scalable video streaming framework.
Video streaming delivers video over the Internet (or intranet) to end-users who
are playing back the video content at real time. The video can either be pre-
recorded or live streamed. The major challenge posed in video streaming is its timing
requirement, and a packet will be considered lost if the transmission fails to meet a
This paper is an extended version of  and .
School of Computer and Information Engineering, University of South Australia,
Adelaide, South Australia, Australia
J. H. Park (B )
Department of Computer Science and Engineering, Seoul National University of Technology,
Nowon-gu, Seoul, South Korea
208Multimed Tools Appl (2010) 47:207–224
real-time constraint. This differs significantly from a play-after-download approach
where transmission delays are not considered as a factor of error. Another challenge
for video streaming is the best-effort nature of today’s Internet, where a successful
data delivery is not guaranteed.
To resolve these problems, numerous techniques on transmission feedback con-
trol, adaptive source encoding algorithm, efficient packetization, resource allocation,
and error control coding have been proposed to improve the quality of video
communication [21, 22]. In , Chow et al. proposed a variation of Gilbert model 
where the loss parameters of a path depend on an application’s transmission rate.
Using this model, the authors optimized the load distribution among multiple paths
to achieve an improved streaming quality. An optimization framework was proposed
to minimize the aggregate distortion for multiple video streams transmitted over
a shared communication channel . A distributed video streaming from multiple
servers to a single receiver was also studied in . The servers independently
partitioned the media packets based on the bandwidth information, such that the
resulting video quality at the receiver was maximized.
Among different approaches proposed for video streaming to resolve the chal-
lenges on packet-based and best-effort Internet today, distributed streaming over
collaborative peer-to-peer (P2P) overlay networks has attracted increased atten-
tions recently from both research and industrial communities. Unlike conven-
tional client/server infrastructures commonly used for content distribution networks
(CDN), in P2P networks each node acts both as a client and as a server. This
approach yields a high throughput and a good tolerance to loss and delay caused
by network congestion . P2P multimedia streaming and caching services also
reduce initial delays for playback, and hence minimize jitters during playback .
Tran et al.  investigated application-layer multicast tree, and proposed ZIGZAG
which possessed features on short end-to-end delay, low control overhead, efficient
join and failure recovery, and low maintenance overhead. Another study which
utilized advantage of the strong buffering capabilities of end hosts was oStream ,
which is a tree based overlay that was specifically designed for one-to-many
on-demand media distribution.
Due to the heterogeneous nature of today’s Internet, data transmission over a
massive number of channels is difficult to control by a centrally managed solution at a
low cost. Multimedia content in general has a highly time varying bandwidth require-
ment since media data are variable bit rates (VBR) in nature using modern coding
techniques [6, 8]. In addition, streaming applications demands guaranteed delivery in
order to meet specified temporal and spatial constrains. The services which provide
guaranteed delivery in Network Multimedia System (NMS) was improved in the
control management level of the host and the underlying network architectures .
Physically copying data is expensive for the guaranteed services for network trans-
mission. The integrated processing loops for performing manipulation functions
over a single common unit instead of performing them serially with the concept
of Application Level Framing . Media synchronization is also need to guarantee
jitter-free playback requirements [16, 18]. The high bandwidth requirement and
a real time delivery constraint are the two major challenges for streaming video.
Driven by the goal of improving long-term system performance, dynamic resource
allocation schemes with application specific adaptation capabilities are integrated in
the solution. Previous work enables complex adaptations by use of general models of
Multimed Tools Appl (2010) 47:207–224209
target systems . With large variations in bandwidth requirements of multimedia
content, rate adaptation  adjusts the bandwidth used by a transmission channel
according to the existing network conditions.
Our work in this paper focuses on overcoming the limitations of conventional
client-server streaming applications using a multi-path streaming infrastructure and
spatial domain MDC  technique. In this paper, we evaluate the performance
of different real-time streaming systems in terms of the frame loss rate of the
reconstructed video, which reflects the probably of a frozen playback video frame
at the receiver device. Different scenarios are examined by adjusting three factors:
with aligned and unaligned I-frame. Based on the multi-path video streaming frame-
work, we also propose an adaptive system which dynamically adjusts GOP lengths
for each sub-stream according to the network conditions. Frame loss rate under
different scenarios are investigated: (1) sub-dividing a GOP into multiple sub-GOPs,
(2) multiple MDC sub-streams with different GOP lengths, and (3) an adaptive
streaming system based on time-series projection of the frame loss rate. In addition,
this paper also presents different approaches to offload the bottleneck traffic by
applying the MDC compressed video over the peer-to-peer network infrastructure.
Three different streaming infrastructures: client/server, centralized P2P streaming,
and decentralized P2P streaming, are examined. To extend from traditional network
performance analysis in terms of packet loss rate, this paper further investigates the
impacts of the loss traffic to the reconstructed video quality due to drifting error. The
frame loss rate, which indicates the un-reconstructable video frames at the receiver,
is analyzed in this paper.
The remainder of this paper is organized as follows. In Section 2, the adaptive
multi-path video streaming scheme is introduced, followed by its frame loss pattern
analysis in Section 3. The studies of aligned and unaligned I-frames are investigated
in Section 4. In order to improve video delivery and reduce frame loss rates, an
adaptive source coding based on channel conditions is investigated in Section 5. In
Section 6, a scalable video streaming framework with different network topologies is
studied. Experiments are presented in Section 7.
2 Adaptive multi-path video streaming
MDC techniques are designed for path diversity. They encode media content into
multiple independent descriptions, and these descriptions are also known as sub-
streams. Once a sub-stream is received at the client end, the granules within the
stream can be decoded. The overall quality of recovered content is depending
on the number of successfully delivered sub-streams. The more sub-streams are
received, the higher the reconstructed video quality can be achieved. MDC provides
loss tolerance, and it is therefore beneficial for delay-sensitive, real-time streaming
applications where data losses are highly disruptive.
An efficient MDC technique proposed in  is applied in this paper. Figure 1
illustrates the concept of the MDC codec, which sub-samples each video frame
into multiple sub-frames over the spatial domain before encoding with an H.264
video encoder. The corresponding MDC decoder applies cubic-spline interpolation
to reconstruct the missing sub-frames. Thus, the system is capable of reconstructing
210Multimed Tools Appl (2010) 47:207–224
Fig. 1 An efficient MDC codec with spatial diversity
video with different quality levels depending on the number of MDC sub-streams
received at the destination.
MDC is typically applied in scenarios where multiple nodes independently for-
ward video content to the client node over physically connected networks. Prior re-
search in  has proposed to apply MDC for a decentralized peer-to-peer streaming
system, where the MDC sub-streams (M1+ M2+ M3+ ··· + Mn) received at server
nodes are forwarded to the end user who request those sub-streams. Therefore, the
more forwarding servers exist, the more paths can be used for transmission. The
combination of the under utilized network bandwidth of multiple forwarding servers
may give an overall broader bandwidth for MDC streaming, hence yielding a better
reconstructed video quality at the receiver.
Prior research in  and  assume that MDC sub-streams are transmitted over
channels with identical delay and throughput characteristics, and these parameters
are unchanged for an entire streaming session. In this paper, we consider the nature
of today’s Internet where delay and throughput vary dynamically. An adaptive
video streaming system is therefore required. We proposed an infrastructure which
allows the receiver to report network conditions to the servers, and adaptively adjust
the encoded bitstreams to improve the performance of video streaming. Figure 2
Fig. 2 Workflow of adaptive
length of GOP
Multimed Tools Appl (2010) 47:207–224211
illustrates the work flow of the proposed infrastructure. The receiver monitors the
network condition, and using time series projection to predict the loss pattern
of subsequent video frames. The predicted value is delivered to the servers as a
feedback to adjust the GOP lengths. In-depth analysis of the performance measure
in terms of frame loss rate under different scenarios, as well as the details for the
adaptive streaming infrastructure, are explained under the subsequent sections.
3 Frame loss analysis for reconstructed video
profile which uses two types of frames: Inter (I)-frame and Predictive (P)-frame.
I-frames are encoded independent of prior frames. P-frames are encoded with
respect to a prior I-frame or P-frame, where motion compensation techniques are
applied for improving the compression efficiency. A group of frames that starts with
an I-frame, followed by a set of P-frames and ends before the next I-frame is called a
Group of Pictures (GOP). Losing an I-frame or a P-frame will result in distortion of
the following frames within the same GOP. This is known as the drifting error.
In this paper, we simulate the packet delivery of the video bitstreams using Gilbert
Model , which is a two-state Markov chain indicating a success state and a
failure state for packet delivery. To simplify the analysis, we assume that each packet
contains one encoded video sub-frame. Let S0(good) denote good state when the IP
network packages are received correctly and timely, and S1(bad) denotes bad state
when the packages are lost. The probabilities of the network transition from S0to
S1and from S1to S0are denoted as P01and P10, respectively. The probabilities of
staying in the same state are denoted as P00for state S0and P11for state S1. Steady
state analysis shows that the overall probabilities of good state PS0and bad state
P00+ P11− 2
P00+ P11− 2
Let T denote the length of GOP, and M denote the number of sub-streams.
At anytime i, the probability of a good frame transmission is PS0Pi−1
i = 1,2,...,T. The complement value, ρi, reflects the probability of all combinations
with a frame loss before or at time i, and ρi= 1 − PS0Pi−1
Substitute (1) into (3), the mean frame loss rate can be written in terms of
transition probabilities P00and P11, as shown in (4).
?1 − PS0Pi−1
1 −(P11− 1) P(i−1)
P00+ P11− 2