Conference Paper

On the Delay Performance of Browser-based Interactive TCP Free-viewpoint Streaming

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In free-viewpoint video arbitrary views of a scene or an object are rendered from a 3-dimensional scene representation that is obtained using multiple cameras or generated by computer graphics. The interactivity that is due to the viewpoint selection is particularly challenging in case of networked applications, where a server renders the scene from a viewpoint that is chosen by a remote client. Relying on widely-used standard browser-based video streaming technology, data transport is performed by the Transmission Control Protocol (TCP), implying an anticipated risk of potentially large delays. The magnitude, frequency, and origin of such delays are the focus of this work. To investigate the tail distribution of the delays, we use a controlled testbed environment and instrument the entire video streaming chain from the server-side renderer to the display at the client using various measurement points. We identify three major sources of delays: the video coders, the protocol stack, and the network. We investigate the causes of these delays and show a strong impact of network parameters, such as round-trip time and packet loss probability, on protocol stack delays. While stack delays can significantly exceed network delays, we find that stack delays can be reduced effectively by adapting the parameters of the video encoder.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Recent advances in free-viewpoint rendering techniques as well as the continued improvements of the internet network infrastructure open the door for challenging new applications. In this paper, we present a framework for interactive free-viewpoint streaming with open standards and software. Network bandwidth, encoding strategy as well as codec support for open source browsers are key constraints to be considered for our interactive streaming applications. Our framework is capable of real-time server-side rendering and interactively streaming the output by means of open source streaming. To enable viewer interaction with the free-viewpoint video rendering back-end in a standard browser, user events are captured with Javascript and transmitted using WebSockets. The rendered video is streamed to the browser using the FFmpeg free software project. This paper discusses the applicability of open source streaming and presents timing measurements for video-frame transmission over network.
Technical Report
Full-text available
Recently, HTTP-based adaptive streaming has become the de facto standard for video streaming over the Internet. It allows clients to dynamically adapt media characteristics to varying network conditions in order to ensure a high quality of experience, that is, minimize play-back interruptions, while maximizing video quality at a reasonable level of quality changes. In the case of live streaming, this task becomes particularly challenging due to the latency constraints. The challenge further increases if a client uses a wireless access network, where the throughput is subject to considerable fluctuations. Consequently, live streams often exhibit latencies of up to 20 to 30 seconds. In the present work, we introduce an adaptation algorithm for HTTP-based live streaming called LOLYPOP (Low-Latency Prediction-Based Adaptation) that is designed to operate with a transport latency of few seconds. To reach this goal, LOLYPOP leverages TCP throughput predictions on multiple time scales, from 1 to 10 seconds, along with an estimate of the relative prediction error distribution. In addition to satisfying the latency constraint, the algorithm heuristically maximizes the quality of experience by maximizing the average video quality as a function of the number of skipped segments and quality transitions. In order to select an efficient prediction method, we studied the performance of several time series prediction methods in IEEE 802.11 wireless access networks. We evaluated LOLYPOP under a large set of experimental conditions, limiting the transport latency to 3 seconds, against a state-of-the-art adaptation algorithm from the literature, called FESTIVE. We observed that the average video quality is by up to a factor of 3 higher than with FESTIVE. We also observed that LOLYPOP is able to reach a broader region in the quality of experience space, and thus it is better adjustable to the user profile or service provider requirements.
Article
Full-text available
Transmission control protocol (TCP) is pervasively employed as the transport-layer solution in popular video applications (e.g., Skype, Google+, HTTP-based adaptive video streaming, etc.) for firewall traversal and network-friendliness. To remedy the shortcomings of the data retransmission mechanism in TCP, forward error correction (FEC) coding is commonly used as the application-layer error-resilient scheme in live media streaming systems. An important measurement study reveals that TCP exhibits delay-friendliness (i.e., delay-performance bias) towards traffic flows composed of small-size packets. Motivated by leveraging the delay-friendliness of TCP to optimize the real-time streaming video quality, we propose a novel FEC coding scheme dubbed Coded Live vide O Streaming ov Er TCP (CLOSET). To achieve the optimal video quality over the lossy communication networks, we analytically formulate the constrained optimization problem of joint FEC coding and packet interleaving to minimize the effective packet loss rate. Then, we provide an approximate analysis to derive the solution for online adaption of FEC redundancy, packet size, and interleaving level. The performance of CLOSET is evaluated through extensive semi-physical emulations in Exata involving real-time encoded H.264 video streaming. Experimental results show that CLOSET outperforms the reference FEC coding schemes in terms of video peak signal-to-noise ratio (PSNR), end-to-end delay, ratio of overdue video frames, and goodput. Therefore, we recommend CLOSET for TCP-based real-time video communication systems.
Technical Report
Full-text available
Changing network conditions pose severe problems to video streaming in the Internet. HTTP adaptive streaming (HAS) is a technology employed by numerous video services that relieves these issues by adapting the video to the current network conditions. It enables service providers to improve resource utilization and Quality of Experience (QoE) by incorporating information from different layers in order to deliver and adapt a video in its best possible quality. Thereby, it allows taking into account end user device capabilities, available video quality levels, current network conditions, and current server load. For end users, the major benefits of HAS compared to classical HTTP video streaming are reduced interruptions of the video playback and higher bandwidth utilization, which both generally result in a higher QoE. Adaptation is possible by changing the frame rate, resolution, or quantization of the video, which can be done with various adaptation strategies and related client- and server-side actions. The technical development of HAS, existing open standardized solutions, but also proprietary solutions are reviewed in this paper as fundamental to derive the QoE influence factors that emerge as a result of adaptation. The main contribution is a comprehensive survey of QoE related works from human computer interaction and networking domains, which are structured according to the QoE impact of video adaptation. To be more precise, subjective studies that cover QoE aspects of adaptation dimensions and strategies are revisited. As a result, QoE influence factors of HAS and corresponding QoE models are identified, but also open issues and conflicting results are discussed. Furthermore, technical influence factors, which are often ignored in the context of HAS, affect perceptual QoE influence factors and are consequently analyzed. This survey gives the reader an overview of the current state of the art and recent developments. At the same time, it targets networking res- archers who develop new solutions for HTTP video streaming or assess video streaming from a user centric point of view. Therefore, this paper is a major step toward truly improving HAS.
Article
Full-text available
In interactive multiview video streaming (IMVS), a client receives and observes one of many available viewpoints of the same scene and periodically requests from the server view switches to neighboring views, as the video is played back in time uninterruptedly. One key technical challenge is to design a frame coding structure that facilitates periodic view switching and achieves an optimal tradeoff between storage cost and expected transmission rate. In this paper, we first propose three significant improvements over existing IMVS systems and then study the corresponding frame structure optimization. First, using depth-image-based rendering, the new IMVS system enables free viewpoint switching, i.e., by encoding and transmitting both texture and depth maps of captured views, a client can select and synthesize any virtual view from an almost continuum of viewpoints between the left-most and right-most captured views. Second, the IMVS system adopts a more realistic Markovian view-switching model with memory that more accurately captures user behaviors than previous memoryless models . A view-switching model is used in predicting client's future view-switching patterns. Third, assuming that the round-trip-time (RTT) delay during server-client communication is nonnegligible, during an IMVS session, the IMVS system additionally transmits redundant frames RTT into future playback, so that zero-delay view switching can be achieved. Given these improvements, we formalize a new joint optimization of the frame coding structure, transmission schedule, and quantization parameters of the texture and depth maps of multiple camera views. We propose an iterative algorithm to achieve fast and near-optimal solutions. The convergence of the algorithm is also demonstrated. Experimental results show that the proposed optimized rate-allocation method requires 38% lower transmission rate than the fixed rate-allocation scheme. In addition, with the same storage, the transmission rate of - he optimized frame structure can be up to 55% lower than that of an I-frame-only structure and 27% lower than that of the structure without distributed source coding frames.
Article
Full-text available
Significant improvements in video compression capability have been demonstrated with the introduction of the H.264/MPEG-4 advanced video coding (AVC) standard. Since developing this standard, the Joint Video Team of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) has also standardized an extension of that technology that is referred to as multiview video coding (MVC). MVC provides a compact representation for multiple views of a video scene, such as multiple synchronized video cameras. Stereo-paired video for 3-D viewing is an important special case of MVC. The standard enables inter-view prediction to improve compression capability, as well as supporting ordinary temporal and spatial prediction. It also supports backward compatibility with existing legacy systems by structuring the MVC bitstream to include a compatible “base view.” Each other view is encoded at the same picture resolution as the base view. In recognition of its high-quality encoding capability and support for backward compatibility, the stereo high profile of the MVC extension was selected by the Blu-Ray Disc Association as the coding format for 3-D video with high-definition resolution. This paper provides an overview of the algorithmic design used for extending H.264/MPEG-4 AVC towards MVC. The basic approach of MVC for enabling inter-view prediction and view scalability in the context of H.264/MPEG-4 AVC is reviewed. Related supplemental enhancement information (SEI) metadata is also described. Various “frame compatible” approaches for support of stereo-view video as an alternative to MVC are also discussed. A summary of the coding performance achieved by MVC for both stereo- and multiview video is also provided. Future directions and challenges related to 3-D video are also briefly discussed.
Article
Full-text available
While much of multiview video coding focuses on the rate-distortion performance of compressing all frames of all views for storage or non-interactive video delivery over networks, we address the problem of designing a frame structure to enable interactive multiview streaming, where clients can interactively switch views during video playback. Thus, as a client is playing back successive frames (in time) for a given view, it can send a request to the server to switch to a different view while continuing uninterrupted temporal playback. Noting that standard tools for random access (i.e., I-frame insertion) can be bandwidth-inefficient for this application, we propose a redundant representation of I-, P-, and “merge” frames, where each original picture can be encoded into multiple versions, appropriately trading off expected transmission rate with storage, to facilitate view switching. We first present ad hoc frame structures with good performance when the view-switching probabilities are either very large or very small. We then present optimization algorithms that generate more general frame structures with better overall performance for the general case. We show in our experiments that we can generate redundant frame structures offering a range of tradeoff points between transmission and storage, e.g., outperforming simple I-frame insertion structures by up to 45% in terms of bandwidth efficiency at twice the storage cost.
Conference Paper
Full-text available
Adaptive (video) streaming over HTTP is gradually being adopted, as it offers significant advantages in terms of both user-perceived quality and resource utilization for content and network service providers. In this paper, we focus on the rate-adaptation mechanisms of adaptive streaming and experimentally evaluate two major commercial players (Smooth Streaming, Netflix) and one open source player (OSMF). Our experiments cover three important operating conditions. First, how does an adaptive video player react to either persistent or short-term changes in the underlying network available bandwidth. Can the player quickly converge to the maximum sustainable bitrate? Second, what happens when two adaptive video players compete for available bandwidth in the bottleneck link? Can they share the resources in a stable and fair manner? And third, how does adaptive streaming perform with live content? Is the player able to sustain a short playback delay? We identify major differences between the three players, and significant inefficiencies in each of them.
Conference Paper
Full-text available
With the rapid development of electronic and computing technology, multi-view video is attracting extensive interest recently due to its greatly enhanced viewing experience. In this paper, we present the system architecture for real-time capturing, processing, and interactive delivery of multi-view video. Unlike previous systems that mainly focus on multi-view video capturing, our system is designed to provide multi-view video service with high degree of interactivity in real time, which is still challenging in the current state of the technology. The proposed architecture tackles many practical problems in system calibration, object tracking, video compression, interactive delivery, etc. With the proposed system, users can interactively select their desired viewing directions and enjoy many exciting visual experiences, such as view switching, frozen moment and view sweeping, in real-time and with great freedom.
Conference Paper
Full-text available
Interactive multiview video streaming (IMVS) is an application that streams to a client one out of N available video views for observation, but client can periodically request switches to neighboring views as the video is played back uninterrupted in time. Previous IMVS works focused on the design of a frame structure at encoding time, trading off expected transmission rate with storage, without knowing the exact view trajectory a client may select at stream time. None of the existing IMVS schemes, however, explicitly addressed the network delay problem, and so a client will suffer a round trip time (RTT) delay for each requested view-switch. In this paper, we optimize frame structure for a bounded RTT, so that a client can switch to neighboring views as the video is played back without view-switching delay. The key idea is to send additional views likely to be requested by a client within one RTT beyond the current requested view. Each required set of contiguous views (corresponding to a given current requested single view) are pre-encoded using frames of previously transmitted set of views as predictors to lower transmission rate. Using I-, P- and distributed source coding (DSC) frames, we first formulate the structure design problem as a Lagrangian minimization for a desired bandwidth/storage tradeoff. We then develop a low-complexity greedy algorithm to automatically generate a good structure. Experimental results show that for the same storage cost, the transmission rate of the proposed structure can be 42% lower than that of I-frame-only structure, and 8% lower than that of the structure without DSC frames.
Article
Full-text available
We present a novel client-driven multi-view video streaming system that allows a user watch 3-D video interactively with significantly reduced bandwidth requirements by transmitting a small number of views selected according to his/her head position. The user’s head position is tracked and predicted into the future to select the views that best match the user’s current viewing angle dynamically. Prediction of future head positions is needed so that views matching the predicted head positions can be prefetched in order to account for delays due to network transport and stream switching. The system allocates more bandwidth to the selected views in order to render the current viewing angle. Highly compressed, lower quality versions of some other views are also prefetched for concealment if the current user viewpoint differs from the predicted viewpoint. An objective measure based on the abruptness of the head movements and delays in the system is introduced to determine the number of additional lower quality views to be prefetched. The proposed system makes use of multi-view coding (MVC) and scalable video coding (SVC) concepts together to obtain improved compression efficiency while providing flexibility in bandwidth allocation to the selected views. Rate-distortion performance of the proposed system is demonstrated under different experimental conditions.
Article
Full-text available
TCP has traditionally been considered inappropriate for real-time applications. Nonetheless, popular applications such as Skype use TCP since UDP packets cannot pass through restrictive network address translators (NATs) and firewalls. Motivated by this observation, we study the delay performance of TCP for real-time media flows. We develop an analytical performance model for the delay of TCP. We use extensive experiments to validate the model and to evaluate the impact of various TCP mechanisms on its delay performance. Based on our results, we derive the working region for VoIP and live video streaming applications and provide guidelines for delay-friendly TCP settings. Our research indicates that simple application-level schemes, such as packet splitting and parallel connections, can reduce the delay of real-time TCP flows by as much as 30% and 90%, respectively.
Conference Paper
We present an interactive free-viewpoint video (FVV) streaming system that is based on the dynamic adaptive streaming over HTTP (DASH) standard. The system uses standard HTTP Web servers to achieve scalability with a large number of users and performs view synthesis and rate adaptation at the client-side to achieve high response time. We propose a rate adaptation logic based on sampled rate-distortion (R-D) values, which relate the distortion of synthesized view to the bit rates of the texture and depth components of the reference views, to maximize the quality of rendered virtual views. Initial results indicate that the proposed R-D-based rate adaptation strategy outperforms equal bit rate allocation among the reference streams components.
Article
With the rapid development of electronic and computing technology, multi-view video transmission has attracted significant attention from the research and industrial communities. A multi-view video service provides a user with various viewpoints according to the user’s input. Alongside the recent trend of shifting video streaming service platforms to HTTP adaptive streaming, Dynamic Adaptive Streaming over HTTP (DASH)-based multi-view video services has been widely studied. However, DASH-based multi-view video streaming suffers from high view-switching delay. The delay between a request for view switching and the rendering of the target view adversely affects the Quality of Experience (QoE). This paper provides an analysis of the view-switching delay of DASH-based multi-view video streaming. Based on the analysis, a novel DASH-based multi-view video streaming system is proposed. It can minimize the view-switching delay by employing a buffer control, parallel streaming, and server push schemes. The results obtained from the implementation of this solution show that the proposed system significantly reduces the view-switching delay while guaranteeing the quality of multi-view video.
Article
Closed-loop flow control protocols, such as the prominent implementation transmission control protocol (TCP), are prevalent in the Internet, today. TCP has continuously been improved for greedy traffic sources to achieve high throughput over networks with large bandwidth delay products. Recently, the increasing use for streaming and interactive applications, such as voice and video, has shifted the focus toward its delay performance. Given the need for real-time communication of non-greedy sources via TCP, we present an estimation method for performance evaluation of closed-loop flow control protocols. We characterize an end-to-end connection by a service curve that provides statistical guarantees for arbitrary traffic. The estimation is based on end-to-end measurements at the application level that include all effects induced by the network and by the protocol stacks of the end systems. From our measurements, we identify different causes for delays. We show that significant delays are due to queueing in protocol stacks. Notably, this occurs even if the utilization is moderate. Using our estimation method, we compare the impact of fundamental mechanisms of TCP on delays at the application level: in detail, we analyze parameters relevant for network dimensioning, including buffer provisioning and active queue management, and parameters for server configuration, such as the congestion control algorithm. By applying our method as a benchmark, we find that a good selection can largely improve the delay performance of TCP.
Article
TCP is widely used in commercial media streaming systems, with recent measurement studies indicating that a significant fraction of Internet streaming media is currently delivered over HTTP/TCP. These observations motivate us to develop analytic performance models to systematically investigate the performance of TCP for both live and stored media streaming. We validate our models via ns simulations and experiments conducted over the Internet. Our models provide guidelines indicating the circumstances under which TCP streaming leads to satisfactory performance, showing, for example, that TCP generally provides good streaming perfor- mance when the achievable TCP throughput is roughly twice the media bitrate, with only a few seconds of startup delay. Categories and Subject Descriptors: C.2.2 (Network protocols): Application (video streaming) General Terms: Performance.
Conference Paper
Recent advances in video capturing and rendering technologies have paved the way for new video streaming applications. Free-viewpoint video (FVV) streaming is one such application where users are able to interact with the scene by navigating to different viewpoints. Free-viewpoint videos are composed of multiple streams representing the captured scene and its geometry from different vantage points. Rendering non-captured views at the client requires transmitting multiple views with associated depth map streams, thereby increasing the network traffic requirements for such systems. Adding to the complexity of these systems is the fact that different component streams contribute differently to the quality of the final rendered view. In this paper, we present a free-viewpoint video streaming system based on HTTP adaptive streaming and the multi-view-plus-depth (MVD) representation. We propose a novel quality-aware rate adaptation method for FVV streaming based on a virtual view distortion model. This view distortion model represents the relation between the distortion of the texture and depth components of reference views and a target virtual view and enables the streaming client to find the best set of representations to request from the server. We have implemented the proposed rate adaptation method in a prototype FVV DASH-based streaming system and performed objective and subjective evaluation experiments. Our experimental results show that the proposed FVV streaming rate adaptation method improves the user's quality-of-experience and increases the visual quality of rendered virtual views by up to 4 dB for some video sequences. Moreover, users have rated the quality of videos streamed using our proposed method higher than videos streamed using other rate adaptation methods in the literature.
Conference Paper
This paper introduces a new TCP congestion control mechanism for rate-limited applications that transmit data in bursts and do not fully utilise their allowed transmission rate. We propose "new-CWV", a method that allows a TCP connection to restart quickly from either an idle or application-limited period. Simulation results show that this provides faster convergence to the rate requested by a rate-limited application, demonstrated by a higher throughput, and better utilisation of unused capacity compared to Standard TCP or TCP with Congestion Window Validation.
Conference Paper
We present an interactive free-viewpoint video (FVV) streaming system that is based on the dynamic adaptive streaming over HTTP (DASH) standard. The system uses standard HTTP Web servers to achieve scalability with a large number of users and performs view synthesis and rate adaptation at the client-side to achieve high response time. We propose a rate adaptation logic based on sampled rate-distortion (R-D) values, which relate the distortion of synthesized view to the bit rates of the texture and depth components of the reference views, to maximize the quality of rendered virtual views. Initial results indicate that the proposed R-D-based rate adaptation strategy outperforms equal bit rate allocation among the reference streams components.
Conference Paper
HyperText Markup Language (HTML) is the main markup language for web pages. HTML5 is a new generation of HTML standards. With the development of HTML5, it has wide range of applications in the multimedia direction. This new standard first describes the Video and Audio tag, and then elements of Video and Audio are discussed, pointing out that the support of the current browser to Video and Audio format. Then the advantages and disadvantages of HTML5 and flash are discussed, and it pointed out some applications that HTML5 multimedia cannot achieved. And it also introduces the Working principle of the HTTP Live Streaming. Finally, we discussed the direction of development of HTML5 multimedia in the future. HTTP Live Streaming, Audio Data API and peer-to-Peer Networking will become the focus of application of HTML5 multimedia.
Article
Delivering multiview video content over present packet networks poses multiple challenges. First, the best effort nature of the Internet exposes media packets to variable bandwidth, loss, and delay as they traverse the network. Second, the prediction dependencies employed to maximize compression efficiency make the reconstruction process at the client extremely vulnerable to missing data. Third, the heterogeneity of client devices in terms of computing power, display capabilities, and access link capacity necessitates customizing the streaming process per user. My article reviews existing opportunities for addressing these challenges from within each of the three main stages of the content delivery pipeline (i.e., encoding, transmission, and reconstruction). Concretely, I first describe adaptive source coding techniques that construct a compressed representation of the multiview video source that exhibits resilience to network bandwidth variations and client view selection uncertainty. Then I discuss intelligent methods for error protection, caching, and packet scheduling that organize the transmission of multiview data in a bandwidth-effective way. Here, I also review prospective multipath and cloud-assisted techniques for multiview video streaming. Finally, I identify robust client-side content reconstruction schemes and adaptive media playout methods that can minimize the impact of missing data and enhance the user's interactive experience. Then I proceed to describe community-driven streaming techniques for delivering interactive multiview content over a population of social peers. The article concludes with an outline of approaches for synergistic exploitation of the techniques I will present theretofore, jointly across the different layers of the network protocol stack at which they individually operate. Here, I also highlight the main deployment challenges for some of these techniques, and how their design should be addressed accordingly, to overcome them.
Conference Paper
We explore the process of transmitting real-time Internet video frames over TCP (Transmission Control Protocol) and discover that the delay of waiting in TCP sender-buffer is the critical factor that causes large end-to-end delay. We propose a multi-buffer scheduling model for decreasing the end-to-end delay by scheduling video frames among application-layer sender-buffer, TCP sender-buffer, TCP receiver-buffer and receiver playout-buffer. Based on the proposed model, we present a new rate adaptive scheme to dynamically deliver variant bit rate video frames according to available network bandwidth by adjusting frame rate as well as assuring video frames to be played at normal time. Our scheme does not require any modifications to the network infrastructure or TCP protocol stack and only needs an application-layer buffer of sender. The performance of the proposed solution is evaluated through extensive simulations using the NS-2 simulator.
Article
The Markovian analysis of telecommunication networks with several interacting connections is very difficult due to the explosion of the number of states in the model. Simulation analysis under the same conditions also fails, because of prohibitive simulation times due to the enormous amount of events that must be collected to reach statistically meaningful results. This paper presents a modeling technique based on closed queuing networks, that is suitable for the evaluation of adaptive window congestion control protocols in heavy load conditions, overcoming the problem of state dimension explosion. The technique is presented applying it to a class of protocols where the window is linearly increased when the transmission is successful, and reset to 1 when packets are lost; however, the technique is general and can be applied to almost any increase and decrease policy. Results are presented for the modeled protocol class, showing the behavior as the number of competing connections increases. Then, a modification on the protocol is presented, in order to show possible ways to ameliorate the performance and, most of all, how the modeling technique can help in evaluating protocol modifications before their deployment.
Article
This paper gives an end-to-end overview of D video and free viewpoint video, which can be regarded as advanced functionalities that expand the capabilities of a 2D video. Free viewpoint video can be understood as the functionality to freely navigate within real world visual scenes, as it is known for instance from virtual worlds in computer graphics. D video shall be understood as the functionality that provides the user with a 3D depth impression of the observed scene, which is also known as stereo video. In that sense as functionalities, D video and free viewpoint video are not mutually exclusive but can very well be combined in a single system. Research in this area combines computer graphics, computer vision and visual communications. It spans the whole media processing chain from capture to display and the design of systems has to take all parts into account, which is outlined in different sections of this paper giving an end-to-end view and mapping of this broad area. The conclusion is that the necessary technology including standard media formats for D video and free viewpoint video is available or will be available in the future, and that there is a clear demand from industry and user for such advanced types of visual media. As a consequence we are witnessing these days how such technology enters our everyday life
Article
In this paper we develop an open multiclass queuing network model to describe the behavior of short-lived TCP connections sharing a common IP network. The queuing network model is paired with a simple model of the IP network, and the two models are solved through an iterative procedure. The combined models need as inputs only the primitive network parameters, and they produce estimates of the packet loss probability, the round trip time, the TCP connection throughput, and of the average TCP connection completion time (that is, of the average time necessary to transfer a file with given size over a TCP connection). We derive models for both TCP-Tahoe and TCP-NewReno. The Tahoe model is presented in detail, while the NewReno model is presented describing differences with respect to Tahoe. Results are shown for both models. The analytical performance predictions are validated against detailed simulation experiments in realistic networking scenarios, proving that the proposed modeling approach is accurate.
Article
Media streaming over TCP has become increasingly popular because TCP's congestion control provides remarkable stability to the Internet. Streaming over TCP requires adapting to bandwidth availability, but unforunately, TCP can introduce significant latency at the application level, which causes unresponsive and poor adaptation. This article shows that this latency is not inherent in TCP but occurs as a result of throughput-optimized TCP implementations. We show that this latency can be minimized by dynamically tuning TCP's send buffer. Our evaluation shows that this approach leads to better application-level adaptation and it allows supporting interactive and other low-latency applications over TCP.
Conference Paper
The dominance of the TCP protocol on the Internet and its success in maintaining Internet stability has led to several TCP-based stored media-streaming approaches. The success of these approaches raises the question whether TCP can be used for low-latency streaming. Low latency streaming allows responsive control operations for media streaming and can make interactive applications feasible. We examined adapting the TCP send buffer size based on TCP's congestion window to reduce application perceived network latency. Our results show that this simple idea significantly improves the number of packets that can be delivered within 200 ms and 500 ms thresholds.
A survey on quality of experience of HTTP adaptive streaming
  • M Seufert
  • S Egger
  • M Slanina
  • T Zinner
  • T Hofeld
  • P Tran-Gia
M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hofeld, and P. Tran-Gia, "A survey on quality of experience of HTTP adaptive streaming," IEEE Commun. Surveys Tuts., vol. 17, no. 1, pp. 469-492, Mar. 2015.