Yong Yan’s research while affiliated with IEEE Computer Society and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (40)


Fig. 1. SRB memory allocation: the initial buffer freezes its size.  
Fig. 2. Multiple running buffers.  
Fig. 3. SRB memory reclamation: different situations of session termination.  
Fig. 4. SRB algorithm.  
Fig. 5. PSRB caching example.  

+12

Fast proxy delivery of multiple streaming sessions in shared running buffers
  • Article
  • Full-text available

January 2006

·

136 Reads

·

2 Citations

IEEE Transactions on Multimedia

Songqing Chen

·

·

Yong Yan

·

[...]

·

Xiaodong Zhang

With the falling price of memory, an increasing number of multimedia servers and proxies are now equipped with a large memory space. Caching media objects in the memory of a proxy helps to reduce the network traffic, the disk I/O bandwidth requirement, and the data delivery latency. The running buffer approach and its alternatives are representative techniques to caching streaming data in the memory. There are two limits in the existing techniques. First, although multiple running buffers for the same media object co-exist in a given processing period, data sharing among multiple buffers is not considered. Second, user access patterns are not insightfully considered in the buffer management. In this paper, we propose two techniques based on shared running buffers in the proxy to address these limits. Considering user access patterns and characteristics of the requested media objects, our techniques adaptively allocate memory buffers to fully utilize the currently buffered data of streaming sessions, with the aim to reduce both the server load and the network traffic. Experimentally comparing with several existing techniques, we show that the proposed techniques achieve significant performance improvement by effectively using the shared running buffers.

Download


Figure 3. Data Sharing among Buffers in SRB Algorithm (a) and Example of PSRB Algorithm (b) 
Figure 4. REAL: (a) Bandwidth Reduction and (b) Average Client Channel Requirement with 1GB Memory 
Figure 7. REAL: (a) Average Client Storage Requirement(%) and (b) Client Waste(%) with the Scale of 1/4 
SRB: Shared Running Buffers in Proxy to Exploit Memory Locality of Multiple Streaming Media Sessions.

January 2004

·

62 Reads

·

17 Citations

Proceedings - International Conference on Distributed Computing Systems

With the falling price of the memory, an increasing number of multimedia servers and proxies are now equipped with a large DRAM memory space. Caching media objects in the memory of a proxy helps to reduce network traffic, disk I/O bandwidth requirement, and data delivery latency. The running buffer approach and its alternatives are representative techniques to cache streaming data in the memory. However, there are two limits in the existing techniques. First, although multiple running buffers for the same media object co-exist in a given processing period, data sharing among the multiple buffers is not considered. Second, user access patterns are not insightfully considered in the buffer management. In this paper, we propose two techniques based on shared running buffers (SRB) in the proxy to address these limits. Considering user access patterns and characteristics of the requested media objects, our techniques adoptively allocate memory buffers to fully utilize the currently buffered data of streaming sessions, with the aim to reduce both the server load and the network traffic. Experimentally comparing with several existing techniques, we show that the proposed techniques have achieved significant performance improvement by effectively using the shared running buffers.


mmGrid: distributed resource management infrastructure for multimedia applications

May 2003

·

73 Reads

·

25 Citations

We are developing mmGrid (Multimedia Grid) as an extensible middleware architecture supporting multimedia applications in a grid computing environment. Our vision is to provide support for interactive applications from the following domains: graphics, visualization, streaming media and tele-immersion. The initial deployment will be within an enterprise as a mechanism for provisioning computing resources. However the scheduling system of mmGrid will be flexible, and will allow interactive and batch jobs to use the grid-computing paradigm. This paper presents our argument for using remote display technology in this environment. We also report on the use cases we support at this point, the system architecture for mmGrid and our research directions.


Shared Running Buffer Based Proxy Caching of Streaming Sessions

April 2003

·

37 Reads

·

5 Citations

shared running buffer, proxy caching, patching, streaming media delivery, VOD With the falling price of memory, an increasing number of multimedia servers and proxies are now equipped with a large memory space. Caching media objects in memory of a proxy helps to reduce the network traffic, the disk I/O bandwidth requirement, and the data delivery latency. The running buffer approach and its alternatives are representative techniques to caching streaming data in the memory. There are two limits in the existing techniques. First, although multiple running buffers for the same media object co-exist in a given processing period, data sharing among multiple buffers is not considered. Second, user access patterns are not insightfully considered in the buffer management. In this paper, we propose two techniques based on shared running buffers (SRB) in the proxy to address these limits. Considering user access patterns and characteristics of the requested media objects, our techniques adaptively allocate memory buffers to fully utilize the currently buffered data of streaming sessions, with the aim to reduce both the server load and the network traffic. Experimentally comparing with several existing techniques, we show that the proposed techniques achieve significant performance improvement by effectively using the shared running buffers.


mmGrid: Distributed Resource Management

February 2003

·

10 Reads

middleware architecture supporting multimedia applications in a grid computing environment. Our vision is to provide support for interactive applications from the following domains: graphics, visualization, streaming media and tele-immersion. The initial deployment will be within an enterprise as a mechanism for provisioning computing resources. However the scheduling system of mmGrid will be flexible, and will allow interactive and batch jobs to use the grid-computing paradigm. This paper presents our argument for using remote display technology in this environment. We also report on the use cases we support at this point, the system architecture for mmGrid and our research directions.


Buffer Sharing for Proxy Caching of Streaming Sessions.

January 2003

·

49 Reads

·

3 Citations

With the falling price of the memory, an increasing number of mul- timedia servers and proxies are now equipped with a large mem- ory space. Caching media objects in the memory of a proxy helps to reduce the network traffic, the disk I/O bandwidth requirement, and the data delivery latency. The running buffer approach and its alternatives are representative techniques to cache streaming data in the memory. There are two limits in the existing techniques. First, although multiple running buffers for the same media ob- ject co-exist in a given processing period, data sharing among the multiple buffers is not considered. Second, user access patterns are not insightfully considered in the buffer management. In this study, we propose two techniques based on shared running buffers (SRB) in the proxy to address these limits. Considering user ac- cess patterns and characteristics of the requested media objects, our techniques adaptively allocate memory buffers to fully utilize the currently buffered data of streaming sessions to serve existing re- quests concurrently, with the aim to reduce both the server load and the network traffic. Experimentally comparing with several exist- ing techniques, we have shown that the proposed techniques have achieved significant performance improvement by effectively using the sharing running buffers.


Journal Of Parallel And Distributed Computing 22, 392-410 (1994)

December 2000

·

5 Reads

·

1 Citation

this paper, we present an experimental metric using network latency for measuring and evaluating parallel program and architecture sealability. We first give the definitions of latency and sealability. Furthermore we show the analytical relationships among the latency metric, the isoefficiency function, and the isospeed metric. Finally we give a measurement method for using the latency metric. We include experimental measurements on the KSR-I to show the effectiveness of the latency metric in predicting and evaluating the sealability


Cacheminer: A runtime approach to exploit cache locality on SMP

May 2000

·

25 Reads

·

16 Citations

IEEE Transactions on Parallel and Distributed Systems

Exploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at runtime on Symmetric Multiprocessor (SMP) systems. Guided by application-dependent and targeted architecture-dependent hints, our system, called Cacheminer, reorganizes and partitions a parallel loop using the memory-access space of its execution. Through effective runtime transformations, our system maximizes the data reuse in each partitioned data region assigned in a cache, and minimizes the data sharing among the partitioned data regions assigned to all caches. The executions of tasks in the partitions are scheduled in an adaptive and locality-presented way to minimize the execution time of programs by trading off load balance and locality. We have implemented the Cacheminer runtime library on two commercial SMP servers and an SimCS simulated SMP. Our simulation and measurement results show that our runtime approach can achieve comparable performance with the compiler optimizations for programs with regular computation and memory-access patterns, whose load balance and cache locality can be well optimized by the tiling and other program transformations. However, our experimental results show that our approach is able to significantly improve the memory performance for the applications with irregular computation and dynamic memory access patterns. These types of programs are usually hard to optimize by static compiler optimizations


Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP.

April 2000

·

30 Reads

·

15 Citations

IEEE Transactions on Parallel and Distributed Systems

Exploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at runtime on Symmetric Multiprocessor (SMP) systems. Guided by application-dependent and targeted architecture-dependent hints, our system, called Cacheminer, reorganizes and partitions a parallel loop using the memory-access space of its execution. Through effective runtime transformations, our system maximizes the data reuse in each partitioned data region assigned in a cache, and minimizes the data sharing among the partitioned data regions assigned to all caches. The executions of tasks in the partitions are scheduled in an adaptive and locality-preserved way to minimize the execution time of programs by trading off load balance and locality. We have implemented the Cacheminer runtime library on two commercial SMP servers and an SimOS simulated SMP. Our simulation and measurement results show that our runtime approach can achieve comparable performance with the compiler optimizations for programs with regular computation and memory-access patterns, whose load balance and cache locality can be well optimized by the tiling and other program transformations. However, our experimental results show that our approach is able to significantly improve the memory performance for the applications with irregular computation and dynamic memory access patterns. These types of programs are usually hard to optimize by static compiler optimizations.


Citations (25)


... This is mainly due to the longer streaming session on average in the VOD workload. We omit it for brevity; interested readers can refer to [21]. ...

Reference:

Fast proxy delivery of multiple streaming sessions in shared running buffers
Shared Running Buffer Based Proxy Caching of Streaming Sessions

... Several caching solutions for homogeneous clients have been brought to the light at that time, as outlined in [95]. One of them, called Sliding-Interval Caching [30], suggests that a cache should not store the entire video transmission passing through it, but only a short window of it -thus, in essence, a storage-limited LRU per each video instance. With the course of the playback, new video frames will arrive to the cache's "stack" which will cause the eviction of the oldest frames in the window. ...

SRB: Shared Running Buffers in Proxy to Exploit Memory Locality of Multiple Streaming Media Sessions.

Proceedings - International Conference on Distributed Computing Systems

... The model for AvailCPU(S i ) could be a dynamically supplied value, for example from the Network Weather Service Wol96], or from an analytical model of contention, for example FB96, LS93,ZY95]. Notice that in this case, Avail-CPU could be a model or a supplied parameter. ...

A Framework of Performance Prediction of Parallel Computing on Nondedicated Heterogeneous NOW.
  • Citing Conference Paper
  • January 1995

... 34 Speedup and efficiency are two useful metrics. 35 The former evaluates the improvement in speed of execution of a parallel algorithm as the number of resources increases. The later is a metric of the utilization of the resources of the improved system. ...

Measuring and Analyzing Parallel Computing Scalability
  • Citing Conference Paper
  • August 1994

... We intend to extend this study in several ways. First, we plan to investigate impact of efficient data delivery algorithms such as patching [15, 5] and stream merging [8] with the clip and block replacement techniques. Second, we are exploring a dynamic version of Domical that switches from one granularity of data placement (a block) to another (a clip) based on the observations reported in Section 4.2. ...

Buffer Sharing for Proxy Caching of Streaming Sessions.

... is paper proposes a parallel prediction model of clickthrough rate based on feature classification and Combine [29]. e experimental results show that the newly added model has an average increase of 0.93% in AUC and a decrease of 0.47% in Logloss compared to the benchmark model. ...

An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW
  • Citing Article
  • October 1996

Journal of Parallel and Distributed Computing

... Although significant performance improvement can be acquired for some applications, the type of affinity exploited by this approach is not very popular because it does not take into consideration the relationship between memory references of different iterations. The basic idea of Cacheminer [13] method is to group and partition tasks through shrinking and partitioning the memory access space of parallel tasks. Shrinking the memory access space is to group tasks that have shared data accessing regions. ...

Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP.
  • Citing Article
  • April 2000

IEEE Transactions on Parallel and Distributed Systems

... A legal route of up*/down* routing follows the rule: it must traverse zero or more links in the up direction followed by zero or more links in the down direction. Although the up*/down* routing is simple, the performance is not good since there exists many traffic congestions at root of a spanning tree called hot spots [13][16]. To overcome the drawbacks of the up*/down* routing, the L-turn routing was proposed in [9] based on the 2D turn model [8]. ...

Comparative Performance Evaluation of Hot Spot Contention Between MIN-Based and Ring-Based Shared-Memory Architectures.
  • Citing Article
  • January 1995

... To cope with the aspect of content management, media or semantic grid tools are used. Among the several solutions: MmGrid supports interactive applications with graphics, rendering, streaming, and tele-immersion [Basu et al., 2003]; AXCP provides an integrated approach to perform both content management and semantic computing [Bellini et al., 20011]; GRISINO combines web services with intelligent content and grid computing, where its integration with workflow management is performed via web service [Toma et al., 2006]. GridCast is a service-oriented architecture for broadcasting media via IP [Harmer et al., 2005]. ...

mmGrid: distributed resource management infrastructure for multimedia applications
  • Citing Conference Paper
  • May 2003