FLoD: A Framework for Peer-to-Peer 3D Streaming
Shun-Yun Hu∗, Ting-Hao Huang†, Shao-Chen Chang∗, Wei-Lun Sung∗, Jehn-Ruey Jiang∗and Bing-Yu Chen‡
∗Department of Computer Science and Information Engineering
National Central University
†Department of Computer Science and Information Engineering
National Taiwan University
‡Department of Information Management and Graduate Institute of Networking and Multimedia
National Taiwan University
Abstract—Interactive 3D content on the Internet has yet
become popular due to its typically large volume and the limited
network bandwidth. Progressive content transmission, or 3D
streaming, thus is necessary for real-time content interactions.
However, the heavy data and processing requirements of 3D
streaming challenge the scalability of client-server-based delivery
methods. We propose the use of peer-to-peer (P2P) networks
for 3D streaming and argue that due to the non-linear access
pattern of 3D content, P2P 3D streaming is a new class of
applications apart from existing P2P media streaming. We also
describe FLoD, the first P2P framework that allows clients in 3D
virtual globe or virtual environment (VE) applications to obtain
relevant data from other clients while minimizing server resource
usage. To demonstrate how FLoD applies to real-world scenarios,
a prototype system is built to adapt JPEG 2000 geometry image-
based streaming for the P2P delivery of 3D scenes. Experiments
show that server-side bandwidth can be conserved with P2P
streaming, while simulations indicate that P2P 3D streaming is
fundamentally more scalable than client-server approaches.
3D streaming refers to the continuous and real-time delivery
of 3D content (e.g. meshes, textures, animations, and scene
graphs) over networks to allow user interactions without an a
priori download. Similar to audio or video media streaming
, , 3D content needs to be fragmented into pieces on
a server, before it can be transmitted, reconstructed, and
displayed at the clients. But unlike media streaming, as users
often possess different visibility or interest areas due to
each user’s unique behavior, the transmission sequence in 3D
streaming varies from user to user, and requires individualized
visibility calculations .
Current 3D streaming schemes can be classified into four
main types: object streaming, scene streaming, visualization
streaming, and image-based streaming . In this paper we
look at scene streaming, which usually involves a collection
of 3D objects placed arbitrarily in space that are streamed
to clients according to user visibility or interests. The goal
of scene streaming is to provide a remote walkthrough (i.e.
navigation) or multi-user virtual environment (VE) experience
, where users navigate a 3D scene and possibly communi-
cate with one another in real-time (e.g. walkthrough of virtual
museums). As many more objects may exist than what a user
can see at a moment, scene streaming generally has two stages:
object determination and object transmission . For the first
stage, the server employs visibility determination techniques to
cull away irrelevant objects, and uses visual quality estimates
to assign transmission priorities. For the second stage, data
reduction techniques such as progressive representations and
compressions are used to send the object pieces using object
streaming methods (e.g. ). Scene streaming also benefits
from the reuse of cached content, so that objects need not be
sent again if re-visited later .
Existing 3D streaming schemes adopt the client-server ar-
chitecture for content delivery. However, as 3D streaming is
both data and processing-intensive, prohibitively vast amount
of server-side bandwidth and CPU resources are required when
serving a large audience. 3D applications with large data
volume (e.g. today’s popular Massively Multiplayer Online
Games, or MMOGs ) thus currently require users to obtain
the content through pre-installations via CDs or a priori
downloads. However, a priori installations are undesirable and
even unpractical for two likely future scenarios:
Larger and more dynamic content. Today’s MMOGs have
a few gigabytes of relatively static content (e.g. World of War-
craft is 5GB and takes over an hour to download and install).
However, as content becomes larger and more dynamic, con-
tent streaming will save both the installation and update time.
In fact, a priori installation is already unsuitable for the social
MMOG Second Life , which depends on 3D streaming
to deliver over 34 TB of user-created models, textures and
behavior scripts1. Also of note is that virtual globes such as
Google Earth and NASA World Wind currently have terabytes
of data (70TB and 4.6TB, respectively). Extensions into 3D
may only be a matter of time, as shown by initiatives such
as X3D Earth2. Pre-installation thus would be unpractical
given the data volume and the size of the userbase (i.e. 250
million+ Google Earth downloads). In realizing such Earth-
scale virtual environments, large-scale 3D streaming could be
the basis for next-generation virtual globe applications.
Larger number of environments. As imagined by proponents
of Web 3D, the future Internet could very well be three-
dimensional where diverse environments exist to offer various
learning, shopping, and socializing experiences. If millions of
3D sites were to exist, a priori installations for each one of
them would be frustratingly inconvenient and unpractical.
P2P STREAMING COMPARISONS
same (different starts)
Scalable and efficient 3D streaming thus may be an im-
portant enabler for diverse forms of new applications. We
propose the use of peer-to-peer (P2P) networks to improve
the scalability and affordability of 3D scene streaming, based
on the observation that users navigating through a 3D scene
may own similar content due to overlapped visibility. Users
thus might obtain relevant content from one another. Although
P2P-based media streaming has seen significant progress in
recent years, it is not directly applicable to 3D data due to the
different access patterns. To use an analogy, consider that a
user has left a certain navigation path through a VE. If other
users also go through the same path with the same starting
position and speed, the data stream would be the same for
everyone (i.e. similar to live media streaming ). If other
users join the path at different locations but proceed with the
same speed, the data required resembles on-demand media
streaming . 3D scene streaming occurs when other users
proceed on different paths at different speeds, making the
streaming sequences individually unique.
The main difference between 3D scene streaming and
media streaming thus lies in the content access pattern due
to user behaviors. Media streaming views content as one-
dimensional (i.e. time) and sequentially accessible, whereas
3D streaming views content as stored in a multi-dimensional
space (i.e. the x and y axis, plus dimensions such as view
orientation) and accessed according to user behaviors. The
content access pattern thus is linear and more predictable
for media streaming, yet non-linear and less predictable for
3D streaming. Switching between various interest groups that
share data also would occur more frequently for 3D streaming
(Table I). Novel understandings to the fundamental problems
involved and the design of new streaming techniques thus
are necessary for P2P 3D streaming. In this paper, we try to
answer the question: how can 3D scene steaming be realized
for millions of concurrent users within the same VE?
This paper builds on our earlier work  to provide a
conceptual model for 3D scene streaming, and presents the
design and evaluation of FLoD (Flowing Level-of-Details),
the first P2P framework that supports 3D scene streaming
for MMOG or virtual globe applications. By separating the
graphics and the networking aspects of the problem, FLoD
also allows both fields to tackle each aspect independently.
The rest of the paper is organized as follows: A model
for P2P 3D scene streaming is first presented in Section II,
followed by FLoD’s design in Section III. To evaluate FLoD’s
applicability and scalability, Section VI and V describe a
prototype system and related simulations. Section VI discusses
related work, and conclusions are given in Section VII.
II. P2P-BASED 3D SCENE STREAMING
A. System Model and Assumptions
We consider a remote walkthrough ,  scenario where
3D objects of various sizes and shapes are placed in a large
scene with specific positions and orientations. Objects are
defined by polygonal meshes and their associated data, such as
textures, light maps, and animations, etc., and the information
on their placements is stored within a scene description. Each
user navigates the scene through a client program and may
update his or her current position and view orientation via
movement commands (the terms user, node, client, and peer
will be used interchangeably from now on). As there are
potentially many objects, it is neither feasible nor necessary to
see or interact with all of them at once. Each user’s visibility
and interaction thus is limited to a circular area of interest
(AOI)  centered at the user’s current location. For simplicity,
we assume that all objects are static in both their positions
and content. In this basic model, we also do not consider the
display of other users’ 3D representations (i.e. each user sees
only static objects, but not each other).
For a given 3D object, we assume that its mesh and other
data can be fragmented into a base piece and many refinement
pieces. The specific fragmentation is application-specific, but
whichever the mechanism, we assume that the user is provided
with a minimal working set of objects once the base pieces
are obtained, such that the scene can be rendered to allow
navigation. Progressive meshes  and techniques such as ge-
ometry image  may be used for mesh fragmentation, while
progressive encodings of GIF, JPEG, or PNG, may be used for
texture fragmentation . Note that different fragmentation
methods impose different types of dependency requirements
among the pieces when reconstructing the objects.
All content is initially stored at a server, and clients obtain
it through streaming from either the server or other clients.
Rendering and navigation may begin as soon as base pieces
of a few objects within the AOI are obtained.
From the user’s perspective, the main concern for 3D
streaming is its visual quality, which is captured by concepts
such as walkthrough quality  or visual perception .
However, as visual quality can be a subjective judgement, a
more definable concept may be the streaming quality in terms
of “how much” and “how fast” a client obtains data. For the
former, one measure is the ratio between the data currently
owned and those necessary to render a view at an instant,
which we will call fill ratio. A ratio of 100% indicates the
best visual quality, as the rendered image would be the same
as if all content is locally stored. As for the latter, we may use
the following two measures: base latency, the time to obtain
the base piece of an object, and completion latency, the time
to download the complete data of an object. Note that the two
are similar to latency time and response time in . Base
latency indicates the delay for a user to see a basic view of an
object, while completion latency indicates the delay of being
Fig. 1.A conceptual model for P2P-based 3D scene streaming.
able to fully inspect or manipulate an object. For clients, the
goal of 3D streaming thus is to optimize the streaming quality
by maximizing the fill ratio for every view and minimizing
the base and completion latency.
From the server’s perspective, the main concern is to
improve the system’s scalability by distributing processing and
transmission loads to clients as much as possible. For trans-
missions, it is preferable if most content is delivered by clients.
This can be measured by the amount of server-side bandwidth
usage. For processing, it is desirable to minimize the server’s
role in calculating user visibility and deciding the transmission
strategy. Ideally, if these calculations are delegated to clients,
then server-side processing can be conserved for answering
data requests only. For servers, the goal of 3D streaming thus
is to minimize their CPU and bandwidth usage.
To meet the above requirements by utilizing client resources,
we identify two new issues to address:
Distributed visibility determination Preferably, visibility
determination should be done without the server’s involvement
or any global knowledge of the scene. However, as only
the server initially possesses complete knowledge of object
placements (i.e. the scene descriptions, which are required
for visibility calculations), we need to partition and distribute
scene descriptions to clients so that visibility determination
can be done in a distributed manner efficiently.
Peer and piece selection To optimize the visual (or streaming)
quality for a given bandwidth budget, clients should perform
peer selection to contact the proper peers and piece selection
to request the proper data pieces for object reconstructions. As
there may be multiple relevant data sources, factors such as
resource capacity, content availability and network conditions
need to be considered together. Interestingly, as 3D streaming
can be view-dependent  and that some data pieces may
be applied in arbitrary order during object reconstructions, 3D
streaming requires only a roughly sequential transfer order
as opposed to the strictly sequential transmissions in video
or audio streaming, as long as certain piece dependencies
are satisfied. Concurrent downloads can thus be exploited to
accelerate data retrievals for pieces that lack dependencies.
D. Conceptual Model
Given the above requirements and challenges, we summa-
rize the main tasks for P2P 3D scene streaming as follows:
Partition: The task of dividing the entire scene into blocks
or cells so that global knowledge of all object placements
is not required for visibility determination. Scene partition is
essential if visibility calculations were to be decentralized.
Fragmentation: The task of dividing a 3D object into pieces
so that it may be transmitted over the network and recon-
structed back progressively by a client. Progressive meshes or
textures are examples of fragmentation techniques.
Prefetching: The task of predicting data usage ahead of time
and generating object or scene requests so that latency due
to transmissions is masked from users. Predications of user
movements or behaviors are often employed for this task .
Prioritization: The task of performing visibility determination
to generate the ordering for a client to obtain object pieces in
a scene. The goal is to produce the best streaming quality with
considerations to factors such as object distance, line-of-sight
, , or the requesting client’s bandwidth .
Selection: The task of determining the proper peers to connect
and pieces to obtain based on considerations of peer capac-
ity, content availability and network conditions, in order to
efficiently fulfill requests from prefetching and prioritization.
Fig. 1 organizes the above tasks into a conceptual model
for P2P-based 3D scene streaming. For an interactive 3D
application, obtaining movement updates from the user and
performing rendering are the only steps when content is locally
available. Object preprocessing, determination, transmission,
and reconstruction are the additional stages in 3D streaming.
For client-server-based 3D streaming, only fragmentation,
prefetching, and prioritization need to be considered. Partition
of the scene and the selection of peers and pieces are new
issues introduced in P2P-based 3D streaming. A summary
comparison between the two approaches is shown in Table II.
CLIENT-SERVER AND P2P COMPARISONS.
neighbors, stars are both.
VON. Triangles are boundary neighbors, squares are enclosing
III. DESIGN OF FLOD
FLoD’s main design rationale is that as users in large-scale
VEs tend to see each other or crowd at certain hotspots , a
node might have overlapped visibility with its AOI neighbors
(i.e. other nodes whose positions fall within the node’s AOI).
It is thus likely that the neighbors already possess relevant 3D
content. By requesting data from the neighbors first, the server
can be relieved from serving the same data repetitively. Note
that neighbors here are based on proximity on the virtual map,
not the physical network. The discovery of AOI neighbors is in
fact the discovery of the proper interest groups for distributed
content sharing, and must be done efficiently. Fortunately,
recent research on P2P virtual environment (P2P-VE) overlays
 allows information on AOI neighbors such as IDs, virtual
coordinates, and IP addresses be learned given a position and
AOI-radius, without relying on a server. As a node moves
around, it can constantly notify the overlay of its position and
get refreshed information on AOI neighbors.
Our choice for the P2P overlay is Voronoi-based Overlay
Network (VON) (Fig. 2), as it has demonstrated scalability,
consistency, and reliability . VON requires each node to
connect directly with its AOI neighbors and organize them
into a Voronoi diagram. By identifying the boundary neighbors
according to Voronoi partitioning (i.e. nodes whose Voronoi
regions overlap with the AOI boundary), a node may learn
of new nodes via notifications from its boundary neighbors
as it moves around. To further constrain client-side bandwidth
usage, a node may also shrink its AOI if a certain connection
limit is exceeded . Note that other P2P-VE overlays can
also be used , as long as correct and timely information
on AOI neighbors are provided. One benefit of VON is that
when no AOI neighbors are present, connections with a few
enclosing neighbors are still kept , such that data requests
to peers are still possible.
To efficiently distribute scene descriptions to clients, we
partition the VE into fixed-size square cells (similar to Cy-
berwalk ), each has a small scene description specifying
the objects within. Each 3D object is specified by a unique
ID, location point, orientation and scale within the scene
description. Determining the visible objects to retrieve can
thus be done in a fully distributed manner, as each node is
Fig. 3.Schematic of a VE divided into cells.
able to locally determine the cells covered by its AOI (see
Fig. 3, where the big circle is the AOI of the star node,
and triangles are other user nodes. Various shapes are the 3D
objects, with their location points as dots. Note that cell IDs
can be calculated given the star node’s location coordinates,
the world dimensions and cell size). When entering a new
area, a client first prepares a scene request list to obtain scene
descriptions from its AOI neighbors or the server. Once scene
descriptions are obtained, the client then judges which objects
are in view and produces a piece request list to request for
visible objects. Piece dependency, if any, is also specified in
the piece request list to ensure that data retrieval adheres to
the correct piece ordering for object reconstructions. Views
are rendered progressively as data pieces arrive from either the
peers or the server (which acts as the final data source if peers
cannot fulfill the requests). This iterative process of requesting
scene descriptions and object pieces is repeated continuously
as a client moves in the VE.
To accommodate evolving policies and techniques, FLoD
separates the main client-side tasks into a graphics layer and a
networking layer (Fig. 4). The graphics layer performs object
determination (i.e. prefetching and prioritization) and object
reconstruction (i.e. de-partition and de-fragmentation), while
the networking layer is responsible for object transmission (i.e.
peer and piece selection). Prefetching is not yet considered in
this work, but is included for the sake of completeness. The
application sits on top of FLoD and performs the usual tasks
of taking user movement commands and performing rendering.
Fig. 4. FLoD’s client-side task flow and layers. Data flows: (A) scene request
list (B) scene descriptions (C) piece request list (D) data pieces. The numbers
are task labels in FLoD’s Procedures.
We now describe FLoD’s main procedures in more details.
The numbers after the procedure names indicate the tasks (as
shown in Fig. 4) covered by the respective procedure:
Login: The joining node enters the VE system by specifying
a join location and AOI-radius to the P2P-VE overlay, which
returns an initial list of AOI neighbors. The VE’s dimensions
and cell size are also obtained from a gateway server. Obtain
Scene Descriptions procedure is then called.
Obtain Scene Descriptions (2, 4): The requesting node
determines the cells that its AOI covers, and uses the Request
for Data procedure to get the cells’ scene descriptions by
passing a scene request list made of cell IDs. Once the scene
descriptions are obtained and analyzed, the node requests for
3D objects with the Obtain Objects procedure.
Obtain Objects (5, 6, 7): Visibility determination produces
a prioritized piece request list, consisting of (object ID, piece
ID, depended-piece ID) tuples, for any missing visible data.
Pieces are obtained according to their priorities and depen-
dencies via the Request for Data procedure, and stored to a
cache once downloaded. A view is rendered from the cache
according to the specified location, orientation, and scale of
each object in the scene descriptions.
Request for Data (3): If the local cache does not have
the desired data, requests are sent to the data source nodes
(composed of current AOI neighbors and the gateway server),
according to certain peer selection policy. The actual data
exchanges are governed by certain piece selection policy. As
the gateway server is part of the pool, requests will eventually
go to the server if peers fail to respond.
Move (1): A node moves by sending a user-generated
position update to the overlay, which forwards the update
to other AOI neighbors. Any new neighbors discovered via
the overlay will become part of the data source nodes. If the
node enters certain new cells whose scene descriptions are
unknown, Obtain Scene Descriptions is invoked.
Logout: A node simply disconnects from all neighbors
when leaving the system. As the system is distributed, failure
or departure of any single user node will not affect the system’s
operation. Other nodes will learn about the departure from the
overlay via an updated neighbor list.
The above procedures describe the general steps when
browsing a 3D scene. However, specific policies are still
needed for various streaming tasks, which are discussed below:
Content Discovery Before each peer can request data, they
must first know which neighbors possess the desired con-
tent. Some methods include: 1) request data from neighbors
simultaneously; 2) request data from neighbors sequentially;
and 3) query the neighbors first, and send requests later. The
first option has the least latency, but is vulnerable to multiple
responses. The second option uses the least bandwidth, but
incurs more delays due to multiple attempts. We therefore
query the neighbors first, and only request the data from the
neighbors that respond positively.
Fig. 5. Screenshot of the 3D streaming client.
Peer Selection Once a peer knows which other neighbors
possess a certain data piece, it could send a request to a
chosen peer. The choice of peer can either be random or based
on certain criteria (e.g. remote peer’s bandwidth capacity or
proximity). Our current choice is to pick neighbors randomly.
Piece Selection For obtaining pieces, the request order need
not be strictly sequential as the piece dependencies for 3D
objects may follow that of a tree (e.g. geometry image )
or a forest (e.g. progressive meshes ). For simplicity, we
now adopt a sequential request policy for data pieces.
Server Request Condition One key question for P2P delivery
is under what conditions do clients request data from the
server? The answer impacts both the number of requests to a
server, and the responsiveness for clients to obtain data. A
basic approach is to ask the server whenever other clients
cannot respond to requests. However, this could make the
server vulnerable to requests, especially when clients are
joining concurrently. We thus allow a client to request from
the server only if it becomes the nearest node to an object.
Caching We choose a cache size that will store roughly three
times the expected amount of data within an AOI. The farthest
objects from the user’s position are replaced first when the
cache is used up.
IV. PROTOTYPE IMPLEMENTATION
In order to demonstrate how FLoD may be used in actual
scenarios, we implement a prototype based on geometry image
streaming . Fig. 5 shows a screenshot of the 3D streaming
client, where two scenes are shown with multiple objects.
In this section, we describe the partition, fragmentation, pri-
oritization, selection, and object reconstruction techniques,
followed by the results of a LAN experiment. Prefetching as
noted earlier is not yet considered.
Partition We adapt a small virtual village (a 3D Studio Max
scene file) from an actual game demo as the basic cell and
convert it into a X3D file. 3D objects are extracted from
the X3D file and converted to geometry images  using a
geometry image extension for X3D , so that the X3D file
may serve only as the scene description. To create the full VE,
we duplicate the village 100 times by translating and tilting its
coordinates to fit on a larger plane until the total data volume
PROTOTYPE LAN EXPERIMENT STATISTICS
total time (ms)
send size (b)
recv size (b)
base latency (ms)
avg AOI neighbor
Fig. 6.Fragmentation of 3D models into geometry images.
approximates that of a real game scene. The original data for
a cell is 514 KB (with scene description and models), and the
final VE is 51.8 MB. We use this approach as a large game
scene is hard to find. But it should be noted that it is common
for MMOG developers to build scenes in cell-based units .
So the approach resembles in part with how MMOGs or VEs
are constructed today.
Fragmentation As the scene descriptions contain only the
unique IDs and bounding boxes of the objects, we use JPEG
2000 to store the actual models as geometry images due to its
support for multiple image layers. To facilitate P2P transmis-
sions, the images are further fragmented into pieces with JPIP
 using the following criteria: 1) Resolution: as rendering
distant objects with a large number of polygons is unnecessary,
the image is further divided into several resolution levels.
Each level, except the highest one, represents a simplified
version of the original 3D model. 2) View-dependency: when
a user is looking at the front of an object, transmission of
its rear-side data can be postponed. Each resolution level thus
is also divided into blocks that correspond to different surface
patches on the reconstructed model. Fig. 6 shows the resulting
geometry images of this procedure, and the corresponding
reconstructed models. In our prototype, each image is divided
into 5 resolution levels, each of which is further divided into
blocks of 8x8 pixels. Level 0 corresponds to the base piece
and other blocks correspond to the refinement pieces.
Prioritization As mentioned in Section 3.2, after joining the
P2P network, each node first requests the scene descriptions
to perform distributed visibility determination. After scene de-
scriptions are obtained, prioritization then generates a piece re-
quest list for locally unavailable data, based on each requested
object’s optimal resolution that fulfills the user’s perception
requirements. Here we adopt the concept of visual importance
in Cyberwalk  to determine the optimal resolution, where
objects that are nearer to the viewer or closer to the center
of the field of view have higher visual importance. However,
some differences between our method and Cyberwalk exist
as follows: 1) Since each client has the scene description,
the calculations of visual importance is done by each client
instead of the server, lowering the server’s computations. 2)
An estimate of recent bandwidth utilization is used to decide
how many data requests should be sent. A bandwidth quota is
also imposed on each requested object based on the object’s
visual importance, so that more important objects can utilize
a larger share of the available bandwidth.
Selection Once the network layer receives the piece request
list, peer and piece selection then takes place to fulfill the
requests. We rely on asking the AOI neighbors to provide the
relevant 3D content. For each piece on the request list, a query
is first sent to all AOI neighbors to check for data availability.
Actual requests for object pieces are sent to a randomly chosen
neighbor that responds positively. For a given neighbor, there
are maximally five on-going requests at a time, new requests
are sent only after the previous requests have finished. This
ensures that neighbors with higher transfer rates can service
more requests. If none of the neighbors has a desired piece, the
requester will keep querying until the server request condition
described in Section 3.3 is met.
Object Reconstruction To reconstruct a model for rendering,
we first expand the JPEG 2000 image from cache to a certain
resolution level, depending on the visual importance of the
object. For each object, we can obtain 6 gray scale images,
which correspond to the x, y, z coordinates and x, y, z values
of the normal vector of each vertex. We then use the remesh
algorithm in  to rebuild the 3D model. When a user node
moves close to the boundary of the current cell, the node will
request for new scene descriptions of the nearby cells, and the
procedures of prioritization and selection are repeated.
A. LAN Experiment
Our experiment with the prototype involves setting up a
Linux-based server that loads the initial scene data onto mem-
ory, and responds to client requests as needed. To conduct the
experiment, we set up eight computers on a 100 Mbps LAN
to act as clients. During a forty minute session, users login
to the system and navigate around continuously to explore
the scenes. Statistics on 34 sessions regarding the clients’
performances are collected and shown in Table III.
As can be seen from the data, each client stays within the
scene for roughly three minutes per session, and has three
known neighbors on average. As clients could request data
from their neighbors, the average server request ratio (SRR)
(i.e. the ratio of the received data that comes from the server) is
about 41.7%. The average upload and download transmissions
of clients are roughly the same, indicating that most clients do
well to serve other peers of their data needs.
V. SIMULATION EVALUATION
The prototype demonstrates the feasibility of using P2P
streaming to save server resource usage. However, to see how
FLoD works on a larger scale requires simulation evaluations.
The main purpose of the simulation is to compare the scala-
bility and streaming quality between a P2P and a client-server
approach of 3D streaming. In this section, we present our
simulation metrics, setup and results regarding the scalability,
streaming quality, and limitations of FLoD.
A. Performance Metrics
To evaluate FLoD’s effectiveness, the following metrics are
Bandwidth usage: A fundamental requirement for scalable
systems is that resource usage at each system component (i.e.
server or client) is bounded without exceeding the compo-
nent’s capacity. Otherwise an overloaded component may fail
or degrade its service quality. Bandwidth usage at all nodes and
the server thus are important indicators for system scalability.
Fill ratio: 3D streaming aims to achieve a visual quality
matching that of locally stored content. We can measure the
ratio of data volumes between the client’s obtained data and
visible data (according to the server’s storage), to estimate a
client’s capability of rendering a view.
Base latency: We define the time between the initial query
and the time a base piece becomes available at a client as base
latency. It serves as an indicator for how soon a user can start
meaningful navigation when entering a new scene.
B. Simulation Setup
A custom discrete-time simulator is used for our simula-
tions, which proceed in time-steps of 100ms each. Up to n
nodes are put inside the simulator to process and exchange
messages with other nodes at each step, under the following
pre-specified bandwidth limits: 1 Mbps download and 256
Kbps upload for typical broadband clients and a 10 Mbps
symmetric connection for the server. Constant latency is also
assumed between all nodes, where each message sent can
be received in the next step, unless the transmission time is
prolonged due to bandwidth limits. The simulator runs on top
of VAST, an implementation of the P2P-VE overlay VON
(http://vast.sourceforge.net). To constrain bandwidth used by
the overlay, a connection limit is also set for each node so to
restrict the number of connected AOI neighbors .
To set up a VE, a number of objects are placed randomly on
a 2D map that is partitioned into square cells. For simplicity,
we assume that one set of data pieces is enough for each
object’s reconstruction. Based on the data size in our prototype
experiment, each object is set to 15 KB, where the base piece
World dimension (units)
Cell size (units)
Overlay connection limit
Number of nodes
Number of objects
Node speed (units / step)
Client cache size (KB)
100 - 1000 (in 100 increment)
is 3 KB and 10 refinement pieces are 1.2 KB each. The scene
descriptions are around 300 to 500 bytes each. To run the
simulation, a number of nodes are put randomly within the
VE, and stay at their joining locations until the the system’s
average fill ratio exceeds 99%. This gives each node an initial
set of data to share. The nodes then move with constant speeds
using random way-points , and request scene descriptions
or data pieces as needed. All simulations proceed for 3000
steps, which is equivalent to 300 seconds assuming 100ms per
step. Specific simulation parameters are shown in Table IV.
C. Simulation Results
Fig. 7(a) shows a time-series of the server’s upload for both
a client-server (C/S) system and FLoD during a 1000-node
run. After an initial period to wait for fill ratio to reach 99%,
FLoD server’s upload increases as nodes begin to move, before
it becomes stabilized. As we are interested in the system’s
stable state behavior, the rest of the discussions will be based
on statistics collected during each simulation’s last 2000 steps.
Scalability The basic requirement for scalable systems is that
resource usages are bounded at all relevant system nodes. In
the context of a streaming system, this requirement means
that both the server’s and clients’ bandwidth usage should
be bounded by some limits. Fig. 7(b) shows the upload
bandwidth for both a C/S server and a FLoD server. As the
bandwidth limit is 10 Mbps for the server, the C/S server’s
bandwidth exhausts at 1250 KB/s when serving over 200
nodes. On the other hand, a FLoD server’s upload stays
relatively constant below 50 KB/s. The reduction in server-
side bandwidth is explained in Fig. 7(c), which shows that
the upload and download bandwidth usages of FLoD clients
converge, indicating that as the system scales (i.e. the number
of AOI neighbors increases), FLoD clients can become self-
sufficient in serving data to each other. On the other hand, C/S
clients are rationed less and less of the server’s bandwidth, and
their download sizes continuously decrease. The results also
show that FLoD clients are functional under today’s broadband
environment given our assumed content size and user behavior.
Although FLoD significantly reduces and bounds server-
side bandwidth usage, client-side usage still grows logarith-
mically as user size increases, which indicates that the system
still has a scalability limit. However, additional analysis reveals
that the increase is mostly due to the P2P-VE overlay, whose
bandwidth usage grows logarithmically with node density
under a connection limit . Bandwidth used by FLoD in fact
(a) server upload time-series (1000 nodes) (b) server upload(c) client upload and download
Fig. 7. Bandwidth usage comparisons (average transmission size per node per second).
(a) fill ratio(b) base latency
Fig. 8. Streaming quality comparisons.
remains relatively constant, which should be expected given
our uniform object distribution and constant node speed (i.e.
the new content required by each node per-second remains
constant on average). As the bandwidth growth depends on
user density, if it can be controlled, the total number of users
can still grow scalably. Note however that 1000 nodes in a
1000x1000 area is already a scenario with high user-density.
This shows that P2P-based 3D streaming is fundamentally
more scalable than client-server approaches by preventing both
the server and clients to become resource bottlenecks.
Streaming Quality We use fill ratio and base latency to
measure the streaming quality of the system. Fig. 8(a) shows
the fill ratio for both C/S and FLoD clients. After an initially
high fill ratio (while the server still has enough bandwidth),
the fill ratio of C/S clients drops to 48% at 200 nodes, and then
declines continuously. On the other hand, a FLoD client has
a relatively stable fill ratio at above 94% independent of node
size. The degrade in the service quality for C/S clients can
also be seen from Fig. 8(b), where their base latency is lower
than a FLoD client at 100 nodes, but jumps to 3 seconds and
then increases linearly afterwards. On the other hand, FLoD
clients’ base latency remains relatively stable below 600ms.
Limitations The basic assumption of FLoD is that clients can
obtain data from neighbors with shared visibility, an interesting
question thus is what happens if AOI neighbors do not exist?
To answer this question, we perform another set of simulations
to observe how the server request ratio might change as node
size varies from 2 to 512, while fixing all other parameters.
Fig. 9(a) shows how server request ratio may decrease as node
density (and hence the number of AOI neighbors) increases.
When there are hardly any AOI neighbors, most requests will
be sent to the server, making the system to fall under similar
constrains as client-server systems. This indicates that FLoD
is functional only if sufficient AOI neighbors exist.
Even when there are AOI neighbors, what happens if the
amount of data required exceeds what the peers can provide?
In another set of experiments using 500 nodes, we downgrade
the client’s upload capacity from 64 KB/s to 48, 32, 16 and
8 KB/s, to see how this will affect both the fill ratio and
bandwidth usage. Note that by lowering the upload capacity,
we are essentially investigating the effect of data density on
FLoD’s performance. Fig. 9(b) shows that the fill ratio remains
high till 32KB/s, but decreases as the client’s upload capacity
gets smaller. This indicates that the amount of data needed
by peers within a period must be matched by peers’ upload
capacities, otherwise either the server would start to receive
excessive requests, as shown by the increase of server’s upload
size, or the service quality experienced by peers would decline.
In another round of 500-node simulations, we see how cache
size affects the streaming quality and bandwidth usage by
varying cache sizes between 0.5 to 5 times of the expected
content within an AOI (estimated to be 132 KB on average).
Fig. 9(c) reveals that the fill ratio degrades and the server’s
bandwidth usage surges if the cache stores less than one
multiple of AOI’s content. The cache is adequate if it is twice
the AOI content, but fill ratio and bandwidth usage improve
only little beyond that, indicating that excessive cache is not
necessary and will only improve performance marginally.
9 Download full-text
(a) effect of node density(b) effect of upload capacity(c) effect of cache size
Fig. 9. Limitations of FLoD.
VI. RELATED WORK
Schmalstieg and Gervautz  first introduce scene stream-
ing where a server determines and transmits visible objects at
different level-of-details (LODs) to clients. Subsequent work
replaces discrete LODs with continuous (smooth) LODs .
Teler and Lischinski used pre-rendered image-based impostors
as the lowest LOD to allow faster initial visualizations .
Cyberwalk  adopts progressive meshes to avoid the data
redundancy from sending multiple LODs, and focuses on
caching and prefetching to enhance visual perceptions. Deb
and Narayanan propose a geometry streaming system that
maintains interactive frame-rate by adaptive data selection
according to the client and network conditions . Social
MMOGs such as ActiveWorlds, There.com, and Second Life 
utilize scene streaming to support dynamic content, but little
is known on their mechanisms. Our work complements all the
above client-server work with distributed deliveries. Cavagna
et al.  recently introduce a level-of-detail description tree
for visibility determination to serve hierarchically-organized
urban scenes in a P2P environment. A few peer selection
strategies based on proximity or estimated content on peers
are evaluated. A self-regulated request strategy is also used
to support fair data distribution. However, evaluations on its
scalability and the effect of overlay are not yet considered.
We have formulated a conceptual model for P2P-based 3D
scene streaming by identifying the main tasks as the partition
of scenes, the fragmentation of objects, the prefetching of
potentially visible objects, the prioritization of transmission
order, and the selection of peers and pieces for deliveries. We
have also presented FLoD, the first P2P 3D scene streaming
framework where a P2P-VE overlay is used to discover
relevant peers to exchange content. We show the feasibility of
FLoD with a prototype, and how it achieves better scalability
by bounding the server and the clients’ bandwidth usage under
even today’s broadband condition. An open source implemen-
tation of FLoD is available at: http://ascend.sourceforge.net.
A number of directions exist for future work, for example,
sufficient AOI neighbors is required by the current design,
yet nodes beyond AOI may also possess relevant content.
Efficiency at matching supply and demand thus may not be op-
timal. We assume linear piece dependency, yet considerations
into non-linear piece dependency may provide better download
parallelism. We also have not investigated prefetching in depth,
but it is essential for any streaming scheme to be effective.
Real-time 3D content has yet found a way to most Internet
users in spite of years of efforts. While challenges remain
in areas such as format standards and the ease of content
creations, content streaming may effectively address the de-
livery problem. 3D streaming on P2P networks thus is a topic
of interest to both graphics and networking professionals. By
identifying the basic issues, we hope to generate interests in
this promising direction for more accessible 3D content.
 Y. Cui et al., “ostream: asynchronous streaming multicast in application-
layer overlay networks,” IEEE JSAC, vol. 22, no. 1, pp. 91–106, 2004.
 N. Magharei and R. Rejaie, “Prime: Peer-to-peer receiver-driven mesh-
based streaming,” in Proc. INFOCOM, 2007.
 E. Teler and D. Lischinski, “Streaming of complex 3d scenes for remote
walkthroughs,” EUROGRAPHICS, vol. 20, no. 3, 2001.
 S.-Y. Hu, “A case for 3d streaming on peer-to-peer networks,” in Proc.
Web3D, 2006, pp. 57–63.
 S. Singhal and M. Zyda, Networked Virtual Environments: Design and
Implementation. ACM Press, 1999.
 J. Sahm et al., “Efficient representation and streaming of 3d scenes,”
Computers & Graphics, vol. 28, no. 1, pp. 15–24, 2004.
 H. Hoppe, “Progressive meshes,” in SIGGRAPH, 1996, pp. 99–108.
 S.-Y. Hu et al., “Von: A scalable peer-to-peer network for virtual
environments,” IEEE Network, vol. 20, no. 4, pp. 22–31, 2006.
 P. Rosedale and C. Ondrejka, “Enabling player-created online worlds
with grid computing and streaming,” Gamasutra Resource Guide, 2003.
 J. Chim et al., “Cyberwalk: A web-based distributed virtual walkthrough
environment,” IEEE TMM, vol. 5, no. 4, pp. 503–515, 2003.
 N.-S. Lin, T.-H. Huang, and B.-Y. Chen, “3d model streaming based on
jpeg 2000,” IEEE Trans. on Consumer Electronics, vol. 53, no. 1, 2007.
 J.-E. Marvie et al., “Remote rendering of massively textured 3d scenes
through progressive texture maps,” in Proc. VIIP, 2003, pp. 756–761.
 J. Kim, S. Lee, and L. Kobbelt, “View-dependent streaming of progres-
sive meshes,” in Proc. SMI’04, 2004, pp. 209–220.
 S. Deb and P. J. Narayanan, “Design of a geometry streaming system,”
in Proc. ICVGIP, 2004, pp. 296–301.
 X. Gu, S. J. Gortler, and H. Hoppe, “Geometry images,” ACM TOG
(SIGGRAPH 2002), vol. 21, no. 3, pp. 355–361, 2002.
 D. Taubman and R. Prandolini, “Architecture, philosophy, and perfor-
mance of jpip,” in Proc. ISVCIP, 2003, pp. 791–805.
 D. Schmalstieg and M. Gervautz, “Demand-driven geometry transmis-
sion for distributed virtual environments,” Computer Graphics Forum,
vol. 15, no. 3, pp. 421–433, 1996.
 G. Hesina and D. Schmalstieg, “A network architecture for remote
rendering,” in Proc. DIS-RT, 1998, p. 88.
 R. Cavagna, C. Bouville, and J. Royan, “P2p network for very large
virtual environment,” in Proc. VRST, 2006, pp. 269–276.