A study of overheads and accuracy for efficient monitoring of wireless mesh networks.
ABSTRACT IEEE 802.11-based wireless mesh networks are being increasingly deployed in enterprize and municipal settings. A lot of work has been done on developing measurement-based schemes for resource provisioning and fault management in these networks. The above goals require an efficient monitoring infrastructure to be deployed, which can provide the maximum amount of information regarding the network status, while utilizing the least possible amount of network resources. However, network monitoring involves overheads, which can adversely impact performance from the perspective of the end user. The impact of monitoring overheads on data traffic has been overlooked in most of the previous works. It remains unclear as to how parameters such as number of monitoring agents, or frequency of reporting monitoring data, among others, impact the performance of a wireless network. In this work, we first evaluate the impact of monitoring overheads on data traffic, and show that even small amounts of overhead can cause a large degradation in the network performance. We then explore several different techniques for reducing monitoring overheads, while maintaining the objective (resource provisioning, fault management, and others) that needs to be achieved. Via extensive simulations and experiments, we validate the efficiency of our proposed approaches in reducing overheads, their impact on the quality of data collected from the network, and the impact they have on the performance of the applications using the collected data. Based on results, we conclude that it is feasible to make the current monitoring techniques more efficient by reducing the communication overheads involved while still achieving the desired application-layer objectives.
Conference Paper: Scaling end-to-end measurements in heterogeneous wireless mesh networks[Show abstract] [Hide abstract]
ABSTRACT: In large scale deployments of Wireless Mesh Networks (WMNs), access to the Internet is ensured by multiple gateways spread over the network. In such environments that rely on heterogeneous backhaul technologies offering different and time-varying bandwidth, delay or jitter characteristics, monitoring the end-to-end performances on the diversity of paths Internet flows can be forwarded on is challenging: the end-to-end measurement strategy must capture the diversity of backhaul connections, as well as multi-hop behavior within the mesh, and it must scale with the number of gateways, nodes and flows. In this paper we propose and evaluate the scalability of two measurement strategies for the monitoring of end-to-end paths. We establish closed form formulas for the overhead incurred by these measurement strategies, and compare their efficiency against greedy measurements in grid topologies. We conclude that one can reach linear increase in the number of probing nodes in place of an exponential growth for greedy end-to-end measurements. We extend these results to also show that this strategy takes advantage of dense topologies.Proceedings of the 8h ACM symposium on QoS and security for wireless and mobile networks; 10/2012
Pervasive and Mobile Computing 6 (2010) 93–111
Contents lists available at ScienceDirect
Pervasive and Mobile Computing
journal homepage: www.elsevier.com/locate/pmc
Fast track article
A study of overheads and accuracy for efficient monitoring of wireless
Dhruv Gupta∗, Daniel Wu, Prasant Mohapatra, Chen-Nee Chuah
Graduate Group in Computer Science, University of California Davis, Davis, CA 95616, United States
a r t i c l ei n f o
Received 13 February 2009
Received in revised form 2 July 2009
Accepted 13 July 2009
Available online 17 July 2009
Wireless mesh networks
a b s t r a c t
IEEE 802.11-based wireless mesh networks are being increasingly deployed in enterprize
and municipal settings. A lot of work has been done on developing measurement-based
schemes for resource provisioning and fault management in these networks. The above
goals require an efficient monitoring infrastructure to be deployed, which can provide the
amount of network resources. However, network monitoring involves overheads, which
can adversely impact performance from the perspective of the end user. The impact of
monitoring overheads on data traffic has been overlooked in most of the previous works. It
remains unclear as to how parameters such as number of monitoring agents, or frequency
In this work, we first evaluate the impact of monitoring overheads on data traffic, and
show that even small amounts of overhead can cause a large degradation in the network
performance. We then explore several different techniques for reducing monitoring
overheads, while maintaining the objective (resource provisioning, fault management, and
others) that needs to be achieved. Via extensive simulations and experiments, we validate
the efficiency of our proposed approaches in reducing overheads, their impact on the
quality of data collected from the network, and the impact they have on the performance
of the applications using the collected data. Based on results, we conclude that it is feasible
to make the current monitoring techniques more efficient by reducing the communication
overheads involved while still achieving the desired application-layer objectives.
© 2009 Elsevier B.V. All rights reserved.
IEEE 802.11-based wireless mesh networks (WMNs) have witnessed a tremendous growth over the last few years [1–3].
A lot of work has been done in terms of understanding the behavior of these networks and analyzing the impact of various
factors, such as interference and number of hops, on their performance [4–6]. A lot of focus has also been on providing
Quality-of-Service (QoS) in WMNs and several schemes have been proposed for admission control and QoS-based routing
[7–10]. There have been some recent studies on developing measurement-based schemes for resource management and
fault management in WMNs [7,11]. These schemes usually involve measuring certain parameters from the network (such
as packet loss rate or signal quality), and utilizing this data for the purpose of QoS provisioning (for example routing
algorithms using loss rate as metric) or for fault management (for example avoiding links with low signal quality). WMNs
have also found wide applications in enterprize and municipal networks. Such networks need to be monitored constantly
E-mail addresses: email@example.com (D. Gupta), firstname.lastname@example.org (D. Wu), email@example.com (P. Mohapatra), firstname.lastname@example.org
1574-1192/$ – see front matter © 2009 Elsevier B.V. All rights reserved.
D. Gupta et al. / Pervasive and Mobile Computing 6 (2010) 93–111
that can provide them with up-to-date network statistics. This is not a trivial task and the challenges involved in network
management and diagnosis have been addressed in the past [12–14].
Both the above goals (QoS provisioning and network diagnosis) require an efficient monitoring framework that can
provide accurate statistics about the network in a timely manner. An online network management system would require
transmitting measurement data from various locations to a central server, or the exchange of data among various mesh
nodes. However, in most cases, the same links are used for carrying both the user traffic and the monitoring data. In the case of
802.11-based wireless networks, an out-of-band channel may not always be available for transferring measurement data
from the mesh nodes to a central server. Deploying dedicated monitoring sniffers, with each node having a connection
to the wired backbone, is cost-prohibitive and may not always be feasible. As a result, the transmission of measurement
information will contend with the data traffic and reduce the channel time available to the end users. As a result, in a multi-
hop wireless network, even small amounts of monitoring traffic can cause a large impact on the existing data traffic in the
network, resulting in performance degradation for the end users. Thus, reducing the amount of monitoring overhead in a
wireless network is an important goal.
framework. An ideal case would be where each node in the wireless mesh network also acts as a monitoring agent,
periodically reporting measurement data to the central server. Such an approach gives us the advantage of having an
data may introduce large amounts of overhead in the network. In order to achieve our objective of reducing monitoring
we consider two different application scenarios and show how different techniques lend themselves to each scenario:
• QoS provisioning: A municipal ISP may have different service classes for its users, where each service class offers certain
guarantees in terms of network performance (such as bandwidth and delay). In order to make sure that each user gets
the specified performance, it is necessary for the ISP to continuously monitor the network. For example, the ISP may
infer end-to-end delay of a particular path by collecting delay information from the wireless routers along that path, and
then utilize this information to provide delay guarantees to end users. In such a scenario, each wireless router needs
to periodically report a certain set of parameters to a central server, where this information will be analyzed. For such
applications, we propose two different monitoring solutions:
– Monitor Selection approach: In this approach, we propose to reduce the overheads by reducing the number of wireless
routers used as monitoring agents in the wireless mesh network. We use a vertex-cover algorithm to locate the mesh
nodes that should also serve as monitoring agents. This helps us to reduce the number of monitors in the wireless
network, while still being able to maintain complete link coverage.
– Reporting Interval approach: Our second approach involves decreasing the frequency with which we report data to the
central server. By sending out monitoring packets at longer intervals, we reduce the contention in the network.
• Network diagnosis & fault management: An enterprize network administrator may be interested in maintaining the
performance of the entire network above a certain level, as opposed to providing service guarantees to individual users.
An efficient monitoring framework is required to report any events that might indicate a degradation in the network
performance. For example, the administrator may be interested in monitoring the signal quality on different links, in
order to decide whether a link should be used or not. In such a scenario, the administrator may not be interested in
periodic reports, but would instead like to get a measurement value if and only if the signal quality of a link falls below
a threshold value. For such applications, we propose the following approach:
– Threshold-based monitoring: In this approach, the nodes serving as monitoring agents will transmit measurement data
only when a certain pre-defined event occurs, as opposed to a periodic framework that reports data on a continuous
basis. An example of an event would be a desired network parameter crossing a pre-defined threshold.
We propose these techniques not as a replacement for existing network monitoring solutions, but to complement these
schemes by making them more efficient in terms of overhead. However, a simple reduction in network overheads is not our
final goal. It is also crucial for us to investigate as to what impact does that have on the quality of information that is
being collected from the network. This is important because if our accuracy of estimating network statistics goes down,
then this will impact the performance of the application that is using the collected data. Hence, for each of the above
techniques, we also evaluate the trade-off involved in terms of reduction in overheads and the accuracy and quality of
measurement data. This analysis is of utmost importance, as a reduction in overheads will not be desirable if it causes a
work , we performed some basic simulations to test our proposed approaches. In this work, we perform more extensive
simulations, and also validate our schemes via extensive experiments on our laboratory test-bed. Our main contributions can
be summarized as follows:
• We investigate the effects of monitoring overheads on the forwarding of user data traffic. We show that even small
amounts of monitoring traffic can result in increased delays and packet loss for the end user.
• We propose and evaluate three different approaches: (a) monitor selection approach, (b) reporting interval
approach, and (c) threshold-based monitoring, for reducing overheads in WMNs. Instead of using one common
D. Gupta et al. / Pervasive and Mobile Computing 6 (2010) 93–111
solution, we propose the concept of application-based monitoring, which involves using different monitoring
solutions for different applications.
• Our most important contribution involves evaluating the trade-offs between estimation accuracy and reduction in
of the application using the data.
behind our work. Section 3 explains the proposed methodology. Section 4 gives the performance evaluation of the proposed
schemes, along with the simulation results. Section 5 outlines the results obtained from our experimental study. Section 6
concludes the paper.
2. Related work and motivation
2.1. Related work in wired networks
The problem of efficient monitoring in a wired network (such as the Internet backbone) has been studied in the past.
Several works have studied how to use different polling mechanisms for lowering overheads [16,17]. Other works have
looked at improving the performance of reactive monitoring in wired networks. In  and , the authors look at how
to combine global polling with local event-based reporting for reducing monitoring overheads. However, these works did
not consider the impact of using these mechanisms on the functionality to be achieved. Other works such as  and 
have considered the impact of reducing the frequency of routing updates on routing overheads, as well as on end user
performance. However, such works focus only on the impact of delaying or reducing routing updates, while our goal is to
reduce the amount of monitoring data and evaluate its impact on the performance of a broader class of applications.
Several recent works have studied this problem from the aspect of jointly reducing the number of monitors and
controlling the sampling rate at the monitors, in order to bring down the monitoring cost while maximizing the monitoring
coverage in terms of the number of flows monitored. In , the authors consider the problem of minimizing cost (sum
of deployment and monitoring cost) and maximizing coverage (in terms of monitoring reward) under various budget
constraints. In , the authors look at the problem of placing a small set of active beacons in the Internet topology. They
show that the problem is NP hard for a BGP-like routing topology and present the upper and lower bounds for the number
of beacons needed for a given network. In , the authors present active monitor placement as a combinatorial problem
and present a mixed integer programming solution. They propose algorithms to both minimize the number of monitoring
beacons and the sampling rate. In , the authors consider the problem of placing monitors and setting the sampling rate.
Like the previous works, they show the problem to be NP hard and present approximation algorithms to solve the problems.
the variations in the network traffic. They propose an approach where a monitor is assumed to be present at every network
node. The problem is to decide which monitors to activate and what sampling rate to set at each monitor, in order to achieve
a measurement task with high accuracy and low resource consumption.
Unlike the above works, our focus is on multi-hop wireless networks. The primary difference that arises between these
two scenarios is the definition of the ‘‘cost’’ involved in network monitoring. In wired networks, the focus has been to
minimize the deployment and maintenance cost of the monitors (especially given the large size of the networks), while the
communication overheads usually do not present a problem. Second, since the number of links and flows to be monitored is
huge, choosing a sampling rate along with the placement of monitors becomes crucial. Hence, all the above works approach
the monitor placement problem in wired networks as a joint problem of minimizing cost and maximizing coverage, and
have presented heuristic solutions. On the other hand, in wireless networks, communication overheads pose a greater
problem due to interference and limited available bandwidth. By controlling the measurement overheads, we can increase
the channel time available to the end users. Hence, instead of looking at deployment and maintenance costs, we define
the ‘‘cost’’ for monitoring wireless networks to be in terms of the accuracy of measurement. We evaluate the proposed
schemes by studying their impact on the accuracy with which we can monitor data, and the impact the schemes have on
the application using the measurement data. Moreover, due to the smaller size of wireless networks, and lower link speeds,
it is usually not required to set a sampling rate on a wireless monitor. Hence, the need for different efficient monitoring
techniques for wireless networks.
2.2. Related work in wireless networks
Wireless networks present a widely different scenario than wired networks. A wireless mesh network does not compare
for each monitoring node. Also, wireless networks have highly dynamic characteristics (interference and varying link
quality), which should be taken into account during the deployment of the monitoring infrastructure. Several previous
works have proposed both active probe-based and passive monitoring techniques for wireless mesh networks. However,
these works do not focus on the impact that monitoring overheads have on the transmission of data traffic. For example,
routing schemes based on metrics such as ETX  and ETT  rely on periodic broadcasts to estimate the link quality.
D. Gupta et al. / Pervasive and Mobile Computing 6 (2010) 93–111
However, the impact of overheads on data traffic has not been quantified in these works. In , authors have evaluated the
impact of overheads associated with ETT-based routing. It was shown that the active probes used by ETT-based routing
protocol contend with the data flows for channel access and result in reduced throughput for the end users. Passive
monitoring techniques such as those proposed in  and  provide a better alternative to active monitoring. However,
even with such approaches, the transmission of measurement data from the monitoring nodes to the central server will
add overheads, which is not accounted for in these works. In , the authors perform some basic analyses to show how
low-rate management and control traffic can severely degrade the end user’s throughput, and propose a metric to capture
this reduction in throughput. Our work extends these studies in two ways. We investigate the impact of various factors such
as reporting interval and size of monitoring packet on measurement overheads. We further perform a careful analysis of
how reducing overheads impacts the quality of measurement data being collected from the network and the performance
of the application using that data.
Some other works such as  (and references thereof) have focussed on mobile Adhoc networks, and have evaluated
the impact of collecting coarse-grained mobility information on QoS provisioning and overheads. Our focus is on collection
of network-performance-related metrics in static wireless mesh networks, and their impact on the application using the
As explained in the previous sections, the issue of efficient monitoring in multi-hop wireless networks is an important
problem. Our goal is to evaluate different techniques that can help reduce the volume of monitoring data in the network,
while achieving the desired performance and functionality. Lower monitoring traffic will translate to lower contention and
interference in a wireless network, thereby providing better end-to-end performance to the clients. However, we not only
evaluate techniques for reducing the monitoring traffic in the network, but we also evaluate their impact on the accuracy
of measurement information and performance of various applications. Our work is complementary to the existing network
monitoring approaches for wireless mesh networks. We do not propose a new monitoring scheme itself, but instead focus on
how these various monitoring techniques can be made more efficient, by reducing the overheads involved in the collection and
transmission of measurement data, and what impact would this have on the accuracy of the data.
3. Proposed methodology
Our conjecture is that different application scenarios will need different forms of monitoring. Based on the objective for which
the monitoring data is being used, different techniques for reducing the volume of measurement traffic can be used. As part
of this study, we consider two different application scenarios, namely QoS provisioning and Network Diagnosis (outlined in
Section 1). These are only two of the several different application scenarios (such as QoS routing, load balancing, admission
the volume of measurement data generated by the monitoring framework will directly impact the performance of the end
3.1. Monitor selection approach
For the first application of QoS provisioning, we consider QoS-based routing. We consider a delay-based routing
algorithm, where the objective is to find a path with minimum delay for each client. In order to achieve this, we need
to monitor each link in the network, and report the associated delays to the central management server. The central server
would then use this information to estimate the end-to-end delays for various network paths. By utilizing this information,
the central server can assign a path with the least delay to an incoming client.
In an ideal case, every node in the wireless mesh network would also be used as a monitoring agent. Using such a
framework would enable us to collect continuous measurement data from every link in the wireless network. However,
such a framework would also introduce large volumes of monitoring traffic in the network, thereby adversely affecting the
transmission of data traffic. In order to reduce the monitoring overheads, we propose to limit the number of wireless routers used
for monitoring purposes, while still achieving the goal of delay-based routing.
We decided to evaluate the performance of vertex-cover algorithm for this purpose. We use this algorithm to locate the
network sites to be used for monitoring purposes. A mesh network can be modeled as a graph G = (V,E), where V is the
set of nodes, representing the mesh access points, and E is the set of edges, representing the links between the mesh access
points. We want to choose a set of k nodes, from N nodes in the network, to be used for monitoring. The above problem is
similar to the vertex-cover problem in graph theory. For our problem, if we can find a vertex cover for our network, then we
have a set of nodes which we can use as monitoring agents. This would ensure that we cover all the links in the network for
the purpose of monitoring, while using the minimum possible number of nodes.
A simple approximation to the vertex-covering algorithm consists of picking a random edge from the graph and adding
the vertices of the edge to the vertex cover. It then removes all the edges incident on these two vertices, as they have been
covered, and then repeats the above process. The running time of this algorithm is O(V +E). This algorithm is a polynomial-
time 2-approximation algorithm . However, we should not select any random network site to be used as a monitoring agent.
D. Gupta et al. / Pervasive and Mobile Computing 6 (2010) 93–111
The selection process should take into account some network characteristics. We include the effects of network topology
in the monitor selection decision. To do this, we use the vertex cover approximation algorithm that chooses vertices in
decreasing order of their degrees. The rationale behind this approach is that the vertex with the maximum degree would
reflect the node that has the maximum number of links with other nodes in the mesh network. Thus, by choosing the nodes
with higher degrees, we will be monitoring a larger number of links in the network.
3.2. Reporting interval approach
our goal is to implement a delay-based routing scheme that involves monitoring the per-hop delay at each mesh node.
An important parameter in any monitoring framework is the frequency at which we monitor data. Reporting data at a
high frequency (such as per-second basis or lower) enables us to maintain a more accurate image of the network. However,
We consider a basic monitoring infrastructure that transmits one monitoring packet every second from each mesh node. In
order to reduce the overheads, we can use a framework where nodes maintain an average of various parameters and report
data at longer intervals. For example, instead of transmitting one monitoring packet every second, we can transmit one
the reporting interval should be selected appropriately, so as to not impact the desired functionality. We investigate the
performance of our delay-based routing protocol using different reporting intervals and evaluate the trade-offs in terms of
overheads and accuracy of monitored data.
3.3. Threshold-based monitoring
For the second application of detecting performance degradations in the network, we propose to use a threshold-based
monitoring framework in order to reduce the volume of monitoring traffic. Our objective here is to report data to the central
server only when a certain event occurs. Specifically, in our case, we consider the traffic load on a node as an indication of
congestion. If this load value crosses a certain threshold, then it might indicate an onset of congestion, and the administrator
may want to route packets around that node. We wish to capture this particular event in the network. Ideally, we could use
the periodic reporting mechanism, where every wireless router continuously sends monitoring data to the central server.
For our case, such a framework would transmit data irrespective of the traffic load on the mesh node. However, this would
generate large amounts of overhead and add to the contention in the network. We thus evaluate the performance of the
threshold-based reporting mechanism to achieve our objective of reducing monitoring overheads while maintaining the
desired functionality of identifying network performance anomalies.
performance guarantees to end users, in terms of throughput or delay, we need continuous and periodic measurements
from the network. A threshold-based scheme will only report data when a certain event occurs and may not be able to
provide the fine-grained measurements required for QoS provisioning. We could modify the threshold in order to report
data more frequently. However, this might lead to large overheads in the network. Second, consider the case where a
value at every occurrence of the event. As a result, the latter case might cause the QoS application to over-react to transient
events, resulting in other problems such as route-flapping. Hence, a threshold-based approach may not be suitable for QoS
provisioning, and we need different types of monitoring schemes for different applications.
4. Performance evaluation
In this section, we describe the details of our simulations, followed by an analysis of the proposed approaches.
4.1. Simulation setup
In order to evaluate our proposed scheme, we have used the QualNet simulator. We use three topologies for comparing
the performance of various schemes. The first one is a twenty-five node uniform grid topology generated in QualNet
(hereafter referred to as the GRID topology). The distance between the nodes was set to be 100 m, while the communication
range was set to 200 m. We used the two-ray path loss model for this topology. The center node was selected as the gateway
node. The second topology (Fig. 1(a)) is a fifteen node network, derived from the indoor testbed used in  (hereafter
referred to as the INDOOR topology). It consists of 802.11 b/g nodes spread across two floors in a department building. We
used the indoor terrain model library in QualNet to simulate the indoor building environment. The width of the simulation
environment was set to be the same as the dimensions of the actual building which houses the testbed (about 187 feet).
Node 8 in the figure served as the gateway node for this topology. The third topology is also a fifteen node mesh network,
based on our outdoor mesh network testbed (hereafter referred to as the OUTDOOR topology). This testbed  consists of