Conference PaperPDF Available

Monitoring VoIP call quality using improved simplified E-model



ITU-T recommendation G.107 introduced the E-model, a repeatable way to assess if a network is prepared to carry a VoIP call or not. Various studies show that the E-model is complex with many factors to be used in monitoring purposes. Consequently, simplified versions of the E-model have been proposed to simplify the calculations and focus on the most important factors required for monitoring the call quality. In this paper, we propose simple correction to a simplified E-model; we show how to calculate the correction coefficients for 4 common codecs (G.711, G.723.1, G.726 and G.729A) and then we show that its predictions better match PESQ scores by implementing it in a monitoring application.
Monitoring VoIP Call Quality Using Improved
Simplified E-model
Haytham Assem, David Malone
Hamilton Institute, National University of Ireland,
Jonathan Dunne, Pat O’Sullivan
Systems and Performance Engineering, IBM Dublin,
Software Lab
AbstractITU-T recommendation G.107 introduced the E-
model, a repeatable way to assess if a network is prepared to
carry a VoIP call or not. Various studies show that the E-
model is complex with many factors to be used in monitoring
purposes. Consequently, simplified versions of the E-model
have been proposed to simplify the calculations and focus on
the most important factors required for monitoring the call
quality. In this paper, we propose simple correction to a
simplified E-model; we show how to calculate the correction
coefficients for 4 common codecs (G.711, G.723.1, G.726 and
G.729A) and then we show that its predictions better match
PESQ scores by implementing it in a monitoring application.
Keywords-VoIP; E-model; PESQ; Monitoring
The evaluation of data networks depends on several
factors. Thus, it is argued that it is not appropriate to use a
single metric to evaluate the quality of data networks. Yet in
the telephony world, a single number is typically given to
rate call quality. Such value is used as a basis of monitoring
and tuning the network. Voice over Internet Protocol (VoIP)
is an example of such data network application [1].
In previous years, VoIP has become an important
application and is expected to carry more and more voice
traffic over TCP/IP networks. In real-time voice applications,
the speech quality is impaired by the packet loss, jitter, delay
and bandwidth. Consequently, VoIP applications require low
delay, low packet loss rates, low jitter and sufficient
bandwidth in order not to affect the interaction between call
VoIP is based on IP network; however IP networks
frequently provide best effort services, and may not
guarantee delay, packet loss, and jitter [2]. So, the prediction
of voice quality in different environments and traffic loads
may be as important part of network monitoring in order to
measure voice quality and prevent critical problems before
they occur.
As measuring voice quality is important to the service
providers and end users, ITU-T provides two test methods
subjective and objective testing. Subjective testing was
considered the earliest attempts on this issue to evaluate the
speech quality by giving Mean Opinion Scores (MOS). The
MOS test is one of the widely known accepted tests that give
a speech quality rating. ITU-T Rec. P.800 [3] presents the
MOS test procedures as users can rate the speech quality
from 1(Poor) to 5 (Excellent) scale. Of course, the numbers
of the listeners are considered an important factor in
estimating accurate scores. Thus, subjective testing using
MOS is time consuming, expensive and does not allow real
time measurement. Consequently, in recent years new
methods were developed for measuring MOS scores in an
objective way (without human perception): PESQ [4], E-
model [5] and several others.
PESQ, Perceptual Evaluation of Speech Quality, is
considered an objective method for predicting the speech
quality. It is an intrusive testing method which takes into
account two signals; one is the reference signal while the
other one is the actual degraded signal. Both signals are sent
through the test that uses a PESQ algorithm and the result is
a PESQ score. Consequently, this approach cannot be used to
monitor real time calls.
Nowadays, a new objective method proposed by TU-T
G.107 [5] defines the E-model, a mathematical model that
combines all the impairment factors that affect the voice
quality in a single metric called R value that is mapped to
MOS scale. The E-model was designed to provide estimated
network quality and has shown to be reasonably accurate for
this purpose. It has not been accepted as a valid measurement
tool for live networks. The ITU-T G.107 Recommendation
[5] states at the beginning of the document that it is
considered only estimates for the transmission planning
purposes and not for actual customer opinion prediction
unlike the PESQ [4] which is developed to model subjective
tests commonly used in telecommunications to assess the
voice quality by human beings.
Increasingly and against ITU recommendations, the E-
model is being used nowadays by industry and research as a
live voice quality measurement tool. Thus, simple versions
of E-model [1, 6] have been proposed to simplify the
complexity of the original E-model [5] and focus on most
important part that affect the VoIP call quality.
The objective of our work is to provide a monitoring
system using a simplified version of the E-model corrected
for 4 common codecs to better predict PESQ MOS scores as
PESQ is generally considered to provide more accurate
predictions of user experience than the E-model.
This paper is organized as follows: Section 2 describes
the proposed improved simplified E-model. In Section 3 we
show how we derived the correction coefficients used in the
improved simplified E-model. In Section 4 we propose our
results using the derived model by implementing it in a
monitoring application. Finally, we conclude and summarize
the paper in Section 5.
In this section, we will first give a brief description of the
simplified E-model [6] and then we will describe our
proposed improvements to the simplified E-model with the
method of calculation of the various parameters used in the
model in order to be applicable in monitoring purposes.
A. Simplified E-Model
The original E-model is very complex [7] and involved
with many factors. Moreover, the voice processing is not
related significantly to the instantaneous judgment of QoS.
Thus, a simplified version of the E-model [6] has been
introduced to focus on the most important parts and
afterwards it was used in a monitoring system [2]. This
model takes in to account the codec and the present network
conditions which are the main two factors that affect the
voice quality. The simplified E-model is expressed by
equation (1) by calculating the evaluation value R.
R= R0IcodecIpacketloss Idelay (1)
Where R0 represents the basic signal to noise ratio, Idelay
represents the delays introduced from end to end, Icodec is
the codec factor and the Ipacketloss is the packet loss rate
within a particular time. Finally the R value is mapped to
MOS score.
B. Improved simplified E-Model
The objective of this model is to determine the voice
quality MOS rating by a simplified modified version of the
previous E-model described above. The computational
model consists of a mathematical function of parameters of
the transmission system. The computation itself can be split
into several elements and can be expressed by the following
equation (2).
Where is a second order function corrected using curves
fitted to PESQ scores which is the standard objective method
defined by ITU-T recommendation P.862 [4], is the
average delay time within specified period and A is the
expectation factor due to the communication system. The
description and method of calculating the previous
parameters ( , and A) in (2) are as follows:
1) :
as mentioned above is a second order function model
corrected with PESQ scores to obtain more accurate results
in our monitoring system. Ry can be expressed by the
following equation (3).
Where is a part of the simplified E-model (1) which is
corrected with PESQ scores, can be obtained by the
following expression (4) and a, b, c are codecs coefficients as
shown in Table I and derived in section III.
G.723.1 5.3k
G.726 24k
is the basic signal to noise ratio, including noise
sources such as circuit and room noise. However, currently it
is really difficult to calculate directly. Thus, ITU-T
G.113 [8] provides the common value of . Since, the
inherent degradation that occurs when converting actual
spoken conversation to a network signal and back reduces
the theoretical maximum R-value (94.2) with no
impairments to 93.2 [5]. So, we set the R0 value to 93.2.
is the equipment impairment (codec quality) factors as
defined in [8] and [9]. It represents the codec distortion
which leads to voice distortion and impairments arising
because of signal conversions. Nowadays, its value is
determined by looking up the codec in the ITU-T
Recommendation G.113 literature [8] as Table II is part of it.
1.3) :
is the packet loss percentage within a particular
period measured by certain number of packets. The
percentage measured is the loss of packets occurred when the
sender’s packets is not received by the receiver. It can be
expressed by the following formula (5).
Where DS is the difference between the largest and smallest
sequence number of N packets. Statistics and calculation of
the Real-time Transport Protocol (RTP) packets can be used
to calculate this percentage by the following expression (6).
DS=LS-SS+1 (6)
Where LS and SS are the largest and smallest sequence
numbers respectively. They are extracted from the RTP
header of the sequence number field from the packets
2) :
The delay components contributing to provided in
ITU-T G.107 [5] are , the average absolute one way
mouth to ear delay. T, the average one way delay from the
receive side to the point in the end to end path where a
signal coupling occurs as a source of echo. , the average
trip delay in the 4 wire loop. G.107 [5] gives a fully
analytical expression for the function , interms of , T,
and parameters associated with a general reference
connection describing various circuit switched and packet
switch inter-working scenarios. Assuming perfect echo
cancellation, all the factors in can be collapsed in a
single points as shown in (7) and (d) is now function only
of the one way delay d. (d) can be calculated by a series
of complex equations in ITU-T G.107 [5] as shown with the
plotted curve of vs one way delay in Fig.1 (labeled
The one way delay (d) is the time it takes to get data
across the network. The one-way delay measured from one
end of the network to the other end is mainly composed of
four components that can be expressed in equation (8).
d=t0+t1+t2+t3 (8)
Where t0 is the propagation delay, t1 is the transport delay,
t2 is the packetization delay and t3 is the jitter buffer delay.
In this paper we approximate these four components by
measuring the response time (round-trip delay) as in most
modern devices t1, t2 shall be small. Thus, ping should be
In our model we used a simplified version of (9) as
provided in [10]. This model shows accuracy for one way
delay less than 400ms as shown in Figure 1 (labeled AT&T
simplified model”). We found this model reasonable as ITU-
T recommend that one-way delay should not be more than
150 ms for good speech quality[11].
H(x) =0,if x<0
H(x) =1,if x>=0
Figure 1. versus one-way delay
3) A:
The advantage factor, A represents anadvantage of
access”, introduced into transmission planning for the E-
model (ITU-T G.107) [5]. This value can be used directly as
an input parameter to the E-model. Provisional A values are
listed in [5] as show in Table III. Assuming our
communication system is conventional then we neglect A
Communication System
Maximum value
of A
Conventional (wire bound)
Mobility by cellular networks in a
Mobility in a geographical area or
moving in a vehicle
Access to hard-to-reach locations, e.g.,
via multi-hop satellite connections
The R value of the E-model is finally transformed to
MOS score that will reflect the user level of satisfaction as
shown in Table IV, theoretical range of transmission
performance rating factor R from 0 to 100. R=0 represents of
the worst quality and R=100 represents the best quality. The
R factor value for estimated average score of MOS can be
expressed by equation (10).
For R<0: MOS=1
For R>100: MOS=4.5 (10)
Satisfaction Level
Very satisfied
Some users dissatisfied
Many users dissatisfied
Nearly all users dissatisfied
Not recommended
In this section we show how we derived the values of a, b
and c (Table I) used in our improved simplified E-model
described in the previous section.
In our experiment shown in Figure 2 we have developed
a java program that stream RTP packets using 4 main audio
codecs (G.711, G.726, G.723.1 and G.729A). We recorded
the voice at both ends and measured the PESQ scores under
different random packet loss rate ranges from 0-20%. For
each packet loss rate, we repeated the experiment 10 times
taking the average MOS PESQ score in order to increase the
accuracy of the results as much as possible.
Figure 2. Deriving codecs coefficients a,b and c
The PESQ scores are converted from MOS to R value
and this can be conducted by a complicated Candono’s
Formula as in [12] or by the simplified 3rd-order polynomial
fitting [13] as shown in (11).
The converted PESQ scores from (11) will represent the
values shown in the below graphs (Fig.3-Fig.4) on the y
axis. Since PESQ does not take the delay factor in its
account, so we correct the rest part of the model which we
name it (see equation 3) represented on x axis.
We found that it is well matched second order relation
function and then we derived the coefficients a, b and c as in
Table I using least-squares fitting method. The graphs below
(Figure 3-Figure 4) show the correlation between the
converted values from PESQ and the R values from the
simplified E-model for individual codecs in different loss
Figure 3. Relationship between and (G723.1 and G711)
Figure 4. Relationship between and (G726 and G729A)
The monitoring system could target specific number of
RTP packets to capture and perform an effective MOS value
calculation. The system will use a coefficient database for
the codec used in the call. This monitoring system is
developed for monitoring VoIP quality at the network
terminals, and the environment could be a personal or family
network with voice quality monitoring.
The whole system works as follows: The system uses
network capturing module to capture a certain number of
packets passed to specific IP and port. Non RTP packets will
be filtered. When this process completes the packet capture,
the system will analyze the data, delay and packet loss rate as
described previously in section II. The MOS score is
calculated to assess voice call quality in this period of the
call. We took our results online with introducing random
packet loss rate in the network using Dummynet [14].
We compared our monitoring system using MOS scores
based on the codec’s coefficients (see Table I) derived for 4
main codecs with the simplified version of the E-model that
is used in monitoring purposes [1, 6] and the PESQ scores.
The graphs (Figure 5-Figure 8) show our results for the 4
codecs. It can be observed that the MOS scores of our
improved simplified E-model based on the coefficient
database (Table I) are very close to the PESQ scores unlike
the simplified E-model which gives an advantage for the
corrected model in monitoring purposes for the VoIP call
Figure 5. Comparative Analysis (G.723.1)
Figure 6. Comparative Analysis (G.711)
Figure 7. Comparative Analysis (G.726)
Figure 8. Comparative Analysis (G.729A)
The E-model brings a new approach to the computation
of estimated voice quality. The main advantage of using E-
model that it is classified as an objective non intrusive
method that can be applied in real time. On contrary to the
ITU-Recommendation, simplified versions of E-model have
been introduced by researchers and industry to be used for
monitoring purposes and predicting the VoIP call quality.
Consequently, we have proposed an improved simplified
E-model and show how we derived the coefficients used in
the model for 4 common codecs (G.711, G.723.1, G.726 and
G.729A). We demonstrate its results by implementing it in a
monitoring system; our system analyzes the impact of voice
quality encoding factors under various network conditions
and uses our simplified improved E-model to assess voice
quality. The main advantage of our improved simplified
version that, it is less complex than the original E-model
model and it is more accurate than the simplified versions
We stress three benefits of our work. The first as
confirmed by the experiment, the simplified version of E-
model does not provide accurate results compared to PESQ
scores. The second, the correction coefficients derived
enhance the simplified E-model to monitor/predict the call
quality. The third, proposing a complete design of
monitoring system using our improved simplified E-model
for 4 common codecs. Another output of our work is a java
application that stream RTP packets using number of codecs.
The authors were supported by Science Foundation Ireland
(SFI) grants 07/SK/I1216a and 08/SRC/I1403.
[1] John Q. Walker, Assessing VoIP Call Quality Using the E-model,
NetIQ Corporation.
[2] Junsheng Zhang and Xiaohua Sun, “The VoIP phone QoS protection
in the wide-are network”, Computer learning .2006 No.6 .17-18.
[3] ITU-T Recommendation P.800, Methods for subjective
determination of transmission quality, Geneva, 08/1996.
[4] ITU-T Recommendation P.862, “Perceptual evaluation of speech
quality(PESQ): An objective method for end-to-end speech quality
assessment of narrow-band telephone networks and speech codecs”,
Febrauary 2001.
[5] ITU-T Recommendation G.107, The E-model: A computational
model for use in transmission planning, Geneva, 04/2009.
[6] Chunlei Jiang and Peng Huang, Research of Monitoring VoIP Voice
QoS, International Conference on Internet Computing and
Information Services, 2011.
[7] Pystechnics Limited, The E-Model, R Factor and MOS, 23
Museum street Ipswitch, Suffolk United Kingdom. December 2003.
[8] ITU-T Recommendation G.113, “Transmission impairments due to
speech processing”, 2001.
[9] ITU-T Recommendation P.833, “Methodology for derivation of
equpment impairment factors from subjective listening-only tests”,
[10] R.G.Cole and J.Rosenbluth, “Voice over IP performance monitoring”,
ACM comput. Commun. Rev., vol. 31, no. 2,pp. 9-24, April 2001.
[11] S.Pracht and D.Hardman: Voice Quality in Converging Telephony
and IP Networks, Agilent Technologies, White Paper,
[12] C. Hoene, H. Karl, and A. Wolisz, “A perceptual quality model for
adaptive VoIP applications”, Int. Symp. Performance Evaluation of
Computer and Telecommunication Systems(SPECTS’04), SanJose,
[13] L. Sun, “Speech Quality Prediction for voice Over Internet Protocol
Networks”, Ph.D dissertation, Univ. Plymouth, UK., Jan 2004.
[14] Marta Carbona and Luigi Rizzo, “Dummynet Revisited”, ACM
SIGCOMM Computer Communication ReviewVolume 40 Issue 2,
April 2010, 12-20.
... In VoIP, the most extended method is the E-Model [26,27], which combines additive impairments factors to measure speech quality R factor, ranging from 0 (poor) to 100 (excellent), that can be directly mapped to MOS [28]. As Assem et al [29] put it, the E-Model is "a repeatable way to assess if a network is prepared to carry a VoIP call or not". In its simplest form, the E-model can be expressed as [30]: ...
... MOS estimation.. As stated in Section 2, the E-Model rates the conversation quality R factor, which can be calculated using (1) [29], whose terms were: ...
Full-text available
Unmanned Aerial Vehicle (UAV) networks have emerged as a promising means to provide wireless coverage in open geographical areas. Nevertheless, in wireless networks such as WiFi, signal coverage alone is not sufficient to guarantee that network performance meets the quality of service (QoS) requirements of real-time communication services, as it also depends on the traffic load produced by ground users sharing the medium access. We formulate a new problem for UAVs optimal deployment in which the QoS level is guaranteed for real-time voice over WiFi (VoWiFi) communications. More specifically, our goal is to dispatch the minimum number of UAVs possible to provide VoWiFi service to a set of ground users subject to coverage and QoS constraints.Optimal solutions are found using well-known heuristics that include K-means clusterization and genetic algorithms. Via numerical results, we show that the WiFi standard revision (e.g. IEEE 802.11a/b/g/n/ac) in use plays an important role on both coverage and QoS performance and hence, on the number of UAVs required to provide the service.
... On the contrary, the E-model [38] is an analytical model that allows one to estimate the speech quality assuming additive impairments to the quality. The E-model provides a quality score, namely, the R factor, that ranges from 0 (poor) to 100 (excellent) that is calculated as follows [15,39,40]: ...
Full-text available
Drones equipped with wireless network cards can provide communication services in open areas. This paper proposes a hierarchical two-layered network architecture with two types of drones according to their communication equipment: Access and Distribution. While access drones provide WiFi access to ground users, distribution drones act as WiFi-to-5G relay forwarding packets into the 5G Core Network. In this context, we formulate a novel optimization problem for the 3-D initial placement of drones to provide Voice over WiFi (VoWiFi) service to ground users. Our optimization problem finds the minimum number of drones (and their type and location) to be deployed constrained to coverage and minimum voice speech quality. We have used a well-known metaheuristic algorithm (Particle Swarm Optimization) to solve our problem, examining the results obtained for different terrain sizes (from 25 m × 25 m to 100 m × 100 m ) and ground users (from 10 to 100 ). In the most demanding case, we were able to provide VoWiFi service with four distribution drones and five access drones. Our results show that the overall number of UAVs deployed grows with the terrain size (i.e., with users’ sparsity) and the number of ground users.
... However, in [2] they analyze VoIP using the MOS elicitation methodology, preamble on the Emodel and develop a non-intrusive model for the Opus codec. Finally, in [3] they have the simplified case of the E-model, specifically predictions using PESQ MOS for four codec types and based on user experience. ...
Full-text available
Real-time applications such as Voice over IP (VoIP) are very sensitive to variables such as jitter, delay and packet loss. The Quality of Service (QoS) together with the available bandwidth allows to have a quality service, which added to the Quality of User Experience (QoE) provides a way to evaluate the communication. In this work we evaluate the variables delay, jitter and packet loss in VoIP application, together with the QoE (R factor and MOS) of the user using the E model. The variables of delay, jitter and packet loss were measured in three scenarios where the channel was saturated and the bandwidth was limited, to determine the relationship between two QoS variables: delay and jitter with respect to the R factor. Finally, the QoS and QoE that allows early detection of quality degradation in a VoIP call was evaluated, taking into account factors such as bandwidth and codec.
... At the planning stage, the method predominantly used in the literature for QoS assessment is the E-Model [45] since it allows one to estimate the speech quality coming down to the effect that the VoIP traffic load has on network performance in terms of packet loss and delay. The E-model provides a quality score termed R factor from 0 (poor) to 100 (excellent) which can be readily obtained using the following expression [12,46,47]: ...
Full-text available
This paper formulates a new problem for the optimal placement of Unmanned Aerial Vehicles (UAVs) geared towards wireless coverage provision for Voice over WiFi (VoWiFi) service to a set of ground users confined in an open area. Our objective function is constrained by coverage and by VoIP speech quality and minimizes the ratio between the number of UAVs deployed and energy efficiency in UAVs, hence providing the layout that requires fewer UAVs per hour of service. Solutions provide the number and position of UAVs to be deployed, and are found using well-known heuristic search methods such as genetic algorithms (used for the initial deployment of UAVs), or particle swarm optimization (used for the periodical update of the positions). We examine two communication services: (a) one bidirectional VoWiFi channel per user; (b) single broadcast VoWiFi channel for announcements. For these services, we study the results obtained for an increasing number of users confined in a small area of 100 m2 as well as in a large area of 10,000 m2. Results show that the drone turnover rate is related to both users’ sparsity and the number of users served by each UAV. For the unicast service, the ratio of UAVs per hour of service tends to increase with user sparsity and the power of radio communication represents 14–16% of the total UAV energy consumption depending on ground user density. In large areas, solutions tend to locate UAVs at higher altitudes seeking increased coverage, which increases energy consumption due to hovering. However, in the VoWiFi broadcast communication service, the traffic is scarce, and solutions are mostly constrained only by coverage. This results in fewer UAVs deployed, less total power consumption (between 20% and 75%), and less sensitivity to the number of served users.
Voice over IP (VoIP) requires a Call Admission Control (CAC) mechanism in WiFi networks to preserve VoIP packet flows from excessive network delay or packet loss. Ideally, this mechanism should be integrated with the operational scenario, guarantee the quality of service of active calls, and maximize the number of concurrent calls. This paper presents a novel CAC scheme for VoIP in the context of a WiFi access network deployed with Unmanned Aerial Vehicles (UAVs) that relay to a backhaul 5G network. Our system, named Codec-Optimization CAC (CO-CAC), is integrated into each drone. It intercepts VoIP call control messages and decides on the admission of every new call based on a prediction of the WiFi network’s congestion level and the minimum quality of service desired for VoIP calls. To maximize the number of concurrent calls, CO-CAC proactively optimizes the codec settings of active calls by exchanging signaling with VoIP users. We have simulated CO-CAC in a 50 m × 50 m scenario with four UAVs providing VoIP service to up to 200 ground users with IEEE 802.11ac WiFi terminals. Our results show that without CAC, the number of calls that did not meet a minimum quality level during the simulation was 10% and 90%, for 50 and 200 users, respectively. However, when CO-CAC was in place, all calls achieved minimum quality for up to 90 users without rejecting any call. For 200 users, only 25% of call attempts were rejected by the admission control scheme. These results were narrowly worse when the ground users moved randomly in the scenario.
Call Admission Control is a central mechanism for assurance of quality of service in telephony. While CAC is integrated into Public Switched Telephony Network (PSTN), its application to voice over IP in a corporate environment is challenging not only due to the heterogeneity of technologies, but also because of the difficulty of implementation into commercial VoIP terminals or Access Points. We present a novel framework that unifies call admission control for VoIP telephony corporate users despite their access network (i.e., WiFi or Ethernet) under a single corporate management domain. Our Unified CAC (U-CAC) system can be implemented in a VoIP Gateway/Proxy and uses only standard protocols already present in commercial off-the-shelf devices, avoiding the need to modify the firmware of existing APs or VoIP terminals. We define two variants of the decision algorithm: basic and advanced. In the basic mode of operation, the admission of new calls is based on the availability of spare circuits and the impact of the new call in the speech quality of VoWiFi calls in progress. In the advanced mode of operation, the traffic load in affected APs is proactively reduced by reconfiguring ongoing calls before rejecting the new call. Simulation results show that the number of simultaneous VoWiFi calls under guaranteed quality increases with our unified call admission control scheme. When using the advanced mode of operation, the number of simultaneous calls under guaranteed quality can be doubled when compared to the standard mode of operation.
Full-text available
We describe a method for monitoring Voice over IP (VoIP) applications based upon a reduction of the ITU-T's E-Model to transport level, measurable quantities. In the process, 1) we identify the relevant transport level quantities, 2) we discuss the tradeoffs between placing the monitors within the VoIP gateways versus placement of the monitors within the transport path, and 3) we identify several areas where further work and consensus within the industry are required. We discover that the relevant transport level quantities are the delay, network packet loss and the decoder's de-jitter buffer packet loss. We find that an in-path monitor requires the definition of a reference de-jitter buffer implementation to estimate voice quality based upon observed transport measurements. Finally, we suggest that more studies are required, which evaluate the quality of various VoIP codecs in the presence of representative packet loss patterns.
The emergence of VOP technologies complicates the issue of voice quality. A user's perception of quality depends of three factors that share a complex relationship and have their own unique testing methods: clarity, end-to-end delay, and echo.
This paper introduces a VoIP voice quality monitor system based on SIP protocol using simplified E-Model. This system can correctly identify the establishment and duration of entire dialog session. In the end, we performed various functional and performance tests on the test system under different network traffic load.
This Recommendation describes methods and procedures for conducting subjective evaluations oftransmission quality. The main revision encompassed by this version of this Recommendation is theaddition of an annex describing the Comparison Category Rating (CCR) procedure. Othermodifications have been made to align this Recommendation with recent revision ofRecommendation P.830.
Quality models predict the perceptual quality of services as they calculate subjective ratings from measured parameters. In this article, we present a new quality model that evaluates Voice over IP (VoIP) telephone calls. In addition to packet loss rate, coding mode and delay, it takes into account the impairments due to changes in the transmission configuration (e.g. switching the coding mode or re-scheduling the playout time). Moreover, this model can be used at run time to control the transmission of such calls. It is also computationally efficient and open source. To demonstrate the potential of our model, we apply it to select the ideal coding and packet rate in bandwidth-limited environments. Furthermore, we decide, based on model predictions, whether to delay the playout of speech frames after delay spikes. Delay spikes often occur after congestion and cause packets to arrive too late. We show a considerable improvement in perceptual speech quality if our model is applied to control VoIP transmissions. Copyright
The E-model: A computational model for use in transmission planning
  • Itu-T Recommendation
ITU-T Recommendation G.107, " The E-model: A computational model for use in transmission planning ", Geneva, 04/2009.
The VoIP phone QoS protection in the wide-are network
  • Junsheng Zhang
  • Xiaohua Sun
Junsheng Zhang and Xiaohua Sun, "The VoIP phone QoS protection in the wide-are network", Computer learning.2006 No.6.17-18.
Methodology for derivation of equpment impairment factors from subjective listening-only tests
  • Itu-T Recommendation
ITU-T Recommendation P.833, " Methodology for derivation of equpment impairment factors from subjective listening-only tests ", 2001.