Conference PaperPDF Available

Exception Based Modeling and Forecasting.

Authors:

Abstract and Figures

How often does the need arises for modeling and forecasting? Should it be done manually by ad-hoc, by project requests or automatically? What tools and techniques are best for that? When is trending forecast enough and when is a correlation with business drivers required? The answers to these questions are presented in this session. The capacity management system should automatically provide a small list of resources that needs to be modeled or forecasted; a simple spreadsheet tool can be used for that. This technique method is already implemented on the author’s environment with thousands of servers.
Content may be subject to copyright.
1
Exception Based Modeling and Forecasting
Igor Trubin, Ph.D., SunTrust Bank
How often does the need arise for modeling and forecasting? Should it be done manually by ad-hoc,
by project requests or automatically? What tools and techniques are best for that? When is a trending
forecast enough and when is a correlation with business drivers required? The answers to these
questions are presented in this paper. The capacity management system should automatically
provide a small list of resources that need to be modeled or forecasted; a simple spreadsheet tool
can be used for that. This method is already implemented on the author's environment with
thousands of servers.
1. Introduction
One of the essential jobs of a Capacity Planner is to
produce forecasts based on current and future
situation models. This type of job is a bit dangerous.
Just like a weather or stock market forecasting job, if
the prediction is correct nobody usually gives you
credit for it, but in case a prediction is wrong,
everybody notices and blames you!
My first and quite unpleasant experience with this
type of activity was during one of my dead-end
career paths when I was hired by a stock brokerage
company to develop a forecasting system. I
passionately tried to predict the unpredictable
market, and finally decided to give up because I
realized that the management did not really need
accuracy, but just nice pictures and graphs to
persuade the customers to buy something.
In spite of all those unpleasant side effects of
forecasting, I finally found this job to be fascinating
once I produced my very first precise forecast shown
on Figure 1. That was a rather simple model, based
on common benchmarks (TPM) to recalculate UNIX
box CPU usage for a few possible upgrade
scenarios.
After one of my recommendations was accepted, I
collected performance data on the newly upgraded
configurations and - bingo - the server worked as I
had predicted! That was a moment of truth for me,
but of course I did not get any special appreciation
from the customer.
(I have previously experienced similar excitement
during my teaching career when I would see some
sparks of understanding in the eyes of my students.
I hope all the readers of this paper have those
sparks as well.)
Figure 1 Old and Good Prediction Example
(Scanned Image from Personal Archive)
Since that first good forecast, our Capacity Planning
team kept a special focus on producing a different
type of forecasting/trending analysis, automating the
process as much as possible. The result was
automatically updated trend forecast charts for every
server for several metrics and most of the charts
looked nice and beautiful, but in most cases were
useless as you can see on Figure 2.
Figure 2 - Beautiful Trend-Forecast Chart
Why? Because in a good environment most of the
production boxes are not trending much and you
could only enjoy this type of healthy future
predictions. As a result, we had to scan all those
thousands of charts manually (eyeballing) to select
2
only those that had real and possibly dangerous
trends.
Some of these experiences are discussed previously
in CMG papers [1] and [2] where authors used
the SAS language to automate the production of
resource projections from business driver inputs
(see Figure 3). If projections are stored in a
database, automatic “actual vs. projected”
comparisons can also be displayed…”
Figure 3 Business Driver Based Forecast
However, some challenges were reported [2]: “…the
authors’ experience is that this approach appears to
be more accurate for a single resource such as
CPU. When applied to multiple file systems’ disk I/O,
results may vary greatly…. And there were really
great variations even for CPU resources! All in all, it
appeared that the maintenance and usage of the
bulk” forecasting with or without business driver
correlations becomes a nightmare.
Living with all those challenges, I dreamed of
automating the process of selecting the really
dangerous up or down trends of resource
consumption and finally have discovered a way of
doing so.
2. Basic Forecasting Rules
First of all let’s review some basic and obvious rules
of how to avoid common mistakes in forecasting
models:
A. Historical data should have the right
summarization, including data for
correlation (e.g. business drivers).
B. Do not mix shifts: forecasts should be
done separately for working days and/or
hours or off-shifts.
C. The result depends on the statistical model
chosen.
D. The starting historical time point should take
into account the significant events such as
hardware upgrades; virtual machines,
databases and application migrations; LPAR
reconfigurations and so on.
E. “Bad” data points should be excluded from
historical samples as outliers”.
RULE A: “Right Summarization”. Data collection
must provide very granular data (10 sec -15 min at
least). However, for analysis data should be
summarized by hour, day, week or month. It is not a
good idea to try and produce trend forecast analysis
against the raw and very granular data even using
good “time series” algorithms. On the other hand, if
you have only hourly or daily snap-shots type of data
the forecast could be very misleading even after
summarization.
If correlation analysis is needed, all data
(independent and dependent variables) should be
normalized by the same interval, and usually the
less granular the summarized data is the better the
correlation that can be found. Below in paragraph
five of this paper you can see an example of
correlation analysis of CPU usage vs. Web hits that
uses 5 min. interval data summarized by days with a
very good result (see Figures 16, 17 and 18).
RULE B: Do not mix shifts”. This is a pretty
obvious rule. The following real data example on
Figure 4 shows the difference between a trend
forecast that includes weekends and one that
excludes weekends. Note that a “no-weekends”
chart forecast reaches the yellow zone sooner!
Figure 4 Trend forecast with and without
weekends
3
RULE C: “Statistical model choice”. Let’s leave
the detailed discussion of this subject to real
statisticians/mathematicians and here let’s just
formulate the basic rule: start with linear trend. Play
with other statistical algorithms if they are available
and use them only if absolutely necessary.
The chart in Figure 5 shows the same data with
different future trends because three different
algorithms were used. That figure shows that the 1st
method tries to reflect weekly oscillation
(seasonality) while others try to keep up along with
the most recent trends; the last one is the most
aggressive.
Figure 5 Different Statistical Forecasting
Algorithms Results Comparison
Using SAS language it is easy to play with these
models, through changing the method and
trendparameters on the following proc statement:
proc forecast
data=<actual_dataset>
interval=day lead=30
method=<STEPAR|EXPO|WINTERS>
trend=<1|2|3>
out=<actual&predicted_dataset>;
run;
(The stepar and trend=2 are default values and
they mean stepwise autoregressive with the “linear
trend model”)
RULE D: Significant Events”. The standard
forecast procedure (time series algorithm based)
might work well where the history is consistently
reflecting some natural growth. However, often due
to upgrades, workload shifts or consolidations, the
historical data consists of phases with different
patterns. The forecasting method should be adjusted
to take into consideration only the latest phase with
a consistent pattern.
Figure 6 Trend Forecasts Based on Whole and
Partial Data History
For instance, if the history shown in Figure 6 began
in October instead of July, the future trend would be
more realistic as shown by the dashed lines on the
future side of the chart.
RULE E: Outliers”. Unfortunately, a server can
occasionally experience some pathological workload
events such as the run-away processes capturing all
spare CPU resources or memory leak situations
where the application consumes more memory than
it can actually use.
No doubt these workload defects cause some
problems for the server. Even if the system’s
performance with a parasite process is still
acceptable, the real resource usage is hidden and
the capacity planning for resource usage is
unpredictable. When the automatic future trend chart
generator is used, the historical data with
pathologies causes inaccurate prediction, as shown
on Figure 7. To improve the quality of trend-forecast
analyses such types of events along with some
unexpected outages should be removed from the
history as outliers.
4
Figure 7 Memory Leak Issue Spoiled the Forecast
3. Forecasting vs. Exception Detecting
About 7 years ago, the first manager of my first
Capacity Planning team faced a tough dilemma: he
needed to assign the following tasks, one to me and
one to another mathematically well educated
analyst:
- Forecasting System development, and
- Exception Detection System development.
Naturally, I dreamed of working on the first task, but
unfortunately (or fortunately?) my manager assigned
the second task to me because of my mechanical
engineering background in which the use of Statistic
Process Control methodology is widespread.
My colleague (he is the co-author of my pervious
CMG paper [8]) did a great job developing a
forecasting system to produce trend-forecast and
business correlated forecast charts based on
performance data from hundreds of servers and
databases. His scripts are still used today to
generate high quality graphs. The charts shown in
Figures 3, 6 and 7 are based on code he wrote. I
learned from him how to produce similar charts and I
greatly appreciated his help.
He also implemented some of the basic forecasting
rules described above including RULE D
“Significant Events”. He developed scripts to
forecast when the future trend of the database used
space metric intersects with the “database current
allocation” metric. His script does this task three
times for short, medium and long history sample
data to show the best and the worst scenarios. As a
result, only a few problematic databases and table
spaces show up on his report which is automatically
colored yellow or red based on the future date
threshold intersections.
It is a known approach to mark a resource as
potentially running out of capacity by using future
trend intersection with some obvious threshold.
However, this approach does not work when the
threshold is unknown. Below we will discuss another
method that does not have this weakness.
While my colleague was developing the forecasting
system, I worked on the Exception Detection System
using as a basis some of the CMG publications
starting with the first MASF paper [3]. The basic idea
I implemented was the following:
Take some representative historical reference data;
set it as a baseline and then compare it with the
most recent actual data. If the actual data exceeds
some statistical thresholds, (e.g. Upper (UCL) and
Lower (LCL) Control Limits are mean plus/minus 3
standard deviations), generate an exception (alert
via e-mail) and build a control chart like shown on
Figure 8 to publish on the web:
Figure 8 Exception Detection Control Chart
This weekly control chart is a powerful tool. It shows
at a glance not only the last week’s worth of hourly
data, but also the daily and weekly profiles of the
metric’s behavior plus a historical view. It actually
shows you if the data are trending up or down and
when exactly the trend occurs. Gradually, I realized
that I had developed a trending analyzer in addition
to an Exception Detector!
While improving this approach, I also realized that
some of the basic forecasting rules already apply to
the Exception Detection System:
- RULE A: “Summarization” says that all
metrics and subsystems under the
Exception Detector should be consistently
summarized and the best summarization
level is a 6-8 month history of hourly data.
5
That, for instance, allows you to see where
system performance and business driver
metrics correlate simply by analyzing control
charts.
- RULE B: Do Not Mix Shifts is easily
demonstrated by the weekly/hourly control
chart because it visualizes the separation of
work or peak time and off time.
- RULE C: Statistical Model Choice”
means playing with different statistical limits
(e.g. 1 st. dev. vs. 3 or more st. dev.) to tune
the system and reduce the rate of false
positives.
- RULE D: “Significant Events” is another
important tuning parameter of the system.
RULE D is used to determine the depth of
the reference set. Even with a constant for
the reference set (e.g. 6 months), the
Exception Detector has the ability to adjust
itself statistically to some events because
the historical period follows (moving forward)
the actual data and every event will
occasionally be older than the oldest day in
the reference set.
- RULE E: “Outliers” are easily found
statistically by the Exception Detector as all
workload pathologies are statistically
unusual. By adding some (non-statistical)
filters to the system, the most severe of
these pathologies could and should be
excluded from the history to keep the
reference set free from outliers. [4]
Finally, to increase the accuracy of the Exception
Detector and to reduce the number of the false
positive situations (false alerting), a new meta-metric
was added - ExtraValue (EV) of exception.
Geometrically speaking, it is the area between the
actual data curve (black line on a control chart) and
the statistical limit curves (red and yellow lines on
the control chart) (Figure 9). This metric is basically
a magnitude of exception and is always equal to
zero, if actual data fluctuates within statistically
healthy boundaries. But if EV stays >0 or <0 for a
while that means there is an unusual growth
(trending up) or drop (trending down), respectively.
(This metric was first introduced in a 2001 CMG
paper [7].) See APPENDIX for a mathematical
method for calculating EV.
Having this metric recorded to some exception
database allows for in-depth analysis of the system
behavior and some examples of this analysis were
published in another CMG paper [5].
But the most efficient use of this metric is to filter the
top most exceptional resources in terms of unusual
usage.
Publishing this top list in some way (e.g. bar charts
shown on Figure 9 or 16) along with links to control
charts significantly reduces the number of servers
that require the focus of Capacity Planning or
Performance Management analysts.
Figure 9 Top list of servers with unusual CPU
usage
4. Exception Based Forecasting
The Exception Detector provides a list of resources
with highly unusual consumption. In so doing, the
Exception Detector provides a targeted list and
helps to apply all of the forecasting rules described
below.
Here is the suggested method for implementing that
approach:
- The data for the Exception Detector and
forecasting system should be the same.
- The trend-forecast charts should be
generated only for resources listed in the
Exception Detector outputs.
- The data for trending analysis should be
freed up from outliers based on Exception
Detector pathology filters (e.g., free from
run-away and memory leak days or hours).
- The starting time point(s) in the historical
data for trending analysis can be found
based on exception database data with
6
ExtraValue or EV metric records, as the
most recent negative value of this metric
indicates time when the data actually started
trending up.
To illustrate how this works, let’s look at a couple of
case studies with actual data. One day, server
number 9 has hit the exception list as shown on
Figure 9. Clicking on the control chart link, which the
web report should have on the same page, brings up
the control chart (shown on Figure 8). That indeed
shows some signs of exceptional server behavior:
- Some hourly exceptions occurred on
Monday.
- During the entire previous week the actual
CPU utilization was slightly higher than
average (green mean curve).
- On Friday the upper limit (red curve)
reached the 100% threshold for a few hours,
which indicates that in the past the actual
data might be at 100% level on other
Fridays; and Friday average curve is higher
than on the other days.
Based on this information, it is a good idea to look at
the historical trend. But which metric statistic is
better suited for that: daily average or average of
peak hour? Let’s look at Figure 10 where both
statistics are presented:
Figure 10 Trend Forecast Chart of Daily Average
vs. Daily Peak Hour Average CPU Utilization
Which presentation is better? It depends. The
common recommendation is to use the daily peak
for OLTP or web application servers and daily
average for back-up and other batch oriented
application servers.
But even looking at the Daily Peak trend forecast,
the future looks good. Why? Because RULE D is not
applied and the entire history was used for
forecasting. But the history is obviously more
complicated, and it’s a good idea to analyze only the
last part of the history. It can be seen clearly just by
eyeballing historical trend chart. Could that decision
be done automatically? Certainly, if one looks at the
history of “ExtraValueor EV metric on Figure 12.
Note that the most recent negative value of
ExtraCPUtime metric (which is the EV meta-metric
derived from CPU utilization metric) points exactly to
the point of time when CPU utilization started to
grow. Basically, to find a good starting point for
analyzing the history one needs to find the roots in
the following equation:
where EV for this example is ExtraCPUtime
(unusual CPU time used) function of time t. (in days)
If EV metrics are recorded daily in some database,
this equation could be easily solved by developing a
simple program using one of the standard
algorithms. The solution for the real data example
shown above is t =~ 04/22.
The final trend-forecast chart is shown on Figure 11.
It predicts that after 06/16 the server might be
running out of CPU capacity:
Figure 11 Corrected Trend Forecast
0)(tEV
7
Figure 12 History of “ExtraValue” Metric (ExtraCPUtime) vs.
Daily Average of CPU Utilization
Another example is presented in Figures 13-15. It is
interesting that the EV metric somewhat mimics the
original metric but makes trending more obvious.
Figure 13 The Whole History Based Forecast
Figure 14 - ExtraTime data analysis
Some oscillations are seen around the most resent
negative EV value, but that might be tuned out as
those cases are using 1 st. dev. threshold which is
too sensitive. And of course by the term “recent” one
should assume at least a few weeks or more to have
enough data for meaningful trend analysis.
Figure 15 The Short History Based Trend-forecast
But what about the opposite side of the exceptions
list that reports resources with unusually low usage?
Yes, the Exceptions Detector publishes that as well
and it makes perfect sense to build control and
trend-forecast charts for each resource from that list.
One example of such a web report is shown on
Figure 16 for the VM host from which some virtual
machines were recently moved to another host. The
Exception Detector captures that event perfectly and
gives an analyst the possibility to see how much
resource was released.
One of the unique parts of this method is the
following. If a metric does not have an obvious
threshold (e.g. I/O, paging or Web hits rates) the
approach works anyway and the trend-forecast will
be built only for resources (e.g. disk, memory or
Web application) that recently started dangerously
trending up. Additional modeling may be needed to
estimate what drives the increase and how to avoid
potential problems.
8
Figure 16 Web Report about Top Servers that Released CPU Resource
5. Exception Based Modeling and Forecasting
How often do we need to perform modeling? What
tools are good for that? Obviously, if there is a
project to upgrade hardware or to consolidate
resources or consumers (applications, servers, VMs
or databases), a good capacity management
process assumes that the capacity planner is
involved as a project resource and should model
various what-if scenarios. The capacity planner is
fortunate if he has good data and good modeling
tools (for instance, a queuing theory based analytical
tool).
A good capacity management service should be
able to initiate this type of project. The Exception
Detection System and the forecasting based on that
system can be very helpful in this process. In most
cases, if one has good data and some statistical and
capacity management experience, the modeling can
be done effectively with a spreadsheet.
Even control charts can be built using just a
spreadsheet, as demonstrated in one of the past
CMG sessions [6]. Trend analysis (forecasting) can
be also done by a spreadsheet. (One of these types
of techniques such as Forecast(…) formula usage
was also demonstrated at CMG [1]). Lets look at
some real data in a case study to see how that
works. Here is the most recent data for a previous
example mentioned in the introduction; see Figure 1.
One day, the Exception Detector notified me that
some web application had an unusual number of
web hits producing the hits rate control chart similar
to the one on Figure 8, but without obvious
threshold. The trend forecast for CPU usage was
automatically generated for the server hosting this
application as shown on Figure 10. Finally, both
CPU and Web hits data were downloaded to a
Spreadsheet for modeling. Combining those metrics
in one scattered chart (Figure 17) shows excellent
correlation with R2=0.96.
Figure 17 Correlation between CPU utilization and
Web hits rate
9
This simple correlation model shows us that the
maximum number of hits per second that this server
can handle is about 18. This is a meaningful result
and if the applications support team anticipates a
higher hit rate in the near future based on
specifications, stress test results, customer behavior,
forecast and/or business projections, this server will
need more processing capacity to meet the
requirements.
The model also shows (Figure 18) that if the pattern
of this application usage remains the same, the
server will be at capacity in about two months.
Figure 18 Current Capacity Usage Projection
But what if the number of anticipated hits is higher?
How much additional capacity would be needed?
This model can help someone to play out this type of
scenario. For instance, Figure 19 shows that adding
25% more processing power to that server gives it
the ability to handle 20 hits per sec and that capacity
will be reached in a period twice as long in about 4
months.
Figure 19 Proposed Capacity Usage Projection
This model also allows for a more complex analysis.
To apply the method explained in the previous
paragraph of this paper calculate the historical
starting point based on the most recent negative EV,
and look at the most recent trend which is
apparently the worst case scenario as shown on
Figure 20:
Figure 20 Worst Case Scenario
What spreadsheet features were used for this
modeling exercise? All are pretty standard and
simple:
- Figure 17: XY (scattered)” standard chart
type plus “Add trendline” wizard.
- Figure 18, 19: Just Add trendline” wizard
with the coefficient =0.75 to reflect the
proposed capacity increase.
- Figure 20: In addition to “Add trendline”
wizard the future data were populated by
just dragging down the selected range of
historical data as shown on Figure 21.
Figure 21 Future Data Population Technique
10
6. Summary
Capacity management in a large IT environment
should perform forecasting and modeling only when
it is really needed. This saves a lot of man-hours
and computer resources.
Exception Detection techniques along with an
Exception Database could be used to automate the
decision making process with regard to what needs
to be modeled/forecasted and when.
MASF Control Charts have the ability to uncover
some trends showing actual data deviations from an
historical baseline. The most recent negative EV
(ExtraValue of exceptions meta-metric first
introduced in CMG’01) is an indicator of the moment
of time when it is good to start the trending analysis
of an historical sample.
A common way of raising a future capacity concern
by calculating future trend intersection with some
constant threshold does not work for metrics without
obvious thresholds. The Statistical Exception
Detection approach helps to produce the trending
analysis necessary for those cases.
Workload pathologies (e.g. run-aways or memory
leaks) should be excluded from an historical sample
in order to improve the forecasting. The Exception
Detector provides data (dates and hours) for that.
Application data (e.g. web-hits) vs. Server
performance data (e.g. CPU utilization) correlation
analysis gives a priceless opportunity to add some
meaning to forecasting/modeling studies and that
analysis can be done using standard spreadsheet
tools.
7. References
[1] Merritt, Linwood, "Seeing the Forest AND
the Trees: Capacity Planning for a Large
Number of Servers", Proceedings of the
United Kingdom Computer Measurement
Group, 2003
[2] Merritt, Linwood and Trubin, Igor, Ph. D.,
Disk Subsystem Capacity Management,
Based on Business Drivers, I/O
Performance Metrics and MASF, CMG2003
Proceedings.
[3] Jeffrey Buzen and Annie Shum, "MASF --
Multivariate Adaptive Statistical Filtering",
CMG1995 Proceedings, pp. 1-10.
[4] Igor Trubin, Capturing Workload Pathology
by Statistical Exception Detection System,
CMG2005 Proceedings.
[5] Igor Trubin, Global and Application Levels
Exception Detection System, Based on
MASF Technique, CMG2002 Proceedings.
[6] Igor Trubin, System Management by
Exception Part 6, CMG2006 Proceedings.
[7] Kevin McLaughlin and Igor Trubin, Exception
Detection System, Based on the Statistical
Process Control Concept", CMG2001
Proceedings.
[8] Igor Trubin and Ray White: System
Management by Exception, Part Final,
CMG2007 Proceedings.
11
8. APENDIX:
ExtraVolume meta-metric calculation
ExtraVolume (let’s call it EV) is basically a
magnitude of exception that occurred at a particular
time with some metric [7].
Let’s look at a 2D model first. The flat and linear
model of some performance metric behavior is
shown on Figure 22, where U is the metric, t is time,
UCL is Upper Control Limit and LCL is Lower
Control Limit.
Figure 22 2D Model
For this model the formula for EV calculations is
)()()(,0
0)()(,
0)()(,
)(
tLCLtUtUCL
tLCLtUS
tUCLtUS
tEV
where S+ = U(t)-UCL(t) and S- =U(t)-LCL(t)
Three-dimensional model is more realistic: h hours
of a day (or week) or days of a week, t - days (or
weeks) and U performance metric. By the way, the
MASF control chart such as the one shown on
Figure 8 is the 2D cut (projection) on a particular
week. This 3D model was introduced in last year
CMG paper [8] and one example of that 3D view is
shown here (Figure 23). For the full 3D model case
EV(t) calculations are a bit more complex:
EV(t) = S+ + S-
Figure 23 3D Model Example:
(Built by Spreadsheet Graph Wizard)
Where
0,0
0,)),(),((
0,0
0,)),(),((
LCLU
LCLUdhthLCLthU
S
UCLU
UCLUdhthUCLthU
S
In a general case S+ and S- as shown on Figure 24
have the following geometrical meaning: it is the
area between the actual data curve (U) and the
statistical limit curves (UCL and LCL). They should
be calculated only on intervals where the actual
metric is outside of the UCL - LCL band. If the metric
U is within the band, then both S+ and S- as well as
EV are equal to zero.
Figure 24 EV Geometrical Meaning
... SETDS is a methodology [2] of using statistical filtering, pattern recognition, active baselining, dynamic vs. static thresholds, Control Charts, Exception Value (EV) based reporting/smart alerting and EV based change points/trends detection to do Systems Capacity Management including Capacity Planning and Performance Engineering. ...
... Illustrate visually how SETDS analyzes data, which groups data by 168 weekhours, and compares two data sets: baseline or reference/learning set and the most recent 7 days of actual data points. Then Exception Values (EV) [2], which basically is an anomaly score (magnitudes) are calculated hourly or daily as a difference between statistical limits (UCL and/or LCL) and the actual data to keep that aside for additional analysis. Those visual charts show a workload pattern with weekly-daily seasonality and also when anomalies have happened. ...
... SETDS is a methodology [2] of using statistical filtering, pattern recognition, active baselining, dynamic vs. static thresholds, Control Charts, Exception Value (EV) based reporting/smart alerting and EV based change points/trends detection to do Systems Capacity Management including Capacity Planning and Performance Engineering. ...
... Illustrate visually how SETDS analyzes data, which groups data by 168 weekhours, and compares two data sets: baseline or reference/learning set and the most recent 7 days of actual data points. Then Exception Values (EV) [2], which basically is an anomaly score (magnitudes) are calculated hourly or daily as a difference between statistical limits (UCL and/or LCL) and the actual data to keep that aside for additional analysis. Those visual charts show a workload pattern with weekly-daily seasonality and also when anomalies have happened. ...
... In predicting hydrological modeling, Saudi et al. stressed that there are three important result were important, and those result were upper control limit (UCL), average value (AVG) and lower control limit (LCL) where the sigma in the control chart is represented within the value range of a set of data [16]. The control can uncover some trends and patterns showing how the actual data deviations from the historical baseline, displaying the best base lining and dynamic threshold as well as being able to capture unusual resource usage [19]. This analysis used is in Equation (3): ...
Article
Full-text available
This study was implemented to evaluate the flood risk pattern recognition in Golok River, Kelantan. Based on Spearman correlation test, it showed that water level and suspended solid was very strong and significant (p < 0.0001). There was also a weak correlation of coefficient of stream flow and rainfall with the changes of water level as the p-value close to 1. Suspended solid has strong corresponded in changing the rate of water level, as it described the rate of surface run-off that flowed into the water body. However, the risk of flood in study area is irrelevant to the monsoon season. Principle Component Analysis (PCA), the most sensitive parameters that contribute to the flood occurrence were identified with variability R² value of 0.812 and 0.764. Expansion and development by human activities contribute to the incline of stream flow and suspended solid in Golok River. Based on Statistical Process Control (SPC), water level of all parameters exceeded the Upper Control Limit (UCL), considered as high risk and vulnerable for flood and it is mostly due to manmade activities. It was not deniable that monsoon season played role in contributing flood occurrence as parameters of rainfall and water level have moderately positive factor loading.
... In predicting hydrological modeling, Saudi et al. stressed that there are three important result were important, and those result were upper control limit (UCL), average value (AVG) and lower control limit (LCL) where the sigma in the control chart is represented within the value range of a set of data [16]. The control can uncover some trends and patterns showing how the actual data deviations from the historical baseline, displaying the best base lining and dynamic threshold as well as being able to capture unusual resource usage [19]. This analysis used is in Equation (3): ...
Article
Full-text available
This study was implemented to evaluate the flood risk pattern recognition in Golok River, Kelantan. Based on Spearman correlation test, it showed that water level and suspended solid was very strong and significant (p < 0.0001). There was also a weak correlation of coefficient of stream flow and rainfall with the changes of water level as the p-value close to 1. Suspended solid has strong corresponded in changing the rate of water level, as it described the rate of surface run-off that flowed into the water body. However, the risk of flood in study area is irrelevant to the monsoon season. Principle Component Analysis (PCA), the most sensitive parameters that contribute to the flood occurrence were identified with variability R2 value of 0.812 and 0.764. Expansion and development by human activities contribute to the incline of stream flow and suspended solid in Golok River. Based on Statistical Process Control (SPC), water level of all parameters exceeded the Upper Control Limit (UCL), considered as high risk and vulnerable for flood and it is mostly due to man-made activities. It was not deniable that monsoon season played role in contributing flood occurrence as parameters of rainfall and water level have moderately positive factor loading.
... The Sigma in the control chart is represented within the value range of a set of data. The control chart has the ability to uncover some trends and patterns, showing actual data deviations from the historical baseline and dynamic threshold, being able to capture unusual resource usage, and becoming the best base lining to show how the actual data deviate from the historical baseline [10]. The equation (2) that is used in this analysis was shown below: ...
Article
Full-text available
This study is focusing on constructing the flood risk index in the Johor river basin. The application of statistical methods such as factor analysis (FA), statistical process control (SPC) and artificial neural network (ANN) had revealed the most efficient flood risk index. The result in FA was water level has correlation coefficient of 0.738 and the most practicable variable to be used for the warning alert system. The upper control limits (UCL) for the water level in the river basin Johor is 4.423m and the risk index for the water level has been set by this method consisting of 0-100.The accuracy of prediction has been evaluated by using ANN and the accuracy of the test result was R2 = 0.96408 with RMSE= 2.5736. The future prediction for UCL in Johor river basin has been predicted and the value was 3.75m. This model can shows the current and future prediction for flood risk index in the Johor river basin and can help local authorities for flood control and prevention of the state of Johor. © 2015, Malaysian Society of Analytical Sciences. All rights reserved.
... The Sigma in the control chart is represented within the range value of a set of data. The Control Chart has the ability to uncover some trends and patterns, showing actual data deviations from the historical baseline and dynamic threshold, being able to capture unusual resource usage, and becoming the best base lining to show how actual data are deviated from the historical baseline [20] .The equation that is used in this analysis is: ...
Article
Full-text available
This study looks into the downscaling of statistical model to produce and predict hydrological modelling in the study area based on secondary data derived from the Department of Drainage and Irrigation (DID) since 1982-2012. The combination of chemometric method and time series analysis in this study showed that the monsoon season and rainfall did not affect the water level, but the suspended solid, stream flow and water level that revealed high correlation in correlation test with p-value < 0.0001, which affected the water level. The Factor analysis for the variables of the stream flow, suspended solid and water level showed strong factor pattern with coefficient more than 0.7, and 0.987, 1.000 and 1.000, respectively. Based on the Statistical Process Control (SPC), the Upper Control Limit for water level, suspended solid and stream flow were 21.110 m3/s, 4624.553 tonnes/day, and 8.224 m/s, while the Lower Control Limit were 20.711 m, 2538.92 tonnes/day and 2.040 m/s. This shows that human development in the area gives high impact towards climate change and flood risk, and not the monsoon season. Prediction was carried out using the Artificial Neural Network (ANN) to classify risks into their own classes, and the rate of accuracy for the prediction was 97.1%. This meant that the points in the time series analysis which were located beyond Upper Control Limit were considered as High Risk class, and the probability for flood occurrence is very high. The other classes classified in this prediction are Caution Zone, Low Risk and No risk. This is important to set a trigger for warning system in the case of emergency response plan during flood.
... The Sigma in the control chart is represented within the value range of a set of data. The control chart has the ability to uncover some trends and patterns, showing actual data deviations from the historical baseline and dynamic threshold, being able to capture unusual resource usage, and becoming the best base lining to show how the actual data deviate from the historical baseline [10]. The equation (2) that is used in this analysis was shown below: ...
Article
Full-text available
This study is focusing on constructing the flood risk index in the Johor river basin. The application of statistical methods such as factor analysis (FA), statistical process control (SPC) and artificial neural network (ANN) had revealed the most efficient flood risk index. The result in FA was water level has correlation coefficient of 0.738 and the most practicable variable to be used for the warning alert system. The upper control limits (UCL) for the water level in the river basin Johor is 4.423m and the risk index for the water level has been set by this method consisting of 0-100.The accuracy of prediction has been evaluated by using ANN and the accuracy of the test result was R 2 = 0.96408 with RMSE= 2.5736. The future prediction for UCL in Johor river basin has been predicted and the value was 3.75m. This model can shows the current and future prediction for flood risk index in the Johor river basin and can help local authorities for flood control and prevention of the state of Johor. Abstrak Kajian ini memberi tumpuan kepada pembinaan indeks risiko banjir di lembangan sungai Johor. Penggunaan kaedah statistik seperti analisis faktor (FA), kawalan proses statistik (SPC) dan rangkaian neural buatan (ANN) telah mendedahkan indeks risiko banjir yang paling berkesan. Hasil dalam FA menunjukkan bahawa paras air mempunyai pekali korelasi 0.738 dan pembolehubah yang paling praktikal untuk digunakan sebagai satu sistem amaran. Had kawalan atas (UCL) bagi paras air di lembangan sungai Johor adalah 4.423m dan juga indeks risiko untuk paras air telah dibentuk melalui kaedah ini yang terdiri daripada 0-100. Ketepatan ramalan telah dinilai dengan menggunakan ANN dan ketepatan keputusan ujian adalah R 2 = 0.96408 dengan RMSE = 2.5736. Ramalan masa depan untuk UCL di lembangan sungai Johor telah diramalkan dan nilai tersebut adalah 3.75m. Model ini dapat menunjukkan ramalan semasa dan masa depan untuk indeks risiko banjir di lembangan sungai Johor dengan cekap dan dapat membantu Pihak Berkuasa Tempatan untuk kawalan banjir dan pencegahan negeri Johor.
... The method produces three important data (e.g., Upper Control Limit (UCL), Average Value (AVG) and Lower Control Limit (LCL)) for the trend and prediction of future hydrological modelling, where the Sigma is within a range value of a set of data. Control Chart can detect some trends and patterns with actual data deviations from historical baseline, be able to capture unusual resource usage, can determine the dynamic threshold, and also can become the best base lining to examine the actual data deviation from the historical baseline (Igor Trubin, 2008) [7] . The equation implemented in this analysis was: ...
Article
Full-text available
This study constructs downscaling statistical model in analyzing the hydrological modeling in the study area which face the risk of flood occurrence as the impact of climate change. The combination of chemometric method and time series analysis in this study show that even during the monsoon season, rainfall and stream flow are not the major contribution towards the changing of water level in the study area. Based on correlation Test, it shows that suspended solid and water level show high correlation with p-value < 0.05. Factor Analysis being carried out to determine the major contribution to the changes of Water Level and the result shows that Suspended Solid shows a strong factor pattern with value 0.829. Based on Control Chat Builder for time series analysis, the Upper Control Limit for water level and suspended solid are 7.529 m and 1947.049 tons/day and the Lower Control Limit are 6.678 m and 178.135 tons/day. This shows that human development in the area gives high impact towards climate change and risk of flood in the study area which commonly face flood during monsoon season.
... The concept of human brain has been utilized in Artificial Intelligent and it was applied as a method in data analysis, known as Artificial Neural Network. This concept was created by McCulloch and Pitts in 1943 by simulating the structure and performance of biological neural network in the computing system 10 . ...
Article
Full-text available
Flood is a major problem in Johor river basin, which normally happened during monsoon season. However in this study, it shows that rainfall did not have a strong relationship for the changes of water level compared to suspended solid and stream flow, where both variables have p-values of <0.0001 and these variables also became the main factors in contributing to the flood occurrence based on Factor Analysis result. Time Series Analysis was being carried out and based on Statistical Process Control, the limitation has been set up for mitigation in controlling flood. All data beyond the Upper Control Limit was predicted to have High Risk to face flood and Emergency Response Plan should be implemented to prevent complication and destruction because of flood. The prediction for the risk level was carried out using the application of Artificial Neural Network (ANN), where the accuracy of prediction was very high, with the result of 96% for the level of accuracy in the prediction of risk class.
Patent
Full-text available
"This disclosure relates generally to system modeling, and more particularly to systems and methods for modeling computer resource metrics. In one embodiment, a processor-implemented computer resource metric modeling method is disclosed. The method may include detecting one or more statistical trends in aggregated interaction data for one or more interaction types, and mapping each interaction type to one or more devices facilitating the transactions. The method may further include generating one or more linear regression models of a relationship between device utilization and interaction volume, and calculating one or more diagnostic statistics for the one or more linear regression models. A subset of the linear regression models may be filtered out based on the one or more diagnostic statistics. One or more forecasts may be generated using the remaining linear regression models, using which a report may be generated and provided.”
Conference Paper
Full-text available
The paper describes one site's experience of using Multivariate Adaptive Statistical Filtering(MASF) to produce web-based exception reports against SAS/ITSV performance databases for a large, multi-platform environment. In addition to global exceptions, the system can capture application level exceptions by using standard workload characterization. The history of exceptions, kept in a separate database, is used to analyze seasonal stresses, considering it as a natural test to discover the weakest subsystem.
Conference Paper
Full-text available
The paper describes one site's experience of using Multivariate Adaptive Statistical Filtering (MASF) to recognize automatically some common computer system defects such as run-away processes on one or multiple CPUs and memory leaks. A home made SEDS (Statistical Exception Detection System) that captures any global and application level statistical exceptions was modified to recognize, report and alert about those defects.
Conference Paper
Full-text available