PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

The high penetration of distributed energy resources, especially weather-dependent sources, even at the edge of the distribution grids, has increased the power system uncertainties and drastically shifted the operational status quo for the system operators. For the operators to ensure the uninterrupted electricity supply of the end-consumers, the fast and accurate response to fault events is of critical importance. This paper proposes a data-driven fault location identification and types classification application based on the continuous wavelet transformation and convolutional neural networks optimally configured through Bayesian optimization. This application leverages the proliferation of high-resolution measurement devices in distribution networks. It can locate the exact position of the short-circuit faults and classify them into eleven different types. Its intrinsic models grasp the spatial characteristics and the converted in frequency domain temporal ones of the three-phase voltage and current timeseries measurements stemming from the field devices, thus increasing the operators' visibility of their networks in real-time. We conduct simulations through synthetic data, which we provide in an open-source repository, that replicate a wide range of fault occurrence scenarios with eleven different types, with the resistance ranging from 50Ω to 2kΩ and with duration from 20ms to approximately 2s, under noise conditions injected by devices and load variability. The results showcase the efficacy of the proposed method reaching an accuracy of 91.4% for fault detection, 93.77% for correct branch identification, 94.93% for fault type classification, and RMSE value of 2.45% for location calculation.
Deep Learning-based Application for Fault Location Identification and Type Classification
in Active Distribution Grids
V. Rizeakosa, A. Bachoumisa,, N. Andriopoulosa, M. Birbasa, A. Birbasa
aDep. of Electrical and Computer Engineering, University of Patras, Rio Campus,26504, Patras, Greece
Abstract
The high penetration of distributed energy resources, especially weather-dependent sources, even at the edge of the distribution
grids, has increased the power system uncertainties and drastically shifted the operational status quo for the system operators. For
the operators to ensure the uninterrupted electricity supply of the end-consumers, the fast and accurate response to fault events
is of critical importance. This paper proposes a data-driven fault location identification and types classification application based
on the continuous wavelet transformation and convolutional neural networks optimally configured through Bayesian optimization.
This application leverages the proliferation of high-resolution measurement devices in distribution networks. It can locate the exact
position of the short-circuit faults and classify them into eleven dierent types. Its intrinsic models grasp the spatial characteristics
and the converted in frequency domain temporal ones of the three-phase voltage and current timeseries measurements stemming
from the field devices, thus increasing the operators’ visibility of their networks in real-time. We conduct simulations through
synthetic data, which we provide in an open-source repository, that replicate a wide range of fault occurrence scenarios with
eleven dierent types, with the resistance ranging from 50to 2kand with duration from 20ms to approximately 2s, under noise
conditions injected by devices and load variability. The results showcase the ecacy of the proposed method reaching an accuracy
of 91.4% for fault detection, 93.77% for correct branch identification, 94.93% for fault type classification, and RMSE value of
2.45% for location calculation.
Keywords: Active distribution grids, CNNs, Deep learning, Fault detection and location identification, Wavelet transformation
1. Introduction
In the ever-evolving environment of the active distribution
networks, the uninterrupted and high-quality supply of the end-
customer is threatened due to the intermittent nature of the Re-
newable Energy Resources (RES). Grid operators shall ensure
power system reliability by performing high grid observability
even at the edge of the Low-Voltage Distribution Grid (LVDG)
in real-time conditions to achieve rapid recovery after the emer-
gence of contingencies. The advancements in Information and
Communication Technology (ICT) through the emergence of
low-latency telecommunication networks and Advanced Meter-
ing Infrastructure (AMI) [1], along with the accelerated progress
in Machine Learning (ML) and especially in Deep Neural Net-
works (DNNs) [2], can act as a catalyst for alleviating the prob-
lems arising in rich distribution networks, i.e., highly RES-
penetrated with a significant number of prosumers [3, 4].
This research has been financed by the European Union, under the Horizon
2020 project 864537: Flexible Energy Production, Demand and Storage-based
Virtual Power Plants for Electricity Markets and Resilient DSO Operation
“FEVER” H2020-LC-SC3-2018-2019-2020.
Corresponding author
Email addresses: up1053537@upatras.gr (V. Rizeakos),
abachoumis@ece.upatras.gr (A. Bachoumis),
nadriopoulos@ece.upatras.gr (N. Andriopoulos),
mbirbas@ece.upatras.gr (M. Birbas), birbas@ece.upatras.gr (A.
Birbas)
Active and smart LVDGs can self-heal in case a fault oc-
curs. Several strategies are followed by the grid operators to
react to contingencies and apply self-healing control practices
[5–8]. The cornerstone of these practices is the execution of
accurate and fast actions for fault diagnosis, i.e., fault detection
and identification of the specific location and type. However,
this activity is not trivial in LVDGs, due to particular character-
istics that impede the traditionally used methods. Specifically,
LVDGs have a high number of branches, are multi-phase, and
usually have an unbalanced operation due to single-phase con-
nected loads. Dierent types of conductors do exist, connecting
the nodes with dierent characteristics and lengths, leading to
a wide range of resistance (R) and reactance (X) values with a
high R/Xratio. In addition, a limited number of AMI devices
exist, reducing the overall observability, and have a radial struc-
ture and operation [9]. Therefore, the development of a fault
diagnosis method for LVDGs shall consider from the initial de-
sign process all the above-mentioned inherent characteristics of
the LVDGs.
1.1. Literature
The approaches used for fault diagnosis in power systems
can be classified into three categories. The first category in-
cludes the classical approaches that use direct modeling tech-
niques, with the most important being the impedance-based
[10–12] and traveling-wave methods [13, 14]. System opera-
tors have widely used over the last decades these two methods
Preprint submitted to Applied Energy July 29, 2022
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
to perform fault diagnosis in high and medium voltage networks
and are dependent on the line parameters. However, for both
methods, their eciency in LVDGs decreases significantly. For
the former, the ability to accurately detect the fault location de-
teriorates due to the high number of branches at the LVDGs
[15]. The latter method lacks accuracy in LVDGs, due to the
presence of several branches, impending the dierentiation be-
tween waves [16]. The exact fault location is crucial to be iden-
tified in rich LVDGs to ensure the security of supply for the
end-consumers.
The second category consists of data-driven methods with
a wide spectrum of techniques from signal processing to Ar-
tificial Intelligence (AI) domains. In both [17] and [18], fault
diagnosis methods using Support Vector Machines (SVMs) in
a real distribution feeders are presented. [19] proposes a fault
diagnosis method in LVDGs based on gradient boosting trees.
Moreover, several works with applications both at the trans-
mission and distribution levels leverage domain transformation-
based methods, such as wavelet transform, to conduct fault de-
tection [20–22]. In [23], a hybrid clustering algorithm based
on k-Nearest Neighbors (k-NN) and k-Means is developed, us-
ing as a preprocessing method the Matching Pursuit Decom-
position (MPD). A complementary clustering technique, the
Density-Based Spatial Clustering of Applications with Noise
(DBSCAN), has been implemented in [24] for fault diagnosis.
The feature selection and dimensionality reduction methods of
Principal Component Analysis (PCA) and Random Forest (RF)
have been used for active LVDGs in [25] and [26], respectively.
A Multi-Layer Perceptor (MLP) along with Extreme Learning
Machine (ELM) are deployed in [27], particularly for radial
grid topologies. DNNs are employed in [28] for identifying the
fault location and type for radial topologies in LVDGs. Finally,
Convolutional Neural Networks (CNNs) have also been lever-
aged for fault diagnosis in both transmission and distribution
levels [29, 30].
The last category includes methods that use a hybrid ap-
proach for fault diagnosis. These methods combine data-driven
and model-based approaches to conduct fault location and type
identification processes. In [31], both the impedance-based method
and DNNs are employed, where in [32], a ruled-based fuzzy
logic is developed, which can detect the dierence between
simulated data and the actual measurements to extract the ac-
curate short-circuit fault location.
1.2. Work Contributions
From the above-conducted literature review, we can con-
clude that in the context of LVDGs, data-driven methods have
gained popularity mainly for two reasons: (i) the limitations
that model-driven methods experience at the edge of the grid
and (ii) the accelerating integration of AMI even at the residen-
tial end-consumer level. However, the amount of data-driven
works that focus on the LVDGs is limited [33]. By further con-
tributing towards that direction, this work proposes an applica-
tion for smart, rich, and radial LVDGs for Fault Location Iden-
tification and Type Classification, called hereafter the FLITC
application, based on advanced DNN architectures. Specifi-
cally, Continuous Wavelet Transformation (CWT) and CNN ar-
chitectures are employed to consider the spatio-temporal char-
acteristics of the AMI measurements. This work aims to detect
the fault occurrence, its type, and exact location in real-time
conditions. This information is used as an input into an ad-
vanced self-healing application aiming at repairing the contin-
gencies and performing energy restoration eorts to reduce the
impact of energy interruption on the consumers, by decreasing
the number (SAIFI) and duration (SAIDI) of the interruptions
[34].
The main contributions of this work can be concisely de-
scribed as follows:
Propose of an application of DNNs to identify in real-
time conditions eleven dierent types of short-circuit faults
in active LVDGs and find the exact location (feeder and
branch) and distance from the root node, considering the
constraints of AMI devices resolution and the computa-
tional time,
Leverage the CWT along with Dynamic Mode Decom-
position (DMD) techniques as preprocessing stages into
the CNNs, constituting the application’s cornerstone. To
the authors’ knowledge, it is the first time across litera-
ture that this data-driven approach, empowered by meth-
ods of the signal processing domain, is used for the fault
detection and location identification application in active
LVDGs, under noisy measurements that are introduced
by the AMI and the variability of loads and distributed
generation,
Dataset generation that simulates the LVDG operation
under fault occurrences for tuning and evaluation of the
application,
Optimal hyperparameters tuning in each model based on
the Bayesian Optimization (BO) algorithm and particu-
larly on the Tree Parzen Estimator (TPE),
Showcase the superiority of the proposed models com-
pared to benchmark models existing in the literature, and
Empowerment of the research results’ reproducibility and
transparency across the academia, by making available
the source code both for the dataset generation and the
FLITC application, in an open-source repository [35].
1.3. Paper Outline
The paper is structured as follows: Section 2 includes a
thorough description of the problem. Section 3 presents the
mathematical building blocks and concepts upon which the FLITC
application is constructed. Section 4 provides the proposed
FLITC application, a deep analysis of the DNN-based archi-
tecture, the algorithm used for hyperparameter tuning, and the
employed loss functions. Section 5 includes a description of the
use case serving for validation purposes. Section 6 introduces
the extensive exploration of the proposed application’s results
to showcase its eciency and applicability. We conclude our
work in Section 7, where a summary of the proposed applica-
tion and recommendations for further work are given.
2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
2. Problem statement and specifications
This section includes the description of the fault detection
problem and the minimum requirements the FLITC application
has to meet to perform with high accuracy. Initially, a descrip-
tion of the faults considered in this work and their physical
meaning takes place. Then, the specifications of the AMI and
the corresponding reporting rate are presented. Finally, based
on this analysis, we derive the functionalities of the FLITC ap-
plication.
2.1. Fault types
As reviewed in [36], failures in electrical grids are catego-
rized based on their origin. Natural-made causes are the most
common type of occurring faults in power grids and mainly
comprise disturbances due to extreme weather phenomena. An-
other category of faults emerges from malfunctions in electric
grid equipment, as well as from human failure. Lastly, a portion
of the occurring faults is associated to man-made hazards, i.e.,
either cyber-attacks that aim to aect the grid integrity or other
forms of intentional attacks.
Based on the fault origin, its severity has a wide range.
On the one hand, a lightning strike can cause instability in the
power flow of the electric grid, while on the other hand, it can
create a localized blackout due to the destruction of the grid
infrastructure. The aftermath of a fault occurring to the grid
might overlap with two or more of the possible causes dis-
cussed above, e.g., a localized blackout might stem from both
a lightning strike and a substation malfunction. This work in-
vestigates a subset of all the possible types of errors. More
specifically, the proposed methodology emphasizes the detec-
tion of single-phase (1Φ), double-phase (2Φ), and three-phase
(3Φ) short-circuits with respect to ground and phase-to-phase
ones. According to [37], the origin of those faults is mainly at-
tributed to physical contact between one or more phases with
the ground (i.e., tree fall, broken insulators, natural phenom-
ena such as lightning storms, hurricanes, floods, heatwaves,
etc.), overloading, corrosion or lack of maintenance of solar
plants and wind turbines, generators’ overheating, short-circuit
of generator rotor windings, etc.
According to the EN 50160 [38], IEEE Std 1159-1995 [39]
and IEEE Std 1250-1995 standards [40], the duration of the oc-
curring fault can range from half a cycle to up to three minutes,
during short interruptions, depending on both the fault cause
and the grid security level. However, most disruptions due to
phase-to-phase or phase-to-ground faults do not exceed 1 s, be-
yond which normal operation is restored. Power grid outages
longer than 3 minutes are usually owned to scheduled mainte-
nance of the grid’s components, or construction works in the
LVDG district.
The frequency of occurrence of each short-circuit fault de-
pends on its type. According to [41], the most common type
is the 1Φshort-circuits with the ground at a rate of 70%, while
around 5% concerns 3Φshort-circuits. The remaining 25% in-
cludes phase-to-phase short-circuits, and double-phase short-
circuits with the ground. For the purposes of this work, it is
permissible to consider that the respective frequency for each
Types Frequency Severity
L-G 70% Low
L-L 7.5% Medium
L-L-G 17.5% Medium
L-L-L 2% High
L-L-L-G 3% High
Table 1: Occurring frequency and severity of dierent line fault types
of the five types of fault is as follows: 70% for 1Φto ground,
17.5% for 2Φto ground, 7.5% for line-to-line, 3% for 3Φto
ground, and 2% for 3Φfault. When one or two lines contribute
to short-circuits, this creates a significant imbalance in a most
likely already unsymmetrical power grid, where each phase’s
power consumption diers. However, this eect restricts the
power flow of the aected lines when more are involved. Thus,
it is acceptable to assume that the severity of a fault increases
with each contributing phase (Table 1).
2.2. Metering infrastructure
AMI enables the collection of data stemming from the grid’s
buses, such as 3Φvoltage (VRMS ), 3Φcurrent (IR MS ), hence-
forth denoted as 3Φ-Vand 3Φ-I, respectively, and active power
(P), and transmits them to the operators’ data management sys-
tem. Since most of the faults’ duration does not exceed 1 s, the
proposed model is able to simulate faults ranging from a cycle,
i.e., from 20 ms up to 2 s. The AMI should be able to support
transmitting data measurements from each bus. In addition to
that, since data arriving at such a high frequency will be dicult
to be handled, it is considered that the data are sent in packages,
e.g., every 5 or 20 s. Therefore, a framework is created for ev-
ery measurement unit at the LVDG nodes with a sample rate
of 20 ms, which accumulates the information in either 5 s or
20 s intervals. Data with high granularity can be provided by
measuring devices, such as either Phasor Measurement Units
(PMUs) or Power Quality Meters (PQMs), which have a sam-
pling rate of up to 2880 and 100.000 samples/s, respectively.
2.3. FLITC specifications
Based on the above-described problem statement, the fol-
lowing requirements and specifications of the FLITC applica-
tion are derived:
Data handling and decision timeframe: Because data ar-
riving in high frequency are dicult to be handled, it is
considered that the application shall be capable of receiv-
ing and analyzing data in packages, e.g., every 5 s or 20 s.
Then, a decision (model inference) is taken in real-time
conditions (sub-minute or sub-second scale based on the
dimensions of the grid topology).
Faulty feeder detection: The model has the capability of
detecting the feeder in which the fault took place and dif-
ferentiating that feeder from the healthy ones,
3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Faulty branch detection: The model has the capability of
accurately detecting which branch is the one that the fault
has emerged from,
Faulty class identification: The model can identify the
type of the short-circuit fault that occurred in the LVDG.
Eleven faulty classes are included: A-G, B-G, C-G, A-
B, B-C, A-C, A-B-G, B-C-G, A-C-G, A-B-C, A-B-C-G,
where A, B, and C denote the corresponding phases and
G the ground,
Fault branch location: The model can calculate the dis-
tance from the root node.
3. Technical Preliminaries
This section presents the theoretical framework that the FLITC
application is built upon. Initially, the CNN is introduced, which
is the employed DNN architecture and is adept at capturing spa-
tial changes in the input data. They are commonly used in im-
age and video recognition algorithms and have further varia-
tions depending on the specific task. Then, the Dynamic Mode
Decomposition (DMD) method is described, i.e., a completely
data-driven dimensionality reduction technique independent of
the underlying dynamics, firstly proposed in the hydrodynam-
ics domain. Finally, the CWT method is presented that is used
in the data preprocessing stage for data transformation from the
time to the frequency domain.
3.1. Convolutional Neural Networks
CNNs are a widely used architecture of DNNs, which are
primarily used in image and video recognition due to their ad-
vantage of determining spatial features in the given data. CNNs
are feed-forward neural networks whose primary feature is the
convolution of the input data with a shared weight kernel. The
kernels slide across the input data and extract feature matrices
for each convolution. Depending on the datasets’ dimensions,
the CNN’s kernels or filters have a corresponding dimension.
At the same time, the output of their matrix multiplications pro-
duces a feature map with lower dimensions through employing
pooling. This process of consecutive convolution and pooling
layers can continue to the point of creating a vector layer, also
known as a dense layer. The input of a CNN model has dimen-
sions of the following structure:
Dataset Length ×Input Height ×Input Width ×Input Channels
In the case of image classification, for example, the dimen-
sions correspond to the size of the dataset fed into the network,
the picture height in pixels, the picture width in pixels, and the
three-color channels (R, G, B) respectively. Then, a kernel with
dimensions smaller than the above slides through the input and
carries out matrix multiplication. The hyperparameters of ker-
nel size, number of filters, stride, and padding of the kernel are
all set by the user, according to the nature of the dataset. The in-
put of the following stage is the feature map extracted from the
previous one. This is passed through a pooling layer, which re-
duces the dimensionality of the processed data, either by using
max or average pooling, which yields the kernel’s maximum or
average value, respectively. The flattened data from the pooling
layer are fed into dense layers with a depth size determined by
the user.
The selection of the CNNs in this work is based on two fac-
tors: (i) the CNN layer can capture the spatial information in-
herently included in the measured 3Φtimeseries data stemming
from dierent locations of the LVDG, and (ii) the dimensions
of the grid’s branch data are similar to the CNN model input di-
mensions. Hence, the CNN can classify measured Vtimeseries
data to a particular fault type by extracting data from the grid’s
3Φ.
3.2. Dynamic mode decomposition
A dimensionality reduction technique that computes a set
of modes, where each of them has a fixed oscillation frequency
and a decay/growth rate [42]. These modes and frequencies
are analogous to the normal modes of the system, but more
generally, they are approximations of the modes and eigenval-
ues of the composition operator. Due to the intrinsic temporal
behaviors associated with each mode, DMD diers from di-
mensionality reduction methods such as PCA, which computes
orthogonal modes that lack predetermined temporal behaviors.
Because its modes are not orthogonal, DMD-based representa-
tions can be less parsimonious than those generated by PCA.
However, they can also be more physically meaningful because
each mode is associated with a damped sinusoidal behavior in
time.
Given a time series data Xof size N-by-T where Nis the
number of variables and Tis the number of time steps, then in
any time step t, the first-order vector autoregression takes the
form:
xt=Axt1+ϵt(1)
where xtdenotes the snapshot vector in time tand the size is
Nx1, Ais the coecient matrix size of NxN, and ϵtis the error
term. To find a well-behaved coecient matrix and use it to rep-
resent temporal correlations, we reformulate the above equation
as:
X2AX1(2)
Then, we employ the singular value decomposition of the X1to
factorize it, writing:
X1=UΣVT(3)
where Uis consists of left singular vectors, V consists of right
singular vectors and Σis diagonal. We define e
Aas:
e
A=UTX2VΣ1(4)
The eigenvalues and eigenvectors of the e
Aare then computed
from the following equation:
e
Ay = Λy(5)
Finally, the DMD mode decomposition of the DMD eigenvalue
Λis given by:
Φ = Uy (6)
4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 2: Indicative example of CWT transformation for Vrms timeseries.
This work uses the DMD method to reduce the dimensions of
the input dataset of the CNNs to facilitate the training and in-
ference of the models by decreasing the requirements both for
computational power and memory without significantly aect-
ing the model’s accuracy.
3.3. Continuous Wavelet Transformation
The CWT of a signal s(t), can be defined as:
Z(α, β)=1
αZ
−∞
s(t)Ψ(tβ
α)dt (7)
where s(t) is a finite energy signal, Ψis the complex conjugate
of the mother wavelet, and aand bare the scaling and trans-
lational factors of the wavelet, respectively. Large scale values
expand the wavelet in time, revealing low-frequency informa-
tion in the signal, while smaller scale values shrink the wavelet
and reveal high frequencies present in the signal. The CWT is
calculated by continuously varying variables a and b over the
range of scales and length of the signal, respectively.
Across the academia over the last years, many works pro-
vide solutions to classification and regression problems, using
CWT as a preprocessing stage of the CNN models, especially
for applications in the medical engineering domain [43, 44].
In this work, the advantage of this approach is to transfrm the
noisy 3Φ-Vand -Itimeseries signals into the frequency domain
and thus decompose complex information and patterns into el-
ementary forms, thus boosting the ability of the CNN models
to capture the faulty states/occurrences. This work uses Morlet
as a mother wavelet, where an indicative example of this CWT
conversion is illustrated in Fig. 2.
4. FLITC application
4.1. Architecture
The FLITC application fulfills the following functionalities:
Faulty feeder detection through the employment of a feeder
detection CNN model (FFNN),
Faulty branch detection through the employment of a branch
detection CNN model (FBNN),
Faulty type classification in eleven fault classes, as de-
fined in section 2.3, through the employment of a faulty
class detection CNN model (FCNN),
Faulty distance calculation by estimating the distance from
the root node through the employment of a distance cal-
culation CNN model (FDNN).
The FLITC application’s architecture is illustrated in Fig. 3.
The developed components for the application’s materialization
are input, preprocessing, fault diagnosis, and output.
4.1.1. Input stage
It handles the measurements stemming from the AMI de-
vices in the LVDG. Data are received in 20s intervals in or-
der not to obstruct the AMI communication system and cre-
ate bandwidth issues. The accumulated 20s data, i.e., feeder
3Φ-Imeasurements, and each node 3Φ-Vmeasurements are
segmented into four 5s batches, which are stored until all the
timeseries have been processed. The data are successively for-
warded to the next stage, responsible for the data preprocessing.
4.1.2. Preprocessing stage
The data preprocessing stage is an intermediate function be-
tween the incoming data and the diagnostic tools. It prepro-
cesses the data as follows:
Cleaning step: Checks the data for any missing measure-
ments and replaces them based on their position in the
input timeseries,
Grouping step: Separates the Imeasurements from the
Vmeasurements and isolates the latter depending on the
branch that each node belongs to,
Interpolation step: As firstly introduced in [28], this step
interpolates the measurements of each node to create a
branch with a generalized number of virtual nodes, thus
facilitating the distance estimation functionality by re-
ducing the input vector. These virtual nodes are located
in 5 dierent locations of each branch, i.e., 0%, 25%,50%,
75%, 100% of the total faulty branch distance,
Normalization step: Normalizes the data to insert them
in the dierent models,
Figure 3: Architecture diagram of the FLITC application.
5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 4: Dimensionality (in parenthesis) and CWT transformation of the 3Φ-I
dataset during the preprocessing stage.
CWT step: Transforms the data from the time domain
to the frequency domain to enhance the performance of
the models. Even though the result of this process is a
32x32 image, the intermediate steps can vary. There-
fore, two dierent CWT approaches are explored: (i) the
CWT of the timeseries is a 32x250 image, retaining the
original time dimension while extracting the data for the
convolution via the mother wavelet to a new dimension
with equal size to the final one, i.e., 32, (ii) CWT cre-
ates a 250x250 image, expanding the first dimension to
get an orthogonal matrix including further information of
the original timeseries. Each approach is used for dier-
ent datasets according to the volume of information each
CNN model requires. In this work, all the obtained best
models (see section 6) yield better results by employing
the first approach due to the fact that the second one pro-
vides excessive information overwhelming the CNN lay-
ers,
DMD step: Reduces the dimensionality of the input data
to alleviate problems provoked by memory overloading
during the hyperparameter tuning and training of the mod-
els without significantly deteriorating the models’ e-
cacy. This work uses DMD for reduction of the 3Φ-V
timeseries dimensionality.
Fig. 4 and 5 illustrate the preprocessing stage for the 3Φ-Iand
3Φ-Vinput datasets, respectively, for the specific use case pre-
sented below in section 5.
4.1.3. Fault diagnosis stage
The DNN blocks are the main computational blocks of the
system since this is where the fault diagnosis of the LVDG is
conducted. Data insert into each module as follows:
FFNN module is fed with the preprocessed 3Φ-Imea-
surements,
FBNN module is fed with the preprocessed 3Φ-Vmea-
surements of the feeder classified as faulty in the previous
stage,
FCNN module is fed with the preprocessed 3Φ-Vmea-
surements of the branch classified as faulty in the previ-
ous stage,
FDNN module is fed with the preprocessed 3Φ-Vmea-
surements of the branch classified as faulty in the previ-
ous stage.
The FFNN module receives the preprocessed input data and
sends its classification results to the output. If the model pre-
dicts that the state of the LVDG during these 5s is healthy, the
fault diagnosis is concluded. Otherwise, if the model finds
a faulty feeder, the process continues. Since the number of
branches for each feeder can vary, the input shape of the FBNN
is not standard. Thus, within the DNN training stage, models
with feeders of dierent branches are trained. Therefore, the
number of the DNNs selected from the training stage equals the
number of dierent branch-size feeders. Depending on which
is the faulty feeder, the corresponding fault branch is fed with
the Vdata. The last stage involves both the FCNN and FDNN
since they are independent and their input data have already
been preprocessed.
Categorical cross-entropy loss function is used in the case
of multi-class classification problems. It is a so f tmax activation
function (Eq. (8)), followed by a cross-entropy loss (Eq. (9)).
f(s)i=esi
PC
jesj
(8)
CE =
C
X
i
tilog(f(s)i) (9)
Since the labels are in one-hot encoding, only the positive
class Cpkeeps its term in the loss, so there is only one element
of the target vector twhich is not zero, i.e., ti=tp. Hence, by
discarding the zero elements in the summation, the formula of
the categorical cross-entropy is derived as such:
CCE =log(esp
PC
jesj
) (10)
4.1.4. Output stage
The DNNs output is provided in a comprehensive format
and informs the user about the LVDG state. Specifically, it
concatenates the results of all the four DNNs as follows: (i)
”No Fault Detected” when the model classifies the LV grid as
healthy, and (ii) ”C Fault Detected in (F, B, D)” where C, F, B,
and D denote the faulty class, faulty feeder, faulty branch and
the estimated distance from the root node.
4.2. Modes of operation
Three dierent modes of operation can be defined for the
FLITC application. Specifically:
6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 5: Dimensionality (in parenthesis) and transformations of the 3Φ-Vdataset during the preprocessing stage for the FFNN (upper) and the FBNN (lower).
4.2.1. Configuration and hyperparameter tuning
The configuration mode concerns the parameters configura-
tion, such as the input vector dimensions of each DNN model
and the number of the CNN models that will be applied for the
branch detection. This mode is executed whenever the FLITC
application uses a new LVDG. The required data for the config-
uration are the amount of LVDG feeders, branches per feeder,
the amount and location of the devices that measure the 3Φ-V
values, and the length of each branch.
After the models’ configuration, the hyperparameter tuning
process is carried out. Hyperparameter tuning is the problem
of optimizing a loss function over a graph-structured configura-
tion space and thus calculating the optimal value of the model
hyperparameters, i.e., the parameters that define the model’s ar-
chitecture and control the learning process. Computing the op-
timal hyperparameter value enhances the overall model’s per-
formance. As described thoroughly in section 3, this work uses
the CNN architecture as a foundation to perform the dierent
tasks. The hyperparameters that need to be selected in this case
are the batch size, number of hidden layers and units in each
of them, the dropout rate for each layer, the window size in the
incoming data, and the filter size.
However, this multi-dimensional hyperparameter space ren-
ders the methods used to identify the optimal values, such as
the random or grid search, inecient and time-consuming. To
tackle this issue, the concept of BO has been introduced, where
the number of samples drawn from the hyperparameter search
space is probabilistically guided and reduced, thus allowing for
proper evaluations of the most promising candidates for hyper-
parameters selection [45]. In this work, an automated approach
for hyperparameter tuning of each model is performed based
on the BO, namely, the Tree-structured Parzen Estimator (TPE)
method [46]. The TPE is a sequential model-based optimiza-
tion (SMBO) approach, which sequentially constructs models
to approximate the performance of hyperparameters based on
historical measurements. It then selects new hyperparameters
to test based on this model. This method solves the problem of
dealing with categorical and conditional parameters and there-
fore increases the eciency of the hyperparameters selection
process [46]. In literature, the TPE algorithm is broadly used in
dierent domains and applications, such as image processing
[47, 48], load forecasting [49–51], and solar irradiance fore-
casting [52].
The TPE algorithm models p(θ|y) transforming that genera-
tive process by replacing distributions of the configuration prior
with non-parametric densities [46]. By using in each iteration
tdierent observations (θ(1), θ(2) , .., θ(M)) in the non-parametric
densities, a learning algorithm is generated that can produce a
variety of densities over the configuration space Θ. The TPE
defines p(θ|y) such as:
p(θ|y)=
k(θ),if y<y.
l(θ),if yy.(11)
where k(θ) is the density formed by using observations θ(i), so
that the corresponding loss y=f(θ(i)) is less than yand l(θ) is
the density formed by the rest observations. The TPE algorithm
chooses yto be some quantile γof the observed yvalues, so
that p(y<y)=γ, but no specific model for p(y) is necessary.
An Expected Improvement (EI) is defined as the fraction of the
probability k(θ) divided by l(θ), and is maximized in order to
select the θ(t+1) set for the next iteration. The algorithm ter-
minates when the maximum number of iterations is completed.
The runtime of each iteration of the TPE algorithm can scale
linearly in |H|, i.e., the sorted history lists of observed variables,
and linearly in the number of hyperparameters (dimensions) be-
ing optimized [46].
4.2.2. Training and testing modes
After the optimal configuration of the parameters mentioned
above, training is conducted using the stochastic gradient de-
7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 6: Aggregated local consumption and generation.
scent method, specifically the ADAM optimizer, to calculate
the models’ weights and calibrate them on the specific net-
work’s topology and amount of metering devices. After the
completion of the training process, the models’ performance is
evaluated throughout sample data. If the FLITC applications’
performance is acceptable to the operator, the application can
be used in a real operational environment.
4.2.3. Operational mode
During the real operational mode, the FLITC application
can be hosted either in a server of the operator’s control center
or in an edge server located near the LVDG, e.g., the LV trans-
former, which has adequate computational resources to execute
the application. The first implementation has the advantage
of the computing capabilities of the cloud servers. However,
high data bandwidth is needed for the data transmission to the
server with the issues of data drop and security to exist. On
the contrary, the edge implementation secures the LVDG data
from breaches. It requires less transmission bandwidth because
only the output is sent to the control center without transmitting
the LVDG raw data. Regardless of the hosting environment,
the FLITC application can be integrated into a self-healing ap-
plication, oering real-time fault diagnosis and thus allowing
for initiating other grid control functionalities in the LVDG for
fault clearance and restoration of the consumer’s power supply.
5. Use Case
5.1. Simulated Dataset
The dataset used for the CNN models’ hyperparameter tun-
ing, training, and evaluation process is generated in a simulation
environment. For the dataset generation process, 1minresolution
real data of local consumption and distributed production are
retrieved from the publicly available dataset of Pecastreet Dat-
aport [53]. Fig. 6 illustrates the aggregated profile of the user’s
data. The penetration rate of the renewables for the case study
is 25%. Through the simulations, data interpolation is con-
ducted to replicate the operation of a real LVDG and investigate
a diverse and large number of fault scenarios, which are ab-
sent on already-open source available datasets that comprise on-
field real measurements. Therefore, the simulated dataset can
provide adequate information containing all the possible short-
circuit-related faults that can occur in an LVDG. Furthermore,
the generated data also include noise introduced by the measur-
ing devices and the load and generation variability; replicating
the operation of a common unbalanced, radial high-RES pene-
trated active LVDG.
The original LVDG, upon which this simulation model is
based, is a Portuguese one given in [54] and shown in Fig. 7.
A radial LVDG encompasses multiple feeders and secondary
branches starting from an MV grid equivalent. Each node of
the described network is integrated with a device that measures
3Φ-V, while the main feeders of the network monitor both the
3Φ-Vand -Imeasurements. A variable length distribution line
is added between each node, emulating distribution losses over
LVDGs. Furthermore, the consumer loads consist of variable
parallel-connected resistive and inductive loads. Lastly, local
distributed generation is simulated by AC Vsources with vari-
able nominal power rates.
The flowchart for data generation, illustrated in Fig. 8. It
uses as input the fault characteristics of each simulation sce-
nario, as well as the consumers’ loads and the local production
of the distributed generation. The consumers’ reactive power
loads Qare calculated using a power coecient equal to arccos ϕ=
0.95. In this model, during the fault event of each scenario, the
resistance Rbetween the 3Φand the ground is drawn from a
Log-Normal distribution, with an average value ranging from
50to 2k. Fault duration is also calculated using a Weibull
distribution, with a duration ranging from 20ms to approxi-
mately 2s. Fig. 9 illustrates the duration and resistance his-
tograms of the generated faults. It is noteworthy that the fault
initialization is a uniformly distributed variable among the length
of the simulation.
5.2. Baseline models
For the validation of our method, we are comparing it with
the state-of-the-art models regarding fault diagnosis and anomaly
detection domains, which are:
Figure 7: Topology of the LVDG used as a testbed.
8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 8: Flowchart of the dataset generation process.
LSTM Autoencoder for unsupervised learning for detec-
tion of the outliers, i.e., the short-circuit faults in our case.
This technique is utilized in several domains for anomaly
detection, such as supply chain management [55] or net-
work [56], where the model learns to forecast the sig-
nal of healthy states, then reconstructs the training data
and calculates the Mean Absolute Error (MAE) for each
training sample. The maximum MAE value is considered
a threshold. If the reconstruction loss for a test sample is
greater than this threshold value, then we can infer that
the model is seeing an unfamiliar pattern, and thus char-
acterizing it as a fault,
Convolutional LSTM (ConvLSTM) for all the stages of
the FLITC application, i.e., FFNN, FBNN, FCNN, and
FDNN. We leverage ConvLSTM, which can consider the
spatiotemporal characteristics of the data [57] and outper-
form MLP [58]. In this work, the data stemming from the
measuring devices inherently contain the grid’s topology
characteristics.
5.3. Performance evaluation
For the evaluation of the model, a set of metrics is used to
elucidate each layer’s performance from dierent perspectives.
Firstly, the confusion matrix is utilized for the case of a multi-
class classification DNN model. Specifically, it depicts a matrix
of n×ndimensions, where n denotes the number of classes the
DNN is trained for. Across the matrix rows lie the true labels of
the test data, while across the columns are the predicted ones.
The diagonal of the matrix showcases the correct classifications
of the DNN models. For binary confusion matrices, where there
are only two classes to select from, each value of the matrix
belongs to one of the following categories:
True positives (TP): an outcome where the model cor-
rectly predicts the positive class,
True negatives (TN): an outcome where the model cor-
rectly predicts the negative class,
False positives (FP): an outcome where the model incor-
rectly predicts the positive class,
False negatives (FN): an outcome where the model incor-
rectly predicts the negative class.
Beyond the confusion matrix of the model, the accuracy,
precision, recall, and F1-score of the DNN are calculated. Ac-
curacy is the fraction of all correctly classified values over the
entire dataset. On the other hand, precision is defined as the
fraction for a particular class of the correctly classified values
by the total number of predicted ones. Recall is the fraction of
correctly predicted values by all the true labels. Lastly, F1-score
is the harmonic mean of the precision and recall. The formulas
of the above-described metrics are:
Accuracy =T P +T N
T P +T N +F P +FN (12)
Precision =T P
T P +FP (13)
Recall =T P
T P +FN (14)
F1-score =2Precision ×Recall
Precision +Recall (15)
For the regression problem, the Root Mean Square Error
(RMSE) metric is used, i.e., a quadratic scoring rule that cal-
culates the average magnitude of the error. It is the square root
of the average of squared dierences between prediction and
actual observation given by the formula:
RMS E =sPN
i=1(xix
i)2
N(16)
where iare the dierent training samples, Nis the amount of
training data points, xiis the actual fault distance and xis the
estimated by model distance. Since the errors are squared be-
fore they are averaged, the RMSE gives a relatively high weight
to large errors. This means the RMSE should be more use-
ful when large errors are particularly undesirable. This applies
directly to the FLITC case, due to the fact that its primary ob-
jective is to reduce the time that the operators’ inspection teams
spend to find the exact fault location.
5.4. Computational environment
For the dataset generation process, the simulation model
is developed in the Matlab-based Simulink graphical program-
ming environment. The FLITC application models are devel-
oped in python, by using the TensorFlow library with Keras as
9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 9: Histograms of resistance and duration values of the generated faults.
Table 2: Derivation of the best models for the FLITC application from the hyperparameter tuning process.
FFNN FBNN-1 FBNN-2 FBNN-3 FCNN FDNN
Hidden Layers 2 1 1 2 4 3
Hidden Layer Units [128, 160] 224 192 [80, 240] [144, 80, 144, 96] [192, 48, 208]
Regularization Dropout Dropout, L2 Dropout, L2 Dropout Dropout, L2 Dropout
Dropout Rate [0.05, 0.025, [0.05, 0.025, [0.05, 0.025, [0.05, 0.025, [0.05, 0.025, 0.0167, [0.05, 0.025,
0.0167, 0.0125] 0.0167] 0.0167] 0.0167, 0.0125] 0.0125, 0.01, 0.0083] 0.0167, 0.0125, 0.01]
L2 Regularization Factor - 0.01 0.01 - 0.01 -
Accuracy (%) 90.49 85.4 93.77 95.02 94.93 -
Loss (%) 29.62 52.7 26.92 15.13 21.37 0.12
F1-score (%) 62.36 - 93.90 16.22 - 88.78 83.75 - 95.56 94.91 - 95.25 89.87 - 99.10 -
RMSE (%) - - - - - 2.45
Mean epoch time (s) 1.82 1.24 3.7 0.73 9.6 2.52
an API. For hyperparameter tuning, the HyperOpt, i.e., an open-
source library for large-scale automated machine learning, is
used [59]. HyperOpt gives the capability of conducting hyper-
parameter tuning through the utilization of the TPE algorithm.
All the computations were conducted on an AlmaLinux server
with 8 CPU cores with a total of 16 GB RAM and an NVIDIA
RTX A6000 GPU card with a total RAM capacity equal to 48
GB.
6. Results
6.1. FLITC performance
Table 2 includes the best models for each stage of the FLITC
application, as derived from the hyperparameter tuning process.
The validation accuracy for the best models is 90.76% for the
FFNN, 85.4%, 93.77%, 95.02% for each feeder model, namely
FBNN-1, FBNN-2, and FBNN-3, respectively, and 94.93% for
the FCNN. The RMSE value is equal to 2.45 for the FDNN
model. The most computationally intensive derived model is
the FCNN, which consists of 4 hidden layers and has a mean
training time for each epoch equal to 9.6s.
Fig. 10 illustrates the confusion matrices for the best three
FFNN models derived from the hyperparameter process. As it
can be thoroughly observed, even though the model with the
highest accuracy is the one in the upper left, it exhibits worse
performance in detecting the healthy states compared to the
lower depicted model (precision value for healthy state 84.97%
and 90.62%, respectively). For the fault diagnosis process, it
is of crucial importance the distinction between the healthy and
faulty states. Therefore, the sensitivity of the FLITC applica-
tion, i.e., the recall metric for binary classification, to identify
the faulty occurrences is equal to 91.4% and the specificity is
equal to 90.62%. In addition, as illustrated in Fig. 11, the per-
formance of the FFNN model is identical regardless of the fault
duration. On the other hand, Fig. 12 depicts that the fault diag-
nosis performance significantly decreases, as long as the fault
resistance increases. This is mainly caused due to the inability
of the models to capture the high resistance faults, which pro-
duce smaller variations to the magnitude of the grid variables
and have less cascading eects on neighboring branches and
feeders.
Fig. 13 illustrates the confusion matrix for the best FBNN
model concerning the feeder with the four branches. The low
accuracy of that model compared to the rest of the FBNN mod-
els is mainly attributed to its inability to classify the faults of the
second branch correctly due to its small length (the existence of
only one node). This can be considered as a limitation of the
model. Other models derived from the hyperparameter tuning
process have better accuracy in detecting the faults for that par-
ticular branch, however, their overall accuracy was lower.
10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 10: Confusion matrices for the best FFNN models. The upper left is the best model with a total accuracy of 90.5%. The upper right is the second best model
with accuracy equal to 89.32%, and the lower one is the third best model with accuracy equal to 88.74%.
Figure 11: Average test performance for the dierent fault duration ranges of
the best FFNN models.
Fig. 14 presents the confusion matrix for the FCNN model.
The 2Φand 3Φto ground faults are grouped in the same cate-
gory as the ones without ground. As it can be seen, the model
exhibits almost flawless performance in classifying the 2Φand
3Φfaults, except for the A-B fault, where all the misclassified
faults are categorized as 1ΦB to ground faults, probably due
to the network asymmetry and the fact that 1ΦB has the most
Figure 12: Average test performance for the dierent classes of fault resistance
of the best FFNN models.
connected loads. For the 1Φfaults, the accuracy of the FCNN
model drops due to the misclassification of the faults to the rest
1Φfault categories.
Fig. 15 illustrates the density function of the absolute er-
ror, normalized by the maximum distance, as calculated from
the FDNN model. It can be seen that the computed normal-
ized error values loosely follow the beta distribution. The blue
11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 13: Confusion matrix of the FBNN model for the first feeder.
Figure 14: Confusion matrix for the best FCNN model.
line represents the beta distribution with parameters a=0.52 and
b=329.77, which have the best goodness-of-fit ratio, i.e., the
lowest Residual Sum of Squares (RSS) value (95.52), among
other 89 univariate distributions using the library from [60].
The calculated fault distances from the FDNN model are be-
low the 7.21% of the normalized branch distance from the root
node with a 95% confidence. Therefore, we can conclude that
the FDNN model exhibits high performance providing valuable
assistance to the maintenance crew of the operators to locate the
fault correctly.
6.2. Comparison to benchmark models
We conduct evaluation of the proposed FLITC application,
compared to other benchmark models, summarized below:
Fault detection: A comparison takes place between the
FLITC application’s ability to detect the faulty states,
in comparison with an unsupervised method that is used
Figure 15: Density function of the observed error values calculated from the
FDNN model.
broadly across the academia for fault diagnosis and anomaly
detection, namely the LSTM Autoencoder. Our proposed
model exhibits slightly worse sensitivity than the LSTM
Autoencoder in detecting the faulty states (91.6% com-
pared to 91.4%),
Feeder and branch identification: For the faulty feeder
detection (if fault exists) the best hyperparameter opti-
mally tuned ConvLSTM model (batch size=200, units=
180, dropout rate=0.2, hidden layers=4, window size=
50, and filter size=256) achieves an accuracy of 87.12%,
which is lower than the 90.49% achieved by the FFNN
model of the FLITC application. For the branch identifi-
cation, our model outperforms all the ConvLSTM mod-
els in all the three dierent feeders, i.e., 85.4% >67.43%,
93.76% >74.65%, and 95.02% >76.42%.
Faulty class: Our model outperforms the best hyperpa-
rameter optimally tuned ConvLSTM model, which is used
as multi-class classifier. Particularly, the best accuracy
achieved by the ConvLSTM (batch size=150, units=160,
dropout rate=0.2, hidden layers=3, window size=25
and filters=64) is equal to 83.1%, which is significantly
lower than the 94.93% achieved by the FLITC applica-
tion.
Distance calculation: Our model outperforms the best hy-
perparameter optimally tuned ConvLSTM model, which
is used as a regressor. Particularly, the lowest RMSE
value achieved by the ConvLSTM (batch size=200, units=
80, dropout rate=0.2, hidden layers=2, window size=
25 and filters=64) is equal to 21.54%, i.e., significantly
lower than the 2.52% attained by the FLITC application.
6.3. Sensitivity analysis
To explore the robustness of the FLITC application, we con-
duct a thorough analysis to quantify the performance of the best
12
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Figure 16: Testing accuracy of the FBNN and FCNN models for dierent levels
of measuring devices availability rate.
models in LVDGs against dierent availability rates of measur-
ing units. For a particular network topology, we define the unit
availability rate as the fraction of nodes in which 3Φ-Vmeasur-
ing units exist, divided by the total number of nodes (we con-
sider that 3Φ-Imeasurements devices in the root of each feeder
exist in all the cases). This sensitivity study is essential for dis-
tribution system operators’ planning and innovation strategies
since it provides the minimum requirements for infrastructure
upgrades so as to integrate automatic fault diagnosis and self-
healing practices.
In the results presented in section 6.1, we quantified the per-
formance of the FLITC application for an availability rate of
100%. Fig. 16 showcases the performance of the models (ex-
cept for the FBNN, which uses 3Φ-Ivalues) for measuring unit
availability rates ranging from 30% to 100%. As it can be seen,
there is no significant drop in the models’ accuracy if the avail-
ability rate is higher than 60%. There is a trend in most of the
models to have better accuracy with higher availability rates.
However, the slight variations in the dierent models’ perfor-
mance can be subject to the aleatoric uncertainty [61], which is
introduced due to the variation in the input data (sampling data
to create the dierent versions of the measuring units availabil-
ity rate) and thus it cannot be concluded that their performance
significantly deteriorates. For the FDNN model, the normalized
error limit for the 95% confidence interval reaches up to 23%
for measuring availability rate of 30%, compared to the 7.21%
for 100%.
6.4. Application limitations
The DNN-based approach followed in this work, has three
limitations that are discussed in this section. First, it requires
a large amount of data for their training, which are often not
available, especially when it comes to real data. To the authors’
knowledge, there is no publicly available dataset for fault diag-
nosis studies in LVDGs, due to the fact that structural and in-
frastructure upgrades have taken place in LVDGs over the last
years. Thus, obtaining curated high-resolution datasets from a
real operational environment is not yet feasible. Since real data
are unavailable, our method is based on synthetic data, which
in turn requires a deep knowledge of the grid parameters to ap-
proximate faulty conditions, as accurately as possible. In sec-
tion 5, we have thoroughly presented the analysis of the data
generation process while making publicly available the source
code. Second, due to the supervised nature of the FLITC appli-
cation, the models’ accuracy is guaranteed neither in fault cases
that are not included in the training phase nor when changes in
the LVDGs topology/measuring infrastructure occur. The pre-
processing stage employed in this work, particularly the inter-
polation and dimensionality reduction techniques, increases the
generalizability of the model and its performance robustness,
with the cost of increasing also the complexity and the com-
putational time of the models. Last, the authors recognize the
limitations of the model to detect faults in mesh topologies that
night exist even at the edge of the LVDGs.
7. Conclusions
The primary focus of this paper is the selection of ecient
data-driven DNN-based methods to create an application that
is able to be used as a fault diagnostic tool for smart LVDGs.
Cornerstones of the proposed FLITC application are the CWT
and CNNs models, which are able to handle the big amount of
data stemming from the measuring units and detect with high
accuracy the fault patterns. Furthermore, by using the TPE al-
gorithm for conducting exploration of the most suitable hyper-
parameters for a specific aspect of the fault diagnosis tool, i.e.,
the faulty feeder, branch, class, and distance models. The re-
sults showcased its ecacy in providing fine-grained and ac-
curate fault diagnosis analytics to the system operators, i.e., an
accuracy of 91.4% for fault detection, of 93.77% for correct
branch identification, of 94.93% for fault type classification,
and RMSE value of 2.45% for location calculation. Further
work could be conducted to make the FLITC application ap-
plicable in mesh network topologies and explore both privacy-
preserving and resource-constraint methods for fault diagnosis
in LVDG leveraging the computation resources at the edge.
References
[1] A. E. Salda ˜
na-Gonz´
alez, A. Sumper, M. Arag ¨
u´
es-Pe˜
nalba, M. Smol-
nikar, Advanced distribution measurement technologies and data ap-
plications for smart grids: A review, Energies 13 (14) (2020).
doi:10.3390/en13143730.
URL https://www.mdpi.com/1996-1073/13/14/3730
[2] S. Barja-Martinez, M. Arag¨
u´
es-Pe˜
nalba, ´
I. Munn´
e-Collado, P. Lloret-
Gallego, E. Bullich-Massagu´
e, R. Villafafila-Robles, Artificial intel-
ligence techniques for enabling big data services in distribution net-
works: A review, Renewable and Sustainable Energy Reviews 150 (2021)
111459.
[3] L. Cipcigan, P. Taylor, Investigation of the reverse power flow require-
ments of high penetrations of small-scale embedded generation, IET Re-
newable Power Generation 1 (3) (2007) 160–166.
[4] M. Liserre, T. Sauter, J. Y. Hung, Future energy systems: Integrating re-
newable energy sources into the smart power grid through industrial elec-
tronics, IEEE industrial electronics magazine 4 (1) (2010) 18–37.
13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
[5] E. Shittu, A. Tibrewala, S. Kalla, X. Wang, Meta-analysis of the strategies
for self-healing and resilience in power systems, Advances in Applied
Energy (2021) 100036.
[6] S. A. Arefifar, Y. A.-R. I. Mohamed, T. H. El-Fouly, Comprehensive op-
erational planning framework for self-healing control actions in smart
distribution grids, IEEE Transactions on Power Systems 28 (4) (2013)
4192–4200.
[7] E. Shirazi, S. Jadid, Autonomous self-healing in smart distribution grids
using agent systems, IEEE Transactions on Industrial Informatics 15 (12)
(2018) 6291–6301.
[8] M. Ramos, M. Resener, P. Oliveira, D. P. Bernardon, Self-healing in
power distribution systems, in: Smart Operation for Power Distribution
Systems, Springer, 2018, pp. 37–70.
[9] J.-H. Teng, A direct approach for distribution system load flow solutions,
IEEE Transactions on power delivery 18 (3) (2003) 882–887.
[10] J. Mora-Florez, J. Mel ´
endez, G. Carrillo-Caicedo, Comparison of
impedance based fault location methods for power distribution systems,
Electric power systems research 78 (4) (2008) 657–666.
[11] R. Salim, K. Salim, A. Bretas, Further improvements on impedance-based
fault location for power distribution systems, IET Generation, Transmis-
sion & Distribution 5 (4) (2011) 467–478.
[12] F. Aboshady, D. Thomas, M. Sumner, A new single end wideband
impedance based fault location scheme for distribution systems, Electric
Power Systems Research 173 (2019) 263–270.
[13] P. N. Ayambire, Q. Huang, D. Cai, O. Bamisile, P. O. K. Anane, Real-time
and contactless initial current traveling wave measurement for overhead
transmission line fault detection based on tunnel magnetoresistive sen-
sors, Electric Power Systems Research 187 (2020) 106508.
[14] M. A. Aftab, S. S. Hussain, I. Ali, T. S. Ustun, Dynamic protection of
power systems with high penetration of renewables: A review of the trav-
eling wave based fault location techniques, International Journal of Elec-
trical Power & Energy Systems 114 (2020) 105410.
[15] A. Zidan, M. Khairalla, A. M. Abdrabou, T. Khalifa, K. Shaban, A. Ab-
drabou, R. El Shatshat, A. M. Gaouda, Fault detection, isolation, and ser-
vice restoration in distribution systems: State-of-the-art and future trends,
IEEE Transactions on Smart Grid 8 (5) (2016) 2170–2185.
[16] A. Bahmanyar, S. Jamali, A. Estebsari, E. Bompard, A comparison frame-
work for distribution system outage and fault location methods, Electric
Power Systems Research 145 (2017) 19–34.
[17] R. Agrawal, D. Thukaram, Identification of fault location in power dis-
tribution system with distributed generation using support vector ma-
chines, in: 2013 IEEE PES Innovative Smart Grid Technologies Con-
ference (ISGT), IEEE, 2013, pp. 1–6.
[18] W. Fei, P. Moses, Fault current tracing and identification via machine
learning considering distributed energy resources in distribution net-
works, Energies 12 (22) (2019) 4333.
[19] N. Sapountzoglou, J. Lago, B. Raison, Fault diagnosis in low voltage
smart distribution grids using gradient boosting trees, Electric Power Sys-
tems Research 182 (2020) 106254.
[20] A. Yadav, A. Swetapadma, A novel transmission line relaying scheme
for fault detection and classification using wavelet transform and linear
discriminant analysis, Ain Shams Engineering Journal 6 (1) (2015) 199–
209.
[21] F. Perez, E. Orduna, G. Guidi, Adaptive wavelets applied to fault classifi-
cation on transmission lines, IET generation, transmission & distribution
5 (7) (2011) 694–702.
[22] M. Shafiullah, M. A. Abido, Z. Al-Hamouz, Wavelet-based extreme
learning machine for distribution grid fault location, IET Generation,
Transmission & Distribution 11 (17) (2017) 4256–4263.
[23] H. Jiang, J. J. Zhang, W. Gao, Z. Wu, Fault detection, identification, and
location in smart grid based on data-driven computational methods, IEEE
Transactions on Smart Grid 5 (6) (2014) 2947–2956.
[24] R. Tervo, J. Karjalainen, A. Jung, Predicting electricity outages caused by
convective storms, in: 2018 IEEE Data Science Workshop (DSW), IEEE,
2018, pp. 145–149.
[25] L. Souto, J. Mel´
endez, S. Herraiz, Fault location in low voltage smart
grids based on similarity criteria in the principal component subspace, in:
2020 IEEE Power & Energy Society Innovative Smart Grid Technologies
Conference (ISGT), IEEE, 2020, pp. 1–5.
[26] D. Chakraborty, U. Sur, P. K. Banerjee, Random forest based fault clas-
sification technique for active power system networks, in: 2019 IEEE
International WIE Conference on Electrical and Computer Engineering
(WIECON-ECE), IEEE, 2019, pp. 1–4.
[27] Y. D. Mamuya, Y.-D. Lee, J.-W. Shen, M. Shafiullah, C.-C. Kuo, Appli-
cation of machine learning for fault classification and location in a radial
distribution grid, Applied Sciences 10 (14) (2020) 4965.
[28] N. Sapountzoglou, J. Lago, B. De Schutter, B. Raison, A generalizable
and sensor-independent deep learning method for fault detection and loca-
tion in low-voltage distribution grids, Applied Energy 276 (2020) 115299.
[29] P. Rai, N. D. Londhe, R. Raj, Fault classification in power system distri-
bution network integrated with distributed generators using cnn, Electric
Power Systems Research 192 (2021) 106914.
[30] W. Li, D. Deka, M. Chertkov, M. Wang, Real-time faulted line local-
ization and pmu placement in power systems through convolutional neu-
ral networks, IEEE Transactions on Power Systems 34 (6) (2019) 4640–
4651.
[31] R. H. Salim, K. R. C. de Oliveira, A. D. Filomena, M. Resener, A. S. Bre-
tas, Hybrid fault diagnosis scheme implementation for power distribution
systems automation, IEEE Transactions on Power Delivery 23 (4) (2008)
1846–1856.
[32] Z. Galijasevic, A. Abur, Fault location using voltage measurements, IEEE
Transactions on Power Delivery 17 (2) (2002) 441–445.
[33] P. Stefanidou-Voziki, N. Sapountzoglou, B. Raison, J. Dominguez-
Garcia, A review of fault location and classification methods in distri-
bution grids, Electric Power Systems Research 209 (2022) 108031.
[34] H. Falaghi, M.-R. Haghifam, M. O. Tabrizi, Fault indicators eects on
distribution reliability indices, in: CIRED 2005-18th International Con-
ference and Exhibition on Electricity Distribution, IET, 2005, pp. 1–4.
[35] V. R. Vasilis Rizeakos, A. B. Athanasios Bachoumis, FLITC-application.
URL https://github.com/tombax7/FLITC-application
[36] A. Mar, P. Pereira, J. F. Martins, A survey on power grid faults and
their origins: A contribution to improving power grid resilience, Ener-
gies 12 (24) (2019). doi:10.3390/en12244667.
URL https://www.mdpi.com/1996-1073/12/24/4667
[37] J. Hare, X. Shi, S. Gupta, A. Bazzi, Fault diagnostics in smart micro-
grids: A survey, Renewable and Sustainable Energy Reviews 60 (2016)
1114–1124. doi:https://doi.org/10.1016/j.rser.2016.01.122.
[38] H. Markiewicz, Voltage characteristics of electricity supplied by public
distribution systems (Jun 1999).
[39] Ieee recommended practice for monitoring electric power quality, IEEE
Std 1159-1995 (1995) 1–80doi:10.1109/IEEESTD.1995.79050.
[40] Ieee guide for service to equipment sensitive to momen-
tary voltage disturbances, IEEE Std 1250-1995 (1995).
doi:10.1109/IEEESTD.1995.122634.
[41] J. J. Grainger, Power system analysis, McGraw-Hill, 1999.
[42] P. J. Schmid, Dynamic mode decomposition of numerical and experimen-
tal data, Journal of fluid mechanics 656 (2010) 5–28.
[43] A. Meintjes, A. Lowe, M. Legget, Fundamental heart sound classifica-
tion using the continuous wavelet transform and convolutional neural net-
works, in: 2018 40th annual international conference of the IEEE engi-
neering in medicine and biology society (EMBC), IEEE, 2018, pp. 409–
412.
[44] R. Miao, Y. Gao, L. Ge, Z. Jiang, J. Zhang, Online defect recognition
of narrow overlap weld based on two-stage recognition model combining
continuous wavelet transform and convolutional neural network, Comput-
ers in Industry 112 (2019) 103115.
[45] H.-P. Nguyen, J. Liu, E. Zio, A long-term prediction approach based on
long short-term memory neural networks with automatic parameter op-
timization by tree-structured parzen estimator and applied to time-series
data of npp steam generators, Applied Soft Computing 89 (2020) 106116.
[46] J. Bergstra, R. Bardenet, Y. Bengio, B. K´
egl, Algorithms for hyper-
parameter optimization, Advances in neural information processing sys-
tems 24 (2011).
[47] L. F. Rodrigues, M. C. Naldi, J. F. Mari, Comparing convolutional neu-
ral networks and preprocessing techniques for hep-2 cell classification
in immunofluorescence images, Computers in biology and medicine 116
(2020) 103542.
[48] S. F. Chevtchenko, R. F. Vale, V. Macario, F. R. Cordeiro, A convolutional
neural network with feature fusion for real-time hand posture recognition,
Applied Soft Computing 73 (2018) 748–766.
[49] J. Lago, F. De Ridder, P. Vrancx, B. De Schutter, Forecasting day-ahead
electricity prices in europe: the importance of considering market inte-
14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
gration, Applied energy 211 (2018) 890–903.
[50] F. He, J. Zhou, L. Mo, K. Feng, G. Liu, Z. He, Day-ahead short-term
load probability density forecasting method with a decomposition-based
quantile regression forest, Applied Energy 262 (2020) 114396.
[51] M. N. Fekri, H. Patel, K. Grolinger, V. Sharma, Deep learning for load
forecasting with smart meter data: Online adaptive recurrent neural net-
work, Applied Energy 282 (2021) 116177.
[52] J. Lago, K. De Brabandere, F. De Ridder, B. De Schutter, Short-term fore-
casting of solar irradiance without local telemetry: A generalized model
using satellite data, Solar Energy 173 (2018) 566–577.
[53] Dataport, Pecan Street Inc. (Nov 2020).
URL https://www.pecanstreet.org/dataport/
[54] N. Sapountzoglou, Detection et localisation des defauts dans les resaux de
distribution basse tension en presence de production decentralisee, Ph.D.
thesis, Universit´
e Grenoble Alpes (ComUE) (2019).
[55] H. Nguyen, K. P. Tran, S. Thomassey, M. Hamad, Forecasting and
anomaly detection approaches using lstm and lstm autoencoder tech-
niques with the applications in supply chain management, International
Journal of Information Management 57 (2021) 102282.
[56] M. Said Elsayed, N.-A. Le-Khac, S. Dev, A. D. Jurcut, Network anomaly
detection using lstm based autoencoder, in: Proceedings of the 16th ACM
Symposium on QoS and Security for Wireless and Mobile Networks,
2020, pp. 37–45.
[57] S. Xingjian, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, W.-c. Woo,
Convolutional lstm network: A machine learning approach for precipita-
tion nowcasting, in: Advances in neural information processing systems,
2015, pp. 802–810.
[58] W. Luo, W. Liu, S. Gao, Remembering history with convolutional lstm
for anomaly detection, in: 2017 IEEE International Conference on Multi-
media and Expo (ICME), IEEE, 2017, pp. 439–444.
[59] J. Bergstra, D. Yamins, D. Cox, Making a science of model search: Hy-
perparameter optimization in hundreds of dimensions for vision architec-
tures, in: International conference on machine learning, PMLR, 2013, pp.
115–123.
[60] E. Taskesen, distfit - Probability density fitting (1 2020).
URL https://erdogant.github.io/distfit
[61] J. Gawlikowski, C. R. N. Tassi, M. Ali, J. Lee, M. Humt, J. Feng,
A. Kruspe, R. Triebel, P. Jung, R. Roscher, et al., A survey of uncertainty
in deep neural networks, arXiv preprint arXiv:2107.03342 (2021).
15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
... Employing the equivalent transfer matrix for the cable's second section, B l−x , in conjunction with (11), the voltages and currents at the cable's receiving end are determined by (15). ...
... Step 4: Solving for r f By substituting (15) into (16), m can be represented in a compact format in (19). To calculate the sum of the squares of the voltage difference of the sheaths, the column vector m is multiplied by its conjugate transpose, as shown in (20) and (21). ...
Article
Full-text available
Traditional fault location methods face challenges in grids with high penetration of inverter-based resources (IBRs) due to the non-sinusoidal and distorted nature of the fault currents. This paper presents a fault location method for such grids. It is designed for medium voltage cables, employing sheath current measurements and combining time-domain and phasor-domain analysis. The proposed method addresses the limitations of existing techniques by utilizing low-frequency local measurements. Simulation results on a 10kV test system demonstrate the method’s accuracy, achieving fault location errors below 1.81%. Sensitivity analysis further confirms the method’s robustness against measurement uncertainties. The proposed algorithm’s implementation on an embedded device using IEC 61850-9-2 protocol showcases its practical feasibility for real-time fault location.
... This effectiveness is evaluated using accuracy, precision-recall, and F1-score. the data drive approach for fault detection and location in active distribution network is presented in [100]. This application harnesses the widespread availability of high-resolution measurement devices throughout the network. ...
Article
Full-text available
Fault detection is a critical process in ensuring the reliability and safety of modern power systems. With the evolvement of the grid in the modern paradigm, the detection of faults has become complex, particularly due to the incorporation of renewable based distributed generation. Thus, a detailed study of the available fault detection methods needs to be addressed while developing a fault detection model for the modern power system. In line with this, the paper provides a comprehensive overview of different fault detection techniques. The fundamentals of each technique have been elucidated in detail. Moreover, a through comparative analysis of each technique has been done on the basis of several terms to highlight the strengths and shortcomings of each method. Based on the review of these techniques, several prospective research gaps have been identified which will help the researchers to advance the study of power system protection.
Article
Accurate fault localization is critical for ensuring reliable power supply in active distribution networks, yet conventional state estimation (SE)-based methods fail to differentiate authentic fault responses from measurement distortions due to uncertainties in fault parameters. To overcome this limitation, a robust three-phase SE-driven fault localization methodology is proposed. First, a measurement transformation-based SE model is built for fault conditions, leveraging real-time voltage phasor measurements and pseudo-measurements derived from pre-fault SE results. Then, a robust fault SE model is built using the quadratic-constant-based generalized maximum likelihood estimation, solved through the iteratively reweighted least squares algorithm that postpones phasor measurement weight updates until after initial iterations to prevent residual contamination. Furthermore, a fault localization algorithm is proposed through the systematic traversal of candidate buses, where each potential fault localization is assessed by performing robust fault SE with the fault current injected into this bus. The matching index is designed, accounting for the weight disparity of different types of measurements and measurement placement. Extensive simulations on a 33-bus unbalanced distribution network validate the method’s effectiveness under various measurement noise levels, fault resistances and incorrect data severity. The approach maintains comparable accuracy to conventional SE under normal operating conditions, while it exhibits superior robustness against measurement anomalies and effectively preserves fault localization reliability when confronted with incorrect data.
Article
Full-text available
Transmission lines are vital for delivering electricity over long distances, yet they face reliability challenges due to faults that can disrupt power supply and pose safety risks. This research introduces a novel approach for fault detection and classification by analyzing voltage and current patterns across transmission line phases. Leveraging a comprehensive dataset of diverse fault scenarios, various machine learning algorithms—including Random Forest (RF), K-Nearest Neighbors (KNN), and Long Short-Term Memory (LSTM) networks—are evaluated. An ensemble methodology, RF-LSTM Tuned KNN, is proposed to enhance detection accuracy and robustness. Results indicate that RF-LSTM Tuned KNN achieves a remarkable accuracy of 99.96% on a multi-label dataset, outperforming RF (97.50%) and KNN (96.55%). In binary classification, KNN attains the highest accuracy of 99.85%, closely followed by RF at 99.72%. This methodology provides significant advancements in fault detection capabilities, offering valuable insights for improving grid reliability and stability, and ensuring a more resilient power supply.
Article
Full-text available
This paper provides a comprehensive and systematic review of fault diagnosis methods based on artificial intelligence (AI) in smart distribution networks described in the literature. For the first time, it systematically combs through the main fault diagnosis objectives and corresponding fault diagnosis methods for a smart distribution network from the perspective of combined signal processing and artificial intelligence algorithms. The paper provides an in-depth analysis of the advantages and disadvantages of various signal processing techniques and intelligent algorithms in different fault diagnosis tasks, focusing on the impact of different data dimensions on the effect of fault diagnosis. This paper points out that data security issues and the question of how to combine expert domain knowledge with artificial intelligence technology are essential directions for the future development of fault diagnosis in smart distribution network.
Conference Paper
Full-text available
The high penetration rate of distributed energy resources even at the edge of the distribution grids drastically changes the operational status quo, mainly caused due to their high intermittent nature. In order to ensure the uninterrupted electricity supply of the end-consumers, the fast and accurate response to fault events is of critical importance for the operators. This paper proposes a data-driven fault location identification and type classification application based on the ConvLSTM models, which leverages the proliferation of advanced measurement devices in the distribution networks and can locate the exact position of the fault and classify it in eleven different types. These models grasp the spatiotemporal characteristics of the three-phase voltage and current timeseries measurements stemming from the field devices, increasing the visibility of the operators for their networks in real-time conditions. The results conducted through the use of synthetic data showcase the efficacy of this application with accuracy in faulty feeder detection reaching 96% and in the fault type exceeding 88%.
Article
Full-text available
The evolution of the conventional power systems to smart grids has changed the way to conceive and operate them. The part of the grid evolving the most is the distribution grid where the installation of additional sensors and actuators has increased its observability and controllability. These have enabled the development of more accurate and automated processes including some critical ones such as the fault detection, isolation and restoration techniques. In this direction, unconventional methods, e.g. artificial intelligence, have been increasing in popularity over the last years. In this paper, fault location and fault classification methods are reviewed for both medium–voltage and the until recently unexplored case of low–voltage distribution grids. Different methods applied for both fault location and fault classification are being classified by the implemented technique. Such methods are explained and analyzed providing the main advantages and disadvantages of each category. Additionally, the research trends in both fields are analyzed and state–of–the–art methods from each category are thoroughly compared. Finally, the research gaps are identified.
Article
Full-text available
Artificial intelligence techniques lead to data-driven energy services in distribution power systems by extracting value from the data generated by the deployed metering and sensing devices. This paper performs a holistic analysis of artificial intelligence applications to distribution networks, ranging from operation, monitoring and maintenance to planning. The potential artificial intelligence techniques for power system applications and needed data sources are identified and classified. The following data-driven services for distribution networks are analyzed: topology estimation, observability, fraud detection, predictive maintenance, non-technical losses detection, forecasting, energy management systems, aggregated flexibility services and trading. A review of the artificial intelligence methods implemented in each of these services is conducted. Their interdependencies are mapped, proving that multiple services can be offered as a single clustered service to different stakeholders. Furthermore, the dependencies between the AI techniques with each energy service are identified. In recent years there has been a significant rise of deep learning applications for time series prediction tasks. Another finding is that unsupervised learning methods are mainly being applied to customer segmentation, buildings efficiency clustering and consumption profile grouping for non-technical losses detection. Reinforcement learning is being widely applied to energy management systems design, although more testing in real environments is needed. Distribution network sensorization should be enhanced and increased in order to obtain larger amounts of valuable data, enabling better service outcomes. Finally, the future opportunities and challenges for applying artificial intelligence in distribution grids are discussed.
Article
Full-text available
Electricity load forecasting has been attracting research and industry attention because of its importance for energy management, infrastructure planning, and budgeting. In recent years, the proliferation of smart meters and other sensors has created new opportunities for sensor-based load forecasting on the building and even individual household level. Machine learning approaches such as Recurrent Neural Networks (RNNs) have shown great successes in load forecasting, but these approaches employ offline learning: they are trained once and miss on the opportunity to learn from newly arriving data. Moreover, they are not well suited for handling the concept drift; for example, their predictive performance will degrade if the load changes due to the installation of new equipment. Consequently, this paper proposes Online Adaptive RNN, an approach for load forecasting capable of continuously learning from newly arriving data and adapting to new patterns. RNN is employed to capture time dependencies while the online aspect is achieved by updating the RNN weights according to new data. The performance is monitored; if it degrades, online tuning is activated to adapt the RNN hyperparam-eters to changes in data. The proposed approach was evaluated with data from five individual homes: the results show that the proposed approach achieves higher accuracy than the standalone offline long short term memory network and five other online algorithms. Moreover, the time to learn from new samples is only a fraction of the time needed to retrain the offline model.
Article
Full-text available
Making appropriate decisions is indeed a key factor to help companies facing challenges from supply chains nowadays. In this paper, we propose two data-driven approaches that allow making better decisions in supply chain management. In particular, we suggest a Long Short Term Memory (LSTM) network-based method for forecasting multivariate time series data and an LSTM Au-toencoder network-based method combined with a one-class support vector machine algorithm for detecting anomalies in sales. Unlike other approaches, we recommend combining external and internal company data sources for the purpose of enhancing the performance of forecasting algorithms using multivariate LSTM with the optimal hyperparameters. In addition, we also propose a method to optimize hyperparameters for hybrid algorithms for detecting anomalies in time series data. The proposed approaches will be applied to both bench-marking datasets and real data in fashion retail. The obtained results show that the LSTM Autoencoder based method leads to better performance for anomaly detection compared to the LSTM based method suggested in a previous study. The proposed forecasting method for multivariate time series data also performs better than some other methods based on a dataset provided by NASA.
Conference Paper
Full-text available
Anomaly detection aims to discover patterns in data that do not conform to the expected normal behaviour. One of the significant issues for anomaly detection techniques is the availability of labeled data for training/validation of models. In this paper, we proposed a hyper approach based on Long Short Term Memory (LSTM) autoencoder and One-class Support Vector Machine (OC-SVM) to detect anomalies based attacks in an unbalanced dataset, by training the models using only examples of normal classes. The LSTM-autoencoder is trained to learn the normal traffic pattern and to learn the compressed representation of the input data (i.e. latent features) and then feed it to an OC-SVM approach. The hybrid model overcomes the shortcomings of the separate OC-SVM, in which its low capability to operate with massive and high-dimensional datasets. Additionally, we perform our experiments using the most recent dataset (InSDN) of Intrusion Detection Systems (IDSs) for SDN environments. The experimental results show that the proposed model provides higher detection rate and reduces the processing time significantly. Hence, our method provides great confidence in securing SDN networks from malicious traffic.
Article
Full-text available
The integration of advanced measuring technologies in distribution systems allows distribution system operators to have better observability of dynamic and transient events. In this work, the applications of distribution grid measurement technologies are explored in detail. The main contributions of this review are: (a) a comparison of eight advanced measurement devices for distribution networks, based on their technical characteristics, including reporting periods, measuring data, precision, and sample rate; (b) a review of the most recent applications of micro-Phasor Measurement Units, Smart Meters, and Power Quality Monitoring devices used in distribution systems, considering different novel methods applied for data analysis; and (c) an input-output table that relates measured quantities from micro-Phasor Measurement Units and Smart Meters needed for each specific application found in this extensive review. This paper aims to serve as an important guide for researches and engineers studying smart grids.
Article
Full-text available
Fault location with the highest possible accuracy has a significant role in expediting the restoration process, after being exposed to any kind of fault in power distribution grids. This paper provides fault detection, classification, and location methods using machine learning tools and advanced signal processing for a radial distribution grid. The three-phase current signals, one cycle before and one cycle after the inception of the fault are measured at the sending end of the grid. A discrete wavelet transform (DWT) is employed to extract useful features from the three-phase current signal. Standard statistical techniques are then applied onto DWT coefficients to extract the useful features. Among many features, mean, standard deviation (SD), energy, skewness, kurtosis, and entropy are evaluated and fed into the artificial neural network (ANN), Multilayer perceptron (MLP), and extreme learning machine (ELM), to identify the fault type and its location. During the training process, all types of faults with variations in the loading and fault resistance are considered. The performance of the proposed fault locating methods is evaluated in terms of root mean absolute percentage error (MAPE), root mean squared error (RMSE), Willmott’s index of agreement (WIA), coefficient of determination (R2), and Nash-Sutcliffe model efficiency coefficient (NSEC). The time it takes for training and testing are also considered. The proposed method that discrete wavelet transforms with machine learning is a very accurate and reliable method for fault classifying and locating in both a balanced and unbalanced radial system. 100% fault detection accuracy is achieved for all types of faults. Except for the slight confusion of three line to ground (3LG) and three line (3L) faults, 100% classification accuracy is also achieved. The performance measures show that both MLP and ELM are very accurate and comparative in locating faults. The method can be further applied for meshed networks with multiple distributed generators. Renewable generations in the form of distributed generation units can also be studied.
Article
This paper presents a survey of the literature on the strategies to enhance the resilience of power systems while shedding lights on the research gaps. Using a deductive methodology on the literature covering the resilience of power systems, we reviewed more than two hundred peer-reviewed articles spanning the 2010–2019 decade. We find that there is vacuum on the level of integration that considers the interdependence of local or decentralized decision making in an adaptive power system. This gap is widened by the absence of policies to enhance resilience in power networks. While there is significant coverage and convergence of research on algorithms for solving the multi-objective problem in optimization routines, there are still uncharted territories on how to incorporate system degradation while designing these self-restoration systems. We posit that a shift to a smarter, cleaner and more resilient power network requires sustained investments rather than disaster-induced responses.
Article
Fault detection is the critical stage of the relaying system and their successful completion in minimum time is expected for fault clearance. With the increasing usage of distributed generators (DGs) in a distribution network, the conventional relaying methods are becoming inappropriate due to changing fault current levels. This paper presents a deep learning algorithm i.e. Convolutional Neural Network (CNN) customized for fault classification in the distributed networks integrated with DGs. This is first time that CNN has been used for fault detection using raw and sampled-data of three-phase voltage and current signals of various fault classes and no-fault class. The 10-fold cross-validation is used to demonstrate the performance of the proposed model in terms of different metrics such as accuracy, sensitivity, specificity, precision, and F1 score. The proposed model has attended an average 10-fold cross-validation accuracy of 99.52% for all the tested fault cases. This featureless proposed method has been compared with conventional approaches from literature and has shown better performance in terms of accuracy and computation burden. Further, a similar fault study is conducted on a mixed transmission line and distribution network with PV as DG using the proposed method and found performance accuracy of 99.92% and 99.97%, respectively.