ArticlePDF Available

Abstract and Figures

This paper presents a focused and comprehensive literature survey on the use of machine learning in antenna design and optimization. An overview of the conventional computational electromagnetics and numerical methods used to gain physical insight into the design of the antennas is first presented. The major aspects of machine learning are then presented, with a study of its different learning categories and frameworks. An overview and mathematical briefing of regression models built with machine learning algorithms is then illustrated, with a focus on those applied in antenna synthesis and analysis. An in-depth overview on the different research papers discussing the design and optimization of antennas using machine learning is then reported, covering the different techniques and algorithms applied to generate antenna parameters based on desired radiation characteristics and other antenna specifications. Various investigated antennas are sorted based on antenna type and configuration to assist the readers who wish to work with a specific type of antennas using machine learning.
This content is subject to copyright. Terms and conditions apply.
A review on the design and optimization of antennas using
machine learning algorithms and techniques
Hilal M. El Misilmani | Tarek Naous | Salwa K. Al Khatib
Department of Electrical and Computer
Engineering, Beirut Arab University,
Debbieh, Lebanon
Hilal M. El Misilmani, Department of
Electrical and Computer Engineering,
Faculty of Engineering, Beirut Arab
University, P.O. Box 11-5020 Beirut, Riad
El Solh, 1107 2809, Debbieh, Lebanon.
This paper presents a focused and comprehensive literature survey on the use
of machine learning (ML) in antenna design and optimization. An overview of
the conventional computational electromagnetics and numerical methods used
to gain physical insight into the design of the antennas is first presented. The
major aspects of ML are then presented, with a study of its different learning
categories and frameworks. An overview and mathematical briefing of regres-
sion models built with ML algorithms is then illustrated, with a focus on those
applied in antenna synthesis and analysis. An in-depth overview on the differ-
ent research papers discussing the design and optimization of antennas using
ML is then reported, covering the different techniques and algorithms applied
to generate antenna parameters based on desired radiation characteristics and
other antenna specifications. Various investigated antennas are sorted based
on antenna type and configuration to assist the readers who wish to work with
a specific type of antennas using ML.
antenna design, computational electromagnetics, machine learning, neural networks, regression
Over the past few decades, the art of machine learning
(ML) has taken the world by storm with its pervasive
applications in automating mundane tasks and offering
disruptive insights across all walks of science and engi-
neering. Though arguably still in its infancy, ML has all
but revolutionized the technology industry. ML practi-
tioners have managed to alter the foundations of count-
less industries and fields of study, including lately the
design and optimization of antennas. In the light of the
Big Data era the world is experiencing, ML has gar-
nered a lot of attention in this field. ML shows great
promise in the field of antenna design and antenna
behavior prediction, whereby the significant accelera-
tion of this process can be achieved while maintaining
high accuracy.
Known for their complex shapes, antennas typically
do not have closed-form solutions. Computational Elec-
tromagnetics (CEM)
are applied to model the interac-
tion of electromagnetic fields with antennas using
Maxwell's equations. Approximate solutions are usually
used to gain physical insight into the design of the
antenna. With the advancements in numerical methods,
integral equations were used to solve linear antennas.
Later on, with the advancements in computers, it became
possible to solve Maxwell's equations using integral and
differential equation solvers. Method of moments
was then introduced to also solve the integral
equations. For a more complicated antenna structure,
additional unknowns are added to the equations. Differ-
ential equation solvers were then developed with a sim-
pler implementation even though they contain a larger
number of unknowns. Memory and CPU usages are
Received: 22 January 2020 Revised: 18 May 2020 Accepted: 23 June 2020
DOI: 10.1002/mmce.22356
Int J RF Microw Comput Aided Eng. 2020;e22356. © 2020 Wiley Periodicals LLC 1of28
among the main drawbacks of the integral and differen-
tial equation solvers since they scale with the size of the
antenna. Fast integral equation solvers were then devel-
oped, for which the integral equations are solved using
iterative methods, with reduced memory requirements.
The most widely known CEM methods in antenna
design can be classified into numerical methods and high
frequency methods. Three numerical analysis methods
that are commonly used in antenna simulations and test-
ing are namely: finite difference time domain (FDTD),
finite element method (FEM),
and MoM.
physical optics approximation method, the radiation field
of high frequency reflector antennas can be also
obtained. Typically, most of the work involving antenna
simulations require solving partial differential equations,
with defined boundary conditions, using computers.
High frequency methods include current based Physical
optics (PO)
and field based Geometric optics (GO).
Other methods are also found, such as generalized multi-
pole technique (GTM), multiple multipole program
(MMP), conjugate gradient method (CGM), and trans-
mission line matrix method (TLM).
The most widely used commercial CEM software for
antenna design and simulations are ADS, HFSS, CST, and
IE3D. These software tools also lack several important fea-
tures. For instance, 3D structures cannot be modeled using
ADS, structures with finite details cannot be simulated using
IE3D, and the execution time of HFSS and CST is high and
increases as the size of the antenna structure is enlarged.
Due to their inherent nonlinearities, ML has been
considered thoroughly as a complimentary method to
CEM in designing and optimizing various types of anten-
for several advantages, as will be discussed fur-
ther in this paper. ML is a large area within artificial
intelligence (AI), as shown in Figure 1, that focuses on
getting useful information out of data, thus explaining
why ML has been frequently associated with statistics
and data science. Indeed, the data-driven approach of ML
has allowed us to design systems like never before, taking
the world steps closer to building truly autonomous sys-
tems that can match, compete, and sometimes out-
perform human capabilities and intuition. However, the
success of ML approaches relies heavily on the quality,
quantity, and availability of data, which can be challeng-
ing to obtain in certain cases. From an antenna design
perspective, this data need to be acquired, if not already
available, since no standardized dataset for antennas,
such as the ones available for computer vision, are yet
available. This can be achieved by simulating the desired
antenna on a wide range of values using CEM simulation
software. Based on the obtained results, a dataset can be
created and divided into a train, cross-validation, and test
sets, for the purpose of training a ML model and validat-
ing whether this model succeeds in generalizing on new
inputs. At this point, it is up to the designer's clairvoy-
ance and expertise to know how to diagnose the model to
improve performance. Some common steps to follow in
this regard would be to plot the learning curves and to
FIGURE 1 Relationship between artificial intelligence, machine learning, and deep learning
check the values for the bias and variance. Typically, a
large part in optimizing a model's performance depends
on the intuition of the designer, specifically when using
neural networks, where the best possible architecture
and hyper-parameters need to be found out for optimal
This paper presents and investigates the use of ML in
antenna design and optimization and provides a compre-
hensive survey of all the antennas designs found in the
literature that have employed different ML techniques. It
serves as a guide to researchers in the antenna commu-
nity with minimal ML expertise seeking to employ this
technology in their work. The different antenna design
papers investigated are sorted according to the type and
category of the antenna, which makes it simpler for
readers interested in beginning research on antenna
design and optimization using ML.
The rest of this paper is organized as illustrated in
Figure 2 and as follows: a detailed overview of the CEM
methods is presented in Section 2. Section 3 presents an
overview on ML covering the different categories of
learning, in addition to ML frameworks and applications.
Section 4 investigates the regression models built with
ML algorithm and used for antenna design. Section 5 pre-
sents the in-depth overview on the different works in the
literature discussing the design and optimization of
antenna parameters using ML. Section 6 presents another
aspect of the literature, where ML was used to enhance
different types of optimization algorithms in designing
antennas. Concluding remarks, challenges, and future
directions, follow in Section 7. A list of most of the acro-
nyms used in paper is also presented in Table 1.
Using central-difference approximations, FDTD is based
on discretizing the time-dependent Maxwell's equations
to the space and time partial derivatives.
It basically
contains a grid of points containing the computational
domain with boundary conditions. Field equations are
used to find physical quantities using post processing.
In FEM, linear equations are formed by meshing compu-
tational domain problems using weighted residual
As for MoM, the computational area is split
into various segments. Each segment is then meshed and
evaluated using basis functions.
The current of each
segment and the strength of each moment are studied
using Green's functions.
Nevertheless, all of these methods suffer from several
drawbacks that affect their results. For FDTD, the accu-
racy of computation is affected by the reflection from the
boundary. Truncation techniques can be used to reduce
these reflections; however, the truncation also affects the
FIGURE 2 Diagrammatic view of the organization of this survey
accuracy of the computations.
For high order absorbing
boundary conditions, the time and memory resources
needed get higher as the computational domain is larger.
Many methods were developed to remedy some of these
drawbacks. For instance, perfectly matched layer (PML)
can be used to decrease the reflections by absorption of
EM. However, this comes at the cost of increasing the
required CPU time and computational domain.
case approximation was also proposed for discretization,
but also increases the reflection and affects the accuracy
of computation.
Another approach is to use 3D FDTD
which employs two different time step increments;
however, strong electric fields largely affects the stability
of this method.
Other methods are also found, such as
semi implicit schemes (SIS),
sub-cell algorithm,
FDTD-alternating direction implicit method,
dimensional-finite-difference time-domain method,
domain decomposition-Laguerre-FDTD Method,
Runge-Kutta Higher Order FDTD.
Knowing that each
one of these methods has its own advantages, such as the
enhanced accuracy, and the reduced CPU time, most of
these methods are still considered as time consuming.
Also, they have difficulties in modeling thin wires, fre-
quency dependent materials, and have dispersion
As for FEM, which is widely used in modeling wave-
guides, Yagi-Uda antennas, horn antennas, and vehicular
antennas, it also suffers from certain drawbacks. For
instance, as a result of its unstructured mesh, large radia-
tion problems are difficult to be modeled using FEM, as
they require excessive computation that could result in
computational errors. Several methods have been pro-
posed with FEM to remedy some of its drawbacks. For
instance, Direct FE solver has been proposed for better
accuracy with 3D structures, but also suffers from CPU
time and memory storage requirements.
Dual prime,
which can be used with 3D structure problems, Vivaldi
arrays, and other array problems, has a faster conver-
gence time but also suffers from a trade-off between accu-
racy and computational cost.
Element Tearing and
interconnecting full-dual-primal have been also proposed
for the analysis of 3D large-scale problems, but also suffer
from memory and CPU time requirements.
Despite its
parallelization difficulty, finite element-boundary
integral-multilevel fast multipole (FE-BI-MLFMA) algo-
rithm method has been also proposed and used in bio-
medical and space applications, in addition to antenna
arrays, as a result of its efficiency and accuracy.
methods are also found, such as non-conforming FETI,
and domain decomposition based preconditioner (FE-BI-
MLFMA) algorithm, but also suffer from memory
requirements, and have difficulty working with lossless
3D objects with high permittivity and permeability.
Additionally, FEM has also difficulties modeling thin
As for MoM, errors can occur as a result of the choice
of the testing and basis functions.
Typically, many
issues are associated with MoM, such as low-frequency
breakdown and singularity.
In addition, MoM is not
efficient to inhomogeneous and composite structures.
Although some solutions are found for several drawbacks
of MoM, such as the use of pre-conditioners to solve the
low-frequency breakdown,
recovery to solve the charge
cancelation problem,
and the multi-resolution
approach to improve the spectrum of MoM,
TABLE 1 List of acronyms
Acronym Definition
ANN Artificial neural network
BR Bayesian regularization
BRANN Bayesian regularized artificial neural network
CEM Computational electromagnetics
DE Differential evolution
FDTD Finite difference time domain
FEM Finite element method
FFBP Feed forward backpropagation
GA Genetic algorithm
GD Gradient descent
GPR Gaussian process regression
K-NN K-Nearest neighbors
LASSO Least absolute shrinkage and selection operator
LBE Learning-by-example
LM LevenbergMarquardt
LR Linear regression
ML Machine learning
MLP Multi-layer perceptron
MoM Method of moments
MoM-LP Method of moments based on local periodicity
MSE Mean squared error
PIFA Planar inverted-F antenna
PSO Particle swarm optimization
RBF Radial basis function
RPROP Resilient backpropagation
SDG Stochastic gradient descent
SIW Substrate integrated waveguide
SOM Self-organizing map
SVM Support vector machines
SVR Support vector regression
VSWR Voltage standing wave ratio
computational cost, CPU memory and timing required
can be further enhanced. MoM is also considered as com-
putationally expensive since it requires dense systems of
equations to solve the integral equations.
In conjunction with the standard CEM methods, artifi-
cial neural networks (ANNs) can be used to minimize
the energy function obtained by FEM.
Due to their
stability, ANNs have also been used as a solution to
MoM in Reference 53. Taking advantage of today's
advances in distributed computing, ANNs can be used
to efficiently solve large and complex EM problems, as
well as integral equations, due to their parallel and
distributed processing capabilities.
solution of EM problems, ANNs were also used with
FDTD and proved to increase the computational
speed. For instance, they were used in Reference 55 to
provide a global modeling approach for Microwave
and Millimeter-Wave Circuits design, in a much faster
approach than the traditional FDTD.
Generally speaking, ANN models possess advanta-
geous characteristics that are beneficial in solving EM
problems. They are characterized by their ability to
approximate nonlinear input-output mappings which
optimizes the relation between the input data and the
required output, their adaptivity to changes in the envi-
ronment, their uniformity of analysis and design, and
neurobiological analogy.
One of ML's major advantages in this field is the
reduction of the large computational times found in the
presented CEM techniques, especially when several
parameters are to be optimized, or when a large structure
is to be designed. The formulations of several antenna
geometries, especially those with innovative structuring,
complex geometries, or nonlinear loads, are still difficult
to be treated analytically with known antenna theories,
especially that some of them still suffer from low accu-
ML can be applied to model and predict scattering
problems and analyze and optimize antennas in real-
ANNs can be easily realized using several avail-
able frameworks, implemented on high-performance
computers, and can efficiently model electromagnetic
structure in much less time with very low computational
resources, and negligible degrees of errors.
In the
antenna design sense, where closed-form solutions are
hard to be found, ML can be the perfect solution to elimi-
nate the time consumed in trial-and-error simulations
when optimizing geometrical parameters to achieved
some specific design requirements such as the desired
radiation characteristics, especially if some of these char-
acteristics are to be modified in real time.
Although the idea behind ML dates back to the
recent times have witnessed an unanticipated
surge of interest in ML algorithms. This interest has been
stimulated by the large availability of data in the digital
age the world has been witnessing, the access to high per-
formance computing, and the better mathematical formu-
lation and comprehension of learning techniques. Having
revolutionized many aspects in research and industry,
multiple breakthroughs in ML have occurred such as deep
reinforcement learning
and generative adversarial net-
works (GANs).
Although some ML algorithms, specifi-
cally deep neural networks (DNNs), are perceived as
Black Boxtools, they work very well in practice and
have outperformed some well-disciplined approaches.
3.1 |Categories of learning
ML can be generally divided to three key categories:
supervised learning, unsupervised learning, and rein-
forcement learning, shown in Figure 3.
3.1.1 |Supervised learning
It is a learning task in which a model generalizes on a set
of labeled input-output pairs to consequently make pre-
dictions on unseen input. There is a distinction between
training and testing data in supervised learning, where
training samples are associated with labels or targets
which the test samples are missing. Supervised learning
can be divided into parts:
Regression: It is a supervised learning problem in
which data are used to predict real-valued labels of
unseen data. Regression algorithms include linear
regression (LR),
kernel ridge regression,
vector regression (SVR),
and least absolute shrink-
age and selection operator (LASSO).
Classification: In classification, the goal is to label data
from a finite set of classes. Binary classifications refer
to classification based on a set of two classes, and
multi-class classification refers to classification based
on a set of three or more classes.
3.1.2 |Unsupervised learning
After receiving an unlabeled dataset, an unsupervised
learning model then predicts certain labels for new data.
Unlike the case in supervised learning, there is no
distinction between train and test data in unsupervised
Two learning problems are recognized in
unsupervised learning:
Clustering: Often used for large datasets, clustering is a
learning problem that aims to identify regions or
groups within these datasets.
Dimensionality reduction: Also known as manifold
learning, it is the process of reducing the dimensions
in which data are represented while maintaining some
principal features of the initial representation.
3.1.3 |Reinforcement learning
It is a learning paradigm in which the learner, also
referred to as the agent, actively interacts with the learning
environment to achieve a common goal. Used in control
theory, optimization, and cognitive sciences, this paradigm
depends on the notion of rewards given to the agent in
amounts proportional to the achievements of the agent,
which he aims to maximize. A model that is widely
adopted in this field is Markov decision processes (MDPs)
which represents the environment and the interactions
with it. Since the transition and reward probabilities do
not rely on the entire history of the model and only on its
current state, the model is considered Markovian.
3.2 |Machine learning frameworks
Numerous open-source frameworks are available to
apply machine and deep learning concepts for solving
real world problems. These platforms that are based
on optimized codes written in Python, R, Java, or any
other programming language, offer flexible and fast
usage of several algorithms, thus making them essen-
tial and critical tools in research and development.
These libraries include but are not limited to:
Microsoft CNTK,
and many
others. In addition to these, off-the-shelf tools such as
the WEKA software
are available for people with
domain-expertise but minimal ML experience where
they would only have the task of acquiring data and
tuning the hyperparameters.
3.3 |Applications
There is an abundance of areas where ML can have an
impact ranging from molecular dynamics for predicting
atomic behavior,
to serving as an analysis tool in
or building reliable financial predic-
In the realm of electrical and computer engineer-
ing, a plethora of works presented in the literature can
be found where ML has contributed to the enhancement
of previous systems, or in finding new approximate solu-
tions to recurring problems. These techniques have also
been widely employed in communication technology,
among which we mention: deep learning based detec-
tion and decoding,
antenna selection in MIMO,
wireless and cellular networks,
cognitive radios,
wireless sensor networks,
FIGURE 3 The three main categories of ML. ML, machine learning
Regression algorithms are the essential tools needed
when applying ML in the design of antennas. By using
these algorithms and dataset of a considerable size, a
model representing the mapping function of the non-
linear relationship between the antenna's geometrical
parameters and characteristics can be derived. The most
widely used ML algorithms for antenna design are
and SVR.
Other regression methods that
are less widely used are LR, LASSO, Gaussian process
regression (GPR),
and Kriging Regression.
section provides a mathematical briefing of these ML
algorithms that are applied in antenna design.
4.1 |Linear regression
Considered to be one of the simplest regression algo-
rithms, LR is a statistical tool used to trace a linear rela-
tionship between some variables and their respective
numeric target values. For an unknown stochastic envi-
ronment, we consider a set of labeled examples
i=1 for the goal of building a model
xðÞ=wx +bð1Þ
where Nis the size of the set, x
is a D-dimensional vector
of example i=1,,N,y
R is the numeric target value,
is a D-dimensional vector of unknown, but fixed
parameters, and band Dare real numbers.
For the
model to produce the most accurate prediction of y, the
optimal values of wand bneed to be reached. To that
end, we consider the following cost function to be
i=1 fw,bxi
This squared error loss function represents the aver-
age loss, or empirical risk, obtained after applying the
model to the training data. It accounts for the average
penalties for misclassification of examples i=1,,N.
Gradient descent optimization algorithm (GD),
is used
to minimize the cost function. GD is used in LR to itera-
tively find the minimum of the function by gradually tak-
ing steps toward the negative of the gradient. The first
step of GD is calculating the partial derivative of every
parameter in the cost function as follows:
where the partial derivatives were calculated using the
chain rule. The parameters w
and b
are initialized by
zero. It is worth noting that the correct initialization of
parameters is integral in the success of the optimization
algorithm. After initializing the parameters, training data
) are iterated through, where in each iteration the
parameters are updated as follows:
where αdenotes the learning rate, and w
and b
denote the
respective values of wand bafter using the training example
). The algorithm stops iterating when the values of the
parameters remain relatively constant upon the end of an
epoch, where an epoch is a pass over all training examples.
4.2 |Least absolute shrinkage and
selection operator
LASSO algorithm, also known as Sparse Linear Regres-
sion, integrates L1 regularization and mean-squared error
with a linear model.
L1 regularization is known to
result in a sparse solution, where sparsity refers to having
parameters with an optimal value of zero. Thus, this algo-
rithm can be used for feature selection. The LASSO esti-
mate is defined by
where λis the regularization parameter, and kwk
is the
L1 norm obtained by Pd
i=1 wi
4.3 |Artificial neural networks
A neurobiological analogy of the brain, an ANN is a ML
technique that derives its computing power from the
massive interconnections between its neurons,which
are the computing cells, and from its ability to generalize
based on experiential knowledge. ANNs are known to be
great function approximators
and are widely used for
regression problems. In general, an ANN consists of an
input layer of nodes that is not counted since no compu-
tations occur at this layer, an output layer of computation
nodes, and zero or more hidden layers whose computa-
tion nodes are referred to as hidden nodes. An example is
shown in Figure 4 where a deep neural network with two
hidden layers is sketched. The architecture of the net-
work when it comes to the number of layers and the
number of nodes at each layer depends on the algorithm
used in the learning process and the desired output of the
Consider the following nested function that repre-
sents an ANN:
y=fNN xðÞ ð8Þ
The internal functions of layer indices lof the nested
function have the following form:
ðÞ ð9Þ
where g
represents an activation function. Activation
functions are fixed non-linear functions used as tools to
compute the output of a computation node, which is then
fed as input to the subsequent nodes.
We present three
types of commonly used activation functions:
Logistic function: Also known as the sigmoid func-
tion, the logistic function is defined as follows
As shown in Figure 5 the logistic function saturates
and becomes less sensitive to input at high or low values
of x, while they exhibit sensitivity for values of
xnear zero.
Hyperbolic tangent function: Also known as tanh
function, shown in Figure 6. It is characterized as
tanh xðÞ=exex
ReLU Function: Typically used in all hidden layers,
the ReLU function is a rectified linear unit function
shown in Figure 7 and defined as follows
relu xðÞ=
To non-linearly estimate the gradient of the cost func-
tion of an ANN, which is the cross-entropy loss, we con-
sider a popular, widely used training algorithm called
Based on GD, Backpropagation is a computational
iterative procedure that aims to find a local minimum of
the cost function. It consists of forward and backward
passes. During the forward passes, the outputs of the acti-
vation functions are computed and stored to be used in
the following pass.
During backward passes, partial derivatives of the
cost function are calculated using the chain rule starting
from the final layer to eventually update the parameters.
The error is said to be back propagatedfrom layer to
layer. It is worth noting that the non-convex nature of
the cost function in this case implies that a local rather
than a global optimum is reached.
In Backpropagation,
the change Δw
(k) in the weight of a connection between
two neurons iand jis given by the following:
FIGURE 4 Schematic of a deep neural network with two
hidden layers FIGURE 5 Sigmoid function
Δwji kðÞ=αδ jxi+μΔwji k1ðÞ ð13Þ
where the input is x
,αis the learning rate, δ
whether the neuron jis a hidden neuron or an output
neuron, and μis the momentum coefficient.
4.4 |Support vector regression
Support vector machines (SVM), a widely popular mod-
ern ML algorithm used for classification has inspired
another algorithm used for regression: SVR.
Similar to
its classification counterpart, the idea behind SVR is to
separate the data points into two sets: points which fit
within a predefined tube of width ϵ> 0 and which are
not penalized, and points which fall outside this bound-
ary and are thus penalized as shown in Figure 8.
For a set of linear hypothesis functions
where Φis the feature mapping corresponding to a posi-
tive definite symmetric kernel function K, and w.Φ(x
the dot product of the feature mapping Φ(x
) and w.
Training this model involves reaching optimal values of
wand bby minimizing the corresponding cost function.
The cost function to be minimized is as follows
i=1 yiw:Φxi
where |.|
denotes the ϵ-insensitive loss as shown in
Figure 8.
It is worth noting that the choice of the parameter ε
plays a role in determining the sparsity and accuracy of
the model, where assigning large values to εresults in
sparser solutions.
Gaussian kernels, otherwise known as radial basis
function (RBF), is a kernel Kdefined over R
ðÞ= exp xx0
for any constant σ> 0. These kernels are the most com-
monly used kernels in this and other applications.
4.5 |Gaussian process regression
In GPR, the objective function is considered as a sam-
ple of a Gaussian stochastic process. By using the
available data samples, the distribution of the func-
tion value for new samples can be predicted. Consid-
ering a set of labeled examples {(x
)}, new
predictions at a certain input x
can be obtained by
the following
FIGURE 6 Tanh function
FIGURE 7 ReLU function
FIGURE 8 SVR epsilon-bounded data. SVR, support vector
=μ+rTR1yIμðÞ ð17Þ
where Iis a n×1 vector of ones, μis the mean of the pre-
dictive distribution, R
= Corr(x
) is a correlation
function with i,j=1,2,,n, and r= [Corr(x
), Corr
), , Corr(x
4.6 |Kriging regression
A less widely used regression method in the design of
antennas is the Kriging Regression algorithm. In this
method, the relationship between auxiliary variables and
a target is modeled using known values of auxiliary vari-
ables. This algorithm can be defined as follows
ðÞ ð18Þ
where ^
βkare the regression coefficients, pis the number
of auxiliary variables q, and ^
ðÞis the predicted value of
a target variable given an input s
Optimization algorithms are an important aspect in ML,
since they allow to find the optimal weight and bias
parameters of the ML model. Specifically, these algo-
rithms are not ML algorithms but are used in the training
process of a ML model to minimize the cost function and
find the optimal values for the parameters. The optimiza-
tion algorithm used has a direct impact on the perfor-
mance of the ML model that results after training, and
the choice of this optimizer is based purely on the type
and amount of data available and on the designer's intui-
tion. The most commonly used optimizers in antenna
design can be listed as follows:
5.1 |Gradient descent
GD algorithm, also known as batch GD, is known to be
slow since it updates the parameters once after calculat-
ing the gradient of the whole dataset. Another drawback
of GD is its vulnerability to being stuck in local minima
before converging to the global minimum in a non-
convex surface. In the era of Deep Learning, where we
may have millions of data samples, vanilla GD would not
do. Hence, several optimization algorithms are available
to use inside the architecture of a ML algorithm. Alterna-
tives include Stochastic Gradient Descent (SGD),
where the parameters would be updated for each training
example, and mini-batch GD,
where the gradient of a
small amount of data samples are computed before per-
forming updates.
5.2 |Adaptive moment estimation
A more recent, computationally efficient, and faster algo-
rithm is the adaptive moment estimation (ADAM) algo-
rithm, where the learning rates are computed for each
This algorithm is especially useful in the
case of optimization problems with relatively huge
amounts of data or with big numbers of parameters.
5.3 |Levenberg-Marquardt algorithm
Used for nonlinear least-squares estimation problems,
the Levenberg-Marquardt (LM) algorithm is a batch-form
trust region optimization algorithm that is widely used in
a variety of disciplines to find the local minimum of a
The LM algorithm is a mix between Gauss-
Newton iterations and GD, making it faster in conver-
gence than vanilla GD. It is most efficient for usage in
cases of small or medium sized patterns and offers a solu-
tion for nonlinear least squares minimization.
5.4 |Bayesian regularization
Bayesian regularization (BR) is mostly used to train
ANNs instead of error backpropagation, with the main
advantage of bypassing the need for lengthy cross-vali-
Bayesian regularized artificial neural net-
works (BRANNs) are known to be difficult to
over-train and over-fit making them an attractive
choice for usage.
5.5 |Evolutionary algorithms
Evolutionary algorithms are a category of algorithms that
are inspired by the biological behavior and evolutionary
process of living creatures.
This class of algorithms,
that contains genetic algorithms (GA), differential evolu-
tion (DE), particle swarm optimization (PSO), and others,
is usually used in global optimization, and has been
extensively used in electromagnetic optimization
and can also be used to train ML models in the case of
antenna design.
A large body of literature exists where ML has been used
to design and optimize antennas. Most of these works
have employed the usage of ANNs to find direct relation-
ships between different antenna parameters, such as
between the geometrical properties of the antenna and
the antenna characteristics. As the complexity of an
antenna's structure increases, the number of geometrical
parameters increase, and it becomes hard to derive rela-
tionships between these parameters and values for the
resonant frequency and other radiation characteristics.
The usual approach for optimizing a design is simulating
the antenna to finally reach the desired values, a process
described as computationally heavy and time demanding.
Instead, ML can accelerate the design process by provid-
ing a mapping between whatever the desired inputs and
outputs may be. In general, the following procedure can
be adopted:
1. Numeric values corresponding to the desired inputs
with their respective outputs are obtained by simula-
tions and are stored in a database
2. Once this dataset is created, it is split into training,
cross-validation, and test-sets, where the percentage of
each depends on the amount of data samples
3. A ML algorithm is chosen to learn from this data. The
choice of the algorithm relies on the complexity of the
problem, the amount of data at hand, and the mathe-
matical formulation of the algorithm
4. After training and testing the model, it can be used to
predict output values for the desired inputs
Although this process demands going into simula-
tions to create a dataset for training, once a model is
obtained, predictions can be made for any desired inputs
at very high speeds, and within very low error margins
compared to simulated results. Several metrics have been
in the literature to quantify this error, among which are:
The Output Error: obtained by calculating the differ-
ence between the output obtained by simulations and the
output predicted by the ML model. The unit of this error
depends the parameter being predicted and could be in
dB, Hz, mm, or any other unit. It is expressed by:
where e
is the output error, y
is the desired output, and
is the output predicted by the ML model.
The mean squared error (MSE) expressed by:
MSE =1
i=1 ei
where Nis the size of the training samples.
The error percentage is obtained by the following:
j×100 ð21Þ
In this section, we investigate the different papers
found in the literature on the design and optimization of
antennas using ML procedures. These papers are sorted
according to the type and configuration of antennas,
starting with the typical rectangular and circular patch
antennas, fractal shape antennas, elliptical shape anten-
nas, monopole and dipole antennas, planar inverted-F
antenna (PIFA), substrate integrated waveguide (SIW),
special patch design, reflectarray antennas, in addition to
some other types of antennas.
6.1 |Microstrip antennas
6.1.1 |Rectangular patch
The simplest form of antenna design using ML is the
design of rectangular patch antennas. In References 124
and 125, multi-layer perceptron (MLP) neural networks
have been used for the synthesis and analysis of rectan-
gular microstrip antennas. During the synthesis phase,
the height and permittivity of the substrate, denoted by
Hand ϵ
in Figure 8, in addition to the resonance fre-
quency of the antenna, are used to generate the length
and width of the rectangular patch, denoted by Land
Win Figure 9. During the analysis phase, the width and
length, in addition to the height and effective permittivity
of the substrate are used to generate the resonance fre-
quency. The obtained neural network results were com-
pared to those available in the literature where an MSE
FIGURE 9 Substrate configuration
of 10
was obtained, showing good agreement. In Refer-
ence 125, RBF networks were used in the proposed
approach, where results showed that the RBF network
gave the best results with an error percentage of 0.91%
compared with the MLP approach that reached 3.47%.
Other works have employed SVRs in the design of
rectangular microstrip antennas. The optimization of the
resonant frequency f
, operation bandwidth (BW), and
input impedance R
of a rectangular microstrip patch
antenna using SVR was presented in References 126 and
97. The results obtained from the proposed approach
were compared to those obtained from an ANN-based
approach. It was determined that SVR computed the
above design parameters with higher accuracy than the
ANN approach. SVR error percentages reached 1.21% for
, 2.15% for BW, and 0.2% for R
while the ANN
approach achieved accuracy percentages of 1.67% for
,1.19% for BW, and 1.13% for R
Similarly, a rectangular patch antenna was designed
using SVR with a Gaussian Kernel in Reference 127. The
training and test sets were obtained by FDTD simulations
where accurate values of the antenna's performance
parameters such as the resonant frequency, gain, and
voltage standing wave ratio (VSWR) were obtained with
the corresponding values for the width and length of the
rectangular patch. This data was then used to train the
SVM, where the geometrical properties of the antenna
are predicted based on desired values for the performance
parameters that are given as the input.
In Reference 128, the resonance magnitude of a rect-
angular patch antenna with a two-section feed was
predicted using SVR. The patch antenna, which has
dimensions of 50.7 ×39.4 mm
and an operating fre-
quency of 1.8 GHz, has two feeds of about 20 mm in
length. A total number of 23 samples has been obtained
through varying the widths of the feeds, out of which
21 were used for training and 2 for testing. During train-
ing, the two width values were taken as input parame-
ters, and the resonance frequency as the output. Different
kernel configurations including linear, polynomial of
order 3, sigmoid and radial kernels were tested, and the
radial kernel was used. It was shown that the average
predicted error between the simulations results and the
predicted value was around 3 dB on average.
In Reference 129, the slot-position and slot-size of
rectangular microstrip antenna were predicted using a
SVR model and ANN model. Two asymmetrical and two
symmetrical slots were inserted on the radiating and
grounding surface respectively after which the models
were used to predict the slot-size and slot-position. Ana-
lytical results showed that SVR was more accurate and
time efficient than the ANN, where the SVR was ~10%
more accurate and had a speedup rate of 416 times.
Using ANN, the radiation characteristics of a slotted
rectangular patch antenna, including its resonance fre-
quency, gain, and directivity, have been used to generate
the required slot-size and substrate air-gap dimensions in
Reference 130. Multiple optimization algorithms have
been tested to train the ANN with the LM algorithm
proving to be the most efficient by providing the most
accurate results in the shortest training time and least
number of iterations. A prototype antenna was also fabri-
cated to validate the accuracy of the obtained model,
where the measured results showed great agreement with
the simulated and predicted ones where a low percentage
error of 0.208% was achieved.
More recently in Reference 131, rectangular patch
antenna was designed using a new ANN architecture.
ANNs, based on feed forward backpropagation (FFBP)
algorithm, resilient backpropagation (RPROP) algorithm,
LM algorithm, and RBF, were trained and tested using
MATLAB. The input parameters of the models were the
dielectric constant, width and length of the patch, and
the substrate thickness, with the output being the reso-
nance frequency of the antenna. After comparing the per-
formance error of the four algorithms used, it was
concluded that the RBF-based network produced the
most accurate results with a value of 3.49886 ×10
the error.
In Reference 132, PSO was used to train an ANN in
the design of rectangular patch antennas. The ANN used
the sigmoid function as an activation function, with input
units representing the resonance frequency, the height,
and permittivity of the substrate material, and output
units representing the dimensions of the patch. It was
shown that training required less than 5 minutes of com-
puter work. In addition, an RBF ANN was also used to
produce the value of inset feed distance d, shown in
Figure 10, corresponding to the suitable normalized input
resistance. The results produced by the proposed
approach and those obtained from conventional simula-
tions were determined to be in good agreement with an
MSE value of 0.104.
In Reference 133, a GA was used to train an ANN
instead of backpropagation for the purpose of optimizing
a rectangular microstrip antenna. The ANN was able to
predict the resonant frequency of the antenna, having the
substrate dielectric constant ε
, the width Wand length
Lof the patch, and the shorting post position as inputs.
Although the results obtained showed good.
agreement with the experimental ones with an aver-
age error of 0.013545 GHz for the resonant frequency, it
was concluded that despite optimizing the parameters
accurately, using GA to train the ANN was not very time
efficient and could have been achieved in less time by
employing backpropagation instead.
6.1.2 |Circular patch
Another well-known and simple type of microstrip
antennas is the circular patch antenna, shown in
Figure 11. The design of a circular patch antenna with
thin and thick substrates, using ANN, was presented in
Reference 134. The ANN took as input the radius of the
patch, the height and permittivity of the substrate, to gen-
erate the resonance frequency using MLP and RBF net-
works. The effectiveness of five learning algorithms in
the training of MLPs was investigated, the delta-bar-delta
(DBD), the extended delta-bar-delta (EDBD), the quick-
propagation (QP), the directed random search (DRS),
and the GA. After comparing train, test, and total errors
of the mentioned algorithms, it was deduced that EBDB
attained the best results with a test error of 2 MHz com-
pared with 13, 142 and 271 MHz in error for the DBD,
DRS, and GA approaches respectively. As for the RBF-
based network, its learning strategy was used for train-
ing. Additionally, a neural network trained by EDBD
and backpropagation was used to compute the charac-
teristic impedance and the effective permittivity of
asymmetric coplanar waveguide (ACPW) backed with a
In Reference 135, ANNs were used in the design and
determination of feed position for a circular microstrip
antenna. The first network, an MLP neural model with
two hidden layers was used to predict the radius a, effec-
tive radius, and directivity of the patch. The inputs were
thickness of the substrate h, relative dielectric constant of
the substrate, and resonant frequency, and the optimiza-
tion algorithm used was LM algorithm. The trained net-
work was tested on 45 various samples, and the reported
MSE was 9.70 ×10
, 9.80 ×10
, and 7.76 ×10
the respective inputs. The second network, an RBF neu-
ral model with one hidden layer, was used to predict the
input impedance. The input was a representation of the
various radial distances from the center of the patch. The
network, which was trained with 200 input-output pairs,
had an MSE of 2.69 ×10
upon testing.
An MLP ANN was used in Reference 136, to model,
simulate, and optimize multilayer circular microstrip
antennas. Chosen from 11 tested learning algorithms,
LM algorithm was used to train the network. The reso-
nance frequency was calculated for any arbitrary values
of the patch radius, dielectric constant of different layers
and their thickness. The results showed good agreement
with reference results, where the average error percent-
age of the resonant frequency was 0.35%, 0.065%, 0.43%,
and 0.066% for circular microstrip antenna with and
without cover, spaced dielectric antenna, and microstrip
antenna with two superstrates respectively.
In Reference 137, the resonance frequency of a circu-
lar patch antenna was also modeled using a conjugate
gradient model of an ANN. A closed form expression of
the resonance frequency of the antenna, based on the cir-
cular patch radius, height, and permittivity of the sub-
strate, was used to generate data for ANN modeling and
testing. Forward modeling and reverse modeling were
used to either predict the resonance frequency of the
antenna or the circular patch radius. A comparison
between the simulated results and the results predicted
by the ANN showed 0.10721% error for the resonant fre-
quency and 0.1956% error for the patch radius.
FIGURE 10 Rectangular patch antenna with line feed
FIGURE 11 Rectangular patch antenna with insert feed
The optimization of various design parameters of
a circular microstrip patch antenna using ANN
trained with LM algorithm was presented in Refer-
ence 138. A FFBP neural network was used to esti-
mate the following seven parameters: return loss
(RL), VSWR, resonance frequency, BW, gain, direc-
tivity, and antenna efficiency. The input parameters
were the patch radius, in addition to the height and
permittivity of the substrate. Results from testing the
model were in good agreement with simulated results
and achieved an MSE of 9.96 ×10
6.1.3 |Fractal patch
Fractal patch antennas are another type of microstrip
antennas that have their design procedure dominated by
ANNs. In Reference 139, the resonance frequency, RL,
and the gain of a coaxial fed elliptical fractal patch
antenna were calculated using an ANN with bac-
kpropagation algorithm. IE3D software was used to gen-
erate the dataset for different values of feed positions and
for different iterations of the antenna fractal shape. These
values were used for training a model to find the position
of the feed point of the coaxial feed for optimized imped-
ance matching of the antenna.
In another work, the optimization of a square fractal
antenna using ANN was presented in Reference 140.
With the aim to reduce the size of the broadband reso-
nance antenna, selected iterated structures resulting from
the ANN were simulated in HFSS to obtain optimal reso-
nance characteristics. Also, the design of quasi-fractal
patch antennas using ANNs has been presented in Refer-
ence 141. Several values of the antenna parameters that
allow operation at a specific resonance frequency have
been obtained by simulations. The dataset was then used
to train the network, which resulted in a prediction
model that provides a mapping between parameters and
frequency of operation.
6.1.4 |Elliptical patch
ANNs were used in two works for the design of elliptical
patch antennas. In Reference 142, ANNs using RBF were
utilized in the design of elliptical microstrip patch
antenna. The resonance frequency for even mode, sub-
strate height and permittivity, and the eccentricity of
elliptical patch were used as input parameters to compute
the resonance frequency for odd mode and the semi-
major axis. When comparing the obtained results with
the results of conventional simulations, the error percent-
age reached as low as 0.006% and 0.043%.
In a different approach, the design and modeling of
an elliptical microstrip patch antenna using ANN was
presented in.
For the purpose of computing the RL
and the gain of the antenna, a FFBP neural network was
trained in MATLAB using the three major axes of the
connected ellipses as input parameters. A dataset was
obtained from CST simulations. The obtained results
were compared to those of the simulated and measured
results of a fabricated antenna, and a good agreement
was revealed with error values as low as 0.0202 dB for
the Gain and 0.2014 dB for the RL.
6.1.5 |Monopole and dipole antennas
The design of a circular monopole antenna was facili-
tated using an ANN in Reference 144. The feed-gap, a
design parameter required for the antenna to operate
within a specific frequency band, was calculated using an
ANN trained with dataset obtained from IE3D simula-
tions. The model was later tested with five input-output
pairs, and the resultant error percentage was determined
to be within 0.4 and 4.6%.
The LASSO technique, a sparse LR method, was used
in Reference 145, to design a reference dual band double
T-shaped monopole antenna. Five design parameters that
cooperatively represent the shape and structural geome-
try of the antenna were repeatedly obtained from HFSS
simulations. After the model was fitted using the LASSO
method, which is characterized by variable selection and
regularization, optimum predicted design parameters
were reached. The resulting model was able to predict
values 495 616 design points after being trained on a
dataset that consists of only 450 training samples. Even
given the relatively small size of the dataset, the model
was able to analyze an exponentially higher number of
data points in a very brief amount of time without the
need to perform further electromagnetic simulations.
This work was later developed in Reference 146, where
two more ML techniques, namely ANNs and the k-
nearest neighbor (k-NN) algorithm, were used to opti-
mize the same antenna. It was shown that by using
ANNs or LASSO, better predictions can be achieved than
the k-NN approach that had an error percentage
of 2.90%.
The first took the microstrip line impedance Zas
input, and the line width as output. The second took Z
and the substrate dielectric constant as input, and the
width and height of the substrate as output. The third
took Z as input, the substrate dielectric constant, width,
and height, as output. The synthesis ANN was further
tested on a printed dipole antenna with integrated balun,
where the results of the proposed neuro-computational
model were compared to those of a developed FDTD
analysis tool. The input parameters to the model were
three voltage standing-wave ratio numbers and two fre-
quencies, and the output was two geometric parameters.
6.1.6 |Planar inverted-F antenna
The optimization of design parameters of a PIFA with
magneto dielectric nano-composite substrate using ANNs
trained with the BR algorithm was presented in Reference
147. The model was trained on two databases obtained
from CST simulations. Taking as inputs the particle radius
and volume fraction of the nano-magnetic material, differ-
ent antennas parameters, such as gain, BW, radiation effi-
ciency, and resonance frequency can be generated using
neural networks with error percentages close to zero.
Working with the same antenna, the algorithm used in
Reference 147 has been further optimized in
for the
same input and output parameters. In addition, a reverse
technique has been also addressed using ML, for which the
corresponding design space of possible material parameters
can be generated based on given antenna parameters.
6.1.7 |Substrate integrated waveguide
ANNs were used to predict the geometrical parameters of a
SIW patch antenna in Reference 149, taking as inputs the
desired resonance frequency and the RL. Feed-forward MLP
and backpropagation were used for training the ANN in
MATLAB, using dataset obtained from HFSS simulations.
In Reference 150, the design of a broadband millimeter-
wave SIW cavity-backed slot (CBS) antenna using ML was
presented. A ML assisted optimization method (MLOM)
ML assisted method with additional feature (MLOMAF),
which utilizes the population-based metaheuristic optimiza-
tion method. HFSS was used in the design and analysis pro-
cess of the antenna structures. Following database
initialization, the initial training set was sampled using the
Latin hypercube sampling (LHS) and the resultant data was
exploited in building a Gaussian process (GP) surrogate
model. It was shown that this algorithm was able to reach
the stopping criterion 12 iterations before the MLOM did,
and that the proposed antenna exhibited notable features
related to BW and ease of fabrication.
6.1.8 |Special patch designs
Other types of special patch designs have been also
designed in the literature using different ML techniques.
The analysis and design of a frequency re-configurable
planar antenna using a MLP ANN and a self-organizing
map (SOM) neural network respectively was presented in
Reference 151. In the analysis phase, the operational fre-
quency bands of the antenna at different reconfigured
conditions were located by the trained MLP network. In
the design phase, switches to be turned on for a specific
desired frequency response were identified using a SOM
neural network, trained using Kohonen learning algo-
rithm. Frequency responses from the output of the MLP
network were used to feed the input layer, whereas the
output of the networks was four clusters of frequency
responses, used to approximate the position of the
switches to be turned on.
In Reference 152, the design of a three-layer annular
microstrip ring antenna with pre-specified operational
features was facilitated using ANNs, where structural
design parameters were computed. A dataset of reflec-
tion coefficient vs frequency was used to train an MLP
ANN to generate the geometrical properties of the patch
as well as the physical properties of the substrate. Upon
testing, the root MSE of the model was determined to
be 1%.
In Reference 153, the design of a two-slot rectangular
microstrip patch antenna was facilitated using MLP and
RBF, based on an ANN trained with different learning
algorithms. The MLP-based networks were trained using
five learning algorithms: LM algorithm, scale conjugate
gradient backpropagation, Fletcher Powell CG bac-
kpropagation, gradient decent with momentum, and
adaptive gradient decent. It was determined that the LM
algorithm resulted in the least MSE compared to the
other MLP-based algorithms, but the RBF-based network
resulted in a lower error percentage of 0.09%.
A spiral microstrip antenna has been designed in Ref-
erence 154 using ANN. Antenna parameters were
mapped to characteristics such as the resonance fre-
quency, RL, and VSWR. After training the network on
data samples obtained by the simulator, it was shown
that accurate prediction results may be obtained which
allows bypassing the computational burdens of conven-
tional simulation methods.
In Reference 155, the design of an aperture-coupled
microstrip antenna using an ANN with a hybrid network
architecture was presented. The RBF and the bac-
kpropagation algorithm were combined to develop the
hybrid network. Using the hybrid ANN, the antenna
parameters including the dimensions of the ground
plane, the aperture, and the radiating element, in addi-
tion to the dimensions of the feed and its position, were
determined based on different resonance frequencies.
The obtained ANN results were compared with those
using backpropagation and RBF models, which showed
performance superiority after showing an error percent-
age of 0.27%.
The design of single-feed circularly polarized square
microstrip antenna (CPSMA) with truncated corners, has
been facilitated in Reference 156 using ANN synthesis
model. A total of 5000 data samples were generated by
calculating the resonance frequency, in addition to the Q-
factor of the antenna by analytical formulations, out of
which 3500 were used for training the model. The LM
algorithm was used for training, which resulted in a
faster and simpler structure. During the synthesis pro-
cess, for a desired operating frequency and a given sub-
strate, the size of the truncated corners can be obtained
for CP operation. An ANN with three hidden layers was
found to give the highest accuracy. The average relative
error and the maximal relative error have been calculated
to test the accuracy of the model. The proposed model
was compared with simulation results. It was shown that
the antenna with the calculated parameters achieved a
circular polarization with less than 2 dB of axial ratio,
with some discrepancy in the frequency of operation of
less than 5%. Eight different CPMAs were also fabricated
and tested, for which the measured results showed an
axial ratio of less than 1 dB, with a discrepancy between
the measured and synthesis values of 3.6% for the physi-
cal dimensions, and 2.3% for the frequency of operation,
with an average relative error of less than 1%.
In Reference 157, the optimization of design parame-
ters of a tulip-shaped microstrip patch antenna using
ANN was presented. Taking as input the resonance fre-
quency band and the RL of the lower and higher reso-
nance frequencies, the ANN was used to generate the
patch dimensions. Backpropagation was used to train the
ANN in MATLAB using a dataset obtained from HFSS
The design of a pentagonal-shaped flexible antenna,
shown in Figure 12, for ultra-wide band (UWB) wireless
applications such as WLAN, 5G, and WiMAX applica-
tions using ANN was presented in Reference 158. The
aim was to determine the two frequencies representing
the structure BW from the radius.
of the pentagonal shape. This approach was used due
to the complexity behind finding the non-linear relation-
ship between these parameters and representing it in an
equation, and due to some time and cost concerns. For
this, an ANN based on LM algorithm was trained and
tested using a dataset obtained from Ansoft simulator.
The error resulting from learning, validation, and testing
was 6%.
In Reference 159, the resonance frequency of rectan-
gular patch antennas printed on isotropic or uniaxially
anisotropic substrate, with or without air gap, was
modeled using ANN. Spectral dyadic Green's function
was used in conjunction with a developed single neural
network. To reduce the computational complexities,
required time, and amount of data needed to maintain
the accuracy of the ANN model, a single matrix was used
to present the effective parameters. Two types of anten-
nas were tested with two ANN models: a circular patch
antenna where the radius of the patch has been deter-
mined based on the resonance frequency, substrate thick-
ness, and permittivity, in addition to a modified PIFA
antenna with a chip resistance where the feed position
was determined based on the input impedance.
In Reference 160, an ANN has been used in the analy-
sis and synthesis of Short-Circuited Ring-Patch Antennas
(SCRP). The importance of the ANN in this work was to
solve the drawbacks of the analytical calculations that do
not accurately model the effect of the thickness and per-
mittivity of the substrate on the resonance frequency of
the antenna in TM
mode. For the training process, the
internal and external radius, in addition to the substrate
permittivity and thickness, were varied to obtain
275 training samples from simulations. During the train-
ing phase that required less than 1-minute, least squares
cost function with a backpropagation method have been
used. In the analysis case, the resonance frequency was
obtained from the ANN given the other patch dimensions
for 30 test cases. The comparison between the frequency
obtained by the trained ANN and that obtained by simu-
lations showed a percentage error of less than 0.3%. In
the synthesis case, the importance of using ANN for this
type of antennas was seen in estimating the value needed
for the external radius from other parameters and for a
desired resonance frequency. Usually, the external radius
FIGURE 12 Circular patch antenna
is estimated via analytical formulations and adjusted by
trial and error using the simulation software. In this case,
the comparison between the external radius obtained by
the trained ANN and the simulations achieved an error
of less than 3.2%. A SCRP has been also fabricated based
on the trained ANN, for which the measured results
showed closer results to those estimated by the ANN than
the simulated ones.
Some works have explored the design of multiple
patch types at the same time. In Reference 161, a combi-
nation of ANN and adaptive-network-based fuzzy infer-
ence system (ANFIS) was used to calculate the resonance
frequencies of rectangular, circular, and triangular micro-
strip antennas. The MLP ANN, trained with the BR algo-
rithm, was utilized to compute the resonance frequencies
of the antennas. As for the ANFIS, it was trained using
the hybrid learning algorithm, which is a combination of
least square method and backpropagation. The inputs to
this hybrid model were the geometrical parameters of the
patch and the dielectric constant of the substrate,
whereas the calculated output was the resonant frequen-
cies. The MLP ANN was used in computing the resonant
frequencies, and the ANFIS was used in compensating
for the inaccuracies in the ANN results. Finally, the
results were compared to those of the single neural
models, conventional methods, and approaches based on
GA and Tabu search algorithm (TSA). It was determined
that the proposed hybrid method provided results of
higher accuracy.
6.2 |Reflectarray antennas
Several works focusing on the accelerated design and
analysis of very large reflectarrays, as the one shown in
Figure 13, using ML, have been presented in the litera-
ture. Among these, many have employed ANNs as the
main design and analysis tool. In Reference 162, an ANN
was utilized in the optimization of microstrip patches
unit cell parameters of broadband reflectarray antennas
with Malta cross unit cell configuration, as shown in
Figure 14. To this end, the ANN was used to accurately
characterize the non-linear relationship between the
phase behavior of patch radiator and its geometric
parameters. The proposed network used MLP. The hyper-
bolic tangent was chosen as the activation function for
the two-layered network. The model was trained using
the error backpropagation algorithm. It was shown that,
when compared to direct evaluation, the ANN approach
had results of similar accuracy that were attained with an
enhanced speed.
This work was later expanded in Reference 163 where
the results of a modified ANN were compared with those
of a full-wave method of moments based on local period-
icity (MoM-LP) approach. The sigmoid function was used
here as the activation function. The reflection coefficient
corresponding to any re-radiating element of a reflec-
tarray approximated by the two approaches were in
agreement, further showing that the ANN method can
maintain the desired level of accuracy while significantly
reducing the computational cost.
In Reference 164, an ANN was used to characterize
the elements of a rectangular planar surface reflectarray
composed of 70 ×74 elements, for satellite applications
covering the Eutelsat footprint. Each ANN took as input
the angle of incidence and patch dimensions, in addition
to the resonance frequency as a constant parameter.
Feed-forward MLP topology was used, along with
FIGURE 13 Pentagonal-shaped CPW antenna. CPW,
coplanar waveguide
FIGURE 14 Reflectarray with unit cells
backpropagation training algorithm, to optimize the
reflectarray patch dimensions by training the model on a
dataset obtained from MoM-LP-based computations. The
obtained ANN results of the gain pattern and phase dis-
tribution of the reflectarray were compared with those
obtained from MoM-LP computations and showed good
agreement while having a speed up factor of 2 ×10
An MLP-ANN has been used in Reference 165 to opti-
mize and speed up the design and analysis of
reflectarrays. The reflection phase characteristics of the
unit cell element was first trained and tested using CST.
A dataset of 990 samples was used for training, while
660 data samples were used for testing. In the analysis
model, the edge length of the patch, the ratio of the cavity
to the edge length, the substrate thickness and resonance
frequency, were taken as input in the analysis model.
The LM algorithm was used for training. The MSE was
calculated to be 3.5992 ×10
for training and
4.0192 ×10
for testing. The reconstructed phase varia-
tions and the target ones were of high similarity, validat-
ing the efficiency of the proposed model. A reflectarray
with the optimized Minkowski elements was then tested
to validate the overall optimized performance of the
antenna (Figure 15).
The design of reflectarrays composed of second-order
phoenix cells was presented in Reference 166. Fast char-
acterization of these cells was made possible by using
ANNs, allowing to obtain a spherical mapping that com-
plies with the results obtained by full-wave simulations
with the local periodicity assumption.
A reflectarray antenna using modified Malta-Cross
cells was designed using an ANN in Reference 167.
Starting by a dataset obtained through full-wave simula-
tions, the ANN was trained using error backpropagation,
resulting in a model that allows the computation of
reflection coefficients from any input value for the geo-
metrical and re-radiating field parameters of the reflec-
tarray in both cases of horizontal and vertical
polarizations. The obtained ANN model allows high
accuracy predictions for lower memory usage with less
computation time and load.
A contour-shaped reflectarray antenna was analyzed
in Reference 168. A trained ANN is used to predict the
complex reflection coefficient's amplitude and phase by
taking six geometrical parameters, the incident angle in
terms of azimuth and elevation, and the frequency as
inputs. The results were compared to full-wave electro-
magnetic computations and showed great agreement
while having a speed up factor of 700.
Other techniques have been also employed in the
design of reflectarray antennas. Using an advanced
learning-by-example (LBE) method, namely the Kriging
method, the design of high-performance reflectarrays
was presented in Reference 169. The problem of
predicting the scattering matrix of complex reflectarray
elements was addressed using this LBE algorithm that, if
trained on a set of known input-output relationships, can
accurately predict the output of new input-output pairs.
The Kriging method is not only proficient in dealing with
deterministic noiseless processes but can also facilitate
vectorized outputs. A set of preliminary numerical results
were used to validate the accuracy and time-efficiency of
the proposed model. It was confirmed that this method,
while maintaining a prediction error below 5%, allowed
for a 99.9% time saving percentage when compared to
standard full-wave approaches.
The prediction of the electromagnetic response of
complex-shaped reflectarray elements was presented in
Reference 170, where the authors presented an innova-
tive LBE method based on Ordinary Kriging to obtain
reliable predictions. Full-Wave simulations were used in
order to generate a training set composed of the elevation
angle, azimuth, operating frequency, and the degrees-of-
freedom (DoF) for each array element, along with the
corresponding field distribution. The relationship
between such parameters is highly non-linear which
encourages the usage of ML techniques to find an accu-
rate input/output mapping of these parameters without
having to go through simulations. The authors compared
the performance of their approach to the performance of
SVR and Augmented RBF neural networks used for the
same problem. In addition, several unit-cell shapes have
been considered such as the several cross-slot, ring-slot,
and square/rectangular Phoenix shapes. Results showed
FIGURE 15 Malta cross patch
that the customized LBE approach achieved lower error
rates for the same number of training examples com-
pared with SVR and Augmented RBF neural networks.
A framework that employs a ML technique in its
architecture was proposed in Reference 171, where the
algorithm focuses mainly on improving the antenna per-
formance. A surrogate model was obtained using the
SVMs. SVM was used in References 172-174 to design
shaped-beam reflectarrays and for modeling dual-
polarized reflectarray unit cells in Reference 175. Using
SVM, the computational burden resulting from the use of
Full-Wave Local-Periodicity for the design and analysis
was reduced. This has been tested and proven where an
acceleration factor of 880 was achieved compared to sim-
ulations based on the MoM-LP, with error percentages as
low as 0.43%.
6.3 |Other antenna types
The indirect use of an ANN for predicting the input
impedance of broadband antennas through a parametric
frequency model was proposed in Reference 176. While
the antenna geometry parameters and frequency are rou-
tinely used as inputs of the ANN, the resistance was ini-
tially parametrized by a Gaussian model and the ANN
was later used to reach an approximate non-linear rela-
tionship between the antenna geometry and the model
parameters. This novel method was used to obtain a
smaller network size with a smaller number of hidden
units for an ultimately faster training time. For testing, a
loop-based broadband antenna with three tuning arms
was used, where the results were compared to those of a
direct approach. It was found that the proposed model
was considerably more time efficient as it required
10 times less the amount of electromagnetic computa-
tions when training the ANN.
The design of a loop antenna was facilitated using
competitive learning ANN in Reference 177. The aim was
to determine the physical dimensions for frequencies in
the range of 200 to 300 MHz by calculating the best com-
bination of conductor thickness and loop radius using a
SOM. 11 sets of efficiency values corresponding to fre-
quencies related to frequencies for 11 pairs of loop radius
aand wire radius bwere used to train the SOM. The
SOM was later used to produce the desired set of (a, b)
that has the required radiation efficiency, which was veri-
fied by comparison to theoretical results. The design was
shown to respond well to input parameter changes
of 50%.
An MLP-ANN was used to model and predict the
radar cross section of a non-linearly loaded antenna in
Reference 57. After training the MLP-ANN with
backpropagation, the slope information of the resulting
model was used to optimize the antenna. Theoretical for-
mulations have been proposed for this aim, verified by
numerical simulation results. A nonlinear loaded dipole
antenna was used for simplicity as an example. The har-
monic balance technique
was used to calculate
101 data samples for training, and 100 data samples for
testing. Comparing the predicted values by extension of
MLP-ANN with those calculated from the harmonic bal-
ance technique, it was concluded that the proposed
method is accurate and can obtain the required results in
less time.
A multi-grade ANN model was proposed in Reference
179 for the design of finite periodic arrays. To take into
consideration the mutual coupling and the array environ-
ment, this approach introduced an innovative approach
where two sub ANNs were used. The first-grade ANN is
called the element-ANN that can provide the non-linear
relationship between the geometrical parameters and
electromagnetic behavior, represented by a certain trans-
fer function (TF) coefficients, of the array element with-
out considering mutual coupling. The output of this
element-ANN is then fed as the input to the second-grade
ANN called the array-ANN. The array-ANN is then capa-
ble of producing outputs of the electromagnetic behavior
of the whole array, with mutual coupling considered.
This approach allows to obtain the mapping between the
geometrical parameters of the element and the electro-
magnetic response of the whole array, while separating
array and element information. Several arrays types were
used to verify the effectiveness of the proposed approach
including a linear phased array, a six-element printed
dipole array, and a U-slot microstrip array. Results
showed training and testing errors smaller than previous
approaches that do not use the multi-grade ANN
In Reference 180, ANNs were used to optimize the
parameters of a pyramidal horn antenna. The ANN used
RBF as the activation function in its layers and was
trained on data obtained by a full wave simulator. Taking
as inputs the desired frequency of operation and gain, the
ANN generated the required antenna dimensions such as
the height and width of the flared end, the height and
width of the waveguide, and the length of the horn
antenna. Results showed that the trained model can give
very accurate results compared to those obtained by a
simulator with an error percentage as low as 1.3%.
In Reference 181, an ANN was used for analysis and
synthesis simulations of profiled corrugated circular
horns, and then compared with the conventional mode
matching- combined field integral equation (CFIE) tech-
nique. During analysis, the ANN takes as inputs the aper-
ture radius, the horn length, the corrugation height, the
metal-void ratio, in addition to the number of corruga-
tions per wavelength. The output of the ANN was the
RL, in addition to the co- and cross-polar patterns limited
to 0to 40range, with 2step. To accelerate the analysis,
several ANNs are used. The input space is formed of
10 hypercubes, with each single hypercube mapped to a
subspace of the output space. This approach has been
also presented in Reference 182. The ANN was then
trained in the synthesis procedure to approximate the
function that can relate the main beam width and the
maximum level of the cross-polar level, to the corrugated
horn geometrical parameters. The RL was not taken into
consideration as an input to the ANN during analysis,
since with this type of horn, low levels of RL can be easily
obtained. In addition, some of the geometrical parame-
ters were assumed to be constant, and not varied during
the synthesis process. An example of an optimized pro-
filed corrugated horn using the proposed ANN has been
also fabricated and measured. The results have been com-
pared with the traditional electromagnetic analysis. The
results showed less accuracy when compared with those
obtained from a careful optimization process. Neverthe-
less, the cost and the time needed to design the antenna
as per the required parameters were highly reduced.
ANNs were also used in the design of a W-Band slot-
ted waveguide array antenna in Reference 183. The
model was trained, cross-validated, and tested on dataset
obtained by HFSS simulations. The seven design parame-
ters that were used as input were the lengths and orienta-
tion angles of the coupling slots, in addition to the length
of the radiating slots. The antenna was later fabricated
using Stereolithography 3D printing techniques, and the
measured and simulated results were compared, which
were in good agreement with slight errors.
In Reference 184 a novel multibranch ANN modeling
technique was proposed as a solution to the non-
uniqueness problem in the design of antenna arrays. The
nonuniqueness issue can be defined as the case where
the desired output can be mapped to several inputs,
resulting in conflicting output values for similar inputs in
a dataset, and leading to a large training error and poor
ML model accuracy. This work presented a novel tech-
nique based on calculus to provide a solution for the non-
uniqueness problem, where the training data is separated
into several groups after obtaining the data boundary
locations and monotonicity of this data. These groups
can be used separately to train different ANN branches,
forming one multibranch ANN model that can predict
the antenna's geometrical or physical parameters based
on the desired input EM characteristics.
This technique was tested on Short Dipole Planar
Array where the obtained model had a train and test set
errors of 0.25% and 0.28% respectively, considered to be a
significant improvement on the non-multibranch con-
ventional approach that has a 20.38% training set error
and a 20.40% test set error. Further testing was made on
sparse linear dipole array resulting in 0.37% and 0.68%
training and test set errors respectively.
As a summary to the in-depth investigation of the dif-
ferent antenna design papers using ML presented in the
literature and studied in this work, Table 2 lists the differ-
ent papers investigated sorted by ML and as per the
antenna type and configuration.
Another line of researcher focuses on embedding ML
models inside an optimization algorithm that is used to
reach optimal parameters and performance of an
antenna. By integrating a ML model within the opti-
mizer, the design and optimization process would speed
up since less simulations would be required. This
section presents the work that has been done in this
regard, along with the various results obtained. A Sum-
mary of these antennas along with the used algorithms
can be found in Table 3.
Interpolation combined with GA used for the design
of an UWB ring monopole antenna was presented in Ref-
erences 186 and 187 where fitness function behaviors
such as the BW, the RL, and the central frequency divi-
sion (CFD) were estimated. After optimizing those
parameters, comparison was held between a simulated
antenna and a real prototype manufactured from the
obtained values.
Different numbers of datasets were used in training
the model and it was determined that the perception on
the behavior of the objectives (BW, RL, and CFD)
increases as the size of the dataset increases.
The design of stacked patch antennas using ANNs
was presented in References 188-191, where a trained
ANN embedded in PSO was used to obtain multi-band
characteristics. After having decided upon the geometri-
cal parameters of the antenna by the PSO, a function
mapping black-boxwas built by the ANN, and the fre-
quencies and associated bandwidths were related to the
dimensional antenna parameters. The obtained ANN
results were then compared with measured results of a
fabricated antennas, where good agreement has been rev-
ealed with an error of order 10
In Reference 192, DE and the Kriging algorithm were
used in the design optimization of an E-shaped antenna.
Six antenna variables were optimized, which were feed
position, the slot position, the length and width of the
patch, and the slot width and length. Good prediction
accuracy was exhibited by the model after reaching opti-
mal solutions by the model. It was concluded that the
proposed approach reduced the number of necessary sim-
ulations significantly.
In Reference 193, a new algorithm named (SADEA)
based on surrogate model assisted (SMA-DE) and GPR
was proposed. This method was found efficient in the
design of antennas, where it has been tried on three types
of antennas that are namely: an inter-chip antenna, a
four-element linear array, and a 2-D array. It was shown
that SADEA can speed up the design and optimization
procedure by more than four times compared with DE.
Slots antennas were optimized in References 194 and
195 by using Space Mapping as an optimization engine.
Computational costs were reduced by implementing
Bayesian SVR (BSVR)
as the coarse response surface
model instead of relying on electromagnetic simulations.
The parameters of a CPW-fed Slot Dipole Antenna and a
CPW-fed T-shaped Slot Antenna were optimized using
this procedure which resulted in satisfactory designs.
The aforementioned discussion has highlighted the
importance and usefulness of using ML techniques in the
design and analysis of many antennas. However, many
challenges arise when adopting this approach instead of
relying on computational electromagnetics. The first
challenge relates to the lack of standardized datasets for
antenna structures that can be used directly to train a
certain model and obtain results. Instead, data need to be
generated by simulations beforehand to create a database
of selected input and output variables. This can be a
tedious and time-consuming task since the initial goal of
using ML in the context of antennas is obtaining an
TABLE 2 Investigated antennas designed using ML
algorithm Antenna type References
ANN Rectangular patch [124,125,130-133]
Circular patch [134-138]
Fractal patch [139-141]
Elliptical patch [142,143]
Monopole antenna [144]
Dipole antenna [95,185]
PIFA [147,148]
SIW [149]
Special patch structures [151-161]
Reflectarrays [162-168]
Broadband antenna [176]
Loop antenna [177]
Non-linearly loaded
dipole antenna
Antenna arrays [179]
Corrugated circular horn
Pyramidal horn antenna [180]
Slotted waveguide
antenna array
SVR/SVM Rectangular patch [97,126-129]
Reflectarrays [171-175]
GPR SIW [150]
Reflectarrays [169,170]
LASSO Monopole antenna [145,146]
Abbreviations: ANN, artificial neural networks; GPR, Gaussian pro-
cess regression; LASSO, least absolute shrinkage and selection oper-
ator; ML, machine learning; PIFA, planar inverted-F antenna; SIW,
substrate integrated waveguide; SVR/SVM, support vector regres-
sion/support vector machines.
TABLE 3 Investigated antennas
designed using ML assisted
ML algorithm Optimization algorithm Antenna type References
Interpolation GA Ring monopole antenna [186,187]
ANN PSO Stacked patch antenna [188-191]
Kriging DE E-shaped antenna [192]
GPR SMA-DE Inter-chip antenna [193]
Four-element array
2-D array
BSVR Space mapping Slots antenna [194,195]
Abbreviations: ANN, artificial neural network; BSVR, Bayesian SVR; DE, differential evolution;
GA, genetic algorithm; GPR, Gaussian process regression; ML, machine learning; PSO, particle
swarm optimization; SMA-DE, surrogate model assisted differential evolution.
accelerated design and characterization process while
maintaining high accuracy. Having to go through simula-
tions to obtain a dataset also translates into a heavier
computational load.
Another aspect to be considered is selecting the best
model hyperparameters that can lead to the optimal
results. It can be clearly deduced that ANNs have domi-
nated this research area by being the most popular choice
of ML technique with many frameworks and software
packages available for their quick and efficient employ-
ment, and by showing resilience in providing highly accu-
rate results compared to conventional CEM approaches.
The importance of ANNs in antenna design becomes more
recognizable as the complexity of the antenna structure
increases. Therefore, it is necessary to investigate what
type of training and optimization method, network archi-
tecture, regularization techniques, choice of activation
functions, and similar factors that affect the model's per-
formance, would be most suitable for each antenna type.
While ML stands out as an attractive antenna design
and analysis tool that can perform predictions with high
accuracy in a shorter period of time compared to simula-
tion approaches, having to generate the training data
would seem unattractive and demanding. For this reason,
a good approach to address this issue is the development
of an antenna design software based purely on ML
models to replace simulators. Such a tool would of course
be limited in terms of designer flexibility and would have
to be targeted on specific antenna types and structures
but may be extended to cover a large number of anten-
nas. Having a fast, accurate, and optimized design tool
would allow quick characterization of the selected
antenna type, where the user would only need to input
the design requirement to obtain the geometrical predic-
tions. However, this software falls short in cases where a
special structure is desired, which forces the designer of
going through simulations.
This paper provided a comprehensive survey on the
usage of ML in antenna design and analysis. ML is
expected to reduce the computational burdens imposed
by simulators and accelerate the design process. The dif-
ferent research papers presented in the literature that
have employed ML algorithms in their design have been
investigated. An overview on a variety of ML concepts
has also been presented, thus enabling readers that are
interested in antenna research but have minimal ML
expertise with the basic and fundamental understandings
needed to use these effective tools in their projects.
Hilal M. El Misilmani
Tarek Naous
Salwa K. Al Khatib
1. Volakis JL, Johnson RC, Jasik H. Antenna Engineering Hand-
book. New York: McGraw-Hill; 2007.
2. Sumithra P, Thiripurasundari D. Review on computational
electromagnetics. Adv Electromagn. 2017;6:42-55.
3. Tayli D. Computational Tools for Antenna Analysis and
Design. Electromagnetic Theory Department of Electrical and
Information Technology, Lund University; 2018.
4. Gibson WC. The Method of Moments in Electromagnetics. CRC
Press; 2014.
5. Reineix A, Jecko B. Analysis of microstrip patch antennas
using finite difference time domain method. IEEE Trans
Antenna Propag. 1989;37:1361-1369.
6. Tirkas PA, Balanis CA. Finite-difference time-domain method
for antenna radiation. IEEE Trans Antenna Propag. 1992;40:
7. Maloney JG, Smith GS, Scott WR. Accurate computation of
the radiation from simple antennas using the finite-difference
time-domain method. IEEE Trans Antenna Propag. 1990;38:
8. Volakis JL, Chatterjee A, Kempel LC. Finite Element Method
Electromagnetics: Antennas, Microwave Circuits, and Scatter-
ing Applications. Vol 6. John Wiley & Sons; 1998.
9. Lou Z, Jin JM. Modeling and simulation of broad-band anten-
nas using the time-domain finite element method. IEEE Trans
Antenna Propag. 2005;53:4099-4110.
10. Sarkar TK, Djordjevic AR, Kolundzija BM. Method of
moments applied to antennas. Handbook of Antennas in Wire-
less Communications; 2000:239-279.
11. Rawle W, Smiths A. The method of moments: a numerical
technique for wire antenna design. High Freq Electron. 2006;5:
12. Wu YM. The contour deformation method for calculating the
high frequency scattered fields by the Fock current on the sur-
face of the 3-D convex cylinder. IEEE Trans Antenna Propag.
13. Xu Q, Huang Y, Zhu X, Xing L, Duxbury P, Noonan J. Build-
ing a better anechoic chamber: a geometric optics-based sys-
tematic solution, simulated and verified [measurements
corner]. IEEE Antennas Propag Mag. 2016;58(2):94-119.
14. Weston D. Electromagnetic Compatibility: Principles and Appli-
cations. 2nd ed. (Revised and Expanded) CRC Press; 2017.
15. Testolina P, Lecci M, Rebato M, et al. Enabling simulation-
based optimization through machine learning: a case study on
antenna design. arXiv Preprint. 2019;1908:11225.
16. Ledesma S, Ruiz-Pinales J, Garcia-Hernandez M, et al. A
hybrid method to design wire antennas: design and optimiza-
tion of antennas using artificial intelligence. IEEE Antennas
Propag Mag. 2015;57:23-31.
17. Misilmani HME, Naous T. Machine learning in antenna
design: an overview on machine learning concept and algo-
rithms. Paper presented at: International Conference on High
Performance Computing & Simulation; Dublin, Ireland; 2019.
18. Zhang Q-J, Gupta KC, Devabhaktuni VK. Artificial neural
networks for RF and microwave design-from theory to prac-
tice. IEEE Trans Microw Theory Tech. 2003;51:1339-1350.