Conference PaperPDF Available

Application of a Neural Network to Radar Detection

Authors:

Abstract and Figures

The application of Neural Networks to radar detection has many open questions and some of them are solved in this paper. First, we propose a network structure useful for the problem of binary detection. We model the input as signal and noise (given by its complex envelope), and the binary output are 1 or 0. We evaluate different structures and their dependence on training signal-to-noise ratio and the threshold value. Then, we evaluate its performance by Montecarlo trials. We present its Receiver Operating Characteristics (ROC) and detection curves.
Content may be subject to copyright.
(1)
Application of a Neural Network to Radar Detection
Diego ANDINA and José L. SANZ-GONZALEZ
Departamento de Señales, Sistemas y Radiocomunicaciones, ETSI de Telecomunicación Universidad Politécnica de
Madrid, Ciudad Universitaria s/n 28040 Madrid, Spain, Tel/Fax: +34 [1] 549 5700 (ext.384 ) / 543 9652,
E-Mail: andina@ics.upm.es
Abstract. The application of Neural Networks to radar detection has many open questions and some of them are solved
in this paper. First, we propose a network structure useful for the problem of binary detection. We model the input as signal
and noise (given by its complex envelope), and the binary output are 1 or 0. We evaluate different structures and their
dependence on training signal-to-noise ratio and the threshold value. Then, we evaluate its performance by Montecarlo
trials. We present its Receiver Operating Characteristics (ROC) and detection curves.
1. Introduction
We could briefly reduce the binary detection problem as having to decide if an input complex value (the complex envelope
involving signal and noise) has to be classified as one of two outputs, 0 or 1. Neural networks have proved their abilities in
classifying problems and could have interesting nonlinear capabilities for detection when the input is affected with non-Gaussian
noise. Obviously, the binary detection problem is highly dependent on each application, but we could try to model each
characteristic as a learning parameter, or as modifications on the network structure. For example, the need of processing complex
signals with back-propagation learning [2] can be characterized by complex weights and adapted sigmoidal function or simply
by separating the inputs into its real and imaginary parts and doubling the number of input nodes. Also, the presence at the output
of only 0 or 1 does not imply that the nodes in our network must have hard limiters: we can establish a threshold at the output
and assign the two binary values to each sides separated by this threshold.
In this paper, we model the input at time t as a complex value composed by
were xi is the input vector, si is the signal vector (s1 corresponds to "0" and s2 corresponds to "1") and n is the noise vector. Each
component of xi corresponds to the complex envelope of the received signal.
At the neural network output we will have values ranging in [0,1]. Then we will have to choose a threshold value T0 [0,1]
so that output values [0,T) will be considered as binary output 0 and values in [T,1] will represent value 1.
2. The Neural Network as Detector
For the present study, we have chosen one representative algorithm for supervised learning: multi-layer perceptron with back-
propagation. This choice is mainly motivated by the fact that it is an efficient technique, widely used for classification tasks.
Back-propagation have performed well for many real problems and has become the most popular learning algorithm for multi
layer networks. This learning method yield a good approximation to the detection problem.
One of the parameters to choose is the learning rate that can be the same for every weight in the network, different for each
layer, different for each node or different for each weight in the network. In general, to determine the best learning rate is not
an easy task, so we have chosen a general solution proposed in [3] making the learning rate for each node inversely proportional
to the average magnitude of vectors feeding into the node.
In the basic algorithm to update each weight, we add the well-known momentum term [3] as a simple approach to adapt the
learning rate as a function of the local curvature on the error surface.
For stopping the training algorithm there are several methods. You can terminate it when the magnitude of the gradient is
sufficiently small, since by definition the gradient will be zero at the minimum. You can stop the algorithm when the estimation
error falls bellow a fixed threshold, that fulfil the starting requirements. Or you can stop when a fixed number of iterations have
been performed, although there is little guarantee that this stopping condition will terminate the algorithm at a minimum. In fact,
with this solutions one do not optimize the net, premature terminating the learning algorithm.
The method we have chosen is "cross validation"; we split the data into two sets: a training set which is used to train the
network, and a test set which is used to estimate the error probability (Pe) of the neural network detector. During learning, the
performance of the network on the training data will continue to improve, but its performance on the test data will improve until
a point, beyond which it will start to degrade. It is at this point, were the network starts to be overtrained, that the learning
algorithm is terminated. Although more computationally intensive, this method avoids premature termination, improving the
generalization performance of the network. For low values of Pe you can have some estimation problems because very low
probability values will need extremely large testing sets. For better estimation of probabilities you could then use techniques
such as Importance Sampling, although it complicates the algorithms.
If you find that the rate of convergence is too slow, you can use learning rate adaptation methods as proposed in [4]. The
essence of this method is to trace the curvature of the error propagation surface. It increases the learning rate if the error
performance surface is flat at the current point in the parameter space. Otherwise, the learning rate is decreased to avoid potential
oscillations. The drawback of this method is the increase in computational complexity, since every network weight has its own
(2a)
(2b)
learning rate to be computed. However, exploiting the fact that the weight changes (due to each training pattern) are usually small
compared with the magnitude of the weights, we can combine a gradient reuse method [5] with the learning rate adaptation
method.
Although the net will deal with complex signal, it will be an all real coefficient one, as we have indicate in the previous
section. To provide generalization, it will have at least three layers. The number of layers and nodes in them will be discussed
in the next section. The last layer will have only one node. The output will be "1" or "0" after thresholding.
3. Application to radar
For the radar case we can build a simplified model of the input, the complex envelope constituted by a sequence of M
complex samples (the radar azimuth samples, referred to the same range bin). We define the two detection hypothesis
where T0 is the pulse repetition period, k varies from 0 to M-1 (being MT0 the time on target), S is the signal amplitude, Θ
indicates an initial phase (constant in the same sequence) and n(kT0) represents an uncorrelated zero-mean Gaussian variable
with variance σ2. Each complex input is separated in its real and imaginary parts, yielding two real inputs to the net; so the
number of input units must be 2M. This input model allow us to generate as many training and test pairs as we need in our
analysis. Then we compare different MLP structures. Taking M=8 as a typical value for radar detection, the input layer has 16
nodes. We choose the threshold value T to achieve a given (Pd , Pfa) relationship.
The training and testing pair has been simulated in the computer with the noise standard deviation, σ=1, for different values
of the training signal-to-noise ratio (TSNR). One of the parameters that have shown more influence in the performance of the
net is this training signal-to-noise ratio.
The learning procedure consists of alternately presenting sequences of only noise and signal plus noise samples, with a given
TSNR, so that the desired output alternates between 0 and 1 at each iteration. The desired output value is 0 for only noise and
1 in the presence of signal and noise. We vary the phase Θ randomly in the interval [0,2π) for imposing the network to
generalize on the input phases during the learning procedure.
4. Computer Results
In our experiment, we have chosen an MLP with 8 nodes in the hidden layer (so the input layer has 16 nodes) as a result of
a thorough study of the net structure were we have seen that not significatively improvements are achieved by increasing the
complexity of the net, at least for this type of model. A more complexity of the net will probably be needed for more complex
models of signal and noise, and for increasing the robustness of the detector. But this simple net could be good enough if the
statistical characteristics of the input does not change. Or, due to its quickly training, it could present interesting adaptation
properties to changes in input distributions, if continuously (real time) trained.
4.1. ROC curves.
First we study how the network performs under changes in the training signal-to-noise ratio (TSNR), by means of the
Receiver Operating Characteristic (ROC). What we could expect is that a net trained with a TSNR is good to detect inputs which
Signal-to-Noise Ratio (SNR) is within certain range of values, close to its TSNR. We can see in Figure 1(a) that it Figure
1 Receiver Operating Characteristic (ROC) curves: Detection Probability (Pd) vs false alarm Probability (Pfa) for a MLP with
8 nodes in 1 hidden layer and 3 different Training Signal-to-Noise Ratio (TSNR). (a) Input Signal-to-Noise-Ratio (SNR)= 6
dB, (b) SNR= 0 dB.
is generally true for a SNR= 6 dB. For low values of this SNR, we see that the value of Pfa impose a limit to the last statement.
In Figure 1(b), with SNR= 0 dB, the net trained with 3 dB present worst characteristics than that of 6 dB because for low values
of TSNR the value of Pfa impose a threshold value too high to get a good Pd.
4.2. Detection curves.
In this section we study how the network performs under changes in the training signal-to-noise ratio (TSNR), by means of
the detection curves. As we can see in Figure 2, if you train with a very low TSNR, the value of Pfa will limit the
Figure 2. Detection Probability (Pd) vs Signal-to-Noise Ratio (SNR) for a MLP with 8 nodes in 1 hidden layer and 4
different Training Signal-to-Noise Relationships (TSNR). (a) Pfa= 0.01, (b)Pfa= 0.001.
detection capabilities of the net. If you train with a very high TSNR, the values of Pfa will not be a limitation for the detection
capabilities, but it does not present better performance of detection than lower TSNR (# 6 dB) . In other words train your net
in adverse conditions (low TSNR), and it will be a better detector for the same SNR than if you train it in favourable conditions
(high SNR). But if the training conditions are too adverse you will never get a Pd high enough for practical purposes.
In this experiment, the net that generally performs best is the one trained with 6 dB. This has been found empirically. To find
an optimal net or an analytical expression for the TSNR-detection performance is yet an open question.
5. Conclusions
We can summarize this paper in the following points.
a) A three layer neural network is probably sufficient to achieve your design requirements in terms of detection and false
alarm probabilities (or error probability). The net can be an all-real coefficient one with an even number of inputs (double of
input samples) and one node in the output layer.
b) Increasing the complexity of the network do not improve the performance of your net for an specific task. It is better to
optimize the relationship among signal-to-noise ratio for training (TSNR), detection probability (Pd) and false alarm probability
(Pfa). Increasing the complexity of the network likely improves the robustness of the detector, or it could be necessary for more
complex models; but training time increases.
c) To train the net with a low training signal-to-noise ratio (TSNR) will improve its detection performance. But if TSNR is
too low, the detection capabilities is seriously degraded. On the other hand, if you train your net with a high training signal-to-
noise ratio, it will be only efficient for high input signal-to-noise ratio, and that is not desirable.
d) As a general rule, the net should be trained with an intermediate TSNR (the minimum TSNR depends on Pfa) and then
adjust the threshold value to achieve the design requirements of Pfa. If this is impossible, a higher training signal-to-noise ratio
has to be used.
6. References
[1] D.E. Rumelhart, G.E. Hinton, R.J. Williams, "Learning Internal Representations by Error Propagation", in D.E.
Rumelhart & McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognitron.
Vol 1: Foundations, MIT Press, 1986.
[2] M.S. Kim and C.C. Guest, "Modification of Back propagation Networks for Complex-Valued Signal Processing in
Frequency Domain", IEEE Proceedings of IJCNN, Vol III, pp. 27-31, San Diego, June 1990.
[3] D.R. Hush and B. G. Horne, "Progress in Supervised Neural Networks. What's new since Lipmann?", IEEE Signal
Processing Magazine, pp. 8-36, January, 1993
[4] Jacobs, R.A., "Increased Rates of convergence through Learning Rate adaptation", Neural Networks, vol 1, pp. 4-22,
1987.
[5] D.R. Hush and J.M. Salas, Improving the Learning Rate of Back-propagation with the Gradient Reuse Algorithm",
Proc. IEEE Second Int. Conf. on Neural Networks, vol I, pp. 441-447, San Diego, 1988.
... 3 State of the Art [124], [125] and [126] are settled in the signal pipeline one step before the radar point cloud. They work on the raw radar signal, which is a time series data format, and extract the radar detections, which is usually done by a constant false alarm rate (CFAR) algorithm. ...
... They work on the raw radar signal, which is a time series data format, and extract the radar detections, which is usually done by a constant false alarm rate (CFAR) algorithm. Andina and Sanz-González [124] apply a MLP architecture on the radar and find that low signal-to-noise ratio (SNR) during training improves the detection performance and that an ANN trained with a high SNR is only efficient for high input SNR. Cheikh and Faozi [126] also apply a MLP for binary detection to the radar signal and outperform the classical cell-averaging CFAR detector in terms of FPR. ...
... ResNeXt [101] NASNet [102] EfficientNet [103] R-CNN [107] SSD [112] YOLO [108] YOLOv2 [109] YOLOv3 [76] RetinaNet [64] RefineDet [113] TridentNet [114] PointNet [85] PointNet++ [43] PointCNN [115] KPConv [118] Ψ-CNN [119] OctNet [120] O-CNN [121] Andina and Sanz-González [124] Zyweck and Bogner [125] Cheikh and Faozi [126] Heuel and Rohling [127] Angelov et al. [128] Wohler et al. [122] Schumann et al. [42] Dube et al. [129] Lombacher et al. [130] Lombacher et al. [131] Lombacher et al. [179] Lombacher et al. [1] Danzer et al. [133] Schlosser et al. [136] AVOD [137] Bijelic et al. [32] Ji and Prokhorov [138] Chadwick et al. [51] Chollet [56] Zhang et al. [149] Chen et al. [140] Zhao et al. [151] Godard et al. [152] Dimitrievski et al. [159] Roveri et al. [157] Väänänen [158] Hermosilla et al. [167] Islam and Chong [170] Gradolewski et al. [169] 3 State of the Art Table 3.3 -continued from previous page Work Modality Task C a m e r a L id a r a n d 3 -D p o in t c lo u d R a d a r F il t e r in g a n d r e s t o r a t io n C la s s ifi c a t io n D e t e c t io n S e g m e n t a t io n ...
Thesis
Full-text available
Autonomous vehicles are safety critical systems and require a very high level of reliability. By combining technologies that rely on different physical principles and types of information, the perception can be superior in terms of reliability, accuracy, computational time and costs. Thus, the given task is to develop a software for environmental perception based on automotive camera and radar sensors. The topic of this master’s thesis is two-folded. On the one hand, state-of-the-art radar and camera sensors are to be fused and used for object detection via neural networks. On the other hand, false positive detections from radar sensors are to be filtered via neural networks. For the entire work, real data from state-of-the-art sensors are being used. The results are evaluated on the publicly available nuScenes dataset. The performance of the radar fusion network is evaluated against an image only approach. The following work packages have been done: 1. Familiarization with the state of the art of neural filtering, sensor data fusion and object detection. 2. Analyzing the radar data for relevant properties. 3. Developing a neural fusion network for object detection based on camera and radar. 4. Developing a neural filter network for filtering automotive radar data. 5. Investigating different detection performances from raw-data fusion to object-level fusion. 6. Investigating the impact of different preprocessing methods. 7. Evaluating all aspects on real automotive sensor data.
... Perceptrons [3] can be successfully applied to the radar or sonar detection problem [5]. But the application of this type of networks presents the problem that information about the desired output is necessary to build the patterns in the training file. ...
... In our design process, we use the classical envelope detector, a system that takes input data from an antenna, demodulates them into Intermediate Frequency and finally, sample it to serve as input to our preprocessed system [5]. This preprocess system is mainly integrated by a Kohonen´s map. ...
... Over the same input data, one MLP is detecting and the second MLP is training is learning on-line. Details of the supervised algorithm can be found in [5]. ...
Conference Paper
Full-text available
Supervised training in neural detectors has proved to perform as the optimal detector in the Neyman-Pearson sense for classical target models. Nevertheless, in many practical cases, the use of an unsupervised training would be desirable. This paper presents an application of the Kohonen Network as preprocessor, so the detector can be trained in an unsupervised fashion. The detector performance is analyzed, showing their advantages and drawbacks.
... Multilayer Perceptrons (Freeman and Skapura, 1993) can be successfully applied to the radar or sonar detection problem (Andina and Sanz, 1995). But the application of this type of networks present the problem that information about the desired output is necessary to build the patterns in the training file. ...
... In our design process, we use the classical envelope detector, a system that takes input data from an antenna, demodulates them into Intermediate Frequency and finally, sample to serve as input to the our preprocess system (Andina and Sanz, 1995). This preprocess system is mainly integrated by a Kohonen´s map. ...
... A neural structure that lays efficient detection results (Andina and Sanz, 1995) is the one that integrates eight pulses. This process is parallel for both components, real and imaginary part of the complex envelope, so, in this case, we get 16 representative values of the input signal. ...
Conference Paper
Full-text available
Supervised training in neural detectors has proved to perform as the optimal detector in the Neyman-Pearson sense for classical Marcum and Swerling target models. Nevertheless, in many practical cases, the use of an unsupervised training would be desirable. This paper presents a novel neural detector that uses a Kohonen network as preprocessor, so the detector can be trained in an unsupervised fashion. The detector performance is analyzed, showing their advantages and drawbacks. Resumen: los detectores neuronales entrenados de manera supervisada han demostrado su capacidad teórica para comportarse como los detectores óptimos en el sentido de Neyman-Pearson y para los modelos de blanco radar clásicos de Marcum y Swerling. Sin embargo, el uso de entrenamiento no supervisado sería deseable en muchos casos prácticos. En este artículo se presentan y analizan las ventajas y problemas del empleo de un entrenamiento no supervisado en este tipo de detectores.
... -Límite máximo de iteraciones: 10.000 -En las Figs. 3-8, el modelo de señal de entrada corresponde al de Marcum [4,19], esto es, Donde A: amplitud de la señal, de valor constante. n 0 : fase inicial, constante para cada patrón de entrada y con distribución uniforme entre patrones. ...
Conference Paper
Full-text available
En este trabajo se propone un detector binario que incluye una red neuronal tipo Perceptrón Multicapa (MLP), con el objetivo de ser aplicado a problemas de detección binaria radar o sonar. La entrada al detector se modela como señal más ruido aditivo (dado por su envolvente compleja). La salida binaria es 1 ó 0. De este estudio se extrae como diseñar la estructura, entrenamiento y evaluación de prestaciones. Una vez diseñado, la optimización del detector se basa en la minimización de una función error cuyos resultados son significativamente superiores a los del error cuadrático medio, comúnmente usado en este tipo de redes. Sus prestaciones son evaluadas mediante técnicas de Montecarlo que permiten obtener las Curvas de Detección y ROC (Receiver Operating Characteristics). Se analiza la dependencia de estas prestaciones con la relación señal a ruido de entrenamiento (TSNR), estructura, función criterio, valor del umbral y distribuciones de entrada. Las curvas de detección (criterio de Neyman-Pearson) se comparan con las óptimas, observándose que la red neuronal presenta un rango óptimo de integración de pulsos donde sus prestaciones son cuasi-óptimas, alcanzando los límites teóricos para varias distribuciones de la señal de entrada (modelos de Marcum y Swerling I-IV),incluso para distribuciones de señal diferentes de las asumidas en las fases de diseño.
Article
While there exist many techniques for finding the parameters that minimize an error function, only those methods that solely perform local computations are used in connectionist networks. The most popular learning algorithm for connectionist networks is the back-propagation procedure, which can be used to update the weights by the method of steepest descent. In this paper, we examine steepest descent and analyze why it can be slow to converge. We then propose four heuristics for achieving faster rates of convergence while adhering to the locality constraint. These heuristics suggest that every weight of a network should be given its own learning rate and that these rates should be allowed to vary over time. Additionally, the heuristics suggest how the learning rates should be adjusted. Two implementations of these heuristics, namely momentum and an algorithm called the delta-bar-delta rule, are studied and simulation results are presented.
Conference Paper
A simple method for improving the learning rate of the backpropagation algorithm is described and analyzed. The method is referred to as the gradient reuse algorithm (GRA). The basic idea is that ingredients which are computed using backpropagation are reused several times until the resulting weight updates no longer lead to a reduction in error. It is shown that convergence speedup is a function of the reuse rate, and that the reuse rate can be controlled by using a dynamic convergence parameter.< >
Article
Theoretical results concerning the capabilities and limitations of various neural network models are summarized, and some of their extensions are discussed. The network models considered are divided into two basic categories: static networks and dynamic networks. Unlike static networks, dynamic networks have memory. They fall into three groups: networks with feedforward dynamics, networks with output feedback, and networks with state feedback, which are emphasized in this work. Most of the networks discussed are trained using supervised learning.< >
Parallel Distributed Processing: Explorations in the Microstructure of Cognitron
Rumelhart & McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognitron. Vol 1: Foundations, MIT Press, 1986. [2]
Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency DomainProgress in Supervised Neural Networks. What's new since Lipmann?Increased Rates of convergence through Learning Rate adaptation
  • M S Kim
  • C C Guest
  • D R Hush
  • B G Horne Jacobs
M.S. Kim and C.C. Guest, "Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency Domain", IEEE Proceedings of IJCNN, Vol III, pp. 27-31, San Diego, June 1990. [3] D.R. Hush and B. G. Horne, "Progress in Supervised Neural Networks. What's new since Lipmann?", IEEE Signal Processing Magazine, pp. 8-36, January, 1993 [4] Jacobs, R.A., "Increased Rates of convergence through Learning Rate adaptation", Neural Networks, vol 1, pp. 4-22, 1987. [5] D.R. Hush and J.M. Salas, Improving the Learning Rate of Back-propagation with the Gradient Reuse Algorithm", Proc. IEEE Second Int. Conf. on Neural Networks, vol I, pp. 441-447, San Diego, 1988.
Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency Domain
  • M S Kim
  • C C Guest
M.S. Kim and C.C. Guest, "Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency Domain", IEEE Proceedings of IJCNN, Vol III, pp. 27-31, San Diego, June 1990.