Conference PaperPDF Available

Application of a Neural Network to Radar Detection

August 1995

August 1995

Conference: European Conf. on Circuit Theory and Design, ECCTD'95
At: Instabul, Turkey

Authors:

Diego Andina

Universidad Politécnica de Madrid

José L. Sanz-González

Universidad Politécnica de Madrid

The application of Neural Networks to radar detection has many open questions and some of them are solved in this paper. First, we propose a network structure useful for the problem of binary detection. We model the input as signal and noise (given by its complex envelope), and the binary output are 1 or 0. We evaluate different structures and their dependence on training signal-to-noise ratio and the threshold value. Then, we evaluate its performance by Montecarlo trials. We present its Receiver Operating Characteristics (ROC) and detection curves.

Detection Probability (P d ) vs Signal-to-Noise Ratio (SNR) for a MLP with 8 nodes in 1 hidden layer and 4 different Training Signal-to-Noise Relationships (TSNR). (a) P fa = 0.01, (b)P fa = 0.001.

…

Figures - uploaded by Diego Andina

Content may be subject to copyright.

Content uploaded by Diego Andina

Content may be subject to copyright.

(1)

Application of a Neural Network to Radar Detection

Diego ANDINA and José L. SANZ-GONZALEZ

Departamento de Señales, Sistemas y Radiocomunicaciones, ETSI de Telecomunicación Universidad Politécnica de

Madrid, Ciudad Universitaria s/n 28040 Madrid, Spain, Tel/Fax: +34 [1] 549 5700 (ext.384 ) / 543 9652,

E-Mail: andina@ics.upm.es

Abstract. The application of Neural Networks to radar detection has many open questions and some of them are solved

in this paper. First, we propose a network structure useful for the problem of binary detection. We model the input as signal

and noise (given by its complex envelope), and the binary output are 1 or 0. We evaluate different structures and their

dependence on training signal-to-noise ratio and the threshold value. Then, we evaluate its performance by Montecarlo

trials. We present its Receiver Operating Characteristics (ROC) and detection curves.

1. Introduction

We could briefly reduce the binary detection problem as having to decide if an input complex value (the complex envelope

involving signal and noise) has to be classified as one of two outputs, 0 or 1. Neural networks have proved their abilities in

classifying problems and could have interesting nonlinear capabilities for detection when the input is affected with non-Gaussian

noise. Obviously, the binary detection problem is highly dependent on each application, but we could try to model each

characteristic as a learning parameter, or as modifications on the network structure. For example, the need of processing complex

signals with back-propagation learning [2] can be characterized by complex weights and adapted sigmoidal function or simply

by separating the inputs into its real and imaginary parts and doubling the number of input nodes. Also, the presence at the output

of only 0 or 1 does not imply that the nodes in our network must have hard limiters: we can establish a threshold at the output

and assign the two binary values to each sides separated by this threshold.

In this paper, we model the input at time t as a complex value composed by

were xi is the input vector, si is the signal vector (s1 corresponds to "0" and s2 corresponds to "1") and n is the noise vector. Each

component of xi corresponds to the complex envelope of the received signal.

At the neural network output we will have values ranging in [0,1]. Then we will have to choose a threshold value T0 [0,1]

so that output values [0,T) will be considered as binary output 0 and values in [T,1] will represent value 1.

2. The Neural Network as Detector

For the present study, we have chosen one representative algorithm for supervised learning: multi-layer perceptron with back-

propagation. This choice is mainly motivated by the fact that it is an efficient technique, widely used for classification tasks.

Back-propagation have performed well for many real problems and has become the most popular learning algorithm for multi

layer networks. This learning method yield a good approximation to the detection problem.

One of the parameters to choose is the learning rate that can be the same for every weight in the network, different for each

layer, different for each node or different for each weight in the network. In general, to determine the best learning rate is not

an easy task, so we have chosen a general solution proposed in [3] making the learning rate for each node inversely proportional

to the average magnitude of vectors feeding into the node.

In the basic algorithm to update each weight, we add the well-known momentum term [3] as a simple approach to adapt the

learning rate as a function of the local curvature on the error surface.

For stopping the training algorithm there are several methods. You can terminate it when the magnitude of the gradient is

sufficiently small, since by definition the gradient will be zero at the minimum. You can stop the algorithm when the estimation

error falls bellow a fixed threshold, that fulfil the starting requirements. Or you can stop when a fixed number of iterations have

been performed, although there is little guarantee that this stopping condition will terminate the algorithm at a minimum. In fact,

with this solutions one do not optimize the net, premature terminating the learning algorithm.

The method we have chosen is "cross validation"; we split the data into two sets: a training set which is used to train the

network, and a test set which is used to estimate the error probability (Pe) of the neural network detector. During learning, the

performance of the network on the training data will continue to improve, but its performance on the test data will improve until

a point, beyond which it will start to degrade. It is at this point, were the network starts to be overtrained, that the learning

algorithm is terminated. Although more computationally intensive, this method avoids premature termination, improving the

generalization performance of the network. For low values of Pe you can have some estimation problems because very low

probability values will need extremely large testing sets. For better estimation of probabilities you could then use techniques

such as Importance Sampling, although it complicates the algorithms.

If you find that the rate of convergence is too slow, you can use learning rate adaptation methods as proposed in [4]. The

essence of this method is to trace the curvature of the error propagation surface. It increases the learning rate if the error

performance surface is flat at the current point in the parameter space. Otherwise, the learning rate is decreased to avoid potential

oscillations. The drawback of this method is the increase in computational complexity, since every network weight has its own

(2a)

(2b)

learning rate to be computed. However, exploiting the fact that the weight changes (due to each training pattern) are usually small

compared with the magnitude of the weights, we can combine a gradient reuse method [5] with the learning rate adaptation

method.

Although the net will deal with complex signal, it will be an all real coefficient one, as we have indicate in the previous

section. To provide generalization, it will have at least three layers. The number of layers and nodes in them will be discussed

in the next section. The last layer will have only one node. The output will be "1" or "0" after thresholding.

3. Application to radar

For the radar case we can build a simplified model of the input, the complex envelope constituted by a sequence of M

complex samples (the radar azimuth samples, referred to the same range bin). We define the two detection hypothesis

where T0 is the pulse repetition period, k varies from 0 to M-1 (being MT0 the time on target), S is the signal amplitude, Θ

indicates an initial phase (constant in the same sequence) and n(kT0) represents an uncorrelated zero-mean Gaussian variable

with variance σ2. Each complex input is separated in its real and imaginary parts, yielding two real inputs to the net; so the

number of input units must be 2M. This input model allow us to generate as many training and test pairs as we need in our

analysis. Then we compare different MLP structures. Taking M=8 as a typical value for radar detection, the input layer has 16

nodes. We choose the threshold value T to achieve a given (Pd , Pfa) relationship.

The training and testing pair has been simulated in the computer with the noise standard deviation, σ=1, for different values

of the training signal-to-noise ratio (TSNR). One of the parameters that have shown more influence in the performance of the

net is this training signal-to-noise ratio.

The learning procedure consists of alternately presenting sequences of only noise and signal plus noise samples, with a given

TSNR, so that the desired output alternates between 0 and 1 at each iteration. The desired output value is 0 for only noise and

1 in the presence of signal and noise. We vary the phase Θ randomly in the interval [0,2π) for imposing the network to

generalize on the input phases during the learning procedure.

4. Computer Results

In our experiment, we have chosen an MLP with 8 nodes in the hidden layer (so the input layer has 16 nodes) as a result of

a thorough study of the net structure were we have seen that not significatively improvements are achieved by increasing the

complexity of the net, at least for this type of model. A more complexity of the net will probably be needed for more complex

models of signal and noise, and for increasing the robustness of the detector. But this simple net could be good enough if the

statistical characteristics of the input does not change. Or, due to its quickly training, it could present interesting adaptation

properties to changes in input distributions, if continuously (real time) trained.

4.1. ROC curves.

First we study how the network performs under changes in the training signal-to-noise ratio (TSNR), by means of the

Receiver Operating Characteristic (ROC). What we could expect is that a net trained with a TSNR is good to detect inputs which

Signal-to-Noise Ratio (SNR) is within certain range of values, close to its TSNR. We can see in Figure 1(a) that it Figure

1 Receiver Operating Characteristic (ROC) curves: Detection Probability (Pd) vs false alarm Probability (Pfa) for a MLP with

8 nodes in 1 hidden layer and 3 different Training Signal-to-Noise Ratio (TSNR). (a) Input Signal-to-Noise-Ratio (SNR)= 6

dB, (b) SNR= 0 dB.

is generally true for a SNR= 6 dB. For low values of this SNR, we see that the value of Pfa impose a limit to the last statement.

In Figure 1(b), with SNR= 0 dB, the net trained with 3 dB present worst characteristics than that of 6 dB because for low values

of TSNR the value of Pfa impose a threshold value too high to get a good Pd.

4.2. Detection curves.

In this section we study how the network performs under changes in the training signal-to-noise ratio (TSNR), by means of

the detection curves. As we can see in Figure 2, if you train with a very low TSNR, the value of Pfa will limit the

Figure 2. Detection Probability (Pd) vs Signal-to-Noise Ratio (SNR) for a MLP with 8 nodes in 1 hidden layer and 4

different Training Signal-to-Noise Relationships (TSNR). (a) Pfa= 0.01, (b)Pfa= 0.001.

detection capabilities of the net. If you train with a very high TSNR, the values of Pfa will not be a limitation for the detection

capabilities, but it does not present better performance of detection than lower TSNR (# 6 dB) . In other words train your net

in adverse conditions (low TSNR), and it will be a better detector for the same SNR than if you train it in favourable conditions

(high SNR). But if the training conditions are too adverse you will never get a Pd high enough for practical purposes.

In this experiment, the net that generally performs best is the one trained with 6 dB. This has been found empirically. To find

an optimal net or an analytical expression for the TSNR-detection performance is yet an open question.

5. Conclusions

We can summarize this paper in the following points.

a) A three layer neural network is probably sufficient to achieve your design requirements in terms of detection and false

alarm probabilities (or error probability). The net can be an all-real coefficient one with an even number of inputs (double of

input samples) and one node in the output layer.

b) Increasing the complexity of the network do not improve the performance of your net for an specific task. It is better to

optimize the relationship among signal-to-noise ratio for training (TSNR), detection probability (Pd) and false alarm probability

(Pfa). Increasing the complexity of the network likely improves the robustness of the detector, or it could be necessary for more

complex models; but training time increases.

c) To train the net with a low training signal-to-noise ratio (TSNR) will improve its detection performance. But if TSNR is

too low, the detection capabilities is seriously degraded. On the other hand, if you train your net with a high training signal-to-

noise ratio, it will be only efficient for high input signal-to-noise ratio, and that is not desirable.

d) As a general rule, the net should be trained with an intermediate TSNR (the minimum TSNR depends on Pfa) and then

adjust the threshold value to achieve the design requirements of Pfa. If this is impossible, a higher training signal-to-noise ratio

has to be used.

6. References

[1] D.E. Rumelhart, G.E. Hinton, R.J. Williams, "Learning Internal Representations by Error Propagation", in D.E.

Rumelhart & McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognitron.

Vol 1: Foundations, MIT Press, 1986.

[2] M.S. Kim and C.C. Guest, "Modification of Back propagation Networks for Complex-Valued Signal Processing in

Frequency Domain", IEEE Proceedings of IJCNN, Vol III, pp. 27-31, San Diego, June 1990.

[3] D.R. Hush and B. G. Horne, "Progress in Supervised Neural Networks. What's new since Lipmann?", IEEE Signal

Processing Magazine, pp. 8-36, January, 1993

[4] Jacobs, R.A., "Increased Rates of convergence through Learning Rate adaptation", Neural Networks, vol 1, pp. 4-22,

1987.

[5] D.R. Hush and J.M. Salas, Improving the Learning Rate of Back-propagation with the Gradient Reuse Algorithm",

Proc. IEEE Second Int. Conf. on Neural Networks, vol I, pp. 441-447, San Diego, 1988.

Autonomous Driving: Radar Sensor Noise Filtering and Multimodal Sensor Fusion for Object Detection with Artificial Neural Networks

Thesis

Full-text available

Sep 2019

Markus Weber

Autonomous vehicles are safety critical systems and require a very high level of reliability. By combining technologies that rely on different physical principles and types of information, the perception can be superior in terms of reliability, accuracy, computational time and costs. Thus, the given task is to develop a software for environmental perception based on automotive camera and radar sensors. The topic of this master’s thesis is two-folded. On the one hand, state-of-the-art radar and camera sensors are to be fused and used for object detection via neural networks. On the other hand, false positive detections from radar sensors are to be filtered via neural networks. For the entire work, real data from state-of-the-art sensors are being used. The results are evaluated on the publicly available nuScenes dataset. The performance of the radar fusion network is evaluated against an image only approach. The following work packages have been done: 1. Familiarization with the state of the art of neural filtering, sensor data fusion and object detection. 2. Analyzing the radar data for relevant properties. 3. Developing a neural fusion network for object detection based on camera and radar. 4. Developing a neural filter network for filtering automotive radar data. 5. Investigating different detection performances from raw-data fusion to object-level fusion. 6. Investigating the impact of different preprocessing methods. 7. Evaluating all aspects on real automotive sensor data.

Kohonen Networks And Multilayer Perceptrons In Signal Detection

Conference Paper

Full-text available

Jul 2001

Diego Andina

Supervised training in neural detectors has proved to perform as the optimal detector in the Neyman-Pearson sense for classical target models. Nevertheless, in many practical cases, the use of an unsupervised training would be desirable. This paper presents an application of the Kohonen Network as preprocessor, so the detector can be trained in an unsupervised fashion. The detector performance is analyzed, showing their advantages and drawbacks.

Integration of a Mixed Neural System Into a Radar Detector

Conference Paper

Full-text available

Jan 1998

Diego Andina

Supervised training in neural detectors has proved to perform as the optimal detector in the Neyman-Pearson sense for classical Marcum and Swerling target models. Nevertheless, in many practical cases, the use of an unsupervised training would be desirable. This paper presents a novel neural detector that uses a Kohonen network as preprocessor, so the detector can be trained in an unsupervised fashion. The detector performance is analyzed, showing their advantages and drawbacks. Resumen: los detectores neuronales entrenados de manera supervisada han demostrado su capacidad teórica para comportarse como los detectores óptimos en el sentido de Neyman-Pearson y para los modelos de blanco radar clásicos de Marcum y Swerling. Sin embargo, el uso de entrenamiento no supervisado sería deseable en muchos casos prácticos. En este artículo se presentan y analizan las ventajas y problemas del empleo de un entrenamiento no supervisado en este tipo de detectores.

Optimización de Detectores Neuronales de Envolvente

Conference Paper

Full-text available

Jan 1998

Diego Andina

En este trabajo se propone un detector binario que incluye una red neuronal tipo Perceptrón Multicapa (MLP), con el objetivo de ser aplicado a problemas de detección binaria radar o sonar. La entrada al detector se modela como señal más ruido aditivo (dado por su envolvente compleja). La salida binaria es 1 ó 0. De este estudio se extrae como diseñar la estructura, entrenamiento y evaluación de prestaciones. Una vez diseñado, la optimización del detector se basa en la minimización de una función error cuyos resultados son significativamente superiores a los del error cuadrático medio, comúnmente usado en este tipo de redes. Sus prestaciones son evaluadas mediante técnicas de Montecarlo que permiten obtener las Curvas de Detección y ROC (Receiver Operating Characteristics). Se analiza la dependencia de estas prestaciones con la relación señal a ruido de entrenamiento (TSNR), estructura, función criterio, valor del umbral y distribuciones de entrada. Las curvas de detección (criterio de Neyman-Pearson) se comparan con las óptimas, observándose que la red neuronal presenta un rango óptimo de integración de pulsos donde sus prestaciones son cuasi-óptimas, alcanzando los límites teóricos para varias distribuciones de la señal de entrada (modelos de Marcum y Swerling I-IV),incluso para distribuciones de señal diferentes de las asumidas en las fases de diseño.

Learning Internal Representations by Error Propagation

Article

Jul 1986

Increased Rates of Convergence through Learning Rate Adaptation. Neural Networks Vol I, No 4 295-307

Article

Dec 1988
NEURAL NETWORKS

Robert Jacobs

While there exist many techniques for finding the parameters that minimize an error function, only those methods that solely perform local computations are used in connectionist networks. The most popular learning algorithm for connectionist networks is the back-propagation procedure, which can be used to update the weights by the method of steepest descent. In this paper, we examine steepest descent and analyze why it can be slow to converge. We then propose four heuristics for achieving faster rates of convergence while adhering to the locality constraint. These heuristics suggest that every weight of a network should be given its own learning rate and that these rates should be allowed to vary over time. Additionally, the heuristics suggest how the learning rates should be adjusted. Two implementations of these heuristics, namely momentum and an algorithm called the delta-bar-delta rule, are studied and simulation results are presented.

Improving the learning rate of back-propagation with the gradient reuse algorithm

Conference Paper

Aug 1988

A simple method for improving the learning rate of the backpropagation algorithm is described and analyzed. The method is referred to as the gradient reuse algorithm (GRA). The basic idea is that ingredients which are computed using backpropagation are reused several times until the resulting weight updates no longer lead to a reduction in error. It is shown that convergence speedup is a function of the reuse rate, and that the reuse rate can be controlled by using a dynamic convergence parameter.< >

Progress in supervised neural networks

Article

Feb 1993

Theoretical results concerning the capabilities and limitations of various neural network models are summarized, and some of their extensions are discussed. The network models considered are divided into two basic categories: static networks and dynamic networks. Unlike static networks, dynamic networks have memory. They fall into three groups: networks with feedforward dynamics, networks with output feedback, and networks with state feedback, which are emphasized in this work. Most of the networks discussed are trained using supervised learning.< >

Parallel Distributed Processing: Explorations in the Microstructure of Cognitron

Jan 1986

Rumelhart & McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognitron. Vol 1: Foundations, MIT Press, 1986. [2]

Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency DomainProgress in Supervised Neural Networks. What's new since Lipmann?Increased Rates of convergence through Learning Rate adaptation

Jan 1987
27-31

M S Kim
C C Guest
D R Hush
B G Horne Jacobs

M.S. Kim and C.C. Guest, "Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency Domain", IEEE Proceedings of IJCNN, Vol III, pp. 27-31, San Diego, June 1990. [3] D.R. Hush and B. G. Horne, "Progress in Supervised Neural Networks. What's new since Lipmann?", IEEE Signal Processing Magazine, pp. 8-36, January, 1993 [4] Jacobs, R.A., "Increased Rates of convergence through Learning Rate adaptation", Neural Networks, vol 1, pp. 4-22, 1987. [5] D.R. Hush and J.M. Salas, Improving the Learning Rate of Back-propagation with the Gradient Reuse Algorithm", Proc. IEEE Second Int. Conf. on Neural Networks, vol I, pp. 441-447, San Diego, 1988.

Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency Domain

Jun 1990
27-31

M S Kim
C C Guest

M.S. Kim and C.C. Guest, "Modification of Back propagation Networks for Complex-Valued Signal Processing in Frequency Domain", IEEE Proceedings of IJCNN, Vol III, pp. 27-31, San Diego, June 1990.

Application of a Neural Network to Radar Detection

Abstract and Figures

Recommended publications

Application of Radial-Basis Function Networks to Radar Detection

On the problem of binary detection with neural networks

Comparison of a neural network detector vs Neyman-Pearson optimaldetector

Performance Analysis of Neural Network Detectors by Importance Sampling Techniques

Importance Sampling Techniques in Neural Detector Training