PreprintPDF Available

LengthNet: Length Learning for Planar Euclidean Curves

Authors:
  • Google and Reichman Tech School
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

In this work, we used a deep learning (DL) model to solve a fundamental problem in differential geometry. One can find many closed-form expressions for calculating curvature, length, and other geometric properties in the literature. As we know these properties, we are highly motivated to reconstruct them by using DL models. In this framework, our goal is to learn geometric properties from many examples. The simplest geometric object is a curve, and one of the fundamental properties is the length. Therefore, this work focuses on learning the length of planar sampled curves created by a simulation. The fundamental length axioms were reconstructed using a supervised learning approach. Following these axioms, a DL-based model, we named LengthNet, was established. For simplicity, we focus on the planar Euclidean curves.
Content may be subject to copyright.
Volume xx (200y), Number z, pp. 1–7
LengthNet: Length Learning for Planar Euclidean Curves - Preprint
Barak Or and Ido Amos
ALMA Technologies LTD, Haifa, Israel
Abstract
In this work, we used a deep learning (DL) model to solve a fundamental problem in differential geometry.
One can find many closed-form expressions for calculating curvature, length, and other geometric properties in the literature.
As we know these properties, we are highly motivated to reconstruct them by using DL models. In this framework, our goal
is to learn geometric properties from many examples. The simplest geometric object is a curve, and one of the fundamental
properties is the length. Therefore, this work focuses on learning the length of planar sampled curves created by a simulation.
The fundamental length axioms were reconstructed using a supervised learning approach. Following these axioms, a DL-based
model, we named LengthNet, was established. For simplicity, we focus on the planar Euclidean curves.
1. Introduction
The calculation of curve length is a significant component in many
classical and modern problems involving numerical differential
geometry [GP90, HC98]. Several numerical constraints affect the
quality of the length calculation; additive noise, discretization
error, and partial information. One use case for length calculation
is a handwritten signature. It involves the computation of the
length along the curve [OTPH16]. This use case and many others
should handle the mentioned numerical constraints. Hence, a
robust approach to handle it is required.
Recent works explore the possibilities of using classical machine
learning (ML) or deep learning (DL) based approaches as they
achieved great success in solving many classification, regression
and anomaly detection tasks [LBH15]. Evidence of the effective-
ness of DL in solving such tasks has been shown repeatedly in
recent years [LBH15]. An efficient DL architecture finds intrinsic
properties by using a convolutional operator (and some more
sophisticated and nonlinear operators) and generalizes them.
Their success is related to the enormous amount of data and their
capability to optimize complicated models by high computational
available resources.
Related papers in the literature mainly address a higher level of
geometric information by DL approach [BBL17, BN18]. Saying
that, a fundamental property was reconstructed by DL model
in [PWK16], where a curvature-based invariant signature was
learned by using a Siamese network configuration [Chi20]. They
presented the advantages of using the DL model to reconstruct the
curvature signature, which mainly results in robustness to noise
and sampling errors.
barak@almatechnologies.com
ido@almatechnologies.com
As we know the powerful functionality of DL models, we are
highly motivated to use them to reconstruct fundamental geometric
properties. Specifically, we focus on the length property recon-
struction for curves in the two-dimensional Euclidean domain
by designing a DL-based model. The task was formulated in a
supervised learning setup. There, a data-dependent learning-based
approach was applied by feeding each example at a time through
our DL-based model and by minimizing a unique loss function that
satisfies the length axioms. For that, we created four anchor shapes,
and applied translations, rotations, and additional operations to
cover a wide range of geometric representations. The resulting
trained DL model was called LengthNet. It obtains a 2D vector as
an input, representing samples of a planar Euclidean curve, and
outputs their respective length.
The main contribution of this work is to reconstruct the length
property. For that, a DL architecture was designed. This archi-
tecture is based on the classical Convolutional Neural Networks
(CNNs).
The remainder of the paper is organized as follows: Section 2
summarizes the geometric background of the length properties.
Section 3 provides a detailed description of the learning approach
where the two architectures are presented. Section 4 presents, the
results followed by the discussion. Section 5 gives the conclusions.
2. Geometric Background of Length
In this section, the length properties are presented and the dis-
cretization error is reviewed.
submitted to STAG2021.
2Barak Or & Ido Amos / LengthNet: Length Learning for Planar Euclidean Curves - Preprint
2.1. Length Properties
Consider a planar parametric differential curve in the Euclidean
space, C(p) = {x(p),y(p)} ∈ R2, where xand yare the curve
coordinates parameterized by parameter p[0,N], where Nis a
partition parameter. The Euclidean length of the curve, is given by
l(p) = Zp
0|C˜p(˜p)|d˜p=Zp
0qx2
˜p+y2
˜pd˜p,(1)
where xp=dx
d p ,yp=dy
d p . Summing all the increments results in the
total length of C, given by
L=ZN
0|C˜p(˜p)|d˜p.(2)
Following the length definition, the main length axioms are pro-
vided.
Additivity: The length additives with respect to concatenation,
where for any C1and C2the following holds
L(C1)+ L(C2) = L(C1C2).(3)
Invariance: length is invariant with respect to rotation (R) and
translation (T),
L(T[R[C]]) = L(C).(4)
Monotonic: length is monotone, where for any C1and C2the fol-
lowing holds
L(C1)≤ L(C2)C1C2.(5)
Non-negativity: The length of any curve is non-negative,
L(C)0.(6)
2.2. Discretization Error
In order to reconstruct the length property by the DL model, a dis-
cretization of the curve should be applied. As a consequence, it is
prone to errors. The curve Clies on a closed interval [α,β]. In order
to find the length by a discretized process, a partition of the interval
is done, where
P={α=p0<p1<p2<···<pN=β}.(7)
For every partition P, the curve length can be represented by the
sum
s(P) =
N
n=1
|C(pn)C(pn1)|.(8)
The discretization error is given by,
ed=L − s(P)
=RN
0|Cp(p)|dpN
n=1|C(pn)C(pn1)|.(9)
where obviously, ed0 when N→ ∞ (for further reading, the
reader refers to [DC16]). Fig. 1 illustrates a general curve with A
their discretized representation for better error visualization.
Figure 1: Discretization.
3. Learning Approach
3.1. Motivation
The motivation for using the DL model for this task lies in the
core challenge of implementing equations (1) and (2) in real-life
scenarios. These equations involve non-linearity and derivatives.
Poor sampling and additive noise might lead to numerical errors
[QW93]. The differential and integral operators can be obtained by
using convolution filters [PWK16] and the summation can be repre-
sented using linear layers, which are highly common in DL models.
The differential invariants can be interpreted as a high pass filter
and the integration as a low pass filter. Hence, it is convenient to
use a CNN alike model for our task. Another approach to deal with
this task involves the Recurrent Neural Network (RNN), where the
curve is considered a time-series [TB97, SJ19]. Our suggested ar-
chitecture is based on a simplified CNN. As we aim to reconstruct
the length axioms, (3)-(6), each of them is considered in the model
establishment pipeline: from unique dataset generation 3.2, through
loss function design 3.3, and the architecture structure 3.4.
3.2. Dataset Generation
The reconstruction of the length properties was made in a super-
vised learning approach, where many curve examples with their
lengths as labels were synthetically created. Each curve is repre-
sented by 2 ×Nvector for the xand ycoordinates and a fixed
number of points N. We created a dataset with 500,000 to en-
able DL-based model establishment. This large number of exam-
ples aimed to cover curve transformations and to satisfy different
patterns. These curves were created by considering four standard
anchor geometric shapes (circles, straight lines, triangles, and rect-
angles), as shown in Figure 2. We scaled, rotated, translated, and
then segmented them randomly into two segments, as shown in
Figure 3. We performed 10 different splits for each anchor curve
and sampled it in the forward and backward directions. In order
to increase the dataset variety, an oscillating vector with random
frequency and amplitude was added. This vector direction was de-
fined as perpendicular to the shape’s bounding curve. These steps
were applied on the curve parametric analytic representation for a
uniform sampling, defined on the [0,1]interval. To enforce smooth-
ness, the curve is convolved with a Gaussian kernel, creating dif-
ferentiable curves even for the case of a rectangle and triangle. The
ground truth length (label) of the shapes was calculated via 1st or-
der approximation between every two successive samples, where
submitted to STAG2021.
Barak Or & Ido Amos / LengthNet: Length Learning for Planar Euclidean Curves - Preprint 3
Figure 2: Dataset generation: Curve examples of four anchor
shapes.
Figure 3: Curve transformation example.
we applied oversampling of 2×100N. This process allows the re-
construction of Additivity and Invariance axioms.
3.3. Loss function
The optimization problem was formulated as a supervised learning
task. The loss function was design to meet the Additivity axiom.
For each example, we aim to minimize the following
Jk=kL(s1)O(s2)O(s3)kδ
k+λ
Θi j
2
2,(10)
where s1,s2, and s3are the input curves that hold the equality
L(s1) = L(s2) + L(s3),Ois the DL model output, kis the ex-
ample index, λis a regularization parameter, Θij are the various
model weight, and k·kδ
2is the norm, where δ[1,2]. The opti-
mization task is to reconstruct the equality:
L(s1) = O(s2) + O(s3)(11)
Minimizing Jkby passing many examples from the dataset, 3.1,
through the model, tuning its weights iteratively where eventu-
ally (10) and (11) coincide on the test set. The optimized resulting
model is characterized by the optimal weights provided by
Θ
i j =argmin
Θij
k
Jk.(12)
In this work, we considered two options of the loss function norm,
δ: the L1norm and the L2norm. The L2norm is very common to
use for minimizing the mean-squared error (MSE) between the true
label and the predicted one. The L1norm (Manhattan Distance) is
very common to use for minimizing the Least Absolute Deviations
(MAD) between the true label and the predicted one. We trained
two models, one with L1norm and the second with L2norm. The
motivation for including the L1norm in the loss is our interest in a
model that calculates length. Hence, the relative error is of our con-
cern. Our hypothesis was that L1outperforms the L2, as we indeed
obtained at 3.5.
3.4. LengthNet Architecture
A simplified CNN was designed as a predictor for this supervised
learning task. The baseline model receives as an input a 200 ×2
array representing discrete samples along the curve. It is inserted
into two one-dimensional convolution layers (Conv1D). Both with
a small kernel of size 3. The first with 24 filters and the second
with 12 filters. They are connected with the Rectified Linear Unit
(ReLU) activation function. Both Conv1D layers are followed by
another ReLU and a linear layer of 2,352 neurons which finally
outputs the length. The architecture is shown in Figure 4.
We tested several variations upon this baseline, e.g., models with
additional batch normalization layers. Still, we found that for this
simple task, a simple and shallow model achieves satisfying results.
3.5. LengthNet Training
Our training set consisted of 400,000 examples created from 400
shapes (100 for each anchor shape). Each specific shape was ro-
tated and segmented in various ways to achieve invariance over
those transformations creating the final dataset of size 400,000. The
test set consisted of 100,000 examples created from 100 different
shapes in the same manner. Training of this architecture for both
loss functions, (10), was done using the ADAM optimizer [KB14]
with a learning rate of 1e-3 with a constant decay rate of 0.99 and
a batch size of 128, which were set after parameter tuning. Both
models (with L1norm and L2norm) were trained by passing many
examples in small batches with a back-propagation method. The
training process was carried out in batches of 128 examples for 400
submitted to STAG2021.
4Barak Or & Ido Amos / LengthNet: Length Learning for Planar Euclidean Curves - Preprint
Figure 4: LengthNet architecture: A general curve is inserted into the model for their length estimation. There, two 1D convolutional layers,
followed by one linear layer were considered with two ReLU activation functions.
Figure 5: LengthNet training
epochs. The chosen CNN-based architecture with L1norm for the
loss function was named LengthNet. Various parameters are pro-
vided in Table 1. Figure 5 shows a graph of the train and test losses
as a function of the number of epochs.
Table 1: Learning Parameters
Description Symbol Value
Nunber of examples K500,000
Train/test ratio - 80/20
Regularization parameter λ0.01
Partition parameter N200
Batch size - 128
Learning rate η0.001
Decay rate - 0.99
Epochs - 400
4. Results and Discussion
The LengthNet was well established after 400 epochs. In order to
validate the LengthNet, we used the Root MSE (RMSE) measure,
as also the RMSE-Over-Length (ROL) measure, defined as
ROL =RMSE
L,(13)
This measure provides a normalized error with respect to the
curve’s length. As we deal with various curves of different lengths,
we must appropriately weigh their errors. Figure 6 provides the
ROL histograms of L1norm based model and L2norm based
model. As shown, L1loss function based model clearly outper-
forms L2loss function based model according to the ROL measure.
4.1. Architectures Comparison
We compared the LengthNet performance with several architec-
tures. A Batch-Norm was added to the LengthNet architecture, af-
ter every convolutional layer. Once with L1 loss and once with
L2 loss. This modification did not improve the test loss, relatively
to LengthNet test loss. We also tried the long short term mem-
ory (LSTM) architecture, which are commonly used for sequen-
tial data. The LengthNet outperformed LSTM with both L1 and L2
loss. Results are summarized in Table 2.
Table 2: Architectures Comparison
Model Test loss ROL
LengthNet (CNN+L1 loss) 0.239 0.021
CNN+L2 loss 0.254 0.029
CNN+L1 loss + Batch-Norm 0.360 0.039
CNN+L2 loss + Batch-Norm 11.36 0.230
LSTM +L1 loss 1.768 0.166
LSTM +L2 loss 2.370 0.099
4.2. Monotonic Property
A linear relation was established between the true length and the
LengthNet (Fig.7). The x-axis represents the ground truth length,
and the y-axis is the predicted length by the LengthNet. This result
shows the generalization capability and, in particular, shows the
success of reconstructing the (Monotonic) axiom.
submitted to STAG2021.
Barak Or & Ido Amos / LengthNet: Length Learning for Planar Euclidean Curves - Preprint 5
Figure 6: ROL histograms. L1norm vs. L2norm comparison. L1
loss function based-model clearly outperforms L2loss function
based-model
Figure 7: LengthNet monotonic property assessment
4.3. Comparison to first-order Spline interpolation
A classical approach for length calculation uses the first order
spline interpolation (yielding a length calculation equivalent to (8)).
A comparison between the 1st order spline interpolation and the
LengthNet was made. The length of all test set examples were cal-
culated, once by the 1st order spline interpolation and once by the
LengthNet. The results are presented in Figure 8, where the relative
error histograms are shown for each. The LengthNet histogram is
mostly below 2.5%, while for the 1st order linear spline, half of the
relative error is concentrated around 2.5%, relatively wide spread.
Figure 8: LengthNet vs. spline interpolation relative error his-
tograms.
4.4. Noise Robustness
We check the capability of the LengthNet to estimate the length of
curves with additive white Gaussian noise. We set the standard de-
viation of the noise to λ¯
d, where ¯
dis the mean distance between
two successive points along the curve and λ[0,1]is the noise
magnitude parameter. Figure 9 shows two curves with their asso-
ciated noisy curves, with λ=0.5. We compared the performance
of our model with linear spline interpolation. The LengthNet and
the linear spline sensitivity to additive noise are presented in Fig-
ure 10 (blue and green plots). For a low level of additive noise, the
LengthNet predicts the curve length property pretty well. Only for
noise magnitude of 0.3 a relative error of over 10% is obtained.
The relative error of the LengthNet is much lower than the linear
approximation for most of the data. The ability to be robust to noise,
even though the LengthNet didn’t see any noisy example, is a good
capability.
In addition, we add a low pass filter (LPF) to smooth the noise for
both approaches (orange and purple in Figure 10). We obtained bet-
ter results, where the suggested LengthNet outperforms the linear
spline for noise magnitude up to λ=0.7.
4.5. Generalization for unseen examples: Lame curves
In order to evaluated model generalization capabilities, we present
the Lame curves family (super-ellipse), given by:
x
a
r+
y
b
r=1 (14)
where aand bwere set to 1, for simplicity, and ris the shape pa-
rameter. Note, when r1 the curve has non-differentiable points.
submitted to STAG2021.
6Barak Or & Ido Amos / LengthNet: Length Learning for Planar Euclidean Curves - Preprint
Figure 9: Noisy vs. original shapes. Var parameters=0.5
Figure 10: LengthNet vs. linear approximation sensitivity to addi-
tive noise with and without LPF
Some curves from the family are provided in figure 11. We im-
plemented the parametric equation for 39 different curves where
r[0.5,10]with steps of 0.25. Then, we passed each curve trough
the LengthNet. Results are shown in Figure 12.
5. Conclusions
A learning-based approach to reconstruct the length of curves was
presented. The power of the deep learning based model to recon-
struct the fundamental axioms was demonstrated. There, a very
simplified architecture was designed to deal with sequential data.
Figure 11: Lame curves family.
Figure 12: Lame curves: Relative error of the LengthNet as a func-
tion of r.
We have shown that the norm L1is more appropriate for this
problem than the common L2norm in the loss function formula-
tion. Furthermore, by comparison to the linear approximation, we
see how the LengthNet deals with noisy examples, even though it
hasn’t been trained on noisy data. Currently, the LengthNet does
not deal well with high noise magnitude. Future challenges: we
aim to generalize a DL model with the capabilities of taking a level
set from a given image and performing accurate length calculation
using more examples, such as the outline of the human figure. For
that, we aim to formulate the problem as an unsupervised learn-
ing (or self-supervised learning) task. Also, we may include some
submitted to STAG2021.
Barak Or & Ido Amos / LengthNet: Length Learning for Planar Euclidean Curves - Preprint 7
transformations, such as affine, equi-affine, and homography trans-
formations.
Acknowledgements
The first author would like to thanks Prof. Roni Kimmel, from
the Technion - Israel Institute of Technology, for introducing him
to this fascinating problem. Both authors would like to thank Dr.
Maxim Freydin, from ALMA Technologies LTD, and Dr. Chaim
Baskin from the Technion - Israel Institute of Technology, for as-
sistance with editing this paper.
References
[BBL17] BRO NS TE IN M. M., BRU NA J., LECUN Y., SZ LA M A.,
VANDERGHEYNST P.: Geometric deep learning: going beyond eu-
clidean data. IEEE Signal Processing Magazine 34, 4 (2017), 18–42.
[BN18] BERG J., NYSTRÖ M K.: A unified deep artificial neural network
approach to partial differential equations in complex geometries. Neuro-
computing 317 (2018), 28–41.
[Chi20] CHICCO D.: Siamese neural networks: An overview. Artificial
Neural Networks (2020), 73–94.
[DC16] DOCARMO M. P.: Differential geometry of curves and surfaces:
revised and updated second edition. Courier Dover Publications, 2016.
[GP90] GUENTER B., PARENT R.: Computing the arc length of para-
metric curves. IEEE Computer Graphics and Applications 10, 3 (1990),
72–78.
[HC98] HELLWE G H.-B., CRI SFI EL D M.: A new arc-length method for
handling sharp snap-backs. Computers & Structures 66, 5 (1998), 704–
709.
[KB14] KINGMA D. P., BAJ.: Adam: A method for stochastic optimiza-
tion. arXiv preprint arXiv:1412.6980 (2014).
[LBH15] LECUN Y., BENGIO Y., HIN TO N G.: Deep learning. nature
521, 7553 (2015), 436–444.
[OTPH16] OOI S. Y., TEO H A. B. J., PANG Y. H., HI EW B. Y.: Image-
based handwritten signature verification using hybrid methods of dis-
crete radon transform, principal component analysis and probabilistic
neural network. Applied Soft Computing 40 (2016), 274–282.
[PWK16] PAI G., WE TZ LE R A., KIMMEL R.: Learning invariant repre-
sentations of planar curves. arXiv preprint arXiv:1611.07807 (2016).
[QW93] QIAN S., WEISS J.: Wavelets and the numerical solution of
partial differential equations. Journal of Computational Physics 106, 1
(1993), 155–175.
[SJ19] SMAGU LOVA K., JAM ES A. P.: A survey on lstm memristive
neural network architectures and applications. The European Physical
Journal Special Topics 228, 10 (2019), 2313–2324.
[TB97] TSOI A. C., BACK A .: Discrete time recurrent neural network
architectures: A unifying review. Neurocomputing 15, 3-4 (1997), 183–
223.
submitted to STAG2021.
... There are very few datasets of plane curves targeted to support learning-based tasks. We can cite LengthNet [18], a dataset of 500k point sets created for learning the length of planar sampled curves. These curves were created by considering four standard geometric shapes, such as circles, straight lines, triangles, and rectangles, which are scaled, rotated, translated, and randomly segmented. ...
Article
Full-text available
We propose CurveML, a benchmark for evaluating and comparing methods for the classification and identification of plane curves represented as point sets. The dataset is composed of 520k curves, of which 280k are generated from specific families characterised by distinctive shapes, and 240k are obtained from Bézier or composite Bézier curves. The dataset was generated starting from the parametric equations of the selected curves making it easily extensible. It is split into training, validation, and test sets to make it usable by learning-based methods, and it contains curves perturbed with different kinds of point set artefacts. To evaluate the detection of curves in point sets, our benchmark includes various metrics with particular care on what concerns the classification and approximation accuracy. Finally, we provide a comprehensive set of accompanying demonstrations, showcasing curve classification, and parameter regression tasks using both ResNet-based and PointNet-based networks. These demonstrations encompass 14 experiments, with each network type comprising 7 runs: 1 for classification and 6 for regression of the 6 defining parameters of plane curves. The corresponding Jupyter notebooks with training procedures, evaluations, and pre-trained models are also included for a thorough understanding of the methodologies employed.
Article
Full-text available
The recurrent neural networks (RNN) found to be an effective tool for approximating dynamic systems dealing with time and order dependent data such as video, audio and others. Long short-term memory (LSTM) is a recurrent neural network with a state memory and multilayer cell structure. Hardware acceleration of LSTM using memristor circuit is an emerging topic of study. In this work, we look at history and reasons why LSTM neural network has been developed. We provide a tutorial survey on the existing LSTM methods and highlight the recent developments in memristive LSTM architectures.
Article
Full-text available
We use deep feedforward artificial neural networks to approximate solutions of partial differential equations of advection and diffusion type in complex geometries. We derive analytical expressions of the gradients of the cost function with respect to the network parameters, as well as the gradient of the network itself with respect to the input, for arbitrarily deep networks. The method is based on an ansatz for the solution, which requires nothing but feedforward neural networks, and an unconstrained gradient based optimization method such as gradient descent or quasi-Newton methods. We provide detailed examples on how to use deep feedforward neural networks as a basis for further work on deep neural network approximations to partial differential equations. We highlight the benefits of deep compared to shallow neural networks and other convergence enhancing techniques.
Article
Many signal processing problems involve data whose underlying structure is non-Euclidean, but may be modeled as a manifold or (combinatorial) graph. For instance, in social networks, the characteristics of users can be modeled as signals on the vertices of the social graph. Sensor networks are graph models of distributed interconnected sensors, whose readings are modelled as time-dependent signals on the vertices. In genetics, gene expression data are modeled as signals defined on the regulatory network. In neuroscience, graph models are used to represent anatomical and functional structures of the brain. Modeling data given as points in a high-dimensional Euclidean space using nearest neighbor graphs is an increasingly popular trend in data science, allowing practitioners access to the intrinsic structure of the data. In computer graphics and vision, 3D objects are modeled as Riemannian manifolds (surfaces) endowed with properties such as color texture. Even more complex examples include networks of operators, e.g., functional correspondences or difference operators in a collection of 3D shapes, or orientations of overlapping cameras in multi-view vision ("structure from motion") problems. The complexity of geometric data and the availability of very large datasets (in the case of social networks, on the scale of billions) suggest the use of machine learning techniques. In particular, deep learning has recently proven to be a powerful tool for problems with large datasets with underlying Euclidean structure. The purpose of this paper is to overview the problems arising in relation to geometric deep learning and present solutions existing today for this class of problems, as well as key difficulties and future research directions.
Article
Image-based handwritten signature verification is important in most of the financial transactions when a hard copy of signature is needed. Considering the lack of dynamic information from static signature images, we proposed a working framework through hybrid methods of discrete Radon transform (DRT), principal component analysis (PCA) and probabilistic neural network (PNN). The proposed framework aims to distinguish forgeries from genuine signatures based on the image level. Extensive experiments are conducted on our own independent signature database, and a public signature database – MYCT. Equal error rates (EER) of 1.51%, 3.23% and 13.07% are reported, respectively, for random, casual and skilled forgeries of our own database. When working on the MYCT signature database, our proposed approach manages to achieve an EER of 9.87% with 10 training samples.
Article
How useful is that new technology? This is a natural question to ask whenever an emerging technique, such as neural networks, is transferred from research laboratories to industry. In addition, the biological flavor of the term 'neural network' may lead to some confusion. For those reasons, this chapter is devoted to a presentation of the mathematical foundations and algorithms that underlie the use of neural networks, together with the description of typical applications; although the latter are quite varied, they are all based on a small number of simple principles. Putting neural networks to work is quite simple, and good software development tools are available. However, in order to avoid disappointing results, it is important to have an in-depth understanding of what neural networks really do and of what they are really good at. The purpose of the present chapter is to explain under what circumstances neural networks are preferable to other data processing techniques and for what purposes they may be useful. Basic definitions will be first presented: (formal) neuron, neural networks, neural network training (both supervised and unsupervised), feedforward and feedback (or recurrent) networks. The basic property of neural networks with supervised training, parsimonious approximation, will subsequently be explained. Due to that property, neural networks are excellent nonlinear modeling tools. In that context, the concept of supervised training will emerge naturally as a nonlinear version of classical statistical modeling methods. Attention will be drawn to the necessary and sufficient conditions for an application of neural networks with supervised training to be successful. Automatic classification (or discrimination) is an area of application of neural networks that has specific features. A general presentation of automatic classification, from a probabilistic point of view, will be made. It will be shown that not all classification problems can be solved efficiently by neural networks, and we will characterize the class of problems where neural classification is most appropriate. A general methodology for the design of neural classifiers will be explained. Finally, various applications will be described that illustrate the variety of areas where neural networks can provide efficient and elegant solutions to engineering problems, such that pattern recognition, nondestructive testing, information filtering, bioengineering, material formulation, modeling of industrial processes, environmental control, robotics, etc. Further applications (spectra interpretation, classification of satellite images, classification of sonar signals, process control) will be either mentioned or described in detail in subsequent chapters.
Article
We present a new numerical method for the solution of partial differential equations in nonseparable domains. The method uses a wavelet-Galerkin solver with a nontrivial adaptation of the standard capacitance matrix method. The numerical solutions exhibit spectral convergence with regard to the order of the compactly supported, Daubechies wavelet basis. Furthermore, the rate of convergence is found to be independent of the geometry. We solve the Helmholtz equation since, for variations in the parameter, the solutions have qualitative properties that well illustrate the applications of our method. 16 refs., 18 figs., 1 tab.
Article
In this paper, after giving definitions for a set of commonly used terms in recurrent neural networks (RNNs), all possible RNN architectures based on these definitions are enumerated, and described. Then, most existing RNN architectures are categorized under these headings. Four general neural network architectures, in increasing degree of complexity, are introduced. It is shown that all the existing RNN architectures can be considered as special cases of the general RNN architectures. Furthermore, it is shown how these existing architectures can be transformed to the general RNN architectures. Some open issues concerning RNN architectures are discussed.
Article
Various constraints for the arc-length method have been successfully employed to pass limit points. However, at very sharp snap-backs, particularly those occurring in fracture mechanics or damage mechanics both for failure initiation and crack propagation, standard arc-length methods can fail to converge and alternative solution procedures are required. A novel algorithm to choose the correct root for the cylindrical arc-length method is suggested here. It will be demonstrated that this new arc-length method is robust and can handle very sharp snap-backs.