Tohru Nitta

Tohru Nitta
Rikkyo University · Graduate School of Artificial Intelligence and Science

Doctor of Engineering

About

66
Publications
8,637
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,665
Citations

Publications

Publications (66)
Preprint
Full-text available
In this paper, we theoretically prove that the deep ReLU neural networks do not lie in spurious local minima in the loss landscape under the Neural Tangent Kernel (NTK) regime, that is, in the gradient descent training dynamics of the deep ReLU neural networks whose parameters are initialized by a normal distribution in the limit as the widths of t...
Conference Paper
Full-text available
In this paper, we investigate a feedforward neural network extended to dual numbers (dual-NN). It is found that the dual-NN has different properties from the real-NN and the complex-NN. Specifically, the weights are trained with the constraint of the motion of shearing. Experimental results show that the generalization ability is higher than those...
Conference Paper
Full-text available
In this paper, we formulate a commutative quaternion neuron model with parameters represented by an orthogonal coordinate, and makes clear the differences among the commutative quaternion neuron, a real-valued neuron, and a complex-valued neuron.
Preprint
Full-text available
In this paper, we prove that no bad local minimum exists in the deep nonlinear neural networks with sufficiently large widths of the hidden layers if the parameters are initialized by He initialization method. Specifically, in the deep ReLU neural network model with sufficiently large widths of the hidden layers, the following four statements hold...
Preprint
In this paper, we propose a new weight initialization method called even initialization for wide and deep nonlinear neural networks with the ReLU activation function. We prove that no poor local minimum exists in the initial loss landscape in the wide and deep nonlinear neural network initialized by the even initialization method that we propose. S...
Article
It has been reported that training deep neural networks is more difficult than training shallow neural networks. Hinton et al. proposed deep belief networks with a learning algorithm that trains one layer at a time. A much better generalization can be achieved when pre-training each layer with an unsupervised learning algorithm. Since then, deep ne...
Article
Full-text available
In this paper, we first extend the Wirtinger derivative which is defined for complex functions to hyperbolic functions, and derive the hyperbolic gradient operator yielding the steepest descent direction by using it. Next, we derive the hyperbolic backpropagation learning algorithms for some multilayered hyperbolic neural networks (NNs) using the h...
Conference Paper
Full-text available
In this paper, we analyze a deep neural network model from the viewpoint of singularities. First, we show that there exist a large number of critical points introduced by a hierarchical structure in the deep neural network as straight lines. Next, we derive sufficient conditions for the deep neural network having no critical points introduced by a...
Article
Full-text available
We present a theoretical analysis of singular points of artificial deep neural networks, resulting in providing deep neural network models having no critical points introduced by a hierarchical structure. It is considered that such deep neural network models have good nature for gradient-based optimization. First, we show that there exist a large n...
Article
Full-text available
This letter investigates the characteristics of the complex-valued neuron model with parameters represented by polar coordinates (called polar variable complex-valued neuron). The parameters of the polar variable complex-valued neuron are unidentifiable. The plateau phenomenon can occur during learning of the polar variable complex-valued neuron. F...
Article
The quaternion widely linear (WL) estimator has been recently introduced for optimal second-order modeling of the generality of quaternion data, both second-order circular (proper) and second-order noncircular (improper). Experimental evidence exists of its performance advantage over the conventional strictly linear (SL) as well as the semi-WL (SWL...
Article
The ability of the 1-n-1 complex-valued neural network to learn 2D affine transformations has been applied to the estimation of optical flows and the generation of fractal images. The complex-valued neural network has the adaptability and the generalization ability as inherent nature. This is the most different point between the ability of the 1-n-...
Article
Full-text available
In this paper, the natural gradient descent method for the multilayer stochastic complex-valued neural networks is considered, and the natural gradient is given for a single stochastic complex-valued neuron as an example. Since the space of the learnable parameters of stochastic complex-valued neural networks is not the Euclidean space but a curved...
Article
In this paper, the characteristics of the complex-valued neuron model with parameters represented by polar coordinates (called polar variable complex-valued neuron) are investigated. The main results are as reported below. The polar variable complex-valued neuron is unidentifiable: there exists a parameter that does not affect the output value of t...
Article
Full-text available
A critical point is a point on which the derivatives of an error function are all zero. It has been shown in the literatures that the critical points caused by the hierarchical structure of the real-valued neural network could be local minima or saddle points, whereas most of the critical points caused by the hierarchical structure are saddle point...
Article
Full-text available
a critical point is a point at which the derivatives of an error function are all zero. It has been shown in the literature that critical points caused by the hierarchical structure of a real-valued neural network (NN) can be local minima or saddle points, although most critical points caused by the hierarchical structure are saddle points in the c...
Article
Full-text available
We survey the development of Clifford's geometric algebra and some of its engineering applications during the last 15 years. Several recently developed applications and their merits are discussed in some detail. We thus hope to clearly demonstrate the benefit of developing problem solutions in a unified framework for algebra and geometry with the w...
Chapter
Introduction Neuron Models with High-Dimensional Parameters N-Dimensional Vector Neuron Discussion Conclusion
Article
Most of local minima caused by the hierarchical structure can be resolved by extending the real-valued neural network to complex numbers. It was proved in 2000 that a critical point of the real-valued neural network with H-1 hidden neurons always gives many critical points of the real-valued neural network with H hidden neurons. These critical poin...
Article
Full-text available
In this paper, we will give a theoretical foundation for a quaternion-valued widely linear estimation framework. The estimation error obtained with the quaternion-valued widely linear estimation method is proved to be smaller than that obtained using the usual quaternion-valued linear estimation method.
Article
This chapter reviews the widely linear estimation for complex numbers, quaternions, and geometric algebras (or Clifford algebras) and their application examples. It was proved effective mathematically to add x, the complex conjugate number of x, as an explanatory variable in estimation of complexvalued data in 1995. Thereafter, the technique has be...
Conference Paper
Full-text available
In this paper, we formulate a Clifford-valued widely linear estimation framework. Clifford number is a hypercomplex number that generalizes real, complex numbers, quaternions, and higher dimensional numbers. And also, as a first step, we will give a theoretical foundation for a quaternion-valued widely linear estimation framework. The estimation er...
Article
The ability of the 1-n-1 complex-valued neural network to learn 2D affine transformations has been applied to the estimation of optical flows and the generation of fractal images. The complex-valued neural network has the adaptability and the generalization ability as inherent nature. This is the most different point between the ability of the 1-n-...
Book
Recent research indicates that complex-valued neural networks whose parameters (weights and threshold values) are all complex numbers are in fact useful, containing characteristics bringing about many significant applications. Complex-Valued Neural Networks: Utilizing High-Dimensional Parameters covers the current state-of-the-art theories and appl...
Conference Paper
In this paper, the basic properties, especially decision boundary, of the hyperbolic neurons used in the hyperbolic neural networks are investigated. And also, a non-split hyperbolic sigmoid activation function is proposed.
Article
This letter introduces a novel neural network whose input and output signals, and threshold values are all 3-dimensional real-valued vectors and whose weights are all 3-dimensional orthogonal matrices, and the related back-propagation learning algorithm. The algorithm allows new spatial characteristics to be treated.
Article
This letter will clarify the fundamental properties of a quaternary neuronwhose weights, threshold values, input and output signals are all quaternions, which is an extension of a usual real-valued neuron to quaternions. The main results of this letter are summarized as follows. A quaternary neuron has an orthogonal decision boundary. The 4-bit par...
Article
This letter presents some results of an analysis on the decision boundaries of complex-valued neural networks whose weights, threshold values, input and output signals are all complex numbers. The main results may be summarized as follows. (1) A decision boundary of a single complex-valued neuron consists of two hypersurfaces that intersect orthogo...
Article
A complex neural network is obtained from an ordinary network by extending the (real-valued) parameters, such as the weights and the thresholds, to complex values. Applications to problems involving complex numbers, such as communications systems, are expected. This paper presents the following uniqueness theorem. When a complex function is given,...
Chapter
This chapter presents some results of an analysis on the decision boundaries of the complex-valued neural networks whose weights, threshold values, input and output signals are all complex numbers. The main results can be summarized as follows. (a) Decision boundary of a single complex-valued neuron consists of two hypersurfaces which intersect ort...
Article
Full-text available
This letter presents some results on the computational power of complex-valued neurons. The main results may be summarized as follows. The XOR problem and the detection of symmetry problem which cannot be solved with a single real-valued neuron (i.e. a two-layered real-valued neural network), can be solved with a single complex-valued neuron (i.e....
Article
This paper shows the differences between the real-valued neural network and the complex-valued neural network by analyzing their fundamental properties from the view of architectures. The main results may be summarized as follows: (a) A single complex-valued neuron with n-inputs is equivalent to two real-valued neurons with 2n-inputs which have a r...
Conference Paper
There exist some problems that cannot be solved with conventional usual 2-layered real-valued neural networks (i.e., a single realvalued neuron) such as the XOR problem and the detection of symmetry. In this paper, it will be proved that such problems can be solved by a 2-layered complex-valued neural network (i.e., a single complex-valued neuron)...
Conference Paper
The properties of the critical points caused by the hierarchical structure of complex-valued neural networks are investigated. If the loss function used is not regular as a complex function, the critical points caused by the hierarchical structure are all saddle points.
Article
In this letter, we will clarify the redundancy of the parameters of the complex-valued neural network. The results may be summarized as follows. There exist the four transformations which can cause the redundancy of the parameters of the complex-valued neural network, including the two transformations which can cause the redundancy of the parameter...
Article
Full-text available
This paper presents some results of an analysis on the decision boundaries of complex-valued neurons. The main results may be summarized as follows. (a) Weight parameters of a complex-valued neuron have a restriction which is concerned with two-dimensional motion. (b) The decision boundary of a complex-valued neuron consists of two hypersurfaces wh...
Conference Paper
In this paper, we propose a computational model of personality (called personality model) for the purpose of implementing non-intellectual functions of the human mind on computer systems. The personality model will be formulated based on psychoanalysis, assuming that the defensive mechanism plays an essential role in a personality. Inductive probab...
Conference Paper
Describes a model of a neuron, called neuronoid, that can detect the “coincidence with delays”. A neuronoid is modeled as a collection of chemicals. Each chemical belongs to either the “rover” category or “borderer” category. The interaction of chemicals between these two categories leads to the decision of a neuronoid to fire or not
Conference Paper
This paper presents some results of an analysis on the decision boundaries of complex valued neural networks. The main results may be summarized as follows. (a) Weight parameters of a complex valued neuron have a restriction which is concerned with two-dimensional motion. (b) The decision boundary of a complex valued neuron consists of two hypersur...
Article
This paper presents a complex-valued version of the back-propagation algorithm (called 'Complex-BP'), which can be applied to multi-layered neural networks whose weights, threshold values, input and output signals are all complex numbers. Some inherent properties of this new algorithm are studied. The results may be summarized as follows. The updat...
Conference Paper
It has been discovered by computational experiments that the “complex-BP” algorithm can transform geometrical figures (e.g. rotation, similar transformation and parallel displacement), and reported that this ability can be successfully applied to computer vision. In this paper, the ability of the “complex-BP” algorithm to learn similar transformati...
Conference Paper
A quaternary version, of the back-propagation algorithm is proposed for multilayered neural networks whose weights, threshold values, input and output signals are all quaternions. This new algorithm can be used to learn patterns consisted of quaternions in a natural way. An example was used to successfully test the new formulation
Conference Paper
The characteristics of the learning rule in the “Complex-BP” a complex numbered version of the backpropagation algorithm, are investigated. The results of this study may be summarized as follows: the error backpropagation has a structure which is concerned with two dimensional motion; the unit of learning is complex valued signals flowing in neural...
Conference Paper
This paper presents some results of an analysis on the decision boundaries of the complex valued neural networks. The main results may be summarized as follows. (a) Weight parameters of a complex valued neuron have a restriction which is concerned with two-dimensional motion. (b) The decision boundary of a complex valued neuron consists of two hype...
Conference Paper
The 3D vector version of the back-propagation algorithm (3DV-BP) is a natural extension of the complex-valued version of the back-propagation algorithm (Complex-BP). The Complex-BP can be applied to multilayered neural networks whose weights, threshold values, input and output signals are all complex numbers, and the 3DV-BP can be applied to multil...
Conference Paper
A 3D vector version of the backpropagation algorithm is proposed for multilayered neural networks in which a vector product operation is performed, and whose weights, threshold values, input and output signals are all 3D real numbered vectors. This new algortihm can be used to learn patterns considered of 3D vectors in a natural way. A 3D example w...
Conference Paper
A 3D vector version of the backpropagation algorithm is proposed for multilayered neural networks in which vector product operation is performed, and whose weights, threshold values, input and output signals are all 3D real numbered vectors. This new algorithm can be used to learn patterns consisted of 3D vectors in a natural way. The XOR problem w...
Conference Paper
This paper introduces a complex numbered version of the backpropagation algorithm, which can be applied to neural networks whose weights, threshold values, input and output signals are all complex numbers. This new algorithm can be used to learn complex numbered patterns in a natural way. We show that "complex-BP" can transform geometrical figures.
Article
In this paper, we deal with a stochastic resource allocation model. Suppose that we have resource X_n at time n, and allocate A_<n+1>X_n Out of X_n for production and (1-A<n+1>)X_n for consumption at time n, where An+1 is the proportion of the resource that is allocated for production at time n. Then utility U_<n+1>f((1-A<n+1>)X_n) is obtained and...
Article
This paper deals with stochastic models of an optimal sequential allocation of resources between consumption and production. We obtain the following results by means of the theory of martingales. For a model of resource allocation we establish the dynamic programming equation and show that a supermartingale characterizes the composition of the mode...

Network

Cited By

Projects

Projects (2)
Project
Clarifying singularities in neural networks.
Project
Investigating the properties of neural networks with high-dimensional parameters, especially finding out their inherent properties.