This paper uses neural network as a predictive model and genetic algorithm as an online optimization algorithm to simulate the noise processing of Chinese-English parallel corpus. At the same time, according to the powerful random global search mechanism of genetic algorithm, this paper studied the principle and process of noise processing in Chinese-English parallel corpus. Aiming at the task of identifying isolated words for unspecified persons, taking into account the inadequacies of the algorithms in standard genetic algorithms and neural networks, this paper proposes a fast algorithm for training the network using genetic algorithms. Through simulation calculations, different characteristic parameters, the number of training samples, background noise, and whether a specific person affects the recognition result were analyzed and discussed and compared with the traditional dynamic time comparison method. This paper introduces the idea of reinforcement learning, uses different reward mechanisms to solve the inconsistency of loss function and evaluation index measurement methods, and uses different decoding methods to alleviate the problem of exposure bias. It uses various simple genetic operations and the survival of the fittest selection mechanism to guide the learning process and determine the direction of the search, and it can search multiple regions in the solution space at the same time. In addition, it also has the advantage of not being restricted by the restrictive conditions of the search space (such as differentiable, continuous, and unimodal). At the same time, a method of using English subword vectors to initialize the parameters of the translation model is given. The research results show that the neural network recognition method based on genetic algorithm which is given in this paper shows its ability of quickly learning network weights and it is superior to the standard in all aspects. The performance of the algorithm in genetic algorithm and neural network, with high recognition rate and unique application advantages, can achieve a win-win of time and efficiency.
1. Introduction
Existing Chinese-English parallel corpus noise processing systems with high accuracy rate still have the disadvantages of time consumption, high cost, and inconvenient use [1]. The actual voice recognition system requires real-time Chinese-English parallel corpus noise processing on a general-purpose computer with limited resources [2]. Therefore, the development of fast recognition algorithms has been important in the study on noise processing of Chinese-English parallel corpora. Chinese-English parallel corpus noise processing technology is a subject that uses computers to analyze speech signals to realize automatic understanding of human speech [3]. Speech recognition technology has become a very active research field in information science. As a cross-discipline, it is gradually becoming a key technology of human-computer interaction in information technology [4]. Speech signal processing is a discipline that studies the use of digital signal processing techniques to deal with noise in Chinese-English parallel corpora. The purpose of processing is to obtain certain parameters for efficient transmission or storage or for certain applications, such as speech synthesis, Chinese-English parallel corpus noise processing, and speech enhancement. [5]. It is not only an effective and convenient way of information exchange, but also an important tool for humans to use machines. Whether it is the language communication between humans and machines, the noise processing of Chinese-English parallel corpus, especially the digital processing of voice signals, has a particularly important role [6]. Once voice recognition and voice synthesis technology are combined, people can leave the keyboard, receive voice commands, and perform operations [7].
Mohammad [8] proposed a neural network machine translation architecture, which is completely in terms of the neural network structure and is divided into two parts. The encoder converts the source language text into a set of context vectors and then decodes them. The processor then decodes the set of vectors into target language text. This structure completely gets rid of the previous statistical machine translation architecture. The model no longer includes explicit word alignment and translation rule extraction steps, which simplifies the complicated feature design work brought about by the complexity and change of natural language itself. With the attention mechanism proposed by Mojrian [9], the ability of neural network machine translation of processing long sentences has been further improved. The attention mechanism separately calculates the alignment information of the corresponding parts between the source sequence and the target sequence through weight distribution, so that the model “targets” the specified part in the training and prediction stages. Later, Lazli [10] and others further studied the attention mechanism, replacing the entire sentence with a fixed-length window, reducing the amount of calculation of this mechanism. The proposal of the attention force mechanism makes the results of neural network machine translation comparable to traditional statistical machine translation. As a result, neural network-based machine translation methods have become the mainstream method in the research field. At this stage, in order to overcome the gradient disappearance and gradient explosion problems that may be caused by the classic recurrent neural network model, the nodes of the network usually use complex structures such as LSTM (Long-Short Term Memory) and its variant GRU (Gated Recurrent Unit), so that model training is slow. Subsequently, in order to strengthen the accuracy of model training, Sheta [11] introduced a translation model based on convolutional neural networks, which uses convolutional neural networks to window and hierarchically extract sentence features, while retaining the accuracy of recurrent neural networks. Next, model training is accelerated through parallel computing. Poddar [12] realized the English-Chinese machine translation mode which is based on the sample neural network of the attention mechanism, using different Word2Vec models to generate English word vectors, and optimized the English-Chinese neural network machine translation model. Some scholars have implemented the machine translation model based on convolutional neural network and transformer-based English-Chinese machine translation model adding pretrained word vectors to the English-Chinese translation model and improving the quality of the model by providing prior information [13–15].
This article analyzes the specific content of neural networks and genetic algorithms on the basis of their respective shortcomings and analyzes the necessity and feasibility of combining neural networks and genetic algorithms. In the research of this article, by using the generation gap operator and the intersection operator based on the convex set theory, an improved genetic algorithm for the learning of neural network weights is formed, and the algorithm is used in the verification of the progressive voice. At the same time, the artificial neural network method can be helpful to design and implement the genetic algorithm. The impulse response or step response curve of the object is easier to obtain in the process. Take their series of values at the sampling moment as the information describing the dynamic characteristics of the object to form a predictive model. Because the nonparametric model is easier to obtain and the calculation is simple, the robustness is better. The structure and characteristics of the multilayer feedforward neural network are analyzed and summarized, as well as the computing power and function approximation of the multilayer feedforward neural network. Several methods for selecting the number of internal nodes and finally two heuristic algorithms and the implementation process are given: a detailed design of a genetic algorithm model is given, and related tests and performance analysis are done.
2. Chinese-English Parallel Corpus Noise Processing Model Based on Multilayer Perceptron Genetic Algorithm Neural Network
2.1. Multilayer Perceptron Hierarchical Distribution
Digital Chinese-English parallel corpus noise processing includes three aspects, namely, the digital representation method of Chinese-English parallel corpus noise, various methods and techniques of Chinese-English parallel corpus noise digital processing theory, and their practical applications in various fields [16]. Figure 1 shows the hierarchical spatial distribution of multilayer perceptrons.