September 2023
During the last few years, the field of automatic speech recognition (ASR) has been growing exponentially, due to the diverse applications and solutions it offers. For this reason, this paper presents a multiclass language classifier based on recurrent convolutional neural networks, whose objective is to classify the audios of the KALAKA-3 database, according to their language. To meet this objective, the mel frequency cepstral coefficients (MFCCs) were extracted from each of the audios in the database, with which the training process is carried out. A recurrent convolutional neural network (CRNN) was created for this process, resulting in an accuracy of 98% using the testing data, and 40% using the Eval data. This work sets a precedent for improving real-time translators, since in the future it would be possible to listen to a few seconds of a conversation, identify it, and automatically perform a translation process, which would be very useful in various applications.KeywordsArtificial intelligenceNeural networksCRNNDatabaseKALAKA-3