Thanks to the recent progress in the judicial proceedings management, especially related to the introduction of audio/video recording facilities, the challenge of identification of emotional states can be tackled. Discovering affective states embedded into speech signals could help in semantic retrieval of multimedia clips, and therefore in a deep understanding of mechanisms behind courtroom ... [Show full abstract] debates and judges/jurors decision making processes. In this paper two main contributions are given: (1) the collection of real-world human emotions coming from courtroom audio recordings; (2) the investigation of a hierarchical classification system, based on a risk minimization method, able to recognize emotional states from speech signatures. The accuracy of the proposed classification approach – named Multilayer Support Vector Machines – has been evaluated by comparing its performance with traditional machine learning approaches, by using both benchmark datasets and real courtroom recordings. Results in recognition obtained by the proposed technique outperform the prediction power achieved by traditional approaches like SVM, k-Nearest Neighbors, Naïve Bayes, Decision Trees and Bayesian Networks.