added a research item
Although core in the teaching of academic language skills, little research to date has investigated what makes video-recorded lectures difficult for language learners. As part of a larger program to develop automated videotext complexity measures, this study reports on selected dimensions of linguistic complexity to understand how they contribute to overall videotext difficulty. Based on the ratings of English language learners of 320 video lectures, we built regression models to predict subjective estimates of video lecture difficulty. The results of our analysis demonstrate that a 4-component partial least square regression model explains 52% of the variance in video difficulty and significantly outperformed a base-line model in predicting the difficulty of videos in an out-of-sample testing set. The results of our study point to the use of linguistic complexity features for predicting overall videotext difficulty and raise the possibility of developing automated systems for measuring video difficulty, akin to those already available for estimating the readability of written materials.
The value of understanding video difficulty is hardly controversial, although just how video difficulty is to be theoretically conceptualized and empirically investigated is less clear. Taking a multidisciplinary, evidence-based approach for modeling video difficulty, this study investigates the impact of multimodal complexity on language learner self-ratings of video difficulty, while accounting for the effects of individual differences and video production styles. Particularly, 279 B1 EFL learners watched a corpus of 320 instructional videos and rated their difficulties. The corpus was then analyzed using sophisticated natural language processing and computer vision algorithms to extract and compute a wide range of multimodal complexity indices. Results of linear mixed-effects modeling demonstrated that pitch variation and academic spoken formulaic sequences facilitate viewing comprehension, whereas infrequent words, image clutter, the number of visual objects, salient objects, visual texts, shots, and moving objects impede viewing comprehension. The study concludes with some directions for future research.
Automated assessment of text difficulty has been recognized as one method for assisting language teachers, textbook publishers, curriculum specialists, test developers, and researchers to make more informed decisions when selecting texts for use in instruction and assessment. While there is a substantial body of work on written and spoken texts, research on videotext difficulty is very scarce. Through a series of studies, the aim of this research program was twofold: to investigate what makes a videotext difficult for language learners and to develop automated measures to help predict difficulty in videotexts. Constructed to be used in this thesis, the Second Language Video Complexity (SLVC) corpus contains 320 academic lectures and 320 government advertisements which were annotated by 322 intermediate language learners. In Study 1, the relative contribution of verbal complexity to videotext difficulty was examined. The results demonstrated that videotext difficulty was predicted by variation in pitch, lexical frequency and sophistication, and syntactic complexity. Study 2 sought to investigate the impact of visual complexity on learners’ perception of videotext difficulty. To this end, innovative computational measures to gauge visual complexity in videotexts were developed and integrated into the Automated Video Analysis (AUVANA) software. The findings of the study suggested that visual complexity contributes to videotext difficulty and their impact is on a par with that of verbal complexity. Moreover, the result of principal component analysis demonstrated that visual complexity is more likely a multifaceted and multidimensional construct, rather than a unitary construct. While Study 1 and Study 2 looked at verbal and visual complexity independently, Study 3 focused on the integration of multimodal complexity features into ensemble machine learning models. The findings showed that ensemble multimodal models outperformed unimodal models in predicting difficulty in both video genres. Finally, Study 4 sought to develop an unsupervised approach for forecasting video segment difficulty in real-time. Through leveraging more advanced and sophisticated AI algorithms, several neural network models were trained to forecast difficulty in a corpus of 34,363 video segments. Quantitative and qualitative analyses showed that the trained model performed very well in forecasting difficulty in unseen video segments. Taken together, this thesis makes clear contributions to the investigation of videotext difficulty assessment. In short, the findings of this thesis revealed the usefulness of automated measures for assessing and predicting videotext difficulty. Also, introduced and developed in this thesis new measures of videotext complexity and computational tools for analyzing, computing, and visualizing complexity in videotexts which may help researchers to perform fine-grain analysis of videotext complexity.
Visual complexity is widely considered to be an important variable underlying visual perception. While videos have become versatile in their use of visual imagery, surprisingly, little research has been devoted to understanding the impact of visual complexity. In this paper, we present Automated Video Analysis (AUVANA) software, an open-source tool for extracting, computing, and visualizing visual complexity in digital videos. Through leveraging more sophisticated computer vision and video processing algorithms, AUVANA automatically extracts and computes 78 video visual complexity indices. Results of explanatory analyses demonstrated that rather than a unitary construct video visual complexity is more likely a multidimensional and multifaceted phenomenon. We conclude the paper with a discussion about the potential applications of the software.