ArticlePDF Available

Computerized adaptive testing using neural networks

Authors:
A preview of the PDF is not available
... Относительно выбора типа сети также существуют различные подходы. Используются как «классические» нейронные сети прямого распространения (feedforward neural network), в которых сигнал идет последовательно от слоя к слою [2,3], так и рекуррентные нейросети (Recurrent Neural Networks), в которых существует обратная связь между нейронами, и выходной сигнал может передаваться на вход нейронам предыдущего слоя [4]. В качестве интересных идей, опубликованных в ряде работ, следует отметить применение при создании нейросетей методов открытых систем, в частности создания ИНС по модульному принципу [4][5]. ...
... Используются как «классические» нейронные сети прямого распространения (feedforward neural network), в которых сигнал идет последовательно от слоя к слою [2,3], так и рекуррентные нейросети (Recurrent Neural Networks), в которых существует обратная связь между нейронами, и выходной сигнал может передаваться на вход нейронам предыдущего слоя [4]. В качестве интересных идей, опубликованных в ряде работ, следует отметить применение при создании нейросетей методов открытых систем, в частности создания ИНС по модульному принципу [4][5]. ...
Article
Full-text available
Purpose of the study . The aim of the study is to create neural network models of modules in an adaptive testing system to design an individual testing trajectory. The research article discusses the implementation of an adaptive testing system in terms of introducing artificial neural network modules into its composition, which should solve the problem of choosing a topic and the complexity of the next question, taking into account previous answers and the complexity of previously asked questions, as well as the connectivity of topics and response time as a factor guessing or searching for an answer, thereby forming an individual testing trajectory. Materials and methods . In the course of the study, the data that affect the quality of the solution of the problem was analyzed, the general modular structure of the system was proposed, and the main data flows entering the input of an artificial neural network (ANN) were described. To solve the problem of choosing the complexity of a question, it is proposed to use a feed-forward network, a comparison of various ANN architectures and training parameters (weight update algorithms, loss functions, number of training epochs, packet sizes) is carried out. As an alternative, the possibility of using a recurrent ANN LSTM (Long-Short Term Memory) network is considered. All results were obtained using the high-level Keras library, which allows you to quickly start at the initial stages of research and get the first results. SGD, Adam, NAdam and RMSprop implemented in Keras were compared as optimizers to achieve faster convergence. Adam showed the best results in terms of accuracy, while the MSE loss function (mean square error) was used together with the optimizer. Traditionally, training was carried out for a large number of epochs; graphs of dependences of accuracy on the number of epochs for a different number of neurons in the hidden layer were experimentally obtained. Results . Based on the study, we can conclude that the obtained accuracy of the direct propagation network of 80-85% is quite sufficient for its use in the adaptive testing system. However, it remains to answer the question of the need to improve the efficiency of an already implemented network, and, therefore, to conduct research on methods to improve the efficiency of networks, including finer tuning of parameters and learning algorithms, as well as architecture. A well-known and obvious drawback of using LSTMs is their exactingness in terms of equipment and resources, both during training (the training process takes a significant amount of time) and during startup, in our case, it is supplemented by increased requirements for the training sample and casts doubt on the advisability of further study of LSTM networks when solving this task. Conclusion . The introduction of the proposed tools will allow implementing an adaptive testing system, with an intelligent selection of questions depending on the demonstrated level of knowledge of the test person to form an individual testing trajectory in order to determine the reliable level of knowledge of the test subject for the optimal number of questions asked.
Article
Full-text available
p>The article discusses the advantages and prospects for the implementation of an adaptive approach in the tasks of computerized ability testing, standardization of diagnostic methods and development of simulators for teaching professional skills in the zone of proximal development (development of "soft skills" and "hard skills"). The results of the analysis of the reliability of tests using an adaptive approach and comparison of the obtained results with the classical paper and computer form of ability diagnostics are presented. An assessment of the effectiveness and advantages of this approach to determining the level of complexity of test items using the method of convolution of applied Markov models into quantum representations is presented. The effectiveness of the method on small samples has been proved.</p
Article
Full-text available
A Bayesian procedure to estimate the three-parameter normal ogive model and a generalization of the procedure to a model with multidimensional ability parameters are presented. The procedure is a generalization of a procedure by Albert (1992) for estimating the two-parameter normal ogive model. The procedure supports analyzing data from multiple populations and incomplete designs. It is shown that restrictions can be imposed on the factor matrix for testing specific hypotheses about the ability structure. The technique is illustrated using simulated and real data.
Article
Two algorithms for producing multiple imputations for missing data are evaluated with simulated data. Software using a propensity score classifier with the approximate Bayesian bootstrap produces badly biased estimates of regression coefficients when data on predictor variables are missing at random or missing completely at random. On the other hand, a regression-based method employing the data augmentation algorithm produces estimates with little or no bias.
Book
This graduate-level textbook is a tutorial for item response theory that covers both the basics of item response theory and the use of R for preparing graphical presentation in writings about the theory. Item response theory has become one of the most powerful tools used in test construction, yet one of the barriers to learning and applying it is the considerable amount of sophisticated computational effort required to illustrate even the simplest concepts. This text provides the reader access to the basic concepts of item response theory freed of the tedious underlying calculations. It is intended for those who possess limited knowledge of educational measurement and psychometrics. Rather than presenting the full scope of item response theory, this textbook is concise and practical and presents basic concepts without becoming enmeshed in underlying mathematical and computational complexities. Clearly written text and succinct R code allow anyone familiar with statistical concepts to explore and apply item response theory in a practical way. In addition to students of educational measurement, this text will be valuable to measurement specialists working in testing programs at any level and who need an understanding of item response theory in order to evaluate its potential in their settings. • Combines clearly written text and succinct R code • Utilizes a building-block approach from simple to complex, enabling readers to develop a clinical feel for item response theory and how its concepts are interrelated • Includes downloadable R functions that implement various facets of item response theory Frank B. Baker, Ph.D., is Professor Emeritus of the Department of Educational Psychology at the University of Wisconsin-Madison. He is author of numerous publications dealing with item response theory and statistical methodology. He received his B.S., M.S., and Ph.D. degrees from the University of Minnesota, Minneapolis. Seock-Ho Kim, Ph.D., is Professor in the Department of Educational Psychology at the University of Georgia. He is author of numerous publications in psychometrics and applied statistics and is a member of the American Educational Research Association, the American Statistical Association, the National Council on Measurement in Education, and the Psychometric Society, among other organizations. He received his B.A. from Korea University and his M.S. and Ph.D. degrees from the University of Wisconsin-Madison.
Chapter
A generalized partial credit model (GPCM) was formulated by Muraki (1992) based on Masters’ (1982, this volume) partial credit model (PCM) by relaxing the assumption of uniform discriminating power of test items. However, the difference between these models is not only the parameterization of item characteristics but also the basic assumption about the latent variable. An item response model is viewed here as a member of a family of latent variable models which also includes the linear or nonlinear factor analysis model, the latent class model, and the latent profile model (Bartholomew, 1987).
Chapter
The graded response model represents a family of mathematical models that deals with ordered polytomous categories. These ordered categories include rating such as letter grading, A, B, C, D, and F, used in the evaluation of students’ performance; strongly disagree, disagree, agree, and strongly agree, used in attitude surveys; or partial credit given in accordance with an examinee’s degree of attainment in solving a problem.
Article
A perennial problem for language testers is the need to construct and select test items with 'good' properties. The difficulty lies in the need to assess the properties of items by trying them out on a sample of subjects whose abilities, in turn, it ought to be possible to measure by observing their response to the items. This paper discusses the more important concepts of item response theory (IRT) - a technique, or set of tech niques, developed over the last 25 years, mainly by psychometricians. (An application of IRT was discussed in a recent issue of this journal (Henning, (1984).) Basic concepts are introduced and their implications considered by concentrating on the simplest IRT tool, the Rasch (1960) Model.