
Jorge GonzálezPontificia Universidad Católica de Chile | UC · Departamento de Estadística
Jorge González
PhD in Sciences
About
44
Publications
6,011
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
379
Citations
Introduction
Additional affiliations
March 2011 - September 2015
January 2004 - December 2007
Education
January 2004 - December 2007
Publications
Publications (44)
Mixed-format tests contain items with different formats such as dichotomously scored and polytomously scored items. The aim of this study was to examine the impact of item discrimination, sample size, and proportion of polytomously scored items on item response theory (IRT) kernel equating of mixed-format tests under the equivalent groups design. A...
We define the concept of Global Marginal Effect as the marginal effect observed through the conditional expectation of a partitioned population. The related mathematical expression is characterised by the law of total probability and is shown to be a function of the independent variable. We discuss on the interpretation of the new measure of global...
Liou and Cheng (J Educ Behav Stat 20(3):259–286, 1995) discussed large-sample approximations for the standard error of equating (SEE) using the results of Bahadur (Ann Math Stat 37(3):577–580, 1966) and Ghosh (Ann Math Stat 42(6):1957–1961, 1971) on the asymptotic representation of sample quantiles. In this paper we revisit the Bahadur representati...
To assess the predictive capacity of selection tests is a challenge because the response variable is observed only in selected individuals. In this paper we propose to evaluate the predictive capacity of selection tests through marginal effects under a partial identification approach. Identification bounds are defined for the marginal effects under...
Test equating methods are widely used in order to make comparable different test forms administered at different occasions to different test takers. Although software for test equating is currently available, in this paper we focus the attention on four different R packages which can facilitate test equating for researchers and test developers. Thi...
This proceedings volume highlights the latest research and developments in psychometrics and statistics. It represents selected and peer reviewed presentations given at the 84th Annual International Meeting of the Psychometric Society (IMPS), organized by Pontificia Universidad Católica de Chile and held in Santiago, Chile during July 15th to 19th,...
Using the well-known strategy in which parameters are linked to the sampling distribution via an identification analysis, we offer an interpretation of the item parameters in the one-parameter logistic with guessing model (1PL-G) and the nested Rasch model. The interpretations are based on measures of informativeness that are defined in terms of od...
Gaussian kernel continuization of the score distributions has been the standard choice in kernel equating. In this paper we illustrate the use of both the Epanechnikov and adaptive kernels in the actual equating step using the R package SNSequate (González, J Stat Softw 59(7):1–30, 2014). The two new kernel equating methods are compared with each o...
This research examines empirically the relationship between two measures of teacher quality: one based on professional standards and a second one using teacher value-added estimates. It also studies the extent to which teacher observable characteristics, such as teacher training variables, are associated to better performance on either of these mea...
This book describes how to use test equating methods in practice. The non-commercial software R is used throughout the book to illustrate how to perform different equating methods when scores data are collected under different data collection designs, such as equivalent groups design, single group design, counterbalanced design and non equivalent g...
This chapter describes the kernel equating framework. The five steps that characterize kernel equating are illustrated using the Math20EG, Math20SG, CBdata, and KB36 data sets that were introduced in Chap. 2 and which have been previously analyzed in the literature. We also illustrate the methods using the ADM admissions test data set. The R packag...
Depending on the equating data collection design, score data will be in the form of either univariate or bivariate distributions. In this chapter, we describe how to prepare the score distributions in order to read them into the different R packages that will be used. Presmoothing
the score distributions as a first step in equating is also discusse...
In this chapter, different methods of Item Response Theory (IRT) linking
and equating will be discussed and illustrated using the SNSequate (González, J Stat Softw 59(7):1–30, 2014) and equateIRT (Battauz, J Stat Softw 68(7):1–22, 2015) packages. Other useful packages include ltm (Rizopoulos, J Stat Softw 17(5):1–25, 2006) and mirt (Chalmers, J Sta...
This chapter describes traditional equating methods and their implementation in R. The equate package (Albano, J Stat Softw 74(8):1–36, 2016) will be used the most, although the possibility to use SNSequate (González, J Stat Softw 59(7):1–30, 2014) for traditional equating methods will also be explored. The methods included in this chapter are mean...
This chapter briefly describes some recent developments in test equating and provides examples of how they are performed using the R packages kequate (Andersson et al., J Stat Softw 55(6):1–25, 2013) and SNSequate (González, J Stat Softw 59(7):1–30, 2014). The chapter begins with recent developments within the kernel equating framework, including d...
This chapter provides a general overview of equating and offers a conceptual and formal mathematical definition of equating. The roles of random variables, probability distributions, and parameters in the equating statistical problem are described. Different data collection designs are introduced, and an overview of some of the equating methods tha...
Local equating (van der Linden (2011) Local observed-score equating. In: von Davier A (ed) Statistical models for test equating, scaling, and linking. Springer, New York, pp 201–223) can be seen as an attempt to obtain a fairer equating in comparison to the traditional equating methods described in previous chapters. In this chapter, the concept of...
Equating methods make use of an appropriate transformation function to map the scores of one test form into the scale of another so that scores are comparable and can be used interchangeably. The equating literature shows that the ways of judging the success of an equating (i.e., the score transformation) might differ depending on the adopted frame...
The Poisson’s binomial (PB) is the probability distribution of the number of successes in independent but not necessarily identically distributed binary trials. The independent non-identically distributed case emerges naturally in the field of item response theory, where answers to a set of binary items are conditionally independent given the level...
In education studies value-added is by and large defined in terms of a test-score distribution mean. Therefore, all but a particular summary of the test score distribution is ignored. Developing a value-added definition that incorporates the entire conditional distribution of student’s scores given school effects and control variables would produce...
Condition 2 of Theorem 2 was incorrect in the published version. The correct condition 2 appears in this erratum. Theorem 2. Suppose that I ≥ 3 for the fixed-effects 3PL model. If ω1 = (α1, β1, c1) is fixed at(1, 0, 0), then1. The person parameters are identified by the observations.2. The item parameters (α2:J , β2:J , c2:J ) are not identified by...
The paper offers a general review of the basic concepts of both statistical model and parameter identification, and revisit the conceptual relationships between parameter identification and both parameter interpretability and properties of parameter estimates. All these issues are then exemplified for the 1PL, 2PL and 1PL-G fixed-effects models. Fo...
This paper aims to contribute to the discussion about teacher quality in Chile through two approaches: (a) one based on estimates of the value added by teachers or teacher effect on students' learning, gauged by exploring the role of teachers' characteristics and the context in which they work (school and municipality); and (b) another which seeks...
Equating is an important step in the process of collecting, analyzing, and reporting test scores in any program of assessment. Methods of equating utilize functions to transform scores on two or more versions of a test, so that they can be compared and used interchangeably. In common practice, traditional methods of equating use either parametric o...
Este estudio espera contribuir a la discusión sobre calidad docente en Chile a través de dos enfoques: (a) uno basado en
estimaciones del valor agregado del profesor o efecto profesor en el aprendizaje, explorando el rol de las características
del profesor y del contexto en que trabaja (escuela y municipio); y (b) el otro intenta predecir el desemp...
Equating methods utilize functions to transform scores on two or more versions of a test, so that they can be compared and used interchangeably. In common practice, traditional methods of equating use parametric models where, apart from the test scores themselves, no additional information is used for the estimation of the equating transformation....
Equating is a family of statistical models and methods that are used to adjust scores on two or more versions of a test, so that the scores from different tests may be used interchangeably. In this paper we present the R package SNSequate which implements both standard and nonstandard statistical models and methods for test equating. The package co...
Linear mixed modeling is a useful approach for double mixed factorial designs with
covariates. It is explained how these designs are appropriate for the study of human behavior
as a function of characteristics of persons and situations and stimuli in the situations. The
behavior of subjects nested in types of persons responding to stimuli nested in...
A power analysis of seven normality tests against the Ex-Gaussian distribution (EGd) is presented. The EGd is selected on the basis that it is a particularly well-suited distribution to accommodate positively skewed distributions such as those observed in reaction times data. A pre-assessment of the power of the selected tests across various types...
Local equating (LE) is based on Lord's criterion of equity. It defines a family of true transformations that aim at the ideal of equitable equating. van der Linden (this issue) offers a detailed discussion of common issues in observed-score equating relative to this local approach. By assuming an underlying item response theory model, one of the ma...
Based on Lord's criterion of equity of equating, van der Linden (this issue) revisits the so-called local equating method and offers alternative as well as new thoughts on several topics including the types of transformations, symmetry, reliability, and population invariance appropriate for equating. A remarkable aspect is to define equating as a s...
In social network studies, most often only a single relation (or link) between the actors is investigated. When more than one link has been recorded, the two- way sociomatrix becomes a three-way array with the set of links being the third way. In this paper, we present a model which simultaneously accounts for the three ways in the data. Random eff...
Bayesian methods have become increasingly popular in social sciences due to its flexibility in accommodating numerous models from different fields. The domain of item response theory is a good example of fruitful research, incorporating in the lasts years new developments and models, which are being estimated using the Bayesian approach. This is pa...
Bayesian methods have become increasingly popular in social sciences due to its flexibility in accommodating numerous models from different fields. The domain of item response theory is a good example of fruitful research, incorporating in the lasts years new developments and models, which are being estimated using the Bayesian approach. This is pa...
Using the concept of reduction by sufficiency of a Bayesian model, the issue of Bayesian identifiability is discussed. Various statements given in the literature on Bayesian iden-tifiability are revised. Particular attention is put on the possibility of updating unidentified parameters. This issue is discussed under a general framework and also car...
Structural equation models are commonly used to analyze 2-mode data sets, in which a set of objects is measured on a set of variables. The underlying structure within the object mode is evaluated using latent variables, which are measured by indicators coming from the variable mode. Additionally, when the objects are measured under different condit...
This paper analyzes the sum score based (SSB) formulation of the Rasch model, where items and sum scores of persons are considered
as factors in a logit model. After reviewing the evolution leading to the equality between their maximum likelihood estimates,
the SSB model is then discussed from the point of view of pseudo-likelihood and of misspecif...
El presente estudio investiga la naturaleza del discurso docente en aula de los profesores y profesoras de lenguaje de enseñanza básica en Chile. Para ello reanaliza datos de videos recolectados en un estudio previo en 2003. El modelo estadístico utilizado para analizar dicha información permite ubicar en una misma escala intervalar las característ...
Marginal maximum likelihood estimation is commonly used to estimate logistic-normal models. In this approach, the contribution of random effects to the likelihood is represented as an intractable integral over their distribution. Thus, numerical methods such as Gauss–Hermite quadrature (GH) are needed. However, as the dimensionality increases, the...
A cluster is a collection of subunits on which observations are made. The clusters are the units. Another usual form of clustering arises when data are repeated observations on the same unit. Examples of the same units being observed are students measured with different items and items presented to different persons, and examples of clusters with s...
Projects
Project (1)
En las últimas mediciones internacionales de matemática, TIMSS 2015 y PISA 2015, Chile no obtuvo los resultados esperados: si bien es cierto, los desempeños promedios aumentaron con respecto a mediciones anteriores, al comparar a Chile con otros países de PIB similar, el desempeño en matemática está por debajo de lo que los jóvenes chilenos requieren para desenvolverse en el mundo del siglo XXI.
Estos resultados son motivo más que suficiente para continuar trabajando por mejorar los aprendizajes en matemática. Dicha mejora debe enmarcarse dentro del marco regulatorio del sistema educacional chileno, el Sistema Nacional de Aseguramiento de la Calidad de la Educación (SAC), cuyo objetivo principal es la mejora continua de los aprendizajes de los estudiantes. Para lograr este objetivo, se requiere mejorar las capacidades de los docentes. La Agencia de Calidad de la Educación (ACE) proporciona herramientas y apoyos concretos para que dichas capacidades se utilicen de manera eficiente en el aula: talleres y recursos para la mejora escolar, visitas de aprendizaje, y evaluaciones progresivas.
Este proyecto se propone mejorar las evaluaciones progresivas de forma que se transformen en un recurso pedagógico que permita medir el aprendizaje matemático en 8° año de enseñanza básica y en 1° de enseñanza media de forma de localizar el estado de conocimiento de un estudiante o un grupo de ellos, identificando a la vez los estados de mayor conocimiento y los posibles caminos de aprendizaje para alcanzarlos.
Para lograr esto, es necesario introducir una nueva concepción del aprendizaje con respecto a la cual se debe organizar el conjunto de temas que conforman un currículum. Esta nueva concepción se relaciona con una concepción de aprendizaje no-lineal, lo que implica reconocer que no todos los estudiantes aprenden matemática de la misma manera. Esta concepción se basa en la teoría psicométrica de Espacios de Conocimiento. A pesar de que existen experiencias internacionales en las que dicha teoría se aplica en diversas áreas escolares, las mismas no pueden aplicarse directamente en Chile pues dependen de un currículum específico, en nuestro caso el de Matemática. De ahí entonces que el primer producto que este proyecto generará corresponde al espacio de conocimiento matemático en 8° año de enseñanza básica y en 1° de enseñanza media. Dicho espacio será construido de manera empírica, y corroborado por un espacio de conocimiento construido en base a juicio experto.
Dado el espacio de conocimiento, se hace necesario localizar el estado de conocimiento al cual pertenece cada estudiante. Un segundo producto que será generado por este proyecto lo constituye la Prueba de Localización de Estado de Conocimiento. Junto con la localización del estado de conocimiento de un estudiante, es posible identificar todos los estados de mayor conocimiento y los posibles caminos de aprendizaje que permiten llegar a cada uno de dichos estados. De esta manera, se explicita la diversidad de aprendizajes en el aula, contándose con una herramienta que permite planificar acciones pedagógicas para los distintos grupos de estudiantes.
Ambos productos tienen la intención de mejorar las Evaluaciones Progresivas, por lo que buscan insertarse dentro del marco regulatorio definido por el SAC y así servir como herramientas que mejorarán las capacidades evaluativas y de planificación escolar de los docentes chilenos.