ArticlePublisher preview available

Modeling Multiple Response Processes in Judgment and Choice

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

In this article, I show how item response models can be used to capture multiple response processes in psychological applications. Intuitive and analytical responses, agree–disagree answers, response refusals, socially desirable responding, differential item functioning, and choices among multiple options are considered. In each of these cases, I show that the response processes can be measured via pseudoitems derived from the observed responses. The estimation of these models via standard software programs that allow for missing data is also discussed. The article concludes with two detailed applications that illustrate the prevalence of multiple response processes.
Modeling Multiple Response Processes in Judgment and Choice
Ulf Böckenholt
Northwestern University
In this article, I show how item response models can be used to capture multiple
response processes in psychological applications. Intuitive and analytical responses,
agree–disagree answers, response refusals, socially desirable responding, differential
item functioning, and choices among multiple options are considered. In each of these
cases, I show that the response processes can be measured via pseudoitems derived
from the observed responses. The estimation of these models via standard software
programs that allow for missing data is also discussed. The article concludes with two
detailed applications that illustrate the prevalence of multiple response processes.
Keywords: item response models, missing data, multiple-choice items
Supplemental materials: http://dx.doi.org/10.1037/2325-9965.1.S.83.supp
A key challenge in quantitative psychology is
to develop models that parsimoniously capture
how individuals differ in arriving at their judg-
ments or choices. Successful examples include
item response and discrete choice models
(Böckenholt, 2006; van der Linden & Hamble-
ton, 1997). Both classes of models have in com-
mon that they postulate a single response pro-
cess that leads to the observed judgments and
choices. In item response models, the probabil-
ity of a correct response to an item depends on
the difference between the test taker’s ability
and the item difficulty. In choice models, the
probability of a choice depends on the differ-
ences in utility between the choice options.
Thus, in both cases, a single response process,
formalized as a difference between ability and
difficulty for item response models or as a dif-
ference between utilities for choice models, is
postulated to hold for all respondents. In this
article, I go beyond the notion of a single re-
sponse process and consider applications in
which respondents may arrive at their answers
via multiple response processes.
Multiple response processes abound in psy-
chological research. For example, there are a
considerable number of dual-response theories
in judgment and choice applications that are
based on System 1 and System 2 distinctions
(Evans, 2008). System 1 processes are charac-
terized as unconscious, rapid, effortless, and
automatic, whereas System 2 processes are
characterized as conscious, slow, effortful, and
deliberative. Each system can lead to different
answers, as illustrated by the following test
item: “A bat and a ball cost $1.10. The bat costs
$1 more than the ball. How much does the ball
cost?” (Frederick, 2005, p. 26). A typical im-
mediate answer is “10 cents” because $1.10 can
be divided easily into $1 and 10 cents, and 10
cents seems to be a reasonable price for a ball.
However, after a moment of reflection and de-
liberation, a respondent may realize that the
difference between $1 and 10 cents is less than
$1 and give the correct answer instead.
Similarly, when asked questions about per-
sonal or sensitive issues, respondents may want
to give honest answers but also want to present
themselves in a favorable light, with the result
that items measure both the actual behaviors of
the respondents as well as the respondents’ ten-
dency to edit their responses. To identify which
response process gives rise to the observed an-
swer, social desirability scales (Paulhus, 1984)
have been developed that measure the degree to
which respondents tend to present themselves
favorably. However, success in using these
scales to correct for respondents’ response-
Ulf Böckenholt, Kellogg School of Management, North-
western University.
I am grateful to Carolyn Roux and Jacques Nantel for the
rating data presented in the Applications section of this
article.
Correspondence concerning this article should be ad-
dressed to Ulf Böckenholt, 2001 Sheridan Road, Evanston,
IL 60208. E-mail: u-bockenholt@kellogg.northwestern.edu
This article is reprinted from Psychological Methods,
2012, Vol. 17, No. 4, 665–678.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Decision © 2013 American Psychological Association
2013, Vol. 1(S), 83–103 2325-9965/13/$12.00 DOI: 10.1037/2325-9965.1.S.83
83
... For example, Roberts and colleagues (2000) developed the Generalized Graded Unfolding Model (GGUM) to better model the unfolding response process, which has been successfully applied to many non-cognitive assessments. Further, item response tree (IRTree) models have been proposed and well-received as a flexible tool for handling RSs (e.g., Böckenholt, 2012Böckenholt, , 2017Böckenholt, , 2019Henninger & Meiser, 2020;Lang et al., 2019;Lang & Tay, 2021;Lievens et al., 2018;Sun et al., 2021). However, the two lines of research have been developing largely in parallel, resulting in researchers concerned about the response process not accounting for RSs, and vice versa. ...
... Among the many approaches to handling RSs, the IRTree model (Böckenholt, 2012; exhibits the advantages of disassociating the focal trait and RSs while modeling them simultaneously. In the IRTree model, an item with polytomous responses is decomposed into several sub-items given a tree structure where the sub-items are represented by nodes, and the item categories are represented by leaves (i.e., end-nodes). ...
Article
Full-text available
Two research streams on responses to Likert-type items have been developing in parallel: (a) unfolding models and (b) individual response styles (RSs). To accurately understand Likert- type item responding, it is vital to parse unfolding responses from RSs. Therefore, we propose the Unfolding Item Response Tree (UIRTree) model. First, we conducted a Monte Carlo simulation study to examine the performance of the UIRTree model compared to three other models -- Samejima’s Graded Response Model, Generalized Graded Unfolding Model, and Dominance Item Response Tree model, for Likert-type responses. Results showed that when data followed an unfolding response process and contained RSs, AIC was able to select the UIRTree model, while BIC was biased towards the DIRTree model in many conditions. In addition, model parameters in the UIRTree model could be accurately recovered under realistic conditions, and mis-specifying item response process or wrongly ignoring RSs was detrimental to the estimation of key parameters. Then, we used datasets from empirical studies to show that the UIRTree model could fit personality datasets well and produced more reasonable parameter estimates compared to competing models. A strong presence of RS(s) was also revealed by the UIRTree model. Finally, we provided examples with R code for UIRTree model estimation to facilitate the modeling of responses to Likert-type items in future studies.
... For example, Roberts and colleagues (2000) developed the Generalized Graded Unfolding Model (GGUM) to better model the unfolding response process, which has been successfully applied to many non-cognitive assessments. Further, item response tree (IRTree) models have been proposed and well-received as a flexible tool for handling RSs (e.g., Böckenholt, 2012Böckenholt, , 2017Böckenholt, , 2019Henninger & Meiser, 2020;Jeon & De Boeck, 2016;Lang et al., 2019;Lang & Tay, 2019;Lievens et al., 2018;Sun et al., 2021). However, the two lines of research have been developing in parallel, and researchers concerned about the response process often overlook the issue of RSs, and vice versa. ...
... Among the many approaches to handling RSs, the IRTree model (Böckenholt, 2012;De Boeck & Partchev, 2012) four-point item with categories 0,1,2 and 3 corresponding to "strongly disagree," "disagree," "agree," and "strongly agree," respectively. We can use an IRTree structure to separate the item into three binary sub-items that are to measure whether an observed response is negative (i.e., category 0 or 1) or positive (i.e., category 2 or 3); if negative, whether it is extreme (i.e., category 0) or not (i.e., category 1); if positive, whether it is extreme (i.e., category 3) or not (i.e., category 2). ...
Preprint
Full-text available
Many researchers have found that unfolding models may better represent how respondents answer Liker-type items and response styles (RSs) often have moderate to strong presence in responses to such items. However, the two research lines have been growing largely in parallel. The present study proposed an unfolding item response tree (UIRTree) model that can account for unfolding response process and RSs simultaneously. An empirical illustration showed that the UIRTree model could fit a personality dataset well and produced more reasonable parameter estimates. Strong presence of the extreme response style (ERS) was also revealed by the UIRTree model. We further conducted a Monte Carlo simulation study to examine the performance of the UIRTree model compared to three other models for Likert-scale responses: the Samejima’s graded response model, the generalized graded unfolding model, and the dominance item response tree (DIRTree) model. Results showed that when data followed unfolding response process and contained the ERS, the AIC was able to select the UIRTree model, while BIC was biased towards the DIRTree model in many conditions. In addition, model parameters in the UIRTree model could be accurately recovered under realistic conditions, and wrongly assuming the item response process or ignoring RSs was detrimental to the estimation of key parameters. In general, the UIRTree model is expected to help in better understanding of responses to Liker-type items theoretically and contribute to better scale development practically. Future studies on multi-trait UIRTree models and UIRTree models accounting for different types of RSs are expected.
... The TVTM approach consists in using IRT tree models -and more specifically the three-process model (Bockenholt 2012) -to quantify systematic individual differences in response patterns. TVTM assumes that the choice of an answer option on a 5-point Likert scale can be decomposed into three subdecisions represented by three pseudoitems in the statistical model. ...
Article
Full-text available
Prior research on the value of personality traits for predicting negotiation outcomes is rather inconclusive. Building on prior research and in light of recent personality and negotiation theories, we discuss why the traditional approach to personality traits has had limited success and propose an alternative approach to predicting negotiation outcomes from personality assessments. More specifically, we argue that negotiations are tasks in which performance is conditioned by the ability to adjust one’s mental states and behaviors according to situational demands. We therefore hypothesize that it is especially individual differences in within-person variability in personality – that is, the variability trait – that can be expected to predict negotiation outcomes, rather than individual differences in average traits. We show in two empirical studies involving dyads that the variability trait is indeed a better predictor of economic gains and satisfaction than average traits. Implications for theory, education, and practice are discussed.
... Person-fit type indices are based on item response theory (IRT; Böckenholt (2013), Lang, Lievens, De Fruyt, Zettler, and Tackett (2019), Sijtsma and Molenaar (2002)) and weigh the relative probability of responses to each item rather than assume that responses to each item are equally probable. Person-fit indices, then, provide a measure of the extent to which a specific participant's pattern of item responses fit with the overall sample's pattern of responses. ...
Article
Full-text available
The response entropy (RE) index is proposed as a new method for flagging careless response patterns and is determined by calculating the balance of proportions of response types endorsed by participants on Likert-scaled surveys. In the first study, performance of the RE index was compared to other commonly used post hoc indices for detecting careless responding (CR) such as the Mahalanobis distance (MD) and the psychometric synonym (PS) index. Three different types of Bogus Sets (BS) were generated: 1) uniform random values produced by computer (n = 100); 2) normally distributed random values produced by computer (n = 100); and 3) purposefully careless responses produced by human participants (n = 100). The BS data were then implanted in a true, cleaned social science dataset (n = 500). Multinomial logistic regression determined that the RE index made independent contributions from other indices to the prediction of BS. Latent variable analyses suggest that the variability type RE index may be tapping distinct constructs from regression type indices such as the PS index. In study 2, potential cultural bias in CR indices was examined with a true social science dataset (n = 302) comprised of racially diverse participants. Unlike other post hoc indices of CR, the RE index was unrelated to participant race. Further analyses demonstrated that racial differences on other indices of CR could be accounted for by culturally different styles of survey responding. For example, Asian participants' higher MD scores relative to White participants' was mediated by a culturally specific acquiescent survey response style. These findings point to the useful of the RE index for detecting CR while also avoiding the conflation of CR with culturally different responding.
Article
Full-text available
Historically, the “ ? ” response category (i.e., the question mark response category) has been criticized because of the ambiguity of its interpretation. Previous empirical studies of the appropriateness of the “ ? ” response category have generally used methods that cannot disentangle the response style from target psychological traits and have also exclusively focused on Western samples. To further develop our understanding of the “ ? ” response category, we examined the differing use of the “ ? ” response category in the Job Descriptive Index (JDI) between U.S. and Korean samples by using the recently proposed item response tree (IRTree) models. Our research showed that the Korean group more strongly prefers the “ ? ” response category, while the U.S. group more strongly prefers the directional response category (i.e., Yes). In addition, the Korean group tended to interpret the “ ? ” response category as mild agreement, while the U.S. group tended to interpret it as mild disagreement. Our study adds to the scientific body of knowledge on the “ ? ” response category in a cross‐cultural context. We hope that our findings presented herein provide valuable insights for researchers and practitioners who want to better understand the “ ? ” response category and develop various psychological assessments in cross‐cultural settings.
Article
Modeling fuzziness and imprecision in human rating data is a crucial problem in many research areas, including applied statistics, behavioral, social, and health sciences. Because of the interplay between cognitive, affective, and contextual factors, the process of answering survey questions is a complex task, which can barely be captured by standard (crisp) rating responses. Fuzzy rating scales have progressively been adopted to overcome some of the limitations of standard rating scales, including their inability to disentangle decision uncertainty from individual responses. The aim of this article is to provide a novel fuzzy scaling procedure which uses Item Response Theory trees (IRTrees) as a psychometric model for the stage-wise latent response process. In so doing, fuzziness of rating data is modeled using the overall rater's pattern of responses instead of being computed using a single-item based approach. This offers a consistent system for interpreting fuzziness in terms of individual-based decision uncertainty. A simulation study and two empirical applications are adopted to assess the characteristics of the proposed model and provide converging results about its effectiveness in modeling fuzziness and imprecision in rating data.
Article
Full-text available
Response styles are a source of contamination in questionnaire ratings, and therefore they threaten the validity of conclusions drawn from marketing research data. In this article, the authors examine five forms of stylistic responding (acquiescence and disacquiescence response styles, extreme response style/response range, midpoint responding, and noncontingent responding) and discuss their biasing effects on scale scores and correlations between scales. Using data from large, representative samples of consumers from 11 countries of the European Union, the authors find systematic effects of response styles on scale scores as a function of two scale characteristics (the proportion of reverse-scored items and the extent of deviation of the scale mean from the midpoint of the response scale) and show that correlations between scales can be biased upward or downward depending on the correlation between the response style components. In combination with the apparent lack of concern with response styles evidenced in a secondary analysis of commonly used marketing scales, these findings suggest that marketing researchers should pay greater attention to the phenomenon of stylistic responding when constructing and using measurement instruments.
Article
Full-text available
People approach pleasure and avoid pain. To discover the true nature of approach–avoidance motivation, psychologists need to move beyond this hedonic principle to the principles that underlie the different ways that it operates. One such principle is regulatory focus, which distinguishes self-regulation with a promotion focus (accomplishments and aspirations) from self-regulation with a prevention focus (safety and responsibilities). This principle is used to reconsider the fundamental nature of approach–avoidance, expectancy–value relations, and emotional and evaluative sensitivities. Both types of regulatory focus are applied to phenonomena that have been treated in terms of either promotion (e.g., well-being) or prevention (e.g., cognitive dissonance). Then, regulatory focus is distinguished from regulatory anticipation and regulatory reference, 2 other principles underlying the different ways that people approach pleasure and avoid pain.
Article
This monograph is a part of a more comprehensive treatment of estimation of latent traits, when the entire response pattern is used. The fundamental structure of the whole theory comes from the latent trait model, which was initiated by Lazarsfeld as the latent structure analysis [Lazarsfeld, 1959], and also by Lord and others as a theory of mental test scores [Lord, 1952]. Similarities and differences in their mathematical structures and tendencies were discussed by Lazarsfeld [Lazarsfeld, 1960] and the recent book by Lord and Novick with contributions by Birnbaum [Lord & Novick, 1968] provides the dichotomous case of the latent trait model in the context of mental measurement.
Chapter
The model considered in this chapter is suited for a special type of response. First, the response should be from ordered categories, i.e., a graded response. Second, the categories or levels should be recorded successively in a stepwise manner. An example illustrates this type of item, which is often found in practice: Wright and Masters (1982) consider the item 9.0/0.35=?\sqrt {9.0/0.3 - 5} = ? Three levels of performance may be distinguished: No subproblem solved (Level 0), 9.0/0.3 = 30 solved (Level 1), 30 − 5 = 25 solved (Level 2), 25=5\sqrt {25} = 5 (Level 3). The important feature is that each level in a solution to the problem can be reached only if the previous level is reached.
Chapter
The partial credit model (PCM) by Masters (1982, this volume) is a unidimensional item response model for analyzing responses scored in two or more ordered categories. The model has some very desirable properties: it is an exponential family, so minimal sufficient statistics for both the item and person parameters exist, and it allows conditional-maximum likelihood (CML) estimation. However, it will be shown that the relation between the response categories and the item parameters is rather complicated. As a consequence, the PCM may not always be the most appropriate model for analyzing data.
Book
Examines the psychological processes involved in answering different types of survey questions. The book proposes a theory about how respondents answer questions in surveys, reviews the relevant psychological and survey literatures, and traces out the implications of the theories and findings for survey practice. Individual chapters cover the comprehension of questions, recall of autobiographical memories, event dating, questions about behavioral frequency, retrieval and judgment for attitude questions, the translation of judgments into responses, special processes relevant to the questions about sensitive topics, and models of data collection. The text is intended for: (1) social psychologists, political scientists, and others who study public opinion or who use data from public opinion surveys; (2) cognitive psychologists and other researchers who are interested in everyday memory and judgment processes; and (3) survey researchers, methodologists, and statisticians who are involved in designing and carrying out surveys. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The compromise effect denotes the finding that brands gain share when they become the intermediate rather than extreme option in a choice set. Despite the robustness and importance of this phenomenon, choice modelers have neglected to incorporate the compromise effect in formal choice models and to test whether such models outperform the standard value maximization model. In this article, the authors suggest four context-dependent choice models that can conceptually capture the compromise effect. Although the models are motivated by theory from economics and behavioral decision research, they differ with respect to the particular mechanism that underlies the compromise effect (e.g., contextual concavity versus loss aversion). Using two empirical applications, the authors (1) contrast the alternative models and show that incorporating the compromise effect by modeling the local choice context leads to superior predictions and fit compared with the traditional value maximization model and a stronger (naive) model that adjusts for possible biases in utility measurement, (2) generalize the compromise effect by demonstrating that it systematically affects choice in larger sets of products and attributes than has been previously shown, (3) show the theoretical and empirical equivalence of loss aversion and local (contextual) concavity, and (4) demonstrate the superiority of models that use a single reference point over "tournament models" in which each option serves as a reference point. They discuss the theoretical and practical implications of this research as well as the ability of the proposed models to predict other behavioral context effects.