Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Multiple-choice tests are commonly used in educational settings but with unknown effects on students' knowledge. The authors examined the consequences of taking a multiple-choice test on a later general knowledge test in which students were warned not to guess. A large positive testing effect was obtained: Prior testing of facts aided final cued-recall performance. However, prior testing also had negative consequences. Prior reading of a greater number of multiple-choice lures decreased the positive testing effect and increased production of multiple-choice lures as incorrect answers on the final test. Multiple-choice testing may inadvertently lead to the creation of false knowledge.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... They report the usual testing effect for repeated questions but no effects when the same conceptual information was tested through different questions. Roediger III and Marsh (2005) examined the benefits and pitfalls of using multiple-choice testing upon a final cued-recall knowledge test (Roediger III and Marsh, 2005). They found evidence for a typical testing effect. ...
... They report the usual testing effect for repeated questions but no effects when the same conceptual information was tested through different questions. Roediger III and Marsh (2005) examined the benefits and pitfalls of using multiple-choice testing upon a final cued-recall knowledge test (Roediger III and Marsh, 2005). They found evidence for a typical testing effect. ...
... They found evidence for a typical testing effect. However, exposure to a larger number of multiple-choice lures increased incorrect answers and the possible creation of false knowledge (Roediger III and Marsh, 2005). ...
Article
Full-text available
Building domain knowledge is essential to a student's success in any course. Chemistry, similar to other STEM disciplines, has a strong cumulative element ( i.e. , topic areas continuously build upon prior coursework). We employed the testing effect, in the form of post-exam retrieval quizzes, as a way to improve students’ understanding of chemistry over an entire semester. Students ( n = 146) enrolled in Introduction to Chemistry were presented with retrieval quizzes released one week after each during-term exam (that covered that exam's content). We measured students’ level of quiz participation, during-term exam scores (a control variable), and cumulative final exam scores to determine the effectiveness of implementing a post-exam retrieval quiz system. Most critically, students completing more than 50% of the retrieval quizzes performed significantly better ( i.e. , more than a half letter grade) on the cumulative final exam than those who were below 50% participation as determined by one-way between-subjects ANOVA and planned follow-up analyses. We found no significant differences between the participating groups on during-term exam scores, suggesting that high achieving students were not more likely than struggling students to participate in the practice testing (and thus benefit from it).
... The testing effect has been observed in studies using various methods of immediate test, most usually with cued recall (e.g., Karpicke & Smith, 2012), but also with other methods such as multiple-choice (e.g., Roediger & Marsh, 2005). There are several possibilities as to why certain methods of immediate testing may be more beneficial for future retention. ...
... They may also provide an opportunity for additional learning of some items through the process of elimination of foils (Marsh et al., 2007) even in the absence of feedback on response choice. However, foil answers in multiple-choice tests may also lead to learning of incorrect information (Butler et al., 2006;Marsh et al., 2007;Roediger & Marsh, 2005). ...
... Importantly, Experiment 3 showed that testing memory immediately after training using either cued recall or multiple-choice alone was sufficient to result in a significant testing effect. This is consistent with studies that have found a testing effect arising from an immediate cued recall test (Karpicke & Smith, 2012) or an immediate test using multiple-choice questions (Roediger & Marsh, 2005). ...
Preprint
Full-text available
This study investigated how word meanings can be learned from natural story reading. Three experiments with adult participants compared naturalistic incidental learning with intentional learning of new meanings for familiar words, and examined the role of immediate tests in maintaining memory of new word meanings. In Experiment 1, participants learned new meanings for familiar words through incidental (story reading) and intentional (definition training task) conditions. Memory was tested with cued recall of meanings and multiple-choice meaning-to-word matching immediately and 24 h later. Results for both measures showed higher accuracy for intentional learning, which was also more time efficient than incidental learning. However, there was reasonably good learning from both methods, and items learned incidentally through stories appeared less susceptible to forgetting over 24 h. It was possible that retrieval practice at the immediate test may have aided learning and improved memory of new word meanings 24 h later, especially for the incidental story reading condition. Two preregistered experiments then examined the role of immediate testing in long-term retention of new meanings for familiar words. There was a strong testing effect for word meanings learned through intentional and incidental conditions (Experiment 2), which was non-significantly larger for items learned incidentally through stories. Both cued recall and multiple-choice tests were each individually sufficient to enhance retention compared to having no immediate test (Experiment 3), with a larger learning boost from multiple-choice. This research emphasises (i) the resilience of word meanings learned incidentally through stories and (ii) the key role that testing can play in boosting vocabulary learning from story reading.
... Increased use of new technologies, larger student groups, reduced resources and with the COVID-19 pandemic, the rise of online learning, have resulted in increased use of multiple choice questions -(MCQs) for assessment in higher education. Many experiments and analyses have been conducted designed to investigate positive and negative consequences of using MCQs for assessment (see for example [1]). The focus in this paper is more specific: to investigate the effect of the number of distractors (the number of incorrect answers) used in MCQs as well as the use of the "None of the above" (NOTA) or "All of the above" (AOTA) options. ...
... However, prior exposure to multiple-choice distractors have been shown to decrease the positive testing effect on later exams. This negative effect has been shown to increase with the number of distractors on previously seen MCQ tests [1]. This is of some concern since having more distractors on a given test decreases the probability of a guessing student to mark the correct answer by luck. ...
... The results of the analyses performed here and results from other studies indicate that including as many plausible distractors as one can in drilling systems such as the tutor-web is a good thing. Since elaborated feedback is provided to the students after they answer questions the negative effect of having many distractors as noted in [1] is not of concern. It should however be kept in mind that the distractors need to be well thought out as discussed in [18] which can be time consuming. ...
Preprint
Full-text available
Multiple choice questions (MCQs) are commonly used for assessment in higher education. With increased use of on-line examination it is likely that the usage of MCQs will be even more in years to come. It is therefore of interest to examine some characteristics of these type of questions such as the effect of the number of distractors used and the "None of the above" (NOTA) or "All of the above" (AOTA) options. The tutor-web is an open-source, on-line drilling system that is freely available to anyone having access to the Internet. The system was designed to be used for teaching mathematics and statistics but can in principle be used for other subjects as well. The system offers thousands of multiple choice questions at high school and university level. In addition to be a tool used by students for learning it has also been used as a testbed for research on web-assisted education. The tutor-web system was used both as a learning tool and as a testing tool in a university course on mathematical statistics in the spring of 2020. Around 300 students were enrolled in the course providing tens of thousands of answers to MCQs designed to investigate the effect of the number of distractors and the use of NOTA and AOTA options in questions. The main findings of the study were that the probability of answering a question correctly was highest when a AOTA option was used as a distractor and when NOTA and AOTA were not used in questions. The probability of answering a question correctly decreased with the number of distractors.
... Increased use of new technologies, larger student groups, reduced resources and with the COVID-19 pandemic, the rise of online learning, have resulted in increased use of multiple-choice questions (MCQs) for assessment in higher education. Many experiments and analyses have been conducted designed to investigate positive and negative consequences of using MCQs for assessment (see for example [1]). The focus in this paper is more specific: to investigate the effect of the number of distractors (the number of incorrect answers) used in MCQs as well as the use of the "None of the above" (NOTA) or "All of the above" (AOTA) options. ...
... However, prior exposure to multiple-choice distractors have been shown to decrease the positive testing effect on later exams. This negative effect has been shown to increase with the number of distractors on previously seen MCQ tests [1]. This is of some concern since having more distractors on a given test decreases the probability of a guessing student to mark the correct answer by luck. ...
... The results of the analyses performed here and results from other studies indicate that including as many plausible distractors as one can in drilling systems such as the tutor-web is a good thing. Since elaborated feedback is provided to the students after they answer questions the negative effect of having many distractors as noted in [1] is not of concern. It should however be kept in mind that the distractors need to be well thought out as discussed in [18] which can be time consuming. ...
... Yet, it is still only rarely implemented in digital (low-stakes) summative assessments of learning, such as those conducted in large-scale assessment programs. However, task-level feedback might help to overcome two common challenges in testing: (1) Multiple-choice items, which are frequently used in tests to facilitate scoring (Butler, 2018), have been shown to enhance the risk that students remember incorrect information and form misconceptions due to the exposure to distractor lures (Roediger & Marsh, 2005). Providing task-level feedback seems to reduce this effect (Butler & Roediger, 2008). ...
... In test situations where students cannot receive immediate or even delayed feedback, for example, by discussing tasks and correct responses with their teachers, students might form misconceptions based on erroneous responses and, specifically, due to the lures presented in multiple-choice items (Butler & Roediger, 2008). Research has shown that exposure to multiple-choice distractors can actually promote the reproduction of lures in a cued-recall test (Fazio, Agarwal, et al., 2010;Roediger & Marsh, 2005). This effect was shown to be more pronounced, the more lures were presented (Butler & Roediger, 2008). ...
... Our additional exploratory score analyses revealed that recall improvement was related to a significant increase in error correction rates of the items that were initially solved incorrectly in the feedback groups compared to in the no-feedback control group. This finding is in line with our assumption that not only feedback that directly provides the correct response may reduce the number of misconceptions students form as a result of the multiple-choice lures (see, e.g., Butler & Roediger, 2008); even simple KR feedback that only informs students about the correctness or incorrectness of a given response may also achieve this (see, e.g., Roediger & Marsh, 2005) by immediately raising error awareness. This very simple type of learning related to KR feedback (i.e., error correction of already known items) therefore seems to have the potential to reduce unwanted cognitive side effects of multiple-choice testing. ...
Article
Full-text available
Immediate Knowledge of Results (KR) feedback may motivate low-stakes test takers by showing that their answers matter, while appealing feedback cues may help to prevent negative emotions in lower performers who receive a higher amount of negative feedback. In this experiment, we varied the presence of KR feedback and the feedback delivery mode in a 1×5 between-subjects design (i.e., no feedback vs. text, color, sound, or animation feedback) to investigate effects on learning outcomes, and affective-motivational measures. Our sample included 661 fifth and sixth graders who solved two computer-based low-stakes multiple-choice science tests. First, students worked on an 18-item treatment test (with experimental feedback manipulation). Students repeatedly rated their effort, enjoyment, pride, and boredom during the test, as well as their expectancy of success and attainment value after the test. Subsequently, they worked on a posttest (without feedback) that assessed recall and near-transfer learning. All KR feedback conditions significantly increased recall, but there was no evidence for near-transfer learning. Feedback had a significant, negative effect on attainment value, whereas significant interactions between the feedback conditions and students’ treatment performance revealed that feedback effects on several affective-motivational dimensions (i.e., expectancy of success, enjoyment, pride, and boredom) were performance-dependent. Feedback benefited higher performers’ motivation and affect but showed negative effects on some affective-motivational measures for lower performers. The pattern of results indicated that color/sound/animation feedback may have reduced the effect of performance on emotional feedback perception to some extent. However, none of the feedback conditions improved affective-motivational outcomes independent of students’ performance.
... In a similar vein, Brown (1988) and Jacoby and Hollingshead (1990) showed that exposing students to misspelled words increased misspelling of those words on a later oral test. Roediger and Marsh (2005) asked whether giving a multiple-choice test (without feedback) would lead to a kind of misinformation effect (Loftus et al., 1978). That is, if students take a multiple-choice test on a subset of facts, and then take a short answer test on all facts, will prior testing increase intrusions of multiple-choice lures on the final test? ...
... Correspondingly, under conditions of relatively poor multiple-choice performance, the positive testing effect will be diminished and the negative effect will be increased. These conclusions hold over more recent experiments, and also agree with the third experiment conducted, the one that did appear in the Roediger and Marsh (2005) paper. In this study, we replicated Experiment 2 but changed the instruction on the final short answer test from forced recall with confidence ratings to recall with a strong warning against guessing. ...
... Research on negative suggestibility is just beginning, and only a few variables have been systematically investigated. Three classes of variables are likely to be interesting: ones that affect how likely subjects are to select multiple-choice lures (e.g., reading related material, a penalty for wrong answers on the MC test), ones that affect the likelihood that selected multiple-choice lures are integrated with related world knowledge (e.g., corrective feedback), and ones that affect monitoring at test (e.g., the warning against guessing on the final test used in Roediger & Marsh, 2005). The negative testing effect could change in size for any of these reasons. ...
... Multiple-choice items exams are claimed not provide an accurate measurement of students' knowledge (Roediger & Marsh, 2005). Despite students high rate answers to a certain multiple-choice test,their scores do not reflect a fairly mastery of the knowledge and content they have learnt. ...
... Notwithstanding, MCT leads to false knowledge. (Roediger & Marsh, 2005) conducted a research on the positive and the negative effects of MCT and concluded that it may unintentionally lead to the creation of false knowledge. They explained that exposing students to wrong answers, there is a huge possibility that the statement will be judged true later. ...
Article
Full-text available
Now more than ever, there exists a plethora of empirical evidence to uphold that examinations used in educational institutions have a backwash effect, a well-recognized phenomenon among applied linguists, educators and teachers, which is the effect of test on teaching and learning (Alderson & Wall, 1993; Bailey, 1999; Messick, 1996; Widen et al., 1997; Hughes, 2003; Yi-Ching, 2009). This article essentially targets this phenomenon in Moroccan higher education. It seeks to provide a concise theoretical framework to render the reader au fait with such an unfamiliar term. It aims at examining the extent to which higher education assessments affect EFL students’ academic achievements through sketching examples from the summative assessment practices used by faculty instructors at Ibn Zohr University, Agadir, Morocco. It also aims at suggesting some pedagogical implications to harness teaching and learning in Moroccan higher education. Key words: backwash (washback); summative assessment; higher education
... However, students in this experiment were not given a criterial test that assessed memory, so whether they would have produced incorrect information presented in the false statements on a later memory test is unclear. Findings from studies employing multiple-choice tests, however, suggest that exposure to incorrect information on practice tests may in fact increase the likelihood of producing it on later tests (Roediger & Marsh, 2005). In addition, one classroom study demonstrated that, although multiple-choice or essay preliminary tests led to better exam performance than no tests, true-false preliminary tests did not yield any benefits (Jersild, 1929). ...
... On the true-false quizzes, students were presented with incorrect statements half of the time, whereas students in the control group only saw correct statements. In the absence of correct-answer feedback, quizzed students may have accepted erroneous information and carried it through to the criterial test (Roediger & Marsh, 2005). In fact, when we examined incorrect responses on the criterial test of participants who were quizzed during the first session-only considering the items presented as false statements-we found that misinformation was carried to the criterial test about 14% of the time. ...
Article
Testing with various formats enhances long-term retention of studied information; however, little is known whether true-false tests produce this benefit despite their frequent use in the classroom. We conducted four experiments to explore the retention benefits of true-false tests. College students read passages and reviewed them by answering true-false questions or by restudying correct information from the passages. They then took a criterial test 2 days later that consisted of short-answer questions (Experiments 1 and 2) or short-answer and true-false questions (Experiments 3 and 4). True-false tests enhanced retention compared to rereading correct statements and compared to typing those statements while rereading (the latter in a mini meta-analysis). Evaluating both true and false statements yielded a testing effect on short-answer criterial tests, whereas evaluating only true statements produced a testing effect on true-false criterial tests. Finally, a simple modification that asked students to correct statements they marked as false on true-false tests improved retention of those items when feedback was provided. True-false tests can be an effective and practical learning tool to improve students' retention of text material. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
... The low construct validity suggests that the format does not measure the subject matter of Economics as a construct very truly well because of its nature that allows for negative suggestibility. This is similar to results of various researchers (Roediger & Marsh, 2005;Fazio, Agarwal, Marsh, & Roediger, 2010;Marsh, Roediger, Bjork, & Bjork, 2007;the SAT;Marsh, Agarwal, & Roediger, 2009;Butler & Roediger, 2008) on test items which shows that negative suggestibility is real, at least on true/false and multiple-choice tests. Roediger & Marsh, (2005) discovered that when test-takers answered erroneously, the negative suggestibility effect occurred, thereby affecting their performance. ...
... This is similar to results of various researchers (Roediger & Marsh, 2005;Fazio, Agarwal, Marsh, & Roediger, 2010;Marsh, Roediger, Bjork, & Bjork, 2007;the SAT;Marsh, Agarwal, & Roediger, 2009;Butler & Roediger, 2008) on test items which shows that negative suggestibility is real, at least on true/false and multiple-choice tests. Roediger & Marsh, (2005) discovered that when test-takers answered erroneously, the negative suggestibility effect occurred, thereby affecting their performance. ...
Article
Full-text available
The study investigated the influence of test item formats on the construct validity of Economics achievement tests in Osun state Secondary Schools. It further determined whether a significant difference existed in the performance of test-takers across various test formats used in the Economics Achievement Test. The study adopted descriptive survey research design. Multi-stage sampling was used to select a sample of 300 Senior Secondary School class two (2) Economics students and 36 Economics teachers. The instrument used for data collection was titled ‘Economics Achievement Test’ (EAT) and was validated with alpha reliability coefficient of 0.68. The data collected were analyzed using Principal component analysis, scree plot and one-way analysis of variance. The result showed that there is a significant influence of test item format on the construct validity of Economics Achievement Test in Osun State Secondary Schools. Finally, there was a significant difference in the performances of test-takers across various test format. Economics Achievement Test used (F4,1495=290.25, p<0.05).The study concluded that test item formats as a facet have an effect on the construct validity of the Economics Achievement Test in Osun State Secondary Schools and recommend that test constructors and classroom teachers must understand the characteristics of each format and select the best format which most appropriately serves the purpose of a test in each context.
... In the new structure, the number of items was three correct items out of 12 options for each question. With the new structure, the number of correct items in each question was 25% following a standard four-alternative MCQ test where three alternatives are wrong and one is correct (Roediger & Marsh, 2005). ...
Article
Full-text available
Aims To develop and psychometrically test the Reasoning Skills (ReSki) test assessing undergraduate nursing applicants' reasoning skills for student selection purposes. Design A methodological cross‐sectional design was applied for the psychometric testing. Methods The ReSki test was developed as part of a wider electronic entrance examination. The ReSki test included a case followed by three question sections assessing nursing applicants' reasoning skills according to the reasoning process. Item response theory was used for psychometric testing to assess item discrimination, difficulty and pseudoguessing parameters. The ReSki test was taken by 1056 nursing applicants in six Finnish Universities of Applied Sciences (28 May 2019). Results In the development process, the expert evaluations indicated acceptable content validity. In the psychometric testing, the test reliability was supported by item variance, the theoretical structure was supported by the correlation coefficients and the applicant mean performance supported an acceptable overall test difficulty. The item response theory indicated variance between the items’ difficulty and discrimination ranges. However, most of the wrong items failed at being functional distractors. Conclusion The ReSki test is a new and valid objective assessment of undergraduate nursing applicants' reasoning skills. The item response theory provided item‐level information that can be used for further development of the test, especially related to the revisions needed for the distractor items to achieve the desired level of difficulty. Impact What problem did the study address? The assessment of nursing applicants' reasoning skills is suggested, but there is a lack of admission tools. What were the main findings? The results provided support for the reliability and validity of the ReSki test. Item response theory indicated the need for further item‐level improvement. Where and on whom will the research have an impact? The results may benefit higher education institutions and researchers when developing a test and/or student selection processes.
... Studies from the field of learning and memory have examined the effects of test taking on memory and learning, and have pointed to the positive role of answering questions to subsequent performance, a phenomenon named the testing effect (Roediger & Karpicke, 2006). In one of these studies, Roediger and Marsh (2005) asked undergraduate students to read 18 expository passages (about animals, historical events, etc.) and to answer a multiple-choice test with either two, four or six answers. Results demonstrated a large positive testing effect, but negative consequences were also found. ...
Article
Full-text available
Multiple text integration is a challenging task, but it is essential for studying and developing digital literacy, and is therefore critical to promote among students. However, the conditions that may support multiple text integration for different reader profiles have not been thoroughly studied. Therefore, based on single text research, the current study examined whether adding reading comprehension questions in-between texts would improve subsequent text integration for good versus poor readers. Another aim was to assess multiple text integration of complementary texts, where information does not conflict, but supplements, and to examine separately two components of integration: selecting the relevant information and forming inter-text links. Two experiments were carried out: in the first, 124 university students read multiple texts with or without embedded multiple-choice questions. Results showed that adding comprehension questions promoted inter-text links, but hindered references to single text ideas. In the second experiment, 59 university students read multiple texts paired with either open-ended questions or multiple-choice questions embedded between texts. Open-ended questions resulted in higher integration scores compared to multiple-choice questions only among participants with low comprehension scores. Overall, reading comprehension skills uniquely contributed to text integration. Theoretical and practical implications are discussed.
... The study by Enders et al. exemplifies an important line of research aimed at maximizing the benefits of retrieval practice and feedback in formative testing settings. It is notable that students increasingly acquire fact-based knowledge through quizzing in addition or even as alternative to attending the lectures and reading the course materials (Marsh et al., 2007;Roediger & Marsh, 2005). Future research needs to generalize the positive effects of elaborate feedback with different materials and outcome variables (learning transfer, cf. ...
Article
Full-text available
Students and instructors are looking for effective study and instructional strategies that enhance student achievement across a range of content and conditions. The current Special Issue features seven articles and one report, which used varied methodologies to investigate the benefits of practising retrieval and providing feedback for learning. This editorial serves as an introduction and conceptual framework for these papers. Consistent with trends in the broader literature, the research in this Special Issue goes beyond asking whether retrieval practice and feedback enhance learning, but rather, when, for whom, and under what conditions. The first set of articles examined the benefits of retrieval practice compared to restudy (i.e., the testing effect) and various moderators of the testing effect, including participants’ cognitive and personality characteristics ( Bertilsson et al., 2021 ) as well as the timing of the practice test and sleep ( Kroneisen & Kuepper-Tetzel, 2021 ). The second set of articles examined the efficacy of different types of feedback, including complex versus simple feedback ( Enders et al., 2021 ; Pieper et al., 2021 ) and positively or negatively valenced feedback ( Jones et al., 2021 ). Finally, the third set of articles to this Special Issue examined practical considerations of implementing both retrieval practice and feedback with educationally relevant materials and contexts. Some of the practical issues examined included when students should search the web to look for answers to practice problems ( Giebl et al., 2021 ), whether review quizzes should be required and contribute to students’ final grades ( den Boer et al., 2021 ), and how digital learning environments should be designed to teach students to use effective study strategies such as retrieval practice ( Endres et al., 2021 ). In short, retrieval and feedback practices are effective and robust tools to enhance learning and teaching, and the papers in the current Special Issue provide insight into ways for students and teachers to implement these strategies.
... Menurut Roediger (2005) keuntungan untuk pengujian pilihan ganda, meskipun sulit untuk membuat, mereka mudah untuk mencetak karena merupakan metode evaluasi dalam kelas besar. Manfaat tambahan yaitu meningkatkan kinerja peserta didik pada tes nanti, baik pilihan ganda dan tes benar-salah secara rutin mengekspos peserta didik untuk jawaban yang salah (informasi yang salah). ...
Article
Full-text available
This research is a classroom action research. The purpose of this study improves the activity and student learning outcomes through application of learning strategy Everyone is a Teacher Here. Research subjects are students of class X.I SMA Negeri 3 Sengkang. This research carried out during two cycles. Data collection implemented in the second semester of academic year 2016/2017. This research measure learning activity and learning outcomes. Learning activity measure with observation sheet and learning outcomes measure with written test consist of multiple-choice question as much as 20 items. Student learning activities collected during the learning process observed by 3 observers using observation sheet, then learning outcomes data obtained through evaluation when teaching’s time first and second meeting was done and on third meeting was evaluation. This matter occurs on each cycle. All the data obtained were analyzed through three ways: qualitative analysis for student learning activities, quantitative for student learning outcomes. Results of research on student learning activity that has been analyzed shows the average in the first cycle of 55.6%, while in the second cycle of 67.2%, so the difference in the increase of 11.6%. Student learning outcomes also increased where the average value on observation data of 69.6, 64.4 the first cycle, and the second cycle 74.8. It can be concluded that the application of learning strategy Everyone is a Teacher Here can increase the activity and student learning outcomes. Keywords: everyone is a teacher here learning strategy, learning activities, learning outcomes.
... Based on laboratory studies (Rowland, 2014) and in the context of learning and teaching psychology (Schwieren et al., 2017), meta-analyses reported positive additive effects of feedback beyond the learning benefits engendered by testing on retention, and on achievement in higher education in general (d ¼.47;Schneider & Preckel, 2017). Furthermore, feedback decreases the risk of students acquiring incorrect knowledge by correcting memory errors Ecker et al., 2020;Marsh et al., 2007;Roediger & Marsh, 2005). Especially in the multiple-choice and truefalse question format, students are repeatedly exposed to lure test items which can result in students remembering false answers as they become familiar with the statements (negative suggestion effect, Toppino & Brochin, 1989). ...
Article
Online-quizzes are an economic and objective method for formative assessment in universities. However, closed questions have been criticized for promoting shallow learning and resulting often in poor learning outcomes. These disadvantages can be overcome by embedding closed questions in effective instructional designs involving feedback. In the present field study, a final sample of N¼496 students completed the same online quiz, consisting of 60 true–false statements on the biological bases of psychology in two sessions. In order to enhance the benefit of formative testing on students’ test achievement in Session 2, students received elaborate feedback (i.e., by providing explanations for the in-/correctness) for half of their answers in Session 1, and corrective feedback (i.e., just indicating the in-/correctness) for the other half. The results showed that students scored higher in Session 2 if elaborate feedback had been provided in Session 1, compared with when corrective feedback was provided. More specifically, students profited more from elaborate feedback on incorrect answers in Session 1 than from feedback on correct answers. As a practical recommendation, self-administered formative tests with closedquestion format should at least provide explanations why students’ answers are incorrect.
... Specifically, identified the Short answer test item format, Completion test item format, Multiple choice test item format and Essay test item format as enjoying acceptable good levels of reliability while the True/False test item format has the lowest level of reliability in the EAT. Similarly,Roediger and Marsh (2005) in their study found out that True-False and Multiple-choice test items shows negative suggestibility thereby affecting their performance. Conversely to the above findings,Hopkins and Stanley (1981) strongly believe that True-False tests items can be compared favourably with Multiple-Choice test item and used in achievement tests when properly constructed. ...
Article
Full-text available
were selected using multistage sampling techniques. In each of the three senatorial districts, two Local Government Areas (LGAs) were randomly selected and from each LGA, two schools were randomly selected to make a total of 12 schools selected for the study. The study made use of three research instruments: Economics Achievement Test (EAT), an Observation Record Sheet (OBS) and a Test Item Format Checklist (TIFC). The EAT was content validated using the test blueprint and a Cronbach's Alpha reliability coefficient of 0.68 was obtained. Data collected were analysed using percentage, Relative Significant Index (RSI) and simple linear regression. The results showed that the multiple-choice test item format is the most frequently used test item format for classroom assessment (RSI= 1.0, 36). Also, the Economics tests constructed and used by the Economics teachers have a moderate predictive validity of scores (R=0.408, F (1, 298) = 59.45, p<0.05) obtained from such Economics tests. The study concluded that the Multiple-choice is the most frequently used tests format while the achievement tests used by Economics teachers in Osun state secondary schools are of moderate quality. It was recommended that classroom teachers should make an appropriate use of the other various test formats during their classroom assessment.
... To assess a student's quantitative orientation, we gathered data on their preference for objective vis-à-vis subjective test questions as well as their comfort with mathematical problem solving. Specifically, we gather data on students' preference for multiple choice exam questions (Roediger & Marsh, 2005), true-false exam questions (Grosse & Wright, 1985), and matching test questions (Cross & Paris, 1987). We also consider students' self-assessed math skills (Lent, Lopez, & Bieschke, 1991) and confidence with mathematical problem solving (Lent et al., 1991). ...
Article
Full-text available
In this study, we identify factors associated with the choice to seek admission into a limited-enrollment undergraduate finance program. Using a sample of 796 students at a large university, we assess the impact of potential explanatory factors on the likelihood students will seek admission to a finance major. The data are gathered prior to the admissions decision. Thus, we avoid the selection/treatment endogeneity of most previous studies. In univariate and pairwise difference tests, we find positive correlation between a student's desire to study finance and their level of motivation, work ethic, quantitative strength, and pro-finance social support. In a multivariate specification which adds demographic control variables, the data show motivation, work ethic, age, gender, and marital status to be statistically significant determinants. Our findings suggest that the main attributes of students who seek to become finance majors follow a workhorse paradigm: motivated and hard working.
... The second explanation pertains to the nature of multiple-choice questions by which respondents are required to process response options and therefore are presented with erroneous information. Roediger and Marsh (2005) showed that multiple-choice testing may lead participants to answer later criterial tests with false information, but feedback following multiple-choice questions might reduce these negative effects and increase beneficial effects (Butler & Roediger, 2008). ...
Article
Full-text available
Proponents of the testing effect claim that answering questions about the learning content benefits retention more than does additional restudying—even without corrective feedback. In educational contexts, evidence for this claim is scarce and points toward differential effects for different question formats: Benefits emerged for short-answer questions but not for multiple-choice questions. The present study implemented an experimentally controlled, minimal intervention design in five sessions of an existing lecture. In each session, participants reviewed lecture content by answering short-answer questions, multiple-choice questions, or reading summarising statements. An unannounced test measured the retention of learning content. Bayesian analyses revealed a positive testing effect for short-answer questions that was strongest for difficult practice questions. Analyses also provided evidence for the absence of a testing effect for multiple-choice questions. These results suggest that short-answer testing is more beneficial than multiple-choice testing in a higher education context when feedback is not provided.
... The present findings, that fact-checking and tweet-deletion seem to differ in their efficiency as countermeasures against misinformation, should be interpreted against psychological models of belief formation and human memory-especially those that consider psychological mechanisms for the formation and persistence of false memories or false knowledge [69,70]. For instance, the relative inefficacy of fact-checking at later stages of spread might indicate that to be effective, fact-checking needs to take place before much memory consolidation of misinformation has occurred. ...
Article
Full-text available
Conspiracy theories in social networks are considered to have adverse effects on individuals' compliance with public health measures in the context of a pandemic situation. A deeper understanding of how conspiracy theories propagate through social networks is critical for the development of countermeasures. The present work focuses on a novel approach to characterize the propagation of conspiracy theories through social networks by applying epidemiological models to Twitter data. A Twitter dataset was searched for tweets containing hashtags indicating belief in the "5GCoronavirus" conspiracy theory, which states that the COVID-19 pandemic is a result of, or enhanced by, the enrollment of the 5G mobile network. Despite the absence of any scientific evidence, the "5GCoronavirus" conspiracy theory propagated rapidly through Twitter, beginning at the end of January, followed by a peak at the beginning of April, and ceasing/disappearing approximately at the end of June 2020. An epidemic SIR (Susceptible-Infected-Removed) model was fitted to this time series with acceptable model fit, indicating parallels between the propagation of conspiracy theories in social networks and infectious diseases. Extended SIR models were used to simulate the effects that two specific countermeasures, fact-checking and tweet-deletion, could have had on the propagation of the conspiracy theory. Our simulations indicate that fact-checking is an effective mechanism in an early stage of conspiracy theory diffusion, while tweet-deletion shows only moderate efficacy but is less time-sensitive. More generally, an early response is critical to gain control over the spread of conspiracy theories through social networks. We conclude that an early response combined with strong fact-checking and a moderate level of deletion of problematic posts is a promising strategy to fight conspiracy theories in social networks. Results are discussed with respect to their theoretical validity and generalizability.
... Esse efeito é chamado de sugestão negativa (negative suggestion) (Marsh, Agarwal, & Roediger, 2009;Roediger & Butler, 2011) e pode resultar em aprendizado de conteúdos errados. Contudo, isso geralmente ocorre somente quando não é dado um feedback sobre se houve ou não acerto da questão (Roediger & Marsh, 2005). ...
Article
Full-text available
Retrieval practice, or the testing effect, is a study technique that involves trying to remember information to which we were previously exposed. Although this practice increases the long-term-retention of information compared to traditional study techniques, among several other advantages with ample scientific evidence, this strategy is not usually the most often used among students. Educators should help students to use this technique in their daily lives. In order to optimize its applicability, this article discusses which factors interfere in this practice, including: the importance of feedback, the way in which retrieval practice is carried out and the response format, the number of repetitions of retrieval attempts to recall information and the interval between these repetitions. The appropriation of knowledge about these factors positively influences the implementation of this technique in the classroom, thus promoting evidence-based education.
... Both the reading material and questions were taken from various standardized practice examinations and have been used in prior research. 12,13 Perception of learning was measured in several ways, all using a five-point Likert scale (Appendix 1). These measures were adopted from a variety of measures used within the literature. ...
Article
Objective. To determine whether perception of student learning equates to learning gains. Methods. Two-hundred seventy-seven college-aged students and student pharmacists participated in the study. Participants were assessed before and after completing a reading intervention and reported their perceptions of learning by responding to various Likert-scale questions. Relationships between perception and performance were assessed by correlation analysis, trend analysis, and using measures of metacognitive accuracy. Results. There was a lack of correlation between measures of the perception of learning and actual gains in knowledge. There were weak correlations between the perception of learning and post-reading scores. Comparing student-pharmacists to college-aged individuals, both had similar metacognitive accuracy and there were little differences after the intervention. Conclusion. Perceptions of learning may not reflect knowledge gains, and perception data should be used cautiously as a surrogate for evidence of actual learning.
... Many studies involving retrieval practice have been conducted in laboratory settings and while some progress has been made in applying the research in classroom contexts using educationally relevant materials, there are concerns that the testing effect decreases or disappears as the complexity of authentic learning materials increase, what is referred to as 'element interactivity' [68]. Research in classroom contexts have tended to concentrate on test items requiring the retrieval of facts [32,69,70] and has tended to focus on recall with identical or similar test items in both practice and criterion tests. In a number a studies the final test material is identical to the practice test [32,71], or modified or rephrased versions of the same test [8]. ...
Chapter
The application of retrieval practice to electrical science education has been shown to be effective for student learning. While research is beginning to emerge in classroom contexts, the learning approach of students taking electrical science has not been considered as a factor when participating in retrieval practice. This research paper addresses this gap and presents a study of n = 207 students in a within-group design and the impact of retrieval practice within a practice testing learning framework on their subsequent performance in a high stake’s unseen criterion test. The Revised Two-Factor Study Process Questionnaire (R-SPQ-2F) was administered with n = 88 responses to determine learning approach before retrieval practice participation with an average score (Standard deviation) on Deep Approach = 29.32 (9.06) with Surface Approach = 22.53 (7.72). Students reported using a mix of deep and surface approaches with retrieval practice enhancing performance. The findings from this study support the application of retrieval practice to enhance learning in electrical science and provides guidelines for future educational research on retrieval practice in electrical science and other domains.
... In behavioral studies, the testing effect has been seen at both direct (Carpenter, Pashler, Wixted, & Vul, 2008;Wiklund-Hörnqvist et al., 2014) and delayed tests (Karpicke & Roediger, 2008;Roediger & Butler, 2011). Providing feedback during retrieval practice adds a formative component-through feedback the students can monitor their learning, thereby reducing repeated failures and preventing erroneous learning from occurring (Roediger & Marsh, 2005). Also, the teacher is continuously being informed about the students' progress (Roediger & Karpicke, 2006). ...
... 8. False knowledge: When test-takers do not know the correct answer or have no clue, they might start to look for a closer answer or familiar terms among distractors. If the feedback is delayed or not provided, this case might lead to false knowledge and students will consequently learn wrongly (Roediger & Marsh, 2005). ...
Thesis
Computational thinking, a form of thinking and problem solving, is defined as a mental process for abstracting problems and formulating solutions. Computational thinking is considered to be an essential skill for everyone and has become the centre of attention in education settings. There is a limited number of tools to measure computational thinking skills by multiple-choice questions, and limited research on the relationship between computational thinking and other domains. The purpose of this research is to investigate the relationship between computational thinking performance, perception of computational thinking skills and school achievement of secondary school students. Computational thinking performance of secondary school students in Kazakhstan is measured by using a bespoke multiple-choice test, which focuses on the following elements of computational thinking: logical thinking, abstraction and generalisation. The perceptions of computational thinking skills are self-reported using a pre-existing questionnaire, which covers the following factors: creativity, algorithmic thinking, cooperation, critical thinking and problem solving. The General Knowledge Test results that contain scores for 14 different subjects are used as indicators of students’ school achievement, with further sub-scores for the science subjects, language subjects and humanities. The sample group of 775 grade eight students are drawn from 28 secondary schools across Kazakhstan. The validity and reliability of the multiple-choice questions are established by using Item Response Theory models. The item difficulty, discrimination and guessing coefficients are calculated; and the item characteristic curves for each question and test information functions for each quiz are obtained. As a result, the multiple-choice questions are concluded as a valid and reliable tool to measure the computational thinking performance of students. Multiple regression is used to examine the relationship between computational thinking performance, perception of computational thinking and school achievement sub-scores. The results of the data analysis show that science subjects, language subjects and perception of computational thinking skills are significant predictors for computational thinking performance, showing a moderate relationship between computational thinking performance and school achievement. However, no significant relationship is found between humanities subject scores and computational thinking performance. This study also adds to the literature for the studies that investigate the relationship between computational thinking skills and other variables. This research contributes to the development of validated tools to measure computational thinking performance by using multiple-choice questions. This study investigates the relationship between computational thinking performance and general school achievement of secondary school students, and its findings shed light on the measurement of children’s cognitive development. The findings can help in designing better curricula by adjusting subjects that enhance children’s higher-order thinking abilities. The findings obtained in this research also adds to the literature for the studies that investigate the relationship between computational thinking skills and other variables.
... The material consisted of three text passages, which were taken from Roediger and Marsh (2005) and translated into German. Each passage covered a topic from United States history ("Dorothea Dix": 279 words; "Fallingwater": 276 words, "Georgia O'Keefe": 297 words). ...
Article
Full-text available
List-method directed forgetting (LMDF) is the demonstration that people can intentionally forget previously studied information when they are asked to forget what they have previously learned and remember new information instead. In addition, recent research demonstrated that people can selectively forget when cued to forget only a subset of the previously studied information. Both forms of forgetting are typically observed in recall tests, in which the to-be-forgotten and to-be-remembered information is tested independent of original cuing. Thereby, both LMDF and selective directed forgetting (SDF) have been studied mostly with unrelated item materials (e.g., word lists). The present study examined whether LMDF and SDF generalize to prose material. Participants learned three prose passages, which they were cued to remember or forget after the study of each passage. At the time of testing, participants were asked to recall the three prose passages regardless of original cuing. The results showed no significant differences in recall of the three lists as a function of cuing condition. The findings suggest that LMDF and SDF do not occur with prose material. Future research is needed to replicate and extend these findings with (other) complex and meaningful materials before drawing firm conclusions. If the null effect proves to be robust, this would have implications regarding the ecological validity and generalizability of current LMDF and SDF findings.
... That is, if one continually rehearses incorrect information, then that erroneous information will be more readily available during later tests. That perseveration of errors can occasionally result in an impairment in final recall resulting from practiced retrieval relative to additional study is known as the "negative testing effect" (a.k.a. the reverse testing effect; Chan et al., 2009;Roediger & Marsh, 2005;Peterson & Mulligan, 2013) and is especially prevalent in multiple-choice tests which provide attractive lures. A series of experiments providing students (college and high school) with practice SAT tests reported that if initial test performance was sufficiently low, final performance was actually hindered by practice tests ). ...
Article
Full-text available
Taking a test of previously studied material has been shown to improve long-term subsequent test performance in a large variety of well controlled experiments with both human and nonhuman subjects. This phenomenon is called the testing effect. The promise that this benefit has for the field of education has biased research efforts to focus on applied instances of the testing effect relative to efforts to provide detailed accounts of the effect. Moreover, the phenomenon and its theoretical implications have gone largely unacknowledged in the basic associative learning literature, which historically and currently focuses primarily on the role of information processing at the time of acquisition while ignoring the role of processing at the time of testing. Learning is still widely considered to be something that happens during initial training, prior to testing, and tests are viewed as merely assessments of learning. However, the additional processing that occurs during testing has been shown to be relevant for future performance. The present review offers an introduction to the historical development, application, and modern issues regarding the role of testing as a learning opportunity (i.e., the testing effect). We conclude that the testing effect is seen to be sufficiently robust across tasks and parameters to serve as a compelling challenge for theories of learning to address. Our hope is that this review will inspire new research, particularly with nonhuman subjects, aimed at identifying the basic underlying mechanisms which are engaged during retrieval processes and will fuel new thinking about the learning-performance distinction. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... Consistent with the "retrieval hypothesis," short-answer (SA) questions require recall of information, and not merely recognition of it (McDaniel et al., 2007;Roediger & Butler, 2011). There is evidence for a long-term cognitive disadvantage of MC relative to SA questions (e.g., Bangert-Drowns et al., 1991;Karpicke & Roediger, 2008;Leeming, 2002;McDaniel et al., 2007;Roediger & Butler, 2011;Roediger & Marsh, 2005). For example, conducted a study of the effects of testing material from a video-recorded lecture immediately following the lecture and four weeks later. ...
... In regards to the types of quiz questions used, both the question format and, partly related to that, the complexity and the level of processing induced by the question matter. There is evidence of testing effects for both open-format questions, such as word pairs (e.g., Carpenter, 2009) and short answer questions (e.g., Butler et al., 2013;Kang et al., 2007), and closed-format questions such as multiple or single choice questions (e.g., Greving et al., 2020;Marsh et al., 2007;Roediger & Marsh, 2005) and true-false tests (e.g., Uner et al, 2021). It has been argued that open-format questions are more likely to trigger effortful retrieval processes, as they offer less retrieval support than closed format questions, which would make them the better choice . ...
... Third, lab studies have shown beneficial effects of feedback when practicing multiple-choice questions, but this effect can also avert the negative effects that could arise because of the exposure to incorrect information in the form of lures or distractors (Butler & Roediger, 2008;Roediger & Marsh, 2005). Therefore, research is needed to investigate whether feedback adds to the unmediated effects of multiple-choice practice tests in educational settings. ...
Article
Full-text available
Background Retrieval practice promotes retention of learned information more than restudying the information. However, benefits of multiple-choice testing over restudying in real-world educational contexts and the role of practically relevant moderators such as feedback and learners’ ability to retrieve tested content from memory (i.e., retrievability) are still underexplored. Objective The present research examines the benefits of multiple-choice questions with an experimental design that maximizes internal validity, while investigating the role of feedback and retrievability in an authentic educational setting of a university psychology course. Method After course sessions, students answered multiple-choice questions or restudied course content and afterward could choose to revisit learning content and obtain feedback in a self-regulated way. Results Participants on average obtained corrective feedback for 9% of practiced items when practicing course content. In the criterial test, practicing retrieval was not superior to reading summarizing statements in general but a testing effect emerged for questions that targeted information that participants could easily retrieve from memory. Conclusion Feedback was rarely sought. However, even without feedback, participants profited from multiple-choice questions that targeted easily retrievable information. Teaching Implications Caution is advised when employing multiple-choice testing in self-regulated learning environments in which students are required to actively obtain feedback.
... Beim Einsatz von MC-Fragen können Studierende die in den Distraktoren präsentierten Fehlkonzepte fälschlicherweise erlernen, wenn sie kein Feedback zur Korrektheit ihrer Aufgabenlösung erhalten (Marsh et al., 2007;Roediger & Marsh, 2005). Diese Gefahr ist vor allem dann erhöht, wenn sie sich vor der Testung nicht mit dem Lernmaterial beschäftigt haben, sondern die MC-Fragen selbst als Lernmaterial nutzen. ...
Book
Full-text available
The annual issue of 'die hochschullehre' 2021 contains all publications of this year. Downloadable versions of single papers can also be found on the homepage https://www.wbv.de/die-hochschullehre/beitraege/special/jahrgang/2021.html#cc14230
... For example, whereas laboratory paradigms often tightly control exposure to information, often limiting exposure to a single session (e.g., Glover, 1989;Kang et al., 2007), classroom learning typically involves repeated, varied, and spaced presentation of integrated content via multiple methods (e.g., lecture, homework, reading material). Additionally, though laboratory research often utilizes identical questions for "learning" tests and postintervention criterial tests (e.g., Kang et al., 2007;Roediger & Marsh, 2005), course instructors may desire alteration of questions between measures in order to determine if students are learning content or simply memorizing question-response pairing. A recent survey of nearly 200 instructors of college level introductory psychology courses suggests that identical quiz and exam questions are highly atypical in a college classroom environment (Wooldridge, et al., 2014). ...
... Rather, the effects are prominent over lengthier retention intervals, for instance, when students are tested a week after the learning phase (Karpicke and Roediger, 2007). Although the effects of retrieval practice on memory retention occur independently on feedback (Roediger and Butler, 2011), the inclusion of feedback strengthens learning and provide a formative component through which students can monitor their accuracy and thus prevent that erroneous learning (Roediger and Marsh, 2005). The mechanisms underlying retrieval practice remains unclear (see Rowland, 2014 for an overview), but the effectiveness has been studied and confirmed in different experimental settings, educational contexts, across a range of materials and by brain imaging studies (Dunlosky et al., 2013;van den Broek et al., 2016;Adesope et al., 2017). ...
Article
Full-text available
Online quizzes building upon the principles of retrieval practice can have beneficial effects on learning, especially long-term retention. However, it is unexplored how interindividual differences in relevant background characteristics relate to retrieval practice activities in e-learning. Thus, this study sought to probe for this research question on a massive open online course (MOOC) platform where students have the optional possibility to quiz themselves on the to-be-learned materials. Altogether 105 students were assessed with a cognitive task tapping on reasoning, and two self-assessed personality measures capturing need for cognition (NFC), and grittiness (GRIT-S). Between-group analyses revealed that cognitively high performing individuals were more likely to use the optional quizzes on the platform. Moreover, within-group analyses (n = 56) including those students using the optional quizzes on the platform showed that reasoning significantly predicted quiz performance, and quiz processing speed. NFC and GRIT-S were unrelated to each of the aforementioned retrieval practice activities.
... A research has found that retention of studied material can be enhanced by testing (Kang, McDermott & Roediger, 2007). Also recent studies demonstrated that taking a test on studied material promotes learning and conceptual understanding (Cranney, Ahn, McKinnon, Morris & Watts, 2009;Roediger & Marsh, 200�). Roediger and Karpicke (2006) reported that testing is a powerful means by which to improve student learning rather than just assessing students' knowledge levels. ...
Article
The purpose of this study was to determine preservice chemistry teachers’ achievement on different types of test and investigate the effect of different types of test on their achievement related to “Chemical Bonding”. The participants of this study consisted of 26 preservice chemistry teachers in Hacettepe University, Faculty of Education, and Department of Chemistry Education in fall semester of 2010 – 2011 academic year. The mean age of preservice chemistry teachers was 20 years. In the study, Chemistry Achievement Tests that are designed to involve four different item types (Two-Tier Multiple-Choice, Multiple-Choice, Essay, Correct/Incorrect) towards the same behavioural objectives and are administered to the same group of students were used as data collection tools. The hypotheses were tested by using One Way ANOVA. A significant difference was found between preservice chemistry teachers’ means of achievement scores on “Two-Tier Multiple-Choice Test”, “Multiple-Choice Test”, “Essay Test” and “Correct/Incorrect Test”. The results showed that preservice chemistry teachers' are most successfully on “Multiple-Choice Test” and “Correct/Incorrect Test” and then “Two-Tier Multiple-Choice Test” and least successfully on “Essay Test”. Key words: achievement, chemical bonding, correct/incorrect test, essay test, multiple-choice test, two-tier multiple-choice test.
... Since their inception over 100 years ago (Goodenough, 1950), tests and exams constructed using Multiple Choice Questions (MCQs) have been widely used as a form of assessment in higher education due to their high level of reliability, versatility, efficiency and ease of marking (Roediger III & Marsh, 2005). Despite the wide usage of MCQs and detailed published guidelines for their construction (Haladyna, Downing, & Rodriguez, 2002), there remain inherent limitations such as the difficulty of detecting the guessing of answers (Burton, 2001), the inability to test higher-level cognitive functions and the lack of opportunity for a student to show the working used to obtain the selected answer in order to obtain partial credit (McAllister & Guidice, 2012). ...
... Along with the discussion regarding the valid uses of in-class content examinations, there has been much debate about the virtues and vices of using MCQs as a means to evaluate student content mastery (Roediger and Marsh 2005;Fazio, Agarwal, Marsh, and Roediger 2010). Comparisons of MCQs are typically made with more subjective techniques such as essay and open response questions as a way of evaluating student knowledge (Bridgeman 1992). ...
Article
Prior non-accounting research has generally concluded that students obtain performance benefits from self-generating multiple choice questions (MCQs). We examine accounting students completing an extra-credit assignment to self-generate MCQs and its association with examination performance gains. Using students from a large public and small/medium-sized private university, across multiple courses and semesters, we find that students completing the assignment did not outperform students in the same courses, with the same instructors, not offered the assignment. We find that these results hold across students of all initial performance quartiles. Our results suggest that prior educational research may overestimate the benefits of MCQ self-generation by not performing appropriate control group comparisons. We provide evidence that voluntary self-generation of MCQs may be a way to identify students seeking to improve their course performance, but in and of itself it may not be an effective method to improve student performance on MCQ examinations. Data Availability: Data are available upon request, after the completion of a related study. JEL Classifications: M49.
... This phenomenon has been termed the "testing-effect" and has been replicated with diverse materials, such as words (Bouwmeester & Verkoeijen, 2011;Carpenter, 2009), texts (Callender & McDaniel, 2009), and pictorial stimuli (Carpenter & Pashler, 2007;Jacoby, Wahlheim, & Coane, 2010;Weinstein, McDermott, & Szpunar, 2011). More recently, several studies have demonstrated that such an effect can be applied to actual classroom contexts (Carpenter, Pashler, & Cepeda, 2009;Cranney, Ahn, McKinnon, Morris, & Watts, 2009;Jaeger, Eisenkraemer, & Stein, 2015;Lipowski, Pyc, Dunlosky, & Rawson, 2014;Roediger & Marsh, 2005; for a review see Moreira, Pinto, Starling, & Jaeger, 2019), and for individuals with diverse cognitive skills (Moreira, Pinto, Justi, & Jaeger, 2019;Minear, Coane, Boland, Cooney, & Albat, 2018) Because in most testing-effect experiments questions are placed after the study materials (i.e., postquestions), it remains unclear to what extent questions placed before the study materials (i.e., prequestions) benefit learning. Studies investigating the effects of prequestions are more sparse than studies on the effects of postquestions, but they have already shown that prequestions are beneficial for learning when educationally relevant materials are adopted, such as prose passages (Frase, 1967;Peeck, 1970), textbook chapters (Pressley, Tanenbaum, McDaniel, & Wood, 1990), lectures (Carpenter, Rahman, & Perkins, 2018;cf., Geller et al., 2017), and video presentations (Carpenter & Toftness, 2017). ...
Article
Full-text available
Prior research revealed that answering questions after study benefits learning in children, but it is unclear whether answering questions before study (i.e., prequestions) produces similar effects. Here, we report two experiments investigating this issue in 4th and 5th grade children. In both experiments, target-words from an encyclopedic text were either prequestioned, postquestioned, or reread. To assess memory retention, cued recall and multiple choice tests were administered after a 7-day interval, when children also rated their confidence on their responses. Both prequestions and postquestions resulted in overall greater memory retention than rereading, although postquestions resulted in greater cued recall performance than prequestions, a finding that was mirrored by the confidence data (i.e., postquestion > prequestion > reread). Thus, although both prequestions and postquestions were more beneficial for memory retention than rereading, postquestions seem to have promoted more recollection-based retrieval than prequestions, a finding we discuss from a dual-process model perspective.
... However, even if the format of summative assessments is not multiple-choice, giving multiple-choice practice exams helps students perform better on those assessments (Roediger & Karpicke, 2006;Smith & Karpicke, 2014). One problem with multiplechoice practice exams, however, is that students can sometimes remember misinformation simply by being exposed to the incorrect answer choices (Marsh, Roediger, Bjork, & Bjork, 2007;Roediger & Marsh, 2005). Providing relatively immediate feedback (e.g., right after the exam is over) can help to mitigate this effect (Butler & Roediger, 2008). ...
Chapter
Full-text available
While teaching psychology is always demanding, teaching courses about the psychology of learning presents unique challenges for instructors. Learning courses have specialized language and procedures not found in other areas of psychology, students tend to enter courses with certain misconceptions, and published materials related to teaching learning can be lacking. This chapter discusses these and other challenges and potential ways to overcome them. Being aware of these pitfalls can help instructors to understand any confusion students might have or develop about the material and take actions to correct it. Also included is a brief history of learning as a field, and proposed core content and learning outcomes for learning courses. Evidence-based teaching and assessment strategies are discussed in general, along with specific examples pertaining to learning courses. A way of approaching the teaching of operant conditioning based on common student difficulties is also outlined. Lastly, some general teaching tips as well as teaching resources (some general, some specific to learning courses) are provided. Though this chapter is aimed at instructors of learning courses (or those looking for guidance in teaching the learning portion of an introductory psychology course), many of the strategies can be applied widely.
... Well-written multiple-choice and true-false questions work well to measure factual knowledge, basic comprehension, and cognitive application [83,84]. Open-ended questions are more appropriate for measuring reinterpreted and applied learning through daily living, especially when such application can vary widely [83,85]. ...
Article
Objectives: Learning modern pain biology concepts can improve important clinical outcomes for people with chronic pain. The primary purpose of this scoping review is to examine and report characteristics of chronic musculoskeletal pain education programs from an instructional design perspective. Methods: Following PRISMA-ScR guidelines, PubMed, Medline, and CINAHL databases were systematically searched. Articles included expert recommendations and those reporting pain education programs used in clinical trials enrolling adults with chronic neuromusculoskeletal pain and published in English between 1990 and 2021. Three authors independently evaluated articles for eligibility through title, abstract, and full text review. Instructional design characteristics such as learning outcomes, support materials, learning assessment methods, and key concepts communicated were summarized. Results: The search revealed 5260 articles of which 40 were included: 27 clinical studies, 7 expert recommendations, and 6 articles reporting on pain education from participant perspectives. Detailed reporting of instructional design characteristics informing replication in subsequent studies is sparse. Most included trials used only lecture and did not assess participant learning. Conclusions: More comprehensive reporting of pain education programs is needed to facilitate replicability. Practical implications: This article proposes detailed and standardized reporting of trials using pain education programs employing a modified version of the TIDieR checklist.
... While MC questions are favored because they can be rapidly graded, there are drawbacks: MC questions may cause students to rely on memorization (Stanger-Hall, 2012) or use test-taking strategies (Kim and Goetz, 1993). MC questions may also overestimate student understanding (Nehm and Schonfeld, 2008), fail to capture mixed thinking (Brassil and Couch, 2019), or cause students to consider the incorrect alternatives as correct in later testing (Roediger and Marsh, 2005). Thus, MC questions may not provide a complete picture of student thinking. ...
Article
The focus of biology education has shifted from memorization to conceptual understanding of core biological concepts such as matter and energy relationships. To examine undergraduate learning about matter and energy, we incorporated constructed-response (CR) questions into an interactive computer-based tutorial. The objective of this tutorial is to teach students about matter and energy and help dispel common misconceptions through the context of cellular respiration. We used a constructed-response classifier (CRC) tool to categorize ideas in responses to three CR questions and measure changes in student thinking about cellular respiration. Our data set includes 841 undergraduates from 19 geographically diverse institutions including two-year colleges, primarily undergraduate institutions, and research-intensive colleges and universities. We found students from all institution types included more scientific ideas in CRs post-tutorial. Students used an average of 2.1 ideas in CRs and frequently used both scientific and developing ideas. We found this mixed thinking persisted after the tutorial regardless of institution type. Students' multiple-choice (MC) selections were correlated with their CRs, but CRs revealed more mixed thinking than would be inferred from MC responses. Our study shows a CRC tool can measure student learning after a computer-based tutorial and provides more complete information than MC responses.
Article
Across three experiments featuring naturalistic concepts (psychology concepts) and naïve learners, we extend previous research showing an effect of the sequence of study on learning outcomes, by demonstrating that the sequence of examples during study changes the representation the learner creates of the study materials. We compared participants' performance in test tasks requiring different representations and evaluated which sequence yields better learning in which type of tests. We found that interleaved study, in which examples from different concepts are mixed, leads to the creation of relatively interrelated concepts that are represented by contrast to each other and based on discriminating properties. Conversely, blocked study, in which several examples of the same concept are presented together, leads to the creation of relatively isolated concepts that are represented in terms of their central and characteristic properties. These results argue for the integrated investigation of the benefits of different sequences of study as depending on the characteristics of the study and testing situation.
Conference Paper
Full-text available
PROCEEDING: English in the Professional World: Issues of the teaching of English language and culture in the professional world Editors: G.M. Adhyanggono Antonius Suratno iii FOREWORD This book is a compilation of various topics highlighting "English in the Professional World: Issues of the teaching of English language and culture in the professional world." In this proceeding, linguists, teachers, researchers as well as practitioners of English for Specific Purposes express their ideas and interests on salient issues pertaining to the teaching of English language and culture for diverse fields. As communications between nations are increasing, so are the demands of university graduates and students who are capable of using English to perform their job-related tasks and communicate their professional goals. Education institutions have great responsibilities to bridge the needs of the professional world and the education of qualified graduates. Such awareness is reflected on the articles covering diverse topics that range from research, teaching, and work experience.
Conference Paper
Full-text available
What happens when we transform graduate students’ learning environments into simultaneous warp zones? It enables us to take what we usually define as distinct learning spaces and mode of delivery and make them interchangeable interaction and collaboration dimensions. In this presentation we will share two cases in which we applied the concept of “warp zones” in learning experience design and the pedagogical approach we adopted, and discuss the challenges we faced and the lessons we learned.
Conference Paper
Automating assessment of person’s skills is an important area of study in artificial intelligence and natural language processing. In this work we conduct empirical study of a recently proposed Reverse Turing Test for Knowledge Assessment approach—a completely automated domain agnostic method of knowledge assessment that can operate completely without human assessor involvement. Our study involved 53 participants and three different knowledge domains. We conclude that this method can reliably differentiate between expertise levels and therefore can be a compelling alternative to human grading and multiple-choice tests in many domains.
Article
Full-text available
Background: Multiple choice question (MCQ) has been used widely as a method to assess the achievement of learning outcomes. MCQ as an assessment instrument may give both expected and unexpected impact. The research’s objective is to identify the impacts of MCQs in term of the structure, content, information and regulation on the learning process of student at Medical Faculty Hasanuddin University Method: The study was a descriptive survey involving 505 medical students from Hasanuddin University who were still in the academic phase, class of 2010, 2011 and 2012. Preliminary study was carried out to explore learning impacts caused by MCQs. Based on the result of preliminary study a questionnaire using rating scale was developed, consisting 92 items of possible learning impacts. Open ended questions were added to get free response from the students. The answers were classified into expected and unexpected learning impacts.Results: The structure, content, information and regulation of the MCQ method gave expected impacts such as learning from many sources (74.4%), group studying (96.2%), mind mapping (37.9%) and re-discussing the exam materials (96.2%). It also gave unexpected impacts such as guessing the answer (44.8%), only studying previous exams (93.5%), cheating (33.7%) and taking pictures of the exam papers (38.2%). Conclusion: Unexpected impacts may occur from the MCQ method, which structurally consists of item flaws, such as only assessing memorizing skills rather than the application of knowledge and incomplete information in the stem. The regulation, in the form of summative exam, will encourage students to prepare themselves more seriously if compared to the formative exam.
Article
Multiple-choice testing with dichotomous scoring is one of the most common assessment methods utilized in undergraduate education. Determining students’ perceptions toward different types of multiple-choice testing formats is important for effective assessment. The present study compared two alternative multiple-choice testing formats used in a second-year required chemistry course: (1) The Immediate Feedback Assessment Technique (IFAT®) and (2) Personal Point Allocation (PPA). Both testing methods allow for partial credit but only the IFAT® provides immediate feedback on students’ responses. Both survey and interview data indicated that, overall, most students preferred IFAT® to the PPA testing method. These positive ratings were related to potential increase in reward, ease of use, and confidence. IFAT® was also perceived to be less stress producing, and anxiety provoking than PPA. Interview data supported these findings but also indicated individual differences in preference for each of these two methods. Additionally, students’ feedback on strategies used for either testing method and suggestions on how to improve the methods are discussed.
Article
Taking a test on learned items enhances long-term retention of these items. However, it is believed that good performance in a test contributes to subsequent high retention of the tested items while poor performance does not. Recent studies have sought to find the optimal way to make up for this poor performance, and have indicated that giving the subsequent learning session soon after the test is one such way. This study is different from previous studies in that we used L1–L2 word pairs to examine whether restudying immediately after the failure in the test is useful for long-term retention. First, in the initial study session, all the participants (n = 52) were shown and asked to remember 20 English and Japanese word pairs (e.g., deceit:詐欺). A week later, Group A took the first test session (Initial Test) before the restudy session. On the contrary, Group B took the restudy session before the Initial Test. An hour after this session, both groups took Posttest 1. Then, Posttest 2 was conducted a week after Posttest 1. The results showed that Group A had significantly lower scores than Group B in the Initial Test (2% vs. 55%). However, the results were reversed in Posttest 1 (84.2% vs. 53.2%) and Posttest 2 (55% vs. 43.5%). This study found that a restudy session soon after poor performance in the Initial Test enhanced long-term L2 vocabulary retention because learners benefited from the indirect effects of testing. Thus, English teachers should take such effects into consideration when organizing vocabulary quizzes and restudy sessions.
Article
This explanatory case study analyses student selection to English teaching programs in Turkey and suggests a new procedure. Students, teacher educators, and teachers from a language teaching department at a state university made up the study participants. Data were gathered through surveys and interviews, which were analyzed by descriptive and inferential statistics and content analysis. The results presented the strengths and weaknesses of the selection system and suggestions and implications for improvement.
Article
Frequent repetition of vocabulary is essential for effective language learning. To increase exposure to learning content, this work explores the integration of vocabulary tasks into the smartphone authentication process. We present the design and initial user experience evaluation of twelve prototypes, which explored three learning tasks and four common authentication types. In a three-week within-subject field study, we compared the most promising concept as mobile language learning (MLL) applications to two baselines: We designed a novel (1) UnlockApp that presents a vocabulary task with each authentication event, nudging users towards short frequent learning session. We compare it with a (2) NotificationApp that displays vocabulary tasks in a push notification in the status bar, which is always visible but learning needs to be user-initiated, and a (3) StandardApp that requires users to start in-app learning actively. Our study is the first to directly compare these embedding concepts for MLL, showing that integrating vocabulary learning into everyday smartphone interactions via UnlockApp and NotificationApp increases the number of answers. However, users show individual subjective preferences. Based on our results, we discuss the trade-off between higher content exposure and disturbance, and the related challenges and opportunities of embedding learning seamlessly into everyday mobile interactions.
Article
Full-text available
This study investigated how word meanings can be learned from natural story reading. Three experiments with adult participants compared naturalistic incidental learning with intentional learning of new meanings for familiar words, and examined the role of immediate tests in maintaining memory of new word meanings. In Experiment 1, participants learned new meanings for familiar words through incidental (story reading) and intentional (definition training task) conditions. Memory was tested with cued recall of meanings and multiple-choice meaning-to-word matching immediately and 24 h later. Results for both measures showed higher accuracy for intentional learning, which was also more time efficient than incidental learning. However, there was reasonably good learning from both methods, and items learned incidentally through stories appeared less susceptible to forgetting over 24 h. It was possible that retrieval practice at the immediate test may have aided learning and improved memory of new word meanings 24 h later, especially for the incidental story reading condition. Two preregistered experiments then examined the role of immediate testing in long-term retention of new meanings for familiar words. There was a strong testing effect for word meanings learned through intentional and incidental conditions (Experiment 2), which was non-significantly larger for items learned incidentally through stories. Both cued recall and multiple-choice tests were each individually sufficient to enhance retention compared to having no immediate test (Experiment 3), with a larger learning boost from multiple-choice. This research emphasises (i) the resilience of word meanings learned incidentally through stories and (ii) the key role that testing can play in boosting vocabulary learning from story reading.
Article
Full-text available
Two experiments investigated the impact of responding to recognition test items that do not include a correct alternative. In Experiment 1, subjects who were given exclusively incorrect response alternatives were less likely than control subjects to favor the correct alternatives on a second recognition test. Analysis of subjects’ responses indicated that commitments, rather than distractor familiarity, was the main source of this effect. In Experiment 2, an impairing effect of committing to an incorrect alternative was observed even when the initial distractors were excluded from the final test. Thus, this decreased performance cannot simply be attributed to a bias toward remaining consistent. One interpretation of these results is that committing to a distractor causes subjects to remember a false detail that can interfere with their later ability to access the original information. Other potential theoretical and applied implications of these results are explored.
Article
Full-text available
Subjects rated how certain they were that each of 60 statements was true or false. The statements were sampled from areas of knowledge including politics, sports, and the arts, and were plausible but unlikely to be specifically known by most college students. Subjects gave ratings on three successive occasions at 2-week intervals. Embedded in the list were a critical set of statements that were either repeated across the sessions or were not repeated. For both true and false statements, there was a significant increase in the validity judgments for the repeated statements and no change in the validity judgments for the non-repeated statements. Frequency of occurrence is apparently a criterion used to establish the referential validity of plausible statements.
Article
Full-text available
Four experiments tested the hypothesis that successful retrieval of an item from memory affects retention only because the retrieval provides an additional presentation of the target item. Two methods of learning paired associates were compared. In the pure study trial (pure ST condition) method, both items of a pair were presented simultaneously for study. In the test trial/study trial (TTST condition) method, subjects attempted to retrieve the response term during a period in which only the stimulus term was present (and the response term of the pair was presented after a 5-sec delay). Final retention of target items was tested with cued-recall tests. In Experiment 1, there was a reliable advantage in final testing for nonsense-syllable/number pairs in the TTST condition over pairs in the pure ST condition. In Experiment 2, the same result was obtained with Eskimo/English word pairs. This benefit of the TTST condition was not apparently different for final retrieval after 5 min or after 24 h. Experiments 3 and 4 ruled out two artifactual explanations of the TTST advantage observed in the first two experiments. Because performing a memory retrieval (TTST condition) led to better performance than pure study (pure ST condition), the results reject the hypothesis that a successful retrieval is beneficial only to the extent that it provides another study experience.
Article
Full-text available
Numerous studies have demonstrated that exposure to misinformation about a witnessed event can lead to false memories in both children and adults. The present study extends this finding by identifying forced confabulation as another potent suggestive influence. Participants from 3 age groups (1st grade, 3rd/4th grade, and college age) viewed a clip from a movie and were "forced" to answer questions about events that clearly never happened in the video they had seen. Despite evidence that participants would not have answered these questions had they not been coerced into doing so, 1 week later participants in all age groups came to have false memories for the details they had knowingly fabricated earlier. The results also showed that children were more prone to this memory error than were adults.
Article
Full-text available
The effect of an initial forced recall test on later recall and recognition tests was examined in younger and older adults. Subjects were presented with categorized word lists and given an initial test under standard cued recall instructions (with a warning against guessing) or forced recall instructions (that required guessing); subjects were later given a cued recall test for the original list items. In 2 experiments, initial forced recall resulted in higher levels of illusory memories on subsequent tests (relative to initial cued recall), especially for older adults. Older adults were more likely to say they remembered rather than knew that forced guesses had occurred in the original study episode. The effect persisted despite a strong warning against making errors in Experiment 2. When a source monitoring test was given, older adults had more difficulty than younger adults in identifying the source of items they had originally produced as guesses. If conditions encourage subjects to guess on a first memory test, they are likely to recollect these guesses as actual memories on later tests. This effect is exaggerated in older adults, probably because of their greater source monitoring difficulties. Both dual process and source monitoring theories provide insight into these findings.
Article
Numerous studies have demonstrated that exposure to misinformation about a witnessed event can lead to false memories in both children and adults. The present study extends this finding by identifying forced confabulation as another potent suggestive influence. Participants from 3 age groups; (Ist grade, 3rd/4th grade, and college age) viewed a clip from a movie and were "forced'' to answer questions about events that clearly never happened in the video they had seen. Despite evidence that participants would not have answered these questions had they not been coerced into doing so, 1 week later participants in all age groups came to have false memories for the derails they had knowingly fabricated earlier. The results also showed that children were more prone to this memory error than were adults.
Article
High school students completed both multiple-choice and constructed response exams over an 845-word narrative passage on which they either took notes or underlined critical information. A control group merely read the text In addition, half of the learners in each condition were told to expect either a multiple-choice or constructed response test following reading. Overall, note takers showed superior posttest recall, and notetaking without test instructions yielded the best group performance. Notetaking also required significantly more time than the other conditions. Underlining for a multiple-choice test led to better recall than underlining for a constructed response test. Although more multiple-choice than constructed response items were remembered. Test Mode failed to interact with the other factors.
Article
Sixty-four college students read a series of passages about 12 U.S. presidents and then took a true-false exam on the material. One week later, they took a second exam in which they rated each item on a 7-point scale, ranging from definitely false to definitely true. Some items on the second exam were repeated from the first exam, whereas other items (nonrepeated items) were presented previously. Results indicated that mean validity ratings on the second exam were higher for repeated items than for nonrepeated items regardless of whether the items were true or false. That is, subjects were more likely to believe an item to be true if they had encountered it previously on the initial true-false exam. This occurrence for objectively false items constitutes evidence for a “negative suggestion effect” (Remmers & Remmers, 1926), in which students learn incorrect information as a result of exposure to false items. Practical and theoretical implications of these findings are discussed.
Article
This paper focuses on methodological problems arising in the study of the testing effect. These problems arise because processes that are correlated with, but logically independent of, the process of testing often differ across the study and test conditions. Carrier and Pashler (1992) recently reviewed these problems and proposed a paired-associate procedure for alleviating them. In this procedure, robust testing effects occurred, suggesting that the processes underlying testing may differ from those in a standard study condition. One potential problem with Carrier and Pashler's procedure is that opportunities to restudy previously tested items may be contributing to the testing effect. We use a modified Brown-Peterson paradigm (Peterson & Peterson, 1959) to provide converging evidence for Carrier and Pashler's (1992) conclusions. The results of Experiment 1 demonstrate testing effects even when there are no opportunities to restudy previously tested items. Experiment 2 examines whether, as suggested by Carrier and Pashler (1992), stronger interitem associations might be producing the testing effect arising from an intervening free recall test. The results demonstrate that testing effects occur even when single item lists are used, ruling out the view that testing effects are due solely to stronger inter-item associations.
Article
Two experiments examined the influence of test taking and feedback in promoting learning. Participants were shown a list of trivia facts during an incidental learning task. Some facts were later tested (plus feedback provided), whereas other facts were not presented for further processing. Tested facts were better recalled on a final criterion test than untested facts, showing the beneficial effects of testing. Tested facts were also better recalled than facts that were presented for additional study (Experiment 1). Although testing plus feedback enhanced learning, there were no effects of whether the participants were required simply to repeat the feedback or elaborate it.
Article
College students (N = 85) read a passage in which each sentence had been normatively assessed as to its importance to the overall meaning of the passage. Students expecting an essay examination took notes on sentences of higher structural importance than those anticipating a multiple choice test, even though there was no difference in the number of notes taken or in total test performance. The students took notes on 31% of the passage sentences and such notes were of high structural importance value. Most importantly, note taking seemed to serve as both an encoding device and as an external storage mechanism, with the latter being the more important function. The external storage function not only led to enhanced recall of the notes, but also facilitated the reconstruction of other parts of the passage.
Article
Previous research has demonstrated that exposure to false information on a true-false test increases individuals' confidence that the information is true when they later encounter a verbatim repetition of the test item (the negative suggestion effect). To investigate the generality of this effect, we had 160 college students read text passages and take an objective test on their material. After a week, they rated the validity of a series of statements, including some that had appeared on the initial test. The results indicated that subjects were more likely to believe that objectively false statements were true if they had appeared on the earlier test. This effect occurred regardless of whether items were repeated verbatim or in paraphrased form, and regardless of whether the initial test employed a true-false or a multiple-choice format. The findings suggest that the negative suggestion effect in actual test use may be widespread.
Article
In two experiments subjects viewed slides depicting a crime and then received a narrative containing misleading information about some items in the slides. Recall instructions were manipulated on a first test to vary the probability that subjects would produce details from the narrative that conflicted with details from the slides. Two days later subjects returned and took a second cued recall test on which they were instructed to respond only if they were sure they had seen the item in the slide sequence. Our interest was in examining subjects’ production of the misleading postevent information on the second cued recall test (on which they were instructed to ignore the postevent information) as a function of instructions given before the first test. In both experiments, robust misinformation effects occurred, with misrecall being greatest under conditions in which subjects had produced the wrong detail from the narrative on the first test. In this condition subjects were more likely to recall the wrong detail on the second test and were also more likely to say that they remembered its occurrence, when instructed to use Tulving's (1985)remember/knowprocedure, than in comparison conditions. We conclude that a substantial misinformation effect occurs in recall and that repeated testing increases the effect. False memories may arise through repeated retrieval.
Article
The "testing" phenomenon refers to the finding that students who take a test on material between the time they first study and the time they take a final test remember more of the material than students who do not take an intervening test. 4 experiments examined the testing phenomenon in student's memory for brief passages and labels for parts of flowers. Experiments 1a and 1b demonstrated the generality of the phenomenon to the methods and materials used in the current study. Experiment 2 ruled out an "amount of processing" hypothesis as a way of accounting for the testing phenomenon. The results of Experiment 3 seemed to indicate that the testing phenomenon resided in the number of complete retrieval events. Experiments 4a, 4b, and 4c focused on the completeness of retrieval events and indicated that the influence of retrieval on later memory performance was determined, at least in part, by the completeness of the initial retrieval event. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
3605 subjects in the 6th grade were given two articles to read and tests were appropriately spaced to determine effect of recall on retention, relation between rate of forgetting and ability, and effect of item difficulty on retention. Implications for education include: immediate recall aids retention; recall may aid in fixing erroneous ideas; tests are learning devices. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This article examines the negative suggestion effect, or the impact of exposure to incorrect alternatives on memory for correct information. All experiments used the following design: (a) cued-recall test of general facts (e.g., "second smallest planet") with immediate correct feedback, (b) interpolated exposure to incorrect information related to Test 1 items, and (c) a second test over the same items as Test 1. Test 2 was either multiple choice or cued recall and was given either immediately or 1 week after interpolation. Three experiments confirmed the existence of negative suggestion: Exposure to misinformation hindered subsequent performance on those items, relative to noninterpolated control items. The magnitude of this decrement is unrelated to retention interval, type of second test, number of incorrect alternatives exposed, and number of repetitions of incorrect alternatives. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Surveys theories of human learning and memory, covering both the classical approaches and the newer information-processing approaches. Methodological principles, techniques and empirical laws of the field, and the solution of theoretical problems are integrated and discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Examined the rate of forgetting of experimentally acquired associative information when the to-be-associated items are either associatively related or related in semantic memory in 288 undergraduates. Separate groups of 24 Ss studied lists of word pairs in which the members of the pairs were either unrelated or strong or weak associates. A single study trial was given. In addition, 1 group received 3 study presentations on the unrelated list. An immediate cued recall test for half of the pairs was followed by a second test on all pairs either 10 min, 48 hrs, or 1 wk later. The associated pairs, both strong and weak, were forgotten less rapidly than the nonassociated pairs, but the effect was largely restricted to previously tested items. The results do not appear to be due to differences in the amount of interference present, but they point to the importance of retrieval operations in the attenuation of forgetting. (French abstract) (9 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Three experiments with 126 undergraduates investigated the influence of different initial retrieval experiences on memory. In all conditions, Ss performed semantic and phonemic encoding tasks on a word list. Then Ss either received a cued-recall test that varied the type of cue (semantic vs phonemic), a 2-alternative forced-choice recognition test that varied the type of foil, no immediate test, or a 2nd encoding task. 24 hrs later, all Ss performed a final cued-recall test that varied the type of cue (semantic or phonemic). Immediate cued recall, and to a lesser extent a 2nd encoding, facilitated delayed recall primarily when the level of encoding and the type of delayed cue were mismatched. Immediate recognition, however, produced a different pattern of facilitation in delayed recall. Findings support the idea that initial cued-recall and 2nd encoding tasks produce an elaboration of an existing memory representation that increases the variability of encoded information. (27 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
"This study is concerned with the changes in the strength of first-list associations during the acquisition of a second list. A mixed-list design was used so that for half the syllable-adjective pairs in the two lists the paradigm of transfer was A-B, A-B', and for the other half, A-B, A-C. List 1 was learned to one perfect recitation and List 2 was practiced for 20 trials." A greater reduction was found in the availability of List 1 responses for the A-B, A-C than for the A-B, A-B' paradigm. Recall of List 2 was nearly perfect. These findings support the hypothesis of unlearning and make it unlikely that the differences in the availability of List 1 associations are a function of response sets characteristic of the 2 paradigms. The conventional anticipation method yields similar results. The amount of retroactive interference immediately after interpolated learning is determined largely if not entirely by the availability of List 1 associations. (18 ref.) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Conducted 5 experiments with a total of 323 undergraduates with several repetitive reinforcement-test (R-T) sequences varied with respect to m, the number of unreinforced T trials/replication in Condition RT1 . . . Tm. No limit was observed in the forgetting-prevention effects of Ts. Performance levels stayed strikingly constant within any block of consecutive Ts, indicating that neither significant learning nor forgetting occurred over as many as 19 successive unreinforced tests. In contrast, the potentiating effects of T trials were demonstrated to have an optimal point: the effects progressively increased from Condition RT (m = 1) to Condition RT1 . . . T7 (m = 7), where a maximum point was reached, decreased slightly as m increased further, and reached an asymptote soon therafter. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Conducted 4 experiments with 303 undergraduates to examine the relationship between the rated truth of statements and prior study of parts of those statements. Findings from the 1st 2 experiments show that new details about familiar topics are rated truer than new details about unfamiliar topics. Consequently, recognition of a topic as familiar disposes Ss to accept new details as true. Results from the 3rd and 4th experiments show that statements initially studied under an affirmative bias are rated truer than statements originally studied under a negative bias. However, since even the negatively biased statements are rated truer than new ones, it is contended that Ss are not remembering the bias. Rather, different biases during study affect the probability that details will be encoded into memory. In contrast to differential biases, different study processes affect the likelihood that Ss will remember having studied the statements, but do not affect truth. Results are discussed in terms of the hypothesis that remembered factual details are the criterion of certitude against which tested statements are assessed. (French abstract) (38 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
A series of 3 experiments conducted with a total of 240 undergraduates reexamined the factors underlying the effect of recall trials on subsequent recall. In a free-recall memory task, Ss were given a series of presentation trials or recall trials following initial presentation and later tested for long-term retention. Overall results support the hypothesis that recall trials provide information about the recallability of each item and provide a re-presentation of each item recalled. These results are inconsistent with the notion that some retrieval process (e.g., learning to locate an item in memory) facilitates subsequent recall. Some apparently contradictory evidence is experimentally reconciled with this hypothesis, and some problems involved in interpreting other evidence are discussed. (16 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The thesis is advanced that the law of disuse cannot account for the major phenomena of forgetting; first, because it lacks generality, since disuse often fails to produce forgetting; second, because even where forgetting and disuse are correlated, there is no evidence that it was the disuse that caused the forgetting, instead of other important factors which were present; third, because the principle of passive decay has no analogue anywhere else in science, and is illogical; and fourth, that experimental work with retroactive inhibition shows that forgetting varies with interpolated conditions rather than with disuse. Two principles are offered to account for forgetting: interpolated activities and altered stimulating conditions. Disuse is important only in that it gives these primary laws an opportunity to operate. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Experience with misspellings can be detrimental to subsequent spelling performance. Generating or being exposed to incorrect spellings between two successive spelling tests interfered with subsequent spelling accuracy of these same words in Experiments 1 and 2 (but not Experiment 3), as indicated by changes from correct to incorrect spellings (CI changes). Furthermore, significantly more CI changes occurred when a recognition test (with incorrect versions as distractors) followed a dictation test than when a second dictation test followed it. Repeatedly presented misspellings were rated as looking progressively more similar to the correct spelling across presentations (Experiment 3). These outcomes suggest that spelling tests that involve the discrimination of correct from incorrect versions may be ill advised. In addition, the instructional technique encouraging students to intentionally produce misspellings of words, for purposes of visual comparison, may be detrimental rather than helpful. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Examined the effects on spelling of reading and of reproducing correctly and incorrectly spelled words in 2 experiments involving 108 undergraduate students. Reading correctly and incorrectly spelled words influenced later spelling accuracy for those same words. Reproducing the spelling of words did not have any effects on later spelling accuracy beyond those produced by reading the words. However, reproducing a correctly spelled word did speed the production of a later correct spelling for the word, whereas reading did not speed later production. Effects on spelling accuracy were dissociated from recognition memory for previously presented words. (French abstract) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Previous research has shown that repeated statements are rated as more true than new ones. In Exp I, 98 undergraduates rated sentences for truth on 2 occasions, 3 wks apart. Results indicate that the repetition effect depends on Ss' detection of the fact that a statement is repeated: statements that are judged to be repeated are rated as truer than statements judged to be new, regardless of the actual status of the statements. Exp II with 64 undergraduates showed that repeated statements increment in credibility even if Ss were informed that they were repeated. It was further determined that statements that contradicted early ones were rated as relatively true if misclassified as repetitions but that statements judged to be changed were rated as relatively false: Ss were predisposed to believe statements that seemed to reaffirm existing knowledge and to disbelieve statements that contradicted existing knowledge. (15 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The results of two free-recall learning experiments, in which the number and sequential combinations of input and output phases were systematically varied, indicated that (a) recall tests facilitate the overall immediate post-input recall approximately to the same extent as do the events occurring in input phases, (b) input and output events are approximately equally responsible for forgetting of individual items from one output phase to another, (c) the earlier contention that intratrial retention remains constant over trials is untenable, and (d) recall of individual items is quite labile even when overall recall is stable. The findings were interpreted in support of the view of memory as a limited-capacity retrieval system in which the limit is set by the number, but not by the nature of the contents, of accessible memory units.
Article
Two experiments are described in which subjects studied made-up, fantasy facts about well-known persons and then were asked to verify actual facts about these persons. Reaction time to the actual facts was longer the more fantasy propositions studied about a person. Reaction time was also longer when the verification test involved a mixture of actual and fantasy facts rather than just actual facts. A mathematical version of the ACT model (Anderson, 1976) was fit to the data. It provides a satisfactory fit, better than an alternate model. However, some of the parameter values estimated for the ACT model seemed unreasonable.
Article
Two experiments are reported in which subjects viewed films of automobiled accidents and then answered questions about events occurring in the films. The question, “About how fast were the cars going when they smashed into each other?” elicited higher estimates of speed than questions which used the verbs collided, bumped, contacted, or hit in place of smashed. On a retest one week later, those subjects who received the verb smashed were more likely to say “yes” to the question, “Did you see any broken glass?”, even though broken glass was not present in the film. These results are consistent with the view that the questions asked subsequent to an event can cause a reconstruction in one's memory of that event.
Article
Forty words were given to Ss in Experiment I either for three study trials followed by a recall or recognition test, or for one study trial followed by three recall or recognition tests. Forty-eight hours later a free-recall test and a recognition test were administered to all Ss. Training conditions with greater item exposure (study and recognition trials) resulted in more effective recognition, while the conditions which encouraged retrieval (recall and recognition test trials) facilitated recall. In Experiment II first-day training consisted either of four study trials, or of one study trial followed by three recall tests. Long-term retention was significantly better in the former condition when measured by recognition, but better in the latter when measured by free recall. The data were discussed with reference to the two-stage theory of recall.
Article
Fifty-two students sat two 60-question multiple choice examinations 8 months apart. The second examination consisted of 30 of the questions used at the first sitting, together with 30 questions with repeated stems but different responses. Immediately after the first sitting, 27 of the students went through the paper with the examiner. They were given the correct answers, and any problems were discussed. For the 30 questions with new options, the increase in marks was similar in the two groups. The group given feedback had a significantly greater improvement in those questions repeated in the second examination than those given no feedback. It is concluded that feedback on examination answers does lead to learning of the specific times, but does not lead to a general increase of information in the same area.
Article
In two experiments, adults who witnessed a videotaped event subsequently engaged in face-to-face interviews during which they were forced to confabulate information about the events they had seen. The interviewer selectively reinforced some of the participants' confabulated responses by providing confirmatory feedback (e.g., "Yes, ______ is the correct answer") and provided neutral (uninformative) feedback for the remaining confabulated responses (e.g., "O.K_____". One week later participants developed false memories for the events they had earlier confabulated knowingly. However confirmatory feedback increased false memory for forcibly confabulated events, increased confidence in those false memories, and increased the likelihood that participants wouldfreely report the confabulated events 1 to 2 months later The results illustrate the powerful role of social-motivational factors in promoting the development offalse memories.
Article
Taking a memory test not only assesses what one knows, but also enhances later retention, a phenomenon known as the testing effect. We studied this effect with educationally relevant materials and investigated whether testing facilitates learning only because tests offer an opportunity to restudy material. In two experiments, students studied prose passages and took one or three immediate free-recall tests, without feedback, or restudied the material the same number of times as the students who received tests. Students then took a final retention test 5 min, 2 days, or 1 week later. When the final test was given after 5 min, repeated studying improved recall relative to repeated testing. However, on the delayed tests, prior testing produced substantially greater retention than studying, even though repeated studying increased students' confidence in their ability to remember the material. Testing is a powerful means of improving learning, not just assessing it.
Remembering, knowing, and reconstructing the past The psychology of learning and motivation: Advances in research and theory
  • H L Roediger
  • Iii
  • M A Wheeler
  • S Rajaram
Roediger, H. L., III, Wheeler, M. A., & Rajaram, S. (1993). Remembering, knowing, and reconstructing the past. In D. L. Medin (Ed.), The psychology of learning and motivation: Advances in research and theory (pp. 97-134). San Diego, CA: Academic Press.