Content uploaded by Stephen B. Deloach
Author content
All content in this area was uploaded by Stephen B. Deloach on Apr 12, 2020
Content may be subject to copyright.
Do electronic discussions create critical thinking spillovers?
Stephen B. DeLoach and Steven A. Greenlaw*
June 15, 2004
* This is a revision of a paper presented at the Western Economic Association International
conference in Seattle, WA, 2002. We are grateful for the thoughtful suggestions of Robert
Rycroft and Linda Manning during the evolution of this project. We have also benefited from
the comments of the anonymous referees and by numerous conference participants in the
NCEE/NAEE sessions at the Allied Social Science Associations conference in New Orleans,
2002, and the Eastern Economic Association conference in Washington, D.C, 2000. Financial
support for this project was provided by Elon University and Mary Washington College.
DeLoach: Associate Professor of Economics, Elon University, Campus Box 2075, Elon, NC
27244, Phone: 336-278-5943, E-Mail: deloach@elon.edu
Greenlaw: Professor of Economics, Mary Washington College, Fredericksburg, VA 22401,
Phone: 540-654-1483, E-mail: sgreenla@mwc.edu
ABSTRACT
Few academics question the relevance of critical thinking in higher education yet there has been
little attempt to investigate which specific pedagogies aid in its development. In this study we
assess whether critical thinking can be taught effectively using electronic discussions. In most
discussions analyzed, the data show that the quality of a student's argument is positively
influenced by the quality of their peers' arguments (critical thinking spillovers). While the use of
this pedagogy is promising, best results require an appropriate topic as well as effective
management of the discussion.
1
I. INTRODUCTION
The importance of teaching critical thinking skills is gaining increased attention in the
economics literature. This attention can be traced back to Fels (1969), who identified it as an
area ripe for teaching and exploration by researchers. A number of articles propose specific
methods for promoting critical thinking in economics, including Cohen and Spencer (1993),
Feiner and Roberts (1995), Palmini (1996), Peterson and Bean (1998), and Elliot, Meisel and
Richards (1998). Although the value of critical thinking may be taken for granted in higher
education, there has been little attempt to assess the extent to which any of these exercises
achieve this goal.
One reason for the lack of assessment is the difficulty of measuring students’ critical
thinking.1 In recent work we developed a methodology for using electronic discussions to teach
critical thinking in economics courses (Greenlaw and DeLoach 2003). In that paper we
constructed a taxonomy for defining levels of critical thinking, as well as a procedure for using
that taxonomy to assess postings in an electronic discussion. While our earlier paper focused on
developing the methodology, the purpose of the present study is to test the effectiveness of this
new pedagogy for teaching critical thinking.
Generally speaking, electronic discussion can be defined as any collaborative class
activity organized to explore an issue, using an electronic medium such as electronic mail or
Web-based discussion lists. In our work, we require students to use a Web-based threaded
discussion list, over a period of two weeks, to investigate and develop a consensus about an
open-ended economics question. The asynchronous nature of the threaded discussion allows
participants the time to reflect on what others have said and how they wish to respond, while the
2
discrete time frame of the discussion focuses students’ attention and participation. Students
make multiple postings over the course of the typical discussion, usually averaging about six
posts per student. Ideally, these postings will provoke other students to weigh in with their own
opinions or challenge the logic of previous arguments. As a result, this interaction leads to
complex discussions that typically range from students trying to define terms and interpret the
question to ultimately evaluating competing opinions, based upon both economic theory and
empirical evidence.
The use of electronic discussion should be particularly effective because it increases the
efficiency of the learning process.2 While the medium is often viewed simply as an out-of-class
equivalent to standard classroom discussions, electronic discussions also possess many of the
advantages of ‘writing-to-learn' assignments. However, in electronic discussions the students’
peers are providing crucial feedback rather than the instructor. Students are constantly
challenged to improve their answers by providing relevant backing for their opinions. Simply
put, there appears to be a critical thinking spillover effect.
In this paper we investigate whether student interaction in the typical electronic
discussion in upper-level macroeconomic courses (Intermediate Macroeconomics and Money
and Banking) leads to higher levels of critical thinking. Specifically, we test whether or not
critical thinking spillovers exist. Using the taxonomy of critical thinking skills outlined in
Greenlaw and DeLoach (2003) and estimating an educational production function along the lines
of Davisson and Bonello (1976), it is possible to test this hypothesis directly. In addition to
assessing the effectiveness of this particular technique, we also hope to shed light on a related
issue that has been debated in the economics education literature: the degree to which using the
internet enhances teaching and learning. We think that, implicitly, the critical thinking spillover
3
effect appears to be at the core of much of the literature touting the advantages of online
activities.3 While anecdotal evidence abounds, there is little quantitative evidence on the
effectiveness of online activities. As Katz and Becker (1999) observe, “With the exception of a
few published pieces, such as Agarwal and Day (1998)4, we do not have much information on
the effects of the Internet on student outcome measures.”
II. MODELING CRITICAL THINKING IN ELECTRONIC DISCUSSIONS
The argument for the way in which electronic discussion enhances critical thinking
suggests a number of testable hypotheses. Students learn by challenging each other to produce
more coherent, lucid arguments, and provide evidence in support of their assertions. In a
regression model positive and significant lagged critical thinking scores can be interpreted as
evidence of this intellectual spillover effect. In other words, holding individual characteristics
constant, the higher the previous score, the higher the expected score on the successive postings5.
Additionally, if the students as a group develop increasingly more sophisticated arguments over
the course of the discussion, we would expect to observe, on average, increasingly higher and
higher scores over time, suggesting a positive trend in scores.
In order to evaluate these hypotheses, it is necessary to specify a model of each student's
intellectual 'output' appropriate for an electronic discussion. According to Davisson and Bonello
(1976), there are three categories of inputs that enter the educational production function:
human capital, labor (effort), and technology (treatment).
(1) Output = f( K, L, A)
In our earlier paper (Greenlaw and DeLoach, 2003) we developed a methodology for assessing
4
the critical thinking in postings on electronic discussions, using a six-level taxonomy. This
taxonomy is summarized in Table 1. We used this approach to rate the postings in five
electronic discussions.6 Thus, our measure of output in an electronic discussion is the level of
critical thinking exhibited in each student posting.
Human Capital (K) depends upon individual characteristics. At any given time, we
expect some students to perform better than others based upon their previous performance,
exposure to a particular subject, or innate intellectual ability. In our model, we use the following
as a proxy for human capital:
(2) K = f( GPA, total credit hours completed, economics credit hours completed,
participation in previous discussions)
GPA measures the combination of past scholastic effort and innate ability.7 To the extent
that critical thinking is a skill used and learned across the disciplines, the quality of a student’s
postings may be higher if that student has completed more credit hours. At the same time,
discipline-specific knowledge clearly plays an important role. Since one of the higher levels of
critical thinking (see Table 1) involves constructing arguments based upon economic theory, the
more economics courses completed, the higher the expected level of critical thinking. We will
also control for whether or not the student has previously participated in a discussion of this
type.
The second factor in the production function is the labor, or effort component (L). This
is commonly couched in terms of the utilization of effort in the exercise. In this case the proper
measure would be the amount of time spent on the internet discussion. Unfortunately, this is not
directly observable in our data. Our only possible related measure is the number of postings
5
made by a particular student.
(3) L = f( number of postings)
There are some obvious shortcomings with the use of this proxy. Utilization should include the
time spent reading and thinking about postings as well as writing them. However, in the absence
of a survey instrument, we do not have access to that data. It will also be important to consider
the possibility of diminishing or even decreasing returns to the number of posts.
The third factor in the model accounts for how the technology, or treatment (A), affects
educational production. In other words, these are the factors associated with the increased
efficiency of this pedagogy.
(4) A = f( lagged score, days later, trend )
If a critical thinking spillover exists, the higher the level of critical thinking in the preceding
post, the greater the probability that the respondent will post at a high level. Therefore, one way
to assess the treatment is by using the lagged score, the level of critical thinking achieved by the
preceding post in the discussion, as a proxy. In addition to the lagged score effect, we think it is
important to consider the time it takes to respond to a posting. It is possible that the spillover
effect will be less potent if the respondent takes too long to make their posting. The discussion
moves fairly quickly at times and if a student waits too long, the discussion will have moved on
to another topic. This is captured by the number of days later a student responds. Moreover, it
is quite likely that as the discussion progresses, postings will reflect higher levels of critical
6
thinking as participants develop their argument and reach conclusions. For this reason, it is
important to include a thread trend variable in the model.
Finally, a number of demographic variables have been used in the economics education
literature in estimating educational production functions.8 In this study, the one demographic
that we are able to consider is gender.9 That gives us the following model:
(5) Output = f( GPA, total credit hours completed, economics credit hours
completed, participation in previous discussions, number of postings, lagged
score, days later, trend, gender)
III. DATA
To test for the existence of critical thinking spillovers, we analyzed individual student
postings from five separate, two-week long electronic discussions that were held during the
Spring 1999, Fall 1999 and Spring 2000 semesters.10 The five discussions consisted of 21, 31,
28, 60 and 25 students, respectively. Participants were students in Intermediate
Macroeconomics at our two institutions. The fourth discussion combined macro students from
one school with students taking Money and Banking at the other school.
Each assignment asked students to develop a consensus about a current issue in
macroeconomics. The instructions provided to the students for the discussion topics we use in
our analysis are given below. These questions/issues are not much different from what one
might assign in a paper, except for the fact that students completed the assignment online in a
discussion format. The topics were chosen with the Greenlaw-DeLoach (2003) taxonomy in
mind to facilitate the development of critical thinking. These questions lend themselves to
multiple answers, requiring students to develop conclusions rather than merely looking up facts.
7
In addition, the assignment explicitly prompts participants to provide evidence to support their
argument.
TOPIC 1: Discussions 1, 2 and 3
The objective of this discussion is for our class to develop a consensus view on a topic on
which the experts disagree: namely, what caused the decrease in productivity growth in
the U.S. economy that began about 1973. There are a variety of print and electronic
resources available to aid you in this endeavor, including your textbook. However,
before you do that, it would be a good idea to define your terms. Exactly what do we
mean by productivity growth? How do we measure it? What is the evidence that
productivity growth has declined? As you find information with a bearing on the
question, you should report it in the discussion so that other participants can check it out.
TOPIC 2: Discussion 4
The objective of this discussion is for participants to develop a consensus view on a topic
on which the experts disagree: namely, which concerns should the Fed take into
consideration when setting policy? The point of the discussion is to reach a conclusion,
not simply to post comments. Start by trying to agree on what the goals of the Fed
should be. After achieving consensus, consider the trade-offs that exist between the
various goals. Then discuss the issue of how the Fed should resolve the various trade-
offs. The first two issues should not take very long to resolve. The last issue is much
more involved. To argue your position you should use material from class, your
textbook, links from the professors' syllabi, the news, economic history, and your own
values. As you find information with a bearing on the question, you should report it in
the discussion so that other participants can check it out.
TOPIC 3: Discussion 5
The purpose of this electronic discussion is for participants to explore the use of fiscal
and monetary tools to find an “optimal” policy for the U.S. economy today. The U.S.
economy is currently coming out of a recession. How can we insure a smooth recovery,
but without igniting any other economic problems, for example, inflation? Your
objective in this discussion should be to develop a consensus on the appropriate stance
of monetary and fiscal policies for the near term. Specifically, you need to determine
two things: first, whether in light of the current state of the U.S. economy, fiscal policy
should be strongly expansionary, weakly expansionary, neutral, weakly contractionary,
or strongly contractionary, and second, whether monetary policy should be strongly
expansionary, weakly expansionary, neutral, weakly contractionary, or strongly
contractionary. There are a variety of print and electronic resources available to aid you
in this endeavor. You should consider economic and financial indicators, speeches of Fed
and other government officials, and the financial press, as well as your textbooks. But
while research on this topic can be helpful, (what ultimately matters) ultimately what
8
matters is your ability to develop a persuasive argument supported by logic and empirical
evidence.
Each student posting was scored holistically following a widely used methodology for
assessing student writing developed by White (1985)11. Responses were rated 0 to 6 following
the taxonomy outlined in Table 1, by two graders working independently. In order to reduce
potential bias on the part of the graders, the posts were randomized prior to assessment. Table 2
summarizes the inter-rater reliability on the five different samples used in this study. While the
correlation coefficient may appear low, about 95% of the total posts were scored within one
level of each other. This meets typical standards of reliability (White 1985).
The dependent variable, SCORE, is the consensus of the scores assigned by the two
raters. For example, if one rater assigns a 3 and the second a 4, the SCORE is coded as a 3, the
maximum level agreed upon by the two raters. The reason for choosing this method of scoring
rather than simply averaging the two rater scores is a practical one. Averaging the scores would
produce a limited dependent variable with up to fourteen possible outcomes, making
interpretation of the regression difficult at best. Precise definitions of all variables used in this
study are provided in Table 3.
9
IV. ECONOMETRIC EVIDENCE
Due to the nature of the dependent variable (SCORE), an ordered logistic regression was
used to estimate the educational production specified in the previous section. Each individual
response represents one observation. As a result, the regression estimates the probability that a
given individual posting achieves a particular level of critical thinking.1 2 Since there are
potentially six levels of critical thinking in each sample, it is necessary to estimate the marginal
effects of each explanatory variable on the probability of a respondent reaching each given level
of critical thinking. Marginal effects are calculated as changes from the mean. The estimation
results, along with the estimated marginal effects, are summarized in Tables 4-8.
Overall, the regression results are consistent with the hypothesis that a critical thinking
spillover effect exists, but only for one of the three discussion topics. In each of the
“productivity” discussions (Tables 4,5,6), the level of critical thinking achieved by the preceding
post in the discussion (LAGGED SCORE) and the number of days the respondent takes to reply
(DAYS LATER) have the predicted effects on SCORE. The level of critical thinking scored by
the preceding post in a thread (LAGGED SCORE) has a positive and significant effect on the
probability that current respondent makes a clear, logical argument (level 3). Looking at the
marginal effects, an increase in the LAGGED SCORE results in a 20% to 50% increase in the
probability that the respondent will post a “level 3” argument. In two of the cases (Tables 4 and
6) the probability that the respondent uses economic theory as the basis for that argument (level
4) increases by 36% to 45%. The probability of achieving a “level 5” posting increases by 35%
in one discussion (Table 4). As expected, higher levels of the LAGGED SCORE result in
significant decreases in the probability that subsequent authors post lower-level responses (levels
0-2) in the “productivity” discussions.
10
As indicated by equation 4, the ultimate impact of the critical thinking spillover is likely
to depend on the length of time between the initial argument and the response posted by
subsequent students. Specifically, it is our hypothesis that the longer one takes to post a
response, the less likely it is for that response to raise the level of argumentation. In each of the
discussions, the number of days a student waits until replying (DAYS LATER) has the
hypothesized negative coefficient, and is statistically significant in two of the samples (Tables 4
and 6). One thing to note is that the marginal effects are smaller than for the spillover effect
(LAGGED SCORE). While the spillover effect can increase the probability of higher-quality
postings by anywhere from 20% to 50%, delaying the response one day decreases the probability
of “level 3” postings by 7% at most. The probability that one uses theory as the basis for the
response (level 4) decreases by 10% to 15% if the posting comes a day later. The longer one
waits, the less impact the spillover has on subsequent readers. In order to reap the full benefits
from the group interaction, students must be actively engaged in the discussion. Often what
happens is that a student comes in much later and replies long after the group has moved on to
another topic. Therefore, it may be important for the instructor to build in an adequate incentive
structure in the grading in order to encourage students to keep up with the discussion.13
There seems to be some evidence that the quality of students’ postings increased on
average throughout some of these exercises. The longer the discussion progressed (TREND),
the probability of students posting at least a level 3 argument increased significantly in two of
the discussions (see Tables 4 and 8). While this does not necessarily indicate that students are
learning directly from each other (as is our hypothesis), it does show that learning is taking
place. Even in the last discussion (Table 8), where there was no evidence that critical thinking
spillovers existed, we do have evidence that students learned valuable critical thinking skills.
11
Interestingly, the degree of effort and prior experience1 4 in the exercise had little to do
with achievement in this assignment. In the samples (Tables 6 and 8) that necessitated this
dummy variable, prior experience with electronic discussions (EXPERIENCE) was
insignificant. Also, there is little evidence of a learning curve. Only in the last discussion
(Table 8) does increased utilization, or practice, lead to higher levels of critical thinking. In fact,
in one sample (Table 4) the number of previous postings (# OF POSTINGS) by a given author
has a negative effect on the probability of scoring higher levels of critical thinking. Caution
must be used in interpreting these results given the limitations of this measure. It does not
measure the amount of time the student spends working on the discussion. A large proportion of
the actual time spent is likely to be spent reading other's postings, but there is no reliable way to
account for this in the samples used here. Furthermore, the number of postings is likely to be
endogenous. The better a student feels they are doing, the more they are likely to make postings.
The human capital variables had only limited impact upon the level of critical thinking.
If GPA is a measure of either ability or scholastic achievement, we would expect higher levels of
critical thinking out of students with higher GPAs. This was confirmed in most of the
regressions. However, neither total credit hours completed nor economics courses taken
(measured as a ratio of economics hours to total hours) had any significant effect on critical
thinking scores. The marginal effects, however, did have the expected signs on the various
levels of critical thinking. Obviously, we expected that relatively more exposure to economic
theory would increase the probability that a student would make clear and logical inferences
during discussions.
12
Of the various control variables tested (GENDER, CLASS) only gender had any
significant effects on the critical thinking score.15 It is particularly interesting that in some
discussions the males outperformed their female counterparts on average in the sense that the
coefficient on GENDER (Male=1) was positive. There may be a couple of reasons for this.
Males may be more willing to take a stand and defend their argument it in the face of
disagreement. They may also be more willing to make a posting. On the other hand, the fact
that males may be more inclined to participate in these exercises should not be viewed as a
reason not to use this pedagogy. Electronic discussions are likely to induce students, both male
and female, who rarely speak up in classroom discussions to participate.
Interestingly, our evidence does not indicate any kind of an “instructor effect.” Each of
the two primary instructors involved in these experiments presided over some discussions that
succeeded in generating critical thinking spillovers and some that did not. However, it should be
noted that all of the professors involved in this study were experienced instructors of electronic
discussions. So, while our evidence suggests the topic has more to do with success than the
instructor, it is unlikely that an instructor attempting their first discussion of this type could be
expected to replicate these results. In short, there is likely to be a learning curve for instructors.
V. DISCUSSION AND PEDAGOGICAL IMPLICATIONS
Our assessment of this new pedagogy has revealed several key factors that affect the
relative success of these assignments for teaching critical thinking. Significant critical thinking
spillovers were only present in one of the three discussion topics tested. Even though all
discussions were constructed to facilitate critical thinking, the “causes of the productivity
slowdown” proved to be a significantly more effective topic for inducing the desired effects.
13
While the discussion topic appears to be the most crucial factor, the discussion must also be
effectively moderated. With respect to each factor, we suggest the following:
1. First, topics should be chosen based on student access to a sufficient amount of
economic literature relating directly to the issue. Second, there must be course-
appropriate economic theory that can be applied to the question by students
participating in the discussion. Third, if the goal of the discussion is to evaluate
competing arguments, there must be data readily available for students to use. While
all discussion topics met the first two criteria, the third appears to have been the
problem with the two policy debates (discussions 4 and 5). In these topics, students
were asked to decide on which concerns the Fed should take into consideration when
setting policy and on the appropriate stance of monetary and fiscal policies for the
near term. While better answers were based upon carefully constructed theory, it is
quite difficult for undergraduates to envision how they would evaluate the future
impacts of changes in monetary policy on the economy.
2. Good discussions need to be managed effectively. It is quite possible that there is
both an optimal size and length of discussions. Too many participants and they are
not able to “listen” to every comment; too short and there is not enough time to think
about or reflect on it. Good discussions require between 10-14 days, but longer than
2 weeks seems to be unproductive. Though there is a temptation to bring multiple
classes together for one “big” discussion, it appears such discussions are simply too
large and unmanageable for the participants. Both of the “failed” discussions appear
to have suffered one of these pitfalls. In Sample 4 (Table 7) there may have been too
many participants and too many postings. In Sample 5 (Table 8) the discussion ran
for only one week and the discussion was simply never fully developed. All other
discussion ran for at least two full weeks.
Finally, the results of this research have a number of implications for the allocation of
instructional resources. Overall, our evidence suggests that whatever resources exist for
instructors to use electronic discussions, the allocation should be relatively front-loaded. Most
of the work should occur before the discussion. First and foremost, instructors must think
carefully about the discussion assignment. Institutions can help by providing resources for
instructors to create such assignments. On the other hand, managing discussions, once
underway, is not as time-consuming as many imagine. Though instructors need to monitor the
progress of the discussion and be prepared to “step in” if the discussion becomes bogged down,
14
it is not necessary for instructors to comment on each and every posting. In fact, such a strategy
would likely have a chilling effect on the discussion, since students would cease to challenge
their peers’ postings and sit back and wait for the instructor to tell them the “right answer.” This
would likely nullify any critical thinking spillovers generated by the students themselves. We
see no evidence to suggest that significant time need be allocated for the purpose of managing or
grading the discussion.
VI. CONCLUSION
In this paper, we have attempted to assess whether or not critical thinking can be taught
effectively using electronic discussions. Under the right conditions, when the previous student’s
critical thinking score increases, the score obtained by the following student’s post increases as
well. This suggests the existence of what Greenlaw and DeLoach (2003) have called a “critical
thinking spillover effect.”
Our experimental design differs significantly from the typical literature on assessing the
relative worthiness of various pedagogical techniques, and has a number of advantages. First,
since we have not used the final course grade as our measure of success, we have avoided the
bias that plagues many similar studies. Second, since critical thinking cannot be assessed by
existing standardized multiple choice tests, scoring postings using a taxonomy of critical
thinking skills is appropriate. Third, unlike many studies on assessing various pedagogies, we
have collected data from two upper-level classes across several semesters, which were taught by
three different professors at two institutions. This variety has increased our understanding of the
relative strengths and weaknesses of this technique. This approach also has one notable
disadvantage. The question of whether observed spillovers have long-lasting effects on students’
15
thinking cannot be examined.
Even without such evidence of long-run effects, the results of this study are extremely
promising. Most importantly, the students in this study sharpened their critical thinking skills
without continuous feedback by the course instructor or substantial class time. Most pedagogies
designed to promote critical thinking are both highly labor and time intensive. Electronic
discussions provide an alternative to increasing the amount of traditional writing assignments in
a course without imposing unreasonable grading burdens on the instructor. In short, this type of
exercise is not meant to replace instructor feedback, but rather complement it. Evidence shows
that it is not always easy to design structures that will lead to successful discussions. Rather,
careful time and effort must go into the design of the discussion question in order to stimulate
the hypothesized positive spillover effects.
16
TABLE 1
Greenlaw-DeLoach (2003) Taxonomy of Critical Thinking
LEVEL 0 Off-the-subject, or otherwise unscorable
LEVEL 1
Unilateral
Descriptions
They paraphrase information, they repeat and restate the question
Defining terms
Simply repeat information
Simple “good” or “bad” statements
Adds little or nothing new to the issue or question
LEVEL 2
Simplistic
Alternatives
/ Argument
They take a side, they do not explore other alternatives, they make
unsupported assertions, they make simplistic arguments
An assertion, without evidence, often in the form of a question which
modestly advances thinking; often synonymous with getting the
discussion back on track
Challenge an assertion, but without evidence
Facts (beyond defining terms) relevant to the discussion, but no
argument, per se
Simple explanations, e.g. giving an example
Cites simple rules, “laws” as proof
Do not address conflicts with opposing views or does not explore
them
LEVEL 3
Basic Analysis
There is a serious attempt made to analyze an argument or competing
arguments and evaluate it/them with evidence
Appeal to a recognized (appropriate) authority
Casual observation, anecdotal, datum (vs. data)
Assertions with explicit evidence offered; or a reasoned challenge of
another’s assertion, but without a clear logical framework
A singular, Socratic-style question
Often lists numerous factors as evidence, but does not integrate them
within a logical framework
No clear conclusion or choice between alternatives is made; e.g. when
pressed for the “best” explanation, they respond that both are equally
valid
LEVEL 4
Theoretical
Inference
They employ the use of (economic) theory to make a cohesive argument
Logical statements based on the discipline’s accepted model/school(s)
of thought
Identify assumptions
Challenges a key assumption of another’s theory
A series of logical, Socratic-style questions
LEVEL 5
Empirical
Inference
Add to the level of sophistication by introducing empirical evidence to
strengthen their theoretical argument
Looking to appropriate, historical data to “test” the validity of an
argument
Use of data to reach a clear conclusion, or to choose between
alternative theories
Requires at least an implicit logical framework
Challenges the validity of another’s empirical measure/evidence
LEVEL 6
Merging Values
with Analysis
They are able to move beyond objective analysis to incorporate subjective
interests
They may argue that while there is (positive) evidence to validate the use
of a particular policy, there are other (normative) consequences that
must be considered
They may select a particular policy on some normative basis, from
several which have positive evidence to support them.
TABLE 2
Inter-rater Reliability of Electronic Discussions
Rater One-Level Differences
A B CORR Percent # Diff. Total
Disc 1 Mean 2.851 2.593 0.696 10.73% 19 177
St. Dev. 1.227 1.298
Disc 2 Mean 2.692 2.711 0.511 4.45% 7 157
St. Dev. 0.648 0.762
Disc 3 Mean 2.663 2.536 0.691 1.98% 5 253
St. Dev. 0.774 0.839
Disc 4 Mean 2.712 2.485 0.636 2.02% 7 346
St. Dev. 0.659 0.792
Disc 5 Mean 2.651 2.660 0.710 7.55% 8 106
St. Dev. 1.015 1.077
Totals - - - - 4.3% 46 1,039
TABLE 3
Data Definitions
Score: combined score (0 to 6) obtained on each post. In other words the observations are
posting scores, not student scores. Obviously then, a single student is responsible for
numerous observations of the dependent variable.
Lagged Score: the score obtained on the posting immediately preceding the current post. Note
that this score is typically the measured quality by a different student, though in rare cases
a student may in fact respond to their own posting.
Days Later: the number of days after the posting that a subsequent posting follows.
Trend: a linear trend variable to account for possible overall improvement throughout the
discussion.
# of Postings: number of previous postings made by the student in question.
GPA: cumulative GPA of the student prior to the semester.
Total Credits: total hours completed prior to the semester.
Econ Hours: ratio of credit hours previously completed in economics to total credit hours. This
is used in order to avoid multicollinearity between economics and total credit hours we
have defined.
Gender: a 1,0 dummy to account for the author’s gender; Male = 1.
Class: a 1, 0 dummy to represent those students who were in the Intermediate Macroeconomics
class (0=Money and Banking) in the joint discussion held (Sample 4).
Experience: a 1,0 dummy variable to indicate whether the student had ever participated in a
previous electronic discussion prior to the current discussion.
TABLE 4
Ordered Logit Results from Sample Discussion 1
Marginal Effects dy/dx (with predicted probabilities)
Stand.
Error Mean
Level=0
( =.06)
Level=1
( =.17)
Level=2
( =.37)
Level=3
( =.31)
Level=4
( =.07)
Level=5
( =.02)
Level=6
( =.01)
Lagged Score 0.407** 0.124 2.68 -0.024** -0.049** -0.025 0.064** 0.025** 0.007* 0.002
Days later -0.119** 0.039 0.09 0.007** 0.014** 0.007 -0.019** -0.007** -0.002* -0.001
Trend 0.010** 0.003 89.85 -0.001** -0.001** -0.001 0.002** 0.001** 0.0002* 0.000
# of Postings -0.060* 0.031 6.02 0.004 0.007* 0.004 -0.009* -0.004* -0.001 -0.000
GPA -0.421 0.435 2.74 0.025 0.050 0.026 -0.066 -0.026 -0.007 -0.002
Total Credits -0.017** 0.008 68.37 0.001* 0.002** 0.001 -0.003** -0.001* -0.0003 -0.000
Econ Hours 0.723 1.979 0.25 -0.043 -0.086 -0.044 0.113 0.044 0.012 0.004
Male 0.755** 0.354 0.63
Intercept 1 -3.180* 1.775
Intercept 2 -1.677 1.748
Intercept 3 -0.076 1.734
Intercept 4 1.813 1.743
Intercept 5 3.308** 1.779
Intercept 6 4.775** 1.886
Notes: Number of usable posts = 175, Number of Students Responding = 21
LR Chi2(8) = 36.94, Log likelihood = -258.71, Pseudo R2 = 0.067
* and ** denotes significance at the 10% and 5% levels, respectively
TABLE 5
Ordered Logit Results from Sample Discussion 2
Marginal Effects dy/dx (with predicted probabilities)
Stand.
Error Mean
Level=0
( =.00)
Level=1
( =.05)
Level=2
( =.45)
Level=3
( =.49)
Level=4
( =.01)
Level=5
( =.00)
Level=6
( =.00)
Lagged Score 1.054** 0.306 2.69 - -0.055** -0.209** 0.251** 0.008 - -
Days later -0.025 0.100 0.05 - 0.001 0.005 -0.005 -0.000 - -
Trend 0.001 0.005 78.50 - -0.000 -0.000 0.000 0.000 - -
# of Postings -0.109 0.093 2.57 - 0.005 0.022 -0.026 -0.001 - -
GPA 1.384** 0.557 3.26 - -0.072* -0.274** 0.330** 0.011 - -
Total Credits 0.003 0.009 107.86 - -0.000 -0.001 0.001 0.000 - -
Econ Hours 0.782 2.757 0.22 - -0.041 -0.154 0.186 0.006 - -
Male 0.436 0.355 0.55
Intercept 1 4.824** 2.231
Intercept 2 7.681** 2.296
Intercept 3 12.08** 2.422
Intercept 4 13.21** 2.559
Intercept 5
Intercept 6
Notes: Number of usable observations = 154, Number of Students Responding = 31
LR Chi2 (8) = 22.95, Log likelihood = -135.71, Pseudo R2 = 0.078
TABLE 6
Ordered Logit Results from Sample Discussion 3
Marginal Effects dy/dx (with predicted probabilities)
Stand.
Error Mean
Level=0
( =.01)
Level=1
( =.08)
Level=2
( =.49)
Level=3
( =.39)
Level=4
( =.04)
Level=5
( <.01)
Level=6
( <.01)
Lagged Score 0.386** 0.174 2.58 -0.004 -0.027** -0.063** 0.083** 0.009* 0.001 0.001
Days later -0.123** 0.042 0.06 0.001 0.009** 0.020** -0.027** -0.003* -0.000 -0.000
Trend 0.002 0.002 125.5 -0.000 -0.000 -0.003 0.001 0.000 0.000 0.000
# of Postings -0.032 0.027 6.32 0.000 0.002 0.005 -0.007 -0.001 -0.000 -0.000
GPA 0.552* 0.335 2.90 -0.006 -0.039 -0.090 0.119 0.012 0.001 0.001
Total Credits 0.007 0.007 68.03 -0.000 -0.001 -0.001 0.001 0.000 0.000 0.000
Econ Hours 1.966 1.360 0.18 -0.022 -0.137 -0.320 0.424 0.044 0.000 0.005
Male 0.284 0.322 0.65
Experience 0.004 0.325 0.42
Intercept 1 -0.964 1.383
Intercept 2 1.160 1.289
Intercept 3 3.810 1.310
Intercept 4 7.013 1.368
Intercept 5 8.697 1.514
Intercept 6 9.405 1.673
Notes: Number of usable observations = 248, Number of Students Responding = 28
LR Chi2 (9) = 26.43, Log likelihood = -266.86, Pseudo R2 = 0.047
TABLE 7
Ordered Logit Results from Sample Discussion 4
Marginal Effects dy/dx (with predicted probabilities)
Stand.
Error Mean
Level=0
( <.01)
Level=1
( =.04)
Level=2
( =.58)
Level=3
( =.33)
Level=4
( =.05)
Level=5
( =.00)
Level=6
( =.00)
Lagged Score 0.121 0.171 2.58 -0.000 -0.005 -0.021 0.022 0.005 - -
Days later -0.021 0.029 0.64 0.000 0.001 0.004 -0.004 -0.001 - -
Trend -0.002 0.001 174.2 0.000 0.000 0.000 -0.000 -0.000 - -
# of Postings -0.044 0.056 2.81 0.000 0.002 0.008 -0.008 -0.002 - -
GPA 1.184** 0.263 3.07 -0.004 -0.050** -0.225** 0.222** 0.053** - -
Total Credits -0.020** 0.006 86.76 0.000 0.001** 0.004** -0.004** -0.001** - -
Econ Hours 0.753 1.402 0.20 -0.003 -0.032 -0.143 0.141 0.041 - -
Male 0.349 0.242 0.63
Class -0.304 0.313 0.52
Intercept 1 -4.099** 1.535
Intercept 2 -1.084 1.191
Intercept 3 2.447** 1.189
Intercept 4 4.889** 1.215
Intercept 5 8.072** 1.563
Intercept 6 - -
Notes: Number of usable observations = 342, Number of Students Responding = 60
LR Chi2 (9) = 43.62, Log likelihood = -339.59, Pseudo R2 = 0.060
TABLE 8
Ordered Logit Results from Sample Discussion 5
Marginal Effects dy/dx (with predicted probabilities)
Stand.
Error Mean
Level=0
( =.01)
Level=1
( =.21)
Level=2
( =.34)
Level=3
( =.39)
Level=4
( =.04)
Level=5
( =.02)
Level=6
( =.00)
Lagged Score -0.311 0.210 2.61 0.003 0.050 0.024 -0.059 -0.012 -0.006 -
Days later -0.100 0.097 0.10 0.001 0.016 0.0082 -0.018 -0.004 -0.002 -
Trend 0.020** 0.008 53.71 -0.000 -0.003* -0.001 0.004** 0.001* 0.000 -
# of Postings 0.157 0.097 2.53 -0.001 -0.025* -0.012 0.030* 0.006 0.003 -
GPA 0.560 0.490 2.93 -0.005 -0.090 -0.043 0.107 0.021 0.010 -
Total Credits -0.007 0.012 66.0 0.000 0.001 0.000 -0.001 -0.000 -0.000 -
Econ Hours 1.433 2.786 0.20 0.000 -0.231 -0.111 0.273 0.054 -0.027 -
Male 0.991* 0.569 0.74
Experience -0.565 0.519 0.66
Intercept 1 -2.520 1.873
Intercept 2 0.881 1.627
Intercept 3 2.373 1.640
Intercept 4 4.913** 1.692
Intercept 5 6.098** 1.747
Intercept 6 - -
Notes: Number of usable observations = 100, Number of Students Responding = 25
LR Chi2 (9) = 23.24, Log likelihood = -127.81, Pseudo R2 = 0.0834
REFERENCES
Agarwal, R. and E. Day. “The Impact of the Internet on Economic Education.” Journal of
Economic Education, 29(2), 1998, 99-110.
Chizmar, J. F. and M. S. Walbert. “Web-Based Learning Environments Guided by Principles of
Teaching Practice.” Journal of Economic Education, 30(3), 1999, 248-259.
Cohen, A. and J. Spencer. “Using Writing Across the Curriculum in Economics: Is Taking the
Plunge Worth It?” Journal of Economic Education, 24(3), 1993, 219-230.
Davisson, W. and F. Bonello. Computer-assisted instruction in economics education: a case
study. South Bend, IN: University of Notre Dame Press, 1976.
Elliot, D, Meisel, J. and W. Richards. “The senior project: using the literature of distinguished
economists.” Journal of Economic Education, 29(4), 1998, 312-320.
Feiner, S. and B. Roberts. “Using alternative paradigms to teach about race and gender: a
critical thinking approach to introductory economics.” American Economic Review, 85(2),
1995, 367-371.
Fels, R. “Hard research on a soft subject: hypothesis-testing in economic education.” Southern
Economic Journal, 36(1), 1969, 1-9.
Greenlaw, S.A. “Using e-mail, gopher, and other internet tools to enhance your teaching.”
Telecommunications in Education News 6, 1995, 10-11.
1Notes
? Fels (1969), Thoma (1993), and Peterson and Bean (1998) have all noted the difficulty in
measuring critical thinking. One reason for this is that higher order cognitive skills are difficult
to assess with the traditional protocol of pre-testing and post-testing using multiple-choice
instruments (Fels 1969, Katz and Becker 1999, and Sosin and Becker 2000).
2 Most other authors outline various written assignments in which the students inevitably learn to
think better over time only after getting feedback on their critical thinking process by a highly
qualified instructor experienced in teaching this type of activity. The drawback is that it is a
very labor-intensive process. Unfortunately, many professors will give up any serious attempt to
do this except in the case of small, seminar style classes. In fact, it is highly likely that this type
of learning is left for senior seminars and upper-division courses.
3 See Greenlaw (1995), Manning (1996), Greenlaw (1999), Vachris (1999), and Chizmar and
Walbert (1999).
4
? One important way the present study differs from Agarwal and Day (1998) is in the
experimental design. Following the typical approach in the economics education literature,
Agarwal and Day (1998) assess the use of a particular pedagogical tool by comparing two
groups of students, a treatment group and a control group. Siegfried and Fels (1979) point out
that such experiments fail to capture the efficiency gains realized by students, resulting in a bias
towards failure to reject the null hypothesis. We avoid this pitfall by testing only for the
existence of the critical thinking-spillover effect, rather than testing to see if that leads ultimately
to higher test scores on the final exam.
5
? Note that an electronic discussion has two distinct dimensions, time and thread. A thread is a
distinct line of conversation between participants. Each posting provokes one or more
responses, which themselves provoke responses. When we talk about “successive postings” we
mean in terms of the order in a thread, rather than date and time per se; since an electronic
discussion often has multiple threads, a successive posting in terms of time, may be unrelated to
the previous posting if they are from different threads. Usually, subsequent postings are made
by other students.
6
? This approach will be explained in more detail in the next section of the paper.
Greenlaw, S.A. “Using Groupware to Enhance Teaching and Learning in Undergraduate
Economics.” Journal of Economic Education, 30(1), 1999, 33-42.
Greenlaw, S. A. and S. B. DeLoach. “Critical thinking and Electronic Discussion.” Journal of
7
? As Siegfried and Fels (1979) note, SAT scores are typically found to be positively related to
achievement in economics courses. GPA is often assumed to measure overall student effort,
especially when combined with SAT scores, which measures aptitude. As a practical matter,
however, not all students had SAT scores available for the discussions we tested. As a result,
SATs cannot be used in this study without losing observations. The reason lost observations are
such a problem in this case is that we are in essence dealing with time series data; since we are
estimating the effect of a lagged dependent variable, it is vital to maintain the integrity of the
series.
8 See Siegfried and Fels (1979).
9
? The age of the respondent was also considered initially, but there wasn’t enough variation
across students to make a significant difference in the regressions. Because of this, age has been
omitted from the regression reported in this paper. It is also important to note that in these
classes, there was a significant lack of ethnic diversity which, like age, kept us from considering
these factors in our models.
10
? One discussion lasted only one week. This is the one listed in the results as the fifth discussion.
As a result, this discussion has far fewer usable observations (100) than the other samples.
11 See Greenlaw and DeLoach (2003) for a complete exposition of this scoring methodology.
12
? Since each student posts multiple responses during a discussion, a given student may post
comments at several different levels of critical thinking over the course of the discussion. So,
while the student’s human capital has not changed significantly over this two-week period, they
do receive different “treatments” at various stages in the form of peer interaction.
13 One way we have done this in subsequent discussions is to provide homework points based not
only on the total number of posts during the discussion, but also on the timing.
14
? In the third and fifth samples, a control dummy variable was included to account for students
who had done this type of exercise in previous classes. Usually these were students in Money
and Banking who had participated in discussions in their Intermediate Macroeconomics class in
Economic Education, 34(1), 2003, 36-52.
Katz, A. and W. E. Becker. “Technology and the teaching of economics to undergraduates.”
Journal of Economic Education, 30(3), 1999, 194-199.
Manning, L. “Economics and the internet: Electronic mail in the classroom.” Journal of
Economic Education 27(3), 1996, 201-204.
Palmini, D. “Using rhetorical cases to teach writing skills and enhance economic learning.”
Journal of Economics Education. 27(3), 1996, 205-216.
Peterson, D. and J. C. Bean. “Using a Conceptual Matrix to Organize a Course in the History of
Economic Thought.” Journal of Economic Education, 29(3), 1998, 262-273.
Siegfried, J. and R. Fels. “Research on Teaching College Economics: A Survey.” Journal of
Economic Literature. 17(3), 1979, 923-959.
Sosin, K. and W. E. Becker. “Online teaching resources: A new journal section.” Journal of
Economic Education, 31(1), 2000, 3-7.
the previous semester, or vice versa. These were the only cases in which any participants had
prior experience with electronic discussions. Unexpectedly, this prior experience does not
appear to have contributed to higher levels of critical thinking on average.
15 CLASS was used only in sample four, the only case in which two separate classes participated
in the discussion. There was no significant difference between postings made by students in the
Intermediate Macroeconomics and the Money and Banking classes.
Thoma, G. A. “The Perry framework and tactics for teaching critical thinking in economics.”
Journal of Economic Education, 24(2), 1993, 128-136.
Vachris, M. A. “Teaching Principles of Economics without ‘Chalk and Talk’: The Experience of
CNU Online.” Journal of Economic Education, 30(3), 1999, 292-302.
White, E. M. Teaching and Assessing Writing, San Francisco: Jossey-Bass, 1985.