ArticlePDF Available

The MIT Deliberatorium: Enabling Large-Scale Deliberation About Complex Systemic Problems

Enabling Large-scale Deliberation about Complex Systemic Problems
Mark Klein
MIT Center for Collective Intelligence
Cambridge MA 02139, U.S.A
Abstract: Humanity now finds itself faced with highly complex challenges – ranging from climate change, and the
spread of disease to international security and scientific collaborations - that require effective collective
multi-disciplinary decision making with large communities that are distributed in time and space. While
social computing tools (e.g. web forums, wikis, email, instant messaging, media sharing sites, social
networking, and so on) have created unprecedented opportunities for connecting and sharing on a massive
scale, they still fare poorly when applied to deliberation, i.e. the systematic identification, evaluation, and
convergence upon solutions to complex problems. Internet-mediated discussions are instead all-to-often
characterized by widely varying quality, poor signal-to-noise ratios, spotty coverage, and scattered content,
as well as dysfunctional dynamics for controversial issues.
This talk will present a novel integration of ideas taken from social computing and argumentation theory
that has, we believe, the potential to address these limitations. We will describe the underlying concepts, the
results of our evaluations to date, and some promising directions for future work.
Humanity now finds itself faced with a range of
highly complex problems – such as climate change,
the spread of disease, international security,
scientific collaborations, product development, and
so on - that call upon us to bring together large
numbers of experts and stakeholders to deliberate
collectively on a global scale. Collocated meetings
can however be impractically expensive, severely
limit the concurrency and thus breadth of
interaction, and are prone to serious dysfunctions
such as polarization and hidden profiles (Sunstein
2006). Social media such as email, blogs, wikis, chat
rooms, and web forums provide unprecedented
opportunities for interacting on a massive scale, but
have yet to realize their potential for helping people
deliberate effectively, typically generating poorly-
organized, unsystematic and highly redundant
contributions of widely varying quality. Large-scale
argumentation systems represent a promising
approach for addressing these challenges, by virtue
of providing a simple systematic structure that
radically reduces redundancy and encourages clarity.
They do, however, raise an important challenge.
How can we ensure that the attention of the
deliberation participants is drawn, especially in large
complex argument maps, to where it can best serve
the goals of the deliberation? How can users, for
example, find the issues they can best contribute to,
assess whether some intervention is needed, or
identify the results that are mature and ready to
“harvest”? Can we enable, for large-scale distributed
discussions, the ready understanding that
participants typically have about the progress and
needs of small-scale, collocated discussions?
This paper will address these important questions,
discussing (1) the strengths and limitations of
current deliberation technologies, (2) how large-
scale argumentation can help address these
limitations, and (3) how we can use novel
deliberation metrics to enhance the effectiveness of
deliberations mediated by argumentation systems.
Let us define deliberation as a process where
communities (1) identify possible solutions for a
problem, and (2) select the solution(s) from this
space that best meet their diverse needs (Walton and
Krabbe 1995) (Eemeren and Grootendorst 2003).
How well do existing technologies meet this
A wide range of social computing technologies have
emerged in the past few decades, including email,
chat, web forums, wikis like wikipedia, media
sharing sites like youtube and flickr, open source
software development efforts such as Linux, solution
competitions such as, idea-sharing
systems such as, peer-filtering sites
such as Slashdot, group decision support (GDSS)
systems (Pervan and Atkinson 1995) (Luppicini
2007) (Gopal and Prasad 2000) (Macaulay and
Alabdulkarim 2005) (Powell, Piccoli et al. 2004),
and scientific collaboratories (Finholt 2002).
Experience with such systems has shown that they
foster, by virtue of reducing the cost of participation,
voluntary contributions at a vast scale, which in turn
can lead to remarkably powerful emergent
phenomena (Tapscott and Williams 2006) (Sunstein
2006) (Surowiecki 2005) (Gladwell 2002) that
Idea Synergy: the ability for users to share their
creations in a common forum can enable a
synergistic explosion of creativity, since people
often develop new ideas by forming novel
combinations and extensions of ideas that have
been put out by others.
The Long Tail: social computing systems enable
access to a much greater diversity of ideas than
they would otherwise: “small voices” (the tail of
the frequency distribution) that would otherwise
not be heard can now have significant impact.
Many Eyes: social computing efforts can
produce remarkably high-quality results by
virtue of the fact that there are multiple
independent verifications - many eyes
continuously checking the shared content for
errors and correcting them.
Wisdom of the Crowds: large groups of
(appropriately independent, motivated and
informed) contributors can collectively make
better judgments than those produced by the
individuals that make them up, often exceeding
the performance of experts, because their
collective judgment cancels out the biases and
gaps of the individual members.
To understand the strengths and limitations of
these technologies, it is helpful to divide them up
based on how they structure content. One category is
time-centric tools, i.e. tools like email, chat rooms,
and web forums where content is organized based on
when a post was contributed. Such systems enable
large communities to weigh in on topics of interest,
but they face serious shortcomings from the
perspective of enabling collective deliberation
(Sunstein 2006):
Scattered Content. The content in time-centric
tools is typically widely scattered, so it’s hard to
find all the contributions on a topic of interest.
This also fosters unsystematic coverage, since
users are often unable to quickly identify which
areas are well-covered, and which need more
Low Signal-to-noise Ratio. The content captured
by time-centric tools is notorious for being
voluminous and highly repetitive. This is a self-
reinforcing phenomenon: since it can be
difficult to find out whether a point has already
been made in a large existing corpus, it’s more
likely that minor variants will be posted again
and again by different people. Some authors
may do so simply hoping to win arguments by
sheer repetition. This low signal-to-noise ratio
makes it difficult to uncover the novel
contributions that inspire people to generate
creative new ideas of their own.
Balkanization. Users of time-centric systems
often tend to self-assemble into groups that
share the same opinions – there is remarkably
little cross-referencing, for example, between
liberal and conservative blogs and forums – so
they tend to see only a subset of the issues,
ideas, and arguments potentially relevant to a
problem. This tends to lead people to take on
more extreme, but not more broadly informed,
versions of the opinions they already had.
Dysfunctional Argumentation. Time-centric
systems do not inherently encourage or enforce
any standards concerning what constitutes valid
argumentation, so postings are often bias- rather
than evidence- or logic-based.
Enormous effort is typically required to
“harvest” the corpuses created by time-centric tools
to identify the most important issues, ideas, and
arguments. Intel, to give a typical example, ran a
web forum on organizational health that elicited a
total of 1000 posts from 300 participants. A post-
discussion analysis team invested over 160 person-
hours to create a useful summary of these
contributions (at 10 minutes a post, probably longer
than it took to write many of the posts in the first
place). The team found that there was lots of
redundancy, little genuine debate, and few
actionable ideas, so that in the end many of the ideas
Figure 1: A screenshot from the deliberatorium, a large-scale argumentation system.
they reported came from the analysis team
members themselves, rather than the forum1.
It could be argued that many of these concerns
are less prominent in topic-centric tools such as
wikis and idea-sharing systems. In wikis, for
example, all the content on a given topic is captured
in a single article. But wikis are deeply challenged
by deliberations on complex and controversial topics
(Kittur, Suh et al. 2007) (Viegas, Wattenberg et al.
2004). They capture, by their nature, the “least-
common-denominator” consensus between many
authors (any non-consensus element presumably
being edited out by those that do not agree with it),
and the controversial core of deliberations are
typically moved to massive talk pages for the article,
which are essentially time-centric venues prone to
all the limitations we noted above. Idea-sharing
tools – such as Dell’s, the Obama
1 Based on personal communication with Catherine Spence,
Information Technology Enterprise Architect, Computing
Director/Manager at Intel.
administrations’ Open for Questions web site, and
Google’s - are organized
around questions: one or more questions are posted
and the community is asked to contribute, rate, and
comment on proposed solutions. Such sites can elicit
huge levels of activity – the Obama site for example
elicited 70,000 ideas and 4 million votes in three
weeks – but they are prone to several serious
shortcomings. One is redundancy: in all of these
sites, many of the ideas represent minor variations of
each other. When there are thousands of posts
submitted, manually pruning this list to consolidate
equivalent posts is a massive undertaking. In
Google’s case, for example, the company had to
recruit 3,000 employees to filter and consolidate the
150,000 ideas they received in a process that put
them 9 months behind their original schedule.
Another issue is non-collaborativeness. Idea-sharing
sites tend to elicit many fairly simple ideas. The
ideas generated by the google project, for example,
(e.g. make government more transparent, help social
entrepreneurs, support public transport, create user-
generated news services) were in large part not
novel and light on detail. Surely that massive
amount of effort could have been used to compose a
smaller number of more deeply-considered ideas,
but idea-sharing sites provide little or no support (or
incentive) for this, because people can not
collaboratively refine submitted ideas.
Large-scale argumentation represents a promising
approach to addressing the weaknesses with current
deliberation technologies. We describe this approach
Argumentation tools (Kirschner, Shum et al.
2005) (Moor and Aakhus 2006) (Walton 2005) take
an argument-centric approach based on allowing
groups to systematically capture their deliberations
as tree structures made up of issues (questions to be
answered), ideas (possible answers for a question),
and arguments (statements that support or detract
from an idea or argument) that define a space of
possible solutions to a given problem:
Such tools have many advantages. Every unique
point appears just once, radically increasing the
signal-to-noise ratio, and all posts must appear under
the posts they logically refer to, so all content on a
given question is co-located in the tree, making it
easy to find what has and has not been said on any
topic, fostering more systematic and complete
coverage, and counteracting balkanization by putting
all competing ideas and arguments right next to each
other. Careful critical thinking is encouraged,
because users are required to express the evidence
and logic in favor of the options they prefer (Carr
2003), and the community can rate each element of
their arguments piece-by-piece. Users, finally, can
collaboratively refine proposed solutions. One user
can, for example, propose an idea, a second raise an
issue concerning how some aspect of that idea can
be implemented, and a third propose possible
resolutions for that issue. The value of an argument
map can extend far beyond the deliberation it was
initially generated for, because it represents an entire
design space of possible solutions that can be readily
harvested, refined and re-combined by other
communities facing similar problems.
Most argumentation systems have been used by
individuals or in small-scale settings, relying in the
latter case on a facilitator to capture the free-form
interactions of a collocated group in the form of an
commonly-viewable argument map (Shum, Selvin et
al. 2006). Argumentation systems have also been
used, to a much lesser extent, to enable distributed
deliberations over the Internet (Jonassen and Jr
2005) (Chklovski, Ratnakar et al. 2005) (Lowrance,
Harrison et al. 2001) (Karacapilidis, Loukis et al.
2004) (Heng and de Moor 2003) (Rahwan 2008).
These maps tend to be poorly structured, however,
because many users are not skilled argument
mappers, and the scale of participation has been
small2, typically involving only a handful of authors
on any given task.
The author and his colleagues have investigated,
over the past several years, how an argument-centric
approach can be extended to operate effectively at
the same large scales as other social computing
systems. Our approach is simple. Users are asked to
create, concurrently, a network of posts organized
into an argument map. We use the IBIS
argumentation formalism (Conklin 2005) because it
is simple and has been applied successfully in
hundreds of collective decision-making contexts. A
set of community conventions (similar to those that
underlie other social computing systems like
Wikipedia and Slashdot) help ensure that the
argument map is well-organized. Each post should
represent a single issue, idea, pro, or con, should not
replicate a point that has been made elsewhere in the
argument map, and should be attached to the post it
logically refers to. A central tenet is the “live and let
live” rule: if one disagrees with an idea or argument,
the user should not change that post to undermine it,
but should rather create new posts that present their
alternative ideas or counter-arguments. Every
individual can thus present their own point of view,
using the strongest arguments they can muster,
without fear of sabotage by anyone else. This
process is supported by capabilities that have proven
invaluable in other social computing systems,
including rating (to help the community encourage
and identify important issues, ideas and arguments),
watchlists (which automatically notify users of
changes to posts they have registered interest in),
version histories (to allow users to roll-back an post
to a previous version if it has been “damaged” by an
edit), and home pages (which allows users to
develop an online presence). The system also
2 The one exception we are aware of (the Open Meeting
Project’s mediation of the 1994 National Policy Review
(Hurwitz 1996)) was effectively a comment collection
system rather than a deliberation system, since the
participants predominantly offered reactions to a large set
of pre-existing policy documents, rather than interacting
with each other to create new policy options.
provides multiple forms of social translucence
(Erickson, Halverson et al. 2002) (i.e. visual cues
concerning who is doing what in the system),
thereby fostering a sense of belonging as well as
enabling self-organized attention mediation by the
community. See (Klein 2007) for further discussion
of the issues underlying the design of large-scale
argumentation capability. The system itself is
accessible at
Because good argument-mapping skills are not
universal, moderators help ensure that new posts are
correctly structured. Their job is part education, and
part quality control. Posts, when initially created, are
given a “pending” status and can only be viewed by
other authors. If a post doesn’t adequately follow the
argument map conventions, moderators will either
fix it or leave comments explaining what needs to be
done. Once a moderator has verified that a post
follows the conventions, the post is “certified” and
becomes available to be viewed, edited, commented
on, or rated by the general user population. The
certification process helps ensures well-structured
maps, and provides incentives for users to learn the
argument formalism. Moderators serve as honest
brokers in all this: their role is not to evaluate the
merits of a post, but simply to ensure that the content
is structured in a way that maximizes its utility to the
community at large.
We have implemented an initial version of these
ideas, in the form of a web-based tool called the
Deliberatorium (Klein and Iandoli 2008) (Iandoli,
Klein et al. 2009, and evaluated it to date with over
700 users deliberating on a wide range of topics. The
largest evaluation was performed at the University
of Naples with 220 masters students in the
information engineering program, who were asked
to use the system to deliberate, over a period of three
weeks, about the use of bio-fuels in Italy [Klein,
2008 #4691). We observed a very high level of user
participation: all told, the students posted over 3000
issues ideas and arguments, in addition to 1900
comments. This is, to our knowledge, both the
largest single argument map ever created, as well as
(by far) the largest number of authors for a single
argument map. Roughly 1800 posts were eventually
certified, and about 70% of all posts could be
certified without changes, demonstrating that, even
after a relatively short usage period, most authors
were able to create properly-structured posts. The
certification ratio, in addition, increased over the
duration of the experiment. The breadth and depth of
coverage was, in the judgment of content experts,
quite good: this community of non-experts was able
to create a remarkably comprehensive map of the
current debate on bio-fuels, complete with
references, exploring everything from technology
and policy issues to environmental, economic and
socio-political impacts. We estimated, based on this
experience, that there needs to be about 1 moderator
for every 20 active authors, to ensure that posts are
checked and certified in a timely fashion without
undue burden on each moderator. This figure is well
within the bounds of the percentage of “power
users” that typically emerge in social computing user
Other evaluations (including a deliberation with
120 students at the University of Zurich, with 73
users at Intel, and with 40 users at the US Federal
Bureau of Land Management), have explored the
efficacy of our large-scale argumentation tool for a
range of topics and incentive structures. These
evaluations support the idea that large-scale
argumentation can be applied effectively to complex
challenges. Substantial user communities with no
initial familiarity with argumentation formalisms
have been able, in a range of contexts, to rapidly
create substantive, useful, compact, and well-
organized maps on complex topics, while requiring
levels of moderator effort much lower than those
needed to harvest, post-hoc, discussions hosted by
such conventional social computing tools as web
Our mathematical analyses show that the per-
moderator burden, as well as the cost-benefit ratio
for authors, should decrease substantially as the user
community grows, suggesting that the incentives
will be especially compelling for larger-scale
While these results are promising, our work has
led us to conclude that, to fully realize
argumentation technology’s potential for supporting
large-scale deliberations, we need to address the
critical challenge of attention allocation. For the
kinds of topics that most require large-scale
deliberation, even a moderately large user
community can quickly generate large and rapidly
growing argument maps. How can we help users
identify the portions of the map that can best benefit
from their contributions, in maps that covers
hundreds of topics? How can the stakeholders for
such deliberations assess whether the deliberations
are progressing well, whether some intervention is
needed to help the deliberations work more
effectively, and when the results are mature and
ready to “harvest”? Can we foster, for large-scale
deliberations, the understanding that participants in
small-scale discussions typically have about where
the discussion has gone, what remains to be
Figure 2: Using metrics to enable attention mediation in large-scale deliberations.
addressed, and where they can best contribute.
Without this kind of big picture, we run the risk of
severely under-utilizing the collective intelligence
potentially provided by large-scale social media.
We can meet this challenge, we believe, by
developing a set of algorithms that can be used to
provide users with a personalized and continuously-
updated set of suggestions, based on deliberation
metrics, concerning which parts of the argument
map they should view, add to or rate, and why:
Each user is free to accept or ignore suggestions
as they like, but they know that the suggestions are
based in an overview of the deliberation as a whole
and are intended to help them apply their unique
skills and perspectives to promising regions in the
map. If the suggestions are reasonably well-done,
the emergent effect is that the collective intelligence
of the user community is maximized because each
users contributes where they can do the most good.
How can such suggestions be generated? This
can be done, we believe, building on a process we
call process-commitment-exception analysis (Klein
2003). First we define a normative model that
specifies what a good large-scale deliberation looks
like, including its main steps, commitments, and
failure modes (exceptions). Each commitments and
exception is then mapped to one or more metrics
intended to assess (by analyzing user activity data
and the emerging argument map structure) to what
extent the commitments is being achieved or,
conversely, to what extent the exception is taking
place. These metrics values are then mapped, based
on a model of the users’ roles and interests, into
customized suggestions. We discuss these steps in
more detail in the paragraphs below.
A Normative Model of Large-Scale Deliberation
Our normative deliberation model formalizes a
straightforward view of what makes up a rational
decision-making process. According to this model,
deliberation consists of four key steps:
1. Identify the goals the deliberation is trying to
2. Propose possible ways to achieve these goals
3. Evaluate the proposed ideas with respect to the
deliberations goals
4. Select the best idea(s) from amongst the
proposed solutions
The commitments ( ) and exceptions ( ) in this
model include:
Figure 3: A (partial) normative model enumerating deliberation commitments and exceptions.
The commitments for the “identify goals” step,
for example, include identifying all relevant goals
for the deliberation, which in turn is enabled by
getting input from all stakeholders for the decision
being made.
This generic deliberation model is then
elaborated to include sub-steps that specify how
these main steps are implemented in the context of a
large-scale argumentation system. For example, the
commitments “easy to identify gaps” of the step
“identify possible decisions” is implemented, in an
argumentation system, by capturing possible
decisions as idea posts and placing them in the
correct part of the argument map so that all the ideas
for an issue are grouped together, making it easy to
see what has and has not been proposed for that
issue. Each of these additional steps may imply
additional commitments and exceptions.
Identifying Metrics
The next step is to identify metrics that can use the
information generated during a large-scale
argument-centric deliberation to assess whether the
deliberation commitments are being achieved, and
the potential exceptions are occurring or not. We
have identified over 100 possible metrics to date and
describe, below, a few illustrative examples,
highlighting those that take advantage of the
additional semantics provided by an argument map:
Balkanization: balkanization is the phenomenon
wherein a community divides itself into sub-
groups where members of each group agree
with one other but tend to reflexively ignore the
inputs of other groups that they do no agree
with. This can be viewed as a deliberation
dysfunction because it violates the goal
“individuals fully consider the options and
tradeoffs” of the “select the best decision” step
of our normative deliberation model. The
structure of the argument map makes it clear
which ideas represent alternatives for a given
issue, as well as which arguments support and
detract from these ideas, making it
straightforward to assess when groups are
ignoring the ideas, and supporting arguments,
for competing ideas.
Groupthink: groupthink can be defined as
occurring when a community prematurely
devotes an excessive proportion of its
attentional resources to a small subset of the
relevant issues, ideas and arguments. This is
straightforward to assess in an argument map
because we can readily measure when, for
example, one idea under an issue is receiving
the bulk of the community’s attention while
competing ideas and their underlying variants
and arguments have remain largely untouched.
Irrational Bias: we define irrational bias as
occurring when a user gives ratings for ideas or
arguments that are inconsistent with the ratings
they give the underlying arguments. We can use
simple techniques to propagate a user’s ratings
for arguments up to produce a predicted rating
for the higher level arguments/ideas, and then
compare that with the actual ratings they give
these posts.
Mature Topics: a mature topic is one where a
fairly exhaustive inventory has been made of
the relevant ideas and arguments. This can be
estimated in a number of ways, including tree
topology (more mature topics tend to gave both
broader and deeper structures), activity history
(argument-centric deliberations tends to
transition, over time, from identifying issues to
proposing ideas to presenting arguments to
ratings posts to quiescence), and so on.
Controversial posts: we can identify
controversial posts because one can look not
only for posts with many highly divergent
ratings, but also for posts that have polarized
rating distributions for the underlying
arguments. The fact that each post represents a
single logical point (issue, idea, or argument)
rather than (as is often the case with other social
media) a collection of points, means that the
ratings give a more accurate picture of the
community’s assessment of each point.
A large-scale argumentation system requires that
users parse their contributions into topically-
organized structures of typed, individually-ratable
issues, ideas, and argument. This structure provides,
as we can see, rich fodder for such powerful
techniques as social network analysis, belief
propagation, singular vector decomposition, and so
on. This in turn makes it possible to define real-time
deliberation metrics that, for conventional social
media, would require an impractical level of hand-
coding for most settings.
Generating Suggestions
The final step of our approach involves generating
suggestions for users concerning which posts they
might want to look at in order to contribute most
effectively to the deliberation at hand. This is done
by identifying, based on a user model, which metrics
a user “should” be interested in, and then drawing
their attention to parts of the argument map where
these metrics have extreme values. The user’s
interests can be inferred based on their role, as well
as their past activity and that of other members of
the community. A topic manager (someone
responsible for ensuring a deliberation achieves
useful results) might, for example, be interested in
identifying parts of the deliberation that are mature
and ready to be “harvested” or, conversely, that are
dysfunctional (e.g. exhibiting balkanization or
groupthink) and need some kind of intervention. An
author might be interested in being notified of
controversies that have arisen in an area they
previously contributed to, of pet ideas whose support
has dropped and might be revived by the addition of
additional supportive arguments, or of posts where
there ratings appear to exhibit an irrational bias. In
our current implementation, users are presented
these suggestions in the form of an argument map
subset wherein the suggested posts are highlighted
and the reasons for the highlighting appear when
they roll over the post:
Figure 4: The personalized suggestions display.
The emergent effect of these automatically-
generated suggestions, we believe, will be to help
ensure that each part of the deliberation receives
attention, and is fully developed by, the participants
with the most interest and knowledge on the topic.
The key contribution of this work is to explore how
automated algorithms can generate real-time metrics
that help users allocate their deliberation efforts, in
an argument map context, to where they can do the
most good. This approach, if executed well,
synergistically harnesses the creativity and judgment
of human communities along with the ability of
computational systems to rapidly summarize and
visualize large data sets.
While there has been substantial effort devoted
to manually-coded, post-hoc metrics on the efficacy
of on-line deliberations (Steenbergen, Bächtinger et
al. 2003) (Stromer-Galley 2007) (Trénel 2004)
(Cappella, Price et al. 2002) (Spatariu, Hartley et al.
2004) (Nisbet 2004), existing deliberation
technologies have made only rudimentary use of
automated real-time metrics to foster better
emergent outcomes during the deliberations
themselves. The core reason for this lack is that, in
existing deliberation tools, the content takes the
form of unstructured natural language text, limiting
the possible deliberation metrics to the analysis of
word frequency statistics, which is a poor proxy for
the kind of semantic understanding that would be
necessary to adequately assess deliberation quality.
One of the important advantages of using argument
maps to mediate deliberation is that they allow us,
by virtue of their additional semantics, to
automatically derive metrics that would require
resource-intensive manual coding for more
conventional social media such as web forums. We
are aware of one other effort to develop real-time
deliberation metrics for large-scale argument
mapping, but this work (Shum, Liddo et al. in press)
is based on measuring how well the deliberations
adhere (e.g. in terms of audibility, simultaneity of
messages, and mobility of the participants) to a
normative model of small-scale, physically
collocated conversations (Clark and Brennan 1991).
Our work is unique, we believe, in how it attempts
to assess (and improve) how effectively large groups
are deliberating (i.e. exploring and converging on
problem solutions) rather than just how well
individuals are conversing.
Our work to date has been largely conceptual,
focusing on identifying what kinds of metrics could
foster better emergent properties in large-scale
argumentation-based deliberations. Our future work
will focus on the empirical, analytic, and
computational (simulation-based) assessment of the
emergent impact of these metrics.
I would like to gratefully acknowledge the many
useful conversations I have had on the topic of
deliberation metrics with Prof Ali Gurkan (Ecole
Centrale Paris), Prof. Luca Iandoli (University of
Naples), and Prof. Haji Reijers (Eindhoven
University of Technology).
Cappella, J. N., V. Price and L. Nir (2002). "Argument
Repertoire as a Reliable and Valid Measure of Opinion
Quality: Electronic Dialogue During Campaign 2000."
Political Communication 19(1): 73 - 93.
Carr, C. S. (2003). Using computer supported argument
visualization to teach legal argumentation. Visualizing
argumentation: software tools for collaborative and
educational sense-making. P. A. Kirschner, S. J. B.
Shum and C. S. Carr, Springer-Verlag: 75-96.
Chklovski, T., V. Ratnakar and Y. Gil (2005). "User
interfaces with semi-formal representations: a study of
designing argumentation structures." Proceedings of
the 10th international conference on Intelligent user
interfaces: 130-136.
Clark, H. H. and S. E. Brennan (1991). Grounding in
communication. Perspectives on socially shared
cognition. L. B. Resnick, J. M. Levine and S. D.
Teasley. Washington, DC, US, American
Psychological Association,: 127-149.
Conklin, J. (2005). Dialogue Mapping: Building Shared
Understanding of Wicked Problems, John Wiley and
Sons, Ltd.
Eemeren, F. H. v. and R. Grootendorst (2003). A
Systematic Theory of Argumentation: The Pragma-
dialectical Approach, Cambridge University Press.
Erickson, T., C. Halverson, W. A. Kellogg, M. Laff and T.
Wolf (2002). "Social Translucence: Designing Social
Infrastructures that Make Collective Activity Visible."
Communications of the ACM 45(4): 40-44.
Finholt, T. A. (2002). "Collaboratories." Annual Review of
Information Science and Technology 36(1): 73-107.
Gladwell, M. (2002). The Tipping Point: How Little
Things Can Make a Big Difference, Back Bay Books.
Gopal, A. and P. Prasad (2000). "Understanding GDSS in
Symbolic Context: Shifting the Focus from
Technology to Interaction." MIS Quarterly 24(3): 509-
Heng, M. S. H. and A. de Moor (2003). "From Habermas'
s communicative theory to practice on the internet."
Information Systems Journal 13(4): 331-352.
Iandoli, L., M. Klein and G. Zollo (2009). "Enabling on-
line deliberation and collective decision-making
through large-scale argumentation: a new approach to
the design of an Internet-based mass collaboration
platform." International Journal of Decision Support
System Technology 1(1): 69-91.
Jonassen, D. and H. R. Jr (2005). "Mapping alternative
discourse structures onto computer conferences."
International Journal of Knowledge and Learning
1(1/2): 113-129.
Karacapilidis, N., E. Loukis and S. Dimopoulos (2004).
"A Web-Based System for Supporting Structured
Collaboration in the Public Sector." Lecture Notes in
Computer Science: 218-225.
Kirschner, P. A., S. J. B. Shum and C. S. C. Eds (2005).
"Visualizing Argumentation: Software tools for
collaborative and educational sense-making."
Information Visualization 4: 59-60.
Kittur, A., B. Suh, B. A. Pendleton and E. H. Chi (2007).
He says, she says: conflict and coordination in
Wikipedia. SIGCHI Conference on Human Factors in
Computing Systems. San Jose, California, USA, ACM.
Klein, M. (2003). A Knowledge-Based Methodology for
Designing Reliable Multi-Agent Systems. Agent-
Oriented Software Engineering IV. P. Giorgini, J. P.
Mueller and J. Odell, Springer-Verlag. 2935: 85 - 95.
Klein, M. (2007). The MIT Collaboratorium: Enabling
Effective Large-Scale Deliberation for Complex
Problems, MIT Sloan School of Management.
Klein, M. and L. Iandoli (2008). Supporting Collaborative
Deliberation Using a Large-Scale Argumentation
System: The MIT Collaboratorium. Directions and
Implications of Advanced Computing; Conference on
Online Deliberation (DIAC-2008/OD2008).
University of California, Berkeley.
Lowrance, J. D., I. W. Harrison and A. C. Rodriguez
(2001). Capturing Analytic Thought. First
International Conference on Knowledge Capture: 84-
Luppicini, R. (2007). "Review of computer mediated
communication research for education." Instructional
Science 35(2): 141-185.
Macaulay, L. A. and A. Alabdulkarim (2005). Facilitation
of e-Meetings: State-of-the-Art Review. e-Technology,
e-Commerce and e-Service (EEE'05).
Moor, A. d. and M. Aakhus (2006). "Argumentation
Support: From Technologies to Tools."
Communications of the ACM 49(3): 93.
Nisbet, D. (2004). "Measuring the Quantity and Quality of
Online Discussion Group Interaction " Journal of
eLiteracy 1: 122-139.
Pervan, G. P. and D. J. Atkinson (1995). "GDSS research:
An overview and historical analysis." Group Decision
and Negotiation 4(6): 475-483.
Powell, A., G. Piccoli and B. Ives (2004). "Virtual teams:
a review of current literature and directions for future
research." ACM SIGMIS Database 35(1): 6 - 36.
Rahwan, I. (2008). "Mass argumentation and the semantic
web." Journal of Web Semantics 6(1): 29-37.
Shum, S. B., A. D. Liddo, L. Iandoli and I. Quinto (in
press). "A Debate Dashboard to Support the Adoption
of Online Knowledge Mapping Tools." VINE Journal
of information and Knowledge Management Systems
Shum, S. J. B., A. M. Selvin, M. Sierhuis, J. Conklin and
C. B. Haley (2006). Hypermedia Support for
Argumentation-Based Rationale: 15 Years on from
gIBIS and QOC. Rationale Management in Software
Engineering. A. H. Dutoit, R. McCall, I. Mistrik and
B. Paech, Springer-Verlag.
Spatariu, A., K. Hartley and L. D. Bendixen (2004).
"Defining and Measuring Quality in Online
Discussions." The Journal of Interactive Online
Learning 2(4).
Steenbergen, M. R., A. Bächtinger, M. Spörndli and J.
Steiner (2003). "Measuring political deliberation: a
discourse quality index." Comparative European
Studies 1(1): 21-48.
Stromer-Galley, J. (2007). "Measuring deliberation's
content: a coding scheme." Journal of Public
Deliberation 3(1).
Sunstein, C. R. (2006). Infotopia: How Many Minds
Produce Knowledge, Oxford University Press.
Surowiecki, J. (2005). The Wisdom of Crowds, Anchor.
Tapscott, D. and A. D. Williams (2006). Wikinomics: How
Mass Collaboration Changes Everything, Portfolio
Trénel, M. (2004). Measuring the quality of online
deliberation: Coding scheme 2.4. Berlin, Germany,
Social Science Research Center.
Viegas, F. B., M. Wattenberg and K. Dave (2004).
Studying cooperation and conflict between authors
with history flow visualizations. SIGCHI conference
on Human factors in computing systems. Vienna,
Walton, D. N. (2005). Fundamentals of Critical
Argumentation (Critical Reasoning and
Argumentation). Cambridge (MA), Cambridge
University Press.
Walton, D. N. and E. C. W. Krabbe (1995). Commitment
in dialogue: Basic concepts of interpersonal
reasoning. Albany, NY, State University of New York
Dr. Mark Klein ( is a
Principal Research Scientist at the MIT Center for
Collective Intelligence, as well as an Affiliate at the
MIT Computer Science and AI Lab (CSAIL) and the
New England Complex Systems Institute (NECSI).
His research focuses on understanding how
computer technology can help groups, especially
large ones, make better decisions about complex
problems. He has made contributions in the areas of
computer-supported conflict management for
collaborative design, design rationale capture,
business process re-design, exception handling in
workflow and multi-agent systems, service
discovery, negotiation algorithms, 'emergent'
dysfunctions in distributed systems and, more
recently, 'collective intelligence' systems to help
people collaboratively solve complex problems like
global warming.
... Besides, discussions are spread in different spots on the social network. Argumentative web environments, such as offered by deliberatorium (Klein, 2011) offer a structured environment for fostering and discussing ideas based on posts classified as issues, ideas and arguments pro and against. However, they impose people an extra work of classifying their posts. ...
The participation of citizens in the decision-making of a community is the essence of a democracy. As the number of citizens grew, direct participation became utopia and delegated to elected representatives. The spread of internet allied to the population pressure for transparency in government’s decisions brought mass participation back to the table. The communication channels are there, though citizens’ participations have not been effective frequently because their suggestions are not mature. This paper presents a method, maturity in decision-making (MDM), for measuring the maturity of a group for a decision considering the risk of group-thinking, shallow analysis or even polarisation. We have applied the method in two scenarios with promising results.
... Besides, discussions are spread in different spots on the social network. Argumentative web environments, such as offered by deliberatorium (Klein, 2011) offer a structured environment for fostering and discussing ideas based on posts classified as issues, ideas and arguments pro and against. However, they impose people an extra work of classifying their posts. ...
... • Argumentation map (ArgMap), as shown in Figure 5: the discussion from the original forum was reread and logically organized into issues, ideas, and arguments. We used the Deliberatorium tool [14] to build the argumentation map and the same wording as used in the original forum. We also numbered all items providing participants with the same option to answer the questions as the forum participants. ...
Full-text available
With the advances in social media technology, it became possible to involve large group in deliberations. Online discussion environments, such as forums and argumentation maps websites, allow solutions to be outsourced and bred from the crowd. Nevertheless, as a discussion develops, making sense of its content represents a big challenge for newcomers, thus impairing their potential participation. We claim a rhetorically organized text, automatically generated, fosters understanding by guiding participants though the content within chronological, logical and social dimensions of a discussion. An empirical experiment supports our claim. The experiment involved three groups of 16 people whose task was to answer a questionnaire on reading comprehension of a previous debate presented in one of three formats: forum, argumentation map or rhetorically organized text. We discuss the reasons that might explain the results and the implications for the design of large groups’ interaction tools.
... In today's digital age, democratic deliberation takes on more urgency, as the number of voices has multiplied. Some scholars have harnessed this wisdom of the group through online platforms that encourage crowd-sourcing solutions to societal ills (Aitamurto & Landemore, 2015;Klein, 2011;Kriplean et al., 2012). This study built on this foundation by examining a central element in democratic deliberation-disagreement (Aitamurto & Landemore, 2015)-in the context of online commenting. ...
An experiment (N = 272) demonstrated that disagreement—either civil or uncivil—may have a chilling effect on the public discourse vital to a deliberative democracy. Both forms of disagreement—in comments posted on a news story about abortion—caused negative emotion and aggressive intentions. However, only uncivil disagreement led people to respond back uncivilly and indirectly led to greater intention to participate politically, if it aroused aggressive feelings. Findings support extending face and politeness theories to the computer-mediated space of online commenting. Results are discussed in relation to the impact on the public discourse.
... For example, ArgTrust [Parsons et al. 2013] provides an argumentationbased software which provides users the means to handle argumentative situations in a coherent and valid manner. MIT's delibrium [Klein 2011] provides an interactive web-based system to allow multiple, distant users to engage in a discussion in a logical manner. To the best of our knowledge, no argumentation-based system deploys a descriptive approach, i.e, accounts for how argumentation actually works in human reasoning. ...
Argumentative discussion is a highly demanding task. In order to help people in such discussions, this article provides an innovative methodology for developing agents that can support people in argumentative discussions by proposing possible arguments. By gathering and analyzing human argumentative behavior from more than 1000 human study participants, we show that the prediction of human argumentative behavior using Machine Learning (ML) is possible and useful in designing argument provision agents. This paper first demonstrates that ML techniques can achieve up to 76% accuracy when predicting people's top three argument choices given a partial discussion. We further show that well-established Argumentation Theory is not a good predictor of people's choice of arguments. Then, we present 9 argument provision agents, which we empirically evaluate using hundreds of human study participants. We show that the Predictive and Relevance-Based Heuristic agent (PRH), which uses ML prediction with a heuristic that estimates the relevance of possible arguments to the current state of the discussion, results in significantly higher levels of satisfaction among study participants compared with the other evaluated agents. These other agents propose arguments based on Argumentation Theory; propose predicted arguments without the heuristics or with only the heuristics; or use Transfer Learning methods. Our findings also show that people use the PRH agents proposed arguments significantly more often than those proposed by the other agents.
... Although vastly used, it becomes an ackward environment as the discussion grows. Argumentation-based environment, such as the Deliberatorium [54], offers a logically organized discussion environment, in which participation (posts) are tagged upfront. Consequently, the participants' points of view are organized in clusters easily perceived by others. ...
Conference Paper
The development of environments that promote electronic participation (e-Participation) by citizens in government issues is a challenge that requires robust and wide-ranging architectures and methodologies as well as the construction of an effective infrastructure that stimulates and supports citizen's participation. Also, the incentive for an education that leads citizens to participate in public life, with transparent information and technologically mediated engagement is a challenge. Faced with these issues, we propose a discussion and further research and practice in relation to this topic as part of the Great Challenges of the Brazilian Computing Society.
... Among their most influential projects are: Climate CoLab [21], which attempts to harness the collective intelligence of large numbers of people to address the problem of global climate change, and Deliberatorium [22], which explores the integration of ideas from argumentation theory and social computing to help large numbers of people enumerate the issues, ideas, and tradeoffs for complex problems with much greater signal-to-noise and much more systematic organization than should "include innovative methods, such as prizes and competitions, to obtain ideas from and to increase collaboration with those in the private sector, non-profit, and academic communities [26]. ...
Full-text available
Crowdsourcing is being researched as a technique to develop small-scale spaceflight software by issuing open calls for solutions to large crowds of people with the incentive of prizes. There is widespread investment of resources in the fields of Science, Technology, Engineering, Mathematics (STEM) education to improve STEM interests and skills. This thesis tackles the dual objectives of building crowdsourcing cluster flight software and educating students using collaborative gaming and competition, both in virtual simulation environments and on real hardware in space. The concept is demonstrated using the SPHERES Zero Robotics Program which is a robotics programming competition. The robots are nanosatellites called SPHERES - an experimental testbed to test navigation, formation flight and control algorithms - onboard the International Space Station (ISS). Zero Robotics allows students to access SPHERES through a web-based interface and the robust programs run on the hardware in microgravity, supervised by astronauts. The apparatus to investigate the influence of collaboration was developed by (1) building new web infrastructure and an Integrated Development Environment where intensive interparticipant collaboration is possible, (2) designing and programming a game to solve a relevant formation flight problem, collaborative in nature - and (3) structuring a tournament such that inter-team collaboration is mandated. The web infrastructure was built using crowdsourcing competitions too, to demonstrate feasibility of building software end-to-end through crowdsourcing. The multi-objective design of experiments had three types of collaborations as variables - within matches (to achieve game objectives), inter-team alliances and unstructured communication on online forums. The data used to evaluate objective achievement were simulation competition scores, website usage statistics, post-competition surveys and satellite telemetry from ISS hardware demonstrations. All types of collaboration showed positive influence on the quality of solutions achieved. Educationally, they showed mixed results and lessons on improving their process of implementation for more impact have been documented. Overall, this thesis ratifies the applicability of the developed framework for crowdsourcing spaceflight software and educating students and maps the utility of collaboration in this framework. A systems dynamics model for generalizing the framework into other programs for simultaneous crowdsourcing and education outreach has been proposed and management policy concerns highlighted.
Conference Paper
Full-text available
We examine deliberative quality of crowdsourced deliberation in this paper. Analyzing data from two crowdsourced policy-making processes, we found a good quality deliberation with respect, reciprocity, and storytelling according to the standards in the theory of deliberative democracy. We identified a group of super-deliberators, whose deliberation was above the average, and low-quality deliberators, whose deliberation was below the average. The findings show that even when crowdsourced policymaking was not designed for deliberation, it can facilitate a fairly high-quality democratic deliberation.
Debates have been used to develop critical thinking within teaching environments. Many learning activities are configured as working groups, which use debates to make decisions. Nevertheless, in a classroom debate, only a few students can participate; large work groups are similarly limited. Whilst the use of web tools would appear to offer a convenient solution, none of those currently available provides an automated system for organizing contributions into a logical structure, or for making decisions. To address this problem, this paper describes a new tool for managing and structuring debates over the Internet, and presents the results of a series of trials in an educational context. The tool enables users to post opinions and proposals, and to make multiple group decisions. The main advantages are that it does not require a moderator, and all contributions are automatically arranged into an intuitive structure. Thus, it enabled large groups to carry out bigger projects. Empirical results showed that it also encouraged the involvement of all the students in debates and allowed the participation of each student to be evaluated. The tool demonstrated its advantages over traditional oral debates and, as far as we are aware, it incorporates features not found in any other comparable web tool.
Full-text available
In support of research examining relationships between learner characteristics and the quality of online discussions, this paper surveys different methods for evaluating discussions. The paper will present coding methods used in our own research as well as methods used by others interested in quality online discussions. Key topics include what constitutes quality in online discussions and how that quality can be measured?
Full-text available
This paper details a content analysis scheme to measure the quality of political deliberation in face-to-face and online groups. Much of deliberation research studies the outcomes of deliberation, but there has been a lack of analysis of what groups actually do when tasked with deliberating. The coding scheme was developed out of the theoretical literature on deliberation and further enhanced by the empirical literature on small groups, deliberation, online political talk, and conversation analysis. Strict standards for creating coding schemes were followed to ensure a valid and reliable coding process. Results of the coding of deliberations on the topic of public schools suggest that participants produced a fairly high level of reasoned opinion expression, but not necessarily on the topic which they were asked to deliberate. It is hoped that the code scheme can be utilized by practitioners and researchers of political and social deliberations.
Full-text available
A new measure of opinion quality that we name "argument repertoire" (AR) is introduced and evaluated. AR refers to the relevant reasons that one has for one's own opinions and the relevant reasons that others with opposite opinions might have. The measure is shown to be reliable and to have construct validity. Those with elevated AR also were more likely to attend on-line deliberative groups during the presidential election and to contribute to those conversations. Those who participated in online deliberations tended to have higher AR scores on particular issues that were discussed. The role of AR in deliberative political groups is explored.
Full-text available
In this paper, we develop a discourse quality index (DQI) that serves as a quantitative measure of discourse in deliberation. The DQI is rooted in Habermas' discourse ethics and provides an accurate representation of the most important principles underlying deliberation. At the same time, the DQI can be shown to be a reliable measurement instrument due to its focus on observable behavior and its detailed coding instructions. We illustrate the DQI for a parliamentary debate in the British House of Commons. We show that the DQI yields reliable data and we discuss how these data could be used in subsequent analysis. We conclude by discussing some limitations of the DQI and by identifying some areas in which it could prove useful.Comparative European Politics (2003) 1, 21–48. doi:10.1057/palgrave.cep.6110002
Computer Supported Argument Visualization is attracting attention across education, science, public policy and business. More than ever, we need sense-making tools to help negotiate understanding in the face of multi-stakeholder, ill-structured problems. In order to be effective, these tools must support human cognitive and discursive processes, and provide suitable representations, services and user interfaces. Visualizing Argumentation is written by practitioners and researchers for colleagues working in collaborative knowledge media, educational technology and organizational sense-making. It will also be of interest to theorists interested in software tools which embody different argumentation models. Particular emphasis is placed on the usability and effectiveness of tools in different contexts. Among the key features are: - Case studies covering educational, public policy, business and scientific argumentation - Expanded, regularly updated resources on the companion website: "The old leadership idea of "vision" has been transformed in the face of wicked problems in the new organizational landscape. In this excellent book we find a comprehensive yet practical guide for using visual methods to collaborate in the construction of shared knowledge. This book is essential for managers and leaders seeking new ways of navigating complexity and chaos in the workplace." (Charles J. Palus, Ph.D, Center for Creative Leadership, Greensboro, North Carolina, USA)
Presenting the basic tools for the identification, analysis, and evaluation of common arguments for beginners, this book informs by using examples of arguments in dialogues, both in the text itself and in the exercises. (Examples of controversial legal, political, and ethical arguments are analyzed.) Illustrating the most common kinds of arguments, the book also explains how to evaluate each kind by critical questioning. Douglas Walton demonstrates the reasonable nature of arguments under the right dialogue conditions by using critical questions to evaluate them.
Two authorities in argumentation theory present a view of argumentation as a means of resolving differences of opinion by testing the acceptability of the disputed positions. Their model of a “critical discussion” serves as a theoretical tool for analyzing, evaluating and producing argumentative discourse. This major contribution to the study of argumentation will be of particular value to professionals and graduate students in speech communication, informal logic, rhetoric, critical thinking, linguistics, and philosophy. © Frans H. van Eemeren and Henriette Greebe and Cambridge University Press, 2004.