academic paper - MIT Media Lab

advertisement
Justify: A Web-Based Tool for Consensus Decision Making
Christopher Fry
MIT Media Lab
20 Ames St., Cambridge, MA, USA
cfry@media.mit.edu
ABSTRACT
Traditional
unstructured-debate-then-vote
democracy
leaves a lot to be desired. Consensus decision-making
processes may encourage better decisions but suffers from
real-time constraints. Online discussions can help, but
large-scale unstructured online discussions are unwieldy,
redundant, and ambiguous.
The Justify system helps participants clearly state concepts
and organize them in a meaningful structure. It hides detail
until desired, and automatically performs summarization at
all levels. These characteristics facilitate the emergence of
the best ideas along with the rationale for each, crucial for
buy-in.
Justify is a language for expressing rationale and a
development environment for analyzing discussion. There
are 150 kinds of "points", which enable users to express
questions, answers, background information, support,
opposition, votes, math and more.
Points share three essential characteristics:
1.
Each point must contain exactly one idea,
providing an unambiguous target for critiques.
2.
Each point must live in a particular spot in the
hierarchy of points, clarifying context.
3.
Each point declares its intent as a type, benefitting
both
humans
and
conclusion-generating
algorithms.
Author Keywords
Argumentation, decision support, consensus, voting.
ACM Classification Keywords
H5.m. Information interfaces and presentation (e.g., HCI):
Miscellaneous.
General Terms
Human factors, Languages.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
IUI 2012, February 14-17, Lisbon, Portugal.
Copyright 2012 ACM xxx-x-xxxx-xxxx-x/xx/xx...$10.00.
Henry Lieberman
MIT Media Lab
20 Ames St., Cambridge, MA, USA
lieber@media.mit.edu
THE PROBLEM
We have widespread discontent in the world today. Both
rich and poor countries in the mid-east and Africa are
engaged in civil war. The protests from both the right (ie.
tea party in the USA) and left (ie. occupy your-city-here)
are widespread. These movements are characterized by
bottom-up decision-making and aversion to a formal
organization with elected leaders. There are few rational
plans for how to fix what's wrong. Rather than have
decisions motivated by which protesters are the loudest, we
need to encourage processes for rational debate.
NONLINEAR ARGUMENTATION
The fundamental problem with debate is this: arguments are
tree-structured, but debate is sequential. Any position in an
argument can have its pros and cons, which in turn can be
argued for or against; any position can be defended or
refuted in a variety of ways, thus generating a tree structure.
But people arguing verbally or online are constrained by a
linear structure. Any participant must choose to address
only a single point at any moment in time. Short sequences
where one person replies to another are fine, but as the
argument grows, limitations of human short-term memory
and attention often derail the debate. People can't remember
what the speaker is referring to, what are all the arguments
pro and con the speaker's point, whether the speaker really
answered the question he or she was asked, what the
speaker neglected (intentionally or not) to say.
A traditional way of linearizing a tree structure of text is the
outliner. Each node of the tree is summarized as a line of
text. Nodes at greater depths are indented. Each node can be
expanded from a single-line summary to the full text of that
node. Justify takes the outliner as its core interface
metaphor.
Justify adds two things to the basic outliner interface. First,
it has a very rich ontology of types of points raised in the
argument and their relationship to other points. This
encourages good behavior by the participants. Their
declaration of the intent of raising each point helps guard
against hidden agendas and "debate tactics" intended to
mislead. It keeps the argument structured at all times.
Figure 1. The Justify user interface is displayed in a browser window. The buttons in the upper right allow the showing
of documentation in the lower right pane. The lower right pane is used for documentation or detail. Now shown in the
lower right pane are the initial instructions. The points hierarchy is shown in the large lower left pane. A point is
represented by a single line in this pane. The top point is the justify_repository. The first child of that point is Justify
Help.
Second, it enables automatically generating summarizations
of the argument at every level. It enables using diverse
summarization techniques such as logical deduction
techniques or voting.
PROBLEM ANALYSIS THROUGH THE UI LENS
We can apply the discipline of application design, including
its user interface, to analyze decision making.

We need to collect enough of the relevant
information and be confident that we have done
so.

We need to filter out the wrong or useless
information so that it doesn’t distract us.

We need to organize this information in the right
presentation, while realizing that different views
are optimal for different tasks.

We need to summarize at numerous levels so as
not to miss the forest for the trees.
CONCEPT GRANULARITY
In nearly every traditional ‘view’ of an argument, the
granularity of information is too coarse. Take what is called
a “debate” on US national TV. A candidate is given a few
minutes to talk, another a few minutes to respond. Or,
candidates may be asked a question about a complex topic
and allowed to give a lengthy answer. It is not simply that
candidates rarely answer the question they are asked. Its
that their response contains so many different ideas that a
challenger is faced with picking which ones to address. Not
only doesn’t the challenger remember all the points, they
select even fewer to respond to. Listeners of the debate fare
no better in recalling which ideas were strategically
omitted. Dangling questions never get resolved. Incomplete
rebuttals distract listeners from more substantial issues and
we’re left with a miasma of disconnected information.
Consensus meetings don’t address this problem either.
Usually too many people want to speak at a given time. A
“stack” of pending people is formed. Once an individual
“gets the floor”, they cram in as many ideas into their “air
time” as possible since it will be a while before they’re
allowed to speak again. Participants wishing to respond to
just one of a speaker’s ideas are told to wait their turn. That
might take 10 minutes or more, with several intervening
speakers, after which the context of their response is largely
forgotten as the discussion meanders to new areas.
THE
ADVANTAGE
COMMUNICATION
OF
ASYNCHRONOUS
Real time constraints make the above problems nearly
unavoidable. When our discussion goes on-line, we are
freed from the tyranny of the clock. We can take as long as
we want to compose our utterance, allowing the
conscientious author to perfect meaning and concision.
Even more important, numerous users can speak (by which
I mean “type”) at the same time without colliding at the
listeners’ ears (by which I mean “readers’ eyes”). There is
no need for the frustrations of waiting your turn in “the
stack”.
Unfortunately traditional on-line forums solve only part of
the problem. Comments typically fall within one of two
extremes. Either they are too terse to be useful or
unambiguous, or they contain numerous points, only a
small percentage of which are addressed by subsequent
comments. Those subsequent comments are usually “out of
place” because, by the time a thoughtful responder submits
their relevant reflection on the latest comment, often several
other less-thoughtful responders have interjected comments
that distance the thoughtful comment from its intended
context.
JUSTIFY POINTS
Justify encourages a much finer granularity of utterances
than real time discussions or the significant comments in
on-line forums. Each Justify point contains just ONE idea.
This permits comments on it to unambiguously address just
that one idea. Justify encourages points to be so constrained
by:

Making its primary representation 1 line of text (ie
too small to contain more than one idea)

Requiring the point’s author to declare a specific
type for the point, making two ideas of different
types within a point impossible

Providing other users a direct way to critique a
point because it contains more than one idea.
To avoid the “too terse to be unambiguous” condition that
short on-line forum comments often exhibit, a Justify point
cannot be created without assigning a highly specific type
to it that captures its core semantics (pro, con, clarifying
question, etc.)
THE MYRIAD
PROBLEM
OF
NIT-PICKING
LITTLE
POINTS
We might structure a paragraph or an utterance to be a list
of ideas with a certain coherency. If we interject comments
about each idea in-line, we break the coherency of that list.
Our rich annotations obscure the basic list. So we really
need two views, one with all the detail and another simple
and straightforward. Justify’s outline UI is ideally suited to
this task. As in any decent hierarchy browser, you can
display or hide any level.
One problem with most outline editors is that in viewing a
list of items, its context, ie its true position in the hierarchy
is scrolled off the top of the screen. Justify’s focus
operation hides the uncles, granduncles, etc. of a point,
showing only its direct line ancestors. For deeply nested
lists, even that can be too much, particularly as they eat up
the indentation whitespace on the left, leaving too narrow a
column of real content. Justify’s set top operation makes the
current point display as if it’s the root, at the top left of the
screen, gaining back all that whitespace on the left.
Paste lets you insert a cut or copied point anywhere in the
hierarchy. Pastewrap let’s you paste a point on top of
another point, effectively replacing a point with the pasted
point but making the replaced point become a child of the
pasted point, pushing it down one level. Siblings to
subpoints makes all the siblings of a point into subpoints of
the point whereas subpoints to siblings does the opposite.
These novel editing utilities make reconfiguration of a
complex argument easy.
JUSTIFY INTERFACE
See the illustration of the Justify interface in Figure 1.
The Justify UI includes documentation not just in prose
documents, but as points, so that you can hierarchically
browse them. Justify Documentation contains children that
describe each of the point types. Jusity Comments let’s
users give feedback about Justify itself, permitting other
users to support or oppose those comments, all using
regular Justify points. We have expanded the Justify
Playground discussion to display an area where users can
try out the UI.
Let’s examine the point Does democracy work? from left to
right. First a button that allows us to show or hide the
point’s children. Next is a pull down menu of operations for
editing and displaying the point. The brown rectangle to its
right is the conclusion of the point. Next is question, the
type of the point followed by pro_or_con, the subtype of
this point. Does democracy work? is the title of the point.
Following that is admin, the author of the point. Lastly we
have id=2, an identifier that other points can use to refer to
this point. Most of these point characteristics are clickable
to reveal additional detail in the lower right pane.
In the top pane, the Preferences button lets us selectively
hide each of these characteristics for all points, decreasing
screen clutter.
AUTOMATIC SUMMARIZATION AT ALL LEVELS
As discussions grow deep and wide, it becomes
increasingly difficult to ascertain the gist of the discussion.
Are there unanswered questions? Has a point been refuted?
What’s a vote tally? Each point type has an algorithm for
computing its conclusion based on its subpoints and
attributes. Examine the point in figure 1 of Does democracy
work? This question is a ‘yes or no’ question. For
consistency with rationale and to maximize point
interoperability, we label it pro or con and expect its
conclusion to be a pro or con point.
There are many ways to summarize an argument or to draw
conclusions from an argument. Some of these may
themselves be controversial. They range from formal
techniques like proof in first-order logic, to various voting
schemes. Justify allows for multiple summaization
techniques, so that the convenience of automatic inference
can be obtained, but its methods can be examined or
debated if necessary.
Here is one simple algorithm: If all the subpoints have pro
conclusions, the conclusion is pro. If all subpoints have con
conclusions, the conclusion is con. If there’s a mix, the
conclusion is undecided.
Does democracy work? has just one subpoint, not by itself.
Since that subpoint has no subpoints and it is of type con,
the conclusion of this subpoint is con and that forces the
conclusion for Does democracy work? to be an
unfortunately accurate con (ref Erdman & Susskind).
However, participants in this discussion are likely to add
additional rationale which could well change the question’s
conclusion to a green pro or a yellow undecided.
One way to browse a hierarchy is to expand only those
points whose conclusions you disagree with. You may learn
new rationale that causes you to change your mind, or
detect an omission. By adding subpoints to a point, you can
perhaps change the conclusion to what you originally
expected or even to a conclusion that neither you nor the
previous authors expected. Thus Justify can help users
discover new solutions to problems that were hidden by
complexity.
A more sophisticated tack is to employ algorithms such as
Truth Maintenance [Doyle 79] (aka Reason Maintenance or
Belief Revision). Particularly relevant might be the multiagent distributed truth maintenance of [Mason and Johnson
89]. These algorithms compute dependency relations
amongst conclusions with multiple supports and propagate
the effect of changes to beliefs.
Since each reasoning step that Justify takes is immediately
viewable, users can easily verify the validity of the
conclusion. If you decide a conclusion is not valid, you
can’t simply edit it. You must add points that indicate
additional rationale for Justify’s conclusion algorithm to
consider.
Figure 2. An argument evaluating candidates for a Best
Paper Award.
Our top level prioritize point represents each criteria as
another prioritize subpoint. Each prioritize subpoint as one
subpoint, a box that packages up the evaluations from the
reviewers as numbers. Under Writing Style Evaluations we
get a score of 75 for Touch and a score of 25 for Smell. For
this criteria, Touch is the clear winner and is represented as
the conclusion for Clearest Writing Style. For the Most
Innovative, the scores are nearly reversed so that Smell wins
as is indicated in its conclusion. Our top level prioritize
adds up the scores from the criteria for each paper giving
Touch an overall score of 75 + 30 = 105 and Smell an
overall score of 25 + 70 = 95. Thus Touch is the conclusion
of Best User Interface Paper.
At times you may realize that it is not an omission of
rationale that’s causing a sub-optimal conclusion, but that
the algorithm itself is at fault. For instance, changing from
consensus to majority rule could be warranted. Such
modifications should be principled, must be definitive, can
only be made by a point’s author or the discussion
moderator, will be easily viewable by all and may, like all
other changes, be critiqued using additional Justify
subpoints.
The conference chair complains to the judge that the theme
of this year’s conference is innovation, so that the Smell
paper should win. We can represent the preference for
innovation by wrapping a Scale point around Most
Innovative and giving it a multiplier of 2. This effectively
doubles the scores for under Innovative Evaluations
increasing Touch from 30 to 60 and Smell from 70 to 140.
Since the initial score for Smell is higher, its increment is
greater than the increment to Touch. Now Smell’s overall
score exceeds that of Touch and the conference chair is
happy.
SCENARIO
This stylized discussion can be expanded with many other
point types. We could use the math.sum point to add up
review scores for instance. We might use a con point to
nullify the effect of a lousy reviewer. We can even critique
the scale factor of 2 since perhaps innovation should be
given a greater or lesser weight. More significant, we can
break down any criteria with sub-criteria by adding
prioritize subpoints, giving us a higher resolution
evaluation.
Imagine you are the judge deciding the best paper at a user
interface conference. You have narrowed down the
candidates to a paper on Touch and one on Smell. You have
two criteria, Writing Style and Innovative. The reviewers
have giving you scores for each paper for each criteria. We
can represent this information in Justify like so:
CARRYING OUT DECISIONS
Making the best decision is crucial but meaningless if it
isn’t acted upon. Justify allows decisions to be grouped
together under agenda points. We can also add action
points that bubble up from deeply nested discussions to tell
actors how to accomplish the decisions in an agenda. That
makes clear what should be done.
Harder, though is providing the motivation for doing an
action. People are frequently unmotivated to perform an
action if they don’t understand why its important. A
complex decision (whether Justify is used or not) might
entail many points. We see summaries of why a course of
action is appropriate in Supreme Court opinions or
referenda question descriptions. These are often too long
for us to read or too short to capture the detail we’re
interested in. They are impossible to write optimally for a
broad audience as different readers will want different
levels of detail for specific aspects of the decision. This is
where Justify’s dynamic presentation of a decision shines.
Conclusions at each level are obvious and concise. More
detail for a particular point is just a click away. Whether a
decision was commissioned by a busy CEO or created by a
legislature, those that must act on it or at least are affected
by it, can benefit from the multi-level summaries of Justify.
types useful in a wide variety of processes. Fundamentally
it is a language, where, like a good programming language,
the pieces are optimized to be combinable in a maximum
number of configurations, tailored to the problem at hand.
Points represent both instances of their type, and methods
that are called to return a conclusion.
VOTING
Figure 3. A question with two subpoints.
Like a programming language, you need a development
environment to facilitate creating and debugging your
programs/discussions. Justify also provides the user with
multiple views of their discussion to ease understanding.
Voting is considered to be the decision-making process of
first resort in modern democratic societies. Justify considers
voting the process of last resort, to be used only when
reasoning fails.
There are a bunch of different voting strategies in Justify.
The crudest are one answer like the common ‘pick one
candidate, majority wins’ and pro or con as is common on
referenda. Better than either of these for most situations is
preferential voting allowing voters to order candidates.
Justify represents this as a prioritize point.
A more novel kind of voting is how_much. Voters specify a
number within a pre-determined range. The numbers from
each ballot point are combined to form the conclusion of
the vote. How the numbers are combined is indicated in an
attribute to the vote point. Possibilities are: average,
median, highest, lowest, sum and product.
Here we use the how much combiner attribute of sum as is
typical in the pork barrel legislature process of decision
making.
Decision making processes where the participants don’t
have to make trade-offs frequently lead to poor long term
decisions (as in the above example employing
vote.how_much). A novel voting point in Justify that forces
voters to make trade-offs is apportion, where each voter
must allocate pieces of a fixed budget to one of a set
number of candidates. The sum of the pieces on each ballot
must not exceed the budget. The amounts for each piece
from each ballot are averaged together, making staying
within a budget easy. If a legislator’s ballot was public, his
constituents would have an easy way to evaluate the
legislator’s performance, making the desirability of such a
voting technique among legislators low.
THE POWER OF LANGUAGE
Justify does not prescribe a set methodology for making
decisions. Rather it is a toolbox with 150 different point
Figure 4. A different view of the above that uses natural
language generation to present the points an stepping to
navigate the hierarchy.
FUTURE WORK
Justify has not yet been tried on a large group of users. This
is necessary to find out what works best in the visual user
interface and to extend and refine the point set to improve
the conceptual user interface. The tool has a modular design
to ease such improvements.
Automatic classification of points
The potential exists to incorporate AI techniques for
automatically classifying point types and argument types
from natural language. Justify’s ontology of points is
complex, and users may have trouble initially learning the
ontology or correctly classifying points. Some automatic
suggestion of point types could go a long way towards
reducing the cognitive burden of point identification for
users. We look toward natural language processing
techniques based on Commonsense Knowledge, such as []
and [] to help in this regard.
RELATED WORK
Argumentation systems have a long history. The paper
[Conklin, et al 2003] is a survey that includes landmark
systems from Doug Engelbart’s work on Augmentation and
Hypertext from 1963 through NoteCards, gIBIS [Conklin
1988] and QuestMap through Compendium [Conklin
2003].
Conklin’s work on Compendium incorporates the best ideas
of the previous systems so I will concentrate my analysis on
it. Compendium uses various question and answer types as
does Justify, including templates for filling out such nodes
in a network, as does Justify. A crucial aspect of “lessons
learned” here is about being able to quickly record informal
statements and, as needed formalize them. Justify permits
this since you can create a “generic” point such as idea or
information and change its type later, say to a pro or con.
Pros and cons can further be refined to more specific types
should the author like.
The display in Compendium is that of a 2-D graph of “icons
on strings” where the strings represent typed links between
the icon nodes. This notation is semantically flexible, but
requires more work in graphical arrangement and declaring
link types than Justify’s outline/hierarchy. The
expand/contract nature of a good outliner makes hiding of
detail particularly easy.
No reference to education has been made in the body of this
paper though the authors believe that structuring rationale
can be a high-leverage learning tool. We would like to
acknowledge Buckingham’s work on Cohere and the
conceptual framework described in [Buckingham Shum
2010].
Conklin, Buckingham Shum and other researchers have
done groundbreaking work in many aspects of knowledge
representation. To their credit, they have tackled difficult
issues in real-time meeting knowledge capture, a use case
for Justify that I hope some day it can support.
We would also like to credit a project named SIBYL (ref
Lee) done by Jintae Lee at the Center for Coordination
Science directed by Thomas Malone. Fry worked in the
early 1990’s at this center. The SIBYL project was
instrumental in introducing Fry to the field of formal
representation of argumentation. Malone’s work of planetwide importance continues at MIT’s Center for Collective
Intelligence.
Iyad Rahwan (ref Rahwan) tackles representing
argumentation in the Semantic Web technologies of XML,.
RDF and OWL. This highly structured work promises to
make an ontology that can be standardized and shared
across the web. Although Justify is implemented using a
programming language based on XML and is deployed on
the web, I have not attempted to make a shared ontology,
nor used a more traditional reasoning engine such as OWL.
CONCLUSION
Clearly we need better ways to select government
representatives than the process employed in the United
States of “incumbent + large campaign donors = victory”
[Lessig 11]. But even with the most altruistic
representatives, traditional debate and deliberation are poor
at synthesizing and selecting the best solutions.
Our inability to choose wisely is by no means limited to
government.
Businesses,
non-profits,
universities,
households and individuals could all use help in decision
making. In fact, one way to characterize a human being is
“a complex decision-making animal”. The distinction
between an individual and a group may seem stark given all
the problems we have cooperating, yet if you accept Marvin
Minsky’s “Society of Mind” thesis, [Minsky 06] intra-head
competing agents may, in many ways, mimic inter-head
agents.
To recap, the contributions of Justify are:

Extensions to outline editors that facilitate
navigation, viewing and editing described in the
body of the paper.

A greatly expanded set of point types including not
just questions, answers and rationale but
conditional nodes, math nodes, various error/status
nodes plus an enlarged set of voting, question, and
rationale nodes, most of which are not described
in this paper due to space constraints.

Automatic computation and propagation of
summarizations which facilitate determination of
completeness and error checking.
REFERENCES
1. Buckingham Shum, Simon and De Liddo, Anna (2010).
Collective intelligence for OER sustainability. In:
OpenED2010: Seventh Annual Open Education
Conference, 2-4 Nov 2010, Barcelona, Spain.
2. Conklin, J., Selvin, A., Buckingham Shum, S. and
Sierhuis, M. (2003) Facilitated Hypertext for Collective
Sensemaking: 15 Years on from gIBIS. Keynote
Address, Proceedings LAP'03: 8th International
Working Conference on the Language-Action
Perspective on Communication Modelling, (Eds.) H.
Weigand, G. Goldkuhl and A. de Moor. Tilburg, The
Netherlands, July 1-2, 2003. [www.uvt.nl/lap2003]
3. Jeff Conklin and Michael L. Begeman. 1988. gIBIS: a
hypertext tool for exploratory policy discussion. In
Proceedings of the 1988 ACM conference on Computersupported cooperative work (CSCW '88). ACM, New
York, NY, USA, 140-152.
4. J. Doyle. A Truth Maintenance System. AI. Vol. 12. No
3, pp. 251–272. 1979.
5. [reference deleted for blind review]
6. Jintae Lee. 1991. SIBYL: A qualitative decision
management system. In Artificial intelligence at MIT
expanding frontiers, Patrick Henry Winston and Sarah
Alexandra Shellard (Eds.). MIT Press, Cambridge, MA,
USA 104-133.
7. Lessig, Lawrence, Republic Lost: How Money Corrupts
Congress and a Plan to Stop It, Grand Central
Publishing, 2011.
8. Malone, T. W., Lai, K. Y., & Fry, C. Experiments with
Oval: A radically tailorable tool for cooperative work.
ACM Transactions on Information Systems, 1995, 13, 2
(April), 177-205.
9. Mason, C. and Johnson, R. DATMS: A Framework for
Assumption Based Reasoning, in Distributed Artificial
Intelligence, Vol. 2, Morgan Kaufmann Publishers, Inc.,
1989.
10. Malone, T. W., and Klein, M., Harnessing Collective
Intelligence to Address Global Climate Change,
Innovations, Summer 2007, Vol. 2, No. 3, Pages 15-26.
11. Minsky, Marvin, The Society of Mind, Simon &
Schuster, New York, 1988.
12. I. Rahwan, B. Banihashemi, C. Reed, D. Walton and S.
Abdallah (in press). Representing and Classifying
Arguments on the Semantic Web. The Knowledge
Engineering Review. (to appear)
13. Susskind, L., The Cure for Our Broken Political
Process: How We Can Get Our Politicians to Stop
Fighting and Start Resolving the Issues that Truly
Matter, with Sol Erdman, (Potomac Publishers), 2008.
Download