Buzz: Mining and Presenting Interesting Stories Sara Owsley Sood

advertisement
Int. J. Arts and Technology, Vol. 1, No. 1, 2008
131
Buzz: Mining and Presenting Interesting Stories
Sara Owsley Sood
Department of Computer Science,
Pomona College,
185 East Sixth Street, Claremont, CA 91711, USA
E-mail: sara.sood@pomona.edu
Abstract: Living in a world where the machine and the internet are ubiquitous,
many people work and play online, in a world that is, ironically, often isolated
and lonesome. While the internet, as intended, connects us to information,
products and services, it often draws us away from the rich connections that are
created through interpersonal communication. The goal of this work is to use
the machine to connect people. Not only to information, products or services,
but also to each other. The author proposes to use the machine, the very
machine that pulls us apart, to bring us together, connecting people through
stories. People tell stories as a way to be less lonesome, reaching out to people
that they can relate to. The system that the author describes in this article is
intended to facilitate and amplify connections between people through stories.
Buzz is a digital theatre installation that autonomously finds and exposes the
most emotional stories from the millions of blog postings on the web, bringing
the stories to life with a cast of virtual actors using speech synthesis. This
system serves the goal of using the machine to connect people, not only to
information or products, but also to each other.
Keywords: digital art; digital theatre; emotion; narrative; story generation;
story telling; weblogs.
Reference to this paper should be made as follows: Sood, S.O. (2008) ‘Buzz:
Mining and Presenting Interesting Stories’, Int. J. Arts and Technology, Vol. 1,
No. 1, pp.131–155.
Biographical note: Sara Owsley Sood is an Assistant Professor in the
Computer Science Department at the Pomona College. Her research spans
network arts, human computer interaction, text classification and information
retrieval. Her work on Buzz has been exhibited both as artistic installations in
Chicago’s Second City Theater and Wired Magazine’s NextFest, and as an
informative marketing research tool at the Wrigley Global Marketing Summit.
She earned her Bachelor’s degree with honours in Mathematics and Computer
Science from the DePauw University, and her Master’s and PhD in Computer
Science from the Northwestern University in 2007.
1
Introduction
Imagine a world without stories. A world where history is lost and mistakes are repeated.
A world where learning is restricted to theory without practice. A world where we are
alone with our experiences, with no way to express our fears, gain compassion or
empathise with the common experiences of others. A world in which nothing is imagined
and creativity is stifled. A world where negative experiences are bottled up with no venue
for catharsis or confession, and positive happenings go uncelebrated. A world where selfCopyright © 2008 Inderscience Enterprises Ltd.
132
S.O. Sood
expression lies in actions and facts. A world in which the chronicle of our life is
forgotten. Such a world is not only hard to imagine, but also is inhuman.
“But a world without stories is fundamentally inhuman.” –Roald Hoffmann
(Hoffmann, 2000)
The importance of stories in our lives is immeasurable. Stories connect us. They are the
structure of all human communication. They provide a medium through which we can
form relationships with other human beings. These relationships and connections bring
purpose and meaning to our lives. Without stories, we are lonesome.
“We are lonesome animals. We spend all our life trying to be less lonesome.
One of our ancient methods is to tell a story begging the listener to say – and to
feel – Yes, that’s the way it is, or at least that’s the way I feel it. You're not as
alone as you thought.” –John Steinbeck (Steinbeck 1977)
Playing such a variety of roles, stories take many different forms. Non-fiction stories are
written to pass on knowledge, often peppered with legend and myth that evolves with the
passing of time. News stories are reported to recreate the events of the day, making
people aware of current issues facing others around the world. Fictional stories are told
by parents to help develop creativity and fantasy in their children’s minds. Beyond these
types and many others, overwhelmingly, the most commonly told story is a first person
experience.
It is first person stories that provide us with the most powerful way to make
meaningful connections with the people around us. We can express our goals and fears,
our experiences and struggles, our triumphs and happiness and gain comfort, empathy,
criticism, and support from others. We tell first person stories to put our thoughts out in
the world and pass on knowledge and ideas, but most importantly, we seek emotional
connections to other people. We want to know that others are having similar experiences,
joys and frustrations in their lives and the world that we live in. It is not surprising that
we listen to and read first person stories for reasons that are strikingly similar to why we
tell them; for entertainment, to be touched, to gain comfort, to empathise and to learn
about and connect to the author.
“To be a person is to have a story to tell.” –Isak Dinesen (Simmons, 2002)
Our lives are filled with stories. We love to talk about what happens to us. The most
straightforward reason that we talk about our experiences is that it is a way for us to
remember things or to clarify for ourselves what has happened. Even when the listener or
audience is inactive or unresponsive, simply telling a story can serve as catharsis and
therapy. In an extreme case, Tom Hanks’ character, in his 2000 movie Cast Away, gains
comfort as he tells stories to an inanimate volleyball who he lovingly calls ‘Wilson’.
“Hey [to Wilson], you want to hear something funny? My dentist’s name is
James Spalding.” –Tom Hanks in Cast Away (Zemeckis, 2000)
Living in a world where the machine and the internet are ubiquitous, many people work
and play online, in a world that is, ironically, often isolated and lonesome. While the
internet, as intended, connects us to information, products and services, it often draws us
away from the rich connections that are created through interpersonal communication.
The goal of this work is to use the machine to connect people. Not only to
information, products or services, but also to each other. The author proposes to use the
machine, the very machine that pulls us apart, to bring us together, connecting people
Buzz: Mining and Presenting Interesting Stories
133
through stories. Much like the editor of a literary journal, the author proposes a system
that mediates a connection between people through stories.
“A literary journal is intended to connect writer with reader; the role of the
editor is to mediate.” –John Barton
2
Existing systems
Such a system requires many innovations along with the use of existing technology. First,
we need a venue for people to publicly read and write stories. Luckily, such a venue
already exists in the form of the weblog (blog for short). The popular rise of blogs came
in 2001 as a venue for information, commentary, opinions, narrative and stories, posted
by everyday people in a journal format. Blogs are vastly published, with 70 million total
blogs, and 15.5 million currently active, or updated in the past 90 days (Business Week,
2007; Technorati, 2007). The widespread use of blogs is a testament to people’s desire to
tell their stories.
Next, we need a way for people to find the stories written in blogs that might be
interesting to them, or might provide the connection that they are seeking. Current blog
search engines provide users with a way to search for topical content. While these search
engines span the entire blogosphere and return topically relevant blog entries, they fall
short in scaffolding connections between people. First, they only return the most popular
or authoritative blogs. Given the vastness of the blogosphere, if the most popular blogs
are always returned, then millions will remain unseen or unread, unconnected. Also, blog
search engines do just what their name implies, search blogs, and not all blogs take the
form of stories. Current blog search engines on their own will not meet our goals of
connecting people through stories.
“Nobody wants to listen to what happened to you today unless you can make
what happened appear interesting. The process of livening up an experience can
involve simply telling that experience in such a way as to eliminate the dullest
parts, or it also can involve ‘jacking up’ the dull parts by playing with the
facts.” –Roger Schank (Schank, 1990).
We also need the system to find, not only just stories, but also interesting or compelling
stories. Current blog search engines measure importance through popularity; these
systems do not evaluate the emotional impact or interestingness of the blogs that they
return. The system must be able to find stories that people will want to read or hear, not
only just the ones that are popular, but also the ones that are emotional and compelling.
Finally, we need a way to present these stories to people; a way that compels them to
listen and enables them to embed themselves in the narrative. While a textual format
(book, magazines, blogs, etc.) has historically been a common way to communicate
stories, hearing a story told by a person is a much more powerful and expressive
experience and allows the listener to truly empathise with and connect to the person
telling the story. We must keep this in mind in the presentation of stories to people.
3
Buzz
To address these issues we have built a system called Buzz that exists to enable emotional
connections, connecting people with each other via stories. It reaches deep down into the
blogosphere, beyond the popular and authoritative, and finds compelling stories; from
134
S.O. Sood
real people who are telling stories and begging to be heard. These stories are emotional,
touching, funny, surprising, comforting, eye-opening, etc. They expose people’s fears,
dreams, experiences and opinions. The system changes the definition of relevance from
similarity and popularity to emotional impact and interestingness. By connecting people
to these stories, the system, Buzz, amplifies their voices. Being heard is not only
beneficial for the blogger, but also the viewer can gain comfort, have a laugh, relate to an
experience, get advice, share in the blogger’s joy or sorrow, take a walk in the their
shoes, feel less lonely, hear an opinion or perspective or simply be entertained. Buzz
connects people.
Below (in Table 1) are three stories retrieved by the Buzz system. These stories were
extracted from actual blog posts, using Buzz’s retrieval, filtering and modification engine.
While the three are all first person stories, it is clear that their authors had differing goals
while writing the stories; the first and third, a ‘fight’ and a ‘dream’, are a therapeutic
telling of their experiences, while the second, a ‘confession’, was likely intended more as
a self-descriptive narrative, exposing the blogger’s political stance. While differing in
content and goals, these three stories are all highly affective and compelling, and most
importantly, they involve situations that everyone can relate to: fights, confessions,
dreams, etc.
Table 1
A set of stories retrieved and presented by the Buzz system
My husband and I got into a fight on Saturday night; he was drinking and neglectful, and I was
feeling tired and pregnant and needy. It’s easy to understand how that combination could escalate,
and it ended with hugs and sorries, but now I’m feeling fragile. Like I need more love than I’m
getting, like I want to be hugged tight for a few hours straight and right now, like I want a dozen
roses for no reason, like a vulnerable little kid without a safety blankie. Fragile and little and I’m
not eating the crusts on my sandwich because they’re yucky. I want to pout and stomp until I get
attention and somebody buys me a toy to make it all better. maybe I’m resentful that he hasn’t gone
out of his way to make it up to me, hasn’t done little things to show me he really loves me, and so
the bad feeling hasn’t been wiped away. I shouldn’t feel that way. It’s stupid; I know he loves me
and is devoted and, etc. yet I just want a little something extra to make up for what I lost through
our fighting. I just want a little extra love in my cup, since some of it drained
I have a confession. It’s getting harder and harder to blindly love the people who made George W
Bush president. It’s getting harder and harder to imagine a day when my heart won’t ache for what
has been lost and what we could have done to prevent it. It’s getting harder and harder to accept
excuses for why people I respect and in some cases dearly love are seriously and perhaps
deliberately uninformed about what matters most to them in the long run
I had a dream last night where I was standing on the beach, completely alone, probably around
dusk, and I was holding a baby. I had it pulled close to my chest, and all I could feel was this
completely overwhelming, consuming love for this child that I was holding, but I didn’t seem to
have any kind of intellectual attachment to it. I have no idea whose it was, and even in the dream, I
don’t think it was mine, but I wanted more than anything to just stand and hold this baby
In addition to finding interesting stories, the Buzz system presents these stories in a novel
way. Instead of showing the stories in their original form, as plain text, the system
embodies the blogger with an avatar and generated voice (as depicted in Figure 1). These
avatars present the stories by reading them aloud, with each actor attentive to the one
currently reading. This enables a stronger connection with the viewer as it simulates the
blogger telling you their story. Much like the difference between printed newspaper and
television news, Buzz presents the stories in a different and more engaging manner.
Buzz: Mining and Presenting Interesting Stories
135
Figure 1
An installation of Buzz in the ford engineering design centre at Northwestern University
(see online version for colours)
Figure 2
A close-up of the central screen of a Buzz installation (see online version for colours)
3.1 Experiencing Buzz
To understand what it is like to experience Buzz, consider a viewer who approaches it in a
public installation. As he draws near the installation, he sees five monitors on a wall in
the shape of an X (Figure 1). He notices four faces occupying the outer monitors; and
hears one of these characters speaking, appearing to tell a story to the others. The
character is speaking about a dream she had last night (see Table 2 for the full dream),
and the others are closely listening to this story, turned and facing her. As she speaks, her
words fall on the central screen (Figure 2) – love, life, dream and friend – highlighting
the most moving words in her story. As her story concludes, the character on the screen
above her takes over, lightheartedly confessing about his poor singing skills (see Table 2
for the full confession). The performance goes on in this manner indefinitely as the
characters continue to tell compelling stories.
136
Table 2
S.O. Sood
Three sample stories found and presented by the Buzz system
So, I had this dream last night of someone who used to be in my life. And I really, truly loved him.
And he’s been away now for more than twice as long as I even knew him. But I still miss him. I
still love him. And I’m fairly sure there is a part of him that still loves me. He was my best friend
I have a confession to make. I can’t sing, I can’t dance, I can’t do nothing. Cause I had no
professional training, neither do I have the talent to begin with. I can’t hit the high notes and worst
of all, I don’t know how to sing any song. McDave and Famezgay, on the other hand, being
karaoke veterans, sung up lots of songs with ease, while I had to wail and screech to keep up with
the melody. So sorry for making you guys suffer from my preposterous vocals. It was very
unfortunate that I love to sing but I can’t sing
Keith and I got into a fight last night when I got home from Kentucky. … He said I was just like
my father…. which honestly is the worst insult any human being can bestow upon me. … I tried to
roll over and just go to sleep, but it hurt, I won’t lie, it really hurt to have to hear those words. …
‘I’m just like my father’. … OUCH. Before long though, back up the stairs he came, said he was
sorry for saying that and then he told me he was going to sleep on the couch…but it occurred to
him that this would be the last time he’d have the opportunity to lay beside me in OUR bed. It was
to be the last time we’d sleep side by side. That statement was very final
3.2 The Buzz architecture
Figure 3 illustrates how Buzz works from the formation of queries to the final
presentation of stories. To find compelling stories, the system mines the blogosphere
(the global corpus of blogs), collecting posts where the author describes a dramatic and
compelling situation: a dream, a nightmare, a fight, an apology, a confession, etc. After
retrieving these pages, the system extracts candidate stories from the entries. The system
then takes the stories through a set of modifiers, aimed at transforming the candidates to
make them look more like stories, be more emotionally amplified, and sound appropriate
when spoken by animated characters. After being transformed, the candidate stories are
passed through a set of filters, aimed at focusing the system on candidates that take the
syntactic form of a story, are emotional amplified, and will sound appropriate when
presented by an avatar; filtering out candidates that do not meet these criteria. Overall,
the filtering engine is highly selective, discarding 98% of the retrieved candidate stories.
These techniques are critical to the final performance as they ensure that the stories found
will engage the audience.
“A good story cannot be devised; it has to be distilled.” –Raymond Chandler
After passing through the filtering and modification engine, the resulting story selections
are emotion-laden and compelling. Next, the stories go through a markup stage, to
prepare them for performance by an animated character. Several techniques are used
to give the presentation of the stories a realistic feel and to make performances engaging
to an audience. The story is marked up for speech and animation cues in a number of
ways. It is marked up at a sentence level by a mood classifier, providing cues to the
avatar and generated voice as to the affective state of the story as it progresses. This
markup also includes emphasis and timing cues to yield better cadence and prosody from
computer-generated voices.
Finally, the performance is planned and guided by dramatic Adaptive Retrieval
Charts (ARCs), which are used to provide a higher level control of the performance,
similar to that of a director. These ARCs allow for various performance types from the
most basic – a single virtual actor telling an individual story, for example as part of an
Buzz: Mining and Presenting Interesting Stories
137
online system – to more complex – for example an ongoing performance of multiple
virtual actors in a physical installation. These charts specify performance needs at the
level of story type, topic, gender, length, etc. and are used to drive the retrieval of the
stories. While these charts drive the performance, the virtual actors themselves are also
driven by instructions, in a sense, on ‘how to act’. For example, they are attentive to the
actor currently speaking by turning and looking at them, and they pause in the
appropriate places to make the performance feel more realistic.
Figure 3
An architecture diagram of the Buzz system (see online version for colours)
138
S.O. Sood
3.3 Finding compelling stories: the retrieval, filtering and modification model
A first pass at building Buzz revealed that the content of blogs is incredibly wide-ranging,
but unfortunately often very dull. Buzz succeeded in finding stories that were on point to
any provided topic, but the results were not compelling. People blog about a wide range
of topics; for example, their class schedule, what they are eating for lunch, how to install
a wireless router, what they wore today and a list of their 45 favourite ice cream flavours.
While this is interesting to observe from a sociological point of view, it does not make for
a compelling performance. Not only are the blogs on these topics boring, but also the
length of the blog posts vary widely from one sentence to pages upon pages, and most do
not take the form of a story or narrative.
We need to give the system strategies for finding stories that will be compelling and
engaging to an audience. To do so, the system employs a model for the aesthetic qualities
of a compelling story. These qualities include but are not limited to:
1
on an interesting topic
2
emotionally charged
3
complete and of an appropriate length to hold the audience’s attention
4
familiar to an audience
5
involving dramatic situations
6
comprised of developed characters.
We designed Buzz to find stories with all of these qualities.
In building Buzz, the author developed a model for the retrieval, filtering and
modification of stories that takes advantage of the vast size of the blogosphere,
aggressively filtering the retrieval of stories. First, the system retrieves a large set of
blogs using existing blog search engines. The retrieval process includes a query
formation stage, retrieval of blog using existing search engines, result processing and the
extraction of candidate blog posts. Following this stage, the candidate blog posts are sent
to the filtering and modification engine. In this engine, candidate stories are extracted
from the posts and filtered using many different metrics. The stories that pass through all
these filters are known to be impactful and appropriate stories for presentation in a Buzz
performance.
There are three functional categories for Buzz’s filtering strategies:
x
Story filters are those which narrow the blogosphere down to those blog posts that
include stories, including strategies that make use of punctuation, relevance to topics,
inclusion of phrasal story cues and completeness to indicate a text that is likely to
have a dramatic point. For example, the List filter removes candidate stories that
contain more than three commas per sentence as they are perceived as taking the
form of a list as opposed to a story.
x
Content or impact filters are used to find interesting and appropriate stories – those
with elevated emotion, familiar and relevant content that is free of profanity and
other unwanted language use. For example, the Affect filter runs each candidate story
through an emotional classifier and discards those stories that are classified as
emotionally neutral. This filter enables the system to present emotional stories.
Buzz: Mining and Presenting Interesting Stories
x
139
Presentation filters are used to focus on content that will sound appropriate when
spoken through a computer generated voice, and presented by an avatar of the
appropriate gender. For example, the URL filter discards candidate stories that
contain URLs as they sound awkward when presented by a computer generated
voice.
In addition to filters, there is also a set of modifiers that alter the text of the retrieved
stories.
x
Story modifiers alter the text so that the structure looks more like a story. For
example, the Structural Story Cue modifier truncates the story such that the structural
story cue appears in the first sentence of the story; these cues are intended as natural
starting points for stories.
x
Presentation modifiers change the text to make it sound more appropriate in spoken
as opposed to written form. For example, the Abbreviation modifier transforms
abbreviations to the expanded form to ensure that they sound more appropriate when
spoken by the avatar.
All filters and modifiers are configurable to adjust to different installations or
deployments. The integration of this model into the overall Buzz system can be seen in
Figure 3. In this diagram, notice that the modifiers and filters exist in a single module,
illustrating the integrated nature in which these pieces interact. The following sections
describe the use of these filters, modifiers and retrieval strategies in the overall system.
Before diving into the details of this architecture, the would like to address why we
chose to handle this problem the way we did, as there are many other approaches we
could have taken. Given the vastness of the blogosphere, our system typically has on the
order of hundreds or thousands of candidate stories for any one topic. This allows us to
use simple techniques (predominantly search technology), building simple filters to be
highly selective among these candidates. While these filters will have low recall in
retaining good candidates, the precision is quite high in that the stories that are retained
are quite good. We could have approached this problem using more complex techniques,
but the vastness of the corpus of stories made that unnecessary.
3.3.1
Query formation
Two types of query formation strategies are used in the Buzz system; one strategy that
uses popular topics found daily on the web, and another uses a library of structural story
cues to seek texts that take the form of a story. While both are described below, we have
found the latter to be significantly more effective in yielding stories that are of interest to
a large audience.
3.3.1.1
Topics of interest
A compelling story is generally about a compelling topic, one that interests the audience.
For this reason, we chose the day’s most popular searches from Yahoo!, provided by
Yahoo Buzz (Yahoo!, 2006) as topics. Search engines recently began to provide a log of
their most frequently used query topics. Yahoo! takes this a step further to provide the
most frequently queried topics in a set of categories. Their categories currently include:
overall, actors, movies, music, sports, TV, and video games. In the actors category, the
140
S.O. Sood
top three topics from March 7, 2007 are ‘April Scott’, ‘Lindsay Lohan’ and ‘Jessica
Alba’. In the overall category, the top three topics from March 7, 2007 are ‘Britney
Spears’, ‘Antonella Barba’ and ‘Anna Nicole Smith’. These feeds worked well as a seed
to story discovery, as we were using the topics that people were searching for most and
discovering people’s thoughts and opinions on these topics. The intuition here is that the
topics that people search for most are likely the topics that they are most interested in.
We found Wikipedia (2005) to be another source for topics of interest as the site
maintains a list of ‘controversial topics’. The list shows topics that are in ‘edit wars’ on
Wikipedia as contributors are unable to agree on the subject matter. This list includes
topics such as ‘apartheid’, ‘overpopulation’, ‘ozone depletion’ and ‘censorship’. These
topics, by their nature, are topics that people are passionate about. One March 7, 2007,
Wikipedia’s ‘List of Controversial Issues’ included such topics as ‘Bill O’Reilly’,
‘Abortion’, ‘Osama bin Laden’, ‘Stem Cell Research’, ‘Censorship’, ‘Polygamy’ and
‘MySpace’.
Using these two sites as sources for topics, these topics are used as queries and sent to
a set of existing blog search engines. Using topics of interest as the source of topic
keywords and blogs as the target, we are able to discover what was being said today
about what people were most interested in today.
3.3.1.2
Structural cues
Through experiencing Buzz in the world and watching audiences’ reactions and responses
to stories, we discovered more generalised traits of compelling stories. The most
compelling stories to watch or hear were those in which someone is laying his or her
feelings on the table, exposing a dream or a nightmare that they had, making a confession
or apology to a close friend, or regretting an argument that they had with their mother or
spouse.
Codifying these qualities, we built our story discovery engine to seek out these types
of stories. We added a component to the retrieval that forms queries based on a structural
story cue. These cues are designed to find instances in which a writer is starting to tell a
story in the form of a dream, nightmare, fight, apology, confession or any other
emotionally fraught situation. Such cues include phrases such as ‘I had a dream last
night’, ‘I must confess’, ‘I had a terrible fight’, ‘I feel awful’, ‘I’m so happy that’, ‘I’m so
sorry’, etc. The most straightforward structural story cue would be if the author said,
‘I have a story to tell you’ or even ‘Once upon a time’.
This realisation was an important turning point in our system’s capabilities with
regard to retrieving compelling stories. The newest instance of Buzz no longer focuses on
the popular or contentious topics, but instead focuses on stories in different types of
emotion laden situations (dreams, fights, confessions, etc.).
These stories are more interesting as the blogger is not merely talking about a popular
product on the market, or ranting about a movie; they are relaying a personal experience
from their life, which typically makes them more emotionally charged. The experiences
they describe are often frightening, funny, touching or surprising. They describe
situations which have a common element in all of our lives, allowing the audience to
embed themselves in the narrative and truly connect with the writer, whereas the topically
based approach excluded the portion of the audience that was not familiar with the topic
at hand (a popular actress, story in the news, etc.).
Buzz: Mining and Presenting Interesting Stories
141
In the 19th century, a French Writer, Georges Polti enumerated 36 situational
categories into which all stories or dramas will fall. These included modern categories
such as vengeance, pursuit, abduction, murderous adultery, mistaken jealousy and loss of
loved ones (Polti and Ray, 1940). While the language describing these situations now
sounds somewhat dated, the concepts behind these situational categories bear a
resemblance to the types of stories that might be interesting to hear.
Including structural story cues as a search parameter not only gets us to more
interesting story topics and content, but we also tend to see more character depth and
development in the stories. As writers describe dramatic situations in their lives, more
pieces of their personality and personal issues with themselves and others around them
are revealed as a result.
3.3.2
Blog finding and result processing
The queries formed in the previous stage, such as ‘I had a dream last night’, are sent to a
set of existing blog search engines. The system collects the top n results (where n is
configurable and is currently 1,000) from these engines. Each result contains a title,
summary and URL of a blog related to the given query. The system filters duplicate
results and non-blog results (i.e. user profile pages). Next, the HTML content for each
blog result is retrieved.
3.3.3
Candidate extraction
The content for each blog result may contain multiple posts which may or may not be
relevant to the query. In order to generate a set of candidate stories, the system must
be capable of extracting the individual posts that are relevant to the query. To identify
the relevant posts within the blog result, all ‘text’ tags in the HTML of the blog entry are
removed, that is, tags used to alter the look of text such as the italics tags, the bold tag,
the underline tag and the anchor tag. After removing these tags, the system finds all
occurrences of the given query terms and structural story cues on the page. For each
occurrence, it searches for the last previous occurrence of and the next occurrence of a
natural breaking point. The section between these two points is taken as a candidate story.
The natural breaking points before and after a piece of text will often be tags that divide
paragraphs, so the algorithm will accomplish the goal of finding the relevant paragraphs.
In combination with the filters described in the following sections, this algorithm works
quite well as a generalised method for extracting candidate stories from online sources.
Following the candidate extraction step, what remains is a set of candidate stories,
ready to be sent through the filtering and modification engine.
3.3.4
Story modifiers
Story modifiers are modification strategies aimed at transforming the candidate story into
a more story-like structure. The main strategy in this category involves the structural
story cues described in the query formation section. While these cues are initially used as
a method to retrieve stories, they are also used to truncate the blog post into the section
that structurally is most like a story. Often blog posts are retrieved that include the
structural story cue, but it occurs in the middle of a paragraph. Since the stories are
initially divided by paragraphs in this system, story cues would not actually occur at the
142
S.O. Sood
beginning of the candidate story. To change this, this modifier truncates the story to begin
with the sentence that includes the structural story cue. The end result are stories that take
the form laid out in the structural story template, beginning with phrases such as ‘I had a
dream last night’ or ‘I got into a fight with’.
3.3.5
Story filters
Story filters are those which narrow the blogosphere down to those blog posts that
include stories, including strategies that make use of punctuation, relevance to topics,
inclusion of phrasal story cues and completeness.
3.3.5.1
Relevance to topics of interest and inclusion of structural story cues
This filter is intended to evaluate the relevance of candidate stories to the topics of
interest and/or the structural story cue used in their retrieval, filtering irrelevant
candidates. In the case of a topic of interest query, the candidate stories are phrasally
analysed, eliminating posts that do not contain at least one of the two word phrases
(non-stopwords) from the topic. For example, given a topic of ‘Star Wars: Revenge of the
Sith,’ entries that contained the phrase ‘star wars’ were acceptable, but not entries that
merely had the word ‘star’ or ‘wars.’ The remaining blog entries were thought to be
relevant to the current popular topic. In the case of a structural story cue query, the
candidate story is analysed to ensure that the story queue is present and occurs in the first
sentence of the story. This ensures that the structural cue is used as intended, to start the
story.
3.3.5.2
Complete passages
In finding stories, the system must ensure that it finds complete stories, that is, ones that
outline a complete thought. Finding stories that are complete passages involves finding
complete thoughts or stories of a length that can keep the audience engaged. For the most
part, we found that blog authors format their entries in a way such that each paragraph
contains one distinct thought. Under this assumption, the paragraph where the structural
story cue and/or topic are mentioned with the greatest frequency will suffice as a
complete story for our system. Given the method described to extract candidate stories
from blogs, these candidate stories will likely take the form of a complete paragraph. If
this paragraph is of an ideal length (between a minimum and maximum character and
word threshold), determined by viewing Buzz with stories at many different lengths, then
it is proposed as a candidate story. For a public installation of Buzz, we found that stories
between 150 and 600 characters long were ideal. Again, given the large volume of blogs
on the web, letting many blogs fall through the cracks because they are too long or too
short is acceptable for our purposes.
3.3.5.3
Filtering retrieval by syntax
In our first pass at retrieving stories from blogs, we noticed that the system often found
lists or surveys instead of text in paragraph form. For example, one blogger posted an
exhaustive list of lip balm flavours. Others posted answers to a survey about themselves
(their favourite vacation spot, favourite colour, favourite band and actor, etc.). These are
clearly not good candidates for stories to be presented in a performance.
Buzz: Mining and Presenting Interesting Stories
143
To solve this problem, we chose to filter the retrieved candidate stories by syntax.
Stories that met any of the following indicators were removed as they often signify a list:
1
Too many newline characters (currently more than six in a 400 character block).
2
Too many commas (currently more than three in a sentence or more than 1 in 15
characters).
3
Too many numbers (currently more than one number – no longer than four
continuous digits – in a sentence).
This method successfully filters blog entries that contain a list or survey of some sort.
While the recall of stories that pass through this filter based on syntax is lower than other
methods, the system is optimised for precision so that we are confident that the remaining
stories do not contain lists or surveys. Given the large volume of blogs on the web
updated every minute, letting some potentially good blogs fall through the cracks sufficed
for the system’s purposes.
3.3.6
Content or impact modifiers
Content or impact modifiers are modification strategies aimed at transforming the
candidate story to make it more compelling or impactful. The current strategy in this
category involves removing extraneous content in order to focus on the spines of the
stories. For example, if the story contains any parenthetical, bracketed or braced content,
it is removed. This includes any remaining html or xml tags. This is based on the notion
that if you were reading this post to a friend, you might ignore such content as it breaks
up the flow of the story.
In addition to the above mentioned Content/Impact modifiers, the author envisions
another modifier in this category, called an amplifier. An amplifier would alter the
candidate stories, so that they are more impactful, emotional or colloquial. This system
would transform words that occurred in a story to more emotional words with the same
connotation. The end result would be a story that conveyed the same meaning, yet with
more emotional impact than in its original form.
This could be implemented with a combination of a part of speech tagger, a
connected thesaurus and our Naïve Bayes sentiment classification model. The system
would attempt to replace adjectives in the candidate story, namely, ones that have only
one sense in the connected thesaurus, making the word unambiguous. From the synonym
set, it could chose the synonym with a higher ‘sentiment magnitude’ from the sentiment
classification Naïve Bayes model. This ‘sentiment magnitude’ is a calculation of how
emotion-bearing a term is. This system will scale and be configurable for how much to
amplify a story.
3.3.7
Content or impact filters
Content or impact filters are used to find interesting and appropriate stories; those with
elevated emotion, familiar and relevant content that, if desired, is free of profanity and
other unwanted language use.
144
3.3.7.1
S.O. Sood
Filtering retrieval by affect
Given that our initial version of Buzz was reading blogs that were boring or uninteresting,
and since such a large volume of blogs exist on the web, we strove to filter the retrieved
relevant stories by affect, giving us the ability to portray the strongest emotional stories.
We accomplish this using a sentiment classification system. Sentiment analysis is a
modern text classification area in which systems are trained to judge the sentiment
(defined in a variety of ways) of any document. Since we simply want to know if a story
is emotional or not, we define sentiment as valence, that is, how positive or negative a
selection of text is.
In our sentiment classification system, a combination of case-based reasoning,
machine learning and information retrieval approaches are used (Owsley, Sood and
Hammond, 2006; Sood, Owsley and Hammond, 2007). Using a training set of labelled
data (movie and product reviews), our system is able to judge how emotional words are
based on their appearance in the training data, and then use the most emotional words in a
story as a query to a case base of labelled data. While many others have used such data to
build Naïve Bayes sentiment classifiers, we find that using a case based approach
preserves the large differences in affective connotation of words across domains. For
example, while the word ‘cold’ has a very negative connotation in describing a person,
being ‘cold’ is seen as a positive attribute of a beverage.
We collected a case base of 106,000 movie and product reviews labelled with a star
rating between one and five (one being negative and five being positive). We omitted
reviews with a score of three as those were seen as neutral. We built a Naïve Bayes
statistical representation of these reviews, separating them into two groups, positive (four
or five stars) and negative (one or two stars). Given a target document, the system creates
an ‘affect query’ as a representation of the document. The query is created by selecting
the words with the greatest statistical variance between positive and negative documents
in the Naïve Bayes model. The system uses this query to retrieve ‘affectively similar’
documents from the case base. The labels from the retrieved documents are used to
derive an affect score between –2 and 2 for the target document. This tool was found to
be 73.39% accurate.
For Buzz, blogs which score between –1 and 1 are seen as neutral and not good
candidates for a performance. When using the emotional filtering tool, Buzz is
considerably more compelling. The actors are able to retrieve stories from the web based
on emotional stance, enabling the theatrical agents to juxtapose positive and negative
stories on the same topic.
Future work on this classification tool includes creating a model of affect based on
Ekman’s six emotion model (happiness, sadness, anger, disgust, fear and surprise)
(Ortony, Clore and Foss, 1987; Ekman, 2003; Liu, Lieberman and Selker, 2003). This
would allow for greater control of the flow of the performance arc through emotional
states.
3.3.7.2
Colloquial filtering
Shamma et al. (2004a) began exploring the use of Csikszentmihalyi Flow State
(Csikszentmihalyi, 1991) as a method of keeping the audience engaged through
audiovisual interaction. In Buzz, for an audience to stay engaged, they must understand
the content of the stories that they are hearing. That is, the story can not involve topics
Buzz: Mining and Presenting Interesting Stories
145
that the audience is unfamiliar with or contain jargon particular to some field. The story
must be colloquial. The story must also not be too familiar as they audience could get
bored or lose interest.
To determine how familiar a story is, the author built a classifier that makes use of
page frequencies on the web. For each word in the story, the system looks at the number
of pages in which this word appears on the web, a frequency that is obtained through a
simple web search. Applying Zipf’s Law (Zipf, 1949) the system can determine how
colloquial each word is (Shamma et al., 2004b). A story is then classified to be as
colloquial as the language used in it. Given a set of possible stories, colloquial thresholds
(high and low) are generated dynamically based on the distribution of scores. If more
than n percent of the words in a story fall below the minimum threshold (where n is
configurable, currently n is 5), that story is seen as being too obscure and is discarded.
3.3.7.3
Language
It is important that that language used in a candidate story is appropriate for presentation
through Buzz. For example, it would sound awkward for a Buzz digital actor to present a
story that begins ‘In my last post’ as the story is taken out of the context of the blog; or
for an actor in a public installation to read a story that contains profanity. For this reason,
Buzz uses a language filter that can be configured to remove stories which include
profanity, or even stories which include words that expose the fact that it was extracted
from a blog. For example, some blog posts are often started with the phrase ‘In my last
post. …’ While this is appropriate when a reader understands that what they are reading
is a blog, etc. this is inappropriate or awkward when taken out of the context of the blog
posting, and presented through an embodied avatar.
To filter out stories with such language, this filter uses a dictionary based approach.
It can be provided with a list of words to filter based on. From there, the system can be
configured to only filter based on those words, or also to include stems of those terms for
broader coverage. As with all other Buzz filters, this filter may be turned ‘on’ or ‘off’
when appropriate.
3.3.8
Presentation modifiers
In addition to presentation filters, the author built a set of presentation modifiers aimed at
altering the text to make it more appropriate for presentation through a computer
generated voice. Upon reaching the presentation modifiers, the candidate stories have
passed through the three major filter sets (story filters, content or impact filters and
presentation filters) as well as the story modifiers. The next step is to prepare them to be
spoken by a speech generation engine.
Adjacent punctuation is condensed as the speech engines use this punctuation for
pauses, so adjacent punctuation would result in long pauses. Any remaining numbers,
dates and monetary amounts are altered to be readable by the speech engines. Finally,
abbreviations are substituted to their expanded form, and any remaining acronyms or
abbreviations are expanded to instruct the speech engine correctly. For example, ‘APA’
would be expanded to ‘A. P. A’. so that the speech engine spells out the acronym as
opposed to treating it as a word.
146
S.O. Sood
Upon completing all of these modifications, the story candidate is passed through all
filters a second time. This ensures that any transformations made on the text did not
change its value or quality as a story, or how appropriate it is for presentation.
3.3.9 Presentation filters
Presentation filters are used to focus on content that will sound appropriate when spoken
through a computer generated voice, and presented by an avatar of the appropriate
gender.
3.3.9.1
Presentation syntax filter
While syntax filtering was included in the ‘story filters’, it is also important in the
presentation filters, due to the limitations of computer generated speech. Blogs, by their
nature, are often casually punctuated and structured. While this is not generally a problem
for the reader, it poses a problem when presented through a text-to-speech engine.
Text-to-speech engines use punctuation as cues for prosody and cadence (Sproat,
Ostendorf and Hunt, 1998). For this reason, when a story is poorly punctuated, or it
contains too many numbers, numbers with many digits, URLs, links or e-mail addresses
which sound bad when presented by a text-to-speech engine, they are filtered by the
presentation syntax filter.
This filter also removes stories that contain a direct quote which makes up more than
one third of the story. We found that lengthy direct quotes are awkward when read by a
computer generated voice. When a person reads a direct quote, they often have a change
of inflection to indicate a different speaker. This change does not occur in computer
generated voices, often causing the listener some confusion. For this reason, candidate
stories that fall into this category are discarded.
3.3.9.2
Detecting gender-specific stories
One problem encountered in a first pass of building Buzz was that gender-specific stories
were occasionally read by actors of the incorrect gender. For example, if a blogger
describes their experiences during pregnancy, it is awkward to have this story performed
by a male actor. Conversely, if a blogger talks about their day at work as a steward,
having this read by a female could also be slightly distracting.
As a solution to this problem, the author sought to detect and classify gender-specific
stories. Unlike previous gender classification systems (Koppel, Argamon and Shimoni,
2003) it was not necessary for our system to classify all stories as either male or female.
Rather, it was only important for the system to detect stories where the author’s gender is
evident, thus classifying stories as male, female, neutral (in the case where genderspecificity is not evident in the passage) or ambiguous (in the case where both male and
female indicators are present).
To do this, the system looks for specific indicators that the story is written by a male
or a female. These indicators include self-referential roles (roles in a family and job
titles), physical states and relationships. These three types of indicators are treated as
three separate rules for gender detection in the system.
To detect self-referential roles in a blog, the system looks for ‘I’ references including
‘I am’, ‘I was’, ‘I’m’, ‘being’ and ‘as a’. These phrases indicate gender-specificity if they
Buzz: Mining and Presenting Interesting Stories
147
are followed within five words (if none of these five words are pronouns) by
a female-only or male-only role such as wife, mother, groom, aunt, waitress, mailman,
sister, etc. Such roles were collected from various sources and enumerated as such. This
rule set is meant to detect cases like ‘I am a waitress’, which would indicate that the
speaker is a female. Excluding extra pronouns between the self-reference and the role
eliminates false positives like ‘I was close to his girlfriend’, where the additional ‘his’
ensures that this rule is not applied.
To detect physical states that carry gender connotations, the system again looks for ‘I’
references, as above, followed within five words by a gender-specific physical state such
as ‘pregnant’. This rule is meant to detect cases like ‘I am pregnant’. As in detecting
roles, we also ignore cases with extraneous pronouns between the ‘I’ reference and the
physical state. This eliminates false positives like ‘I was amazed by her pregnancy’.
To detect male or female-only relationships, the system looks for use of the word
‘my’ followed within five words by a male or female only relationship such as husband,
ex-girlfriend, etc. This rule is intended to catch cases such as ‘my ex-husband’. Again,
cases with extraneous pronouns are ignored to eliminate false positives like ‘my feelings
towards his girlfriend’. In this our first pass at a gender specific story classification
system, we make the assumption of heterosexual relationships, which we hope to relax in
a future system.
If any of three above indicators exists in a story, and they agree on a male/female
classification, then the story is classified as such. If they disagree, it is classified as
‘ambiguous.’ If no indicators exist, it is classified as ‘neutral.’
This gender detection tool was evaluated using a corpus of 96 stories retrieved by
Buzz. These stories were retrieved from an indexed corpus of stories found by Buzz. They
were selected by queries for words that often indicate gender-specificity (‘pregnant’,
‘mom’, ‘mother’, ‘dad’, ‘father’, ‘girlfriend’, ‘boyfriend’, ‘husband’, ‘wife’ and
‘daughter’). They were manually sorted into three groups, stories written by females,
males or neutral (written by males or females). This sorting was based on textual cues
that gave a clear indication of gender, and was verified unanimously by from five
participants.
While this gender-classification system is still simple, it does an admirable job.
Results showed that the gender detection tool performed very well, as seen in the
precision and recall scores in Table 3. Overall precision and recall were both
approximately 91.67%. Enabling Buzz with the ability to detect and handle gender
specific stories has created a more realistic performance, without the distraction of an
actor performing a gender-mismatched story.
Table 3
Precision and recall scores for detection of gender specific stories
Document type
Precision (%)
Recall (%)
Female-specific
92.59
86.21
Male-specific
100
84.62
Gender-neutral
89.66
96.30
Overall
91.67
91.67
148
3.3.10
S.O. Sood
Evaluation
An example of three stories discovered by Buzz can be seen in Table 2. The stories shown
were retrieved and passed through all above-mentioned filters. To evaluate the
effectiveness of Buzz’s filters in finding compelling stories, the author conducted a user
study including 12 participants. Each participant was given five stories to score on a scale
from 1 to 10 (uninteresting to interesting). The stories were chosen at random from a set
of stories selected by Buzz as good candidates for a performance, and a set of stories
retrieved by Buzz, but removed as they did not pass one of the filters.
On a scale of 1–10 (uninteresting to interesting), the study participants found Buzz
selected stories to be an average 7.13 and Buzz rejected stories to be an average of 4.3. A
graph of the frequencies of participant scores across Buzz accepted and Buzz rejected
stories can be seen in Figure 4.
While the author is quite pleased with the results of the Buzz story discovery system
as a method for autonomously discovering compelling stories, the author thinks it has
even broader implications and contributions. Consider how people tell stories – they
never ‘create’ or ‘invent’ new stories, instead they recall the most interesting stories they
have heard or experiences they have had, they may merge different experiences together,
and they alter the details to make a more compelling story. In a sense, we have codified
how people tell or create stories, building a system to do just that. In a way, the author
thinks this system has made contributions to the decades old problem of story generation
in Artificial Intelligence.
3.4 Creating a performance
While finding compelling stories is an important aspect of Buzz, conveying them to an
audience in an engaging way is just as crucial. The author found several aspects of the
presentation to be critical. The performance must follow a dramatic arc that keeps
the audience engaged. Text-to-speech technology and graphics must be believable
Figure 4
The results of an evaluation of the Buzz story discovery engine, where participants
judged stories on a scale from 1 to 10 (uninteresting to interesting) (see online version
for colours)
Buzz: Mining and Presenting Interesting Stories
149
(or suitable) and evocative. While these issues are a subset of those critical to an
engaging performance, the author chose to address these directly as the author feels that
our findings can generalise to other performance systems.
3.4.1
Embodied presentation of stories
Some theories of knowledge representation cite stories, also called cases, as the core
representation schema in our minds (Schank, 1999). They explain that these stories are
indexed in our memory by their details (locations, situations, etc.). In the following
passage from Dynamic Memory Revisited, Schank explains the power of stories.
“People tell stories because they know that others like to hear them, although
the reason people like to hear stories is not transparent to them. People need a
context to help them relate what they have heard to what they already know.
We understand events in terms of events we have already understood. When a
decision-making heuristic, or rule of thumb, is presented to us without a
context, we cannot decide the validity of the rule we have heard, nor do we
know where to store it in our memories. Thus, what we are presented with is
both difficult to evaluate and difficult to remember, making it virtually useless.
People fail to couch what they have to say in memorable stories will have their
rules fall on deaf ears despite their best intentions and despite the best
intentions of the listeners. A good teacher is not one who merely explains
things correctly, but one who couches explanations in a memorable (i.e., an
interesting) format.” –Roger Schank (Schank, 1999)
We feel that to truly understand and connect to people, stories are essential. Furthermore,
presenting these stories in an interesting way is a critical part of understanding. We feel
that embodying the bloggers with avatars allows the viewer to more easily embed
themselves in the narrative and internalise its impact.
3.4.2
The display
The current Buzz installations include five flat panel monitors in the shape of an ‘x’. The
four outer monitors display actors represented by different adaptations of the graphics
from Ken Perlin’s Responsive Face technology (Perlin, 1996; Perlin and Goldberg,
1996). These faces are synchronised with voice generation technology (NeoSpeech,
2006) controlled through the Microsoft Speech API, matching mouth positions on the
faces to viseme events, lip position cues output by the MSAPI. Within this configuration,
the actors are able to read stories and turn to face the actor currently speaking.
The central screen (Figure 2) displays emotionally evocative words, pulled from the
text currently being spoken, falling in constant motion. These words are extracted from
the stories using our emotion classification technology. The most emotional words are
extracted by finding the words with the largest disparity between positive and negative
probabilities in a Naïve Bayes statistical model of valence labelled reviews. I have found
this display to be a good addition to the actors to give the audience additional context in
the performance and amplify the impact of the emotional words.
3.4.3
Director level control
Given the above classifiers and filters, the system is able to retrieve a set of compelling
stories. These filters and classifiers also give us a level of control of the performance
150
S.O. Sood
similar to that of a director. Having information about each story such as its ‘emotional
point of view’, its ‘familiarity’ and the likely gender of its author, the structure of an
ongoing performance or individual story presentation in an online system can be planned
out from a high level view before retrieving the performance content, giving the
performance a flow, based not only on content, but also on emotion, familiarity, on-point
vs. tangential, etc. Given a topic, when the system is presenting multiple stories, the
system can juxtapose stories with different emotional stances, different levels of
familiarity, and on-point vs. off-point. These affordances give a meaningful structure to
the performance.
To provide a high level control of the performance, we created an architecture for
driving the retrieval of performance content. The structures, called ARCs, provide high
level instructions to the Buzz engine as to what is needed, where to find it, how to find it,
how to evaluate it, how to modify queries if needed and how to adapt the results to fit the
current goal set. To get an idea of how the ARCs interact with the blog search and
filters, see Figure 5.
The pictured ARC (Figure 5) defines a point/counterpoint/dream interaction between
agents. The three modules define three different information needs, as well as the sources
for retrieval to fulfil these needs. The first module specifies that we want a blog entry that
is on point to a specified topic, has passed through the syntax and colloquial filters, and is
generally happy on the topic. The module specifies using Google Blog Search
(Google, 1996) as a source. The source node specifies to form queries by single words as
well as phrases related to the topic. If too few results are returned from this source, we
have specified that queries are to be continually modified by lexical expansion and
stemming.
The ARC extensible framework allows for interactions from directors with little
knowledge of the underlying system. In a future system, we will accomplish this via a
range of possible interfaces from storyboarding and affect manipulation to a natural
language interface.
Figure 5
3.4.4
A sample dramatic adaptive retrieval chart used to drive a Buzz performance
(see online version for colours)
Compelling speech
While text-to-speech systems have made great strides in improving believability of
generated speech, these systems are not perfect (Black, 2002). Their focus has been on
telephony systems, where the length of time of spoken speech is limited and emotional
Buzz: Mining and Presenting Interesting Stories
151
speech is unnecessary. In watching a performance of Buzz using such text-to-speech
systems, the voices tend to drone monotonously during stories longer than one to two
sentences. An additional problem is caused by the stream of consciousness nature of
some blogs, resulting in casual formatting with poor or limited punctuation. As
mentioned earlier, text-to-speech systems generally rely on punctuation to provide natural
pauses in the speech. In blogs where limited punctuation was present, the voices tended
to drone on even more.
In response to these issues, we created a model for emotional speech emphasis.
Others have created models for how to emphasise words in generated speech (Raux and
Black, 2003) and which words to emphasise. While these models are successful, we
strove to create a simple model that would scale to our needs and capture the emotional
element of the stories. Our system also includes a model for emotional speech emphasis
at the sentence level. First, the system uses a sentence level emotion classifier to
determine which sentences in a story are highly affective, and which emotion they are
characterised by. In the exemplary installation of Buzz, the text is marked up at the
sentence level for its emotional content (happy, sad, angry, neutral, etc.). This can also be
done in larger spans such as at the paragraph or story level, or in smaller spans such as
the word or phrase level. The models of emotion used can be replaced by a more or less
detailed model of emotion.
Many speech engines allow XML markup to control the volume, rate and pitch of the
voices, as well as to insert pauses of different periods (specified in milliseconds) in the
speech. We use this XML markup, in combination with an off-the-shelf audio processing
toolkit, to alter the sound of the speech according to its emotional markup. For example,
to handle a happy sentence, the pitch will be raised, rate will be increased and the pitch of
the voice will rise slightly at the end of the sentence. These changes in speech attributes
to portray emotion are informed by a study of emotional human speech (Cahn, 1990). For
each of Ekman’s six emotions, Cahn’s study provides dimensions on which generated
speech can be altered in order to render that emotion.
An off the shelf audio processing toolkit was necessary for this system as some
speech engines to not support the necessary parameter of control (pitch, rate, volume,
etc.). In addition, using an audio tool as post processing often conserves the quality of the
voice more than using the speech engine’s internal controls.
In addition to using a model of emotional emphasis, the system inserts pauses into the
audio stream at natural breaking points. This technique tends to improve performance on
blogs with limited punctuation. While our model of emotional speech emphasis is
simplistic, we have found it to be an effective way to enhance the Buzz experience. We
expect to further tweak our emphasis model in response to audience or user feedback.
3.5 Buzz installations
Using filters, modifiers and information retrieval strategies that focus on finding the
interesting has resulted in an engaging theatrical installation. While finding compelling
stories to present is a very important part of the Buzz performance, presenting these
stories in a way that is meaningful and engaging is equally important. The author found
issues of embodiment, gender-specificity, voice prosody and presentation flow and order
to be the aspects of a Buzz performance with which the author could make great strides in
improving.
152
S.O. Sood
Figure 6
Pictures of four public Buzz installations (see online version for colours)
Enabling Buzz with the ability to discover compelling stories has produced great results.
Buzz has changed from a research project accessing stories that were unbearably dull,
exposing the boring nature of many blogs, to a system that engages its viewers. The
performance is now not driven simply by the relevance of on-line content, but by the
blogger’s emotional state. The highly emotional content engages the audience and creates
a high visibility installation.
Buzz debuted from April 22nd through May 1st 2005 at the Athenaeum Theater as a
part of the 8th Annual Chicago Improv Festival. It was well-received by actors, writers,
producers and theatre-goers alike during this 10 day installation. Buzz was installed in the
lobby of Chicago’s Second City Theater at 1616 N. Wells Street in Chicago on August
24th, 2005 for a one year installation. Buzz was also exhibited at Wired NextFest in New
York City from September 29th to October 1st, 2006. See Figure 6 for still shots of the
four major installations.
4
Related work
Buzz is situated in a larger area of work called Network Arts (Shamma et al., 2004a;
Shamma, 2005) an area focused on the creation of informative and compelling
performance installations that find, use and expose the massively interconnected content
that exists on the web. Systems in this area both inform viewers of the current state of the
web, and enlighten them with associations and points of view. These systems are
important because they bridge the gap between the internet and digital art, an area that
traditionally uses the machine as an instrument which is utilised by a human artist. Using
the internet as a resource, systems in this area are scalable and autonomously generate
creative and entertaining experiences that provide a reflection of the world that we
live in.
Artistically, story telling and online communication have been externalised within
several installations. Of the more well known in this area, Listening Post (Hansen and
Rubin, 2002, 2004) is an art installation that exposes content from thousands of chat
Buzz: Mining and Presenting Interesting Stories
153
rooms and other online forums through an audio and visual display. The audio
component is made up of synthesised voice of the content which is displayed visually on
more than 200 tiny screens. Listening Post exposes the real-time Buzz of chat rooms on
the web, in a manner that was designed to convey ‘the magnitude and diversity of online
communication’.
Inspired by Listening Post, We Feel Fine (Harris and Kamvar, 2007) is an ongoing
online art installation that was created in August of 2005 by Jonathan Harris and Sep
Kamvar. The backend for the installation mines the blogosphere for sentences containing
the phrase ‘I feel’ or ‘I am feeling’. It discovers 15,000–20,000 new sentences per day
and for each sentence, when possible, extracts demographic information (age, gender,
location, etc.) and any image associated with the blog entry. Using the time of the post
and the location information, it gathers information about the current weather at the
blogger’s location. It also does a simple association of the sentence to zero or more of
5,000 hand coded emotions. The associations are determined by mere presence of the
emotion term in the sentence. The system has mined and stored these sentences since
August of 2005 and visualised them in several interesting ways through a web applet.
Like both Listening Post and We Feel Fine, Buzz externalises online communication,
through a context refined via search and extraction. The Buzz filters and retrieval
strategies focus the presentation on stories and specifically on compelling and emotional
stories while these two installations are more broadly oriented to represent or portray the
breadth of online communication, while also targeting key phrases. Listening Post
demonstrates online communication as a whole, while Buzz focuses on singular voices
from the blogosphere, grounded in current popular topics and/or dramatic situations. Buzz
also differs in that the bloggers are embodied by avatars to give voice and body to the
stories.
5
Conclusions and the future
In this article, the author has described a system that connects people through stories.
This system amplifies the stories that people tell and facilitates and strengthens
connections between people. In addition to autonomously finding stories and judging
interestingness, a concept that extends to search by interestingness, the Buzz story
discovery engine has broader implications and contributions. Consider how people tell
stories – they never ‘create’ or ‘invent’ new stories, instead they recall the most
interesting stories they have heard or experiences they have had, they may merge
different experiences together, and they alter the details to make a more compelling story.
In a sense, we have codified how people tell or create stories, building a system to do just
that. In a way, the author thinks these systems made contributions to the decades old
problem of story generation in artificial intelligence.
The clear next step for Buzz includes moving it online, where instead of connecting
thousands of people in a public installation, it could impact millions. Current and future
work in this area involves creating a destination entertainment site (www.buzz.com),
where viewers can see interesting stories presented to them by a diverse set of realistic
and engaging avatars. The stories will exist in a variety of high level categories including
compelling stories involving dramatic situations (dreams, fights, confessions, etc.) topical
stories (around new products or hot topics), and news related points of view (evolving
daily). Each day, Buzz.com will produce and host approximately 1,400 new 45–90 sec
154
S.O. Sood
video stories based on the updated daily news events, topics, etc. The videos can be
navigated via topical search or browsed through a set of hierarchical categories. The site
will allow users to comment on videos, rate them, and recommend them to friends. We
hope that Buzz.com will be successful in connecting millions of people through stories of
experiences and points of view.
References
Black, A. (2002) ‘Perfect synthesis for all of the people all of the time’, IEEE TTS Workshop,
Santa Monica, CA.
Cahn, J.E. (1990) ‘The generation of affect in synthesized speech’, Journal of the American Voice
I/O Society, Vol. 8, pp.1–19.
Csikszentmihalyi, M. (1991) Flow: The Psychology of Optimal Experience. New York, NY: Harper
Perennial.
Ekman, P. (2003) Emotions Revealed: Recognizing Faces and Feelings to Improve Communication
and Emotional Life. New York, NY: Henry Holt and Company.
Google (1996) Google, Available at: http://www.google.com.
Green, H. (2007) With 15.5 Million Active Blogs, New Technorati Data Shows that Blogging
Growth Seems to be Peaking; appears in Business Week.com.
Hansen, M. and Rubin, B. (2002) ‘Listening post: giving voice to online communications’,
International Conference on Auditory Display, Kyoto, Japan.
Hansen, M. and Rubin, B. (2004) Listening Post, Available at: http://earstudio.com/projects/
listeningpost.html.
Harris, J. and Kamvar, S. (2007) We Feel Fine, Available at: http://www.wefeelfine.org/.
Hoffmann, R. (2000) ‘Narrative’, American Scientist Online, Vol. 88, p.310.
Koppel, M., Argamon, S. and Shimoni, A. (2003) ‘Automatically categorizing written texts by
author gender’, Literary and Linguistic Computing, Vol. 17, pp.401–412..
Liu, H., Lieberman, H. and Selker, T. (2003) ‘A model of textual affect sensing using real-world
knowledge’, Intelligent User Interfaces (pp.125–132). New York, NY: ACM Press.
Neospeech, Inc. (2006) Available at: http://www.neospeech.com/.
Ortony, A., Clore, G.L. and Foss, M.A. (1987) ‘The referential structure of the affective lexicon’,
Cognitive Science, Vol. 11, pp.341–362.
Owsley, S. and Sood, S. and Hammond, K.J. (2006) ‘Domain specific affective classification of
documents’, AAAI Spring Symposium on Computational Approaches to Analyzing Weblogs,
Palo Alto, California.
Perlin, K. (1996) Responsive Face Project. Available at: http://mrl.nyu.edu/projects/improv/.
Perlin, K. and Goldberg, A. (1996) ‘Improv: a system for scripting interactive actors in virtual
worlds’, Computer Graphics, Vol. 29, pp.205–216.
Polti, G. and Ray, L. (1940) The Thirty-Six Dramatic Situations. Boston, MA: Writer.
Raux, A. and Black, A. (2003) ‘A unit selection approach to f0 modeling and its application to
emphasis’, ASRU, St. Thomas, US, Virgin Islands.
Schank, R.C. (1990) Tell Me A Story. Evanston, IL: Northwestern University Press.
Schank, R.C. (1999) Dynamic Memory Revisited. Cambridge, UK: Cambridge University Press.
Shamma, D.A. (2005) ‘Network arts: defining emotional interaction in media arts and information
retrieval’, Computer Science Department, Evanston, IL: Northwestern University, Doctor of
Philosophy.
Shamma, D., Owsley, S. Bradshaw, S. and Hammond, K.J. (2004a) ‘Using web frequency within
multimedia exhibitions’, ACM Multimedia. New York, NY: ACM Press.
Buzz: Mining and Presenting Interesting Stories
155
Shamma, D.A., Owsley, S., Bradshaw, S. and Hammond, K.J. (2004b) ‘Network arts: exposing
cultural reality’, International World Wide Web Conference, New York.
Simmons, A. (2002) The Story Factor: Inspiration, Influence, and Persuasion Through the Art of
Storytelling. London, UK: Perseus Books Group.
Sood, S., Owsley, S., Hammond, K. and Birnbaum, L. (2007) Reasoning Through Search: A Novel
Approach to Sentiment Classification. Northwestern University Tech Report Number
NWU-EECS-07-05.
Sproat, R., Ostendorf, M. and Hunt, A. (1998) The Need for Increased Speech Synthesis Research:
Report of the 1998 NSF Workshop for Discussing Research Priorities and Evaluation
Strategies in Speech Synthesis, NSF.
Steinbeck, J. (1977) ‘In awe of words’, The Exonian, 75th anniversary edition, in G. Plimpton
(Ed.). Devon, UK: Exeter University.
Technorati (2007) Technorati, Available at: www.technorati.com.
Wikipedia (2005) Wikipedia, Available at: http://www.wikipedia.org/.
Yahoo! (2006) Yahoo!, Available at: http://www.yahoo.com.
Zemeckis, R. (2000) Cast Away.
Zipf, G. (1949) Human Behavior and the Principle of Least-effort. Cambridge, MA:
Addison-Wesley.
Download