Online Consumer Review Helpfulness: Objectives & Cues

Helpfulness of Online Consumer Reviews: Readers' Objectives and Review Cues
Author(s): Hyunmi Baek, JoongHo Ahn and Youngseok Choi
Source: International Journal of Electronic Commerce , Winter 2012-13, Vol. 17, No. 2
(Winter 2012-13), pp. 99-126
Published by: Taylor & Francis, Ltd.
Helpfulness of Online Consumer Reviews: Readers'
Objectives and Review Cues
Hyunmi Baek, JoongHo Ahn, and Youngseok Choi
ABSTRACT: With the growth of e-commerce, online consumer reviews have increasingly
become important sources of information that help consumers in their purchase decisions.
However, the influx of online consumer reviews has caused information overload, making
it difficult for consumers to choose reliable reviews. For an online retail market to succeed,
it is important to lead product reviewers to write more helpful reviews, and for consumers
to get helpful reviews more easily by figuring out the factors determining the helpfulness
of online reviews. For this research, 75,226 online consumer reviews were collected from
Amazon.com using a Web data crawler. Additional information on review content was
also gathered by carrying out a sentiment analysis for mining review text. Our results
show that both peripheral cues, including review rating and reviewer's credibility, and
central cues, such as the content of reviews, influence the helpfulness of reviews. Based
on dual process theories, we find that consumers focus on different information sources of
reviews, depending on their purposes for reading reviews: online reviews can be used for
information search or for evaluating alternatives. Our findings provide new perspectives
to online market owners on how to manage online reviews on their Web sites.
KEY WORDS AND PHRASES: Consumer decision-making process, dual process theory,
eWOM (electronic word of mouth), online consumer review, review helpfulness.
Online consumer review, a form of electronic word of mouth (eWOM), draws
particular attention because of its effect on the purchasing decision of consum-
ers. When buying products from an online retail market, consumers find it
difficult to make purchase decisions based on information provided by sellers. Thus, consumers look for more detailed product information from online
reviews written by other consumers. This consumer-oriented information is
helpful in making purchase decisions because it provides indirect experiences
of products [45]. Consumer-oriented information may have greater credibility
and relevance than seller-oriented information [3]. Consequently, online consumer reviews can be used as a tool for gaining consumer trust. Kumar and
Benbasat [31] found that the perceived usefulness of online retail Web sites
increases when consumer reviews are available on Web sites.
However, not all online consumer reviews have the same effect on purchase decisions. Reviews that are considered more helpful by consumers have
stronger effects on consumer purchase decisions than other reviews [8, 12]. In
addition, McKnight and Kacmar [38] indicated that the most important factor
in eWOM adoption is information credibility Therefore, an online retail market
should provide credible consumer reviews to achieve continued success.
Recent studies have investigated the factors affecting helpfulness of online
consumer reviews. Mudambi and Schuff [39] found that review extremity
and depth of review have different effects on review helpfulness, depend-
ing on whether a product is a search good or an experience good. Ghose
and Ipeirotis [19] found that subjectivity, readability, and linguistic correctness of a review message affect its helpfulness. In addition, several studies
have focused on the factors affecting information credibility and knowledge
adoption, but are not directly related to online consumer reviews [4, 11, 61,
62]. These studies show the effects of source credibility, argument quality,
and information consistency on information credibility and, eventually, on
knowledge adoption.
The first objective of this research is to figure out the factors that determine
the helpfulness of online reviews, which is used as a reflection of information
credibility [8]. A large number of studies have been conducted to examine
the effect of online reviews on revenue [2, 8, 12, 15]. However, only a few
studies, including those by Ghose and Ipeirotis [19] and by Mudambi and
Schuff [39], have investigated the factors that determine which kind of reviews
are regarded as helpful. Before examining the influence of online reviews on
revenue, we need to figure out which factors are important for helpful online
The second objective of this research is to study which factors, depending
on the purpose for reading reviews, are more important for helpful online
reviews. For this objective, the study is based on dual process theories, which
distinguish between two types of information processing, one of which takes
relatively more effort and is more extensive than the other [21]. In this research,
two prominent dual process theories, namely, the heuristic systematic model
(HSM) by Chaiken [6] and the elaboration likelihood model (ELM) by Petty and
Cacioppo [50], are used to classify the factors that influence the helpfulness of
a review into peripheral cues for heuristic information processing and central
cues for systematic information processing. The consumer decision-making
process is also applied in our research model. Consumers use online consumer
reviews in the stage of information search and evaluation of alternatives [38].
Consumers tend to focus on different information sources, either peripheral
or central cues of reviews, depending on their motivation for reading reviews.
Consumers focus on peripheral cues in the stage of information search, and on
central cues in the stage of evaluation of alternatives. To classify the motivation of reading reviews, this research divides products into (1) search versus
experience goods and (2) high-priced versus low-priced goods. This research
analyzes which information source is the most important deciding factor for
helpfulness of a review depending on what products consumers would like
to buy. The data on reviews were gathered from Amazon.com using a Web
data crawler, and their content was analyzed using sentiment analysis for
mining review text.
The remainder of this paper is organized as follows. The next section
reviews the theory foundation and presents the research model and hypotheses
development. The third section describes the research methodology, including data collection and sentiment analysis for mining review messages. The
fourth explains the results of the empirical analysis from actual Amazon.com
review data. Finally, the fifth section presents a short discussion of the results
and discusses limitations and future areas of research.
Theoretical Foundation and Model
Dual process theories examine the role played by both the information content
of the message and the factors of its context affecting message credibility [60].
These theories have been most influential in the field of persuasion and attitude
change [54] and are useful for explaining effective communication in group
opinions [4, 11, 60, 61]. The HSM and the ELM are the two most prominent
theories that use the dual process approach. Among the dual process theories,
HSM is most closely allied with ELM [9]. Chen and Chaiken [9] pointed out
that both theories maintain that "central" or "systematic" processing requires
capacity and motivation, whereas "peripheral" or "heuristic" processing may
occur with little of either. Of these two theories, HSM distinguishes between
systematic information processing, or when a subject exerts considerable
cognitive effort in performing the task, and heuristic information processing,
or when a subject exerts comparatively little effort in judging the validity of a
message [6]. Thus, people who engage in systematic information processing
attempt to comprehend and evaluate the arguments in a message, as well as
assess their validity in relation to the conclusion. By contrast, people who engage in heuristic information processing rely on more accessible information,
such as the source identity or other noncontent cues, in deciding whether to
accept the conclusion of a message rather than processing argumentation [6].
In contrast to HSM, ELM distinguishes between the central route, wherein a
subject considers an idea logically, and the peripheral route, wherein a subject
uses preexisting ideas and superficial qualities to be persuaded [50]. Persuasion
through the central route occurs when the message's recipient is motivated
and is able to think on the issue. By contrast, persuasion through the peripheral route occurs when either motivation or ability is low [49]. Both HSM and
ELM have been widely applied to understand how information processing by
individuals leads to their decision outcomes in online environments [33, 44, 46,
53, 57, 59, 60]. Sussman and Siegal [57] integrated the technology acceptance
model (TAM) with dual process theories to investigate how knowledge workers are influenced to adopt advice in computer-mediated contexts. In addition,
Zhang and Watts [60] investigated the factors influencing knowledge adoption
in online communities based on dual process theories. Zhang et al. [59] applied
HSM to explain the information-processing behavior of consumers in online
consumer review platforms. Based on ELM, prior studies have examined the
effect of online consumer review depending on consumer skepticism [53],
consumer involvement [33, 46], and consumer expertise [44]. This research
applies both HSM and ELM to classify information in online consumer reviews
into peripheral cues for heuristic information processing and central cues
for systematic information processing. Based on these theories, the current
research attempts to analyze which type of information processing occurs
depending on the purpose for reading online consumer reviews.
Two principles, the least-effort and the sufficiency principles, determine
whether a decision maker engages in heuristic or systematic processing [56].
The least-effort principle states that individuals are economy minded in that
they try to arrive at judgments and decisions as quickly and painlessly as possible [1, 56]. In this situation, people are drawn to heuristic processing [56].
Figure 1. Research Model
The sufficiency principle, in contrast, states that individuals are accuracy or
validity minded in that they want to feel sufficient confidence in their decisions [1, 56]. If heuristic processing yields sufficient confidence, then there is
no need to engage in systematic processing. However, decision makers clearly
have to perform systematic processing to reach a final decision when multiple
alternatives remain after the application of heuristic processing [56].
The consumer decision-making process can be also explained with these
principles. The process consists of several steps, namely, problem recognition, information search, evaluation of alternatives, product choice, and outcome [55]. Mudambi and Schuff [39] stated that consumers can use consumer
reviews for information search and evaluation of alternatives. In the informa-
tion search stage, consumers explore information to select several alternatives
to make better purchase decisions. At this stage, consumers utilize heuristic
information processing to reduce the amount of information they have to pro-
cess in making a decision [47]. Based on the least-effort principle, consumers
engage in heuristic information processing at this stage. They then evaluate
the alternatives based on their criteria and make a final choice in the stage
of evaluation of alternatives. An individual having a strong desire to reach
an accurate conclusion is more likely to engage in systematic processing [7].
Consumers engage in systematic information processing at this stage based
on the sufficiency principle.
In this regard, our model illustrates that consumers take into account both
peripheral and central cues in determining which review is helpful, as depicted
in Figure 1. In this model, a dependent variable is the helpfulness of a review.
Chen et al. [8] stated that review helpfulness, which can be indexed on how
helpful the community found the review, can be used as one of the measures
for review information credibility. Likewise, in this research, review helpfulness, the degree to which other consumers believe that the review is helpful
to make a purchase decision, is used as a reflection of information credibility.
Review helpfulness is measured as the ratio of the number of positive answers
to total answers to the question asking if the review is helpful. Independent
variables include the information cues recognized by consumers from online
reviews, namely, star rating of a review, information on a reviewer (such as
reviewer's ranking and real name exposure), and information on review
content, including the number of words and proportion of negative words
in the review content. Based on ELM, review star rating, reviewer's ranking,
and reviewer's real name exposure are classified as peripheral cues, whereas
the word count and proportion of negative words in the review message are
classified as central cues. In the current research model, we expect product
type to moderate the helpfulness of an online consumer review. Product types
are divided into the following categories: (1) search versus experience goods
and (2) high-priced versus low-priced goods.
Peripheral Cues of Online Consumer Reviews
Peripheral cues are noncontent cues used in a subjective manner in heuristic
information processing [6]. Peripheral route attitude changes are based on a
variety of attitude change processes that require less cognitive effort [51]. In
an online consumer review, these cues include rating, reviewer's ranking, and
reviewer's real name exposure, which are more accessible noncontent cues.
Rating Inconsistency
Mudambi and Schuff [39] attempted to analyze the relationship between review rating and review helpfulness. They indicated that review helpfulness
increases when the rating is low or high for search goods and moderate for
experience goods [39]. However, the main focus of the current research is not
on finding the relationship between rating extremity and review helpfulness;
rather, the hypothesis of this research is that one of the influential factors in
a review's helpfulness is the consistency of the rating with existing reviews'
average rating for a certain product. Zhang and Watts [60] indicated that
information consistency, or the extent to which the current message is consistent with the prior knowledge of the member, delivers a positive influence
on knowledge adoption in communities of practice. Cheung et al. [11] also
showed that the extent to which the eWOM recommendation is consistent
with other contributors' experiences concerning the same product evaluation
has a positive effect on perceived eWOM credibility. The average star rating
for a certain product may be the other consumers' congruent opinions on
the product. Thus, consumers exposed to the review may judge the review
whose rating is consistent with the average rating to be the most trustworthy
review, leading them to conclude that the review is helpful. We hypothesize
that higher rating inconsistency with the average rating lowers the review's
helpfulness by lowering its credibility.
Hypothesis 1: The higher the difference between the review star rating and
the product average rating , the lower the review helpfulness becomes.
Reviewer Credibility
Previous studies have found that source credibility plays a significant role
in adopting online information [4, 60]. In an online community, the exposure of a user's identity influences active participation in contributing one's
knowledge [37]. Moreover, the user's activity level has a positive effect on the
credibility of a review [11]. From these studies, the exposure of a reviewer's
identity and activity level can be assumed to have positive effects on the credibility of a review. Amazon.com bestows badges on reviewers who have been
ranked higher and have exposed their real names so that consumers reading
their reviews will be aware of the reputation and identity of these reviewers.
In this research, we use these badges as measures of reviewers' reputation.
We hypothesize that top-ranked reviewers bestowed with real-name badges
increase source credibility, resulting in increased review helpfulness for
Hypothesis 2a: A top-ranked reviewer's review increases r
Hypothesis 2b: A real-name reviewer's review increases re
Central Cues of Online Consumer Reviews
Central cues are the arguments contained in a message and used in systematic
information processing in an objective manner [6]. Systematic information
processing emphasizes detailed processing of message content and the role of
message-based cognitions in deciding to accept a message's conclusion [6]. In
an online consumer review, content cues, such as the number of words (word
count) and the proportion of negative words in the review contents, are used
as central cues.
Word Count
The number of arguments in a message has been found to affect agreements
by giving recipients more to think about [5, 6, 24, 49]. In the context of online
consumer reviews, Mudambi and Schuff [39] found a direct relationship between the number of words in a review and helpfulness of the review. Their
research indicated that a review provides information to help in the decisionmaking process of a consumer and that the helpfulness of a review increases
as the word count increases. However, according to the law of diminishing
marginal utility (Gossen's first law), which states that the marginal utility of
each unit decreases as the supply of units increases, adding more words to a
review message results in increasing the review's helpfulness at a decreasing
rate [20] (see Appendix A). Therefore, we hypothesize that the logarithm of a
review's word count has a positive effect on the helpfulness of the review.
Hypothesis 3: As the number of words in a review message increases , a review's helpfulness increases , but the effect on helpfulness gradually decreases
as the number of words increases.
Proportion of Negative Words
Kanouse and Hanson [28] asserted that people tend to have negative bias,
wherein they put more emphasis on negative than on positive information.
People tend to believe that negative information is more credible [27], and
people recognize negative information to be a more important source and thus
to have a more persuasive effect [25]. This may be because people feel normative pressure to speak of only positive things; thus, people may be inclined
to believe those who express negative feelings [26]. Many studies related to
online reviews have indicated that a negative review is more influential than
a positive review [2, 10, 12, 43]. In this research, content analysis of a review
message is carried out to obtain data on the proportion of negative words. We
hypothesize that an online review's helpfulness may increase as the review's
message contains more negative content, and that a negative review may
deliver a more persuasive message to its readers.
Hypothesis 4: A review's helpfulness increases as the review message contains more negative words.
Product Types
Consumers may read online consumer reviews from different perspectives
depending on their purpose for reading online reviews during information
search or evaluation of alternatives. The purpose for reading online reviews
can be different depending on what products consumers intend to buy. The
current research looks into factors that decide a review's helpfulness for consumers depending on various kinds of products. Consumer products in the
current research are divided into two categories: (1) experience versus search
goods and (2) high-priced versus low-priced products. In each category the
research analyzes how central or peripheral cues used in processing review
information play a significant role in deciding the helpfulness of a review.
Experience Good Versus Search Good
In consumer decision-making process, consumers show different behaviors,
depending on which product type they intend to buy as follows:
Mr. K wants to buy a baby book (experience good) that costs about $10
for his daughter and a USB (search good) that costs about $10. Mr. K
accesses Amazon.com to buy the USB, looks at the product descriptions
provided by sellers, determines three USBs as a consideration set, reads
online reviews for the three USBs meticulously, and finally buys a USB.
To buy the, baby book, in contrast, Mr. K simply searches for online reviews, as it is not easy to determine a consideration set only with product
descriptions, determines three books as a consideration set, reads the
online reviews, and then buys a book.
An experience good is a product whose characteristics are difficult to observe
in advance, whereas a search good is a product whose characteristics are easily evaluated before purchase. According to Nelson [41], consumers depend
on different information sources in purchasing search and experience goods.
Peterson et al. [48] asserted that a decision to purchase an experience good
is based on subjective judgment, whereas a decision to purchase a search
good is decided based on outside information that may be objective. Thus, a
search good can be judged based on the sellers' information, but the sellers'
information is not sufficient to make a purchase decision for an experience
good. Therefore, for search goods, consumers use sellers' information in the
information search stage and use online consumer reviews to evaluate alternatives (systematic information processing). By contrast, consumers who are
inclined to buy experience goods mainly use consumer reviews in the information search stage (heuristic information processing) because the information
provided by sellers is not sufficient. Thus, central information processing for
the review of search goods and peripheral information processing for experience goods are more influential for review helpfulness.
Hypothesis 5: In the review of search goods , central cues are more influential in deciding a review's helpfulness than peripheral cues; for experience
goods , peripheral cues are more influential than central cues.
High-Priced Product Versus Low-Priced Product
In consumer decision-making process, consumers show different behaviors,
depending on how much they intend to pay for the product as follows:
Mr. A wants to buy a television set that costs about $2,000 and had already determined three candidates (consideration set) through expert
reviews in blogs or discussion rooms before he searched for information
on TVs from Amazon.com. Mr. A looks for the three kinds of TVs to buy
via Amazon.com for each verifies the product through online consumer
reviews related to the product to determine whether the information he
obtained is actually accurate, and finally buys a TV. Mr. B wants to buy
a telephone that costs about $30. Mr. B accesses Amazon.com, simply
explores the information about the telephone through online consumer
reviews, selects three kinds of telephones as candidates, verifies them
through related online consumer reviews, and finally determines which
kind of telephone to buy.
Petty and Cacioppo [50] suggested that people use either central or peripheral
cues depending on the importance of information processing. For instance,
people use central cues in processing information if they are highly concerned
with the information given. By contrast, people use peripheral cues when they
are less motivated and connected to the information. Generally, people are
more concerned with high-priced than low-priced goods. Thus, people read
reviews in detail for high-priced goods to enhance their purchase decisions.
This phenomenon can also be explained from the perspective of the consumer
decision-making process. Consumers are more likely to engage in complex
and extensive information search and evaluation when they want to buy
high-priced goods. In the extensive information search, people tend to exert
more effort in obtaining information not only from the online retail market but
also from other sources (magazine, newspaper, consumer report, brochure,
advertisement, online community, expert review, word of mouth, and so on).
For example, before buying a car, the consumer may ask for friends' opinions,
read reviews in consumer reports, consult several Web sites, and visit several
dealerships. Therefore, consumers who want to buy a high-priced good use
online consumer review mainly for evaluation of alternatives, rather than for
information search. But consumers are more likely to use very simple or limited
search and evaluation tactics when they want to buy low-priced goods [52].
Thus, consumers who want to buy low-priced goods use online consumer
reviews mainly in the information search stage.
As a result, we hypothesize that consumers looking for expensive goods
may focus on text messages that require systematic information processing, whereas consumers looking for low-priced goods may focus on review
rating or reviewer reputation, which only requires heuristic information
Hypothesis 6: A review's helpfulness is more affected by central cues for
high-priced product reviews and by peripheral cues for low-priced product
Research Methodology
Data Collection
Amazon.com is one of the largest online retail markets and has extensive consumer review systems [8, 15]. Therefore, we collected actual online consumer
review data from Amazon.com. We conducted Web data mining in October
2010 with the aim of discovering useful information from Web hyperlinks,
page contents, and usage logs [35]. In the current research, a crawler, developed using Python 2.6, was used to download Web pages of consumer reviews,
reviewers, and product information from Amazon.com. Another Pythonbased system was developed to parse HTML Web pages into a database. To
collect random review data, we first selected 23 different kinds of products
by considering various product categories in Amazon.com and chose reviewers who had written the reviews for these products. Eventually, information
related to all the reviews written by these reviewers was collected. A total of
75,226 online consumer reviews written by 4,613 reviewers on different kinds
Table 1 . Data Collected from Amazon.com.
Instrumentation of
Data collected Definition model variables
Reviewer ranking If a reviewer is one of Amazon'
10,000 reviewers or not top 10,000, 0 - out of
top 10,000)
Real-name exposure Whether or not reviewers exposed their real Numerical value ( 1 ~ real
names name exposed, 0 - real
name not exposed)
Product review A number of cumulative existing revi
number product
Average rating Average star rating on products Numerical value (scale)
Price Product price Numerical value (scale)
Category A category in which each product belongs to
(one of the 28 categories from Amazon.com)
Review rating A star rating value on a review Numerical value ( 1 , 2, 3,
Word count Number of words in a review message Numerical value (scale)
Contents Contents of a review message Textual description
Subjective word % Proportion of subjective words in a review Numerical value (scale)
Positive word % Proportion of positive words in a review message Numerical value (scale)
Negative word % Proportion of negative words in a review message Numerical value (scale)
Total vote Total number of answers to question asking if the Numerical value (scale)
review is helpful
Helpful vote Number of positive answers to question asking if Numerical value (scale)
the review is helpful
Helpfulness Proportion of positive answers to total answers to Numerical value (scale)
question asking if the review is helpful
of products were obtained. A review was excluded if the total answers to a
question asking whether the review was helpful were less than or equal to
five, because the helpfulness was not meaningful [29, 58]. After elimination,
an analysis of 15,059 reviews written by 1,796 reviewers was conducted. The
collected data, as shown in Table 1, contains information on reviewers (e.g.,
reviewer's ranking, authenticity of reviewer's name), information on reviews
(e.g., star rating, review message, helpful voting counts, total voting counts),
and information on products (e.g., category, number of reviews written, average rating, price). Based on the studies by Nelson [41, 42], among the 28
product categories from Amazon.com, the products that belonged to either
search good or experience good were categorized. The rest were excluded
from the data analysis for H5 because they were too vague to be categorized.1
The product categories were classified into Clothing and Accessories, Jewelry,
Shoes, Office Products and Supplies, Sports and Outdoors, and Health and
Personal Care (search goods); they were then further classified into Books,
Movies and TV, MP3 Downloads, Music, Musical Instruments, and Video
Games (experience goods).
Method for Analyzing Review Contents
This research attempts to perform sentiment analysis for mining review text,
also called opinion mining, which quantifies subjective opinions in consumer
feedback [32]. Dave et al. [14] provided semantic classification for positive
and negative reviews using natural language processing and various learning algorithms. Hu and Liu [22] suggested the review-summarizing method
based on opinion mining, which provides a method for summarizing features
of subjective opinions. Zhuang et al. [63] created a list of words that express
subjective opinions and attempted to quantify reviews into summary based
on learning algorithms and WordNet in mining for movie reviews. Recently,
Ghose and Ipeirotis [19] indicated that reviews with more subjective words
are recognized as more helpful reviews through opinion mining.
Given that opinion mining used in existing studies has been characterized
by a summary of a product's characteristics, we used a general process to
recognize features of a product and mine the texts related to such features [14,
63]. However, in our research the purpose of the analysis was not to find the
features of a product but to quantify the degree of subjective words (positive,
negative). Moreover, we detected emotional words using SentiWordNet [16,
17], which is a library that summarizes the degree of subjectiveness and
objectiveness of words and is based on WordNet. The degree is devised from
negative to positive, 1.0 being the maximum degree, enabling the comparison
of relativity between words. This research analyzes how many subjective
adjectives appear in review contents, based on SentiWordNet.2
Table 2 shows the descriptive statistics of the full data used in this research.
The average review rating is generally high, with a mean value at 3.83, which
is in accordance with those obtained in previous studies [12, 23].
Zhu and Zhang [62] stated that reviews with six to eight points on a tenpoint scale have the longest message in terms of the average length of the
reviews. They also found that a review with four out of five points has the
highest percentage of negative words and contains the most information
on the pros and cons of products, with an average word count of 310.6. As
shown in Table 3, the percentage of positive words has a tendency to increase
as the rating goes up, whereas the percentage of negative words is widely
distributed regardless of its rating. Table 3 also shows that the percentage of
positive words is higher than the percentage of negative words in all rating
categories. As mentioned previously, most people tend to write only positive
things under normative pressure [26].
A total of 60.1 percent of the 1,796 reviewers were given a "real-name badge"
because they exposed their real names, and 56 percent of the 15,059 reviews
were written by reviewers with the badge. In addition, 11.2 percent of the
1,796 reviewers were ranked in the top 10,000 based on the number of reviews,
helpful votes, and recent activities, among other criteria. These reviewers
Table 2 • Descriptive Statistics for Reviews.
Variable N Minimum Maximum Mean Deviation
Review rating 15,059 1 5 3.83 1.376
Review word count 15,059 1 4,831 273.86 256.195
Review total votes 15,059 6 980 22.53 41.366
Review helpful votes 15,059 0 965 17.96 37.606
Review helpfulness 15,059 0 100 76.98 25.634
Review subjective 15,059 0.00 100.00 11.12 7.644
word %
Review positive 15,059 0.00 29.41 6.27 2.505
word %
Review negative 15,059 0.00 100.00 4.84 7.724
word %
Table 3« Average Length and Word Content of Reviews.
Word count 190.67 264.31 283.37 310.60 275.80 0.000
Subjective 10.32% 10.87% 10.82% 11.24% 11.36% 0.000
word % (100%) (100%) (100%) (100%) (100%)
Positive 5.55% 6.07% 6.13% 6.29% 6.50% 0.000
word % (53.8%) (55.8%) (56.7%) (56.0%) (57.2%)
Negative 4.77% 4.80% 4.69% 4.95% 4.86% 0.807
word % (46.2%) (44.2%) (43.3%) (44.0%) (42.8%)
* Using one-way analysis of variance.
were given the "top 10,000 reviewer badge," and a total of
the reviews were written by reviewers with this badge. W
reviewer's ranking as an independent variable because consu
to check whether a reviewer is in the top 10,000 reviewers
"top 10,000 reviewer badge" on a review page.
The average price of products in the collected data was $
16.31 percent of the reviews were of products that cost more t
comparison of the descriptive statistics of subsamples betw
and an experience good, we observed that an experience g
review message in terms of word count, a smaller number of h
and lower helpfulness. The results correspond with those
Schuff [39], except that there was no significant difference
tion, an experience good had lower use of subjective word
comparison of descriptive statistics of subsamples for a se
experience good.
Table 4. Descriptive Statistics and Comparison of Means of
Subsampl es.
Search Experience
(n = 672) (n = 9,570)
Variable M (SD) M (SD) p-value
3.8) 3.80
(1.489) (1.354)
(213.867) (247.331)
Helpfulness votes 18.24 15.26 0.045
(37.876) (28.466)
Helpfulness % 82.00 73.34 0.000
(25.252) (25.858)
Subjective word % 12. 19 10.67 0.000
(10.211) (5.667)
(2.898) (2.311)
Negative word % 5.62 4.42 0.004
(10.584) (5.527)
Ranking 10,000 0.54 0.68 0.000
(0.499)" (0.468 )b
Notes: * Using the t-test.
We performed hierarchical regression using PASW 18.0 to analyze the hypotheses. The hierarchical regression model is used to support a researcher's
hypothesis, and individual variable inputs may be used depending on the
researcher's purpose.
In Model 1, we considered control variables to be the number of total votes,
product type, product price, and the number of product reviews. In Model 2,
review rating factors, including rating2 (i.e., quadratic term) and rating incon-
sistency were entered as independent variables. The rating2 was included as
an independent variable based on Appendix A. In addition, Mudambi and
Schuff [39] found that there is a nonlinear relationship between rating and
helpfulness. They included star ratings, both as a linear term and as a quadratic term, as independent variables in their study. In Model 3, the reviewer's
reputation variables, including real-name factor and ranking under 10,000,
were added as independent variables. We found that the real-name factor does
not affect review helpfulness. Moreover, in Model 4, the logarithm of word
count was considered as an independent variable to support the hypothesis,
which states that while a review's helpfulness increases as the word count
in a review's message increases, the actual effect on helpfulness decreases
gradually. This result indicates that helpfulness no longer increases when the
word count reaches around 1,000 to 1,500 words, as shown in Appendix A.
In Model 5, the proportion of negative words in a review message was added
as an independent variable.
As a result of the analysis, Hl, H2a, H3, and H4 are supported. Thus, a
review's helpfulness is decided by how congruent a review rating is with the
average rating for a certain product; whether the review was written by a
high-ranked reviewer; the length of the review message; and the number of
negative words included in the review message. Moreover, although Forman et
al. [18] asserted that review helpfulness increases if identity-descriptive review-
ers write reviews, our study found that reviewers who had their real names
exposed do not have much effect on a review's helpfulness. This finding shows
that a reviewers' ranking information may be considered an important aspect
in deciding perceived helpfulness, but the reviewer's real-name exposure is
not recognized as a significant source of information. The outputs from these
hierarchical regression models are included in Tables 5 and 6.
To find how information processing occurs depending on the purpose
for reading reviews, we analyzed the moderating effects of product types.
We analyzed the moderating effect of various product types (experience vs.
search goods, and high-priced vs. low-priced products) on the relationships
between online review factors (independent variables in the research model:
peripheral cues vs. central cues) and the helpfulness of a review (dependent
variable in the research model). As stated earlier, search goods and experience
goods are classified according to the classification suggested by Nelson [41,
42]. Based on the average value of the total samples, the goods were classified
into goods above and below $100. Using formula (1) suggested by Chin [13],
we verified the existence of moderating effects depending on each type of
product. Table 7 contains the results after testing H5 and H6.
M"1"1) v*SEi2 /n2
y (říj + H2 - 2) ("i
where p, is the coefficient of pa
the standard error of path i. As
Consumers decide review helpfu
goods and high-priced products.
fulness with peripheral cues wh
purchase situations, and how du
Although word count significa
good reviews, rating inconsisten
Amazon.com are more influential
ness of word count on review he
experience goods, which parallels t
found that rating is also one of th
which is applied differently fo
that review helpfulness increase
low or high) for search goods, a
is moderate) for experience good
unlike the findings of Mudambi
of rating inconsistency on revie
Table 5. Output of Hierarchical Regression Model.
Model 1 Model 2 Model 3 Model 4 Model 5
Total voles 0.054** 0.084** 0.073** 0.053** 0.044**
Product type: search -0.018* -0.011 -0.009 -0.006 -0.006
Product type: experi- -0.218** -0.225** -0.237** -0.250** -0.246**
enee good
Product price -0.089** -0.109** -0.091** -0.105** -0.107**
Product review number -0.105** -0.079** -0.071** -0.074** -0.075**
Rating2 0.223** 0.217** 0.219** 0.220**
Rating inconsistency -0.332** -0.305** -0.290** -0.286**
10,000 ranking 0.157** 0.114** 0.100**
Log (word count) 0. 163 * * 0.206 * *
Negative word % 0.092 * *
Adjusted R 2 0.053 0.297 0.320 0.344 0.350
R2 change 0.053** 0.244** 0.023** 0.024** 0.007**
N 14; 169 14, 169 14, 169 14,169 14,169
than for search goods. In summary, cons
use online consumer reviews mainly for i
not obtain sufficient information from t
Nakayama et al. [40], the Web transforms
Hence, the reviews of experience goods in
information that sellers give, thereby maki
similar to those for search goods. As a re
be utilized as substitute information fro
of consumers considering experience goo
for search goods may not encounter any
decisions because of sellers' information
consumer reviews to evaluate alternative
the sellers' information.
The rating inconsistency of an online consumer review has a stronger effect
on a review's helpfulness for products priced under $100, whereas word count
and negative word percentage have a more significant influence on a review's
helpfulness for products priced over $100. Consumers looking for high-priced
goods try to collect as much information as possible and evaluate each product
alternatives [55]. In this case, they are actively involved in gathering information on the product from other external information sources. Thus, consumers
tend to utilize reviews mainly to evaluate alternatives that were already chosen
from other information sources for high-priced products. Consumers looking
for low-priced goods, in contrast, tend to minimize the time and energy they
Table 7. Analysis Results of Moderating Effects.
Search vs.
experience good Product price
Moderating Moderating
Path f-value effect Kvalue effect
Rating inconsistency - » -2.94** Supported -6.41** Supported
10,000 ranking - > 3.42** Supported 1.78 Not supported
Log (word count) - > -2.84** Supported -7.31** Supported
Negative word - » -1 .35 Not supported -4.07* * Supported
* * Significant at 0.01 .
Table 8. Findings on Moderating Effects.
Peripheral cues Central cues
Review Reviewer Word Negative
Product type rating ranking count word HSM
Search good High Systematic
Experience good High High Heuristic
Good below $100 High Heuristic
Good above $100 High High Systematic
* High: relatively high coefficient in review's helpfulness.
spend on purchasing decisions [55]. Thus, the
mainly for information search because informat
requires minimum time and effort. In conclusion
priced good tend to use central cues of onlin
looking for low-priced good tend to use perip
result supports Petty and Cacioppo's claim [5
in processing information if they are highly
given, whereas people use peripheral cues wh
connected to the information. The same clai
consumer reviews.
Discussion and Conclusion
In this research we attempted to find which review factors affect review cred-
ibility by analyzing online consumer review data from Amazon.com. Our
findings bring important extensions to previous research [19, 38] on the relationships between online review factors and helpfulness of an online review.
First, the rate of subjective and positive words becomes higher as the review
rating increases, whereas the rate of negative words has nothing to do with
rating. Thus, we found that review star rating and review message are not
always congruent. In particular, with the star rating of five being the highest,
four-star-rated reviews contain the longest message, with an average word
count of 310.6, and the highest rate of negative word content. Four-star-rated
reviews provide ample information, with both positive and negative features
of the reviewed products.
Second, our results indicate that people feel that reviews are most helpful when reviews are more parallel with the majority average rating; when
reviews are written by high-ranked reviewers and reviews are lengthy; and
when there is frequent use of negative words. Existing studies have supported
the hypothesis that a review's helpfulness differs depending on the review
rating extremity. Our research focuses on the effect of rating inconsistency
on the helpfulness of a review. Explanatory power ( R 2 value) is significantly
increased by 5 percent (Appendix B) when considering rating inconsistency,
instead of rating extremity, as an independent variable. This research also found
a reviewer's real-name exposure does not have a significant influence on the
review's helpfulness. This result suggests that reviewers who are ranked in
the top 10,000 on Amazon.com are more credible to readers, but mere realname exposure alone does not enhance credibility for readers. In addition, as
a review message lengthens, review helpfulness increases. However, helpfulness does not increase further if the word count reaches about 1,000 to 1,500
words (Appendix A). Previous studies have asserted that review helpfulness
and review length are related. However, we found that the logarithm of word
count and review helpfulness is positively related. Comparing this result with
those of former research models, explanatory power ( R 2 value) is also increased
(Appendix B). Cheung and Lee [10] claimed in their hypothesis that negatively
framed eWOM is more credible than positively framed eWOM. However, the
result of this analysis indicates that the hypothesis is not supported. Instead,
our study shows that negative word count in a review is relatively connected
with the review's helpfulness. According to Kanouse and Hanson's negative
bias [28], a negative review acts as a more powerful message than a positive
review, and thus exerts higher persuasive power. Therefore, people tend to
recognize reviews with a significant proportion of negative words to be more
helpful reviews.
Third, we found that the effectiveness of the influencing factor on a review's
helpfulness differs depending on the purpose for which the reader uses the
review. Peripheral cues in processing reviews play a significant role in review
helpfulness if online reviews are used for information search. Experience goods
and low-priced products belong to this category. By contrast, central cues in
the reviews play a key role in review helpfulness if online reviews are used for
evaluating alternatives that have been already made in the stage of information
search. Search goods and high-priced products belong to this category. Thus,
we can conclude that information processing for reviews occurs in two ways,
depending on the purpose for reading the reviews. Peripheral information
Table 9. Summary of Findings.
Description Result
H 1 Peripheral The higher the difference betwe
cues average rating, the lower the review
H2 A top-ranked reviewer's review incre
Areal-name reviewer's review increases
H 3 Central As the number of words in a rev
cues helpfulness increases, but the effect o
decreases as the number of words increase
H4 A review's helpfulness increases as th
more negative words.
H5 Moderating In the review of search goods,
effect in deciding a review's helpfulne
experience goods,
central cues.
peripheral cues are more
H6 A review's helpfulness is more affecte
priced product reviews and by periphe
product reviews.
processing narrows down possible cho
search [56]. Central information process
is making a final choice in the stage of
the summary of findings for all the hy
The contributions of our research a
contributes to the methodological
reviewers, and products from online
gathered and analyzed. Quantified in
secured by analyzing sentiment analys
to analyze factors that affect review h
this process.
Second, this research also makes a contribution to theory. Based on dual
process theories, this research revealed that information processing occurs
in either heuristic (peripheral route) processing or systematic (central route)
processing when consumers read online reviews. To validate this, we needed
to link reading reviews and dual process theories. The purpose for reading
reviews depends on the stage of the consumer decision-making process. The
decision-making process is linked with dual process theories by using the leasteffort and sufficiency principles. On the basis of these theoretical backgrounds,
our study focused not only on finding the factors that affect review helpfulness
but also the moderating effect of product type on the relationship between
review factors and review helpfulness. As a result, our research extends previous approaches by considering two ways of looking at online reviews.
Finally, the current research contributes to practice. Our research may
eventually help online market owners recognize what factors constitute
helpful online reviews for online markets. Using the findings of the research,
online market designers can devise ways for their Web sites to expose helpful
reviews more easily and lessen information overload for consumers. Online
market owners may also indirectly lead consumers to write more helpful
online reviews, which may become valuable assets to their success. Furthermore, the results of the current research can be used to design Web sites by
considering certain review factors that affect review helpfulness, depending
on which products consumers intend to buy. For example, for a high-priced
product and a search good, consumers could be encouraged to provide more
detailed review messages. For a low-priced product and an experience good,
review rating and reviewer credibility are more important. For these goods,
highlighting these review factors is more helpful for consumers. This research
also shows that negatively framed reviews do not harm the success of online
retail markets. To maximize the effect of word of mouth, some online retail
markets encourage consumers to write a positively framed review by offering
incentives, such as discounts. However, when consumers read an online con-
sumer review, they focus on negative words in a review message. Therefore,
it is more helpful to encourage consumers to write straightforward reviews
for the success of online retail markets.
This research has several limitations. First, because the sample data were
collected from online consumer reviews from Amazon.com, whether the
findings of our research can be used to generalize online reviews from other
online markets is not yet confirmed. To overcome this limitation, we need to
analyze actual review data from several online retail markets. Moreover, as
Mudambi and Schuff [39] pointed out, because review helpfulness is measured only by those who vote on whether the review is helpful or not, there
is uncertainty on whether the findings can be generalized for those who do
not vote on review helpfulness. To overcome this limitation, we need to use
other research methods, such as a survey or an experiment, to answer the
research questions.
There are several directions in which this research could be extended.
Instead of executing context analysis to measure the quality of a review context,
future research could measure perceived quality of review in experimental
settings. Longitudinal data may also be collected for online consumer reviews
to determine the changes in the degree of effectiveness of review helpfulness,
depending on time. Furthermore, research on the relationship between review
helpfulness and product sales, depending on the products, may be also valu-
able in further research.
1. The category classification ofAmazon.com is not specific and is rather broad.
Therefore, there are many products that are not classified into either search goods or
experience goods.
2. When review content mining is conducted, negatives are dealt with as follows.
In cases where the negatives are in a form of "be not (are not, is not, am not, were
not, was not, not be, etc.)" that negates adjectives within the three words before the
adjective, we changed the positive adjective to a negative adjective and the negative
adjective to a positive adjective. For example, in review contents with "The Gillette
Fusion is not worth the price," "worth" is considered as a negative adjective, though
it is viewed as a positive adjective according to SentiWordNet.
Appendix A: Relationship Among Rating Inconsistency,
Word Count, and Helpfulness
Based on Mudambi and Schuff [39], we expect a nonlinear relationship between rating inconsistency and helpfulness. Based on Gossen's marginal utility
theory [20], we also expect a nonlinear relationship between word count and
helpfulness. The three-dimensional graph in Figure Al, which was analyzed
using Minitab 16, shows how rating inconsistency and word count affect the
helpfulness of a review.
As shown in Figure A2, helpfulness increases as rating inconsistency
decreases. In particular, if the rating of a review is higher than a product's aver-
age rating, helpfulness tends to drop slowly with an increase of rating inconsistency. If the rating is lower than the average rating, however, helpfulness
Figure Al. Relationship Among Rating Inconsistency, Word Count, and
This content downloaded from on Wed, 12 Jun 2024 03:00:45 +00:00
All use subject to https://about.jstor.org/terms
Figure A2. Relationship Between Rating Inconsistency and Helpfulness
rapidly declines as rating inconsistency increases. There is a nonlinear relationship between rating inconsistency and effect of helpfulness.
Figure A3 shows that as the number of words of a review increases, helpfulness of the review also increases. However, when the number of words of
a review reaches about 1,000 to 1,500 words, helpfulness no longer increases.
There is a logarithmic relationship between word count and helpfulness.
Appendix B: Increase in K2 from Previous Research to
This Research
Table B1 shows the significant increase in R2 value when considering rating
inconsistency instead of rating extremity as an independent variable in the
research model.
Table B2 shows the significant increase in R2 value when considering logarithm of word count instead of word count as an independent variable in the
research model.
Figure A3. Relationship Between Word Count and Helpfulness
Table Bl. Comparison of Two Regression Analyses with Rating and
Rating Inconsistency as an Independent Variable*
Rating Rating inconsistency
Standardized Standardized
coefficient lvalue coefficient lvalue
Rating2 -0.063 -1.535 0.220 26.249
Rating inconsistency -0.287 -33.821
10,000 ranking 0.118 15.525 0.100 13.549
Log (word count) 0.216 25.803 0.206 25.616
Negative word % 0.096 12.115 0.092 12.122
Product type: search -0.007 -0.967 -0.006 -0.839
Product type: -0.236 -00.135 -0.245 -32.404
experience good
Product price 4). 104 -13.861 -0.107 -14.727
Product review number -0.082 -11 .661 -0.075 -10.940
Table B2. Comparison of Two Regression Analyses with Word Count
and Log(Word Count) as an Independent Variable.
Word count Log (word count)
Standardized Standardized
coefficient t-value coefficient lvalue
Rating2 0.217 25.490 0.220 26.249
Rating inconsistency -0.299 -34.603 -0.287 -33.821
10,000 ranking 0.139 19.019 0.100 13.549
Word count 0.086 11 .819
Log (word count) 0.206 25.616
Negative word % 0.013 1.921 0.092 12.122
Product type: search -0.007 -1 .033 -0.006 -0.839
Product type: -0.239 -30.978 -0.245 -32.404
experience good
Product price -0.099 -13.384 -0.107 -14.727
Product review number -0.073 -10.567 -0.075 -10.940
