Content analysis

advertisement
Web Content Evaluation
An application of evaluative methods
Today’s topics
• Feature analysis
• Link analysis
• Image analysis
S519
Web Content Analysis
• WebCA is a methodologically plural paradigm
• But overall coherent, in that methods are informed by
the general principles of CA
– must enable “objective, systematic, and quantitative
description of the content of [web] communication”
(Baran, 2002, p. 410)
– follows a general coding and counting procedure, except when
identification of phenomena is automated
Herring (2010)
S519
Genre
• Definitions
– “typified rhetorical action based in recurrent
situations” (Miller, 1984)
– “a distinctive type of communicative action,
characterized by a socially recognized
communicative purpose and common aspects of
form” (Yates & Orlikowski, 1994)
– A style or type, in this case relating specifically to
Web format and content
S519
Examples
• Campaign websites: Ron Paul, Mitt Romney,
Michele Bachmann, etc
• Shopping websites: Target, Wal-Mart,
Amazon, etc.
• iSchool websites: IU-SLIS, UNC-SILS, UIUCGSLIS
• Some another examples?
Web content analysis/evaluation is usually conducted for a group of websites
with same or similar genre.
S519
Feature Analysis
• Analysis of the presence/absence, the range,
or the frequency of objective website features
• Types of web features
– Content: Text, figures, images, video clips, ads,
badges, etc.
– Navigation: Facilities for accessing information and
moving around the site
– Presentation: The way in which content and
navigation are presented to the user, e.g., layout
patterns
(Calero, Ruiz, & Piattini, 2004)
S519
Examples
Web Genres
background images
email links
forms
images
links inward
links outward
lists
top-level domain
name
From S. Herring S642
Personal Homepages
awards
colors other than
white and gray
copyright notice
guestbook
hit counter
last update date
link structure
‘new’ icon
personal info (various
types)
‘under construction’
sign
video
welcome message
S519
Blogs
ads
archives
badges
blogger
information
blogroll
calendar
comments
entries
images
length of entries
permalink
temporal measures
time stamp
Which features to analyze
• From prescriptive guidelines
• From previous studies wholesale or modified
• Grounded theory approach (Glaser & Strauss,
1968)
– allow relevant features to emerge from the data
S519
AN EXAMPLE ON CAMPAIGN
WEBSITES
S519
Selection of websites
• The candidates’ websites of the 2012 presidential
election
• One Democratic candidate - Barack Obama; and nine
Republican candidates
• To select the most “popular” Republican candidates
–
–
–
–
Google Trend (GT)
Michele Bachmann as the baseline (a GT score of one)
The time period: past 30 days (query time: September 23)
The top four candidates based on GT scores are Ron Paul
(1.96), Rick Perry (1.78), Michele Bachmann (1.00), and
Mitt Romney (0.42).
S519
Purpose and audience
• To facilitate the candidates’ electoral campaigns
– Publicize political views
– Establish connections with voters
– Request for donations and volunteers
• Potential U.S. voters in the 2012 election
– covers the majority of adult U.S. citizens
– non-U.S. citizens (especially for those who are
interested in politics and American culture)
– Michele Bachmann’s campaign website
(http://www.michelebachmann.com/): 22% of the
website visitors were coming from out of the U.S.
S519
Genre
• (Presidential) electoral campaign websites.
– The chosen websites share similar forms and
topics, and serves the same group of people with
same purposes
• Related to other campaign websites, both
political and non-political
• Related to politicians’ websites, such as
mayors’ websites and governors’ websites
S519
Features
•
•
•
•
•
Linking structure
Involvement
Multimedia
Functions
Social Media
S519
Features – linking structure
Features
No. of top level
navigation links
No. of web pages
No. of links in the past
week (as of September
25, 2011)
Mean
Range
Mode
7
5(RP,BO)~10(RP)
5
1,770
233(MB)~3,140(BO)
-
60
22(MB)~138(BO)
-
S519
Features - involvement
Features
Frequency
Percentage
Exception
5
100%
-
4
80%
Mitt Romney
4
80%
3
60%
Sign in
2
40%
Join groups
1
20%
Rick Perry
Mitt Romney;
Barack Obama
Barack Obama; Mitt
Romney
Barack Obama
Have “Donate” in
the homepage
Have “Volunteer” in
the homepage
Store
Get Email Updates
in the homepage
S519
Features - multimedia
Features
Background color
used
Picture as background
image
Leave comments
Features
No. of images in the
homepage
No. of images in the
whole website (larger
than 300*400)
No. of videos in the
homepage
Frequency
Percentage
Exception
3
60%
Barack Obama; Ron
Paul
1
20%
Michele Bachmann
1
20%
Barack Obama
Mean
Range
Mode
7
3(BO)~12(RP)
-
350
19(MB)~1,350 (BO)
-
2
1(RP)~3(BO)
1
S519
Feature - functions
Features
Frequency
Percentage
Exception
Search function
1
20%
Mitt Romney
Calendar
2
40%
Blog
3
60%
Other languages
1
20%
Barack Obama
“In Your State”
function
3
60%
Rick Perry; Mitt
Romney
S519
Barack Obama; Ron
Paul
Rick Perry; Michele
Bachmann
Feature – social media
Sum
Barack Obama
Y
Y
Y
Michele Bachmann
Y
Y
Y
Ron Paul
Y
Y
Rick Perry
Y
Mitt Romney
Sum
Y
Y
Y
5
Y
Y
Y
7
Y
Y
Y
Y
6
Y
Y
Y
Y
5
Y
Y
Y
Y
Y
Y
Y
7
5
5
5
2
5
3
5
S519
Exercise
• Find features that you think are important
and/or relevant for
– iSchool websites
– Shopping websites
S519
Link Analysis
• The web is hypertextual
– Text links to other text via hyperlinks
• Four basic types of links:
– outbound, inbound, internal, reciprocal
S519
Four basic linking patterns
Site B
Site B
Site A
Site A
Site B
Site B
Site A
Site A
Reciprocal
Internal
Outbound
(“outlinks”)
Inbound
(‘inlinks”)
(e.g., navigation
links
within a site)
(e.g., the SLIS site
links to the
JASIST site)
(e.g., the JASIST (e.g., the SLIS and JASIST
sites link to each other)
site links to the
SLIS site)
From S. Herring S642
S519
Calculating links
• Inbound links
– Estimate though Yahoo!
• Outbound links
– Estimate through web page source
• Number of web pages
– Estimate though Google
S519
AN EXAMPLE ON CAMPAIGN
WEBSITE
S519
Internal vs. outbound
Internal
Outbound
Michele Bachmann
17
20
Barack Obama
28
16
Ron Paul
52
11
Mitt Romney
30
8
 Coherent entities: do not contain “communities” inside
nor stay beneath a higher-level territory/domain.
 Limited number of outbound links
 Not link-intensive; may favor internal links over
outbound links
Destination
• Michele Bachmann: 6 YouTube, 4 FB, 8 Flickr, 2
Twitter
• Barack Obama: 1 FB, 5 Twitter, 4 YouTube, 1 Flickr, 4
Whitehouse.org, 1 OnGuardOnline.gov
• Ron Paul: 5 YouTube, 3 FB, 3 Twitter,
• Mitt Romney: 1 obamaisntworking.com, 2 YouTube,
1 FB, 1 Twitter, 1 Flickr, 2
http://www.thevillagesdailysun.com
Code book
• Outbound link types
– Social media (0); News sites (1); Government sites
(2); Academic sites (3); Blog sites (4); Wiki sites
(5); Personal sites (6); Forums (7); Web directory
(8); NGO sites (9); Other (10)
• Outbound link affiliation types
– Positive (0); Negative (1); Neutral (2); Mixed (3)
• Destination size
– Small sites (0); Medium sites (1); Large sites (2)
Outbound link types
Social media
News sites
Government Sites
10 (100%)
0
0
7 (70%)
0
3 (30%)
10 (100%)
0
0
Mitt Romney
5 (71%)
2 (29%)
0
Sum
32 (86%)
2 (5%)
3 (9%)
Michele Bachmann
Barack Obama
Ron Paul

Highly rely on social media sites: Web2.0? The candidates can
only communicate with the potential voters in social media
sites but not on their campaign websites.
Outbound link affiliation types
Positive
Negative
Neutral
Mixed
Michele Bachmann
(10)
0
5 (50%)
5 (50%)
Barack Obama
(10)
0
4 (40%)
6 (60%)
Ron Paul
(10)
0
2 (20%)
8 (80%)
Mitt Romney
(7)
1 (12.5%)
4 (50%)
3 (37.5%)
Sum
(37)
1 (2%)
15 (40%)
22 (58%)


37 outbound links are all in the positive side
For a clearer analysis, I made a distinction between neutral and
mixed sites by the way users interact with the websites.
Destination size
Michele Bachmann
Barack Obama
Ron Paul
Mitt Romney
Sum

Small sites
Medium sites
Large Sites
0
0
10 (100%)
1 (10%)
2 (20%)
7 (70%)
0
0
10 (100%)
1 (12.5%)
2 (25%)
5 (62.5%)
2 (5%)
4 (9%)
32 (86%)
small site if the website has less than 1,000 web pages, 1,000
to 50,000 for medium sites, and more than 50,000 for large
sites
Wiki
Personal
Forums
Web directory
NGO
16
0
1
19
4
3
2
4
1
%
0%
32%
0%
2%
38%
8%
6%
4%
8%
2%




Blog
Academic
Michele
Bachmann
News
0
Social media
Government
Inbound links
The number of inbound links: 31,231
“except from this domain”: 30,648 (no noisy found)
Ratio: 30,648:20, or roughly 1,500:1
Higher popularity in that it only distributed limited number of
links but received a huge amount of links.
Reciprocality
(Source->Destination)
Out/In
(Destination->Source)
In/Out
Michele Bachmann
10/7 (70%)
10/0 (0%)
Barack Obama
10/4 (40%)
10/1 (10%)
Ron Paul
10/6 (60%)
10/0 (0%)
Mitt Romney
8/8 (100%)
10/2 (20%)
Sum
48/25 (52%)
40/3 (7.5%)



Source to destination: a higher level of reciprocality
Destination to source: the reciprocality is quite low
Contributing factor: the different types of outbound links and
inbound links
Image analysis
• First define relevant variables
– dimensions or range of options of a similar type (e.g.,
size, color, setting)
– logically independent of one another
• Then distinguish values for each variable
– elements of the same logical kind; can be substituted
for one another
– Should be mutually exclusive and exhaustive
• Classify content according to defined values on
specific variables (code data)
S519
Sample variables
• Social distance - intimate, close personal, far
personal, close social, far social, public
• Visual modality - high, medium, low
• Behavior - offer/ideal, demand/affiliation,
demand/submission, demand/seduction,
other
S519
Celebrity images
• A) Sexual dress
– demure (0), suggestive (1), partially clad (2), nude (3)
• B) Gender role
– decorative (1), traditional (2), progressive (3), other (4)
• C) Social distance
– intimate (1), close personal (2), far personal (3), close
social (4), far social (5), public (6)
• D) Behavior
– offer/ideal (1), demand/affiliation (2), demand/submission
(3), demand/seduction (4), other (5)
(A & B from Lambiase, 2003)
S519
S519
S519
AN EXAMPLE OF COMPAIGN
WEBSITES
S519
Codebook
•
•
•
•
•
•
•
Size: icon (0); small (1); medium (2); large (3)
Abstractness: abstract (0); figurative (1)
Object: people (0); place (1); thing (2)
Gender: male (0); female (1)
Age: child (0); adult (1)
Race: white (0); black (1); Asian (2); other (3)
Social distance: intimate (0); close personal (1); far
personal (2); close social (3); far social (4); public (5)
• Color saturation: high (0); medium (1); low (2)
• Realism: high (0); medium (1); low (2)
• Behavior: offer/ideal (0); demand/affiliation (1);
demand/submission (2); demand/seduction (3); other (4)
Download