Web Content Evaluation An application of evaluative methods Today’s topics • Feature analysis • Link analysis • Image analysis S519 Web Content Analysis • WebCA is a methodologically plural paradigm • But overall coherent, in that methods are informed by the general principles of CA – must enable “objective, systematic, and quantitative description of the content of [web] communication” (Baran, 2002, p. 410) – follows a general coding and counting procedure, except when identification of phenomena is automated Herring (2010) S519 Genre • Definitions – “typified rhetorical action based in recurrent situations” (Miller, 1984) – “a distinctive type of communicative action, characterized by a socially recognized communicative purpose and common aspects of form” (Yates & Orlikowski, 1994) – A style or type, in this case relating specifically to Web format and content S519 Examples • Campaign websites: Ron Paul, Mitt Romney, Michele Bachmann, etc • Shopping websites: Target, Wal-Mart, Amazon, etc. • iSchool websites: IU-SLIS, UNC-SILS, UIUCGSLIS • Some another examples? Web content analysis/evaluation is usually conducted for a group of websites with same or similar genre. S519 Feature Analysis • Analysis of the presence/absence, the range, or the frequency of objective website features • Types of web features – Content: Text, figures, images, video clips, ads, badges, etc. – Navigation: Facilities for accessing information and moving around the site – Presentation: The way in which content and navigation are presented to the user, e.g., layout patterns (Calero, Ruiz, & Piattini, 2004) S519 Examples Web Genres background images email links forms images links inward links outward lists top-level domain name From S. Herring S642 Personal Homepages awards colors other than white and gray copyright notice guestbook hit counter last update date link structure ‘new’ icon personal info (various types) ‘under construction’ sign video welcome message S519 Blogs ads archives badges blogger information blogroll calendar comments entries images length of entries permalink temporal measures time stamp Which features to analyze • From prescriptive guidelines • From previous studies wholesale or modified • Grounded theory approach (Glaser & Strauss, 1968) – allow relevant features to emerge from the data S519 AN EXAMPLE ON CAMPAIGN WEBSITES S519 Selection of websites • The candidates’ websites of the 2012 presidential election • One Democratic candidate - Barack Obama; and nine Republican candidates • To select the most “popular” Republican candidates – – – – Google Trend (GT) Michele Bachmann as the baseline (a GT score of one) The time period: past 30 days (query time: September 23) The top four candidates based on GT scores are Ron Paul (1.96), Rick Perry (1.78), Michele Bachmann (1.00), and Mitt Romney (0.42). S519 Purpose and audience • To facilitate the candidates’ electoral campaigns – Publicize political views – Establish connections with voters – Request for donations and volunteers • Potential U.S. voters in the 2012 election – covers the majority of adult U.S. citizens – non-U.S. citizens (especially for those who are interested in politics and American culture) – Michele Bachmann’s campaign website (http://www.michelebachmann.com/): 22% of the website visitors were coming from out of the U.S. S519 Genre • (Presidential) electoral campaign websites. – The chosen websites share similar forms and topics, and serves the same group of people with same purposes • Related to other campaign websites, both political and non-political • Related to politicians’ websites, such as mayors’ websites and governors’ websites S519 Features • • • • • Linking structure Involvement Multimedia Functions Social Media S519 Features – linking structure Features No. of top level navigation links No. of web pages No. of links in the past week (as of September 25, 2011) Mean Range Mode 7 5(RP,BO)~10(RP) 5 1,770 233(MB)~3,140(BO) - 60 22(MB)~138(BO) - S519 Features - involvement Features Frequency Percentage Exception 5 100% - 4 80% Mitt Romney 4 80% 3 60% Sign in 2 40% Join groups 1 20% Rick Perry Mitt Romney; Barack Obama Barack Obama; Mitt Romney Barack Obama Have “Donate” in the homepage Have “Volunteer” in the homepage Store Get Email Updates in the homepage S519 Features - multimedia Features Background color used Picture as background image Leave comments Features No. of images in the homepage No. of images in the whole website (larger than 300*400) No. of videos in the homepage Frequency Percentage Exception 3 60% Barack Obama; Ron Paul 1 20% Michele Bachmann 1 20% Barack Obama Mean Range Mode 7 3(BO)~12(RP) - 350 19(MB)~1,350 (BO) - 2 1(RP)~3(BO) 1 S519 Feature - functions Features Frequency Percentage Exception Search function 1 20% Mitt Romney Calendar 2 40% Blog 3 60% Other languages 1 20% Barack Obama “In Your State” function 3 60% Rick Perry; Mitt Romney S519 Barack Obama; Ron Paul Rick Perry; Michele Bachmann Feature – social media Sum Barack Obama Y Y Y Michele Bachmann Y Y Y Ron Paul Y Y Rick Perry Y Mitt Romney Sum Y Y Y 5 Y Y Y 7 Y Y Y Y 6 Y Y Y Y 5 Y Y Y Y Y Y Y 7 5 5 5 2 5 3 5 S519 Exercise • Find features that you think are important and/or relevant for – iSchool websites – Shopping websites S519 Link Analysis • The web is hypertextual – Text links to other text via hyperlinks • Four basic types of links: – outbound, inbound, internal, reciprocal S519 Four basic linking patterns Site B Site B Site A Site A Site B Site B Site A Site A Reciprocal Internal Outbound (“outlinks”) Inbound (‘inlinks”) (e.g., navigation links within a site) (e.g., the SLIS site links to the JASIST site) (e.g., the JASIST (e.g., the SLIS and JASIST sites link to each other) site links to the SLIS site) From S. Herring S642 S519 Calculating links • Inbound links – Estimate though Yahoo! • Outbound links – Estimate through web page source • Number of web pages – Estimate though Google S519 AN EXAMPLE ON CAMPAIGN WEBSITE S519 Internal vs. outbound Internal Outbound Michele Bachmann 17 20 Barack Obama 28 16 Ron Paul 52 11 Mitt Romney 30 8 Coherent entities: do not contain “communities” inside nor stay beneath a higher-level territory/domain. Limited number of outbound links Not link-intensive; may favor internal links over outbound links Destination • Michele Bachmann: 6 YouTube, 4 FB, 8 Flickr, 2 Twitter • Barack Obama: 1 FB, 5 Twitter, 4 YouTube, 1 Flickr, 4 Whitehouse.org, 1 OnGuardOnline.gov • Ron Paul: 5 YouTube, 3 FB, 3 Twitter, • Mitt Romney: 1 obamaisntworking.com, 2 YouTube, 1 FB, 1 Twitter, 1 Flickr, 2 http://www.thevillagesdailysun.com Code book • Outbound link types – Social media (0); News sites (1); Government sites (2); Academic sites (3); Blog sites (4); Wiki sites (5); Personal sites (6); Forums (7); Web directory (8); NGO sites (9); Other (10) • Outbound link affiliation types – Positive (0); Negative (1); Neutral (2); Mixed (3) • Destination size – Small sites (0); Medium sites (1); Large sites (2) Outbound link types Social media News sites Government Sites 10 (100%) 0 0 7 (70%) 0 3 (30%) 10 (100%) 0 0 Mitt Romney 5 (71%) 2 (29%) 0 Sum 32 (86%) 2 (5%) 3 (9%) Michele Bachmann Barack Obama Ron Paul Highly rely on social media sites: Web2.0? The candidates can only communicate with the potential voters in social media sites but not on their campaign websites. Outbound link affiliation types Positive Negative Neutral Mixed Michele Bachmann (10) 0 5 (50%) 5 (50%) Barack Obama (10) 0 4 (40%) 6 (60%) Ron Paul (10) 0 2 (20%) 8 (80%) Mitt Romney (7) 1 (12.5%) 4 (50%) 3 (37.5%) Sum (37) 1 (2%) 15 (40%) 22 (58%) 37 outbound links are all in the positive side For a clearer analysis, I made a distinction between neutral and mixed sites by the way users interact with the websites. Destination size Michele Bachmann Barack Obama Ron Paul Mitt Romney Sum Small sites Medium sites Large Sites 0 0 10 (100%) 1 (10%) 2 (20%) 7 (70%) 0 0 10 (100%) 1 (12.5%) 2 (25%) 5 (62.5%) 2 (5%) 4 (9%) 32 (86%) small site if the website has less than 1,000 web pages, 1,000 to 50,000 for medium sites, and more than 50,000 for large sites Wiki Personal Forums Web directory NGO 16 0 1 19 4 3 2 4 1 % 0% 32% 0% 2% 38% 8% 6% 4% 8% 2% Blog Academic Michele Bachmann News 0 Social media Government Inbound links The number of inbound links: 31,231 “except from this domain”: 30,648 (no noisy found) Ratio: 30,648:20, or roughly 1,500:1 Higher popularity in that it only distributed limited number of links but received a huge amount of links. Reciprocality (Source->Destination) Out/In (Destination->Source) In/Out Michele Bachmann 10/7 (70%) 10/0 (0%) Barack Obama 10/4 (40%) 10/1 (10%) Ron Paul 10/6 (60%) 10/0 (0%) Mitt Romney 8/8 (100%) 10/2 (20%) Sum 48/25 (52%) 40/3 (7.5%) Source to destination: a higher level of reciprocality Destination to source: the reciprocality is quite low Contributing factor: the different types of outbound links and inbound links Image analysis • First define relevant variables – dimensions or range of options of a similar type (e.g., size, color, setting) – logically independent of one another • Then distinguish values for each variable – elements of the same logical kind; can be substituted for one another – Should be mutually exclusive and exhaustive • Classify content according to defined values on specific variables (code data) S519 Sample variables • Social distance - intimate, close personal, far personal, close social, far social, public • Visual modality - high, medium, low • Behavior - offer/ideal, demand/affiliation, demand/submission, demand/seduction, other S519 Celebrity images • A) Sexual dress – demure (0), suggestive (1), partially clad (2), nude (3) • B) Gender role – decorative (1), traditional (2), progressive (3), other (4) • C) Social distance – intimate (1), close personal (2), far personal (3), close social (4), far social (5), public (6) • D) Behavior – offer/ideal (1), demand/affiliation (2), demand/submission (3), demand/seduction (4), other (5) (A & B from Lambiase, 2003) S519 S519 S519 AN EXAMPLE OF COMPAIGN WEBSITES S519 Codebook • • • • • • • Size: icon (0); small (1); medium (2); large (3) Abstractness: abstract (0); figurative (1) Object: people (0); place (1); thing (2) Gender: male (0); female (1) Age: child (0); adult (1) Race: white (0); black (1); Asian (2); other (3) Social distance: intimate (0); close personal (1); far personal (2); close social (3); far social (4); public (5) • Color saturation: high (0); medium (1); low (2) • Realism: high (0); medium (1); low (2) • Behavior: offer/ideal (0); demand/affiliation (1); demand/submission (2); demand/seduction (3); other (4)