Microsoft PowerPoint - the NCRM EPrints Repository

advertisement
Virtual Knowledge Studio (VKS)
Information Studies
What is Webometrics?
Mike Thelwall
Statistical Cybermetrics Research Group
University of Wolverhampton, UK
1. Introduction
□ Webometrics is concerned with gathering
data on and measuring aspects of the Web
□
□
□
□
□
□
web sites
web pages
hyperlinks
web search engine results
YouTube video commenter networks
MySpace Friend networks
□ …for very varied social science purposes
New problems: Web-based
phenomena
□ Webometrics can be applied to
understanding web-based
phenomena
□ Why do web sites interlink?
□ Which web sites interlink?
□ What interlinking patterns exist?
□ What topics are frequently blogged
about?
Old problems: Offline
phenomena reflected online
□ Some offline phenomena have
measurable online reflections
□ International communication
□ Inter-university collaboration
□ University-business collaboration
□ The impact or spread of ideas
□ Public opinion
2. Examples
Blog searching - blogpulse.com
Example: Identifying and tracking
public science concerns in blogs
Over 100,000 Blogs and other sources tracked daily
via RSS feeds
Objective: to identify and track public concerns about
science
E.g., “Schiavo” identified and tracked as potential
public science concern
Example: The online impact of
research groups (NetReAct)
Austria
Switzerland
Geopolitical
connected
Belgium
Germany
Example:
Links between
EU universities
France
Spain
NL
UK
Norway
Italy
Poland
Finland
Normalised linking, smallest countries removed
Sweden
International biofuels research network
Example: MySpace age profiles
percentage of profiles containing
swearing
moderate strong very strong sample
size
US males 16-19
10%
47%
2%
1,530
US females 16-19
11%
38%
2%
1,287
UK males 16-19
33%
33%
8%
171
UK females 16-19
18%
38%
3%
130
(typical sample size 20-148 for non-web swearing research)
emphatic adverb/adjective OR adverbial booster
OR premodifying intensifying negative adjective
(36% of swearing)
□ and we r guna go to town again n make a ryt
fuckin nyt of it again lol
□ see look i'm fucking commenting u back
□ lol and stop fucking tickleing me!!
□ Thanks for the party last night it was fucking good
and you are great hosts.
□ That 50's rock and roll weekender was fucking
mint!
□ Fuckin my space, my arse
□ 1/2 d ppl cudnt even speak fuckin english!
□ yeah so me and sarah broke up and everythings
fucking shit
YouTube – Video poster ages
YouTube
friend network
Online impact - Keywords in web
pages mentioning IWRM
Data Gathering/Processing Tools
□ Blogpulse.com – blog network
diagrams
□ LexiURL Searcher – links, web text,
YouTube, Flickr, Technorati
□ Issue Crawler, Google TouchGraph links
Discussion points for online data
□ Validity – is the underlying meaning of the
text/video/picture readily apparent to the
researcher?
□ Possibly not to any great degree for teenagers’ MySpace
comments or very personal YouTube videos
□ Reliability –are search engines accurate/good at
returning the correct results?
□ Google blog search shows unreliability – very variable over
time
□ Researchers can triangulate different similar search
engines or over time to test reliability
Discussion points for online data
□ Coverage – to what extent is all the
phenomena of interest covered by the
source (e.g., search engine) used?
□ Sample bias – are certain types of
people over-represented? (e.g., the
more literate, the more vocal, the
more politically active, youth,
educated, creative types…)
Summary
□ The web contains a wide variety of
interesting web and “web 2.0” content
posted by many different people in
many different formats
□ Webometric methods can give insights
into this data
Books
□ Thelwall, M. (2009). Introduction to
webometrics: Quantitative web research for
the social sciences. New York: Morgan &
Claypool.
□ Rogers, R. (2005). Information politics on the
Web. Massachusetts: MIT Press.
□ http://lexiurl.wlv.ac.uk http://webometrics.wlv.ac.uk
http://www.issuecrawler.net
Important considerations
□ Data accuracy
□ Data cleaning
□ Context to help interpret results
□ Report results carefully
Example: Analysis of the
accuracy of search engine results
Live Search results analysis
Download