Google Analytics Blog - University of Warwick

advertisement
Public Engagement with Research Online
Appendix N: Evaluating the impact of research online with Google Analytics
The web provides extensive opportunities for raising awareness and discussion
of research findings and issues. As a commonly used channel for communication,
the web can also provide a source of data for evidencing the impact of research
dissemination, public engagement and knowledge transfer activities. For
example, the number and location of people accessing a research report can be
used as an indicator of reach, and favorable quotations from practitioner
discussion forums citing research can illustrate significance.
Web analytics refers to the study of user data collected on websites. Online
commerce has been the main application area that has driven the development
of web analytics in recent years. Nonetheless, the goal of web analytics is to
capture and analyse data on the use made of websites. Here we present an
overview of the historical background and the technologies used for tracking
user behavior. We also highlight the features of Google Analytics and how it can
be set up to monitor and evidence the impact of research online.
Background
The World Wide Web was proposed in March 1989 and the first web browser
was developed in December 1990. As use of the web spread people became
interested in who was accessing their pages and for what purposes, and by the
mid 1990s commercial companies (such as Web Trends and Analog) were
emerging that provided reports of log file data. This was the start of web
analytics, the measurement, analysis and reporting of user behavior on the web.
From a technical perspective, a web server program logs each request for a
HTML element, recording the: Internet Protocol (IP) address of the client
computer (i.e. browser), date, time, element requested, and status of the
program. Each request in the log file is referred to as a ‘hit’. As the IP address of
each web browser can be attributed to a geographical location the summary
reports typically identify the number of hits for specific time periods (e.g. hourly,
weekly, monthly) per location (e.g. Europe, America, Asia).
In 1996 hit-counters started to appear on pages showing the number of requests
for that page. However, by 1997 the number of hits logged no longer represented
the number of page requests because multiple elements (e.g. text and images)
were being used to create a page. This led to the use of JavaScript tags that can be
included as a page element to explicitly log each page request. JavaScript tags are
still the most commonly used way of explicitly logging page request data.
In 2004 the Web Analytics Association was formed as a professional association
for web analytics, which changed its name to the Digital Analytics Association in
2010 to reflect a broader approach to multi-channel analytics. In 2005, Google
launched the Google Analytics platform. This uses JavaScript tags to log page
requests with Google that can be accessed as online reports through the Google
Analytics website. Other commercial providers like ClickTale also host customer
analytics services. As well as third-party services, a range of server software for
logging and reporting user behavior is available for installing on a web server,
these include open source solutions such as Piwik.
Web analytics for online commerce
Online commerce has been one of the primary drivers for web analytics and
there has been significant interested in using analytics to help identify the
impact of website design and marketing initiatives on online sales and
popularity. Google Analytics, for example, currently presents reports on:
audience, advertising, traffic sources, content and conversions.
The audience reports include:





demographics (in terms of the location and browser language setting),
visitor behavior (i.e. the number of new and returning visits, and the
duration of their page visits),
technology used (i.e. the browser version, operating system and the
network service provider),
mobile (i.e. the number of visitors via specific phone or other mobile
devices), and
visitors’ flow (i.e. the pathways commonly used through the website).
The reports on advertising can be used to describe access associated with
AdWords. Traffic source reports identify the source and frequency of referral
links from other websites, access via search engines including the search terms
used, and the number of visitors accessing the website directly by entering the
web address. The content reports identify the relative popularity of pages within
the website. Finally, the conversions reports can be used to help indicate the
performance of the website with regard e-commerce.
Web analytics for evaluating research impact
The potential of web analytics to help evidence impact has recently drawn
considerable interest from the academic research community, where
researchers are increasingly required to account for the impact of their research
in terms of the reach and significance of their work outside of their research
communities. The following section gives a brief overview of the process of
setting up a Google Analytics account and the types of report that are produced.
Other web analytic platforms are available, but Google Analytics is currently the
most widely used free service.
Getting started: Setting up Google Analytics
To set up a Google Analytics account you first need to sign into the website (i.e.
http://www.google.com/analytics see Figure 1 left) with a Google user account,
(which can be created at the http://accounts.google.com website). After signing
in, new Google Analytics accounts can be added, edited and deleted under the
‘Admin’ area (accessed by clicking on the ‘Admin’ tab in the top navigation bar of
the webpage, see Figure 1 right).
Figure 1. Google Analytics sign in page (left) and account administration page (right).
Figure 2. Google Analytics profile administration page.
As previously noted, JavaScript tags are used to record user behavior in web
analytics platforms. In Google Analytics this is referred to as the ‘tracking code’.
The tracking code is unique to each Google Analytics account and once signed in
the code can be copied from the profile page within the ‘Admin’ area (see Figure
2) and pasted into every web page that is to be tracked under that account (for
further guidance see the Google Analytics support pages1).
Once a Google Analytics account has been set up and the tracking code has been
inserted into the web pages that are to be monitored, the resulting data can then
be accessed via the Google Analytics account page. A range of reports is
automatically generated (see Figure 3). The date range for a report can be
modified; under each type of report an initial overview is presented with more
specific reports available under each section.
Figure 3. Google Analytics overview reports for audience (left) and traffic sources (right).
The specific metrics used within each report are explained in the Google
Analytics help pages. A brief description of each metric is also displayed as
mouse-over pop-ups within the reports. There is also a set of explanatory video
clips provided as preparation for the Google Analytics Individual Qualification
test2 that support the interpretation of reports.
Examples use case: Using Google Analytics to evaluate impact
When web pages are part of a research dissemination, public engagement or
knowledge transfer activity, the use made of those pages can provide an insight
into the impact of the activity. The audience report in Google Analytics, for
example, provides information on the number and location of website visitors
over a specified time period, which can be used to evidence reach.
The following examples, drawn from the work of the Centre for Competitive
Advantage in the Global Economy (CAGE) at the University of Warwick, are
provided to illustrate how Google Analytics can be deployed to inform
Google Analytics Support Page: Manage Google Analytics – Basic web tracking
setup ‘How to set up the web tracking code’
http://support.google.com/analytics/bin/answer.py?hl=en&answer=1008080
(last accessed November 2012).
2 The Google Analytics Individual Qualification text and associated preparatory
video clips are available from http://www.google.com/intl/en/analytics/iq.html
(last accessed November 2012).
1
engagement activities and evidence impact. The following examples will
illustrate how the impact of dissemination, engagement and knowledge transfer
activities can be informed by (and evidenced through) the audience, traffic
source and content reports, and how dashboards can be configured to provide
specific report features. Finally, we consider how these reports can provide an
overview of user behavior, in terms of their online activity, and how the reports
can be further explored to extract specific details.
Audience reports
The audience reports present data on the number and location of people visiting
the website (see Figure 4). The number of visitors is recorded in several forms,
including the number of visits, the number of unique visits, the number of page
visits, and the average number of pages per visit. It is important to remember the
form of data being collected when interpreting the data, there is no explicit way
to identify each person instead the Internet address and details of each web
browser are recorded. Selecting the appropriate form of visitor metric is
dependent on the question that you need to answer. For example, the number of
visits includes repeat visits, where as the number of unique visits does not. The
location data refers to the registered city of the Internet Service Provider. As a
result of the form of data being captured, audience reports tend to give a good
indication of the frequency and location of visitors, to help evidence the reach of
the work being communicated through the website.
In the audience overview report shown in Figure 4 there is a clear spike in the
number of visits on October 4th with over 67% of visits during that month being
from browsers where the language setting was set to en-us (i.e. English - United
States) and over 10% with the language setting of en-gb (i.e. English - Great
Britain). By default, the demographic data refers to the language rather than the
location (i.e. country / territory or city). In this case, 739 of the 1,421 visitors in
the month (52%) were accessing the website from the United Kingdom and 215
(15%) were from the United States.
Figure 4. An audience report for the CAGE website at the University of Warwick for a one month time
period (i.e. October 2012).
Of the 1,421 visits during October 2012, 733 were from new visitors (i.e. people
who had not accessed the website before) and 866 were from returning visitors
(733 + 866 = 1,421). Of the 1,421 visitors, 863 were from unique visitors,
indicating that some of the visitors were accessing the website through more
than one browser (either on the same or multiple devices).
Traffic source reports
The traffic source reports present data on the visits resulting from: search
engines (i.e. the results of a search query); referrals (i.e. links) from other
websites; directly entering the web address; and from campaigns (i.e. online
promotion through paid campaign search keywords and adverts). The traffic
source overview indicates the pathways used to access the website, as some of
the details regarding search term access are not available or may be withheld
(e.g. by not agreeing to browser cookies) (see Figure 5).
Figure 5. A traffic source report for the CAGE website at the University of Warwick for a one month
time period (i.e. October 2012).
For campaigns that pay for search keywords and adverts, the source of search
traffic (i.e. the search engine providers) and the search keywords used will be an
important indicator of the value of the campaign. For both paid campaigns and
organic search, the keywords used by people that visit the website can also
provide a useful indicator of the concepts associated with the website and the
work that it communicates.
The other part of the traffic source data that can be useful for informing
dissemination, engagement and knowledge transfer activities is the source of
referral traffic. Much like the most commonly used search engines or search
keywords, the more common sources of website referrals (reported in the
Referral Traffic Source data table) can give a clear indication of the publics that
are engaging with the website. In the example used here, the spike in website
visits around October 4th can be attributed to an article published in the cnn.com
news website on Andrew Oswald’s work (that had links to the pages on the CAGE
website).
Content reports
The content report overview provides data regarding the specific pages (i.e.
website content) accessed during the selected time period (see Figure 6). As with
the audience report, the number of page views is measured in terms of the total
number (including returns to a previously viewed page) and the number of
unique page views. Based on the time between the page views within a visit, the
average time on the page is also presented for the given time period. The
overview also displays the bounce rate and percentage exit, which are terms
used to refer to the percentage of single page visits to the website within the
selected time period (i.e. landing on the website and ‘bouncing off’ again), and
the percentage of website exits that occurred (i.e. the number of page views that
were not followed by a page view within the same website).
Figure 6. A content report for the CAGE website at the University of Warwick for a one month time
period (i.e. October 2012).
Dashboards
As well as the themed reports, Google Analytics provide ‘dashboards’ as part of
the ‘home’ section of the Google Analytics website. A dashboard is a set of
reporting widgets that can be added, moved and deleted by the user in order to
create their own reports using any of the Google Analytic metrics. By default, an
account includes an initial ‘my dashboard’ (see Figure 7), which can be viewed,
edited and deleted; and additional dashboards can be created.
Figure 7. A (default) dashboard report for the CAGE website at the University of Warwick for a one
month time period (i.e. October 2012).
Overviews and specifics
The Google Analytics reports are live and interactive web pages that can be used
to generate monthly or annual reports. These can be archived as (static) reports,
but one of their strengths is that they provide interactive visualisations of the
recorded user data, which can be actively explored. For example, the site content
reports in the content area can be presented as a line graph showing changes in a
selected metric over a chosen reporting period (see Figure 6), or they can be
‘played’ as frame animations of scatterplots or bar charts showing daily changes
in the metrics over the selected time period (see Figure 8).
The user can configure the visualisations included in the reports. For example,
they can select the metrics plotted on the y axis of the timeline graphs; in the
scatterplot, the user can select the metrics plotted on the x and y axis, and the
metrics to be represented by the point colour and size; and for the bar charts, the
user can select the metrics plotted on the two axes and the bar colour. Individual
data values are also displayed in each of the visualisations when the user moves
the cursor over each data item.
Figure 8. Two interactive visualisations from the ‘All Pages’ section of the content report for the
CAGE website at the University of Warwick for a one month time period (i.e. October 2012).
A further example of the ways in which the Google Analytic reports can be
explored interactively is shown in Figure 9. This screen image shows a visitor
flow report, which illustrates the order of page views through the website for a
selected segment of the visitors (such as the country / territory demographic).
Moving the mouse cursor over the banded links between the blocks of the
diagram reveals the number of visits and the percentage of total traffic that took
that path through the website. Although this is a complex diagram it gives an
overview of the how the website is being navigated, and can therefore be a useful
source of evidence for exploring the effect of changes to the website structure or
navigation mechanisms.
Figure 9. An interactive visualisation from the ‘Visitors Flow’ section of the audience report for the
CAGE website at the University of Warwick for a one month time period (i.e. October 2012).
Summary and further information
While Google Analytics reports can be used to evidence the impact of research
with regard to reach, additional process information regarding how the research
work has been used or the changes that have happened as a result of the
research for specific groups is needed to evidence significance. So, although
Google Analytics does not currently provide a complete view of research impact,
the information on user behavior captured through web analytics can help
identify the demographics, interests and access routes of those engaging with the
research through the web. These in turn can be used to help guide and inform
how the significance of the research impact could be monitored and facilitated.
Interest in web analytics has been increasing since the creation of the web.
Specialist research communities are now emerging around application areas,
such as learning analytics (e.g. the Society for Learning Analytics Research3) and
research analytics (e.g. Altmetrics4 and ImpactStory5). How analytic tools might
be adapted or extended through these initiatives is difficult to predict.
Nonetheless, the tracking and analysis of online behavior as part of evaluating
research dissemination, public engagement and knowledge transfer is clearly an
important aspect of recognizing the role the web plays in research, and
The Society for Learning Analytics Research (SoLAR) is an international
network of researchers exploring how analytics can be used for teaching,
learning, training and development. For further details see
http://www.solaresearch.org (last accessed November 2012).
4 Altmetrics refers to the development and use of social web metrics for
analysing and informing scholarship. For further details see
http://altmetrics.org/manifesto (last accessed November 2012).
5 ImpactStory is a toolset for altmetrics. For further details see
http://impactstory.org (last accessed November 2012).
3
informing how researchers and research institutions use the web to share their
research and engage with publics.
Online resources
The following resources provide information and support on web analytics and
their application (links last accessed November 2012).
The history of web analytics
Brice Bottegal blog – Definition and history of web analytics (March 2012)
http://en.bricebottegal.com/definition-history-web-analytics
Web analytics and usability blog – A brief history of web analytics (November
2010)
http://blog.clicktale.com/2010/11/17/a-brief-history-of-web-analytics
Online video tutorials
Google Analytics Walkthrough (February 2012)
http://www.youtube.com/watch?v=XZDUWd_ezcI
Google Analytics - Getting Started with Google Analytics (May 2012)
http://www.youtube.com/watch?v=l9joLoZOjK4
Google Analytics information and reference materials
Google Analytics Blog
http://analytics.blogspot.co.uk
Google Analytics Help Centre
http://support.google.com/analytics
Google Analytics You Tube channel
http://www.youtube.com/googleanalytics
Download