Transcript Document

Sustainability: Web Site
Statistics
Marieke Napier
UKOLN
University of Bath
Bath, BA2 7AY
UKOLN is supported by:
Email
[email protected]
URL
http://www.ukoln.ac.uk/
Web Site Statistics
This presentation will:
• Give a (very) brief overview of what Web
statistics are
• Consider why we need them
• Focus on the analysis of usage data created
by your Web site
• Look at what other criteria, besides Web
server statistics, can be used to provide
performance indication
2
Sustainability Workshop, London, 16 May 2002
What are Web Statistics?
• Web statistics are produced by the Web
server software
• Information (such as IP address, name of
resource) is recorded in a log files
• It is also possible to configure your server to
record more information (such as referrer
details)
• The log files produced are mainly accurate
• However interpretation of the statistics can be
misleading
3
Sustainability Workshop, London, 16 May 2002
Why do we Need Them?
• They indicate how popular your site is
• They show how successful your marketing strategy
has been
• They can be used in management reports
• They can identify gaps in service provision
• They predict and plan for future load patterns
• They allow you to monitor performance levels
• They can be used in consideration of deployment of
new technologies
• They can inform and motivate contributors
• They can show who your users are
• NOF have asked for them
4
Sustainability Workshop, London, 16 May 2002
The HTTP Process
• A user clicks a link or enters a URL
• The remote web server downloads the HTML
page
• The HTML page is interpreted and any inline
objects are also downloaded:
– Each image (occurrence of <IMG SCR=“image1">)
– Background image or sound
– External JavaScript or stylesheet files etc.
• The user follows a path through the site making
new requests till they leave your site
Summary
Each individual users request for a page can produce multiple
requests at the remote server and generate multiple hits.
5
Sustainability Workshop, London, 16 May 2002
Viewing Web Statistics
• Server log files are available
to view…but may not make
a lot of sense on first look
• The Analog program
(Cambridge University) was
one of the first packages to
provide a graphical
summary of web log file.
http://www.statslab.cam.ac.uk/~sret1/stats/stats.html
6
Sustainability Workshop, London, 16 May 2002
Web Statistics: Terms Used
Hit
• Any information requested from a site - this
includes HTML pages, pictures, forms, scripts and
files downloaded
• Can be affected by redesign, robots, caching etc.
Page Views (or requests/impressions)
• The number of pages viewed
• Extensions such as .htm, .html, .asp etc.
User Sessions
• Series of requests from unique IP address within a
period of time (more accurate if registered users
• Issues with firewalls, institutional caches etc.
7
Sustainability Workshop, London, 16 May 2002
Interpretation Issues
Profiling users - can we track users easily?
•
•
•
•
You can’t tell the exact identity of your users
Using IP addresses, domain names of visitors
Following paths – entering and exiting the site
Registration
Caching
• Browser caching and institutional/ISP caching
Robots
• Necessary enable your resources to be found
• Robots generate hits
Quality??
8
Sustainability Workshop, London, 16 May 2002
Log Analysis Tools
There are many tools available:
• Analog: free, easily automated. However little datamining capabilities and management graphs limited.
• WebTrends: Popular desktop package. Several
versions. May be expensive for reporting on multiple
Web sites.
• Webaliser, WebVisit, HitList, Reportmagic etc.
• A list is available at
http://uk.dir.yahoo.com/Computers_and_Internet/Soft
ware/Internet/World_Wide_Web/Servers/Log_Analysi
s_Tools/
9
Sustainability Workshop, London, 16 May 2002
Externally-Hosted Services
Two services have been used
extensively by UKOLN:
SiteMeter and NedStat
http://www.sitemeter.com/
• Advantages:
– No software to buy, install,
configure and run or powerful
PC to run software on
– No log files to manage
– Uses "cache-busting" images
– Can monitor extra features
• Disadvantages:
–
–
–
–
10
Limited data-mining
Lloss of Ownership of data
Dependency on external service
Fails to monitor text browsers http://www.nedstats.com/
Sustainability Workshop, London, 16 May 2002
Other Performance Indicators
Links to Your Site
• Indicators that people are interested in your
service (and can deliver traffic)
Search Engines Coverage
• Indicators that users can find resources on
your Web site
User Feedback
• Comments, voting, etc.
Technical Indicators
• Browser support, broken links, server-uptime,
etc.
11
Sustainability Workshop, London, 16 May 2002
Links To Your Site
• Links are an indication of
potential use of your Web site
• Search engines can be used
to report on the numbers of
links to a Web site
• LinkPopularity.com provides
an interface to 3 search
engines
• Monthly reports can be
obtained
http://www.linkpopularity.com
12
Sustainability Workshop, London, 16 May 2002
Coverage By Search Engines
• Have you promoted your
Web site?
• Can your Web site be
accessed by search
engines?
• Are you near the top of
the search results?
• Search engines can
report on their coverage
of your Web site
• Coverage is an indication
of potential use of your
Web site
For information on how to ensure that your web site has been
indexed see the section on Promotion of your Project Web
13
Sustainability Workshop, London, 16 May 2002
Technical Indicators
Broken Links
• How many links are there on your Web site
(internal and external)?
• How many broken links are there?
• Use services like linkalarm.com
Server Availability
• Recording down time
• Email alerting
• Use services like InternetSeer.com
14
Sustainability Workshop, London, 16 May 2002
Conclusions
• Web statistics can be difficult to interpret
• Analysis of Web statistics is needed for lots of
reasons
• Think about the tools you will need (and the
resource implications in using them)
• Besides analysis of log files there are other
performance indicators which may be of use
• Analysis will also help with in monitoring the
performance of your Web site and planning future
developments
15
Sustainability Workshop, London, 16 May 2002
Any Questions?
This presentation is loosely based on the
Information Paper on Web Site Performance
Monitoring available at:
http://www.ukoln.ac.uk/nof/support/help/papers/performance/
16
Sustainability Workshop, London, 16 May 2002