Understanding library users you don`t see
Download
Report
Transcript Understanding library users you don`t see
Understanding
library users you
don't see
Techniques for tracking and
analyzing library Web resources
Marshall Breeding
Director for Innovative Technologies and Research
Vanderbilt University
http://staffweb.library.vanderbilt.edu/breeding
[email protected]
Saturday June 24
Theme
For many libraries, the number of visitors of their Web site and
electronic resources exceeds the numbers that visit their physical
premises. It's vital for libraries to understand how these remote
visitors approach the Web site, not only to measure use but to
improve the resources themselves. Marshall Breeding will present a
number of practical techniques that libraries can use to better
understand the use of their Web-based resources.
Topics will include the basics of analyzing the server logs of the
library's Web site, transaction logs from the OPAC, the complexities
of measuring use of subscription-based electronic resources, and
techniques for enhancing applications to better record how they are
used.
Understanding remote users
Vital to providing relevant library services
More libraries may use library resources
remotely through the Web than from
physical library facilities
Must work harder to ensure that Webbased services meet patron needs
Move beyond hit counters and raw
statistics to more sophisticated analysis
and assessment
Analysis goals
Improve usability
Web site diagnostics
Understand user needs
Content selection decisions
Improve quality of service
Marketing
Budget justification
Strategy to increase interest and activity
Data sources for tracking remote
use
Web server logs
Application logs
Remote tracking data (Google Analytics)
Vendor provided use statistics (eresources)
Enterprise approach to analytics
Multiplicity of Resources to track
Web Servers
OPACS
E-Resources
Databases
Repositories
Important to track the flow of use among all the library’s
Web-based resources
Beyond the library: study flow to and from higher-level
Web sites and portals (University -> Courseware ->
Library)
Web server logs
Web servers are routinely configured to record
detailed information about each request.
Common elements include:
File
requested
Date / time stamp
Status code
Request directive (get, post, head)
Referrer (where the user came from)
User agent (browser and platform data)
Example Web log
Raw data for analysis process
2006-06-20 05:01:43 129.59.150.105 GET
/index.pl - 80 - c-69-250-131199.hsd1.md.comcast.net
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows
+NT+5.1;+SV1;+.NET+CLR+1.1.4322)
http://www.google.com/search?hl=en&lr=&saf
e=off&q=september+11+television+archive
200 0 0 11752
Exploiting referral data
The query string component of the referrer can
be parsed to reveal search terms and other
interesting information
http://www.google.com/search?hl=en&lr=&safe=
off&q=september+11+television+archive
User
typed “september 11 television archive” in
Google to find our site
Important to study how users get to your site
[example: TV News Public Web queries vs
OpenWeb)
Analysis methodology
Go beyond simply counting pages
Identify Sessions
Categorize users
Determine use patterns
Measure interest
Time
spent on Web site
Bounce rate
Page overlay analysis
Move from measurement to
impact
Establish site goals
Benchmark current use
Implement goal oriented improvements
Measure impact
Repeat as needed
(Example: enhancement of TV News
OpenWeb)
Appropriate data filtering
Requests from indexing bots (crawlers) can
skew statistics
Count user requests and bot requests
separately
Performance monitors
Link checkers
Monitoring crawler activity is an important
component of SEO and Web site discoverability
strategies.
Resource Discovery
How do users get to your site?
Track performance of the Web site relative
to major search engines
SEO – Search engine optimization
Few users begin with library Web sites
Troubling statistic
Where do you typically begin your
search for information on a
particular topic?
College Students Response:
89%Search engines (Google 62%)
2% Library Web Site (total respondents -> 1%)
2% Online Database
1% E-mail
1% Online News
1% Online bookstores
0% Instant Messaging / Online Chat
OCLC. Perceptions of Libraries and Information Resources
(2005) p. 1-17.
Library Discovery
Model
Web
Library Web Site / Catalog
Library as search Destination
TV News OpenWeb project
Dramatic increase in Web site activity and
loan requests through systematic and
controlled exposure of metadata to Google
and other search engines
SEO (Search Engine Optimization)
strategy
Helped the Archive become financially
self-sufficient.
Examples of Web
reporting and
analysis tools
Selected utilities
Analog – free, open source
NetTracker – enterprise level Web analysis
application
Google utilities
– process for submitting Web pages for
optimized indexing by Google with some assessment
capabilities
Analytics – Sophisticated approach for measuring
Web site performance
Sitemap
Analog
Free Open Source application
Basic Web statistics application
Includes fairly full set of static metrics
Command line utility – generates Web
report
Windows, Unix, Linux, etc.
NetTracker
Unica Corporation
Enterprise level Web analytics
http://www.sane.com/
NetTracker Executive
Dashboard
NetTracker Bandwidth Trends
NetTracker Content
NetTracker Keyword Summary
NetTracker Referrers
NetTracker Pages Viewed
Google SiteMaps
XML specification for systematically
submitting URLs that represent a Web site
Makes indexing more efficient but does
not affect PageRank
SiteMap interface provides utilities for
monitoring how the site has been indexed
with some analytical information on terms
used to find your Web site.
Google SiteMaps Top Searches
Google SiteMaps Page Analysis
Google Analytics
Available at no cost from Google
Must receive invitation code
Slanted toward e-commerce
“Conversion University” – training on how to
optimize Web site for high conversion rates.
Allows Webmasters to establish site goals and
measure performance
Google Analytics main
Google Analytics overview
Google Analytics Browser Versions
Google Analytics Top Content
Google Analytics Entrance-Bounce
Rates
Google Analytics Navigational
Analysis
Google Analytics Goal tracking
Application-level reporting and
analysis
Content management systems and other
dynamically driven Web environments can
provide additional usage information.
Can offer additional information beyond raw
Web logs
More capabilities for identifying use based on
user categories
Reporting can be built into the business logic of
the application
Examples from the TV News Web
Site
Reports of use by user category and institution
Statistics on resource use
Data on search types, query terms, etc.
Ability to track all aspects of business activity
Other sources of Use data
ILS OPAC Logs
Proxy Server logs and reports
Link resolver logs and reports
Limitations
Can’t know the intent of the user
User success can only be estimated
Difficult to obtain trends by user type
More aggressive reporting might intrude on
privacy
Few libraries require the level of user
authentication needed to determine use by type
of patron
Additional Information
Breeding, Marshall. Strategies for Measuring
and Implementing E-use. ALA TechSource. MayJune 2002. 79 pages.
Breeding, Marshall. “Analyzing Web server logs
to improve a site’s usage.” Computers in
Libraries. Information Today. Medford, CT.
October 2005.
Handout
Presentation will be available after the
conference at:
http://staffweb.library.vanderbilt.edu/breeding/presentati
ons/ala2006.ppt