Promoting Your Project Web Site
Download
Report
Transcript Promoting Your Project Web Site
Promoting Your Project Web
Site
Brian Kelly
UK Web Focus
UKOLN
University of Bath
Bath, BA2 7AY
England
Email
[email protected]
URL
http://www.ukoln.ac.uk/
Project Manager for
Exploit Interactive web magazine
http://www.exploit-lib.org/
UKOLN is funded by the Library and Information Commission, the Joint Information
Systems Committee (JISC) of the Higher Education Funding Councils, as well as by
project funding from the JISC and the European Union.
UKOLN also receives support from the University
of Bath where it is based.
1
Approaches
What approaches can we take to raising the
profile of our web site?
• Tell our friends and colleagues (at conferences
in exotic places)
• Give away pens and bags
• Let it happen automatically
• Submitting resources
• Perhaps giving parts of our web site away?
2
Automated Indexing
Many users use search engines such as
AltaVista, HotBot, Northern Lights, etc. to find
resources.
Issues:
• Will my site be indexed?
• Will it be near the top of a sensible search query?
• How can I improve things?
3
Problems in Being Indexed
Size of Index
Search engines are failing to keep up with the
growth of the web
Not all pages on a web site will be indexed
Typically a 500 page sample will be indexed
Frames (and "splash screens")
Many indexing robots can't access framed
web sites or web sites which use "splash
screens"
4
Improving Indexing of
Key Resources
How to ensure that quality pages are indexed:
• Don't publish non-work pages on the server
• Move from a single large institutional server to
multiple (real or virtual) servers:
Instead of <www.ukoln.ac.uk/exploit/>
use <exploit.ukoln.ac.uk/> or (even better)
<exploit-lib.org/>
• Avoid use of frames (or provide link to alternative
entry point)
These approaches will improve chances of
more complete indexing of the web site
5
Improving Indexing (2)
Do you know if your project web sites uses the Robot
Exclusion Protocol (REP) - a /robots.txt file?
User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
# Following apply to all robots
# Don't index /cgi-bin directory
# Don't index /tmp directory
Use the REP to:
• Prevent junk (old or draft versions,
experimentation, etc) from being indexed
Check your /robots.txt file to:
• Ensure that your web site can be indexed
Tools are available to help you manage the robots.txt file. For
example RoboGen: <http://www.rietta.com/robogen/>
6
Improving Indexing (3)
Updating the /robots.txt file may be difficult.
The (new) <META> feature allows HTML authors to
control robots.
<META NAME="robots" CONTENT="noindex, nofollow">
Use this in key menu pages for resources you don't
want indexed.
deliverables
reports
draft
personal
See <http://info.webcrawler.com/mak/
projects/robots/meta-user.html> and
<http://www.kollar.com/robots.html>
7
Some Solutions (3)
Getting Your Web Site Indexed (cont)
Several search engines allow URLs to be submitted
Bulk Submissions
Turnaround time
from a few days to
several months
And what about
bulk submission
services?
8
Some Solutions (4)
Some Submission Engines
http://www.webposition.com/
http://www.netsubmitter.com/
http://www.registerpro.com/
http://www.pegasoweb.com/
engenius/
http://www.exploit.com/wizard/
There are products for submitting
sites to multiple search engines
(and analysing your pages,
reporting on your position in
search engines, etc.) But:
• How good are they?
• How ethical are they?
• How cost-effective are they?
9
Has It Worked?
How do you know if
robots are visiting your
web site?
The free BotWatch
Perl program will
analyse your log files
and generate a report
on visits by robots.
BotWatch is available at
<http://www.tardis.ed.ac.uk/~sxw/robots/
botwatch.html>
See also <http://www.botspot.com/>
10
Problems in Ranking
Typically large numbers of hits are obtained.
Metadata may help
<META NAME="keywords" CONTENT="exploit, web
magazine, TAP, telematics">
<META NAME="description" CONTENT="Exploit
Interactive is a ..">
<META NAME="DC.Title" CONTENT="Exploit .."
But:
• "AltaVista" and Dublin Core metadata are not
supported by all (many?) search engines
• Issues about maintenance of metadata
11
Some Solutions
Use of "AltaVista" metadata is a must for key pages
Use of Dublin Core:
• Could be used in specialist applications (domain-specific
search engines, current awareness services, B2B, etc.)
• Think about additional
benefits to you
(e.g. local searching,
auditing)
• Scope for discussions
with search engine
vendors?
• Need to think about
deployment and
The Exploit Interactive web magazine uses Dublin
maintenance
Core metadata to enhance local searching. The
rd parties
metadata
can
also
be
used
by
3
12
Analysis of NFP Web Sites
Report of an analysis of NFP (National Focal
Point) web sites published in Exploit Interactive
issue 3. Of the 10 web sites:
• No significant use of
metadata on main
entry point
• Six made no use of
REP, one disallowed
all robots and three
made sensible use
• No use of separate
domain names
• One framed site
13
http://www.exploit-lib.org/issue3/nfp-websites/
Web Directories
Web directories (e.g. Yahoo!) provide manuallycompiled classifications of the web
Benefits to Projects:
• Additional place to be found
• "61% reach in UK Search engine market"
• Can be sensibly classified e.g. Ariadne magazine is in
<http://www.yahoo.co.uk/Reference/Libraries/
Professional_Resources/Internet_in_Libraries/>
Problems:
• Time-consuming for cataloguers
• Entries can be submitted, but this can be time-consuming
• "..sub-domains have difficulties in getting into Yahoo!"
Compare:
www.ukoln.ac.uk/projects/eu/exploit/
www.ukoln.ac.uk/~exploit/
14
www.exploit-lib.org
www.ukoln-exploit.ac.uk
Submission to Web Directories
It might be worth
submitting to web
directories such
as Yahoo!
Remember that
the information
will be processed
by humans.
See
<http://www.
searchengine
watch.com/
webmasters/>
15
Give Your Web Site Away
Another way to promote your web site is to give it away!
You could give away:
• Parts of the site to robots (e.g. metadata)
• Parts of the interface
You could give away the
• The entire site
interface to:
• your local indexer
• a remote indexing
service e.g. HotBot
See <www.ariadne.
ac.uk/issue21/
webwatch/>
Search interface embedded in Exploit Interactive article at
<http://www.exploit-lib.org/issue3/nfp-websites/>
16
Give Part of Your Site Away
OMNI gives an
example of a site
hosting remote
search interfaces.
Enhances remote
interface, but
several issues.
See article at
<http://www.
ariadne.ac.uk
/issue21/
webwatch/> for
discussion
http://www.omni.ac.uk/other-search/
17
Give Your Web Site Away
Why not have your web site mirrored? Mirrors in, say,
USA and Australia will help to promote your service.
Issues: Is your web site easily mirrored?
• Are relative URLs used?
• Do you use directories structures to delineate areas
of your web site?
• If you use server-side scripting for management
purposes, do you hide unusual URLs:
/issue1/mag-features.asp
# Problems
/issue1/mag-features/default.asp
/issue1/mag-features/
# Usable on Unix
(also techniques such as Apache rewrites)
If your web site can't be mirrored, can it be preserved?
18
See AlertBox column at
<http://www.useit.com/
alertbox/990321.html>
Citation
Is your project web site address easy to remember?
Issues:
• Short domain names are a winner
• Short URLs are desirable (try to avoid org. structure)
• Try to cite directories (shorter and less ambiguous):
www.exploit.org/issue1/pride/article.htm (article.html, article.asp)
www.exploit.org/issue1/pride/ # pride/default.asp
• Very important for web site home page
• Try to avoid use of tilde (~)
• Avoid citing binary files
(inaccessible, lack "Promoting Web Site" Talk
of metadata,
alternative versions, Given on 18 Nov 1999
etc.)
Slides: [HTML] – [PowerPoint]
19
Let's Not Forget Publications
Getting published in a
web magazine (such as
Exploit Interactive) can
have many benefits:
• Visibility to (variety of)
readers
• Web magazine may
submit its pages to
search services
• Links in web magazine
may be harvested
• Web magazine may be
made available on CD
ROM, free text system,
etc.
• May submit its resources
to search engines
20
http://www.exploit-lib.org/issue3/
Measuring Your Success
Link popularity is
growing in
importance as
search engines
make use of citation
analysis ("this site is
best, as there are
lots of links to it" or
"this site is linked to
by important sites").
LinkPopularity.com lets you
check on the number of sites
linking to your web site
"I tried [LinkPopularity.com], pointing out to a potential advertiser that EEVL had,
according to HotBot, 1099 sites linking to it, whilst there were only 18 sites linking to
their site, and suggested that what they needed was more exposure.
It seems to have worked, as they have agreed to buy an ad on the soon to be
released new design EEVL site." Roddy
21 McLeod, EEVL (posting to lis-elib list)
Don't Forget Your Stats
You will produce graphs of your web statistics
for project reports
Do the graphs indicate:
• A healthy growth
• Growth in the number of robots
• Growth in the wrong community
Look beneath the surface
Think about "enterprise analysis packages"
referer: ""
referer: "www.foo.fr/goodstuff/"
# Entered directly
# Followed link
If you record the referrer field you will be able to see the
links users follow to arrive at your web site
22
Universal Design
Many of the guidelines provided will have
additional benefits:
• Robots and people with disabilities (e.g.blind users)
have similar characteristics i.e. can't follow images,
may not be able to access framed sites, etc.
• Indexing programs may index ALT attributes in
<IMG> elements
• Sensibly-structured web sites can be more easily
archived and mirrored.
• Metadata for general resource discovery can be
reused for other applications (e.g. current
awareness services).
23
Conclusions
To conclude:
• There are approaches to the web site architectural
design which can help in promoting your project web
site, including:
– Project-specific domains
– Use of the robots.txt file
– Accessible web design
– Short URLs
– Metadata
• Once you have the correct architecture, you can
assist in the promotion process through various
submission tools
• Many of the solutions will have additional benefits
• Ideally the solutions will be implemented at the start
of the project!
• Dialogue with your server administrator is important
24
Further Information
Book Reviews
<http://www.hw.ac.
<http://www.searchenginewatch.com/>
uk/libWWW/irn/irn58/
Deadlock
irn58d.html#recent>
<http://www.deadlock.com/promote/>
<http://www.hw.ac.uk
Did-it
/libWWW/irn/irn59/
<http://www.did-it.com/>
irn59d.html#recent>
ViirtualPromote
Search Engine Watch
<http://www.virtualpromote.com/promotea.html>
Pegasoweb
<http://http://www.pegasoweb.com/>
Yahoo!
<http://dir.yahoo.com/
Computers_and_Internet/
Internet/World_Wide_Web/
Information_and_
Documentation/Site_Announ
cement_and_Promotion/>
Broadcaster – URL submission
service
<http://www.broadcaster.co.uk/>
Submit-it – URL submission service
25
<http://www.submit-it.com/>