Transcript Chapter7
Greenstone Internals
How to Build a Digital Library
Ian H. Witten and David Bainbridge
Macrolanguage and
Collection Information Database
In order to understand the structure of runtime
systems you need to understand the following
terminology
Macrolanguage – how all web pages are expressed
internally
Collection Information Database – records
information produced during the build operation for
use at runtime
Processes
Two processes:
•
Receptionists – responsible for user interface
•
Collection Servers – abstractly handle contents of collection
Point of contact with the digital library
Accepts user input, analyzes it, and dispatches request to an appropriate
collection server(s)
Different users CAN share receptionists
Interact with data structures produced by building processes
Locate and request information and return it to receptionist for
transmission to user
Different collections can share a collection server
Receptionists communicate with collection servers
through a defined protocol
NULL Protocol
Simplest, most common protocol
1 receptionist and 1 collection server run on the
same computer
All communication between the receptionist and
collection server goes through the protocol
CORBA Protocol Implementation
Common Object Request Broker Architecture
Unified object-oriented paradigm
Allows processes on different computers to
communicate over the Internet
Processes can be implemented on different
computer platforms and in different
programming languages
Two software components
necessary for runtime system
Macro language
creates all pages in the user interface
macro – inline scripts that perform textual replacement
on the macro name
Collection information database
communicates information about collections
between building and delivery phases
Reasons for Using Macros
•
•
•
Interface comes in many different languages
All text fragments are stored as macro
definitions
If particular language is not present for a
macro, the macro will be substituted with the
English version by default
Information About Macros
Macro files end with the extension “.dm”
Macro names begin and end with an underscore
Macro content is defined using curly braces
Content can be plain text, html, macro names, etc.
Macros can contain conditionals
_if_(x,y,z)
x = condition (e.g. ne for not equal)
y = text to use if x is true
z = text to use if x is false
Page Parameters
Page parameters affect how every page in the
interface is generated
3 Types of Page Parameters:
•
•
•
l – interface language parameter
c – current collection name
v – determines if expanded in graphical or text mode
Making Macros Work
main.cfg – consists of macro files that the
system should read
Home page can be changed by editing the
main.cfg file
Responding to User Requests
Standard arguments used in URLs to exchange
info with the library program
Collection (c)
Action (a)
Page (p)
Retrieving Documents
a=d retrieves a document
d specifies document number (and page)
hl specifies highlight query terms
gt specifies page to display within a book
Using Protocol
Complete description of protocol needed for
interaction between receptionist and collect server
Protocol Calls Include:
Get_protocol_name() – returns name of protocol
Get_collection_list() – returns list of collections
Get_collection() – obtains general info about collections
Has_collection() – if you can communicate with collection return true
Ping() – if a successful connection exists return true
Filter() – supports searching and browsing
Get_filterinfo() – gets list of filters for a collection
Get_filteroptions() – gets all options of filter for a collection
Get_document() – gets documents
Actions
Null protocol receptionists use actions derived from a single base
action through virtual inheritance
Actions – base class
Page – web page
Document – retrieve items form collection server
Query - search
Authen – authenticate user
Users – add/delete users and their permissions
Collector – generate pages
Status – generate admin pages
Extlink – connects users to external site
Ping – verifies a collection is online
Tip – provides a random tip for the user
Site Configuration
Configuration files
set variables that are used by library software / Web
server at runtime
Lines in gsdlsite.cfg
gsdlhome – path of Greenstone home directory
httpprefix – web address of Greenstone home directory
httpimage – web address of directory with images
gwcgi – web address of library CGI script
maxrequests – for Fast-CGI