Transcript Chapter7

Greenstone Internals
How to Build a Digital Library
Ian H. Witten and David Bainbridge
Macrolanguage and
Collection Information Database

In order to understand the structure of runtime
systems you need to understand the following
terminology
Macrolanguage – how all web pages are expressed
internally
 Collection Information Database – records
information produced during the build operation for
use at runtime

Processes
Two processes:

•
Receptionists – responsible for user interface



•
Collection Servers – abstractly handle contents of collection




Point of contact with the digital library
Accepts user input, analyzes it, and dispatches request to an appropriate
collection server(s)
Different users CAN share receptionists
Interact with data structures produced by building processes
Locate and request information and return it to receptionist for
transmission to user
Different collections can share a collection server
Receptionists communicate with collection servers
through a defined protocol
NULL Protocol



Simplest, most common protocol
1 receptionist and 1 collection server run on the
same computer
All communication between the receptionist and
collection server goes through the protocol
CORBA Protocol Implementation




Common Object Request Broker Architecture
Unified object-oriented paradigm
Allows processes on different computers to
communicate over the Internet
Processes can be implemented on different
computer platforms and in different
programming languages
Two software components
necessary for runtime system

Macro language

creates all pages in the user interface


macro – inline scripts that perform textual replacement
on the macro name
Collection information database

communicates information about collections
between building and delivery phases
Reasons for Using Macros
•
•
•
Interface comes in many different languages
All text fragments are stored as macro
definitions
If particular language is not present for a
macro, the macro will be substituted with the
English version by default
Information About Macros





Macro files end with the extension “.dm”
Macro names begin and end with an underscore
Macro content is defined using curly braces
Content can be plain text, html, macro names, etc.
Macros can contain conditionals

_if_(x,y,z)



x = condition (e.g. ne for not equal)
y = text to use if x is true
z = text to use if x is false
Page Parameters


Page parameters affect how every page in the
interface is generated
3 Types of Page Parameters:
•
•
•
l – interface language parameter
c – current collection name
v – determines if expanded in graphical or text mode
Making Macros Work


main.cfg – consists of macro files that the
system should read
Home page can be changed by editing the
main.cfg file
Responding to User Requests

Standard arguments used in URLs to exchange
info with the library program
Collection (c)
 Action (a)
 Page (p)

Retrieving Documents




a=d retrieves a document
d specifies document number (and page)
hl specifies highlight query terms
gt specifies page to display within a book
Using Protocol


Complete description of protocol needed for
interaction between receptionist and collect server
Protocol Calls Include:









Get_protocol_name() – returns name of protocol
Get_collection_list() – returns list of collections
Get_collection() – obtains general info about collections
Has_collection() – if you can communicate with collection return true
Ping() – if a successful connection exists return true
Filter() – supports searching and browsing
Get_filterinfo() – gets list of filters for a collection
Get_filteroptions() – gets all options of filter for a collection
Get_document() – gets documents
Actions

Null protocol receptionists use actions derived from a single base
action through virtual inheritance











Actions – base class
Page – web page
Document – retrieve items form collection server
Query - search
Authen – authenticate user
Users – add/delete users and their permissions
Collector – generate pages
Status – generate admin pages
Extlink – connects users to external site
Ping – verifies a collection is online
Tip – provides a random tip for the user
Site Configuration

Configuration files


set variables that are used by library software / Web
server at runtime
Lines in gsdlsite.cfg





gsdlhome – path of Greenstone home directory
httpprefix – web address of Greenstone home directory
httpimage – web address of directory with images
gwcgi – web address of library CGI script
maxrequests – for Fast-CGI