We refer to this as atomic indexes.

Download Report

Transcript We refer to this as atomic indexes.

Fedora Repository
Fedora Repository
Object
Datastreams
RUCORE
Rutgers Community Repository
Pre-indexing Program
The pre-indexing program populates the Search database with
sorting and path information from the Fedora database as well as
collection information from the Objects.
Fedora Objects
Fedora
Database
Search
Database
RUCORE
Rutgers Community Repository
Create Indexes
Search
Database
The indexing program uses the search database to find Objects
for particular collections and then combines Descriptive
metadata with XML full text datastreams to Create “search
objects” for indexing with Amberfish.
Search Objects
Filter
Atomic Amberfish
Indexes
RUCORE
Rutgers Community Repository
Amberfish®
Amberfish is text retrieval software distributed as open source software under the
terms of version 2 of the GNU General Public License (GPL).
- Automatic searching across multiple databases (allowing modular indexing). We refer
to this as atomic indexes.
- Indexing/search of semi-structured text (i.e. both free text and multiply nested fields)
- Boolean queries, right truncation, phrase searching, relevance ranking, support for
multiple documents per file, incremental indexing.
Read more - http://www.etymon.com/tr.html
RUCORE
Rutgers Community Repository
Searching…
User Search Interface
[Coll1][Coll2][Coll3]
Search Results
Ambersearch.php
+
Coll1 Index
+
Coll2 Index
Search
Database
RUCORE
Coll3 Index
Sort Results
Rutgers Community Repository
Collection Hierarchy
A MySQL database is used to create and display parent/child collection relationships. The database is
a compact relational model. A class was written to build collection hierarchies in the search interface
and create structure maps of the collection when needed.
Collection Hierarchy Search Interface
A start point (collection id) along with the max depth are defined in a function call. The collection
tree is then built in the search interface.
Structure Map (SMAP) Generation
When a collection objects structure/hierarchy is changed a structure map(XML) can be generated and
stored in the collection object in the repository. In the event a collections hierarchy needs to be rebuilt
we have preserved the collections lineage in the repository. To create the SMAP a start point
(collection id) need only be defined. A function then probes the database to determine the collections
maximum depth. Once this is discovered an appropriate SMAP is generated and appended to the
object.
RUCORE
Rutgers Community Repository
Collection Hierarchy Methods
A PHP class of methods is used by the WMS, dlr/EDIT and search interfaces to add, update, delete and
display collection hierarchies.
List of methods
AddChild
AddNewCollection
AddSearchTerm
AreRelated
BuildCollTree
BuildQueryCollHierarchy
ChangeChild
ChangeParent
DeleteChild
DeleteCollection
DeleteDeadChildren
DeleteOrphans
DeleteSearchTerm
DisplayCollectionSearchTerms
GetCollectionInfo
GetCollectionSearchTerms
GetCollectionStructureInfo
GetCollidMySQL
GetCollidMySQLByFedoraID
GetCollidWMS
GetCollidWMSByFedoraID
GetDeadChildren
GetDirectChildren
GetDirectParents
GetOrphans
GetSearchTermFields
UpdateSearchTerm
RUCORE
Rutgers Community Repository
Partner Portals
Background
-Provide the capability to allow partners, other institutions and individuals to attach the repository
search engine with selected collections to their website
-Built off existing search code used on NJDH and RUcore sites.
-An extension offered to NJDH partners and RUcore participants.
-Minimal systems requirements.
-Simple setup and maintenance for partners, assumed they are not technically orientated.
-Ability to customize their collection list, subscribe.
How it works…
-Username/password and a unique key are generated and assigned.
-Partner has access to subscribing to collections of their choice.
-Partner embeds a URI, IFrame, on their web site that allows for access.
RUCORE
Rutgers Community Repository
Contact
Chad Mills: [email protected]
Jeffery Triggs: [email protected]
RUCORE
Rutgers Community Repository