kiorgaard_hider2008_0

Download Report

Transcript kiorgaard_hider2008_0

Vale main entry
Differences from AACR2
Philip Hider
Deirdre Kiorgaard
ACOC
What they’re based on
 AACR was based on (and influenced) ISBD
-- based more on convention
 RDA is based primarily on FRBR/FRAD
-- based more on principles
(though inheriting quite a lot of convention)
No semi-colons!
 RDA doesn’t prescribe format/display
 Independent of format standards
 But compatible with ISBD and MARC
 Still have ISBD punctuation, etc. in appendix
FRBR/FRAD model
 Different terminology
 More systematic
 More integrated
 Clearly-defined elements and sub-elements
 Covers authority records
 FRBRization – resources defined in terms of
item/manifestation/expression/work levels
Carrier and content differences
 RDA will be primarily online
 RDA has a very different organization
 no AACR2 pt 1 division by class of material
 AACR2 pt 2 dispersed
 first-order division:
recording attributes -- recording relationships

second-order division: user tasks
Contents
Recording attributes
Section 1 - Recording attributes of manifestation and item
Section 2 - Recording attributes of work and expression
Section 3 - Recording attributes of person, family, and corporate body
Section 4 - Recording attributes of concept, object, event, and place
Recording relationships
Section 5 - Recording primary relationships between work, expression,
manifestation, and item
Section 6 - Recording relationships to persons, families, and corporate bodies
associated with a resource
Section 7 - Recording subject relationships
Section 8 - Recording relationships between works, expressions, manifestations and
items
Section 9 - Recording relationships between persons, families, and corporate bodies
Section 10 - Recording relationships between concepts, objects, events, and places
Section 1
Chapter 1. General guidelines on recording attributes of
manifestations and items
Chapter 2. Identifying manifestations and items
Chapter 3. Describing carriers
Chapter 4. Providing acquisition and access information
FRBR/FRAD model
 Persons, families, corporate bodies
 More coverage of relationships
 Separation of content and carrier
(expression/work) vs (manifestation/item)
 Far fewer rules pertaining to particular ‘material types’
 Generalisation
 Some material-specific rules still, at the element (legal,
music, religious, serials, etc.)
GMDs/SMDs gone!
 Content type
 Media type & Carrier type
 Closed lists of values
 Multiple terms allowed
 Look for the attributes, not the resource
Relationship designators
 Inter-resource relationships,
e.g. sequel, translation, adaptation
 Group 1-2 relationships,
e.g. author, photographer, publisher
 Relationships between Group 2 entities
New elements
 Elements/sub-elements instead of notes,
e.g. language, copyright date, file format
 Resource elements mostly coincide with ISBD
 Elements given definitions
 Some new Group 2 elements (e.g. gender)
Old terminology
 No more
Heading
Added Entry
Authorized heading
See references
Main Entry
Uniform title
New terminology
Element
Carrier
Manifestation
Expression
Access point
Preferred access point
Variant access point
Preferred title
Principle of ‘user-friendliness’
 Principle of representation –
‘take what you see’ or
‘what you see is what you type’
 More transcription, less editing
 Transcribe volume numbers, etc.
 No abbreviations in transcribed elements
Principle of ‘user-friendliness’
 No [sic] records
 No [44] p. etc. for ‘recorded elements’
 No squares for statements of responsibility in the
resource
 A lot less Latin,
e.g. ‘Publisher not identified’ instead of S.l.
Rule of three
 Record all of them,
though first named only still an option
 Conversely, just the first for preferred access point,
but option to use all of them
Principle of ‘flexibility’
 Less prescriptive, less ‘case law’
 ‘Core’ elements rather ‘required’ or specific levels
 Internationalisation -- primacy no longer given to
English
 Preferred access point option, no ‘main entry’
RDA and web resources
Better description of resources with multiple
characteristics
Improved treatment of online resources
 online resource as a carrier type
 improved technical description
 introduction of persistent identifiers & URLS
Other things
 Changes requiring a new description:
change in mode of issuance or in media type;
issuance of a new base set (integrating resources)
 Standard numbers covered together with other
identifiers (e.g. publisher numbers)
 Introductory phrases included in titles proper
Other things
Works with no collective title -- separate access
points for each work in compilation
Fewer additions to names (‘Sir’ ‘Rev’ etc.)
(but Jr. now included)
Bible uniform titles
If you get lost…
 Concordance of AACR rules to RDA
 RDA will also point to the applicable MARC
fields/subfields
 Help will be at hand!
Towards a semantic web
Philip Hider
This talk
 The Semantic Web vision
 Scenarios
 Standards
 Semantic Web & RDA
Web 1.0, 2.0, 3.0
 Internet to WWW (Web 1.0)
 Web 1.0 allows people to navigate the Internet easily,
through hyperlinks
 Web 2.0 allows people to collaborate more on the Web
 Web 3.0 allows computers to find and use the data
contained in Web documents
 Web 3.0 = the Semantic Web vision
The Semantic Web vision
 It will allow computers to make sense of the content of
Web documents, so that they can find and use this data
independently
 Basis of SW already developed, with standards such as
XML and RDF
 Like Web 1.0, it represents a bottom-up, distributed
approach
How would it work?
 Computers would be able to identify and ‘understand’
particular data in a Web document according to the
metadata associated with that data
 metadata could be inside our outside the document
 Computers (agents) would then be able to relate that data to
other data in other documents (or the same document)
according to specified schemas, ontologies and rules
 They could then independently integrate data and process
information according to tasks set by their human users
A Semantic Web scenario
 User asks ‘Trip Agent’ to purchase the ‘best’ deal for a
trip to New Zealand with date range x, family members
y, time of day z, etc. etc.
 ‘Trip agent’ searches the Web for flights and
accommodation, and is able to look up databases and
specify conditions according to what it ‘knows’ about
user’s preferences
Semantic Web scenario
 Agent is able to ‘understand’ the deals available on
different websites by integrating data from different
sources, e.g. looking up geographic information systems
(how far from the sea, shops, etc.), weather forecasts,
family members’ calendars, etc. an ultimately
suggesting the optimal combination of flight, hotel,
tours, etc.
Another scenario
User asks if the latest Stephen King
book is available in a nearby library,
can’t remember what it’s called
‘Library Agent’ searches the Web for nearby libraries
with books by ‘Stephen King’, finds a few different
Stephen Kings, confirms with user which Stephen
King, then identifies the latest novel via the official
Stephen King website, but chooses the secondnearest library (by car) which holds it because of
availability/format/library opening hours, etc.
What do SW agents need?
 Information about the data, i.e. metadata,
in a machine-readable format
 Including a shared understanding of the structure of that
metadata and its relationship to other knowledge
structures (ontologies)
 Some clever programming
Standards for the Semantic Web
Resource Description Framework
Universal Resource Identifiers
XML
Unicode
Schemas (such as XML schemas)
Ontologies written in e.g. OWL
Rules written in RIF, etc.
SPARQL
Resource Description Framework
 W3C standard
 A model used to structure resource descriptions
 Can be used to structure data about any kind of resource
 could be a book, or a car, or a flight ticket, or an experiment, etc.
 Based on ‘triples’, i.e.
Resource – Property – Value
(Subject – Predicate – Object)
Universal Resource Identifiers
For example, URLs
And ISBNs
People don’t have them yet
OCLC working on ‘work identifiers’
Properties and some values are referenced as part of
particular schemas, ontologies, etc.
eXtensible Markup Language (XML)
 Another W3C standard
 More flexible than HTML, XHTML
 Can be used to encode any data
 Data can be in the same Web document or another document
 Can be used to express RDF, i.e. RDF/XML
 RDF/XML basis for metadata structures such as schemas and
ontologies
Schemas
 Standardised structures of resource description that
define property elements in a taxonomic way
 Mostly based on a particular domain,
e.g. pertaining to bibliographic data, or geospatial data,
or flight booking data, or used car data, etc.
Schemas
 Two main groups of schemas –
XML schemas and RDFS (RDF schemas)
 Superseding Document Type Definitions (DTDs)
 Specific well-known schemas include
 Dublin Core
 ONIX
 RSS
Some metadata encoded in RDF/XML
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description
rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">
<dc:title>Tony Benn</dc:title>
<dc:publisher>Wikipedia</dc:publisher>
<foaf:primaryTopic> <foaf:Person> <foaf:name>Tony
Benn</foaf:name> </foaf:Person> </foaf:primaryTopic>
</rdf:Description> </rdf:RDF>
Some metadata encoded in RDF/XML
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:dc="http://purl.org/dc/elements/1.1/">
<rdf:Description
rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">
<dc:title>Tony Benn</dc:title>
<dc:publisher>Wikipedia</dc:publisher>
<foaf:primaryTopic> <foaf:Person> <foaf:name>Tony
Benn</foaf:name> </foaf:Person> </foaf:primaryTopic>
</rdf:Description> </rdf:RDF>
Ontologies
More sophisticated than schemas, formalising
more complex relationships between elements
Also usually domain-specific
Use extra languages, such as OWL, on top of
RDF/XML etc.
Ontologies give more scope for agents to be ‘clever’
Dublin Core can be expressed as an ontology or a
schema
What about MARC?
 MARC files are rather flat and do not readily define
relationships between elements
 But can be expressed as an XML schema,
i.e. MARCXML
 MODS is a lite version of MARCXML
 Mappings between MARCXML and other schemas
(e.g. DC)
Mappings
 Lots of them!
 Between different schemas, ontologies, languages, etc.
 AKA crosswalks
 By UKOLN, LC, OCLC, etc. etc.
 The more standards and adaptations, the more
crosswalks
Value sets
 Resource – Property – Value
 Schemas and ontologies may point to particular value sets,
e.g.
Book A hasaSubjectcalled DCterms:LCSH Apples
where Apples is a value in the set of values known as LCSH
 In other words, they may point to controlled vocabularies
SKOS
 Simple Knowledge Organization Systems
 SW standard for expressing controlled vocabularies
such as subject thesauri
 http://www.w3.org/2004/02/skos
 Might promote use of LCSH, etc.
Semantic Web & cataloguing
 More sophisticated use of library catalogues if they can
be understood by Semantic Web agents
 Library resources more likely to be used in conjunction
with non-library web resources
 SW about agents using cataloguing,
not replacing cataloguing
Semantic Web & RDA
 RDA is therefore aligning itself with DC and RDF
 RDA elements mapped to DC, ONIX, etc.
 DCMI/RDA Task Group
 RDA-DC application profile
 http://dublincore.org/dcmirdataskgroup
Prospects for SW
 Examples of Semantic Web developments:
http://www.w3.org/2001/sw/sweo/public/UseCases
 A lot of standards now in place, technology not so
much of an issue
 With RDA, bibliographic domain ripe for SW take-up
Pre-SW library work
Post-SW library work
Thank you.
[email protected]