DDC metadata

Download Report

Transcript DDC metadata

EDUG 2012
Symposium
26 April 2012
DDC metadata
Boston Spa, UK
Michael Panzer
Assistant Editor, DDC
OCLC
[email protected]
Types of DDC data
- Usually, Dewey numbers provide metadata for describing
other resources
- DDC as value vocabulary for metadata element sets
- Instead, the following focuses on cases where Dewey
numbers and DDC editions are the resources described
- Two levels of DDC metadata
- Number-level metadata (focus on bibliographic records)
- Edition-level metadata (focus on classification records)
DDC metadata
Metadata about
- Dewey numbers (082, 083, 085 fields in MARC
Bibliographic)
- Provenance of machine-generated classication data
- Dewey number components in linked 085 fields
- Dewey editions (084, 686 fields in MARC Classification)
- Interplay between class- and edition-level metadata rendered
in MARC Classification format
Agenda
Scenario
Context
1. Provenance of
machine-generated
data
-Proposal for MARBI;
(metadata) provenance
initiatives at W3C / DCMI
2. Edition-level metadata -Relationship between
translations and other
“versions”
3. Metadata about Dewey -Enhancing Dewey
number components
numbers for retrieval
MARBI proposal
- Drafted over the last two months in cooperation with colleagues
from DNB and LC
- To be presented at MARBI meeting at ALA Annual Conference
2012
- Two options
- Option 1: Addresses the immediate needs of documenting
information about machine generation of classification data
- Defines additional subfields in 082, 083, 084
- Option 2: Proposes a more general way of dealing with metadata
provenance
- Applicable to all MARC variable fields (in principle)
- Heeds the distinction between provenance in general and metadata
provenance in particular
Option 1
Defined for 082, 083, 084
$i - Method of assignment designator
Fully machine-generated (m)
Not fully machine-generated (x)
$u - Process of assignment
May contain a URI, a process name, or some other
description of process designated in $i
$1 - Confidence value
Confidence of the assigning agency in relation to the
process described in $u. Contains value from the interval
[0,1]
$q – Assigning agency (already defined)
Examples
DDC 23 number assigned by LC using AutoDewey. The
AutoDewey process involves machine assistance followed
by intellectual review:
082 00 $a829/.3$223$ix$uautodewey$11
Fictitious example of DDC 22 number assigned by OCLC in a
fully automated way using information in Classify:
082 04 $a394.12$222$im$uclassify$10.5$qOCoLC
Option 2
883 - Data provenance (R)
First Indicator: Method of assignment
# - No information provided
0 – Fully machine-generated
1 – Not fully machine-generated
$d - Date on which the linked field was generated
$u - Process used to generate linked field
$q - Agency using the process/activity to generate the linked field
$1 - Confidence value
$x - Ending date of validity
$0 - Authority record control number or standard number
$8 - Field link and sequence number (with new field link type “p – Data
provenance”)
Examples
082
00 $81\p$a829/.3$223
883
1# $81\p$uautodewey$d20120407$qDLC$11
082
04 $81\p$a394.12$222$qOCoLC
883
0# $81\p$uclassify$d20120407$qOCoLC$10.5
Examples (2)
082
04 $81\p$a004$222/ger$qNO-OsNB
883
0# $81\p$udeweyclassifierv0.1$d20120101
$x20141231$qNO-OsNB$10.25
$0(DE-101)040268942
082
04 $81\p$a004$222/ger$qDE-101
883
0# $81\p$uparallelrecordcopy$d20120101
$x20141231$qNO-OsNB
Agenda
Scenario
Context
1. Provenance of
machine-generated
data
-Proposal for MARBI;
(metadata) provenance
initiatives at W3C / DCMI
2. Edition-level metadata -Relationship between
translations and other
“versions”
3. Metadata about Dewey -Enhancing Dewey
number components
numbers for retrieval
Edition-level metadata
- Edition registry: capturing information about editions and
translations in a centralized manner outside of MARC
records
- Storing additional metadata about editions/translations in
MARC records
- Better management of translation data and other versions
- MARC does not offer edition-level records
- Data info has to be carried in individual records, even when
it applies to the whole edition
- Relevant fields:
084 - Classification Scheme and Edition
686 - Relationship to Source Note
DDC translations:
Anatomy of an edition
Italian
DDC 22
German
DDC 22
Swedish
Mixed
DDC 22
French
DDC 22
Afrikaans
Arabic
Chinese
French
German
Norwegian
Portuguese
Russian
Scots Gaelic
Spanish
Swedish
English
DDC 22
DDC
Summaries
DDC SachGruppen
(German)
French
Italian
RhaetoRomansch
200
Religion
Class
Guide
(French)
A14
Vietnamese
A14
French
A14
Hebrew
A14
Spanish
A14
Italian
A14
Types of editions
- Related to an edition, with relationships not captured at
record level
Examples: sdnb, DDC Summaries, Guide
versus
- Related to an edition, with relationships captured at
record level
Examples: 200 Religion, translations, A15engind
Tracking edition-to-edition relationships
Translation of standard edition
084 1# $a ddc $c 15 $e ind
Source edition
084 1# $a ddc $c 15 $e eng
Authorized derivative version of standard edition
084 8# $a ddc $c 22sdnb $d 22 $e ger
Source edition
084 0# $a ddc $c 22 $e eng
- Not explicitly full or abridged; “8” is used for value of first indicator
- $n should be automatically populated with relevant information
about the changes regarding the source edition.
Tracking record-to-record relationships
1. Record has been modified
Translation of standard edition
084 1# $a ddc $c 15 $e ind
686 3# $i modified
Source record
084 1# $a ddc $c 15 $e eng
Tracking record-to-record relationships (2)
2. Record was created for translation
Translation of standard edition
084 1# $a ddc $c 15 $e ind
686 1# $b 305.899
Source record
[does not exist]
Tracking record-to-record relationships (3)
3. Unmodified record from different source edition
Translation of standard edition
084 1# $a ddc $c 15 $e ind
686 0# $2 23
Source record
084 0# $a ddc $c 23 $e eng
Agenda
Scenario
Context
1. Provenance of
machine-generated
data
-Proposal for MARBI;
(metadata) provenance
initiatives at W3C / DCMI
2. Edition-level metadata -Relationship between
translations and other
“versions”
3. Metadata about Dewey -Enhancing Dewey
number components
numbers for retrieval
085 - Synthesized Classification Number
Components
- 085 fields provide information about components of Dewey
numbers in linked 082 or 083 fields
- Mirror 765 fields in MARC Classification format
- Vital for faceted retrieval driven by Dewey numbers
- Further enhancements possible by utilizing mappings of
Dewey numbers that occur prominently as components, e.g,
geographic data, time periods
- Definition of new indexes is a requirement for retrieval
use for WoldCat data
Exploiting Dewey facets in WorldCat
Das Highlander-Kochbuch
082 04 $8 1\x $a 641.594115 $q DE-101 $2 22/ger
085 ## $8 1\x $b 641.59
085 ## $8 1\x $z 2 $s 4115
641.593-.599 Cooking characteristic of specific continents,
countries, localities
T2—4115
Highland
Proposed new indexes (083 fields)
“Dewey additional” index
da index:
Add $z and $c ($y) to elements already in dd
index
Pattern:
[z--]a[-c][:a[-c]]
Proposed new indexes (085 fields)
“Dewey components” index
dc index:
Index $s and $t concatenated with full
address
Pattern:
[z--]rs|w[-c][:t]
“Dewey synthesized” index
ds index:
Index all components
Pattern:
[z--]a|b|rs|u|w[-c][:a|b|t|u|v[-c]]
Proposed new indexes (082/083/085 fields)
“Dewey general” index
dg index:
Index all elements in Dewey numbers
Pattern:
Combine dd, da, and ds indexes
Example: History of Cologne during WWII
Built number: 943.55140864
9
History & geography
+ T2—435514 Cologne
+ 943.0864
Period of World War II, 1939-1945
082 00 $8 1\x $a 943/.55140864 $2 22
085 0# $8 1\x $b 9 $a 930 $c 990 $z 2 $s 435514 $u 943.5514
085 0# $8 1\x $b 943.5514 $a 930 $c 990 $v 01 $c 09 $f 0 $r 943.0 $s 864
$u 943.55140864
Access points / findability
082 00 $8 1\x $a 943/.55140864 $2 22
085 0# $8 1\x $b 9 $a 930 $c 990 $z 2 $s 435514 $u 943.5514
085 0# $8 1\x $b 943.5514 $a 930 $c 990 $v 01 $c 09 $f 0 $r 943.0 $s 864
$u 943.55140864
Components in dc index:
2--435514
943.0864
Synthesized components in ds index:
2--435514, 9, 930-990, 930-990:01-09, 943.0864,
943.5514, 943.55140864
Scenarios / Use cases
- Components / facets can be varied independently of each
other
- Allows for expanding, but also "morphing" the query by
changing individual components
- Integration of mapped vocabularies into Dewey-driven
discovery process
- Using terms that have been mapped to any number
components
- Usage of local hierarchies of number components instead
of just the hierarchical relationships of the base number
Example: Dewey-driven discovery
Number components + mapped GeoNames
394.120954
394.12 + T2—54
gn:neighbour
394.125 Meals
notational
structural
394.120954
395.54 Table manners
structural
394.13 Drinking of alcoholic
beverages
394.13 + T2—51
Neighboring countries:
China
T2—51
Pakistan
T2—5491
Bangladesh
T2—5492
Nepal
T2—5496
Bhutan
T2—5498
Myanmar
T2—591
Thank You!
Questions? Comments? Ideas?
Some useful links
DDC 23
http://www.oclc.org/us/en/dewey/versions/print/default.htm
Abridged Edition 15
http://www.oclc.org/us/en/dewey/versions/abridged/default.htm
WebDewey 2.0
http://dewey.org/webdewey
dewey.info
http://dewey.info
Dewey webinars &
presentations
http://www.oclc.org/us/en/dewey/news/conferences/default.htm
025.431:
The Dewey blog
http://ddc.typepad.com
Classify
http://classify.oclc.org/classify2/
Questions?
[email protected] (Dewey Editorial Office)
[email protected] (Licensing, group purchases, LIS program)