VIVO Cornell: Lessons from the Field

Download Report

Transcript VIVO Cornell: Lessons from the Field

VIVO Cornell: Lessons from the field
Kathy Chiang, Jon Corson-Rikert, Elizabeth Hines, Joseph McEnerney
Stella Mitchell, Christopher Westling, Tim Worrall
Abstract
VIVO is:
• an organic approach to reflecting the research at an institution;
• the antithesis of a static set of web pages.
VIVO is not:
• a ‘set and forget’ collection of automated data ingests.
VIVO Cornell engages in a continual process of environmental monitoring and major and minor
overhauls as we strive to deliver quality, useful results.
We are:
• Monitoring how Cornell researchers are representing their work on the Web and designing
displays that complement those efforts.
• Developing tools to identify substandard data and assist in their cleanup.
• Scanning the Cornell Web to encourage the reuse of VIVO data and deliver on the promise of
Linked Open Data
Current production VIVO
Revised view: Mockup 1
Revised view: Mockup 2
Revised view: Mockup 3
Mockup 3: Expanded
Publications management: PubMed, Researcher ID, Google Scholar
1.
We coordinate and integrate our contribution to
Cornell’s information discovery goals as
researchers’ information practices change and
competing and complementary information
products are introduced. Some researchers pay
detailed attention to their web presence, others do
not. With VIVO 1.5 we are designing display
options to meet the varying needs of our
researchers. We are looking at how publications
could be managed to meet both individual and
institutional goals.
Faculty lab web page
2.
VIVO Cornell data come from heterogeneous
overlapping sources reflecting the diversity and
complexity of our institution. In addition to manual
data entry (with all its attendant quality issues) the
Cornell data systems of record deliver duplicate
and contradictory data that must be cleaned and
reconciled.
We have developed a tool that semi-automates
this process. We apply heuristic matching
algorithms to VIVO data to cluster similar names
(of people, journal titles and organizations). The
URI Tool presents those results for manual review
and resolution. We identify, or create (from online
sources such as Ulrichs) an authoritative version
and then merge all the variants to that name.
Journal titles can vary by one word; we have
researchers with the same name, but in different
Colleges; this manual step is the only reliable way
to clean a dataset of this size.
4.
We must regularly monitor and coordinate with our
data providers to keep our processes up to date.
We also pay attention to the continual changes in
the information landscape. In an institution the
size of Cornell it is easy for potentially duplicative
initiatives to emerge. We have taken organizational
approaches to maintaining accurate processes,
and minimizing the wasteful duplication of effort.
This is a time consuming process, but essential if
VIVO is to be an accurate representation of
research at Cornell. Here is a sample of the
groups we meet with:
VIVO data sources and data feeds
Data integrity: URI Tool
3.
Since we are required to take data from Cornell’s
systems of record we cannot ‘clean up’ the data in
VIVO. It must be done at the source. For example:
several Colleges at Cornell use Activity Insight
from Digital Measures. It is difficult to identify
missing or malformed values using the Activity
Insight administrative interface. We are developing
a tool that presents the information in a format that
College administrators can use to correct the data,
which will then feed into VIVO.
AI feedback form
• The campus-level Activity Insight Users
Group and the Activity Insight implementation
teams at the College level
• The Communications staff of the Vice Provost
for Research and in the Colleges
• The Web Manager for the College of Arts and
Sciences; they do not use Activity Insight
• The appropriate staff from our data sources—
Human Resources, the Registrar (for
courses), the Office of Sponsored Programs,
Cornell Cooperative Extension
• The Office of Land Grant Affairs
• Our institutional sponsors