Use of Python for VIVO Application Programming

Download Report

Transcript Use of Python for VIVO Application Programming

Use of Python for VIVO Application Programming
Why Python?
Making Reports
Adding Papers to VIVO
Python1,2 is a popular, easy to learn language very well suited
for use with VIVO and the semantic web. Python is available
for Mac, Windows and Linux, has simple procedural syntax and
clear syntax for object oriented development. Python is trivial
to install and standard installations include and integrated
development environment, IDLE.
Python scripts
read bibtex, use
SPARQL calls to
find available
VIVO URIs, and
templates to
generate RDF. UF
authors are
identified and
papers linked to
profiles.
Journals, authors
and publishers
are created if
needed. Python
string functions
improve and
standardize text.
Reports
summarize
actions taken.
Python is open source, and has a strong development
community. Many libraries are included in the standard
distribution and many more libraries are available through
standard python archives. Installing additional libraries
typically requires a single command.
Python has a very short learning curve. Experienced
programmers can install and write python programs on their
first day.
Python has outstanding support for data structures, the
Internet, exception handling, XML, string manipulation, CSV
files, and interaction with other systems.
Python is very fast to compile and execute. A 200,000 line
Excel CSV can be read, and summarized in a few seconds.
Python and VIVO
Making Web Pages
Simple Python functions can make SPARQL queries, and
template libries can be used to make RDF. Python associative
arrays (dictionaries) can store data from VIVO and provide
extremely efficient look-up. A single query can return all
people in VIVO which can then be placed in a dictionary.
Subsequent code can refer to the dictionary without having the
make additional queries. Python code can quickly retrieve RDF,
parse it, find additional URIs and retrieve additional RDF,
thereby following demantic web graphs and identifying data
properties and values.
At the University of Florida, Python is used to report from VIVO
logs, generate web pages, and create RDF for ingest of people,
papers, and positions held. The techniques demonstrated here
can be used to ingest and report on any data in VIVO. Code
from these examples will be available at VIVO repositories.
Mike Conlon, Nicholas Rejack and Laura Guazzelli
UF Clinical and Translational Science Institute, University of Florida
Adding People to VIVO
From a spreadsheet, RDF can be generated by Python to
add people to VIVO, linking them to their home
department. Once people are in and identified via UFID,
subsequent scripts can attach grants, papers, photos,
courses taught, positions held.
See the UF Implementation
Poster3 for additional
information on processes
used at UF to generate VIVO
data to represent
scholarship at UF.
Useful Libraries
Some useful Python libraries for use with VIVO:
Pybtex – read bibtex files into python structures
Tempita – simple, flexible templates
Csv – read and write CSV files
Minidom – read, manage, write XML data
Re – regular expressions in python
Datetime – ISO standard datetime processing
pyRTF – make RTF documents
Urllib – create URLs, fetch web content
Entrez – query, read, process PubMed files
Rdflib – tools for working with RDF
Vivotools – UF tools for SPARQL query, generate VIVO URIs
Getting Started
Visit www.python.org, download python and click to install.
Get a good, quick read python book. Spend a day writing code.
Spend a day studying code examples. Write something simple,
make it work. Write something more sophisticated. Ask
questions. Use Google to find Python examples and additional
libraries. Use libraries to build on existing functions.
The UF code example use Python 2.7.3. We use Python 2.7.3
because it is supported by Google App Engine4. Using Google
App Engine, you can create on-line python web sites and
applications using Google infrastructure at no-cost.
References
1Python
Home Page. www.python.org Accessed 8/16/2012.
2Ceder, VL The Quick Python Book, 2nd ed. Greenwich, CT:
Manning, 2010, 336 pgs. ISBN 97819335182207
3Conlon, M, Barnes, CP, Sposato, V, Rejack, N, Schmidt, E,
Collante, W, Guazzelli, L, Williams, S, Raum, N. Implementation
of VIVO at the University of Florida. Conference Poster, VIVO
2012, Miami, FL.
4Google App Engine Home Page.
https://appengine.google.com Accessed 8/16/2012.