C2B2 retreat presentation - Programming Systems Lab

Download Report

Transcript C2B2 retreat presentation - Programming Systems Lab

genSpace: CommunityDriven Knowledge Sharing
for Biological Scientists
Gail Kaiser’s Programming Systems Lab
Columbia University
Computer Science
1
Introduction

Scientists collaborating together in the same lab
on the same project share:
 Data:
specimens, samples, materials, analyses
 Tools: instruments, software, hardware
 Knowledge: open discussion, whiteboard


However, there are temporal (time) and physical
(space) constraints
This model does not scale to communities of
scientists working on different projects but who
could possibly learn from each other’s expertise,
experience, etc.
2
CSCW Approaches
Most current generation Computer-Supported
Cooperative Work systems enable data sharing
and/or tool sharing (e.g., PNNL Collaboratories,
UIUC BioCoRE)
 However, these systems support relatively limited
knowledge sharing
 how/when/where/why to use tools and data
 Knowledge sharing is partially enabled through
labor intensive approaches: pubs, email lists, wikis,
chat, shared display, etc. – may be outdated,
requires active participation
 We seek to enable automatic knowledge sharing –
3
without requiring “extra work” by scientists

Social Networking Metaphor

Some online social networking is a form of
CSCW that is potentially enjoyable and
profitable but requires “extra work”, with
dynamism limited by explicit user participation
 Facebook,

MySpace, LinkedIn, Twitter, etc.
Other social networking automatically records,
aggregates, data mines and disseminates what
people do online in an enjoyable and profitable
fashion, with no “extra work” required
 Collaborative
filtering – “people like you …”
4
genSpace




We combine implicit and explicit social
networking (and collective intelligence) concepts
in our approach to knowledge sharing
Prototype implemented as a set of plugins for
geWorkbench, MAGNet’s platform for analysis
and visualization tools for integrated genomics
Records, aggregates, data mines and
disseminates geWorkbench users’ activities with
tools and tool sequences (workflows)
Users can opt-in or opt-out
5
geWorkbench – A platform for
Integrated Genomics
Integrated genomics analysis
application




Support for gene expression data,
sequences, pathways, structure.
50+ visualization and analysis
modules.
Access to local and remote data
sources and analytical services.
Integration with biological annotation
sources.
Development platform


Open source, Java-based.
Component architecture,
facilitating customization.
www.geworkbench.org
6
geWorkbench – A platform for
Integrated Genomics
7
Questions genSpace Can Answer
What do I do first?
 Which tools work well together?
 Where does this tool fit in a typical workflow?
 Who do I know who also uses this tool?
 How do I get help (from an expert who is
online right now)?

8
Contributions




We investigate an approach to collaborative
knowledge sharing that is based on data mining
and social networking requiring little or no “extra
work” by scientists
We have developed a prototype implementation,
genSpace, built on the geWorkbench platform
Logging, data mining, etc. of geWorkbench user
activities, tool/workflow recommendation and
visualization already included in local prerelease repository
Planned for next external release
17
Future Work






More precise monitoring - specific analysis
parameters and options, visualization activities
Privacy and Confidentiality – Leverage
collaborative networks to restrict dissemination
Address “concept drift” as user participation,
tool/workflow usage, privacy settings change
Scaling up to hundreds of users and hundreds
of thousands of logs – Caching at client and
server, incremental update, offline access
genSpace APIs enabling easy port to other tool
integration frameworks beyond geWorkbench
18
Integration with pub “tagging” in Ken Ross lab
Ross Lab
Semantic Ranking and Result Visualization
for PubMed Search
 Social Network Aware Search in
Collaborative Tagging Sites


2 posters & demo (Julia Stoyanovich)
19
genSpace:
Community-Driven Knowledge Sharing for the
Discovery and Visualization of Workflows in
geWorkbench
Gail Kaiser
[email protected]
www.psl.cs.columbia.edu/genspace/
20