Macromolecular Structure Database group

Download Report

Transcript Macromolecular Structure Database group

EMBL-EBI
MSD-mine
EMBL-EBI
MSD-mine overview
 Web application for online data analysis
and mining
For the advanced MSDSD researcher
Interactive ad-hoc queries
Exploitation of integrated knowledge
Analysis, charts and Data drill
 Combining of information with multiple
joins
 Generic but customised for the MSDSD
EMBL-EBI
Characteristics
 Not overview visualisation of hits from
predefined queries
 Online analysis of homogenised data
 Arbitrary queries on
100 entities (tables) in 9 sections (marts)
restrictions and results for 2000 attributes
combine entities based on 450 relations
 Operability safeguards
Reject long queries and overload of results
EMBL-EBI
Exploring MSDSD
 Explores and explains MSDSD
With context sensitive help and descriptions
With links to MSDSD documentation
 Helps to understand the structure of
MSDSD
 Helps learning query writing in SQL for
advanced custom queries
EMBL-EBI
Filter build page
 Page areas
Entities
(entities and
relations)
Restrictions
Filter
(entities joined)
Description
(context sensitive)
EMBL-EBI
MSDSD marts
 MSDSD is organised in sections (marts)
 A mart is a closely related set of tables
Click to expand
& use
Click for
documentation
Use in your
query
EMBL-EBI
Define Restrictions
 Select the attribute
 Choose the operator
 Type in the value or
select one from a
sample list
 Add the new
restriction
value
EMBL-EBI
Combine entities
 Using one of its relations
 Relations are organised
per mart
 Understand cardinality
 Choose the the working
node and follow its
relations
EMBL-EBI
MSDSD preferences
 Constraint shortcuts
 Important for correct
analysis
All/Representative
assembly
Asymmetric unit
All/Representative model
One chain per sequence
All entries
SCOP or DALI entries
Custom set of entries
EMBL-EBI
Execute query
 View-Navigate results
 Load all records
 Result based constraints
 View details
 Relation links
 Export: Text-XML-script
EMBL-EBI
Data analysis
 Complete or Sample
 Range or Value
 Fully customisable
 Context sensitive
chart
 Data drill operations
EMBL-EBI
Analysis over a base attribute
 Choose base
attribute
 Choose grouping
operation for
analysis attribute
 Options and
data-drill operations
supported
EMBL-EBI
Basic example
 Find the entries with resolution < 1.2
 Select the “Structure”
mart
 Choose the Entry table
 Set restriction on
resolution
 Browse the
results
EMBL-EBI
Filter Expressions
 Entries with resolution<1.2 related to
HEMOGLOBIN
 Add restriction on
resolution
 “Or” sub-expression
 Title contains the word
“HEMO” or “HAEMO” or “GLOBIN”
EMBL-EBI
Simple distribution chart
 Find the distribution of assembly types
 Use table “Assembly”
 Execute the query
 Analysis for the attribute
“Assembly type”
EMBL-EBI
Relations - external links
 Entries related to “cell death”
follow their GO mappings
 “Entries” where
title contains
the word “death”
 GO mappings
for an entry
 Links to GO
database
EMBL-EBI
A more complex example
 Linearity of helices that are part of betaalpha-beta motifs and have active site
contacts
 Start with “Motif” table
 Combine with “Helix”
and “Residue Contacts”
 Add a restriction
 View results and
statistics for the
helix linearity
 Focus (drill) on
an area of interest
EMBL-EBI
Saving results and exporting
 Binding sites of “kinked”
residues
 Combining “Residue”,
“Helix” and “Site”
 Save the results on a
local file
 Export the results
in XML
Text
as a script
EMBL-EBI
Preferences - representative sets
 Find the distribution of
number of crystals in
experiments
 Use the “XRay-data”
table
 View the distribution of
number of crystals
 For the whole PDB
 For the DALI set
 For a custom
representative set
EMBL-EBI
Custom filters and results
 Percentage of residues that interact in helix
interactions, of helices
of similar size
 “Helix interaction” table
 Custom “normalised
interaction factor”
result item
 Custom restriction “one
helix is at most double
in size than the other”
 View the distribution of
the “interaction factor”