Introduction to Bioinformatics 1

Download Report

Transcript Introduction to Bioinformatics 1

Overview of Bioinformatics 1
Module
Denis Manley.
Contact Details
• Lecturer Name: Denis Manley
• Room number: KE-1-013a
• Email : [email protected]
• Website: www.comp.dit.ie/dmanley
• Phone: 01 402 4949
What is bioinformatics
• Bioinformatics is the use of computers and
computational methods to analyse large sets of
molecular biological data that is used for :
– The investigation of “living organisms” and their
evolution.
– The discovery of genes, gene regulation; genetic
networks and protein functionality, which can be used
to understand: human disease; human development
(conception to adulthood) etc .
– the results of which can facilitate our understanding
of diseases like cystic fibrosis; suggest therapies; and
the development of cures such as drug development,
viral therapy…
Reading DNA novels: “bioinformatics”
• Analysing large sets of data is “equivalent” to
reading and understanding a book
(Computational linguistics). The syntax:
– Reading involves looking at letters [ including
spaces and punctuation] to determine the words.
Bioinformatics associated with DNA reads 4 letters
(referred to by letters ATGC) and can help
determining location of genes or other important
elements of DNA (correspond to words).
Reading DNA novels: “bioinformatics”
– The next step in reading involves determining if
the words are nouns/verbs/adverbs etc… In
general there are rules: “what are they ?”
– Bioinformatics involves determining what are the
“important elements” correspond to: e.g. genes;
gene promoters….
– However, clearly the rules to determine “genes”
and other elements are different than in a natural
language and more importantly are sometimes
being modified (if something new is discovered).
Reading DNA novels: “bioinformatics”
– syntax:
– The next step in determining the sequence of the
words.; e.g. should it be“what are the rules of
english grammar”; or “are what the rules of
grammar english“
– Bioinformatics involves determining the sequence
of “important elements”; e.g. promoter are
“upstream of genes and not the other way
around.
Reading DNA novels: “bioinformatics”
– Symantics:
– What does the set of words (sentence) mean. “what is
your purpose?” what processes do humans use to
interpret this sentence
– Bioinformatics attempts to analyse the function of
DNA/genetic sequences by: e.g.
1. comparing the sequences to sequences whose
function is already known.
2. By converting the sequence into its equivalent
“protein” and comparing it to known proteins
3. determining 3-D structure of proteins and looking
for known structural components.
Reading DNA novels: “bioinformatics”
• Bioinformatics also focuses on the computational
aspects of the discipline such as:
– Setting up databases
– Writing code to perform analysis
– Determining and Utilisation of known computational
techniques to improve analysis of the biological data.
• Bioinformatics, covers a very large area but this
particular module will focus on the
“computational analysis of genetic systems” and
will be referred to as Bioinformatics 1.
Bioinformatics 1: module syllabus.
• Part 1: Fundamental of genetic systems:
• Principles of inheritance and evolutions: essential criteria for
our evolution and existence.
• Basic Molecular cell biology: DNA , Genes and Amino acids
(proteins) .
• The relationship between a gene and its physical
manifestation (proteins); The central “dogma” of Genetics:
DNA -> RNA->Proteins
• Introduction to structural elements of genetic systems
• Examples of Gene “expression” regulation
Bioinformatics 1: module syllabus.
• Part 2: Programming in “PERL”: a common scripting
language used in the field of bioinformatics
• Fundamentals of Perl: read/write, loops….
• Fundamental Perl data structures: “bioinformatics“ data files;
dynamic arrays and hash tables.
• Perl Pattern matching techniques (regular expressions) used in
bioinformatics: searching for a pattern (e.g. ATG); extract a
pattern from a sequence; substitute one pattern for another (e.g.
replace T with a U)
• Create perl sub-routines and Perl modules and use them in other
perl programs
• Development of “basic” bioinformatics data sequences analytical
tools using perl and core computational algorithms [these
algorithms will be covered in the computational element of the
module].
What is bioinformatics
– Part 3a: Introduction to online bioinformatics
resources;
• How and where to obtain “bioinformatics” DNA data
sequences and data relevant to these sequences
• Explanation of the different elements of these data sets
“data annotation” or (meta data).
• Fundamentals of common online DNA analytical tools
(such as sequence alignment measurement )
What is bioinformatics
• Part 3b:
– Computational bioinformatics for:
• DNA pattern matching: global/local/multiple
• Align DNA sequences: e.g. Pairwise alignment
• Application of alignment principles using basic
computational methods
• Reconstruct genomes (large DNA sequences) using
“shot-gun” alignment techniques
• Principles of searching for “matching” DNA (gene)
sequences in large online databases.
• How to utilise and interpret findings of DNA database
searches: e.g. gene functionality.
Assignment and exam
• 1 Assignment (40%):
– Developing an application to analyse “small” DNA
data sequences
– A report discussing the findings of the on-line
applications when applied to known DNA sequences
• Exam: question 1 + 2 out of 4 other question
(60%)
– Question 1 compulsory: Bioinformatics Perl
programming .
– Other questions related to the other areas in the
module.
Proposed schedule
• Week 2 to Week 6 (Thursday 18:00 to 20:00):
“Part 1 Fundamental of genetic systems.”
• Week 2 to Week 6 (Thursday 20:15 to 21:15)
– Perl programming for bioinformatics
• Week 7 review week [submit assignment part
1]
Proposed schedule
• Week 8 to Week 12 (Thursday 18:00 to 20:00):
“Computational techniques and their
application to bioinformatics”
• Week 8 to Week 12 (Monday 20:15 to 21:15)
– Online bioinformatics databases and analytical
applications (approx 2 weeks).
– Development of fundamental computational
applications using perl (approx 3 weeks)
• Week 10 submission of assignment part 2
• Week 13 review of course and sample exam
paper
Assignment content
• Assignment 1:
– A report on the analysis on the biological impact of
developing “ a bioinformatics applications.
– Development of the fundamental functionality of the
application based on the findings of the report
• Assignment 2:
– Using the application from assignment 1: Analysis and
development of the analytical component
“computational analysis” of the application.
– A report on the findings of applying the final
application to a given dataset obtained from online
bioinformatics databases.