S. cerevisiae

Download Report

Transcript S. cerevisiae

IB404 - 4. Saccharomyces cerevisiae - Jan 30
1. Major model system for molecular genetics. For example, one can
clone the gene encoding a protein if you have a mutant, simply by
transforming yeast with a plasmid genomic library and selecting for
colonies that are restored to wild type - plasmid rescue.
2. Genome sequenced by an international consortium of over 600
scientists from over 100 labs. Largely by manual sequencing, based on
clone-by-clone physical map of cosmids, like E. coli; published 1997.
3. Genome is 12Mbp in 16 chromosomes encoding about 6,000 genes.
4. So now have 1 gene per 2kb. Difference is larger promoter regions,
since there are few introns in yeast genes.
5. Introns are thought to have been lost by recombination of cDNA
copies of mRNAs with genes - called gene conversion.
6. Ty1 and 2 retrotransposons can contribute the reverse transcriptase.
7. No operons, and again genes are “randomly” arranged on each strand.
8. The genome appears to have been duplicated in a polyploidization
event long ago, and large duplications remain, although most duplicate
genes were lost and the rest diverged.
Duplicated regions between the 16 chromosomes
Knocking out all 6000 protein-coding genes
Precise knockouts or replacements of genes can be done easily in yeast
because of their high rates of homologous recombination (below left).
Roughly 20% of such complete gene deletion or knockouts are not viable
on rich medium, so they are required in all conditions.
These essential genes encode generally more conserved proteins, with
almost none of them duplicated in the genome.
They also identified 466 genes affecting cell shape (below right).
How many genes and RNA-only genes
1. As we shall see for the other genomes we consider, actually deciding
how many protein-encoding genes are present can be quite difficult,
especially when lots of pseudogenes and alternatively-spliced and
promoted genes are present, but yeast is simple with few introns.
2. From the start it was also obvious there were many genes that do not
encode proteins, that is, RNA-only genes. The obvious ones were the
rRNA, tRNA, snRNA (small nuclear RNAs that mediate splicing of
introns from mRNAs), and snoRNAs (small nucleolar RNAs that
mediate cutting of the rRNA precursor molecules into subunits).
3. Then in the 1990s we learned of the RNA interference (RNAi) systems
that involve short 20-25 nucleotide single-stranded RNAs targeted at the
3’ UTRs (untranslated regions) of mRNAs. These were called
microRNAs, and there are hundreds in most genomes, although S.
cerevisiae doesn’t have a functioning RNAi system so doesn’t have them
(but related yeasts do have RNAi systems, and when these genes are
moved into S. cerevisiae, they can mediate RNAi).
4. Animals turn out to have quite a variety of other RNA-only genes, and
the class of RNAs they produce is called ncRNA (non-protein-coding
RNA), including long non-coding or lncRNAs that are 100bp-10kb in
length, intergenic or antisense, and perform a wide diversity of roles (e.g.
mediate X inactivation in mammals), but yeast has few of these.
6. Eventually two new technologies revealed an even stranger
phenomenon, which is pervasive transcription, meaning that essentially
all of each genome is transcribed to at least some extent. First, genomewide tiling microarrays were developed in which overlapping
oligonucleotide probes were placed on the array covering the entire
genome sequence, not just the predicted gene models. Second,
ILLUMINA sequencing was employed to do extremely deep sequencing
of RNA (after reverse transcription into DNA). The majority of these are
medium length ncRNAs and cluster around the 5’ and 3’ ends of protein
genes, both sense and antisense. They have been given all sorts of
different names in different organisms, but in yeast are called CUTs and
SUTs. They are intermediate in length (200-700bp), generally unstable,
and their role in transcription or other functions remains unclear.
A. Schematic
representation of cryptic
unstable transcripts (CUTs)
and stable unannotated
transcripts (SUTs) relative
to an mRNA.
B. Distribution of the 3'
ends of CUTs relative to
mRNA transcription start
sites (TSSs). At the top,
sense CUTs relative to the
associated mRNAs are
shown; at the bottom,
antisense (divergent) CUTs
relative to the associated
mRNAs are shown. The
zero on the x axis
represents the position of
the mRNA TSSs. The
orange arrows indicate the
approximate positions of
sense and antisense CUTs.