G33 - Spatial Database Group

Download Report

Transcript G33 - Spatial Database Group

Bacteria Divide People Into 3 Types
Article Source
Hosted on the New York Times
Written by Carl Zimmer
Research Source
Article Posted on Nature
Research Authors:
Manimozhiyan Arumugam, Jeroen Raes, Eric Pelletier, Denis Le Paslier, Takuji Yamada, Daniel R.
Mende, Gabriel R. Fernandes, Julien Tap, Thomas Bruls, Jean-Michel Batto, Marcelo Bertalan, Natalia
Borruel, Francesc Casellas, Leyden Fernandez, Laurent Gautier, Torben Hansen, Masahira Hattori,
Tetsuya Hayashi, Michiel Kleerebezem, Ken Kurokawa, Marion Leclerc, Florence Levenez,
Chaysavanh Manichanh, H. Bjørn Nielsen, Trine Nielsen, Nicolas Pons, Julie Poulain, Junjie Qin,
Thomas Sicheritz-Ponten, Sebastian Tims, David Torrents, Edgardo Ugarte, Erwin G. Zoetendal, Jun
Wang, Francisco Guarner, Oluf Pedersen, Willem M. de Vos, Søren Brunak, Joel Doré, MetaHIT
Consortium (additional members), Jean Weissenbach, S. Dusko Ehrlich & Peer Bork
Article Topic




Researchers looked at what bacteria is
found in people's stomachs
Discovered people are host to one of
three bacteria ecosystems
Discovery was made by analyzing the
types of bacteria DNA found in test
subjects’ skin and sweat
The DNA data was examined using
clustering analysis
Connection to Class ...
The topic of the article does not relate
directly to any material covered in class
However
It is interesting because it demonstrates
using data to solve real world problems
Background



Every Human is Host to 100 trillion
bacteria
The researchers were looking for DNA
related to 1,511 bacteria species
The researchers did not know what they
were looking for:
“We didn't have any hypothesis, Anything that came out
would be new” -Dr. Bork
Classification Analysis
Trying to group things into known categories
Examples:


Grouping donated blood by blood type
Looking for patients with low, med, and
high risk for heart disease
Clustering Analysis
Looking for groups in data
Clustering Analysis of Blood Groups By Percent
of Population can Donate To/From
AB
A
100%
Donate From
B
O
0%
0%
Donate To
100%
Clustering Analysis: Gut Microbes
M Arumugam et al. Nature 000, 1-7 (2011) doi:10.1038/nature09944
Why Use Clustering Analysis



Clustering analysis highlights the
existence of distinct groups in data
Can be used in a situation with a lot of
data, but little knowledge of how to
organize the data
It can provide enough information about a
subject to allow more interesting
questions to be asked
Summary



Classification analysis is grouping data
into known groups
Clustering analysis is looking for unknown
groups in data
Clustering analysis is most useful when
not much is known about a subject
Questions?