supa_ada2010_sec1 - University of Glasgow

Download Report

Transcript supa_ada2010_sec1 - University of Glasgow

Advanced Data Analysis
for the
Physical Sciences
Dr Martin Hendry
Dept of Physics and Astronomy
University of Glasgow
[email protected]
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
1. Introduction and Theoretical Foundations
Reasonable thinking?…
PREFACE
The goal of science is to unlock nature’s secrets…Our understanding comes
through the development of theoretical models capable of explaining the
existing observations as well as making testable predictions…Statistical
inference provides a means for assessing the plausibility of one or more
competing models, and estimating the model parameters and their
uncertainities. These topics are commonly referred to as “data analysis”.
The most we can hope to do is to make the
best inference based on the experimental data
and any prior knowledge that we have
available.
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Reasonable thinking?...
“A decision was wise, even though
it led to disastrous consequences,
if the evidence at hand indicated
it was the best one to make; and a
decision was foolish, even though
it led to the happiest possible
consequences, if it was
unreasonable to expect those
consequences”
Herodotus, c.500 BC
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Reasonable thinking?...
“A decision was wise, even though
it led to disastrous consequences,
if the evidence at hand indicated
it was the best one to make; and a
decision was foolish, even though
it led to the happiest possible
consequences, if it was
unreasonable to expect those
consequences”
Herodotus, c.500 BC
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
“Probability theory is nothing
but common sense reduced to
calculation”
Pierre-Simon Laplace
(1749 – 1827)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Plausible reasoning?…
PREFACE
The goal of science is to unlock nature’s secrets…Our understanding comes
through the development of theoretical models capable of explaining the
existing observations as well as making testable predictions…Statistical
inference provides a means for assessing the plausibility of one or more
competing models, and estimating the model parameters and their
uncertainities. These topics are commonly referred to as “data analysis”.
The most we can hope to do is to make the
best inference based on the experimental data
and any prior knowledge that we have
available.
We need to think about the difference
between deductive and inductive logic
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Deductive logic
Cause
(theory)
Effects or outcomes
(predictions of theory)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Deductive logic
Cause
(theory)
Effects or outcomes
(predictions of theory)
Inductive logic
Possible causes
(competing theories
or models)
Observations
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Deductive logic
Cause
(theory)
Effects or outcomes
(predictions of theory)
Inductive logic
Possible causes
(competing theories
or models)
Observations
How do we decide which model is most plausible?
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Deductive logic
Cause
(theory)
Effects or outcomes
(predictions of theory)
Inductive logic
Possible causes
(competing theories
or models)
Observations
How do we decide which model is most plausible?
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
An example of deductive logic
Statement A:
All red-haired students drink Irn Bru
Statement B:
Student X has red hair
Statement C:
Student X drinks Irn Bru
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
An example of deductive logic
Statement A:
All red-haired students drink Irn Bru
Statement B:
Student X has red hair
Statement C:
Student X drinks Irn Bru
Let’s suppose that A is true. (Our theory).
o
If B is true, then C is true
o
If C is false, then B is false
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
An example of deductive logic
Statement A:
All red-haired students drink Irn Bru
Statement B:
Student X has red hair
Statement C:
Student X drinks Irn Bru
Let’s suppose that A is true. (Our theory).
o
If B is true, then C is true
o
If C is false, then B is false
C is a logical consequence of A and B
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
If we set ‘true’ = 1 and ‘false’ = 0, we can use the rules of
George Boole (1854) to carry out logical operations.
We define
Negation:
‘A is false’
A
Logical product:
AB
‘both A and B are true’
Logical sum:
A+B
‘at least one of A or B is true’
Then
A ( B + C ) = AB + AC
A + AB = A
A + A = 1
A + BC = ( A + B )( A + C )
AA = 0
etc
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
An example of inductive logic
Statement A:
All red-haired students drink Irn Bru
Statement B:
Student X has red hair
Statement C:
Student X drinks Irn Bru
What can we say about B if A and C are true?...
(Statement A didn’t say that all students who drink
Irn Bru have red hair)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
An example of inductive logic
Statement A:
All red-haired students drink Irn Bru
Statement B:
Student X has red hair
Statement C:
Student X drinks Irn Bru
What can we say about B if A and C are true?...
(Statement A didn’t say that all students who drink
Irn Bru have red hair)
We might say, however
o
If C is true, then B is more plausible
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
In the 1940s and 50s Cox, Polya and Jaynes formalised
the mathematics of inductive logic as plausible reasoning
If we assign degrees of plausibility a real number between
0 and 1, then the rules for combining and operating on
inductive logical statements are identical to those for
deductive logic
Boolean algebra.
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
In the 1940s and 50s Cox, Polya and Jaynes formalised
the mathematics of inductive logic as plausible reasoning
If we assign degrees of plausibility a real number between
0 and 1, then the rules for combining and operating on
inductive logical statements are identical to those for
deductive logic
Boolean algebra.
Ed Jaynes
(1922 – 1998)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Laplace (1812)
Mathematical framework for probability
as a basis for plausible reasoning:
Probability measures our degree of
belief that something is true
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Laplace (1812)
Mathematical framework for probability
as a basis for plausible reasoning:
Probability measures our degree of
belief that something is true
Prob( X ) = 1

we are certain that
X is true
Prob( X ) = 0

we are certain that
X is false
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Our degree of belief always depends on the
available background information:
We write
“Probability that X is
true, given I ”
Prob( X | I )
Background information
Vertical line denotes conditional probability:
our state of knowledge about
X
conditioned by background info,
is
I
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Rules for combining probabilities
p( X | I ) 
X
p( X | I )  1
denotes the proposition that
X
is false
Note: the background information is the same
in both cases
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Rules for combining probabilities
p( X , Y | I ) 
X ,Y
p( X | Y , I )  p(Y | I )
denotes the proposition that
are true
X
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
and
Y
Rules for combining probabilities
p( X , Y | I ) 
X ,Y
p( X | Y , I )  p(Y | I )
denotes the proposition that
are true
p( X | Y , I )
p(Y | I )
= Prob( X is true, given
X
Y
and
Y
is true)
= Prob( Y is true, irrespective of
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
X)
Also
p(Y , X | I ) 
but
p(Y | X , I )  p( X | I )
p(Y , X | I ) 
p( X , Y | I )
Hence
p(Y | X , I ) 
p( X | Y , I )  p(Y | I )
p( X | I )
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayes’ theorem:
p(Y | X , I ) 
p( X | Y , I )  p(Y | I )
p( X | I )
Laplace rediscovered work of
Rev. Thomas Bayes (1763)
Bayesian Inference
Thomas Bayes
(1702 – 1761 AD)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayes’ theorem:
p(Y | X , I ) 
p(model | data, I ) 
p( X | Y , I )  p(Y | I )
p( X | I )
p(data | model , I )  p(model | I )
p(data | I )
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayes’ theorem:
p(Y | X , I ) 
p( X | Y , I )  p(Y | I )
p( X | I )
Prior
Likelihood
Posterior
p(model | data, I ) 
p(data | model , I )  p(model | I )
p(data | I )
Evidence
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayes’ theorem:
p(Y | X , I ) 
p( X | Y , I )  p(Y | I )
p( X | I )
Prior
Likelihood
Posterior
p(model | data, I ) 
p(data | model , I )  p(model | I )
p(data | I )
Evidence
We can calculate these terms
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayes’ theorem:
p(Y | X , I ) 
p( X | Y , I )  p(Y | I )
p( X | I )
Likelihood
Posterior
p(model | data, I ) 
What we know now
Prior
p(data | model , I )  p(model | I )
Influence of our
observations
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
What we knew
before
Bayesian probability theory is simultaneously a very
old and a very young field:Old :
original interpretation of Bernoulli, Bayes, Laplace…
Young: ‘state of the art’ in data analysis
But BPT was rejected for several centuries.
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayesian probability theory is simultaneously a very
old and a very young field:Old :
original interpretation of Bernoulli, Bayes, Laplace…
Young: ‘state of the art’ in data analysis
But BPT was rejected for several centuries.
Probability  degree of belief was seen as too
subjective
Frequentist approach
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Probability = ‘long run relative frequency’ of an event
in principle, it was thought, can be measured objectively
e.g. rolling a die.
What is p (1) ?
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Probability = ‘long run relative frequency’ of an event
in principle, it was thought, can be measured objectively
e.g. rolling a die.
What is p (1) ?
If die is ‘fair’ we expect
p (1)  p (2)    p (6) 
1
6
These probabilities are fixed (but unknown) numbers.
Can imagine rolling die M times.
Number rolled is a random variable – different outcome each time.
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Probability = ‘long run relative frequency’ of an event
in principle, it was thought, can be measured objectively
e.g. rolling a die.
What is p (1) ?
If die is ‘fair’ we expect
p (1)  p (2)    p (6) 
These probabilities are fixed (but unknown) numbers.
We define
n(1)
p (1)  lim
M  M
If p (1) 
1
6
die is ‘fair’
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
1
6
Probability = ‘long run relative frequency’ of an event
in principle, it was thought, can be measured objectively
e.g. rolling a die.
What is p (1) ?
If die is ‘fair’ we expect
p (1)  p (2)    p (6) 
These probabilities are fixed (but unknown) numbers.
But objectivity is an illusion:
n(1)
assumes each outcome equally likely
p (1)  lim
M  M
(i.e. equally probable)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
1
6
Probability = ‘long run relative frequency’ of an event
in principle, it was thought, can be measured objectively
e.g. rolling a die.
What is p (1) ?
If die is ‘fair’ we expect
p (1)  p (2)    p (6) 
These probabilities are fixed (but unknown) numbers.
But objectivity is an illusion:
Also assumes infinite series of identical trials;
why can’t probabilities change?
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
1
6
Probability = ‘long run relative frequency’ of an event
in principle, it was thought, can be measured objectively
e.g. rolling a die.
What is p (1) ?
If die is ‘fair’ we expect
p (1)  p (2)    p (6) 
1
6
These probabilities are fixed (but unknown) numbers.
But objectivity is an illusion:
What can we say about the fairness of the die after
(say) 5 rolls, or 10, or 100 ?
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
In the frequentist approach, a lot of mathematical machinery is
defined to let us address this type of question. See later
0
1
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
In the frequentist approach, a lot of mathematical machinery is
defined to let us address this type of question. See later
0
1
x
0
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
1
x
Bayesian versus Frequentist statistics: Who is right?
Frequentists are correct to worry about subjectiveness of
assigning probabilities – Bayesians worry about this too!!!
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayesian versus Frequentist statistics: Who is right?
Frequentists are correct to worry about subjectiveness of
assigning probabilities – Bayesians worry about this too!!!
Probability is subjective;
it depends on the available
information
Ed Jaynes
(1922 – 1998)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Bayesian versus Frequentist statistics: Who is right?
Frequentists are correct to worry about subjectiveness of
assigning probabilities – Bayesians worry about this too!!!
Probability is subjective;
it depends on the available
information
Subjective
Ed Jaynes
(1922 – 1998)

arbitrary
Given the same background
information, two observers should
assign the same probabilities
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
Suppose there are a set of M propositions
M
Then
 p( x
k 1
k
xk : k  1,, M 
| I)  1
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
Suppose there are a set of M propositions
M
Then
 p( x
k 1
k
xk : k  1,, M 
| I)  1
Suppose we introduce some additional proposition Y
p( x1 , y | I )  p( x1 | y, I ) p( y | I )
….
Use Bayes’ theorem.
p( xM , y | I )  p( xM | y, I ) p( y | I )
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
Then
M

p ( xk , y | I )    p ( x k | y , I )  p ( y | I )

k 1
 k 1

M
= 1
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
Then
M

p ( xk , y | I )    p ( x k | y , I )  p ( y | I )

k 1
 k 1

M
= 1
Marginal probability
M
p ( y | I )   p ( xk , y | I )
k 1
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
This extends to the continuum limit :
x can take infinitely many values

p( y | I ) 
 p( x, y | I ) dx

SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
This extends to the continuum limit :
x can take infinitely many values

p( y | I ) 
 p( x, y | I ) dx

p ( x, y | I ) is no longer a probability, but a probability density
b
Prob(a  x  b and y is true | I ) 
 p( x, y | I )dx
a
with obvious extension to continuum limit for y
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
This extends to the continuum limit :
x can take infinitely many values

p( y | I ) 
 p( x, y | I ) dx


Also
 p( x | y, I )dx
 1

Normalisation condition
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
More Theoretical Foundations: Marginalisation
Probabilities are never negative, so p ( x )  0
for all x
We compute probabilities by measuring
the area under the pdf curve, i.e.
p(x)
b
Prob(a  x  b ) 
 p( x)dx
a
‘Normalisation’

 p( x)dx  1

a
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
b
x
Some important pdfs:
1)
Discrete case
Poisson pdf
e.g. number of photons / second counted by a detector,
number of galaxies / degree2 counted by a survey
r = number of detections
Poisson pdf assumes detections are independent, and
there is a constant rate 
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Some important pdfs:
1)
Discrete case
Poisson pdf
e.g. number of photons / second counted by a detector,
number of galaxies / degree2 counted by a survey
r = number of detections
p(r ) 
 r e
r!
Poisson pdf assumes detections are independent, and
there is a constant rate 
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Some important pdfs:
1)
Discrete case
Poisson pdf
e.g. number of photons / second counted by a detector,
number of galaxies / degree2 counted by a survey
r = number of detections
p(r ) 
 r e
r!
Poisson pdf assumes detections are independent, and
there is a constant rate 
Can show that

 p(r )  1
r 0
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Some important pdfs:
1)
Discrete case
Poisson pdf
e.g. number of photons / second counted by a detector,
number of galaxies / degree2 counted by a survey
p(r )
p(r ) 
 r e
r!
r
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Some important pdfs:
2)
Discrete case
Binomial pdf
number of ‘successes’ from N observations, for two mutually
exclusive outcomes (‘Heads’ and ‘Tails’)
r = number of ‘successes’

= probability of ‘success’ for single observation
p N (r ) 
N!
 r (1   ) N r
r!( N  r )!
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Some important pdfs:
2)
Discrete case
Binomial pdf
number of ‘successes’ from N observations, for two mutually
exclusive outcomes (‘Heads’ and ‘Tails’)
r = number of ‘successes’

= probability of ‘success’ for single observation
p N (r ) 
N!
 r (1   ) N r
r!( N  r )!

Can show that
p
r 0
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
N
(r )  1
Some important pdfs:
1)
Uniform pdf
Continuous case
a xb
 b 1 a

p( x)  
0

otherwise
p(x)
1
b-a
0
a
b
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Some important pdfs:
2)
Continuous case
Central, or normal pdf
(also known as Gaussian )
p ( x) 
2

1
1 x  
exp  
   N  ,  
2 
 2    
p(x)
  0.5
 1
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
Cumulative distribution function (CDF)
a
P(a) 
 p( x)dx
= Prob( x < a )

P (x )
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
Question 1:
In the figures below, the blue curves show a normal
distribution with mean zero and   1 .
Which of the pink curves shows a normal
distribution with mean zero and   0.5 ?
p (z )
p (z )
0.4
0.4
A
B
Series1
0.2
z
0
-3
-2
-1
0
1
2
Series
0.2
Series2
Series
z
0
3
-3
-2
-1
0
2
3
p (z )
p (z )
0.4
0.4
C
D
Series1
0.2
-2
-1
z
0
1
2
3
Series
0.2
Series2
0
-3
1
Series
z
0
-3
-2
-1
0
1
2
3
Question 1:
In the figures below, the blue curves show a normal
distribution with mean zero and   1 .
Which of the pink curves shows a normal
distribution with mean zero and   0.5 ?
p (z )
p (z )
0.4
0.4
A
B
Series1
0.2
z
0
-3
-2
-1
0
1
2
Series
0.2
Series2
Series
z
0
3
-3
-2
-1
0
2
3
p (z )
p (z )
0.4
0.4
C
D
Series1
0.2
-2
-1
z
0
1
2
3
Series
0.2
Series2
0
-3
1
Series
z
0
-3
-2
-1
0
1
2
3
Measures and moments of a pdf
The nth moment of a pdf is defined as:-
x
n


n
x
 p( x | I )
Discrete case
x 0

xn

n
x
 p( x | I )dx
Continuous case

SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Measures and moments of a pdf
The 1st moment is called the mean or expectation value:
E ( x) 
x


 x p( x | I )
Discrete case
x 0

E ( x) 
x

 x p( x | I )dx
Continuous case

SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Measures and moments of a pdf
The 2nd moment is called the mean square:
x
2


2
x
 p( x | I )
Discrete case
x 0

x2

2
x
 p( x | I )dx
Continuous case

SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Measures and moments of a pdf
The variance is defined as:
var x  

 x  x 
2
p( x | I )
Discrete case
x 0
var x  

 x  x 
2
p( x | I )dx
Continuous case

2
and is often written as 
   2 is called the standard deviation
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Measures and moments of a pdf
The variance is defined as:
var x  

 x  x 
2
p( x | I )
Discrete case
x 0
var x  

 x  x 
2
p( x | I )dx
Continuous case

In general
var x 
x
2
 x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
2
Measures and moments of a pdf
pdf
Poisson
Binomial
p N (r ) 
Uniform
p(r ) 


N
N (1   )
r!
1
ba
Normal
p( x) 
variance
 r e
N!
 r (1   ) N r
r!( N  r )!
p ( x) 
mean
 1  x   2 
1
exp  
 
2 
 2    
1
2
a  b 

1
12
b  a 2

SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
2
Measures and moments of a pdf
The Median divides the CDF into two equal halves
P( xmed ) 
xmed
 p( x' )dx'
 0.5

P (x )
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
Measures and moments of a pdf
The Median divides the CDF into two equal halves
P( xmed ) 
xmed
 p( x' )dx'
 0.5

P (x )
Prob( x < xmed ) = Prob( x > xmed ) = 0.5
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
Measures and moments of a pdf
The Mode is the value of x for which the pdf is a maximum
p(x)
  0.5
 1
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
Measures and moments of a pdf
The Mode is the value of x for which the pdf is a maximum
p(x)
  0.5
 1
x
x
For a normal pdf, mean = median = mode =
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010

This expression is the basis for the ‘error propagation’
formulae we use in e.g. undergraduate physics labs
See also the SUPAIDA course
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Question 2:
Which expression correctly approximates the error
on the natural logarithm of a variable x ?
2

~ x
A
 ln x
B
 ln x ~ x  x
C
 ln x ~  x x
D
 ln x ~ x  x
x2
2
2
Question 2:
Which expression correctly approximates the error
on the natural logarithm of a variable x ?
2

~ x
A
 ln x
B
 ln x ~ x  x
C
 ln x ~  x x
D
 ln x ~ x  x
x2
2
2
Multivariate Distributions
Joint pdf
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Marginal Distributions
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Marginal Distributions
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Conditional Distributions
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Conditional Distributions
Similarly
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Statistical Independence
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Question 3:
Which of the following joint pdfs describe variables
x and y which are statistically independent?
A
p( x, y)  12 x  y ;
B
p( x, y)  exp  12 x  y ;
C
p( x, y)  log x  y ;
D
p( x, y)  exp  12 x  y ;
0  x, y  
0  x, y  
0  x, y  
0  x  y, 0  y  
Question 3:
Which of the following joint pdfs describe variables
x and y which are statistically independent?
A
p( x, y)  12 x  y ;
B
p( x, y)  exp  12 x  y ;
C
p( x, y)  log x  y ;
D
p( x, y)  exp  12 x  y ;
0  x, y  
0  x, y  
0  x, y  
0  x  y, 0  y  
The bivariate normal distribution
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
The bivariate normal distribution
p(x,y)
y
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
The bivariate normal distribution
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
The bivariate normal distribution
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
The bivariate normal distribution
In fact, for any two variables x and y, we define
cov( x, y)  Ex  E( x) y  E( y)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
Isoprobability contours for
the bivariate normal pdf
 0
:
y
  0 .3
positive correlation
y tends to increase as x increases
 0 :
y
  0. 0
negative correlation
x
y
x
y
  0 .5
  0. 7
y tends to decrease as x increases
x
y   0.7
x
y   0. 9
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
Isoprobability contours for
the bivariate normal pdf
 0
:
y
  0 .3
positive correlation
y tends to increase as x increases
 0 :
y
  0. 0
negative correlation
x
y
x
y
  0 .5
  0. 7
y tends to decrease as x increases
Contours become narrower and
steeper as
 1

x
y   0.7
x
y   0. 9
stronger (anti) correlation
between x and y.
i.e.
Given value of x , value of
y is tightly constrained.
x
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
x
The bivariate normal distribution
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010
The bivariate normal distribution
(see also Section 2)
SUPA Advanced Data Analysis Course, Jan 5th – 6th 2010