PowerPoint format

Download Report

Transcript PowerPoint format

Statistics 359a
Regression Analysis
Necessary Background
Knowledge - Statistics
• expectations of sums
• variances of sums
• distributions of sums of normal random
variables
• t distribution – assumptions and use
• calculation of confidence intervals
• simple tests of hypotheses and p-values
Necessary Background
Knowledge – Linear Algebra
•
•
•
•
•
•
multiplication of conformable matrices
transpose of a matrix
determinant of a square matrix
inverse of a square matrix
eigenvalues of a square matrix
quadratic forms
Origin of Least Squares
Introduction of the metric system and the length of
a meter
• 1790 – French National Assembly commissions
the French Academy of Sciences to design a
simple decimal-based system of weights and
measures
• 1791 – French Academy defines the meter to be
10-7 or one ten-millionth of the length of the
meridian through Paris from the north pole to the
equator.
Adrien-Marie Legendre
• Legendre on the French
commission in 1792 to
determine the length of
the meridian quadrant
• measurements of latitude
made in 1795
• complex calculations
made from the
measurements in 1799
• Legendre proposes the
method of least squares
in 1805 to determine the
length of a meter
Data
• old French units of measurement: 1 module = 2 toises
• old French to imperial English: 1 toise = 6.395 feet
• metric to imperial: 1 meter = 3.2808 feet
From Spherical Geometry
S
S
L  L 
C 
  sin( L  L) cos( L  L)
28500
28500
S  arc length
C is related to the length of the meridian quadrant (90D)
D  28500/(1 C )
 length of one degree of an arc
in modules
 is related to the ellipticit y of the earth
Including measurement errors,
the data and model reduce to:
 1  0.003398 C ( 4.912)   (0.590)
 2  0.000475 C ( 2.720)   (0.027)
 3   0.002625 C (0.048)   (0.324)
 4   0.001529 C ( 2.914)   (0.277)
 5  0.000279 C ( 4.765)   (0.014)
Solution is:
D = 28497.78 modules
90D = 2564800.2 modules = length of the
meridian quadrant
Therefore
1 meter = 0.256480 modules
= 0.512960 toises
= 3.280 feet
modern meter = 3.2808 feet
Origin of the Term “Regression”
• Francis Galton, 1886,
‘Regression towards
mediocrity in hereditary
stature.’ Journal of the
Anthropological Institute,
15: 246 – 263
• See JSTOR under UWO
library databases
Data on Heights of Children and
Parents
‘Regression Line’
Theoretical Basis
For X and Y bivariate normal with equal means
variances
E (Y | X  x )     ( x   )
E (Y | X  x )  x  (   1)( x   )
For  > 0
E(Y |X ) < x for x >  and
E(Y |X ) > x for x < 
Example in Data Analysis
Through Regression
• Relationship between
the price of a violin
bow and its attributes
such as age, shape
and ornamentation on
the bow
Violin Bow Example
The following data on violin bows made by W.E. Hill and Sons of London, England are taken
from the internet site www.maestronet.com/pricehist.html. The data show the prices of the bows
sold at auction at Sotheby’s auction house for the years 1994-97. Also given are data on various
factors that may affect the price of the bow. These include: the year of the sale (in case of price
inflation or deflation); the year of manufacture (or age – are antique bows more or less
valuable?); weight of the bow in grams (do buyers like heavier or lighter bows?); the shape of
the bow (is there an aesthetic effect to the price?); presence or absence of ornamental gold;
presence or absence of ornamental pearl; and whether the bow has a tortoiseshell frog or an
ebony frog. Only the bows for which the approximate year of manufacture has been given are
included in the data set. Prices from other auction houses and for other bow makers, as well as
violins, are available at the same site, but only Sotheby’s gives the year of manufacture. A
Minitab file of the data is at O:\359\bows.mtb.
Price in
U.S.
Dollars
1874
2436
7498
1142
1935
1759
5278
4905
7994
2543
1769
1592
3716
2477
2654
3362
Year of
Sale
1997
1997
1997
1996
1996
1996
1996
1995
1995
1995
1994
1994
1994
1994
1994
1994
Year the
Shape
TortoiseGold
Pearl
Bow was Weight in O=octagonal
shell
Grams
Made
R=round Accessories Frog Accessories
1957
59.0
O
N
N
N
1935
62.0
R
N
N
N
1920
62.0
R
Y
Y
N
1945
59.5
O
N
N
Y
1890
57.5
R
N
N
N
1900
56.0
O
N
N
N
1950
57.0
O
Y
Y
Y
1920
58.0
R
Y
N
N
1920
60.0
O
Y
Y
Y
1926
62.5
R
N
N
Y
1935
61.0
R
N
N
N
1960
61.0
R
N
N
Y
1935
55.0
O
Y
Y
Y
1925
59.0
R
N
N
Y
1930
58.0
R
N
N
N
1935
58.0
R
N
Y
Y
Price and Date of Sale
• 1995 seems to be a more expensive year
• Is the effect confounded with some other attribute
common to 1995?
Violin Bows - Price and Sale Date
8000
7000
Price
6000
5000
4000
3000
2000
1000
1994
1995
1996
Year Sold
1997
Price and Year of Manufacture
• Is there anything special about 1920?
• Is there a quadratic trend in the data?
Violin Bows - Price and Year of Manufacture
8000
7000
Price
6000
5000
4000
3000
2000
1000
1890
1900
1910
1920
1930
Year Made
1940
1950
1960
Price and Weight of the Bow
• Is there any trend with respect to the
weight?
Violin Bows - Price and Weight in Grams
8000
7000
Price
6000
5000
4000
3000
2000
1000
55
56
57
58
59
Weight
60
61
62
63
Octagonal vs. Round Bows
• No apparent trend
Violin Bows - Price and Shape
1 = round, 0 = octagonal
Shape
1.0
0.5
0.0
1000
2000
3000
4000
5000
Price
6000
7000
8000
The Gold Standard?
• The presence of gold on a bow generally
makes it more expensive
Violin Bows - Price and Gold Accessories
1 = present, 0 = absent
Gold
1.0
0.5
0.0
1000
2000
3000
4000
5000
Price
6000
7000
8000
Tortoise Shell Frogs
• Some evidence of added expense for
tortoise shell
Violin Bows - Price and Tortoise Shell Frogs
1 = present, 0 = absent
Frog
1.0
0.5
0.0
1000
2000
3000
4000
5000
Price
6000
7000
8000
Price and Pearl Accessories
• No apparent effect
Violin Bows - Price and Pearl Accessories
1 = present, 0 = absent
Pearl
1.0
0.5
0.0
1000
2000
3000
4000
5000
Price
6000
7000
8000
Prediction
• Can we use the
model built with the
current data to
predict the future
price of a bow
• Example: some
1999 data from
auctions
• 1920 bow, 60.5 g.,
round with gold and
pearl accessories $4098
• 1933 bow, 61 g.,
octagonal with pearl
accessories only $2421