Transcript HERE
DATA, DECISIONS, UNCERTAINTY
A HISTORY
Kevin S. Robinson, PhD
[email protected]
Girolamo Cardano lived from 1501 to 1576, Italy
Cool Fact:
consulted by Leonardo da Vinci on questions of geometry
In addition to Cardano's major contributions to algebra he also made important
contributions to probability, hydrodynamics, mechanics and geology. His book Liber de
Ludo Aleae was published in 1663 but the book on games of chance was probably
completed by 1563.
Cardano makes the first ever foray into the, until then untouched, realm of probability
theory. It is the first study of things such as dice rolling, based on the premise that there are
fundamental scientific principles governing the likelihood of achieving the elusive 'double
six', outside of mere luck or chance.
The Unfinished Game:
Pascal, Fermat, and the Seventeenth-Century Letter
that Made the World Modern
In the early seventeenth century, the outcome of
something as simple as a dice roll was consigned to the
realm of unknowable chance. Mathematicians largely
agreed that it was impossible to predict the probability
of an occurrence. Then, in 1654, Blaise Pascal wrote to
Pierre de Fermat explaining that he had discovered how
to calculate risk. The two collaborated to develop what
is now known as probability theory; a concept that
allows us to think rationally about decisions and events.
Jacob Bernoulli - 1654 to 1705 - Swiss
Cool Fact:
Jacob had always found the
properties of the logarithmic spiral to
be almost magical and he had
requested that it be carved on his
tombstone with the Latin
inscription Eadem Mutata
Resurgo meaning "I shall arise the
same though changed".
By 1689 he had published important work on infinite series and published his law of large
numbers in probability theory. The interpretation of probability as relative-frequency says
that if an experiment is repeated a large number of times then the relative frequency with
which an event occurs equals the probability of the event. The law of large numbers is a
mathematical interpretation of this result.
Normal Distribution (1808)
Carl Friedrich Gauss
&
Pierre-Simon Laplace
A very readable article about the Normal Distribution: http://tinyurl.com/normal-ksr
Misconception:
Something is “wrong” if the distribution is non-normal ...
Often, distributions other than the normal
are more appropriate for a given set of data.
Normality is a myth;
there never has, and never will be, a normal distribution.
Roy C. Geary (1896 - 1983)
Roy C. Geary was undoubtedly the most eminent Irish
statistician and economist of the twentieth century.
Francis Galton - 1822 to 1911 - English
one of the most exceptional statisticians of his time
his scientific achievements were substantial and his
influence on statistics is still felt strongly today.
Galton introduced many important statistical concepts
that are now standard in many statistical analyses;
including correlation, regression and percentiles.
Cool Fact:
Galton's career bore remarkable similarities to that of his cousin Charles Darwin. Like
Darwin, Galton attended Cambridge, but did not do exceptionally well. He spent a period of
traveling before settling down to scientific work. And like Darwin, Galton had caught hold of
the controversial ideas, which he realized could only be adequately proved by careful
scientific investigation (Forrest, 1995). Galton placed an extremely high value upon science.
Karl Pearson - 1857 to 1936 – English
Galton’s statistical heir – Pearson a major player in the
early development of statistics as a serious scientific
discipline in its own right. He founded the Department of
Applied Statistics (now the Department of Statistical
Science) at University College London in 1911; it was the
first university statistics department in the world.
He had become interested in the idea that sound mathematics could be applied to natural
phenomena not only under the category of causation, but also under the broader category
of correlation, and had in 1893 contributed his first statistical paper to the Royal Society, of
which he was elected a Fellow in 1896. Whatever his other distinctions, he will be most
widely known to posterity as the inspirer and largely the creator of a body of statistical
theory concerning frequency curves, correlation, goodness of fit, etc., most of which has
appeared in Biometrika, begun in 1901 after the Royal Society had "resolved that
mathematics and biology should not be mixed," as he himself phrased it.
William Gosset - 1876 to 1937 - English
He invented the t-test (1908) to handle small samples for
quality control in brewing. He discovered the form of the t
distribution by a combination of mathematical and empirical
work with random numbers, an early application of the
Monte-Carlo method. -- under a pseudonym ("Student")
To many in the statistical world "Student" was regarded as a statistical advisor to Guinness's
brewery, to others he appeared to be a brewer devoting his spare time to statistics. There is
some truth in both these ideas but miss the central point, which was the intimate connection
between his statistical research and the practical problems on which he was engaged.
Sir Ronald Aylmer Fisher - 1890 to 1962 - English
a genius who almost single-handedly created
the foundations for modern statistical science
&
the greatest biologist since Darwin
… important contributions to statistics, include
the analysis of variance (ANOVA),
method of maximum likelihood,
experimental design.
Cool Fact:
Fisher had poor eyesight, which made reading difficult;
so he learned through listening to others read aloud to
him. He started out studying mathematics (and excelling
at it) but began to focus on statistics because of his
interest in evolutionary theory.
John Tukey - 1915 to 2000 - American
one of the most influential statisticians of the last 50 years
and a wide-ranging thinker … spent decades as both a
professor at Princeton University and a researcher at AT&T's
Bell Laboratories. In 1970, Tukey published ''Exploratory Data
Analysis,'' which gave new ways to analyze and present data
clearly – tools include, the stem-and-leaf display and boxplot
which continue in high school and higher-ed curriculums.
The best thing about being a statistician is that you get to play in everyone's backyard.
Tukey believed that, after the first 40 years, the practitioners of statistics lost sight of its
original objectives of finding methods of analyzing data that described patterns, trends, and
relationships, and detected anomalies. In 1962, he maintained that mathematical statistics
was ignoring real-world data analysis. He urged a return to the origins of scientific statistics,
using modern methods in which the statistical description of the data was paramount.
Cool Fact: In 1944, Tukey coined the term "bit," an abbreviation of "binary digit", to
describe the 1s and 0s that are the basis of the binary code in which all digital computer
programs are produced. He was also credited with conceiving the word "software".
Hal Ronald Varian (born March 18, 1947, in Wooster, Ohio) is an
economist specializing in microeconomics and information
economics. He is the chief economist at Google and he holds the title
of emeritus professor at the University of California, Berkeley where
he was founding dean of the School of Information.
I keep saying the sexy job in the next ten years will be statisticians.
People think I'm joking, but -- the ability to take data - to be able to understand it, to
process it, to extract value from it, to visualize it, to communicate it's going to be a hugely
important skill in the next decades, not only at the professional level but even at the
educational level for elementary school kids, for high school kids, for college kids.
Because now we really do have essentially free and ubiquitous data.
So the complimentary scarce factor is the ability to understand that data …
Statistics is a general intellectual method that applies wherever data, variation,
and chance appear. It is a fundamental method because data, variation, and
chance are omnipresent in modern life.
However, working with data is an art as well as a science. We learn it not simply
by mastering formal methods but by following examples set by our current
teachers and by past masters. In this, learning statistics is like learning to perform
music, another subject in which students develop practical wisdom and critical
evaluation through context and example. We learn in this way because technique
alone does not make an outstanding statistician any more than an out-standing
musician. Interpretation in the specific context is always important.
Technology is not enough … technology empowers but thinking enables …
Reading Arithmetic Writing
Think Show Tell
Recognition Computation Interpretation