12 - Rice University

Download Report

Transcript 12 - Rice University

Feedback on Project and Mini-Project I
& Statistics
Krishna.V.Palem
Kenneth and Audrey Kennedy Professor of Computing
Department of Computer Science, Rice University
1
Contents
 Feedback on Projects
 Discussion of Mini-Project I solution
 History of Statistics
 Basic terms involved in Statistics
 Sampling
2
Biography of Blaise Pascal
 Blaise Pascal was a French mathematician, physicist,
inventor, writer and Catholic philosopher.
 Pascal's earliest work was in the natural and applied sciences
 Made important contributions to the study of fluids,
 clarified the concepts of pressure and vacuum
 Pascal was a mathematician of the first order.
 helped create two major new areas of research.
 wrote a significant treatise on the subject of projective geometry at the age of
sixteen
 corresponded with Pierre de Fermat on probability theory,
 strongly influenced the development of modern economics and social
science.
3
Biography of Pierre de Fermat
 Pierre de Fermat was a French lawyer at the Parlement
of Toulouse, France, and a mathematician
 He is given credit for early developments that led to infinitesimal
calculus.
 In particular, he is recognized for his discovery of an original
method of finding the greatest and the smallest ordinates of
curved lines
 which is analogous to that of the then unknown differential calculus,
 He made notable contributions to analytic geometry, probability,
and optics as well as his research into number theory.
 He is best known for Fermat's Last Theorem,
 It was described in a note at the margin of a copy of Diophantus'
4
Arithmetica.
Mini-Project Discussion
 The problem being modeled in the mini-project is the
problem of points
 Also called the problem of division of the stakes
 One of the famous problems that motivated the beginnings
of modern probability theory in the 17th century
 Through this discussion Pascal and Fermat developed concepts
that continue to be fundamental in probability to this day.
 Along with coming up with a convincing, self-consistent solution to the
division of the stakes
 It also led Pascal to the first explicit reasoning about what
today is known as an expectation value.
5
Mini-Project Discussion
 The problem concerns a game of chance with two players who
have equal chances of winning each round.
 The players contribute equally to a prize pot, and agree in advance that the
first player to have won a certain number of rounds will collect the entire
prize.
 Now suppose that the game is interrupted by external circumstances before
either player has achieved victory. How does one then divide the pot fairly? \
 Inspiration for the people predicting outcomes of the Cricket game
 It is tacitly understood that the division should depend
somehow on the number of rounds won by each player, such
that a player who is close to winning will get a larger part of
the pot.
 But the problem is not merely one of calculation; it also includes deciding
6
what a "fair" division should mean in the first place.
Basic Premise of the Problem
 To understand the basic premise of the problem, let us
construct an example (albeit imaginary) based on the
problem description
 Pascal and Fermat are sitting in a cafe in Paris and decide,
to play by flipping a coin.
 If the coin comes up heads, Fermat gets a point.
 If it comes up tails, Pascal gets a point.
 The first to get 10 points wins.
 They each bet 50 Francs and 'winner takes all'.
7
Basic Premise of the Problem
 But then an unexpected thing happens. Fermat is
winning, 8 points to 7, when he receives an urgent
message that a friend is sick, and he must rush to his
home.
 The carriage man who has delivered the message offers to
take him, but only if they leave immediately.
 Of course Pascal understands, but later, in correspondence,
the problem arises:
How should the 100 Francs be divided?
8
Solution to the problem by Fermat
 Seeing as I(Fermat) needed only two points to win the game,
and you needed 3, I think we can establish that after four
more tosses of the coin, the game would have been over
 All the possible ways of this happening are as follows:
 I think you will agree that all of these outcomes are equally likely.
 Based on these possible outcomes, I believe that we should
divide the stakes by the ration 11:5 in my favor
 I should receive (11/16)*100 = 68.75 Francs, while you should receive
9
31.25 Francs.
A More General Solution by Pascal
 Pascal was able to use it as a starting point for a
developing even slicker computation methods.
 He reasoned that any outcome that featured two or more
heads turning up meant a win for you (Fermat).
 The total number of such outcomes is equivalent to
 the number of ways to choose two objects from four, plus
 the number of ways to choose three objects from four, plus
 the number of ways to choose four objects from four.
 Here the 'events' of the coin coming up in your favor become
'objects' in our counting terminology.
10
A More General Solution by Pascal
 Let us denote the number of ways to choose to choose r
objects from n objects as nCr. Thus I think the likelihood
that you would have won the game is given by:
4C2 + 4C3 + 4C4
----------------------total outcomes
Exercise: Verify that this value is similar to the one derived by Fermat on
Slide 17
11
A More General Solution by Pascal
 Generalizing the problem : Let us suppose that two players
are playing, and the game is stopped after a certain number of
rounds
 The first player needs n more points to win, while the second
needs m more points.
 We solve this problem by constructing a Pascal’s triangle as
described below
 The rows of Pascal's triangle are conventionally enumerated starting with
row n = 0 at the top.
 The entries in each row are numbered from the left beginning with k = 0
and are usually staggered relative to the numbers in the adjacent rows.
12
Pascal’s Triangle
 A simple construction of the triangle proceeds in the following manner.
 On row 0, write only the number 1.
 Then, to construct the elements of following rows, add the number
directly above and to the left with the number directly above and to
the right to find the new value.
 If either the number to the right or left is not present, substitute a zero in its
place.
 For example, the first number in the first row is 0 + 1 = 1, whereas the numbers
1 and 3 in the third row are added to produce the number 4 in the fourth row.
 Expressing the numbers in terms of combinations, we get
Pascal’s Triangle
13
A More General Solution by Pascal
 Recall that the first player needs n points to win, while the second
needs m.
 To compute how the stakes should be divided, one computes row
(n+m) of the triangle, and then adds up the first m entries.
 This number is corresponds to the first player's (who needs n points
to win) share of the stake.
 The remaining entries should be added up to determine the second
players share.
Exercise: Apply this general proof to solve the specific case (that Fermat
requires 2 points to win and Pascal requires 3 points to win) and verify that you
get the same solution as before
Note: Full solution will be posted online
14
Contents
 Feedback on Projects
 Discussion of Mini-Project I solution
 History of Statistics
 Basic terms involved in Statistics
 Sampling
15
Feedback on Projects
 Four main aspects of these projects are:
Description of the Problem or the Model
2. Problem Statement
1.


Evaluation Criteria
3.

4.
16
Given something
Find “best” of something else
Prove that your strategy is best, pretty good etc.

Quantifying how good a strategy is
Lessons learned
Model
 There are 3 main characteristics of a model:
Constraints
1.

17
A concise list of constraints involved

Constraints which can be subsumed by others should be
appropriately identified

Some of the constraints can be slacked/tightened if it can
significantly ease the solving of the problem

For example,
o in the game of ‘Settlers of Katan’, one of the important
constraint would be that – a player cannot build a new city unless
there is a connecting path to one of his existing cities
o in the “Cricket project”, one of the constraint would be that –
there can be no more than 300 balls that can be bowled in an
innings or there can be no more than 10 wickets that can be
taken
Model
Configurations or legal states
2.

18
Presents the values of nodes/variables at a given state

A concise list of nodes/variable that need to be included in a
configuration must be identified

For example,
o in the game of ‘Settlers of Katan’, the nodes that can be included
in a configuration are initial map position, number of cards of
each type etc.
o in the game of ‘Cricket’, the nodes of a configuration would
include number of bowls left, number of batsmen left etc.
Model
3.
Possibilities for transitions
 Defines legal moves between different configurations
 For example,

in the game of ‘Settlers of Katan’, possibilities of transitions might
include one of the ways to trade-in cards
o the actual transition would depend on the roll of the dice
 As a general rule, any model shouldn’t be too cumbersome
and should be easy to apply
19
Problem Statement
 Problem statement is a concise description of the issues that need
to be addressed in the attempt to solve the problem.
 The primary purpose of a problem statement is to focus the
attention of the problem solving team.
 However, if the focus of the problem is too narrow or the scope of the solution too
limited the creativity and innovation of the solution can be limited.
 A good problem statement should have the following aspects:
 Concise description of the goal
 can use the configuration to define goals
 Description of problem to be solved
 finding the “best” or “good enough” solution to the problem
 This problem statement could be finding a “strategy” for the board
games or a good “predictor” for the cricket game
20
Evaluation Criteria
 Evaluation criteria has two main aspects to it:
 Defining the metrics for quantifying your success
 Use the metrics to prove that your strategy is the “best”, “good enough”
etc. quantitatively
 Framework for evaluation
 Can either be experimental or mathematical
 For example, experimental framework might involve a programming
environment while a mathematical framework might involve a proof or
a closed form analytical solution
21
Lessons Learned
 A detailed description of important lessons learnt in this
project
 Involves ways to generalize the solution to problems broadly in
its category
 For example, the strategy being developed in board games can be
generalized to war strategy simulations
22
Contents
 Feedback on Projects
 Discussion of Mini-Project I solution
 History of Statistics
 Basic terms involved in Statistics
 Sampling
23
Birth of Statistics
 Statistics arose from the need of states to collect data on their
people and economies
 for administrative purposes
 started in 18th century
 Bayes theorem provided the mathematical basis for this branch
 initial intuition was given by Francis Bacon
 Thomas Bayes provided the first mathematical basis to this branch of logic
 Its meaning broadened in the early 19th century to include the
collection and analysis of data in general.
 today statistics is widely employed in government, business, and the natural
and social sciences.
24
Probability Theory Vs Statistics
 Probability theory computes the probability that future (and hence
presently unknown) samples out of a known population turn out to
have stated characteristics
 Statistics looks at the present and hence known sample taken out of
an unknown population, and makes estimates of what the population
is likely to be, compares likelihood of various populations and tells
how confident you have a right to be about these estimates
25
What is Statistics?
 Statistics is a mathematical science pertaining to the
collection, analysis, interpretation or explanation, and
presentation of data.
 provides tools for prediction and forecasting based on data.
 applicable to a wide variety of academic disciplines
 from the natural and social sciences to the humanities, government and
business.
26
What is Statistics?
27