#### Transcript What is the mean?

Guidelines for Assessment and Instruction in Statistics Education A Curriculum Framework for Pre-K-12 Statistics Education The GAISE Report (2007) The American Statistical Association http://www.amstat.org/education/gaise/ Christine Franklin & Henry Kranendonk NCSM Conference March 19, 2007 Outline of Presentation • Overview of the GAISE Report • The Evolution of a Statistical Concept - The Mean as Fair Share/Variation from Fair Share - The Mean as the Balance Point/Variation from the Mean - The Sampling Distribution of the Mean/Variation in Sample Means • Summary Benchmarks in Statistical Education in the United States (1980-2007) • The Quantitative Literacy Project (ASA/NCTM Joint Committee, Early 1980’s) • Curriculum and Evaluation Standards for School Mathematics (NCTM, 1989) • Principles and Standards for School Mathematics (NCTM, 2000) • Mathematics and Statistics College Board Standards for College Success (2006) • The GAISE Report (2005, 2007) GOALS of the GAISE Report • Promote and develop statistical literacy • Provide links with the NCTM Standards • Discuss differences between Mathematics and Statistics* • Clarify the role of probability in statistics* • Illustrate concepts associated with the data analysis process* • Present the statistics curriculum for grades Pre-K-12 as a cohesive and coherent curriculum strand* • Provide developmental sequences of learning experiences* Stakeholders • Writers of state standards • Writers of assessment items • Curriculum directors • Pre K-12 teachers • Educators at teacher preparation programs STATISTICAL THINKING versus MATHEMATICAL THINKING • The Focus of Statistics on Variation in Data • The Importance of Context in Statistics PROBABILITY Randomization • Sampling -- "select at random from a population" • Experiments -- "assign at random to a treatment" THE FRAMEWORK Underlying Principles PROBLEM SOLVING PROCESS Formulate Questions • • clarify the problem at hand formulate one (or more) questions that can be answered with data Collect Data • • design a plan to collect appropriate data employ the plan to collect the data Analyze Data • • select appropriate graphical or numerical methods use these methods to analyze the data Interpret Results • • interpret the analysis taking into account the scope of inference based on the data collection design relate the interpretation to the original question Developmental Levels • The GAISE Report proposes three developmental levels for evolving statistical concepts. Levels A, B, and C The Framework Model A Two-Dimensional Model • One dimension is the four components of the statistical problem-solving process, along with the nature of and the focus on variability • The second dimension is comprised of three developmental levels (A, B, and C) THE FRAMEWORK MODEL Process Component Level A Level B Level C Formulate Question Beginning awareness of the statistics question distinction Increased awareness of the statistics question distinction Students can make the statistics question distinction Collect Data Do not yet design for differences Awareness of design for differences Students make designs for differences Analyze Data Use particular properties of distributions in context of specific example Learn to use particular properties of distributions as tools of analysis Understand and use distributions in analysis as a global concept Interpret Results Do not look beyond the data Acknowledge that looking beyond the data is feasible Able to look beyond the data in some contexts THE FRAMEWORK MODEL Nature of Variability Measurement variability Natural variability Induced variability Sampling variability Chance variability Focus on Variability Variability within a group Variability within a group and variability between groups Variability in model fitting Co-variability Activity Based Learning • The GAISE Report promotes active learning of statistical content and concepts Two Types of Learning Activities • • Problem Solving Activities Concept Activities The STN article illustrates a Problem Solving Activity across the three developmental levels. The evolution of a statistical concept -- • What is the mean? • Quantifying variation in data from the mean Level A Activity The Family Size Problem A Conceptual Activity for: • Developing an Understanding of the Mean as the “Fair Share” value • Developing a Measure of Variation from “Fair Share” A Question How large are families today? • Nine children were asked how many people are in your family. • Each child represented her/his family size with a collection snap cubes. Snap Cube Representation for Nine Family Sizes How might we examine the data on the family sizes for these nine children? 2 3 3 4 4 5 6 7 Ordered Snap Cube & Numerical Representations of Nine Family Sizes 9 Notice that the family sizes vary. What if we used all our family members and tried to make all families the same size, in which case there is no variability. How many people would be in each family? How can we go about creating these new families? We might start by separating all the family members into one large group. All 43 Family Members Step 1 Have each child select a snap cube to represent her/him-self. These cubes are indicated in red. Create Nine “New” Families/Step1 Step 2 Next have each child select one family member from the remaining group. These new family members are shown in red. Create Nine “New” Families/Step2 Continue this process until there are not enough family members for each child to select from. Create Nine “New” Families/Step4 Discuss results • The fair share value Note that this is developing the division algorithm and eventually, the algorithm for finding the mean. A New Problem What if the fair share value for nine children is 6? What are some different snap cube representations that might produce a fair share value of 6? Snap Cube Representation of Nine Families, Each of Size 6 Have Groups of Children Create New Snap Cube Representations For example, following are two different collections of data with a fair share value of 6. Two Examples with Fair Share Value of 6. Which group is “closer” to being “fair?” How might we measure “how close” a group of numeric data is to being fair? Which group is “closer” to being “fair?” The upper group in blue is closer to fair since it requires only one “step” to make it fair. The lower group requires two “steps.” How do we define a “step?” • One step occurs when a snap cube is removed from a stack higher than the fair share value and placed on a stack lower than the fair share value . • A measure of the degree of fairness in a snap cube distribution is the “number of steps” required to make it fair. Note -- Fewer steps indicates closer to fair Number of Steps to Make Fair: 8 Number of Steps to Make Fair: 9 Students completing Level A understand: • the notion of “fair share” for a set of numeric data • the fair share value is also called the mean value • the algorithm for finding the mean • the notion of “number of steps” to make fair as a measure of variability about the mean • the fair share/mean value provides a basis for comparison between two groups of numerical data with different sizes (thus can’t use total) Level B Activity The Family Size Problem • How large are families today? A Conceptual Activity for: • Developing an Understanding of the Mean as the “Balance Point” of a Distribution • Developing Measures of Variation about the Mean Level B Activity How many people are in your family? Nine children were asked this question. The following dot plot is one possible result for the nine children: -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 Have groups of students create different dot plot representations of nine families with a mean of 6. -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 In which group do the data (family sizes) vary (differ) more from the mean value of 6? 1 4 2 1 2 0 1 2 3 -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 0 0 4 3 2 0 2 3 4 -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 In Distribution 1, the Total Distance from the Mean is 16. In Distribution 2, the Total Distance from the Mean is 18. Consequently, the data in Distribution 2 differ more from the mean than the data in Distribution 1. 1 4 2 1 2 0 1 2 3 -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 Note that the total distance for the values below the mean of 6 is 8, the same as the total distance for the values above the mean. For this reason, the distribution will “balance” at 6 (the mean) The SAD is defined to be: The Sum of the Absolute Deviations Note the relationship between SAD and Number of Steps to Fair from Level A: SAD = 2xNumber of Steps Number of Steps to Make Fair: 8 Number of Steps to Make Fair: 9 An Illustration where the SAD doesn’t work! 4 4 -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 1 1 1 1 1 1 1 1 -+--+--+--+--+--+--+--+--+2 3 4 5 6 7 8 9 10 The SAD is 8 for each distribution, but in the first distribution the data vary more from the mean. Why doesn’t the SAD work? Adjusting the SAD for group sizes yields the: MAD = Mean Absolute Deviation Measuring Variation about the Mean • SAD = Sum of Absolute Deviations • MAD = Mean of Absolute Deviations • Variance = Mean of Squared Deviations • Standard Deviation = Square Root of Variance Summary of Level B and Transitions to Level C • Mean as the balance point of a distribution • Mean as a “central” point • Various measures of variation about the mean. The Mean at Level C • At Level C, the notion of the “Sampling Distribution of the Sample Mean” is Developed. • This development connects probability and statistics and provides the link between the descriptive statistics students have learned at Levels A and B and concepts of inferential statistics they will learn at Level C. Eighty Circles/What is the Mean Diameter? Activity • Students select samples of 10 circles they considered to be representative of the 80 circles. The mean for each sample is determined. • Students select simple random samples of 10 circles. The mean for each sample is determined. How do the results from self-selection compare with random sampling? • Following are results from two introductory-level statistics classes (50 students). Dotplot of Random Selection versus Self Selection Random Selection Self Selection 1.0 1.2 1.4 1.6 1.8 Sample Means Population Mean = 1.25 2.0 2.2 Sampling Distributions provide the link to two important concepts in statistical inference. • Margin of Error • Statistical Significance The STN article in your packet provides an illustration of how a sampling distribution is used to develop these statistical concepts. SUMMARY: GOALS of GAISE Report • Promote and develop statistical literacy • Provide links with the NCTM Standards • Discuss differences between Mathematics and Statistics • Clarify the role of probability in statistics • Illustrate concepts associated with the data analysis process • Present the statistics curriculum for grades Pre-K-12 as a cohesive and coherent curriculum strand • Provide developmental sequences of learning experiences