Defining and measuring fairness of districting plans
Download
Report
Transcript Defining and measuring fairness of districting plans
Some things to talk about
• Social and political polarization
• A cool dynamic network simulation (which we haven’t done
yet)
• Statistical cutoffs and p-values (work of Wald, Berger, …)
• Survey weighting and poststratification
Studying social and political
polarization
Andrew Gelman
Departments of Statistics and Political Science, Columbia University
7 Feb 2009
Also: Tian Zheng, Thomas DiPrete, Julien Teitler, Jiehua Chen,Tyler McCormick,
Rozlyn Redd, Juli Simon Thomas, Delia Baldassarri, David Park,Yu-Sung Su,
Matt Salganik, Duncan Watts, Sharad Goel
Studying social and political
polarization
• Questions from sociology
• Questions from political science
• Sources of data
• Statistical challenges
Questions from sociology
• The “degree distribution”
• Characteristics of “the social network”
• Homophily
• Quantifying segregation
• Knowing and trusting
Questions from political science
• Polarization of Democrats and Republicans
• Polarization of political discourse
• How are people swayed by news media, talk radio, each
other, …
• Geographic polarization
• Polarization and the perception of polarization
Sources of data
• Complete data on small social networks (schools, monks, …)
• Very sparse data on large social networks (Framingham, …)
• Complete data on other networks (scientific coauthors, …)
• Other network datasets (email, Facebook, …)
• From random sample surveys
• Questions about close contacts (GSS 1985/2004, NES 2000)
• Questions about acquaintances (“How many X’s do you know?”)
Statistical challenges:
Misconceptions of others
• Examples
• Name
• Disease status
• Sexual preference
• Political leanings
• Challenge/opportunity: attributed and perceived attributes
• Appearance vs. reality
• How large is the “footprint” of a group?
Statistical challenges: Learning
about small and large groups
• 1500 respondents x 750 acquaintances = 1 million
• Potential to learn about small groups
• Potential to learn about people you can’t interview
• Difficulty with large groups
• For example, “How many Democrats do you know”
• #known is too high to quickly estimate
• Potential solution: look at subnetworks
• “Cube model” (individuals x groups x subnetworks)
• Need main effects and two-way interactions
Statistical challenges: Network
structure
• Social network is patterned
• Sex, age, ethnicity, SES, location
• Names, occupations, attitudes
• Correct for non-uniform patterns by using a mix of names
• Estimate non-uniform patterns using a conditional
probability matrix for ages
• Overdispersion to model unexplained variation
• Can’t do much with triangles, 4-cycles, etc.
Statistical challenges: Recall bias
• Some people are easier to recall than others
• David, Olga, Sharad
• For some sets of names, can be quantified:
Nicole/Christine/Michael
• Sliding definitions
• Who are your friends?
• Estimates of average #known range from 300 to 750 to …
• Estimates of average #trusted range from 1.5 to 15 to 150
Statistical challenges: Returning to
the social science questions
• Polarization as political segregation in the social network
• Comparing polarization to perceived polarization
• Answering conjectures such as: People in big cities know
more people but trust fewer people
• Getting geography back in the picture
Forming Voting Blocs and Coalitions as a
Prisoner's Dilemma: A Possible Theoretical
Explanation for Political Instability
Andrew Gelman
Departments of Statistics and Political Science, Columbia University
7 Feb 2009
Dynamic network model for
political coalitions
Mathematics of coalitions
Forming a coalition helps the subgroup (or they wouldn’t do it)
But it hurts the general population (negative externality)
Coalitions are inherently unstable
Coalitions of coalitions
Opportunistic acts of secession, poaching, and dissolution
The simulation I want to do:
Set up a political settings: “agents” with attributes and locations
Payoff function for agents
Locally optimal moves
Scheduling
Implementation
Statistical cutoffs and p-values
Andrew Gelman
Departments of Statistics and Political Science, Columbia University
7 Feb 2009
Setting a cutoff for selecting
patterns for further study
Old problem in statistics: Neyman, Wald, Berger, …
Also of interest to biologists!
Some different goals:
Finding patterns that are “statistically significant”
Classifying into those to study further, and those to set aside
Mathematical framework: distribution of a “score”
Solution depends upon:
Distribution of the score among “uninteresting” cases
Distribution of the score among “interesting” cases
Number of uninteresting and interesting cases
Cost of follow-up of uninteresting cases
Cost of follow-up of interesting cases
Survey weighting and poststratification
Andrew Gelman
Departments of Statistics and Political Science, Columbia University
7 Feb 2009
Survey weighting and poststrafication
General framework for adjusting for differences between
sample and population
Population estimate = avg over poststratification cells
You might have to model:
The survey response
Size of poststratification cells
Probabilities of selection
Respondent-driven sampling example:
Cells determined by “gregariousness” and “distance”
Could approx correlations using clustering