Transcript PROJEKT II
Sociological metodology
Quantification
Petr Soukup
Outline of the lecture
1. Data, variables, values in quantitative research
2. Data preparation, data entry, coding of variables
3. Presentation of data, basics of statistic (Quantification)
4. Advances statistical techniques – review
5. Statistical software
6. Data archives and its usage
1. Data, variables
•
Variables 3 types
•
Variables and questions
•
Variable = 1 question ?
•
Numerical data – necessary for computer
Data, variables
•
Values of variables – for interval variables natural, for ordinal and
nominal artificial values (assigned by a researcher)
•
Data matrix – rows (respondents), columns (variables)
•
Technical names of variables (queustionnaire)
2. Open-ended question and problem of coding
1) Predefined coding schemes
Examples – ISCO, EGP, ISEI
2) Creating of our own coding scheme – example
Collapsing of categories
3. Basics of statistics (Quantification)
1) Frequencies
2) Central tendency
3) Crosstabulations
Basics of statistics – Frequencies
Frequencies – absolute and relative (percentages)
Interpretation
Example in MS Excel (data EVS)
Basics of statistics – Central tendencies
MEAN –arithmetic mean of values (sum of values divided by the number of values)
MEDIAN - the value in the middle of data )
MODE – the most frequent value in our data
Basics of statistics – Dispersion
Range – difference between maximum and minimum
Standard deviation – dispersion for interval variables (interpretation of standard deviation)
Examples in Excel
Bivariate statistics
Nominal and ordinal variables:
Crosstabulations – total counts, row and column percentages
Chi-square test of independence
Interval variables:
Correlation
Correlation coefficient and its values
Examples in Excel (EVS data)
Functions in Excel:
FREQUENCY – enables to prepare frequency table
AVERAGE – compute arithmetic mean of values (sum of values divided by
the number of values)
MEDIAN - compute median (middle) - the values in the middle of data )
MODE – compute mode, the most frequent value in our data
MAX – maximum of variable
MIN – minimum of variable
STDEV – standard deviation
4. Advanced statistics - review
More than two variables
Usually called modeling
Examples:
Loglinear modeling
Regression modeling
Structural Equation modeling
Etc.
Usually called Multivariate or Multidimensional Statistics (try Google)
5. Statistical software – review
A. General statistical software
Can prepare data (enter data, labels, clean data, transform data) and also compute individual
statistical procedures. They include many statistical procedures so these can be used nearly
every time.
List of the most common:
SPSS (Statistical Package for Social Sciences)
Origin. USA
13 modules more information see http://www.spss.com/ or http://www.acrea.cz/
Current version : 20
Price : approx. 10 th USD for the full system
Trial version: for 15 days
SAS (Statistical analysis system)
Origin. USA
More info: http://www.sas.com/offices/europe/czech/index/index.html
again individual modules
Current version: 10
Price: approx. 50 th USD
Trial version: NO
5. Statistical software – review
List of the most common (continue):
STATISTICA
Origin. USA
More info: http://www.statsoft.cz/page/index.php
again individual modules
Current version: 8
Trial version: for 30 days
Price: approx. 5 th USD
Statsoft Textbook: see(http://www.statsoft.cz/page/index2.php?pg=navigace&nav=31).
STATA
Origin. USA
More info: http://www.stata.com/capabilities/statisticalcap.html
again individual modules
Current version: 10
Trial version: NO
Price: approx. 3 th USD
5. Statistical software – review
List of the most common (continue):
R (R project)
FREEWARE OPENSOURCE SOFTWARE
Download: http://www.r-project.org/.
Current version: 2.7.
Disadvantages: No menu, worse graphical abilities
Advantages: the most up-to-date to new statistical procedures, very low requirements for
hardware, many forums about software
5. Statistical software – review
B) Special statistical software
Can be used for specialized (advanced statistical procedures). Usually it is necessary to prepare
data in some general statistical software (see above)
Examples:
AMOS – software for Structural Equation Modeling
HLM – software for Multilevel Modeling
lEM – software for Latent Vlase Analysis
etc. there are hundreds of these softwares, some of them are freeware, some are commercial
softwares (ususally not very cheap one)
6. Data archives in Social sciences
Store data from quantitative surveys (sometimes also from qualitative ones)
- Usually national archives
- There are many archive associations
What can be reached via data archives?
- Original Data – in many formats .for SPSS, SAS, Excel etc.
- Original Questionnaire
- Codebook – information about individual variables, their values and their labels (sometimes
also frequency tables for all variables included in the data file)
6. Data archives in Social sciences - Additional services
•
Possibility to find data for individual topic
•
Possibility to find the list of books and articles based on selected
data file
•
Possibility to compute basic statistics without reaching a data
(frequencies, crosstabulations, correlations)
etc.
•
•
Note: Some operations are free for some operations is necessary to
register or pay
6. Data archives in Social sciences
Example:
Czech data archive (http://archiv.soc.cas.cz)
Use system NESSTAR
Other archives and its associations – see LINKS
Thanks for your attention
[email protected]