Open an Excel Worksheet in SPSS

Download Report

Transcript Open an Excel Worksheet in SPSS

Why Statistics?
Statistical literacy is a necessary precondition for an
educated citizenship in a technological democracy.
Understanding risks and asking critical questions can also
shape the emotional climate in a society so that hopes and
anxieties are no longer as easily manipulated from outside
and citizens can develop a better-informed and more
relaxed attitude toward their health.
Gigerenzer, G., Gaissmaier, W., Kurz-Milcke, E., Schwartz,
L. M. and Woloshin, S. 2007 “Helping doctors and patients
to make sense of health statistics” Psychological Science
in the Public Interest, 8, 53–96.
1
Tuesday, 21 July 2015
5:49 AM
Are you statistically challenged?
1.
What percentage of drivers are better than
average?
Calculated Risks: How to Know When Numbers Deceive You
2
By Gerd Gigerenzer, Simon & Schuster, New York.
Are you statistically challenged?
1.
What percentage of drivers are better than
average?
Cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Around 63%, when “average” is determined by number of
accidents. This is so because the distribution of accidents
is asymmetrical; bad drivers account for more accidents
than good ones, so most drivers have fewer than the
average number of accidents.
cccccccccccccccccccccc
Calculated Risks: How to Know When Numbers Deceive You
3
By Gerd Gigerenzer, Simon & Schuster, New York.
Are you statistically challenged?
2.
If men with high cholesterol have a 50% higher risk
of heart attack than men with normal cholesterol, should
you panic if your cholesterol level is high?
Calculated Risks: How to Know When Numbers Deceive You
4
By Gerd Gigerenzer, Simon & Schuster, New York.
Are you statistically challenged?
2.
If men with high cholesterol have a 50% higher risk
of heart attack than men with normal cholesterol, should
you panic if your cholesterol level is high?
Cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Probably not. Although 50% sounds frightening, it is only
because it is given in relative terms: 6 out of 100 men with
high cholesterol will have a heart attack in 10 years, versus
4 out of 100 for men with normal levels. In absolute terms,
the increased risk is only 2 out of 100 – or 2%. Look at it
this way: Even in the high cholesterol category, 94% of the
men won’t have heart attacks.
cccccccccccccccccccccc
Calculated Risks: How to Know When Numbers Deceive You
5
By Gerd Gigerenzer, Simon & Schuster, New York.
Are you statistically challenged?
3.
HIV tests are 99.9 percent accurate. You test
positive for HIV, although you have no known risk factors.
What is the likelihood that you have AIDS, if 0.01 percent
of men with no known risk behaviour are infected?
Calculated Risks: How to Know When Numbers Deceive You
6
By Gerd Gigerenzer, Simon & Schuster, New York.
Are you statistically challenged?
3.
HIV tests are 99.9 percent accurate. You test positive for HIV, although you
have no known risk factors. What is the likelihood that you have AIDS, if 0.01 percent of
men with no known risk behaviour are infected?
Cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
Fifty-fifty. Most people assume the possibility is much
higher, an illustration of the “illusion of certainty.” The
correct answer is clear if the problem is framed in
frequencies: Take 10,000 men with no known risk factors. 1
of these men has AIDS; he will almost certainly test
positive. Of the remaining 9,999 men, 1 will also test
positive. Thus, the likelihood that you have AIDS given a
positive test is 1 out of 2. A positive AIDS test, although
cause for concern, is far from a death sentence.
cccccccccccccccccccccc
Calculated Risks: How to Know When Numbers Deceive You
By Gerd Gigerenzer, Simon & Schuster, New York.
7
Are you statistically challenged?
4.
The blood found under the fingernails of a murdered
woman matches the defendant’s blood type, which only 17.3
percent of the population shares. The blood on defendant’s
shoes matches the victim’s type, which only 15.7 of the
population shares. An expert witness at trial testified that
multiplying these two probabilities together gives a joint
probability of 2.7 percent that these two matches would
occur by chance – and that there was, therefore, a 97.3
percent chance that defendant is the murderer. What is
the flaw in the expert’s reasoning?
Calculated Risks: How to Know When Numbers Deceive You
8
By Gerd Gigerenzer, Simon & Schuster, New York.
Are you statistically challenged?
Cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
4.
This is an example of the “prosecutor’s fallacy” –
namely, the erroneous assumption that the random match
probability equals probability of guilt. The actual possibility
that the defendant is the murderer based solely on these
two matches is very small. Frequency analysis again shows
why: Assume that any of the 100,000 men in the city where
the murder took place could be the murderer. One of them,
the murderer, will show both matches with practical
certainty. Of the remaining 99,999 other residents, we can
expect that 2,700 (2.7%) show the same matches. Thus,
the probability that a man with both matches is the
murderer is 1 in 2,700 - less than one-tenth of 1 percent.
Calculated Risks: How to Know When Numbers Deceive You
By
Gerd Gigerenzer, Simon & Schuster, New York.
cccccccccccccccccccccc
9
Are you statistically challenged?
5.
In his argument to the court to exclude evidence
that O.J. Simpson (1995) had battered his wife, Alan
Dershowitz successfully argued that the evidence was
irrelevant because, although there were between 2.5 and 4
million incidents of abuse of domestic partners, there were
only 1,432 homicides. Thus, he argued, “an infinitesimal
percentage – certainly fewer than 1 of 2,500 – of men who
slap or beat their domestic partners go on to murder
them.” Dershowitz’s argument is outrageously incorrect:
the actual likelihood that a batterer actually murdered his
partner is 8 out of 9, or around 90%. What is missing from
Dershowitz’s analysis?
Calculated Risks: How to Know When Numbers Deceive You
By Gerd Gigerenzer, Simon & Schuster, New York.
10
Are you statistically challenged?
Cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
5.
Either Dershowitz was confused, or he purposely
hoodwinked the court, in much the same way that the
tobacco industry seeks to obscure the risks of smoking. His
analysis omits a key element: what number of battered
women are killed each year by someone other than their
partners? The answer is around 0.05%. Now, think of
100,000 battered women. 40 will be murdered this year by
their partners. 5 will be murdered by someone else. Thus,
40/45 murdered and battered women will be killed by their
batterers - in only 1/9 cases is the murderer someone
other than the batterer.
Calculated Risks: How to Know When Numbers Deceive You
11
By
Gerd
Gigerenzer,
Simon
&
Schuster,
New
York.
cccccccccccccccccccccc
How To Open An Excel
Worksheet in SPSS
Data preparation in Excel.
Each column is a variable.
The data type and width for each variable are
determined by the data type (text, numeric
etc.) and width in the Excel file.
12
Blank Cells
For numeric variables, blank cells are converted
to the system-missing value, indicated by a
period.
13
Variable Names
The name must begin with a letter. The
remaining characters can be any letter, any
digit, a period, or the symbols @, #, _ or $.
14
Variable Names
Variable names can be defined with any mixture
of uppercase and lowercase characters, and
case is preserved for display purposes.
15
Variable Names
Variable names cannot end with a period.
16
Variable Names
The length of the name cannot exceed 64
characters.
17
Variable Names
Blanks and special characters (for example !, ?,
' and *) cannot be used.
18
Variable Names
Reserved keywords such as
ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT,
OR, TO, WITH
cannot be used as variable names.
19
Data Entry
For simplicity enter your data into Excel with
the first cell in each column containing the
variable name.
20
Data Entry
21
Data Entry
It is essential that all column widths are
stretched to accommodate their data.
22
Data Entry
It is important that you “proof” your data.
For instance check the minimum and maximum
value in each column.
You can do this either in Excel (not forgetting
to delete these extra rows) or in
SPSS (not forgetting to make the corrections
to the Excel version).
23
Loading Into SPSS
In SPSS
File > Open > Data
24
Loading Into SPSS
Select the Excel data type
25
In SPSS
Note that sex has been recoded as 0/1, as preferred by SPSS.
26
SPSS Tips
Perhaps the simplest way to ease yourself into using
SPSS syntax is to click on the Paste button instead
of the OK button after you have set up your analysis.
This will paste the code that SPSS uses to run your
analysis into a syntax window. A syntax file is
nothing more than a text file; hence, you can type
code and comments into it, and you can cut-and-paste
in it as you would in any text editor.
27
SPSS Tips
Select Paste
28
SPSS Tips
What action does this code perform?
29
SPSS Tips
To run the code that you have pasted, you simply
highlight it and click on the right-pointing arrow
(green triangle) at the top. Your results will be
displayed in the output window just the same as if
you had used the point-and-click interface. Another
option is to, “Right click” on the highlighted
commands and choose “Run All”.
30
SPSS Tips
Activate
31
SPSS Tips
Alternately to activate, return via the drop down
menu’s to your intended commanded. Your previous
selection will be intact, you can now select OK as
usual.
However you have preserved your syntax for use on
future occasions. For instance running a full analysis
having already conducted a pilot study.
32
SPSS Tips
The EXPORT command exports output from an open
Viewer document to an external file format, such as
Word. By default, the contents of the designated
Viewer document are exported, but a different
Viewer document can be specified by name. The
target file/format may be selected.
It may be activated by “right clicking” within the
output shown by the statistics viewer and selecting
Export.
33
SPSS Tips
Export
34
SPSS Tips
Destination
35
SPSS Tips
Now you should go and try for yourself.
Each week our cluster (5.05) is booked for 2 hours
after this session. This will enable you to come and go
as you please.
Obviously other timetabled sessions for this module
take precedence.
36
Now to consider Power
37