statistics project
Download
Report
Transcript statistics project
STATISTICS PROJECT
Priya Mariam Simon
Aparna Rajeev
Sudhit Sethi
Jinto Antony Kurian
Objective
The main objective of our project is to acquire in-depth
understanding of collection, organization and interpretation of
numerical facts for taking managerial decisions. The data for the
project has been collected from the Outlook (India) site. We have
referred the link for the top 100 engineering colleges in India.
Variables
Here the variables used are Nominal variables, Ordinal variables
and some other variables. Nominal variables used are Name of the
Institution, City and G/P (Government/Private). The ordinal variable
used is Rank. We have also used various other variables such as IC
(Intellectual Capital), I&F (Infrastructure and Facilities), PS
(Pedagogic Systems), II (Industry Interface) and P (Placement).
Attributes
IC (Intellectual Capital) represents quality of students each institute
possesses.
I&F (Infrastructure and Facilities) include land, building and various
other facilities which an institute possesses.
PS (Pedagogic Systems) refer to instructional methods used for
educational purposes.
II (Industry Interface) means corporate interaction with the college.
P (Placements) refer to campus recruitments
Frequency Distribution Table
Class
interval
midpoints(x
)
frequency(f
)
cf
fx
d
d^2
50-55
52.5
0
0
0
-18.05
325.8025
0
55-60
57.5
5
5
287.5
-13.05
170.3025
851.5125
60-65
62.5
21
26
1312.5
-8.05
64.8025
1360.853
65-70
67.5
33
59
2227.5
-3.05
9.3025
306.9825
70-75
72.5
18
77
1305
1.95
3.8025
68.445
75-80
77.5
10
87
775
6.95
48.3025
483.025
80-85
82.5
5
92
412.5
11.95
142.8025
714.0125
85-90
87.5
2
94
175
16.95
287.3025
574.605
90-95
92.5
5
99
462.5
21.95
481.8025
2409.013
95-100
97.5
1
100
97.5
26.95
726.3025
726.3025
100
7055
fd^2
7494.75
Statistical details
Mean
Median
Mode
S.D
70.55
66.65
65.1
8.66
Frequency distribution
35
33
30
frequency
25
21
20
18
Series1
15
10
10
5
5
5
5
2
1
0
0
52.5
57.5
62.5
67.5
72.5
77.5
82.5
87.5
92.5
97.5
midpoints
The Histogram above shows each separate class in the distribution. In the
above histogram, we have 5 elements in the class between 55 to 60.
Frequency Polygon
Frequency Polygon
35
Frequency
30
25
20
Series1
15
10
5
0
52.5
57.5
62.5
67.5
72.5
77.5
82.5
87.5
92.5
97.5
Total Value
Frequency Polygon shows the outline of the data pattern more clearly.
Less than Ogive
Less than ogive
Cumulative frequency
120
100
80
60
Series1
40
20
0
52.5
57.5
62.5
67.5
72.5
77.5
82.5
87.5
92.5
97.5
Total value
The Ogive curve above is a graphical representation of the cumulative
frequency distribution.
Pie Chart showing distribution of Engineering Institutes
Distribution of colleges
P
40%
G
P
G
60%
Table showing Correlation and Regression between IC & P
correlation
0.81
slope
0.57
intercept
0.71
R^2
0.649154
R
0.805701
Correlation & Regression line
Placements
Correlation between IC & P
20
18
16
14
12
10
8
6
4
2
0
y = 0.5707x + 0.7104
R2 = 0.6492
Series1
Linear (Series1)
0
5
10
15
20
Intellectual Capital
25
30
Table showing Correlation and Regression between II & P
Correlation coefficient
0.630374
Slope
0.853955
Intercept
3.843427
r^2
0.397371
r
0.630374
Placements
Correlation between II & P
20
18
16
14
12
10
8
6
4
2
0
y = 0.854x + 3.8434
R2 = 0.3974
Series1
Linear (Series1)
0
5
10
Industry Interface
15
Rank correlation
1-6∑D^2+(m^3-m/12)/n(n^2-1)
sum of squares of the
differences
D
14768
no. of repetitions
m
2382
Rank correlation
0.910192019
The rank correlation is calculated by the method of sum of the squares of
the differences of the rank. Since we have repetitive ranks, a modified
formulae is used which is shown in the above tabular column.
Using this approach the Rank Correlation is found to be .9101
Probability
Frequency
Government
Private
Total
50-60
5
13
18
60-70
28
23
51
70-80
17
3
20
80-90
7
1
8
90-100
3
0
3
60
40
100
Cont…
What is the probability that a college selected at random falls in the
range 70-80?
P(E)= 20/100 0.2
What is the probability that a college selected at random is a private
college and falls under the range 60-70 ?
P(E/P) = 23/40 0.575
What is the probability that a college selected at random is a
government college
or falls under the range 80 to 90?
P(gUE) = P(g) + P(E) - P(g∩E) = 60/100 + 8/100 - 7/100
0.08
Normal Distribution
• A continuous distribution.
• Shows the distribution of data as an area
under the curve.
• From the data, 72% of colleges lie
between 65 and 90.
• Right skewed curve.