Evidence-based management: preview of key ideas

Download Report

Transcript Evidence-based management: preview of key ideas

Correlated subqueries
Tutorial 3, Q2
d) Produce a table showing the total population for each LA and the area it covers.
Compute the population density. Order your output from highest to lowest density.
e) Extend the previous query to include the total number of crimes in each LA.
Part e) is not a simple extension of d). Bringing in violent_crime in a 3 way join will
not work, and this will mean the information about population density and ward
are is repeated multiple times. The answer is to use a subquery:
SELECT LA_code, sum(population)/sum(wardarea) AS density,
(SELECT count(*) FROM wards_by_LA WLA2, violent_crime
WHERE WLA2.ward_code =violent_crime.ward_code AND
WLA2.LA_code= WLA1.LA_code ) AS n_crimes
FROM ward_profile, wards_by_LA WLA1
WHERE ward_profile.ward_code= WLA1.ward_code
GROUP BY LA_code
ORDER BY sum(population)/sum(wardarea) DESC;
Note the use of the subquery as an output field. This type of subquery is called a
correlated subquery because it makes an external reference (WLA1) to the main query,
and has to be re-evaluated for each iteration of the main query, i.e. each value of LA_code
More SQL
All the joins we have been considering so far are called inner joins. SQL provides
an alternative syntax:
SELECT * FROM violent_crime INNER JOIN ward_profile
ON violent_crime.ward_code= ward_profile.ward_code
Inner joins only include combinations of rows when there is a match in both
tables. This can be problematic – the above join, for instance, would not
produce any rows for wards where no crimes were committed. Sometimes we
may want to include all the rows from one table even if there is no matching
row in the other table. This is called an outer join. For example:
SELECT ward_profile.ward_code, count(crime_id) AS n_crime
FROM ward_profile LEFT OUTER JOIN violent_crime
ON ward_profile.ward_code = violent_crime.ward_code
GROUP BY ward_profile.ward_code
ORDER BY count(crime_id) DESC
LEFT just means all
rows from the first
table are included
(vice versa for RIGHT)
FINALLY – note that once you have stored an Access query, you can query it just like a
table, e.g. SELECT * from Query1. This allows very complex queries to be built up. You
can think of the query as providing a particular VIEW of the database.
Chen is not the
only notation
MORE ERM
Relationships can be a) three way and b) recursive.
SUPPLY
SUPPLIER
PROJECT
a) three way
PART
EMPLOYEE
N
SUPERVISOR
1
b) Recursive relationship. Note that the
double line means every employee has to
have a supervisor. This is called TOTAL
PARTICIPATION in the relationship.
ERM – finale: Wozzie’s Gym
USER
ID
NAME
This notation is used to indicate subclasses of
an entity. Note that the subclasses “inherit” all
the attributes of the “parent class”.
BOOKS
USER
d
CASUAL
MEMBER
FULL
MEMBER
BANK
DETAILS
FACILITY
Two implementation
options: 1) Combine casual
and full members in same
table, with an extra
attribute to indicate status.
Alternatively, have a separate table for each and then use the UNION
operator to combine them:
SELECT user_id FROM full_member
UNION
SELECT user_id FROM casual_member
Business Intelligence (BI)
David Wastell
The production of timely, accurate, high
value and actionable information
• A rational approach to continuous improvement
based on:
– Gathering/analyzing operational (transactional) data
– Making decisions & taking actions based on that data
– Measuring results according to predetermined metrics
(Key performance indicators – KPIs)
– Feeding lessons from one decision into the next
Data warehousing
A Data Warehouse is a centrally managed and integrated database containing
structured data from operational sources in an organization. Data extracts are
validated, cleansed and transformed into a common, stable, relatable view of data.
ETL = Extract Transform Load
Dashboards and KPIs
Business dashboards provide a “control panel” for monitoring the vital
functions of the business, supplying immediate information, indicating
when and where performance is lagging…..
BI in Action: Crime Policy
MADE
Police
Monthly
Reports
County Council
Fire service
Ambulance Service
Probation
Other
ETL
Datamining
(policy
research)
3000
no of serious crimes
2500
2000
alcohol involved
alc not involved
total
1500
1000
500
0
time of day
600
serious violent crime
Data mining example:
the aetiology of alcoholrelated violence
500
400
Town 1
Town 2
300
Rural 1
Rural 2
Town 3
200
100
0
time of day
Correlation = 0.67, very sig.
Evidence-based Policy: crime control
% of total in target zone
25
20
15
Guess what…..
Street drinking ban
10
5
0
Month
% total serious violent crime
committed within target zone (reduced
14.8% to 12.3%)
BUT … any validity concerns,
alternative explanations??
Ambulance
Incidents
(monthly)
Target
zone
County
demand
Before ban
9.1
516.2
After ban
8.3
541.1
Change
-9%
+5%
All effects
stat. sig.
OLAP: Online analytical processing
Crime data: relational view
Crime as a multi-dimensional
“cube”
Dimensions are
typically hierarchies
of categories
Slicing and dicing – and drilling down