Census Hub Project
Download
Report
Transcript Census Hub Project
National Case Study
The first EU Member State,
which conducted its population
count in 2011 (1-28 February).
Incredible success – more
than 41% enumerated via
Internet!
For the first time - eCensus via
Internet was held from 1st to 9th
February 2011.
On average 300 000 citizens
were enumerated a day and
in the last day - more than
half a million!
eCensus Architecture
eCensus Architecture
Inernal Users of NSI
Internet Users
INTERNET
Cisco ASA 5540
NSI LAN
DMZ2
DB eCensus
Active
DB eCensus
Passive
Adm,
AddrReg
Logistics
Switch
DMZ1
MS SQL Server 2008 R2
EE
Load Balancer Cisco ACE 4710
Application
Server 1
Application
Server 2
….…..
….…..
….…..
Application
Server 6
SAN
BG
Traffik
All servers are based on Windows Server 2008 R2 x64 Enterprise Edition.
The database software is Microsoft SQL Server 2008 R2 Enterprise Edition.
Microsoft Windows 2008 R2 Standard Edition;
Microsoft SQL Server 2008 R2 Enterprise Edition;
Microsoft Internet Information Server 7.5;
Microsoft .NET Framework 4.0;
Microsoft Visual Studio 2010;
Microsoft C# 4.0;
Microsoft ASP.NET MVC 2.0;
Microsoft Hyper-V.
A modern and innovative technical
solution for the transmission and
dissemination of 2011 Census data of
European countries via the Eurostat
website in accordance with the
European Commission Regulation
A conceptually new system based on the
concept of data sharing, where a group
of partners agree on providing access to
their data according to standard
processes, formats and technology.
The Hub is based on 60
Hypercubes, regulated by the EU
program for 2011 Census data. The
Hypercubes are stored in the
national Dissemination Databases
and are derived when a user wants
them.
For this purpose, SDMX
standards for extraction,
processing and exchanging
data and metadata are used.
We had already conducted our 2011 Census.
8 cubes were designed and loaded by transformations of
Census data from Operational to the Analytical DB using MS
SQL Server Analysis Services. (MsOlap cubes).
Dwellings cube;
Buildings cube;
Collective Households cube;
Families cube;
Households cube;
Population cube;
Total Dwellings cube (including uninhabited ones);
Total Population cube (including temporarily residing).
We analysed the possibility to use part of the existing hardware
and software intended for the Census 2011.
A Windows Server 2008 machine acting both as WEB Application
server and Database server.
A Database server machine containing the Census 2011 data – the
Operational Database and the Analitycal (OLAP) Database with the 8
cubes , mentioned above.
For the WEB Application server:
MS Win 2008 R2 running IIS 7.5
Microsoft .NET Framework 4.0
For the Database server:
MS SQL Server 2008 R2
Enterprice Edition
The national implementation
The purpose of the project
The development of a web service which is compatible with the CENSUS
HUB WEB application and which would process the SDMX queries received
from Eurostat.
Two possible approaches to achieve the Census Hub objective:
To develop our own SDMX Reference Infrastructure;
To use already developed SDMX Infrastructure.
Choosing this aproach, we had 2 possibilities:
SDMX NSI Reference Infrastructure, developed by Eurostat;
SDMX ISTAT Framework, developed by Statistics Italy.
Our decision:
After reading the Eurostat documentation, published on
Circa, our decision was to install the SDMX NSI Reference
Infrastructure, developed by Eurostat.
Regarding the data, our decision was to use Eurostat’s
approach with the Mapping Store Database. This was the
way to be sure all data would be in the format expected
by Eurostat.
NSI Web Service v 2.1.5;
Mapping Assistant v 2.6.0;
Test Client v 2.1.6. to test the whole setup at the end.
To choose the right approach concerning the
SDMX Refference Architecture in BNSI
We took the right decision choosing the SDMX NSI
Reference Infrastructure, developed by Eurostat.
Issues associated with performance, security and other
constrains, related with IT
To provide the security, Cisco ASA 5540 was used, where separated
demilitarized zones (DMZ) for the application servers and database
servers are configured.
A server certificate has been generated and submitted to Eurostat for
allowing the connection with the WS.
Developed as a module of the Census 2011 Information system.
Through the process of WS building, all the recommendations of Eurostat on the
architecture and functionality have been met.
The purpose of the WS is to retrieve the necessary data from the Dissemination
Database (DDB), using mappings between DSD, and DDB, stored in the Mapping
Store Database.
We installed the WS on the WEB Application server and registered it in IIS under
the name CensusHub.
A server certificate has been generated and submitted to Eurostat for allowing the
connection with the WS.
The certificate has been assigned to the website censushub.nsi.bg using IIS.
SSL bindings have been configured in IIS.
The site has also been configured to use Basic HTTP Authentication.
All the Databases (the Mapping Store Database, the
Dissemination Database (DDB) and the OLAP Database)
are installed on one machine with MS SQL Server 2008 R2
Enterprise Edition.
The Mapping Store Database is generated and supported by Mapping Assistant.
It keeps the mappings between the SDMX components and the DDB.
The WS uses this information to retrieve from the DDB the data,
required by the SDMX-ML Query, and transform it to an SDMX-ML
message.
The DDB is the local data storage of the NSI dissemination environment .
We chose to generate the DDB, using the most simple schema (TYPE A)
and the same structure for the tables with the hypercubes,
recommended by Eurostat.
Currently our DDB contains only one table (H06) – the table,
corresponding to HyperCube 6 according to the Eurostat
specifications.
HC06
(The same as in the Eurostat specifications)
HC06_BG
The 8 structural
dimensions defined by
legislation
(from GEO to AGE)
The attributes, defined at
the CELL (OBS_VALUE,
OBS_STATUS,
OBS_LEVEL) and at the
GROUP level (TIME,
TIME_FORMAT, FREQ)
ID
int
GEO
nvarchar(10)
SEX
nvarchar(10)
FST
nvarchar(10)
LMS
nvarchar(10)
CAS
nvarchar(10)
POB
nvarchar(10)
COC
nvarchar(10)
AGE
nvarchar(10)
OBS_VALUE
int
OBS_STATUS
nvarchar(10)
OBS_LEVEL
nvarchar(10)
TIME
nvarchar(10)
TIME_FORMAT
nvarchar(50)
FREQ
nvarchar(10)
The OLAP Database contains cubes and dimensions,
generated and fed with data from the Census 2011 for the
needs of NSI.
We used the OLAP Database as a source for populating the DDB.
The OLAP Database contains all the data that has to be extracted, transformed and
loaded into the DDB.
The whole ETL process is done through a combination of SQL and MDX scripts.
The result from the MDX scripts is then loaded into the DDB.
Connected as
a Linked Server
DDB
Collecting Data
SQL script,
executing
the MDX
Scripts
against the
OLAP
Database
Writing to the DDB table
Before running the scripts, the OLAP Database has to be connected to the
Relational Database as a Linked Server.
Mapping of dimensions between the OLAP database and the Eurostat DSDs is
done through declarations of calculated members in the MDX script.
There is a separate MDX query for each sub-cube of the Hypercube, containing the
calculated member declarations and a cross-join for getting the Cartesian product
of aggregated data over the members of the included dimensions.
Connected as
a Linked Server
DDB
Collecting Data
SQL script,
executing
the MDX
Scripts
against the
OLAP
Database
Writing to the DDB table
The SQL script executes the queries from the MDX scripts against the Linked
Server.
The results from the MDX script are saved into a temporary table variable.
Some additional modifications are done.
At the end the data are loaded into the DDB table.
At the end of August 2011, we were ready to send to Eurostat all the necessary information
for testing our WEB Service:
the URL;
the certificate for authorization;
the User name and Password for the certificate and for the Basic HTTP Authentication
user on the Web server.
Soon after that, we were happy to be welcomed on board for the Census Hub project with
an official letter from Eurostat.
Our participation in the Census Hub project was a very good occasion to gain experience
from such a modern and innovative technical solution and to start using SDMX formats.
Plamen Minkov, Director of
Information and Communication
Technology Department,
NSI, BULGARIA.
Tel. 00359 2 9857 528
E-mail: [email protected]