Transcript Slide 1
Data Management Technologies
Ohm Sornil
Department of Computer Science
National Institute of Development Administration
1
Information Architecture
Other
Organizations
Leased line
Application
+
Database
OLAP tool
Data Warehouse
Data Mining tool
The Internet
Statistical tool
2
Web-Survey System
3
Survey Creation
4
Create New Questions
5
Create Question (Multi-choice)
6
Multi-choice Question
7
Create Question (Matrix)
8
Matrix Question
9
Databases
• is a structured collection of records or data
that is stored in a computer so that a
computer program can consult it to answer
queries
• The computer program used to manage
and query a database is known as a
database management system (DBMS).
10
Database Design and E-R Diagram
11
12
SQL
• It is the standard language for relational systems
• Supports
– Data definition
• CREATE TABLE, ALTER TABLE
– Data manipulation
• SELECT, INSERT, DELETE, UPDATE
13
Business Intelligence (BI)
• Make use of enterprise-wide data to
enable strategic decision making
14
Data Warehousing
• A database
– is designed and optimized) to record
– Using complex SQL queries takes a lot of time
on such a system
• A data warehouse
– is designed (and optimized) to respond to
analysis questions that are critical for your
business (i.e., read-optimized)
15
E-R Diagram (DB Data Model)
Dimension Model (DW Data Model)
16
Data Warehousing
• Separate from application databases
ensure that business intelligence (BI)
solution is scalable
• Answer questions far more efficiently and
frequently
– Reduces the 'cost-per-analysis'
17
Multi-Tiered Architecture
OLAP Server
Other
sources
Operational
DBs
Data Sources
Extract
Transform
Load
Data
Warehouse
Data Storage
Serve
Analysis
Query
Reports
Data mining
18
OLAP Engine Front-End Tools
A Data Warehouse
• is a subject-oriented, integrated, timevariant, non-updatable collection of data
used in support of management
decision-making processes
(W.H. Inmon, 1980)
19
Data Warehouse Implementation
•
•
•
•
•
Dimension modeling
Extraction
Transformation
Data Quality
Loading
20
Extraction, Transformation, Loading (ETL)
21
Extraction Issues
22
Transformation Issues
•
•
•
•
•
•
•
•
•
•
•
Format Revisions
Decoding of Fields
Calculated and Derived Values
Splitting of Single Fields
Merging of Information
Character Set Conversion
Conversion of Units of Measurements
Date/Time Conversion
Summarization
Key Restructuring
Deduplication
23
Loading Issues
• Initial Load: populating all the data warehouse
tables for the very first time
• Incremental Load: applying ongoing changes
as necessary in a periodic manner
• Full Refresh: completely erasing the contents of
one or more tables and reloading with fresh data
(initial load is a refresh of all the tables)
24
Loading Issues
(Paulraj Ponniah, 2001)
25
Data Quality
•
•
•
•
•
•
•
•
•
•
Accuracy
Domain Integrity
Consistency
Redundancy
Conformance to Business Rules
Structural Definiteness
Data Anomaly
Clarity
Timely
Usefulness
26
27
OLAP
• Is a category of software technology that
enables analysts, managers and executives
to gain insight into data through fast,
consistent, interactive access in a wide
variety of possible views of information that
has been transformed from raw data to
reflect the real dimensionality of the
enterprise as understood by the user
(The OLAP council)
28
Multidimensional Concept
29
A Multidimensional View
30
OLAP Tool
31
OLAP Tool
32
Thought Process and OLAP
33
Another OLAP Session
34
Computer Security
• Processes and technologies that ensure
confidentiality, integrity, and availability
(CIA) of information-system assets
• Assets
– Hardware, software, firmware, and
information being processed, stored, and
communicated
35
How Are Computers and
Networks Attacked?
• Take advantages of vulnerabilities inside
operating systems, applications,
protocols, communication channels, and
human
36
Motivations of Attackers
•
•
•
•
Money
Entertainment
Entrance to social groups/status
Cause/malice
Source: Kilger M., Arkin O. and Stutzman J., Profiling. In The honeynet project know your
enemy: learning about security threats (second edition). Boston: Addison, 2004.
37
Internal Security Attacks
• Far greater cost per occurrence and total
potential cost than attacks from outside
• Employees, ex-employees, contractors and
business partners
• Trust and physical access
• Motives
– Challenge/curiosity
– Revenge
– Financial gain
Source: Kristin Gallina Lovejoy (April 2006)
http://www.csoonline.com/read/040106/caveat041206_pf.html
38
Common Internal Attacks
•
•
•
•
•
Sabotage of information or systems
Theft of information or computing assets
Introduction of bad code: time bombs or logic bombs
Viruses
Installation of unauthorized software or
hardware
• Manipulation of protocol design flaws
• Manipulation of operating system design flaws
• Social engineering
Source: Kristin Gallina Lovejoy (April 2006)
http://www.csoonline.com/read/040106/caveat041206_pf.html
39
Attacking Phases
40
IPP Printer Overflow Attack
41
42
IPP Printer Overflow Attack
43
IPP Printer Overflow Attack
44
Malicious Programs
45
Virus Structure
46
Compression Viruses
47
Inherent Technology Weaknesses
• Many of these problems can be traced back
to weaknesses in the technology
• Hackers have exploited many vulnerabilities
found in network protocols
– For example (TCP/IP)
• Inability to verify the identity of communicating parties
• Inability to protect the privacy of data on a network
• Some products also have inherent security
weaknesses (because not all product
developers make security a design priority)
48
Configuration Weaknesses
• Insecure user accounts (such as guest
logins or expired user accounts)
• System accounts with widely known
default, unchanged passwords
• Misconfigured Internet services
• Insecure default settings within products
49
Policy Weaknesses
• Policy is a set of rules by which we operate computer
systems
• Generally include
–
–
–
–
–
–
–
Physical access controls
Logical access controls
Security administration
Security monitoring and audit
Software and hardware change management
Disaster recovery and backup
Business continuity
• No single solution should be viewed as providing all
the protection you need
50
51
Goals of Computer Security
• Confidentiality
• Integrity
• Availability
• Two additional requirements from electronic
commerce
– Authentication
– Nonrepudiation
52
Planning for Security
• Security is more about process than
technology
• Chief Security Officer (CSO)
• Plan-Protect-Respond (PPR) cycle
53
54
Security Planning
• Risk Analysis
• Establish policies considering
– Risk analysis
– Corporate business goals
– Corporate technology strategy
• Actions
– Selecting technology
– Procedures to make technology effective
55
56
Risk Assessment
57
Operational Model of Computer
Security
Protection = Prevention + (Detection + Response)
Prevention:
Detection:
Response:
• Access control
• Firewalls
• Encryption
• Audit logs
• Intrusion Detection
Systems
• Honeypots
• Backups
• Incident response
Teams
• Computer forensics
58
Layered Security
Physical Security
Network Security
Network Security
Host Security
Host Security
Audit Logs (Detection)
Access Controls
Intrusion Detection Systems (Detection)
Firewall (Prevention)
Access cards, biometric authentication
59
Common Network Architecture
DMZ
Semitrusted Zone
Untrusted Zone
Trusted Network Zone
IDS
Web Server
DB Server
The Internet
Inner Firewall
Outer Firewall
DNS Server
Application Server
60
Public Key Infrastructure (PKI)
• Data Encryption
• Digital Signature
• Certificate Authority
61
Digital Signature
62
63
Intrusion Detection System Premise
64
Responding
• Planning for response
• Incident detection and determination
– Procedures for reporting suspicious situations
– Determination that an attack really is occurring
– Description of the attack
• Containment and recovery
– Containment: stop the attack
– Repair the damage
• Punishment
– Forensics
– Prosecution
• Fixing the vulnerability that allowed the attack
65
Business Continuity Planning
66
Trends of Security Attacks
• Scott Berinato in CIO magazine
–
–
–
–
–
“today's sloppiness will become tomorrow's chaos”
In 2010 alone, 100,000 new software vulnerabilities
Incidents worldwide will swell to about 400,000 a year
Another half-a-billion users are connected to the Internet.
A few of them will be bad guys, and they'll be able to pick and
choose which of those 2 million bugs they feel like exploiting.
• Stallings [2005]
– More sophisticated attacks while less knowledge required
• Panko [2004]
–
–
–
–
Growing attack frequency
Growing randomness in victim selection
Growing malevolence
Growing attack automation
67
Trends of Security Mechanisms
• Integrates solutions
• Intelligent mechanisms
• Outsourcing security services
68
Managed Security Service
Provider (MSSP)
Firm
2.
Encrypted &
Compressed
Log Data
MSSP Logging
Server
Log File
Security Manager
5.
Vulnerability
Test
MSSP
3.
Analysis
4.
Small Number of Alerts
69
Thailand’s Security Weaknesses
•
•
•
•
•
•
•
Budgeting
Management supports
Low awareness of potential danger
Laws and enforcements
Human competency development
Limited number of security research projects
Security curriculum
Source: A Brain Storming Session on ICT Security Planning, Ministry of ICT, May 8, 2006.
70
Thailand’s ICT Security Plan
Scope
–
–
–
–
–
–
–
–
–
–
–
Information security policy
National PKI management
Cryptographic technology development
Advanced system and network security
technology development
Information security technology standardization
Standards for government agency security
IT security product evaluation
Response to hacking and virus attacks
Security consulting service for critical information
infrastructure
Manpower capacity building
Game online management
71