System Management Issues for the Future Real

Download Report

Transcript System Management Issues for the Future Real

System Management Issues
for the Future Real-Time
University Environment
Tom Board
September 22, 2004
Northwestern University Information Technology
About the “Real-Time Enterprise”
• Application availability
• Information integrity
• Transaction transparency
Thesis:
A real-time enterprise is too complex to
manage with our current methods. To
keep users productive, to avoid security
breaches, and to meet overall expectations
we need new approaches and tools.
Northwestern University Information Technology
About System Management
• Goal: User productivity
• Measured by:
– Predictable and reliable transactions
– Confident security of all information assets
– Minimal application downtime
• While enabling:
– Efficient operations
– Effective application of resources
Northwestern University Information Technology
Item: Transaction Transparency
• For a single user transaction, all expected
secondary transactions between systems
take place without intervention
• “Real-time” means the time it takes for the
user to move between systems that are
affected by the transactions
Northwestern University Information Technology
Transaction Transparency
Human Resources System
Hiring Event
Queue to
ERP
Provision
access
Provision
ETES
Notify
supervisor
Encumber salary
and benefits
Notify unit
funds mgr
Provision
NetID
Provision
Wildcard
Schedule
training
Subscribe to
email lists
Notify
supervisor
Provision
directory
Provision
calendar
Northwestern University Information Technology
How: Service-Oriented Architecture
• Virtual application integration
• “Structured application architecture”
defines services and eases maintenance
A
B
C
X
Registry
(UDDI)
Web Services Software Bus (SOAP/XML over
HTTP/HTTPS)
Description
(WSDL)
D
E
F
Northwestern University Information Technology
Item: Information Integrity
• Authoritative information is current
• Current information can be accessed in
real-time (what is the fund balance?)
• Consistent data item semantics
• Data capture is reliable and audited
• Business Continuity requirements call for
frequent restore points
– Can we lose one (day’s, hour’s) transactions?
Northwestern University Information Technology
Threats to Information Integrity & Security
•
•
•
•
•
Lack of security awareness
– Information
sensitivity
Poor
software
configurations
–
requirements
Exploitation
– Legal
Open file
permissions
threats
–
risks
Open presetsoftware
accounts
– Opportunity
Unpatched
Compromised
identities
Weak
or non-existent
passwords
–
file cabinets
– Unlocked
Post-It™
password
reminders
Poor
Business
Continuity
practices
–
vulnerabilities
– Social
Auto-login
settings
No information
backup process
– Shared
NetIDs
No off-site
backups
– Too infrequent backups
Northwestern University Information Technology
Answers to Information Integrity Threats
• Lack of security awareness – education;
newsletters; required quiz before access
• Poor software configurations – desktop
scanning; controlled intrusion attempts
• Exploitation threats – education; auto
scanning of e-mail; desktop scanning
• Compromised identities – common identity
and reduced sign-on; two-factor methods
• Poor Business Continuity practices –
education; audit reports; table-top drills
Northwestern University Information Technology
Item: Application Availability
• Most important: user-perceived availability
– Up-time
– Response time
• Service provider availability
– Up-time outside of maintenance windows
– Response time
– Simultaneous sessions
• Transaction transparency makes any
service only as reliable as the weakest link
Northwestern University Information Technology
Availability is Measured End-to-End
User
Workstation
Local
Network
Backbone
Network
Security and
Access Control
Data Center
Network
Web
Servers
App
Servers
Database
Servers
Storage
Network
• We must measure availability, performance,
response time, etc., end-to-end.
– This quantifies perceived experience
– Requires monitoring the complete application path
• Transaction measurements and trends are more
important than volume metrics
– Instead of how many – what was the wait?
– Instead of worst response time – distribution and
trend of response times
Northwestern University Information Technology
Threats to Application Availability
•
•
•
•
•
Physical
Malicious code
Denial-of-Service
Poor software quality assurance
Poor capacity planning
If an application is available this hour,
then what must we do to ensure that
it is available next hour?
Northwestern University Information Technology
Threats to Application Availability
Data Centers
Mirroring
Malicious threats
Application Monitoring
Change Control & Backup
Physical threats
Multi-Tier Application
Production Platform
Backup Site
Firewalls
Fire
Hackers
Identity Thieves
Network
Monitoring
Authentication
service
Intrusion
detection
Port Scanning /
Denial of Service
Attacks
IP address
service
Network
Behavior
modeling
User Workstation
Viruses
Trojan Horses
Keystroke Loggers
Intrusion
Name service
Patch & AV
scanning
User Workstation
Environmental
controls failure
Regression
Testing &
Load Testing
Development
Power failure
Storm/Flood
Seasonal changes
in demand
Northwestern University Information Technology
Response Time or
Transaction Time
Capacity - Monitoring is Crucial
Take
corrective
action?
SLA goal
Perceived
What is
the
interval?
Time
Northwestern University Information Technology
Dealing with Peak Demands
Static provisioning for peak demand leaves resources idle.
Conservative estimates create excess capacity. Both
contribute to increased costs.
Transactions / unit
Excess Capacity
SLA
Idle Capacity
Actual Demand
Time
Northwestern University Information Technology
Dynamic Provisioning
End-to-End Measurement
User Workstation
Performance
Monitoring
Data
Load Balancing
Detecting a performance
problem in the Web Server
Tier, the Management System
configures and logically
deploys a server from the
"pool" into the Web Tier.
Web Server
Tier
Server "pool"
Application
Server
Tier
Database
Server
Tier
Management
System
Northwestern University Information Technology
Using Dynamic Provisioning
Dynamic provisioning for peak demand reduces idle capacity
and eliminates over capacity. Result: cost savings.
Transactions / unit
Allocated pool
capacity
SLA
Idle Capacity
Actual Demand
Time
Northwestern University Information Technology
Answers to Availability Threats
• Physical – redundancy and diversity
• Malicious code – vulnerability scanning
and intrusion detection
• Denial-of-Service – session behavior
modeling
• Poor software quality assurance – new
development methods and regression
testing
• Poor capacity planning – load testing,
monitoring and dynamic provisioning
Northwestern University Information Technology
Work In Progress
• Continuing requests for load testing and
regression testing software
• ITCS is experimenting with dynamic
provisioning and end-to-end monitoring
software
• Dormitory scanning software is under
study for possible wider deployment
• ADC working on data access policies and
role-based security frameworks
• Identity management system replacement
Northwestern University Information Technology
Summary
• The University will become a real-time
enterprise under a Service Oriented
Architecture
• Information integrity and real-time access
are vital to support distributed business
processes
• User productivity will be dependent upon
many inter-operating systems – a single
degraded service will affect processes
throughout the University
Northwestern University Information Technology
Summary (con’t)
• We need increased security awareness and
systems to automatically detect and remediate
threats – the network must defend itself
• This new environment will overwhelm “seat of
the pants” monitoring or uncoordinated
approaches
• End-to-end monitoring, dynamic provisioning,
software authoring tools, and move-toproduction testing tools are necessary for NUIT
to be both proactive and efficient
Northwestern University Information Technology
Questions?
Northwestern University Information Technology