Tripwire Presentation Sample Headline
Download
Report
Transcript Tripwire Presentation Sample Headline
If Maslow Were Here…
Why Is IT In So Much Pain?
(Or, Why Data Centers Are Like Meat Packing Plants?)
Gene Kim
[email protected]
Chief Technology Officer, Tripwire, Inc.
June 2001
First, Who is Maslow?
Psychologist who created “hierarchy of needs” in
1950s
Air >> Water >> Food >> Sex >> Happiness
Acknowledgements
Dr. Gene Spafford, CERIAS
Eric Hemmendinger, Aberdeen Group
Dava Sobel, Author, Longitude
Agenda
State of the world: 1700 AD vs. 2000 AD
Lifecycle challenges
Why is IT in so much pain?
Recasting integrity problem statement
IT Has A Lot Of Pain…
Data centers are like meat packing factories 150
years ago
Dangerous
Unhygienic
Driven by capacity demands
To satisfy demand, we accept the unacceptable
Why Is IT in So Much Pain?
Pain seems obvious…
Security threat environment
Mismatch in attacker/defender lifecycles
IT security budgets at all-time high
But, leading indicators are troubling…
Perceived shortfall in skilled IT labor
High security tool abandonment rate
Are we merely treating the security symptom, while the
underlying disease remains untreated?
No Perfect Change Control Board
Mainframe era: Change control was imperfect
Post-mainframe: Problems are magnified
thousand-fold
Mainframes had several advantages:
Experienced and highly skilled technicians
Centralized control
MIS lifecycle
Capacity Expansion Too Easy?
Entire new class of infrastructure created:
Web server farms
File servers
Desktop clients
Appliances
Technology changes make planning difficult
Repeatable builds often lost long, long ago…
It’s like a plane that has lost its ability to land and take off – we can
merely refuel it in mid-air and attach new engines.
Accelerating Lifecycles
Classic IT and software engineering lifecycles:
Time, Resources, Quality
“IS professionals face demands for new
applications with more functionality that must be
delivered to distressingly compressed time
schedules.” Aberdeen.
IT Audit Makes It Worse?
IT Audit can identify controls – but how do we maintain
control?
Does it perpetuate the “break/fix” cycle?
Does it actually improve security posture?
Does rapid capacity expansion introduce additional risks?
Can audit increase IT efficiency?
IT Audit often becomes at odds with IT and business
units
Perpetual non-compliance is a clue…
No Server Undo: We Can Add, but
Not Subtract…
Software installation is an irreversible function
“Give me root on your favorite Linux machine – I want to make it run
better…”
“Let me borrow your Windows 2000 laptop for a day. I want to install
some software.”
Would you be able to undo the damage?
How is this any different than recovering from an attack?
This is science?
We can add, but we can’t subtract – best we can do is
start over, and add quickly…
The Story of Longitude
Snapshot of 1700 Shipping:
Approx. 300 ships a year were
crossing the Atlantic Ocean
One Spanish galleon carried a
significant percentage of
English GNP
Navigational tools made each transit risky
Safe passage more luck than skill. Why?
The Story of Longitude
Snapshot of 1700 Navigational Technology:
Shipping lanes were parallel to equator. Why?
Numerous techniques were used for determining latitude
But, no accurate way to determine longitude!
Whenever a ship lost sight of land, it was lost
The Story of Longitude
Maritime navigation was unpredictable:
An unusual return voyage from India
British warships sunk 6 miles from English coastline
Financial cost became too high:
English Parliament passed Longitude Act of 1714
Offered $1 million prize for any practical solution
80 years later, an inventor finally claimed the prize
Are We Running Aground?
Too many instances of expensive recovery efforts:
Microsoft is a good example
Damage assessment and recovery is very costly
Once integrity is suspect, you’re lost in a sea of change
Which ones are good and bad?
When things go really wrong, server OS must be
reinstalled—a very expensive proposition, when…
No repeatable builds
No automated provisioning system
No clue to avoid doing software archaeology
An Analogy: 1700 vs 2000
Consequences:
1700: When ships inaccurately estimated their longitude,
they risked plunder or total loss
2000: When servers residing in data centers drifts offcourse, often they must be torched and rebuilt from
scratch
The Perplexing Problem
Theory and disciplines for proper server hygiene
are understood
Change management and Tripwire are already parts of
best practices
Servers are inherently dangerous to operate
Servers are rebuilt more often than we want…
Agenda
State of the world: 1700 AD vs. 2000 AD
Lifecycle challenges
Why is IT in so much pain?
Recasting integrity problem statement
Accelerating Lifecycles
Classic IT and software engineering lifecycles:
Time, Resources, Quality
“IS professionals face demands for new
applications with more functionality that must be
delivered to distressingly compressed time
schedules.” Aberdeen.
Accelerating Lifecycles
Source: Aberdeen
Moving Faster
Infrastructure is often fragile
A spelling mistake is silently corrected, and suddenly the
enterprise mail service goes down.
A maintenance patch is applied to an operating system,
and a production server goes down, affecting millions.
Predicting runaway success is difficult
Lack of safety net is sometimes obvious
Growth rates
Look at Napster:
Millions of active users.
Growth in user base growing at 2-5% per day! (Fortune)
Innovator’s Dilemma and adoption time frames
Radio, TV, Color TV, Cable TV, Cell phones, Internet.
10 million users in less than 1 year?!
Large potential security implications
Agenda
State of the world: 1700 AD vs. 2000 AD
Lifecycle challenges
Why is IT in so much pain?
Recasting integrity problem statement
Emergence of Client/Servers
Hardware Costs
Client/Server
Amazing
performance
gains and price
competition have
made PCs and
client/servers
irresistible
Capacity/$
Mainframes
Time
No Free Lunch?
…No Free Lunch?
Capital
Cost &
Cost of
Operation
(normalized)
Mainframes
require lots of
care and
feeding
???
Service
Contracts
Capital
IT Staff
Capital
Mainframe
Client/Server
What are the
uncounted costs
of necessary
maintenance for
client/servers?
Compensating Controls
Outage Costs vs. Effort & Compensating Controls
Enterprise
Server
Desired
Effort
(Compensating
Controls)
Gap can be closed with
“process and discipline”
but it is still too easy to
make big mistakes!
Home Firewall
Need a Technology
Solution
Actual
Consequence ($)
Are File Systems Inherently
Dangerous?
Why do all file systems still have general
read/write semantics?
Configuration files should have inherent version
control
Audit
Explicit commit
Rollback
Integrity Drift
Planning
Design
Desired IT
Infrastructure
Deployment
Process
Implementation
Operation
Repeatable builds
Repeatable builds
Integrity drift!
Planning
Actual IT
Infrastructure
Deployment
Process
Design
Implementation
Testing
Repeatable build
Operation
Does
not
match!
Maintenance
Maintenance
Maintenance
Capacity
Capacity
upgrade
Capacity
upgrade
upgrade
Tuning
Tuning
Tuning
Integrity Lifecycle Goals
When dealing with executives:
Use primary colors
Use small numbers
Characterize an IT organization by its ability to
deliver services with known loss characteristics.
Safety, Services, Capacity Expansion, Efficient BPA,
Accurate Forecasting?
The IT Safety Index
Where does an organization fit?
(Kim/Spafford)
5 Can you scale economically?
4 Can you get early warning of threats?
3 Can you detect changes?
2 Do you have repeatable builds?
1 Can you inventory critical biz processes?
0 Can you name a critical biz process?
Level 1: An Example (Before)
Case study: An “Akamai-like” content caching service
500 servers distributed around the world
Company doubled in size every six months
Symptoms:
Customers knew about problems before IT did
Servers were being hacked faster than they were being deployed
Operational costs for a customer exceeded revenue
Infrastructure was not viable and neither was the business
Level 1
Level 1: Consequences
What this meant to the CIO
Surprises were frequent, often from things no one heard of
Service quality was poor and inconsistent
Always in firefighting mode – busy, but not productive!
And riddled with “security” problems…
Does this remind you of anyone you know?
It’s not just dot-coms…
“Pilot programs” that go into production,
and stay there…
Level 1
Level 3: CIO Saves The Day
(After)
Solution strategy
Overhauled the software deployment process
“Locked down” all infrastructure with Tripwire
All unauthorized changes triggered server replacement – like a fuse!
Benefits
Very low and predictable remediation times
Far more productive IT staff
Decreased number of platforms
Lower capacity expansion costs
Fixed “hacker” problems
Level 3
Level 3: Restoring Defender Advantage
Attackers have more options than defenders
Defender can deploy only once per month
Attacker are numerous, and have many moves
Inherent mismatch!
Why don’t defenders lose all the time?
Level 3
Level 3: Restoring Defender Advantage
Defenders can still win, but only when you know where
your pieces are, and you keep watching the board!
“Yes, we get hacked like everyone else, but unlike the people
that get splashed on the front of the NY Times, we quickly fix
the problems… and then we’re up at 3am watching the
Hong Kong news feeds to see if anyone noticed.”
Level 3
Agenda
State of the world: 1700 AD vs. 2000 AD
Lifecycle challenges
Why is IT in so much pain?
Recasting integrity problem statement
Server Lifecycle
Acquire
Hardware +
Install OS
++
Install Web
Server +++
Test
+++++
Configure
Tripwire +++
Test
+++++
INCIDENT
Deploy
Server +++
Configure
Tripwire +
Tripwire Audit
(continuous) +
Deploy
Server +
Assess
Damage
+++++
+++++
Install Patch
+++++
Test
+++++
Repair
Damage
+++++
Re-Deploy
Server +++
The obvious
conclusion: The
status quo is difficult
to manage and too
much work!
Transfer Pain/Responsibility
IT currently must glean file behavior
How are they supposed to know how files
behave?
Operating systems
Infrastructure software
Gazillions of ISV tools…
So, Who Does Know?
Application developers wrote the software, so…
“config.h” contains wealth of information
Installers also have another portion
Retain binding information
Configuration files
Log files
Application binaries
Resulting strategy
Move upstream to IT suppliers
Get closer to where OSes and applications are built
IT gets infrastructure that is “Tripwire ready”
Lower deployment effort
Tripwire integrity anchors everywhere
File behaviors
File baselines
What is an Integrity Anchor
A documented, baseline (out of the box)
“desired good state” of an infrastructure
item:
• Configuration files
• System Registry
• Other system files
Quick story
Tripwire has been integrated into financial audit
models
In a major stock exchange:
Tripwire runs ever six minutes
Tripwire runs at end of trading day
Tripwire runs at end of each shift
Bounds integrity drift to a max 8 hours
Don’t forget
Show the poster…
Closing Thoughts
Smart people have been using Tripwire for almost ten
years – I want to understand why…
Many IT problems get blamed on security – in
reality, they hinge on stability and safety
Security often “moving deck chairs on the Titanic” –
maybe it’s time to get a new boat?
Fix/break cycles happen even in top-tier IT shops
Keep moving responsibility and accountability upstream
Come Get Free Stuff!
If you want a copy of any of these…
1. Book: Longitude
2. Poster: Servers Under Siege: A Day in the Life of an IT
Defender
3. This PowerPoint presentation
…leave a business card or email me!
[email protected]