1-14 Business Continuity-Disaster Recovery AT&T Paul Brennan
Download
Report
Transcript 1-14 Business Continuity-Disaster Recovery AT&T Paul Brennan
Business Continuity/Disaster Recovery
California State Universities – November 2, 2006
Paul Brennan
Business Continuity Consultant
Copyright © 2006 AT&T. All rights Reserved.
Agenda
•Drivers and Trends
•Business Continuity Methodology
•Pandemic Problem & Preparedness
•Business Continuity Best Practices
•AT&T Business Continuity Portfolio
2
Copyright © 2006 AT&T. All rights Reserved.
Business Drivers for Business Continuity and
Compliance Storage Solutions
Customer/Market Trends
On-Line Business
24 x 7, Always on
Globalization
Stakeholder Trends
Volatile Markets
Officer Liability
GLB / HIPAA
OHS / SEC / Comptroller
Financial Trends
Risk Trends
Cost Reduction
TCO/ROI Focus
CapEx Reduction
Broader Threats
Increased Vulnerability
Greater Risk and Exposure
Organizational Trends
Technology Trends
Productivity Focus
Scarce Qualified Resources
Internet Time
Consolidation
Improved Network Capability
Exponential Storage Growth
Emerging Protocols
Application Management
All drive the demand for a highly available infrastructure.
3
Copyright © 2006 AT&T. All rights Reserved.
Legislative Drivers for Business Continuity Solutions
Ruling
Who Is
Impacted
SEC 17a-4 | 17a-3
Broker/
Dealers
Email, Instant Messaging must be stored for 3 yrs, in 2 separate & distinct
places, & must be easily accessible
NASD 3010
Broker/
Requires ‘supervision’ including implementation a formalized review process
of incoming/outgoing emails & instant messaging
Dealers
Federal Reserve/
SEC/OCC
Financial
Institution
Sarbanes-Oxley
CEO, CFO,
Specific Business Resumption Recommendations
Resume business within 2 hrs.
Recover financial transactions within next business day
Ensure Back-Up Facilities are “Out Of Region” from Primary Site
Perform cross organization tests to assure compatibility
A broad auditing, financial disclosure & corporate governance law.
Imposes substantive rules as to the conduct of a publicly-held company
Public Firms
HIPAA
Standardize the use & transfer of oral, printed, & electronic records.
Healthcare
Privacy & Security driven (protects the disclosure of all patient health
information)
Insurance
Financial Services
Modernization Act
(Gramm-Leach-Bliley)
FDA - Title 21 CFR
Part 11
Basel II
California SB 1386
4
Regulatory Challenges
Requires banks to develop privacy notices & give their customers the
option to prohibit banks from sharing customer information
Financial
Pharmaceutical
Financial
Any Entity
Dealing w/
CA
Residents
Outlines criteria for acceptance by the FDA of electronic records, electronic
signatures, and handwritten signatures executed to electronic records as
equivalent to paper records and handwritten signatures executed on paper
Relates to where and how a financial institution’s
information and data is provided and controlled
Businesses must inform residents if their name, SS#, Driver’s license,
Credit Card, or Bank account were compromised
Copyright © 2006 AT&T. All rights Reserved.
Business Continuity Professional Services
Methodology
Practices
•
Managed Risk Services
– Business Impact Analysis
– Risk Assessment
•
Business Continuity Strategy
and Planning
–
–
–
–
–
–
Have you performed a BIA and Risk
Assessment?
Do you have a Plan and do you
exercise it regularly?
If not, why not?
5
•
Mitigation Strategy Development
BC Strategy Development
BC Plan Development
BC Plan Testing
BC Plan Certification
Emergency Response Planning
Emergency Response Testing
Business Continuity Program
Management
– BC Standards Development
– BC Program Metrics
– BC Program Review
Copyright © 2006 AT&T. All rights Reserved.
Why is a Pandemic Scenario Difficult to Plan For?
When will a pandemic emerge?
Where will it start?
How virulent will it be?
How fast will it spread?
How effective will control mechanisms be?
People impact vs. infrastructure.
6
Copyright © 2006 AT&T. All rights Reserved.
Implications of Pandemic Flu
• Loss of a large and random element of the work force
for a sustained period of time
• Impact of “Worried Well” Syndrome on human behavior
• Limited warning of the actual pandemic
• Global nature of a pandemic makes most Disaster
Recovery Plans irrelevant
7
Copyright © 2006 AT&T. All rights Reserved.
Range of Potential Pandemic Impacts
Potential Business Interruption
+
+++
+++++
Personnel Impacts
<15%
Absenteeism
25%
Absenteeism
> 40%
Absenteeism
Geographic Impacts
Isolated
Outbreaks
Simultaneous
Global Outbreaks
Bi-Regional
Outbreaks
Mobility Impacts
Airline Travel
Restrictions
Local Transportation
Restrictions
Marshall Law
Declared
Infrastructure Impacts
Food/Fuel
Rationing
8
Retail Chain
Disruption
Food & Fuel
Scarcity
Copyright © 2006 AT&T. All rights Reserved.
Power
Failures
AT&T Pandemic Preparedness
• Team commissioned October 2006 to address Pandemic
Preparedness
• Global operations closely monitoring Avian flu developments
• All mission-critical work functions within AT&T have
documented Disaster Recovery Plans.
• AT&T’s unique Network Disaster Recovery capability is
specially trained for rapid service recovery during a range of
disaster scenarios.
• Existing plans are being reviewed for a Pandemic Scenario
and supplemented accordingly.
• Initial emphasis on Asia/Pacific region
9
Copyright © 2006 AT&T. All rights Reserved.
Customer Considerations in Network
Preparedness for Increased Telecommuting
• Baseline current network access configurations for telecommuting and
current levels of telecommuting
•
•
•
Data Network/VPN Access
Voice Access
Conferencing, Voice, Video, Multimedia
• Estimate expected maximum increase in telecommuting.
(employees enabled for some form of end user telecommuting vs.
actual daily average usage).
•
•
•
Plan desired increased telecommuting to be supported.
Review access facilities between AT&T and major customer data centers/communication hubs.
Review VPN gateway capacity.
• Consider special access arrangements (special VPN, voice bridges etc.)
for critical corporate control functions that may need to be run
remotely
• End user preparedness for telecommuting
•
Access Method (recommend having both primary and back-up)
– Dedicated access, DSL, etc. - Preferred primary
– Dial-up access
– Wireless data and voice access
• Review CDC College and Universities Pandemic Influenza Planning
Checklist
10
Copyright © 2006 AT&T. All rights Reserved.
Best Practices – Leadership & Program Office
•
Chancellor / President support
•
Due Diligence
•
Risk Management / Mitigation / Acceptance
•
BC/DR Program Office
–
–
•
11
BC/DR Standards
BC/DR Metrics / Communication Plan
BC/DR by design; funding as part of business case
Copyright © 2006 AT&T. All rights Reserved.
Best Practices - Implementation
•
Separate the truly critical from the merely important
•
It’s the university’s mission; not just IT
•
Ongoing process, not a project
•
5 P’s – People, Procedures, Practice, Practice, Practice!
•
Apply lessons learned from events and exercises
•
Establish incident command process
•
Integrate into daily functions & operations
•
Planning Considerations:
–
–
–
–
12
Assign tasks to positions, not people
Wide scale disruption vs. a single building
Longer outage duration
Plan for “worst case”; respond to the event
Copyright © 2006 AT&T. All rights Reserved.
“In preparing for battle I
have always found that
plans are useless, but
planning is indispensable.”
-Dwight D. Eisenhower
AT&T Business Continuity Services
Professional
Services
Business Continuity
Professional Services
Recovery
Services
Network
Services
Enterprise Recovery Services
Ultravailable® Network Services
Network Disaster Recovery Services
StorageConnectSM Services
Data Protection
Services
Tape and Disk Backup in IDCs
Storage Services in IDCs
AT&T Remote VaultSM
13
Copyright © 2006 AT&T. All rights Reserved.
Data Protection Services - Continuum
Continuity
Mission Critical, Business Continuity Options
Ready Access
Increased Redundancy w/ Recovery Options
99.99% SLA Guarantee
NAS
SAN
With Remote Replication – Data Mirroring
AT&T
Ultravailable®
Storage
Mid-Range Storage Options
Ready Access, Redundant
Point In Time Copies,
Replication
99.9% SLA Guarantee
AT&T Remote
VaultSM
PC/ Server Remote Backup
and Restore
Protect specified work stations
99.9% SLA
AT&T
Disk Backup &
Restore
IDC
(DB&R)
Backup & Restore
9 to 20 times the
speed of current tape
technology
99.9% SLA Guarantee
Storage Connectivity
Used for replicating,
archiving, mirroring
Up to 99.999% SLA
Tape Based Backup & Restore
Tape Backup
& Restore
(TB&R)
14
AT&T
IDC
Off Site Storage
Archiving Tool or Supplement
to UV & Storage Plus
99.9% SLA Guarantee
Copyright © 2006 AT&T. All rights Reserved.
AT&T StorageConnectSM
Backup & Redundancy
SAN
AT&T
Storage
Plus
Enterprise Recovery Services
The chosen Corporate Disaster Recovery Strategy must support the RTO
& RPO requirements documented in the Business Impact Analysis (BIA)
Hot Site - An alternate facility that already has in place the computer,
telecommunications, and environmental infrastructure required to recover critical
business functions or information systems.
Warm Site – An alternate processing site which is equipped with some hardware,
and communications interfaces, electrical and environmental conditioning which is
only capable of providing backup after additional provisioning, software or
customization is performed.
Cold Site – An alternate facility that already has in place the environmental
infrastructure required to recover critical business functions or information systems,
but does not have any pre-installed computer hardware, telecommunications
equipment, communication lines, etc. These must be provisioned at time of disaster.
Mobile Recovery -
A mobilized resource purchased or contracted for the
purpose of business recovery. The mobile recovery center might include: computers,
workstations, telephone, electrical power, etc.
15
Copyright © 2006 AT&T. All rights Reserved.
30 AT&T Internet Data Centers Worldwide
Atlanta Area, US
Boston Area, US
Chicago Area, US
Dallas Area, US (2)
Los Angeles Area, US (2)
New York Area, US
New York Metro Area, US
Orlando Area, US
Phoenix Area, US
San Diego Area, US
San Francisco Area, US
San Jose Area, US
Seattle Area, US
Washington DC Area, US
United States - 16
Europe - 6
Hong Kong, CH
Osaka, JP
Shanghai, CH
Singapore, SG
Sydney, AU
Tokyo, JP (3)
Asia Pacific - 8
•
Scope: Full Portfolio of
Services
•
UK Management Center
Opened 2000
•
Japan and Singapore
Management Centers
•
Alpharetta (GA) integrated
Management Center
•
Amsterdam and Nice
Opened 2003
•
Japan III Opened 2004
•
Singapore Opened 2005
Frankfurt, Paris and
London Opened 2004
•
Shanghai Opened 2006
•
16
Amsterdam, NL
Birmingham, UK
Frankfurt, DEU
London, UK
Nice, FR
Paris, FR
Copyright © 2006 AT&T. All rights Reserved.
AT&T Network Survivability Protocol
Our survivability protocol is built from four layers:
Disaster Recovery- a best-of-class fleet
of recovery trailers and a well-practiced
response team
Transport- the physical elements of our
network and its construction
Switching- the intelligence of the
network and its applications
Process- how our Network Operations
field staff completes their regular
assignments
On an average business day, the AT&T Global Network carries 6.7 petabytes of
data * and an average 329 million voice calls*. In 2005, our network had a
reliability performance rating of between 99.992 percent and 99.999 percent.
*As of July, 2006
17
Copyright © 2006 AT&T. All rights Reserved.
AT&T Disaster Response Process
• AT&T’s global network continually monitored in our
Global Network Operations Center (GNOC).
• When anomaly occurs, response managed by
GNOC staff through a practiced and proven incident
command process called 3CP:
“Command, Control, and Communications”
• Incident command team led by Network Duty
Officer in GNOC
• GNOC coordinates network incident response
across AT&T organizations, assessing impact of
event in near-real time and prioritizing restoration
efforts.
• In response to catastrophic event, GNOC activates
AT&T’s Network Disaster Recovery Team and
monitor its response.
18
Copyright © 2006 AT&T. All rights Reserved.
AT&T Network Disaster Recovery Strategy
When activated, NDR pulls a
current profile of the lost or
failed office and deploys
recovery equipment trailers
that mirror the type of
technology that was housed
in the impacted network
office.
The recovery compound is
spliced into the AT&T
network and assumes the
identity and functions of the
lost office.
19
Copyright © 2006 AT&T. All rights Reserved.
AT&T NDR — Emergency Communications
NDR establishes broadband voice and data connectivity from disaster sites
using one or more Emergency Communications Vehicles (ECVs). The ECV is a
four-wheel drive van equipped with generators, a 1.2 meter satellite antenna, a
Ku-band satellite modem, and voice/data compression equipment. Once
deployed, the ECV provides a recovery site with a mix of 96 voice/data
channels and IP connectivity to the AT&T network.
In addition to supporting network events, including NDR’s WTC response, our
ECVs have also been deployed for humanitarian relief missions.
20
Copyright © 2006 AT&T. All rights Reserved.
AT&T NDR — Exercises
The NDR team has conducted field exercises three or four times a year since its
formation in 1992. The exercises test as many of the NDR processes as
possible, from the initial call-out, to equipment transportation and setup, to
technology turn-up and testing. At these exercises, team members are given
hands-on training on new technologies and the recovery equipment is operated
in field conditions. The drills are held throughout the United States in a wide
variety of weather and settings and using a variety of recovery scenarios.
NDR’s 2006 exercise schedule:
Dallas, TX
Landover, MD
Miami, FL (happening now)
NDR’s 2007 dates and locations being scheduled
21
Copyright © 2006 AT&T. All rights Reserved.
AT&T NDR — Recent Deployments
Date
Event
Location
Equipment
August & September
2005
Hurricane Katrina
Louisiana and
Mississippi
Four ECVs, one fly-away satellite
comm unit, Command Center, and
Cable/Café Trailer.
(humanitarian relief and network
support)
October 2003
California Wildfires
San Diego, CA
Emergency Communications Vehicle
(ECV)
(humanitarian relief)
October 2003
California Wildfires
Poway, CA
Technology Trailers
(held in reserve)
September 2003
Flooding
Poquoson, VA
ECV (humanitarian relief)
June 2003
AAFES/ACS
Support
Kuwait/Iraq
Deployable Satellite Calling Centers
September &
October 2001
WTC Disaster
New York City,
NY
Jersey City, NJ
Technology Trailers and ECVs
(office recovery/humanitarian relief)
June 2001
Flooding
Houston, TX
ECV (humanitarian relief)
22
Copyright © 2006 AT&T. All rights Reserved.
AT&T-The Disaster Recovery Solution
23
Copyright © 2006 AT&T. All rights Reserved.
References
CDC Colleges and Universities Pandemic Planning Checklist
http://www.pandemicflu.gov/plan/pdf/colleges_universities.pdf
Design and Performance Issues Relating to Higher Education facilities
(earthquakes)
http://www.fema.gov/pdf/plan/prevent/rms/389/fema389_ch11.pdf
Building a Disaster Resistant University
http://www.fema.gov/pdf/institution/dru_report.pdf
Building Partnerships to Reduce Hazard Risks, Tips for Community Officials,
Colleges and Universities
http://www.fema.gov/pdf/library/femacollege_bro_final.pdf
Boston Consortium for Higher Education (Lessons Learned from Universities
responding to 9/11)
http://web.mit.edu/community/resources/learning_history.pdf
The Chronicle of Higher Education – Hurricane Experiences
http://chronicle.com/indepth/katrina/
24
Copyright © 2006 AT&T. All rights Reserved.
Possible Funding Sources
FEMA-Pre-Disaster Mitigation Grant Program
www.fema.gov/government/grant/pdm/index.shtm
Distance Learning and Telemedicine Program
http://www.usda.gov/rus/telecom/dlt/dlt.htm
Chronicle of Philanthropy www.philanthropy.com
The Grantsmanship Center www.tcgi.com
Catalog of Federal Domestic Assistance www.cfda.gov
U.S. Department of Education www.ed.gov
Planned Giving Today www.pgtoday.com
The Foundation Center www.foundationcenter.org
25
Copyright © 2006 AT&T. All rights Reserved.
Backup / Support Slides
26
Copyright © 2006 AT&T. All rights Reserved.
AT&T NDR — Hurricane Katrina Response
The NDR team began its activation on August 28, 2005, before Hurricane Katrina made landfall.
Operations Team members traveled to staging locations and a small set of equipment was
predeployed to reduce the team’s travel time to the impacted area.
On Tuesday, August 30, 2005, after the storm passed out of the Gulf Coast of Louisiana and
Mississippi, the first of four ECVs went into service providing emergency communications for the
Louisiana State Police in Mandeville, LA, near where the northwest eye wall made landfall.
On August 30 and August 31, NDR Ops Team members provided hands-on support to AT&T’s
outside plant teams working to restore a fiber route between New Orleans and a regen building in
Logtown, MS. On Wednesday night, August 31, and Ops member took down the last restoration
patches in Gulfport, MS, returning the AT&T Network to its pre-Katrina status.
Because AT&T’s network did not require further restoration, the NDR team began deploying with
ECVs at the request of the U.S. Federal government in Mississippi and Louisiana. Emergency
communications capabilities were provided to National Guard command locations, to the Louisiana
State Patrol, to a temporary jail in New Orleans, and to Stennis International Airport. The last ECV
went out of service on September 22, 2005 when the team began preparations for Hurricane Rita.
27
Copyright © 2006 AT&T. All rights Reserved.