SLAC Windows Storage - Stanford University

Download Report

Transcript SLAC Windows Storage - Stanford University

SLAC Windows Infrastructure
Brian Scott
May 2003
Windows Environment

1700 Windows computer accounts

3600 Windows user accounts

91% standard Dell desktop hardware
Old NT Environment
BABAR
BSDHUB1
SSRL
CONTROLS
SLAC
SLD-NT
ESH
Ragamuffin
KLYSTRON
MFD-HUB
MDCAD
New Windows 2000 Environment
SSRL
CONTROLS
SLAC
Single forest and domain with multiple
domain controllers (DC). FSMO rolls
reside in SLAC’s DC’s. Global catalog
replicated to remote DC’s.
Windows 2000 Active Directory

Finished rollout of Active Directory in
September 2002

Choices
–
–
–
–
Migration tools and SID history
Double ACL all resources
Re-ACL to new domain and cutover
In-place Upgrade
Upgrade Path 1: Migration
Tools/SID
Go to Native Mode
 Use migration tools to migrate user and
machine accounts (NetIQ, Quest, ADMT)
 Rely on SID history for access to old
resources
 Log into “SLAC” (NT) and “WIN” (XP)

Upgrade Path 1: Migration
Tools/SID

Pro’s
–

Easily reversible
Con’s
–
–
–
–
Migration tools not working as expected
Many migration steps and overhead
Things will break
Migration spans 1 year
Upgrade Path 2: Double ACL
Go to Native Mode
 Double ACL all resources with ACL
migration tool
 Continue to double ACL manually after
migration with any addition or change
 Log into “SLAC” (NT) and “WIN” (XP)

Upgrade Path 2: Double ACL

Pro’s
–

Easily reversible
Con’s
–
–
–
Need to re-ACL resource domains
Very confusing, things will break
Migration spans 1 year
Upgrade Path 3: Re-ACL/Big Bang!
Go to Native Mode
 Re-ACL for new domain
 One day everyone logs into new domain
(WIN), NT, W2K and XP alike

Upgrade Path 3: Re-ACL/Big Bang!

Pro’s
–

Migrate over a weekend
Con’s
–
–
–
–
Not easily reversible
Re-ACL resource domains
Things will break
Chaos for a 1-2 weeks
Upgrade Path 4: In-place Upgrade
In-place Upgrade
 Go to mixed-mode after 3-4 months,
upgrade to Native mode
 Log into “SLAC” (NT and XP) or use UPN
“win.slac.stanford.edu” (XP)

Upgrade Path 4: In-place Upgrade

Pro’s
–
–
–
–
–

No re-ACL
No new domain
No migration Tools
Less likely to break
Less overhead
Con’s
–
–
–
Not native mode
Will need to migrate off of upgraded DC at some point
No nested groups
Windows 2000 Active Directory




Chose in-place upgrade over going straight to
Native Mode
Upgrade was fast (few hours) and no accounts
needed to be migrated
Environment supports XP, Windows 2000 and
Windows NT
All SLAC Windows accounts are in Active
Directory and managed by SCS Help Desk
Windows XP and 2000 Server OS

Operating System installation via Boot CD

Boot CD provides automated installation of
the OS using Windows Preinstallation
Environment (Windows PE) and Visual Basic

Two versions of CD
–
–
OS install files stored on the network
OS install files stored on CD
Software Delivery and GPO’s

Software rolled out to workstations via Group Policy
Objects (GPO’s) rather than SMS
–
–
–

Software repackaged as MSI’s
–

Created MSI wrapper for GPO installs
All software that was part of boot-floppy installations
now installed via GPO’s
–

No clear decision from Microsoft on software delivery
Rollout via SMS could take 24 hours or longer
Little or no documentation from MS on GPO usage
Office XP, SMS, Realplayer, Acrobat, Hypersnap, WS_FTP,
TeraTerm, GS Tools and Aladdin Expander, etc…
SMS used for software and hardware inventory and
remote access to desktops
Minimum Standard for Joining
Domain

Software rolled out immediately upon joining
SLAC domain via GPO
–
–
–
–
–
XP Service Pack 1
InoculateIT Anti-virus
Registry Seed
Office XP
SMS
SUS Hotfix Delivery






Microsoft Windows XP hotfixes rolled out via
Microsoft System Update Services (SUS)
Rollout schedule is monthly
During month users can install themselves
Over the last few days of the month for those
that have not applied hotfixes themselves,
hotfixes are installed automatically
Immediate rollout available for urgent hotfixes
Servers patched once a month as well
Windows 2000 Environment




Utilize Dell hardware (1550,1650,2550,2650,6300)
Print services reside on central print servers
Central account domain in SLAC
User and Machine accounts in department OU’s
–





Administration delegated to departments
Centralized WINS Servers
Delegated DNS zone win.slac.stanford.edu running as “Integrated
Zone” on DC’s
Remote access via PPTP/VPN and ICA/Citrix
Anti-virus via CA ETrust InoculateIT
Recently finished migration of IIS to Windows 2000
Monitoring Solution

Implementing new monitoring solution. Recent
purchase of NetIQ Appmanager and NetIQ
Administration Suite
–
–
–
–
Current monitoring solution, network “ping” and
manual health checks
Reviewed HP Network Node Manager, MOM, Quest
Software and NetIQ
NetIQ is extensible using VB Script and Perl
Integrates with Telalert
NetIQ
NetIQ GPO
NetIQ File and Storage Admin
Windows Environment

Implement new backup solution.
–
–
–
–

Current solution, Veritas Backup Exec
Reviewing Legato, Veritas Netbackup, TSM, etc…
May look to disk for main backups and off-site
storage via tapes
Look to implement SAN based backup architecture
Upgrade of Citrix Metaframe 1.8 on NT TSE to
Citrix XPe on Windows 2000 underway
Windows Storage at SLAC
Total Window s Data 2002 - All Types
3000
Misc Servers
Mac Server
2500
BSD Users
SAN Users
2000
SAN Groups
GB
SAN Pub
1500
SAN Project
Web Servers
Exchange All
1000
RM Data
Grand Total
500
0
Jan-02
Feb-02
M ar-02
Apr-02
M ay-02
Jun-02
Jul-02
Date
Aug-02
Sep-02
Oct-02
Nov-02
Dec-02
Windows Storage


Dell SAN solution utilized
Storage Outages
–
2 Storage outages in 2001 lasted total of 6 days
–
Recent outage in March 2003 lasted 28 hours
Dell Storage System
Backup
StorageTech L180
1st Tier and 2nd Tier

1st Tier Storage
–

The 1st tier storage offering would always be kept small
enough that data can be restored within 4 hours after a
catastrophic failure. Provide high-end functionality such as
non-disruptive upgrades and point-in-time copy.
2nd Tier Storage
–
The 2nd tier storage offering would take full advantage of
reliable low-cost storage technology. Recovery times after a
major failure may be days rather than hours. 2nd tier system
would be comparable to current storage system.
Quotas



In order to help facilitate future storage
planning, a quota system will be proposed
Increases of storage capacity would be allowed
on an as needed basis.
Allow regular planning discussions surrounding
storage best practices.
Storage Evaluation



Completed storage evaluation March 2002
Looked at NAS, SAN and Direct Attached
Reviewed
–
–
–
–
–
–
–
Sun
Hitachi
EMC
IBM
Compaq
Network Appliance
StorageTek
Storage

Purchased Hitachi 9980
–

Hitachi 9980
–
–
–
–

Recently migrated ALL Windows data onto Hitachi solution
Brocade 3800
Emulex 2GB HBA’s
Hitachi Dynamic Link Manager
Hitachi’s ShadowImage (point-in-time copy)
In the process of purchasing Tier 2 Solution
–
–
Evaluating usual suspects
Will migrate most of information onto tier 2
New Storage Solution
HS1 HS2 OK1 OK2 PS
1 2 3 4 5 6 7 8 9101112
COLACTSTA-
CONSOLE
Lan Switch
2 Gbps SAN
Fabric Connection
LAN
Connection
Web Server, VPN
Servers, etc...
2 Gbps SAN
Fabric Connection
File Servers
IDC
E-Mail Servers
IDC
Backup
Server
HS1 HS2 OK1 OK2 PS
1 2 3 4 5 6 7 8 9101112
COLACTSTA-
CONSOLE
HS1 HS2 OK1 OK2 PS
1 2 3 4 5 6 7 8 9101112
COLACTSTA-
Tier 2 storage solution
CONSOLE
Brocade 3800
SAN Switch
Brocade 3800
SAN Switch
StorageTek
L180 LTO
Tier 1 Hitachi
9980
Reporting Storage Trends

Purchased Veritas StorageCentral SRM Tools
for end-users to better understand and control
their storage needs
–
–
–
–
–

Files being stored
Usage of those files
Growth of repository
Size of repository
Active e-mail sent with information
Currently being tested for rollout
Veritas StorageCentral
Exchange




Current production system is Exchange 5.5
Exchange 2000 is production for Windows
Administrators
Waiting for additional storage before rolling out
Exchange 2000
Exchange 2000 will reside on Hitachi 9980
solution
Exchange 2000



Hitachi solution will take snapshots of the
Exchange database every 24 hours
In the event of corrupted data, snaphot volume
will be mounted and logs played to recover email
Anticipated outage less than 4 hours
Over the next year…

Authentication
–
–
Provide single user name and password to user
Single place to change user name and password


Implement new Extra Private Network (EPN)
–
–



Integrate Unix, Windows, PeopleSoft, Oracle, Remedy, etc…
Utilize firewall technology to protect core business information
(PeopleSoft, Oracle databases, etc…)
Migrate Windows NT infrastructure to Active Directory
(incorporated with Authentication project)
Implement similar firewall technology to segment
business community utilizing the SSRL’s Beamline
New Backup Architecture
Content Management System
Future Direction of EPN Architecture