Introduction - Northern Kentucky University

Download Report

Transcript Introduction - Northern Kentucky University

CIT 470: Advanced Network and
System Administration
Workstations and Servers
CIT 470: Advanced Network and System Administration
Slide #1
Topics
1.
2.
3.
4.
Machine Lifecycle
Automated Installs
Server Hardware
Services
CIT 470: Advanced Network and System Administration
Slide #2
Machine Lifecycle
Workstation Management
CIT 470: Advanced Network and System Administration
Slide #4
States of Machines
New
A new machine
Clean
OS installed, but not yet configured for environment.
Configured
Configured correctly for the operating environment.
Unknown
Misconfigured, broken, newly discovered, etc.
Off
Retired/surplussed
CIT 470: Advanced Network and System Administration
Slide #5
State Transitions
Build
Set up hardware and install OS.
Initialize
Configure for environment; often part of build.
Update
Install new software.
Patch old software.
Change configurations.
CIT 470: Advanced Network and System Administration
Slide #6
Automated Installs
Why Automate Installs?
1. Save time.
Boot the computer, then go do something else.
2. Ensure consistency.
No chance of entering wrong input during install.
Avoid user requests due to mistakes in config.
What works on one desktop, works on all.
3. Fast system recovery.
Rebuild system with auto-install vs. slow tapes.
CIT 470: Advanced Network and System Administration
Slide #8
Trusting the Vendor Installation
Always reload the OS on new machines.
– You need to configure the host for your env.
– Eventually you’ll reload the OS on a desktop,
leaving you with two platforms to support: the
vendor OS install and your OS install.
– Vendors change their OS images from time to
time, so systems you bought today have a
different OS from systems bought 6 months ago.
CIT 470: Advanced Network and System Administration
Slide #9
Install Types
1. Hard Disk Imaging
Duplicate hard disk of installed system.
Advantages: fast, simple.
Disadvantages: need identical hardware, leads to
many images, all of which must be updated
manually when you make a change
2. Scripted Installs
Installer accepts input from script.
Advantages: flexible, systems can be different
Disadvantages: more effort to setup initially
CIT 470: Advanced Network and System Administration
Slide #10
Auto-Install Features
1. Unattended
Requires little or no human interaction.
2. Concurrent
Multiple installs can be performed at once.
3. Scalable
New clients added easily.
4. Flexible
Configurable to do custom install types.
CIT 470: Advanced Network and System Administration
Slide #11
Auto-Install Components
Boot Component
Media (floppy or CD)
Network (PXE)
Network Configuration
DHCP: IP addresses, netmasks, DNS
Install Configuration
Media (floppy or CD)
Network (tftp, ftp, http, NFS)
Install Data and Programs
Network (tftp, ftp, http, NFS)
CIT 470: Advanced Network and System Administration
Slide #12
PXE
Preboot eXecution Environment
Intel standard for booting over the network.
PXE BIOS loads kernel over network.
Applications
Diskless clients (use NFS for root disk.)
Booting install program.
How it works
1.
2.
3.
4.
Asks DHCP server for config (ip, net, tftp.)
Downloads pxelinux from tftp server.
Boots pxelinux kernel.
Kernel uses tftp’d filesystem image or NFS filesystem.
CIT 470: Advanced Network and System Administration
Slide #13
Disk Imaging
1.
2.
3.
4.
5.
6.
Setup ftp server.
Install OS image on a test
client.
Verify test client OS.
Copy image to server.
Boot clients with imaging
media.
Clients pull image from
ftp server.
2-3. test client
4. Copy image
6. Pull img
5. deployment #1
1. ftp server
5. deployment #2
CIT 470: Advanced Network and System Administration
Slide #14
Clonezilla
CIT 470: Advanced Network and System Administration
Slide #15
g4u
CIT 470: Advanced Network and System Administration
Slide #16
Scripted Install Tools
Red Hat distributions, incl. Centos
– Kickstart
– Cobbler
Debian distributions, incl. Ubuntu
– FAI
– Preseed
Mandriva Linux
– DrakX
Solaris
– Jumpstart
CIT 470: Advanced Network and System Administration
Slide #17
Network Configuration
What’s so bad about manual net settings?
– It’s only an IP address and netmask.
– What happens if you need to renumber?
Use DHCP instead of manual settings
– Make all changes on a single server.
– Easy to change settings for entire network.
– DHCP can assign static IPs as well as dynamic.
CIT 470: Advanced Network and System Administration
Slide #18
Servers vs. Desktops
How are Servers different?
•
•
•
•
•
1000s of clients depend on server.
Requires high reliability.
Requires tighter security.
Often expected to last longer.
Investment amortized over many clients,
longer lifetime.
CIT 470: Advanced Network and System Administration
Slide #20
Vendor Product Lines
Home
– Cheapest purchase price.
– Components change regularly based on cost.
Business
– Focuses on Total Cost of Ownership (TCO).
– Slower hardware changes, longer lifetime.
Server
– Lowest cost per performance metric (nfs, web)
– Easy to service rack-mountable chassis.
– Higher quality (MIL-SPEC) components.
CIT 470: Advanced Network and System Administration
Slide #21
Server Hardware
• More internal space.
• More CPU/Memory.
– More / high-end CPUs.
– More / faster memory.
• High performance I/O.
– PCIe vs PCI
– SCSI/FC-AL vs. IDE
• Rack mounted.
• Redundancy
– RAID
– Hot-swap, hot-spares
CIT 470: Advanced Network and System Administration
Slide #22
Rack Mounting
Efficient space utilization.
 Simple, rectangular shape measured in RUs.
 Repair and upgrade while mounted in rack.
 No side access required.
Requirements
 Cooling through back, not sides.
 Drives in front, cables in back.
 Remote management (serial console, hardware
sensors, VM MUI)
CIT 470: Advanced Network and System Administration
Slide #23
Server Memory
Servers need more RAM than desktops.
– x86 supports up to 64GB with PAE.
– x86-64 supports 1 PB (1024 TB)
Servers need faster RAM than desktops.
– Higher memory speeds.
– Multiple DIMMs accessed in parallel.
– Larger CPU caches.
CIT 470: Advanced Network and System Administration
Slide #24
Server CPUs
Intel Xeon
• Up to 8 cores with 2 threads each @ 1.8 to 3.3 GHz
• Up to 18 MB L3 cache
AMD Opteron
• 4, 6, 8, or 12 cores @ 1.4 to 3.2 GHz
• Up to 12 MB L3 cache
IBM Power 7
• 4, 6, or 8 cores with 4 threads each @ 3.0 to 4.25 GHz
• 4 MB L3 cache per core (up to 32MB for 8-core)
Sun Niagara 3
• 16 cores with 8 threads each @ 1.67 GHz
• 6 MB L2 cache
CIT 470: Advanced Network and System Administration
Slide #25
Xeon vs Pentium/Core CPUs
Xeon based on Pentium/Core with
changes that vary by model:
–
–
–
–
–
Allows more CPUs
Has more cores
Better hyperthreading
Faster/larger CPU caches
Faster/larger RAM support
CIT 470: Advanced Network and System Administration
Slide #26
System Buses
Servers need high I/O throughput.
– Fast peripherals: SCSI-3, Gigabit ethernet
– Often use multiple and/or faster buses.
PCI
– Desktop: 32-bit 33 MHz, 133 MB/s
– Server: 64-bit 66 MHz, 533 MB/s
PCI-X (backward compatible)
– v1.0: 64-bit 133 MHz, 1.06 GB/s
– v2.0: 64-bit 533 MHz, 4.3 GB/s
PCI Express (PCIe)
– Serial architecture, v3.0 up to 16 GB/s
CIT 470: Advanced Network and System Administration
Slide #27
Hardware Redundancy
Disks are most likely component to fail.
– Use RAID for disk redundancy.
– Cover in detail in Disks lecture.
Power supplies second most likely to fail.
– Use redundant power supplies.
– Many servers need 2 power supplies
normally.
– Need 3 power supplies for redundancy.
– Use separate power cord and UPS for each
power supply.
CIT 470: Advanced Network and System Administration
Slide #28
Full and n+1 Redundancy
n+1 Redundancy: One component can fail, but the system is
still functional.
– Ex: RAID 5, dual NICs with failover
Full Redundancy: Two complete sets of hardware configured
with failover mechanism.
– Manual: SA switches to 2nd system when notices failure.
– Automatic: The second system monitors the first and switches over
automatically on failure.
– Load-sharing: Both systems serve users, sharing load, but each has
capacity to handle entire load on its own. When one fails, other
automatically handles entire load.
CIT 470: Advanced Network and System Administration
Slide #29
Hot-swap Components
Hot-swap components
– Components can be replaced while running.
– Need n+1 redundancy for this to be useful.
– Don’t need to schedule a downtime.
Issues
– Which parts are hot-swappable?
– May require a few seconds to reconfigure.
– Be sure components are hot-swap, not hot-plug.
CIT 470: Advanced Network and System Administration
Slide #30
Hot Plug and Hot Spare
Hot Plug
– Electrically safe to replace component.
– Part may not be recognized until next reboot.
– Requires downtime, unlike hot swap.
Hot Spare
– Spare component already plugged into system.
– System automatically uses hot spare when disk/CPU
board etc. fails.
– Provides n+2 redundancy.
CIT 470: Advanced Network and System Administration
Slide #31
Separate Administrative Network
Reliability
– Allows access to machines even when network
is down.
Performance
– Backups require so much bandwidth that they’re
often done over their own network.
Security
– Network security monitoring data and logs sent
across network should be secured.
CIT 470: Advanced Network and System Administration
Slide #32
Maintenance Contracts
•
•
•
•
All machines eventually break.
Vendors offer variety of maint contracts.
Non-critical: Next-day or 2-day contract.
Clusters: If you have many similar hosts (CPU or web farm),
then on-site spares may be cheaper than maintenance contract.
• Controlled Model: Use small # of machine types for all
servers, so you can afford a spares kit.
• Critical Host: Same-day response or on-site spares.
• Highly Critical: On-site technician + dup machine.
CIT 470: Advanced Network and System Administration
Slide #33
Data Protection
• Avoid desktop backups by storing data on servers.
Easy on UNIX, harder on Windows.
• Use RAID for server hardware failures.
– Mirror root disk, higher RAID levels for data.
– Some servers use 16GB Flash drives for root disk.
– Doesn’t protect against software mistakes.
• Server backups
– Use specialized admin network to keep load off main
network.
– Use specialized tape jukeboxes to fully automate
backups of large data servers (DBs, fileservers).
CIT 470: Advanced Network and System Administration
Slide #34
Keep Servers in Data Center
Data center necessary for server reliability.
–
–
–
–
–
Power (enough power, UPS)
Climate control (temperature, humidity)
Fire protection
High-speed network
Physical security
CIT 470: Advanced Network and System Administration
Slide #35
Server Operating Systems
CIT 470: Advanced Network and System Administration
Slide #36
Server OS Image
Need greater reliability, security than desktop.
– Remove unnecessary OS components.
– Configure for best security & performance.
Install and config specialized server software.
– Server software: web, db, nfs, dns, ldap, etc.
– May need monitoring software too.
– Configuration: disk space, networking
Server OS install should be automated too.
CIT 470: Advanced Network and System Administration
Slide #37
Remote Administration
Servers must be accessible remotely.
– Allows SA to fix problems quickly at 3am.
– Allows SA to work outside machine room.
Remote Administration
–
–
–
–
Serial console and concentrator (UNIX)
Networked KVM (Windows)
Remote power control.
Important to secure remote admin facilities.
CIT 470: Advanced Network and System Administration
Slide #38
Server Appliances
Dedicated hardware + software
– Fileserver (NetApp, Auspex)
– Print servers
– Routers
Advantages
–
–
–
–
Performance
Reliability
Easy to setup
Extra capabilities
Disadvantages
– Cost
CIT 470: Advanced Network and System Administration
Slide #39
Many Inexpensive Workstations
Why buy server hardware?
– Buy two cheap rackmount PCs + failover
software.
– Works if two PCs
cheaper than server.
– Google’s approach
with ~450,000 servers.
CIT 470: Advanced Network and System Administration
Slide #40
Blade Servers
• High-density
servers on a board.
– CPU
– Memory
– Disk
• Each blade lives in
a blade chassis.
CIT 470: Advanced Network and System Administration
Slide #41
Blade Chassis
• Blade chassis
provides power,
network, remote.
• Typically hotswappable, hot-spare.
• Racks can only
support 1 svr/RU.
• Blades are higher
density, but also
require more power
and cooling.
CIT 470: Advanced Network and System Administration
Slide #42
Services
Servers vs Services
A server is a piece of hardware.
A service is the function that is provided by
one or more servers.
CIT 470: Advanced Network and System Administration
Slide #44
Services
• Distinguish structured computing environment
from some standalone PCs.
• Large orgs linked through shared services to ease
communication and optimize resources.
• Typical environments have many services
– Fundamental: net, DNS, email, auth, printing.
– Typical: DHCP, backup, directory, file, license.
• Services often depend on other services
– Almost everything depends on DNS.
CIT 470: Advanced Network and System Administration
Slide #45
Providing a Service
A service is more than hardware + software.
A service must be
1.
2.
3.
4.
5.
Reliable.
Scalable.
Monitored.
Maintained.
Supported.
CIT 470: Advanced Network and System Administration
Slide #46
Servers and Services
For a service to be reliable, servers should:
–
–
–
–
Be as simple as possible.
Have minimum software to run service.
Depend on as few other services as possible.
Depend only on services that are at least as
reliable as the service running on the server.
– Have access restricted to SAs.
– Be as few as needed for performance +
reliability.
CIT 470: Advanced Network and System Administration
Slide #47
Customer Requirements
Customers are the reason for the service.
–
–
–
–
–
How do they intend to use it?
What features do they need?
What features would they like to have?
How critical is the service?
What levels of availability and support are needed?
Service Level Agreement (SLA)
– Enumerates services.
– Defines level of support.
– Commits to response times for problem types.
CIT 470: Advanced Network and System Administration
Slide #48
Operational Requirements
Essential to designing a reliable service
–
–
–
–
–
–
What services does it depend upon?
What other services will depend upon it?
How does it interoperate with other services?
How can it be integrated with auth/dir services?
How does the service scale?
How can the service be upgraded?
• Downtime requirements.
• What systems are affected?
CIT 470: Advanced Network and System Administration
Slide #49
Open Architecture
Service should be built around open standards
–
–
–
–
Check IETF RFCs to see if it’s an open protocol.
Example service: SMTP
Example products: exim, postfix, qmail, sendmail.
Open standards don’t require open source.
Allows vendors to make interoperable products.
–
–
–
–
Avoids vendor lock-in.
Allows vendor competition (cheaper prices for you.)
Decouples client selection from server selection.
Avoids need for protocol gateways.
CIT 470: Advanced Network and System Administration
Slide #50
Requests for Comments (RFCs)
Documentation for Internet protocols,
technologies, and methodologies.
– Standards track RFCs describe Internet standards
(TCP, IP, SMTP) and must be approved by IETF.
– Experimental RFCs may become standards.
– Best Common Practice RFCs describe how to run
services or use protocols.
– Informational RFCs is a catch-all including
proprietary protocols, April Fool’s jokes, etc.
Available from http://www.rfc-editor.org/
CIT 470: Advanced Network and System Administration
Slide #51
Principles for Designing a Reliable Service
Simplicity
– The more features, the more bugs.
– Simplicity increases reliability, ease of
maintenance.
Vendor Relations
– Can be helpful about configuring service.
– Let vendors compete for your business.
– Stick to vendors who develop for your platform.
CIT 470: Advanced Network and System Administration
Slide #52
Machine Independence
Will eventually move service to new host.
– Want to avoid having a downtime.
– Want to avoid reconfiguring every desktop.
Use generic DNS alias for machine
– Mail server has name romero
– DNS alias is smtp
Use virtual IP addresses for non-name svcs
– Machine has usual IP address: 192.168.1.54
– Virtual: ifconfig eth0:0 192.168.1.5
CIT 470: Advanced Network and System Administration
Slide #53
Dedicated Machines
Put each service on its own machine(s).
–
–
–
–
If a server crashes, only impacts one service.
Easier to debug if only one service running.
Performance tuning easier with one service.
If you can’t afford a new machine, use a VM.
CIT 470: Advanced Network and System Administration
Slide #54
Environment
Safe environment
– Improves reliability: AC, UPS, physical security.
– Data center usually provides faster network too.
– Only rely on services provided by data center.
Restricted access
– Customers should not need to login to servers.
– More logins decrease stability, performance.
– Even Windows can be stable w/o user logins.
CIT 470: Advanced Network and System Administration
Slide #55
Principles for Designing a Reliable Service
Service components should be tightly coupled.
– Other than redundant components.
– Share same power source, network.
– Reduces service dependencies (single points of
failure.)
Centralize management of service
– Managed by one set of SAs.
– Support for service by single helpdesk.
– Document service.
CIT 470: Advanced Network and System Administration
Slide #56
Performance
Latency vs throughput
–
–
–
–
Latency is delay before data received.
Throughput is how much data sent per second.
Performance problems typically affects one.
Increasing the other will not solve your problem.
Remote sites
– May have high latency to main site.
– Do you need secondary servers at remote sites?
CIT 470: Advanced Network and System Administration
Slide #57
Capacity Planning
Estimate capacity from testing.
– Test server at 100 qps, 200 qps, until slow.
– Identify resources used by each query
•
•
•
•
RAM
Disk
Network
CPU
Can service be split onto multiple servers?
– Can it be done w/o users noticing?
CIT 470: Advanced Network and System Administration
Slide #58
Principles for Designing a Reliable Service
Monitoring
–
–
–
–
Availability, problems, performance.
Auto-alert front line support.
Customers shouldn’t discover problems before SA.
Capacity planning: CPU, mem, disk, network, licenses.
Service Rollout
– First impressions are difficult to change.
– Be ready for support: docs, trained helpdesk.
– Use one, some, many technique.
CIT 470: Advanced Network and System Administration
Slide #59
Key Points
Desktop Lifecycle: New, clean, configured, unknown states.
Automated Installs
– Why: consistency, fast recovery, saves time.
– Install types: imaging vs. scripted.
– Components: boot, network, config, data.
– Think about how Principles of SA apply.
Servers vs desktops
– Requirements and hardware differences.
Redundancy
– Full vs n+k redundancy.
– Hot plug vs hot spare.
Services
– Requirements: service, server, customer, operational.
– Machine independence and open architectures.
Performance: Latency vs. throughput.
CIT 470: Advanced Network and System Administration
Slide #60
References
1.
2.
3.
4.
5.
Mark Burgess, Principles of System and Network Administration,
Wiley, 2000.
Aeleen Frisch, Essential System Administration, 3rd edition, O’Reilly,
2002.
R. Evard. "An analysis of unix system configuration." Proceedings of
the 11th Systems Administration conference (LISA), page 179,
http://www.usenix.org/publications/library/proceedings/lisa97/full_pa
pers/20.evard/20_html/main.html, 1997
Thomas Limoncelli, Christine Hogan, Strata Chalup, The Practice of
System and Network Administration, 2nd ed, Limoncelli and Hogan,
Addison-Wesley, 2007.
Evi Nemeth et al, UNIX System Administration Handbook, 3rd
edition, Prentice Hall, 2001.
CIT 470: Advanced Network and System Administration
Slide #61