Introduction - faculty.cs.nku.edu
Download
Report
Transcript Introduction - faculty.cs.nku.edu
CIT 470: Advanced Network and
System Administration
Administration Fundamentals
CIT 470: Advanced Network and System Administration
Slide #1
Topics
1.
2.
3.
4.
The Nature of System Administration
Organizations and Certifications
Change Management
Remote Administration
CIT 470: Advanced Network and System Administration
Slide #2
What is System Administration?
What is a system?
System: An organized collection of computers
interacting with a group of users.
PCs
Network
Services
run on
run on
Servers
Users
help to accomplish work
CIT 470: Advanced Network and System Administration
Slide #4
System State
System policy: specification of a system’s
configuration and its acceptable usage.
System state S(t): the current configuration (files,
kernel, memory or CPU usage) of a system.
Ideal states S*(t): states of the system that match the
system policy. Over time, the system state shifts
away from the ideal state.
System administration: modifying the system to
bring it closer to S*(t).
CIT 470: Advanced Network and System Administration
Slide #5
What do sysadmins do?
Small org: sysadmin can be entire IT staff
–
–
–
–
Phone support
Order and install software and hardware
Fix anything that breaks from phones to servers
Develop software
Large org: sysadmin is one of many IT staff
– Specialists instead of “jack of all trades”
– Database admin, Network admin, Fileserver admin, Help
desk worker, Programmers, Logistics
CIT 470: Advanced Network and System Administration
Slide #6
Common Activities
1.
2.
3.
4.
5.
6.
7.
8.
9.
Add and remove users.
Add and remove hardware.
Perform backups.
Install new software systems.
Troubleshooting.
System monitoring.
Auditing security.
Help users.
Communicate.
CIT 470: Advanced Network and System Administration
Slide #7
User Management
Creating user accounts
– Consistency requires automation
– Startup (dot) files
Namespace management
– Usernames and UIDs
– Multiple namespaces or SSI?
Removing user accounts
– Consistency requires automation
– Many accounts across different systems
CIT 470: Advanced Network and System Administration
Slide #8
Hardware Management
Adding and removing hardware
– Configuration, cabling, etc.
Purchase
– Evaluate and purchase servers + other hardware
Capacity planning
– How many servers? How much bandwidth, storage?
Data Center management
– Power, racks, environment (cooling, fire alarm)
Virtualization
– When can virtual servers be used vs. physical?
CIT 470: Advanced Network and System Administration
Slide #9
Backups
Backup strategy and policies
– Scheduling: when and how often?
– Capacity planning
– Location: on-site vs. off-site.
Monitoring backups
– Checking logs
– Verifying media
Performing restores when requested
CIT 470: Advanced Network and System Administration
Slide #10
Software Installation
Automated consistent OS installs
– Desktop vs. server OS image needs.
Installation of software
– Purchase, find, or build custom software.
Managing software installations
– Distributing software to multiple hosts.
– Managing multiple versions of a software pkg.
Patching and updating software
CIT 470: Advanced Network and System Administration
Slide #11
Troubleshooting
Problem identification
– By user notification
– By log files or monitoring programs
Tracking and visibility
– Ensure users know you’re working on problem
– Provide an ETA if possible
Finding the root cause of problems
– Provide temporary solution if necessary
– Solve the root problem to permanently eliminate
CIT 470: Advanced Network and System Administration
Slide #12
System Monitoring
Automatically monitor systems for
– Problems (disk full, error logs, security)
– Performance (CPU, mem, disk, network)
Provides data for capacity planning
– Determine need for resources
– Establish case to bring to management
CIT 470: Advanced Network and System Administration
Slide #13
Helping Users
Request tracking system
– Ensures that you don’t forget problems.
– Ensures users know you’re working on their
problem; reduces interruptions, status queries.
– Lets management know what you’ve done.
User documentation and training
– Policies and procedures
Schedule and communicate downtimes
CIT 470: Advanced Network and System Administration
Slide #14
Communicate
Customers
– Keep customer appraised of process.
• When you’ve started working on a request with ETA.
• When you make progress, need feedback.
• When you’re finished.
– Communicate system status.
• Uptime, scheduled downtimes, failures.
– Meet regularly with customer managers.
Managers
– Meet regularly with your manager.
– Write weekly status reports.
CIT 470: Advanced Network and System Administration
Slide #15
Specialized Skills
Heterogeneous Environments
Integrating multiple-OSes, hardware types, or network
protocols, distributed sites.
Databases
SQL RDMS
Networking
Complex routing, high speed networks, voice.
Security
Firewalls, authentication, NIDS, cryptography.
Storage
NAS, SANs, cloud storage.
Virtualization and Cloud Computing
VMware, cloud architectures.
CIT 470: Advanced Network and System Administration
Slide #16
Qualities of a Successful Sysadmin
Customer oriented
– Ability to deal with interrupts, time pressure
– Communication skills
– Service provider, not system police
Technical knowledge
– Hardware, network, and software knowledge
– Proficiency with at least one scripting language
– Debugging and troubleshooting skills
Time management
– Automate everything possible.
– Ability to prioritize tasks: urgency and importance.
Slide #17
First Steps to Better SA
Use a request system.
– Customers know what you’re doing.
– You know what you’re doing.
Manage quick requests right
– Handle emergencies quickly.
– Use request system to avoid interruptions.
Policies
– How do people get help?
– What is the scope of responsibility for SA team?
– What is our definition of emergency?
Start every host in a known state.
CIT 470: Advanced Network and System Administration
Slide #18
Principles of SA
Simplicity
– Choose the simplest solution that solves the entire problem.
– Work towards a predictable system.
Clarity
– Choose a straightforward solution that’s easy to change, maintain,
debug, and explain to other SAs.
Generality
– Choose reusable solutions that scale up; use open protocols.
Automation
– Use software to replace human effort.
Communication
– Be sure that you’re solving the right problems and that people know
what you’re doing.
Basics First
– Solve basic infrastructure problems before advanced ones.
CIT 470: Advanced Network and System Administration
Slide #19
Organizations and Certifications
Organizations
USENIX: Advanced Computing Systems
Association
LISA: Large Installation System
Administration
SAGE: System Administration Guild
LOPSA: League of Professional System
Administrators
CIT 470: Advanced Network and System Administration
Slide #21
Types of Sites
Small
2-10 computers, 1 OS, 2-20 users.
Small staff size requires outsourcing to obtain most
specialized skills.
Midsized
11-100 computers, 1-3 OSes, 21-100 users.
Large
100+ computers, multiples OSes, 100+ users
Outsources to reduce costs, some specializations.
CIT 470: Advanced Network and System Administration
Slide #22
Certifications
•
•
•
•
•
•
CCNA, CCNP, CCIE (Cisco)
cSAGE (SAGE)
MCSA (Microsoft)
RHCE (Red Hat)
SCSA (Sun)
VCP (VMware)
CIT 470: Advanced Network and System Administration
Slide #23
SAGE Job Descriptions
Novice
OS familiarity, help desk skills
Junior
Can use OS system administration tools (370)
Intermediate
Understanding of distributed computing, common servers,
automate small tasks, independent action
Senior
Understanding of scaling issues, including capacity
planning, solve problems by addressing root cause,
higher level programming abilities, write proposals for
purchasing, data center planning, etc.
CIT 470: Advanced Network and System Administration
Slide #24
SA Maturity Model (SAMM)
1. Ad Hoc
Ad-hoc non-repeatable solutions, firefighting.
2. Repeatable
Some repeatable processes.
3. Defined
Documented standard processes
4. Managed
Process effectiveness measured, adapted.
5. Optimized
CIT 470: Advanced Network and System Administration
Slide #25
Maturity and Complexity
Low downtime, high
efficiency
Maturity
Scalable but time
lost in process.
Works, but hard
to scale up.
Constant firefighting, high
downtime
Complexity: increasing numbers of systems and/or services
CIT 470: Advanced Network and System Administration
Slide #26
Tool Maturity Levels
1. Ad Hoc
OS GUI, CLI, or web administration interfaces.
2. Repeatable
Version control (RCS, SVN, GIT), request tracker
3. Defined
Automatic monitoring (Nagios, monit, god)
4. Managed
Configuration management (AutomateIt, cfengine)
5. Optimized
CIT 470: Advanced Network and System Administration
Slide #27
SAGE Code of Ethics
•
•
•
•
•
•
•
•
Professionalism
Personal Integrity
Privacy
Laws and Policies
Communication
System Integrity
Education
Social Responsibility
http://www.sage.org/ethics/
CIT 470: Advanced Network and System Administration
Slide #28
Terry Childs Case
Network administrator for San Francisco
– CCIE who built city’s FiberWAN network
Terry was only person with router passwords
– IT department acknowledges knowing that
– He was on-call 24x7x365 to resolve issues
Terry refused to give passwords to boss
– Cited fears that they would be misused by
management, outside contractors.
What was the right thing for Terry to do?
CIT 470: Advanced Network and System Administration
Slide #29
Change Management
Change Management
Effective planning and implementation of
changes to systems.
Changes should be
1. Well documented.
2. Have a backout plan.
3. Reproducible.
CIT 470: Advanced Network and System Administration
Slide #31
Why do we need Change Management?
March 26-29, 2006: BART trains halted to
avoid running into each other when
computer systems crashed.
• Crashes on Monday/Tuesday resulted from
software maintenance upgrades.
• Crash on Wednesday resulted from installing a
backup system to avoid future crashes.
• Thousands of passengers stranded for several
hours each time.
CIT 470: Advanced Network and System Administration
Slide #32
Change Management
1.
2.
3.
4.
5.
6.
7.
8.
9.
Plan change.
Test change on single system.
Test change on multiple systems.
File a change request.
Change committee approves request.
Schedule change.
Communication with users/admins.
Change systems at scheduled time.
Post-event analysis.
CIT 470: Advanced Network and System Administration
Slide #33
Testing Changes
• Automated checks.
– Sanity checks like Samba testparm.
– Reboot system.
• Test on one system first.
• Then test on set of systems.
– Dedicated test systems.
– System admin workstations.
– Virtual machines.
CIT 470: Advanced Network and System Administration
Slide #34
When do you need a Change Proposal?
Does the change impact critical services?
Critical machines/services
– Business critical: e-commerce server, etc.
– Essential services: routers, DNS, NFS, auth.
Non-critical machines/services
– Individual desktops
– Internal news web server
CIT 470: Advanced Network and System Administration
Slide #35
Change Proposal
1.
2.
3.
4.
5.
6.
7.
Description of the change.
Systems impacted by change.
Why the change is being made.
Risks presented by the change.
Test procedure.
Backout plans.
How long the change will require.
CIT 470: Advanced Network and System Administration
Slide #36
Communication
Communicate change to impacted people.
–
–
–
–
What change is being made (nontechnical.)
Which services will be unavailable.
When and how long will they be unavailable.
What actions do they need to task (if any.)
Communication issues
– If you send too many notes, they’ll be ignored.
– Send notices only to those impacted.
– Push critical notices; use pull for non-critical.
CIT 470: Advanced Network and System Administration
Slide #37
Scheduling
Scope
Routine
When
Notification
Type
Single host Anytime. Personal.
or user.
Major
Many hosts Off-peak Push.
or users.
Sensitive None but
Off-peak. Pull.
major
impact on
failure.
CIT 470: Advanced
Network and System Administration
Slide #38
Change Freezes
Time when only minor updates can be done.
– End of quarter or year.
– “Crunch time” for projects.
CIT 470: Advanced Network and System Administration
Slide #39
Backing Out
Decide back-out conditions before downtime
– Avoid the “just 5 more minutes” problem.
– Be sure that someone is keeping track of time.
Questions:
– How much time is required for back out?
– When is the latest time you can successfully back out?
– Will backing out this change prevent other changes from
being committed?
CIT 470: Advanced Network and System Administration
Slide #40
Backing Out: How to do it?
Service-level changes
Use revision control system to revert config.
Machine-level changes
Soft cutover: Old service is still running.
Hard cutover: Power up old server or restore from backups.
Snapshots
Snapshot VM before making change.
Revert to snapshot if need to backout.
Issues
Data migration and compatibility
CIT 470: Advanced Network and System Administration
Slide #41
Automatic Checks
Check integrity of critical files before use.
– Some services provide checks: LDAP, SMB.
– Check startup files by rebooting machine.
– Write your own checks for other files.
• Most people only do this after they have a problem.
CIT 470: Advanced Network and System Administration
Slide #42
1.
2.
3.
4.
5.
6.
7.
Network Access
SSH
Key-based Authentication
Console Access
X-Windows
VNC and NX
SSH tunneling
Remote Administration
Network Access
Most tasks can be done from the shell.
File management.
Disk/volume management.
Troubleshooting and viewing logs.
Installing/removing software.
Start/stop network services.
Reboot/shutdown.
All we need is a way to invoke a shell across
the network.
CIT 470: Advanced Network and System Administration
Slide #44
telnet
Ubiquitous network terminal protocol
telnet hostname
Similar protocols
rlogin –l user hostname
rsh –l user hostname command
Insecure
Data, including passwords, sent in the clear.
rlogin/rsh use ~/.rhosts for access w/o
passwords.
CIT 470: Advanced Network and System Administration
Slide #45
ssh
Secure Shell
Replaces
telnet
ftp
rlogin
rsh
rcp
CIT 470: Advanced Network and System Administration
Slide #46
SSH Security Features
CIT 470: Advanced Network and System Administration
Slide #47
SSH: Protocols and Products
• SSH v1
– Insecure, obsolete.
– Do not use.
• SSH v2
– Current version.
•
•
•
•
•
OpenSSH
SSH Tectia
F-secure SSH
Putty
WinSCP
CIT 470: Advanced Network and System Administration
Slide #48
SSH Features
Secure login
ssh –l user host
Secure remote command execution
ssh –l user host command
Secure file transfer
sftp –l user host
scp file user@host:/tmp/myfile
Port forwarding
ssh –L 110:localhost:110 mailhost
CIT 470: Advanced Network and System Administration
Slide #49
The Problem of Passwords
1. Good passwords are hard to remember.
2. Password transferred to remote system.
3. Automating remote access with passwords
is difficult.
CIT 470: Advanced Network and System Administration
Slide #50
Public Key Cryptography
Two keys
– Private key known only to owner.
– Public key available to anyone.
Applications
– Confidentiality:
• Sender enciphers using recipient’s public key,
• Receiver deciphers using their private key.
– Integrity/authentication:
• Sender enciphers using own private key,
• Recipient deciphers using sender’s public key.
CIT 470: Advanced Network and System Administration
Slide #51
Key-based Authentication
SSH uses public-key authentication
Private key stored in your machine.
Public key stored on remote machines.
Public-key login protocol
1. Client sends server a login request.
2. Server issues a challenge.
3. Client responds with computation based on
challenge and private key.
4. Server checks response with public key.
CIT 470: Advanced Network and System Administration
Slide #52
Using key-based authentication
1. Generate a public/private key pair.
ssh-keygen
Encrypted key files: id_rsa, id_rsa.pub
2. Copy public key to remote host
Copy to ~/.ssh/authorized_keys.
3. Login to remote host
ssh –l user remote
CIT 470: Advanced Network and System Administration
Slide #53
Keys are more secure than Passwords
1. Need to have two items to login: key file
and passphrase.
2. Neither key nor passphrase is sent to
remote host.
3. Machine-generated cryptographic keys are
infeasible to guess, unlike passwords.
CIT 470: Advanced Network and System Administration
Slide #54
SSH Agent
Problem: you have to enter passphrase to
decrypt the key each time you use ssh.
Solution: SSH Agent
> ssh-agent $SHELL
> ssh-add Enter passphrase for
/home/jw/.ssh/id_dsa: ********
Identity added:
/home/you/.ssh/id_dsa
(/home/jw/.ssh/id_dsa)
> ssh –l jw host
CIT 470: Advanced Network and System Administration
Slide #55
SSH Agent Features
Agent support for entire session.
Start ssh-agent on initial shell.
X: ~/.xsession (Often enabled by default.)
Multiple keys
ssh-add keyfile
ssh-add –l
Remove keys
ssh-add –d keyfile
ssh-add –D
CIT 470: Advanced Network and System Administration
Slide #56
Remote Access when Server is Down
Problem: No network access to host.
Solutions:
–
–
–
–
Go to computer room and bring host up.
Specialized hardware (network boot / power).
Virtual machines.
Console servers.
CIT 470: Advanced Network and System Administration
Slide #57
Console Servers
Console
– Main I/O device for computer.
– Historically: serial terminal.
– Typically: keyboard/mouse/screen.
Server allows access to multiple consoles.
–
–
–
–
Console access: BIOS, Bootloader, Kernel
Eliminates need for keyboards, mice, monitors.
Serial line to each machine from server.
One user has R/W, other users have R access.
CIT 470: Advanced Network and System Administration
Slide #58
Console Hardware
Console servers solution
– Commercial: Cisco, Cyclades,
Xyplex
– Open source: Conserver +
serial expander card
Hardware issues
– Connectors: DB-9, DB-25,
RJ-45
– Encoding: 8N1, 7E1
– Speeds: 9600 – 230k
CIT 470: Advanced Network and System Administration
Slide #59
X-Windows
Server
– Handles user input and
graphical display.
– Runs on the machine
with display unit.
Clients (applications)
– Can run on a different
machine than server.
• Set DISPLAY env var.
• Use –display option.
CIT 470: Advanced Network and System Administration
Slide #60
Window Manager
X client that provides features like:
– Move, resize, iconify, and kill windows.
– Window title bars.
– Popup menus.
Example window managers
–
–
–
–
twm: Tab, primitive early window manager
mwm: Motif, found on commercial UNIXes
fvwm: Free, fast, very customizable.
WindowMaker: NeXT-like, see also AfterStep.
CIT 470: Advanced Network and System Administration
Slide #61
TWM Screenshot
CIT 470: Advanced Network and System Administration
Slide #62
FVWM Screenshot
CIT 470: Advanced Network and System Administration
Slide #63
WindowMaker
CIT 470: Advanced Network and System Administration
Slide #64
Desktops
CDE
Common desktop env for commercial UNIXes.
Gnome
Standard Linux desktop based on GTK+.
KDE
Windows-like free desktop based on QT.
Xfce
Lightweight desktop, also based on GTK+.
CIT 470: Advanced Network and System Administration
Slide #65
VNC: Virtual Network Computing
CIT 470: Advanced Network and System Administration
Slide #66
Why VNC?
1.
2.
3.
4.
5.
6.
Remote desktop access.
Helpdesk: control a remote desktop.
Persistent desktop.
Use same desktop from multiple clients.
Need Linux access from Windows.
Need Windows access from Linux.
CIT 470: Advanced Network and System Administration
Slide #67
What is VNC?
• Open remote desktop protocol.
• Many implementations
–
–
–
–
–
RealVNC: VNC from original researchers.
TightVNC: VNC with high compression.
VNCj: Java VNC, can run within web browser.
PalmVNC: VNC for Palm Pilots.
UltraVNC: enhanced VNC, only for Windows.
CIT 470: Advanced Network and System Administration
Slide #68
Using VNC
1.
2.
3.
4.
5.
Start VNC server
UNIX: vncserver
Win: Start menu>Programs>RealVNC>VNCServer
Write down server name and display number.
It will look something like unix3:1
Start VNC client
UNIX: vncviewer
Win: Start menu>Programs>RealVNC>VNCViewer
Enter server and display to connect to (from step 2).
A VNC remote desktop should appear.
CIT 470: Advanced Network and System Administration
Slide #69
Configuring and Troubleshooting
• On UNIX, VNC stores files under ~/.vnc
• Configuration: xstartup
– Indicates which X clients to start with server.
– Typically includes vncconfig application.
• Configuration: passwd
– Contains VNC server session password.
• Log files: host:display#.log
– Any errors should appear in these logs.
CIT 470: Advanced Network and System Administration
Slide #70
Securing VNC
VNC does not provide encryption.
Use ssh tunneling to encrypt login + data:
ssh –L 5901:remotehost:5901 remotehost
vncviewer localhost:1
CIT 470: Advanced Network and System Administration
Slide #71
Tunneling
Tunneling: Encapsulation of one network
protocol in another protocol
– Carrier Protocol: protocol used by network
through which the information is travelling
– Encapsulating Protocol: protocol (GRE, IPsec,
L2TP) that is wrapped around original data
– Passenger Protocol: protocol that carries original
data
CIT 470: Advanced Network and System Administration
Slide #72
ssh Tunneling
SSH can tunnel TCP connections
– Carrier Protocol: IP
– Encapsulating Protocol: ssh
– Passenger Protocol: TCP on a specific port
POP-3 forwarding
ssh -L 110:pop3host:110 -l user pop3host
– Uses ssh to login to pop3host as user
– Creates tunnel from port 110 (leftmost port #) on
localhost to port 110 (rightmost post #)of pop3host
– User configures mail client to use localhost as POP3
server, then proceeds as normal
CIT 470: Advanced Network and System Administration
Slide #73
NX
Advantages over VNC:
Speed: fast enough to use over dialup.
Built-in ssh encryption.
Disadvantages
Immature code; hard to install + set up.
GPL client/server for Linux only.
Free Windows client; commercial server.
CIT 470: Advanced Network and System Administration
Slide #74
Key Points
Principles of System Administration
–
–
–
–
–
–
Simplicity
Clarity
Generality
Automation
Communication
Basics First
Changes should be
–
–
–
–
Documented
Reproducible (usually automated)
Have a backout plan (use VM snapshots and version control)
Approved with a formal process
Remote Administration includes
–
–
–
ssh (default)
console (for when network is down)
remote desktop (rarely needed)
Tunneling via ssh can secure insecure protocols.
CIT 470: Advanced Network and System Administration
Slide #75
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Daniel J. Barrett, Robert G. Byrnes, Richard E. Silverman, SSH, The Secure Shell, 2nd edition, O’Reilly, 2005.
Mark Burgess, Principles of Network and System Administration, 2nd edition, Wiley, 2004.
Conserver, http://www.conserver.com/
John Fisher, “Secure X Windows,” CIAC 2316, http://www.ciac.org/ciac/documents/ciac2316.html, 1995.
Aeleen Frisch, Essential System Administration, 3rd edition, O’Reilly, 2002.
David K.Z. Harris, “Zonker’s Greater Scroll of Console Knowledge,” http://www.conserver.com/consoles/, 2005.
Brian Hatch, “SSH Host Key Protection,” http://www.securityfocus.com/infocus/1806, 2004.
Thomas A. Limoncelli and Christine Hogan, The Practice of System and Network Administration, Addison-Wesley,
2002.
No Machine NX, http://www.nomachine.com/
OpenSSH, http://www.openssh.com/
Real VNC, http://www.realvnc.com/
Daniel Robbins, “OpenSSH key management,” http://www-128.ibm.com/developerworks/library/l-keyc.html, 2001.
SAGE, Job Descriptions, http://www.sage.org/field/jobs-descriptions.html.
SAGE, SAGE Code of Ethics, http://www.sage.org/ethics/.
Carla Schroeder, Linux Cookbook, O’Reilly, 2004.
Carla Schroeder, “FreeNX ups the Remote Linux Desktop Ante,” Enterprise Networking Planet,
http://www.enterprisenetworkingplanet.com/netos/print.php/3508951, 2005.
Glen Turner, “Linux Remote Serial Console HOWTO,” http://www.tldp.org/HOWTO/Remote-Serial-ConsoleHOWTO/index.html, 2003.
Paul Venezia, Why San Francisco's network admin went rogue, http://www.infoworld.com/d/adventures-in-it/whysan-franciscos-network-admin-went-rogue-286?page=0,1, July 18, 2008.
Webmin, http://www.webmin.com/
Window Managers for X, http://xwinman.org/
Todd R. Weiss, “IT upgrades slow BART trains in San Francisco,”
http://www.computerworld.com/printthis/2006/0,4814,110107,00.html, ComputerWorld, March 31, 2006.
CIT 470: Advanced Network and System Administration
Slide #76