Network Administration - Pravin Shetty > Resume

Transcript Network Administration - Pravin Shetty > Resume

Network
Administration
Research and Analysis
Week-7
Theory of Network Admin
Burgess – Ch.11
Science vs. Technology
 Studying Complex Systems
 Purpose of Observation
 Evaluation Methods and Problems
 Evaluating a Hierarchical System
 Deterministic and Stochastic Behaviour
 Observational Errors
 Strategic Analyses

A Scientific Basis
for System Administration
System admin has always involved
experimentation
 Development of Networks has lead to
exponential increase in system complexity
and corresponding increase in difficulty of
Management
 A purely mechanical approach may no longer
be adequate: time for a theoretical basis….
 World-wide interest, encouraged by
professional organisations (SAGE, USENIX,

ACM, IEEE, ACS)
Network Admin Research
Science vs Technology
 System
Admin studies mostly
“Applied Research” which result in
the development of a specialised
toolset that solves local/specific
problem
 Some workers have attempted to
collate results to form a more
general technology of more
permanent or global value.
 But this is not Science !
What is “Science”?
The Scientific Method
Knowledge advanced by series of studies
that either verify/falsify a hypothesis
 Study may be theoretical or practical but
all contribute to a larger on-going
discussion that leads to progress
 A single study is rarely the end of the
discussion
 Each study is usually repeated and verified
or challenged by other researchers
 Reproducibility is very important

Scientific Method

Motivation – statement of context and
objectives
Appraisal of problems
 Theoretical Model - used to understand or

solve problems and provide a framework for
comparison and measurement
Design an experiment – the Approach
 Perform an Experiment – obtain Results
 Evaluation or Verification of Approach and
Results

Scientific Method
 Science
is a dialog of Theories
 Science proceeds by Experiment
 Need Theory to interpret
observations
 Need observations to disprove
Theory
Network Admin Research:
Studying Complex Systems
 Areas
of study in System Admin have
been Technical and/or Behavioural and
include:
– Reliability studies
– Finding and evaluating methods for system
integrity
– Observation which apply to non-linear
behaviour
– Issues related to strategy and planning
 Mostly
study
Empirical or Qualitative case
Purpose of Observation
 Gather
Info about a Problem to
enable development of a Technology
which solves it
 To evaluate the Technology for
effectiveness
(ie whether it fulfils it’s design goals)
 But
evaluation of SysAdmin
experiments is difficult due to Vested
Interests and lack of clearly defined
metrics
Evaluation Methods
and some Problems
 Ideally
there should be a repeatable
test yielding measurements
 The trouble is that while a good
system administrator could do this
heuristically, these are
– Very difficult to quantify
– Different SysAdmins work in different
ways
– Extreme variability in systems and
users
Some Research Topics
Efficiency
& Automation
Network Administration
methods/models
Reliability Studies
– Fault management
– Metrics
– Patterns of events

prediction & performance
Eg
A Common Research topic
and the problems
Ways to relieve Administrators of tedious
work, so they can use there talents
better in other ways. What sort of
experiment is needed?
 Measure time spent working on a system
but the time required usually expands to occupy the
time available!

Record actions of an automatic system
and compare with those of a human
administrator
but depends on the person - different people do things in
different ways
Network Admin Research:
effect of Vested Interests….
SysAdmins require tools….
 Such tools often acquire a dedicated
following of users who grow to like them
regardless of what the tools allow them to
achieve
 Marketing skills of one software vendor
might be better than others and create a
bias in the marketplace that effects the
perceived usefulness of a particular tool
 So one cannot estimate the effectiveness
of a tool based just on the number of those
who use it

Evaluating Hierarchical System
What level of detailed decomposition of
levels within the hierarchy is appropriate?
 Building a model of the hierarchy is often
the best way to address complexity – focus

on what’s important or practical

Experiments based on this model might
then involve
–
–
–
–
Measurements
Simulations
Case studies
User surveys
Faults
IEEE classify software anomalies as:
O/S crash
 Program hang
 Program crash
 Input problem
 Output problem
 Failed required
performance

Perceived total
failure
 System error
message
 Service Degraded
 Wrong output
 No output

Most common faults for
SysAdmin are:
 Input
Problem
– Missing or inappropriate configuration
 Failed
performance
– Usually through loss of resources
 Software
problems can be eliminated
by revaluation of individual software
components
Reliability and Redundancy
R

Average (Mean) time before failure

With parallel or redundant components
Rparallel 

With serial or dependent components
Rseries 

Probability of Failure
MeanUptime
TotalElapsedTime
1
1
1

 ...
R1 R 2
Rn
R1  R2  R3  ...
P(t )  exp(  t )
R
MTBF and Computers
Computer system MTBF doesn’t account
for:
– Dependency – Not all systems have same
attachments
– Fail-over and Latency of service
Systems may fail, then recover after a single
delay
this may occur repeatedly !!
– Patterns of usage
User behaviour may bias the outcome
Some Metrics
 Net
– Total number of packets
– Amount of IP fragmentation
– Density of Broadcast messages
– Number of Collisions
– Number of Sockets(TCP) in and out
– Number of malformed packets
Some Metrics
 Storage
– Disk Usage in Bytes
– Disk Operations per Second
– Paging rate (free memory and
thrashing)
Fig 11.2 Daily paging data
Error bars exceed variation of data!
Fig 11.3 Weekly paging data
Also showing extreme variation
Some Metrics
 Processes
– Number of privileged processes
– Number of non-privileged processes
– Maximum percentage CPU used in
processes
Some Metrics
 Users
– Number logged on
– Total Number
– Average time spent logged on per user
– Load Average
– Disk Usage rise per session per user per
hour
– Latency of Services
Distributions
 Delta
– constant X
 Uniform – constant Y
 Gaussian or Random
 Normal – “bell curve”
 Black-Body or Planck – approx
exponential
 Poisson – random arrival with mean
rate
 Pareto – Power Law
Theory of System
Admin
(end)

Network Administration - Pravin Shetty > Resume

Transcript Network Administration - Pravin Shetty > Resume

Directory