PPT - Eurecom
Download
Report
Transcript PPT - Eurecom
Network Modeling (NetMod): Mon. 1:30-4:45pm
Instructor: Thrasyvoulos Spyropoulos
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
A Few Words About Your Teacher
1995-2000: Undergraduate studies in Greece
National Technical University of Athens (NTUA)
Specialization: Telecommunications and Networking
2000-2006: MSc and PhD in Los Angeles, California
University of Southern California (USC)
Thesis: Perf. Analysis and Protocols for Wireless Networks
2006-2007: INRIA, Sophia-Antipolis (next door)
Post-doc at PLANETE group (now DIANA)
2007-2010: ETH, Zurich
Senior Researcher/Lecturer
2010-present: EURECOM
Thrasyvoulos Spyropoulos / [email protected]
Professor
Eurecom, Sophia-Antipolis
2
A Few Words About the Class
Goal: to teach some mathematical tools that are valuable, when
trying to understand real networks…in fact real systems!
STOCHASTIC PROCESSES
NETWORK SCIENCE
Learn to deal with Randomness
Modeling Large Networks
Cloud Computing
APPLICATIONS
Web Server Farms
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
4G/5G Networks
Web Server Farms
3
Why Randomness?
Many things in nature are random => same for networks
more precisely: increased complexity described as randomness
Randomness in:
Propagation phenomena (coding, diversity)
Location and mobility of nodes (handoff)
Traffic/Service arrival patterns (cellular capacity allocation)
Next link to be clicked on a webpage (browser prefetch)
Size of files downloaded (cache sizing)
Computing job arrivals and duration (cloud computing)
Number of (facebook) friends per user (advertising)
…
Computer Science Approach
• Design algorithm
• Deal with worst case
Thrasyvoulos Spyropoulos / [email protected]
Electrical Engineering Approach
• Optimize for probable cases
• Ignore rare events
Eurecom, Sophia-Antipolis
4
Why Large Networks?
Network of Internet Routers Online Social Nets (FaceBook)
Mesh Networks
Most networks can be modeled
as a large graph
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
5
Need a Science of (Social/Large/Complex) Networks
Difficult to study/model a specific graph
Specific graph: an instance of a random graph
with specific qualitative properties
“Complex/Social Network Analysis or Network
Science” the study of qualitative properties
of large graphs/networks
Degree distribution, diameter, connectivity, clusters
(a) WHY do these properties arise? (Scientist)
(b) HOW can they be exploited? (Engineer)
Degree distribution searching, security
Clustering advertising, information
spreading
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
6
Where Do They All Come Together
Many interesting problems on Networks can be modeled as a
Random Process on a (Random) Graph
Searching
Resource Allocation in 4G/5G
Malware Spread
A timely course! Few courses around on these topics
Performance Analysis classes from CalTech and Carnegie Mellon Univ
Complex/Social Networks classes from CalTech and Cornell University
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
7
Course Textbooks - Reading Material
PART I: Stochastic Processes and Queueing
“Performance Modeling and Design of Computer Systems”
by Mor-Harchol-Balter – shared copies in library – online:
http://proquestcombo.safaribooksonline.com/book/electricalengineering/computer-engineering/9781139610834
“Stochastic Processes” by S. Ross – library
“Introduction to Probability Models” by S. Ross – library
PART II: Complex Network Analysis
“Networks, Crowds, and Markets: Reasoning About a
Highly Connected World” by D. Easley and T. Kleinberg –
pdf freely available online
“Networks: An Introduction” by M. Newman – shared copy
in library
Additional reference material (tutorials, articles)
per topic
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
8
Course Evaluation (Grading)
Regular Homeworks and 2 Labs --- 20% of Grade
Midterm Exam (after Part I) --- 30% of Grade
Final Exam --- 50% of Grade
Participation --- extra credit!
Office Hours: TBD
Class Web Site:
http://www.eurecom.fr/~spyropou/netmod2016.htm
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
9
Course Expectations
What to expect from me:
To make the class entertaining
Many examples and application
Interaction Interaction Interaction!
To teach you key insights
Not just “tools” when to use which tool
why it works
What I expect from you:
To study the assigned book chapters!
To work hard on your homeworks
Interaction Interaction Interaction!!!
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
10
Course Prerequisites
Introductory Probability Theory
Distributions (bernoulli, geometric, binomial, gaussian, poisson)
Expectations, Variance, etc.
Conditional probabilities and expectations
Independence and Correlation
Review Reference: “Introduction to Prob. Models” or “A First
Course in Probability” by Sheldon Ross – available in library
(very!) Elementary Linear Algebra
Matrix multiplication
Solving Linear Systems
Eigenvalues
Check out Gilbert Strang’s online lectures for a refresher
(excellent!)
A tiny bit of MatLab
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
11
Probability Refresher: Playing the Odds
Betting on the Roulette
18 red
18 black
2 green
John observes the roulette and counts
5 reds in a row
Q: In the next roll should he bet on red
or black?
Q: What if John sees 20 reds in a row??
On August 18, 1913, at the casino in Monte Carlo, black came up a record
John: “Itimes
should
bet on black!
21 reds
in awas
row
are VERY unlikely!
twenty-six
in succession
in roulette…
There
a near-panicky
rush to
bet
on red, “No,
beginning
about the
black hadRolls
comeare
up aindependent!”
phenomenal fifteen
James:
it makes
no time
difference!
times. …players doubled and tripled their stakes, led to believe after black came
Q:
Prob{20 reds in a row followed by a black}?
up the twentieth time that there was not a chance in a million of another repeat.
In the
end thereds
unusual
runrow}?
enriched the Casino by some millions of francs.
Prob{21
in a
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
12
Probability Refresher: Change or Not?
1. Pick a door: 1 door has a prize! The other 2 have goats
Q: A door with a goat is opened: do you want to change your
chosen door, or stay with the one you have?
A: Switching gives you a 2/3 chance to win! Look up Monty Hall
problem – Conditional probabilities
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
13
Probability Refresher: You Be The Judge
A terrible crime has occurred in city A, and John is one
suspect
A DNA matching that of John is found in the crime scene
This is the only evidence against John
Two DNAs matching have a 1 in a million chance
The prosecutor and jury conclude John is guilty
Q: Were they right?
City A has about 10 million people.
Q: What is the chance that John is innocent?
A: John is innocent with a 90% chance!!!
BAYES RULE: P[Innocent | Evidence]
Thrasyvoulos Spyropoulos / [email protected]
P[Evidence | Innocence]P[Innocence]
P[Evidence]
Eurecom, Sophia-Antipolis
14
Probability Refresher: Network Bootstrapping
Sensor node bootstrapping
Each node has a unique ID
Goal: each node needs to broadcast
x.y.z
its ID to all other nodes
Protocol
Node X picks a slot n uniformly
in [1,N] P{n = i} = 1/N
Broadcasts its ID in slot n
t
SUCCESS: no other node picked n
COLLISION: 2 or more nodes picked n nodes fail and stay “off”
Tradeoff: Low N many collisions || High N long delay
Q: If 30 nodes, what is the minimum N P{collision} < 10%?
A: 200? 500?
N 1000?
> 45005000?
(look up “birthday paradox”)
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
15
Why Modeling and Performance Analysis?
Amazon Cloud
Traffic
from
different
clients
Throughput?
Delay per Client?
Best Algorithm? E.g. Resource
Allocation (CPU/Network/Memory)
Identify bottlenecks: Improve Eurecom WiFi:
a)
b)
Install more Access Points? (network is too sparse)
Propose a better channel selection algorithm? (high interference)
Knowing perf. analysis can make you good consultants :-)
“All models are bad! But some are useful”
by statistician George Box
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
16
Upgrading the Company’s Web Server
Job requests /
per second
2x
Jobs served / per second
?
Your company has just got very positive publicity The
incoming load (arrival rate of requests) to the web server
is expected to double as a result
Your boss tells you that you need to upgrade the server
with a faster one (higher service rate μ) to ensure the
same mean response time E[T]
Q: How much should you increase the service rate?
a)
b)
c)
Double the server speed?
More than double the speed?
Less than double the speed?
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
17
Supermarket Queues and FDMA vs. CSMA
30 cust/hour
10 cust/hour
OPTION 1
3 slow cashiers
3 lines randomly
choose line and stay there
Each cashier: 10
customers per hour
Thrasyvoulos Spyropoulos / [email protected]
OPTION 2
1 fast cashier
single line
30 customers per
hour
Eurecom, Sophia-Antipolis
18
Supermarket Queues and FDMA vs. CSMA
30 cust/hour
10 cust/hour
Option 1
Option 2
Q: Which options has the smallest waiting time?
A: Option 2 is 3x faster!
- Option 1 or Option 2?
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
19
19
Supermarket Queues and FDMA vs. CSMA (2)
30 cust/hour
10 cust/hour
Option 2
Option 3
Q: Which options has the smallest waiting time?
A: Similar delay (for high load)
- Option 2 or Option 3?
- Low load: Option 2 or Option 3? A: Option 2 up to 3x faster
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
20
Supermarket Queues and FDMA vs. CSMA (3)
How does all these relate to networking??
600KHz
OPTION 1: FDMA
Separate 200KHz channel to each
Flows do not compete
600KHz
OPTION 2: CSMA
Each node senses the channel first
If idle transmit pkt using 600KHz
If busy queue (wait)
Q: Which option would you prefer for data?
Q: Which option would you prefer for voice?
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
21
Google’s PageRank: Searching for Web Pages
What lies behind this simple box??
Searching the Web
Step 1: crawl all web pages and create index of {keywords-web pages}
Step 2: User enters keywords (e.g. “Network Modeling”)
Step 3: Google finds all web pages matching these keywords
(these 3 steps are generic to almost every search engine)
Step 4: Return a list of matches ranked by importance. HOW???
Intuition 1: page important if many web pages refer to it
Intuition 2: page important if important web pages refer to it
Solution: PageRank algorithms solves an appropriate Markov Chain
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
22
Discrete Time Markov Chains
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
(Discrete-Time) Markov Process
X(n): the state/value of a process at the n-th period (time slot)
X(n) is random and takes values in a finite or countable set
The sequence X(1), X(2), …, X(n) could be
a)
b)
c)
The values of a stock each day
The web page a user is currently browsing
The Access Point(AP) a moving user is currently associated with
We would like a probabilistic model for X(1),X(2),…,X(n)
a)
b)
X(i) are independent => not so realistic (for above examples)
X(i+1) depends on X(i) => not a bad model to start with
P{X(n+1) = j | X(n) = i,X(n-1) = in-1,…,X(1) = X0} = P{X(n+1) = j | X(n) = i} = pij
(Markov Property)
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
24
Transition Matrix and Properties
Assume state space: 0, 1, 2, …
Pij : P{S(n+1) = j | S(n) = i}
p
ij
1, i
j
Stationary: P{S(n+1) = j | S(n) = i} = P{S(1) = j | S(0) = i}
for any n
pij define a transition matrix P =
Thrasyvoulos Spyropoulos / [email protected]
p00
p10
p20
Eurecom, Sophia-Antipolis
p01 p02
p11 p12
p 21 p22
25
Graphical Representation of Markov Chains
Simple weather model
X(n): weather after n days
Simple channel error model
q
p
(total) prob of error = 0.1
Case 1: q = 0.1, p=0.9 (uncorrelated)
Case 2: q = 0.9 (burst of errors)
1-p
good
bad
1-q
What is the transition
matrix P in both cases?
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
26
Markov Chain Models (cont’d)
A simple handover model
cell -> state
need to find transition probabilities
depend on road structure, user profile,
statistics
0.8
0.2
user on the phone
0.6
0.5
0.3
0.4
0.4
0.2
0.7
0.15
0.2
Cellular Network
Thrasyvoulos Spyropoulos / [email protected]
Markov Chain
Eurecom, Sophia-Antipolis
27
2-step transition probabilities
So far we know probabilities for next step: S(n)->S(n+1)
Probabilities for S(1) -> S(n)?
PS(2) j | S(0) i
PS(2) j | S(1) k,S(0) i PS(1) k | S(0) i
k
P{S(2) j | S(0) i} pik pkj
k
If P(2) = {pij(2)} denotes the 2-step trans. probability matrix,
then P(2) = P • P
Pij(2): multiply row I with column j
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
28
n-step transition probabilities
Generalizing for n steps:
(l)
PS(n) j | S(0) i p(m)
p
(m l n)
ik
kj ,
k
Kolmogorov-Chapman equations
P(n) = P•P…P = (P)n (n-step transition probabilities)
P(n) = P(m) • P(l)
e.g. P(5) = P(4) • P(1) = P(2) • P(3) = P(2) • P • P(2)
Initial state vector s(0): e.g. (0,1,0,0) or = (0.2,0,0.6,0.2)
Q: Where after n steps?
A: s(n) = s(0) • P(n)
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
29
An example: Pre-Fetching Web Pages
Your iPhone browser can pre-fetch pages to improve speed
The current Web Page being browsed has 3 links
Study of click statistics have shown that
A user clicks link 1 next with prob 0.8
Link 2 with prob 0.1
Link 3 with prob 0.1
It seems reasonable that the browser pre-fetches link 1
Assume now a tiny subset of the Web with the following
transition matrix
A B C D
Start at page A
S(0) = [1, 0, 0, 0]
S(1) = [0.2, 0.2, 0.5, 0.1]
S(2) = [0.13, 0.29, 0.24, 0.34]
page A 0.2
page B 0.1
page C 0.1
page
D
0.2
0.1
0.2 0.1 0.6
0.4 0.2 0.3
0.1 0.2 0.5
0.2 0.5
Q: Which web page should the browser pre-fetch?
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
30
Limiting distribution
Q: What happens to pij(n) as n goes to infinity?
Q: What is the lim P(n) as n->∞?
Q: Does it always converge?? (we’ll see this later)
If limit exists, then lim pijn πj for any initial state i
n
π = {π0, π1,…, πm} is called the stationary distribution
IF π• P = π 𝜋𝑗 =
𝜋𝑖 ∙ 𝑝𝑖𝑗
and Σiπi = 1
The above equation can be used to find π
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
31
Getting the Stationary Distribution: Examples
The weather system
0.8 0.2 (π
sunny , πrainy ) (3/4,1/4)
P
0.6 0.4
Data rates supported for outdoor user
e.g. (128Kbps, 320Kbps, 1 Mbps)
0.4 0.4 0.2
2 3 2
P 0.2 0.6 0.2 (π128, π320, π1024) ( , , )
7 7 7
0.3 0.2 0.5
What about this one?
0.4 0.6 0
P 0.3 0.7 0
0
0 1
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
32
A Definition Prelude (Part I)
0.4 0.6 0
P 0.3 0.7 0
0
0 1
(Reachability) State j is reachable by i Pij(n) > 0 for some n
denote ij
(Communication) states i and j communicate (i <-> j) if ij and ji
(Recurrence/Transience) Denote fi the probability to ever return to i,
starting from i
(n)
Transient) state is transient if fi < 1 n0 pii (finite num of returns)
number of returns is geometric (fi): why?
Recurrent) state i is recurrent if fi = 1 n0 pii(n) (infinite num of returns)
Positive recurrent) expected time between visits to i is finite
Null recurrent) expected time between visits is infinite (strange!?)
Periodicity) state i is periodic, if exists d such that pii(n) = 0, if n ≠ kd
Largest d with above property is called period
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
33
A Definition Prelude (Part II)
0.4 0.6 0
P 0.3 0.7 0
0
0 1
Definition: A subset of states S: {for all i,j in S => i <-> j} is called a class
Lemma 1: recurrence is a class property
if i is recurrent and i <-> j, then j is recurrent
Proof (by contradiction): if j transient => can never return to j after some time =>
cannot be at i either (since there is always a chance to go from i to j)
Lemma 2: transience is a class property
Similar argument
Periodicity and positive/null-recurrence are also class properties
Irreducible) A Markov Chain is irreducible if it has only 1 class
A finite MC which is irreducible is always positive-recurrent
Theorem: If a DTMC is aperiodic, irreducible and positive recurrent
(“ergodic”) then it has a stationary distribution π = {π1,…,πN}
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
34
A Motivating Example: Google PageRank Algorithm
Search for “Network Modeling”
1000s of pages (courses, companies, etc.) contain these keywords
Goal: Make the page I’m interested appear in at least the top 10 pages
shown
Q: How should pages be ranked?
A1: A page is important if many links to this page
Q: What is wrong with this metric?
A: - link from Yahoo more important than link from /~spyropou
- can easily “fool” this! (create many dummy pages pointing to mine)
Q: How to fix this?
A: weigh a link from x to page p with the number of links into x
Q: Can this system be fooled?
A: Yes! Create 1000 dummy pages pointing to each other and mine
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
35
Google’s Solution
A page p has high rank (is important) if the pages pointing to it also
have high rank
Q: How is this different from before?
A: All the dummy pages pointing to each other would still be low rank, if no
“external” link from a high rank page
Q: Is a link from a page with 1000 other links as important as a link from a
page with 5 other links?
A: No! If page i has a link to j and k total outgoing links Pij = 1/k
Q: Where does this lead? How do we calculate the rank?
A: rank of page j = Σi { rank of page i * Prob of link ij}
i.e. rj = Σi ri Pij
(basis of) Page Rank Algorithm:
rank of pages = stationary prob. of an MC with transitions Pij
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
36
Page Rank Algorithm Pitfalls
Q: What about this tiny WWW?
A: πA = πN = 2/5, πM = 1/5
Q: But what about these two?
A: These two chains are reducible
and do not have a stationary distr.
Google’s heuristic: introduce additional arrows to avoid such
loops and dead ends (not necessarily optimal!)
Q: How does Google solve the huge system of eq (million pages)?
Thrasyvoulos Spyropoulos / [email protected]
Eurecom, Sophia-Antipolis
37