Columbia University Department of Computer Science
Download
Report
Transcript Columbia University Department of Computer Science
Scaling SIP Servers
Sankaran Narayanan
Joint work with CINEMA team
IRT Group Meeting – April 17, 2002
Agenda
Introduction
Issues in scaling
Facets of sipd architecture
Some results
Conclusion and Future Work
Introduction – SIP servers
SIP Signaling – Proxy,
redirect
Proxies
Call routing by contact location
UDP/TCP/TLS
Stateful or stateless
Programmable scripts
User location – Registrars
SQL
database
What is scale ?
Large call volumes,
commodity hardware
[Schu0012:Industrial]
Response times (mean,
deviation), Turn around time
Goals
200 OK
INVITE
Delay budget [SIPstone]
REGISTER
R2 < 2 s
R1 < 500 ms
Class-5 switches handle
> 750K BHCA
INVITE
R2
180
180
200
200
ACK
ACK
R1
Limits to scaling
Not CPU bound
OS resource limits
Network I/O – blocking
Wait for responses
Latency: Contact, DNS lookups
Open files (<= 1024 on Unix)
LWP’s (Solaris) vs. user-kernel threads (Linux,
Windows)
Try not to…
Customize and recompile OS
(parts) server into kernel (khttpd, AFPA, …)
The problem
Scaling CPU-bound jobs (throughput=1/delay)
Hardware: CPU speed, RAM, …
Software: better OS, scheduler, …
Algorithm: optimize protocol processing
Blocking (Network, Disk I/O) is expensive
Hypothesis
I/O-bound
CPU-bound; reduce blocking
Optimized resource usage – stability at high loads
Facets of sipd architecture
Blocking
Process models
Socket management
Protocol processing
Blocking
Mutex, event (socket,
timeout), fread
Queue builds up
Potentially high variability
Tandem queue system
Easy to fix
Non-blocking calls (event
driven, later!)
Move queue to different
thread (lazy logger)
Logger
{
lock;
write;
unlock;
}
Blocking (2)
Call routing involves ( 1)
contact lookups
10 ms per query (approx)
Cache
Works well for sipd style
servers
Fetch-on-demand with
replacement (harder)
Loading entire database is easy
need for refresh – long lived
servers.
Potentially useful for DNS
SRV lookups (?)
SQL
database
Periodic
Refresh
Cache
< 1 ms
REGISTER performance
Single CPU Sun Ultra10
Response time is constant for Cache (FastSQL)
Process models (1)
One thread per
request
Doesn’t scale
Too many threads
over a short
timescale
R1
Stateless proxy: 2-4
threads per
transaction
High load affects
throughput
R2
R3
R4
Throughput
Incoming
Requests
R1-4
Load
Incoming
Requests
R1-4
Process models (2)
Thread pool + Queue
Thread overhead less;
more useful processing
Overload management
drop requests over
responses, drop tail
Not enough if holding
time is high
Each request holds
(blocks) a thread
Fixed number of threads
Throughput
Load
Stateless proxy (Solaris)
Turnaround time is almost constant for stateless proxy
•
•
The sudden increase in response time - client problem
UDP losses on Ultra10 @ (120 * 6 * 500 * 8) bps
Stateless proxy (Linux)
Request turnaround time breaks down
Response turnaround time is constant
Effect of high holding times and thread scheduling
How to set queue size – investigate?
Queue evolution for sipd
Number of requests (y-axis) waiting in the queue
for a free thread on Solaris (left) and Linux (right)
over a period of up-time (x-axis).
Process models (3)
Blocking thread model needs “too many”
threads
Stateful transaction stays for 30 s
Return thread to free pool instead of blocking
Event-driven architectures
State transition triggered by a global event
scheduler
OnIncoming1xx(), OnInviteTimeout(), …
SIP-CGI: pre-forked multiple processes
Socket management
Problem: open sockets limit (1024),
“liveness” detection, retransmission
One socket per transaction does not
scale
Global socket if downstream server is
alive, soft state – works for UDP
Hard for TCP/TLS – connections
Worse for Java servers – no select, poll
Optimizing protocol processing
Not too useful if CPU is not the
bottleneck
Text protocol - parsing, formatting
overheads
Order of headers matter (Via)
Other optimizations (parse-on-demand,
date formatting)
...
Conclusion
Unlike web servers: can be stateful, less disk
I/O, lesser impact of TCP stack/behavior, …
Pros: UDP, Stateless routing, Load-balancing
using DNS, …
Challenges: scaling state machine,
Towards 2.5M BHCA (3600 messages/s)
Event driven architecture (SEDA?)
Resource management (file limits, threads)
Tuning operating system (scheduler, …)
Future work
Stateful proxy performance
Evaluate event driven architecture
Effect of request forking (> 1 contacts) on
server behavior
Programmable scripts
Queue management and overload
control
Other types of servers (conference
servers, media servers, etc.),
References
CINEMA web page.
http://www.cs.columbia.edu/IRT/cinema
H. Schulzrinne. “Industrial strength internet
telephony,” Presentation at 6th SIP bakeoff,
Dec. 2000.
H. Schulzrinne et. al. “SIPstone –
Benchmarking SIP server performance,” CS
Technical report, Columbia University.