Multicasting

Download Report

Transcript Multicasting

TDC561
Network Programming
Week 8:
Multicasting; Socket Options;
Camelia Zlatea, PhD
Email: [email protected]
References
 W. Richard Stevens, Network Programming : Networking
API: Sockets and XTI, Volume 1, 2nd edition, 1998 (ISBN 013-490012-X)
– Chap. 7, 11, 19, 21, 22
Network Programming (TDC561)
Winter 2003
Page 2
Addressing in the Internet
 Addressing tied to reachability
– Every host interface has its own IP address
– Router interfaces usually have their own IP addresses
 IP is version 4 (IPv4 addresses)
– 4 bytes long
– two part hierarchy
» network number and host number
– different types of boundary indicator
» class, subnet mask, prefix
– Goal of boundaries is address aggregation
Network Programming (TDC561)
Winter 2003
Page 3
Address classes
 Historical first choice
– fixed network-host partition, with 8 bits of network number
 Generalization
– Class A addresses have 8 bits of network number
– Class B addresses have 16 bits of network number
– Class C addresses have 24 bits of network number
 Distinguished by leading bits of address
–
–
–
–
–
leading 0 => class A (first byte < 128)
leading 10 => class B (first byte in the range 128-191)
leading 110 => class C (first byte in the range 192-223)
leading 1110 => class D (multicast)
leading 1111 => Class E (reserved)
Network Programming (TDC561)
Winter 2003
Page 4
Address evolution
 Class based scheme was too inflexible
 Two problems
– Too many routes
– Too few addresses
 Four extensions
– Subnetting (flexible boundaries within network)
– CIDR (flexible grouping of networks- Classless Interdomain Routing)
– Dynamic host configuration (reuse of addresses)
– A bigger address (IPv6)
 One issue
– Network address translation
Network Programming (TDC561)
Winter 2003
Page 5
What is Multicast?
 Multicast is a communication paradigm
– 1 source, multiple destination
 Applications:
– bulk-data distribution to subscribers
» (e.g., newspaper, software, and video tapes distribution),
– connection-time-based charging data distribution
» (e.g., financial data, stock market information, and news tickets
broadcasting),
– streaming (e.g., video/audio real-time distribution),
– push applications, web-casting,
– distance learning, conferencing, collaborative work, distributed
simulation, and interactive games.
Network Programming (TDC561)
Winter 2003
Page 6
The Internet group model
– multicast/group communications means...
» 1  n as well as
nm
– a group is identified by a class D IP address
(224.0.0.0 to 239.255.255.255)
» abstract notion does not identify any host
source
site 2
140.192.1.8
source
host_1
Ethernet
host_1
from logical view...
multicast router
host_2
receiver
multicast group
225.1.2.3
multicast router
...to physical view
host_2
host_3
receiver
140.192.1.6
receiver
216.47.143.60
Network Programming (TDC561)
site 1
Internet
receiver
multicast router
multicast distribution tree
host_3
Ethernet
Winter 2003
Page 7
IP Multicast: Basic Idea
 Multicast groups: abstract “rendez-vous” points.
 Set up optimal spanning tree spanning participants for
each group.
 Make it cheap by not providing strong guarantees: send
out packets and hope for the best.
Network Programming (TDC561)
Winter 2003
Page 8
The Internet group model (cont’)
 the group model is an open model
– anybody can belong to a multicast group
» no authorization is required
– a host can belong to many different groups
» no restriction
– a source can send to a group, no matter whether it
belongs to the group or not
» membership not required
– the group is dynamic, a host can subscribe to or leave
at any time
– a host (source/receiver) does not know the
number/identity of members of the group
Network Programming (TDC561)
Winter 2003
Page 9
Mapping IP Multicast onto Ethernet Multicast
 IP Multicast (class D IP address):
–
–
–
–
Class D: 224.x.x.x-239.x.x.x (in HEX: Ex.xx.xx.xx): 28 bits
No further structure (like Class A, B, or C)
Not addresses but identifiers of groups
Some of them are assigned by the IANA to permanent host groups
 Mapping a class D IP adr. into an Ethernet multicast adr.
– The least 23 bits of the Class D address are inserted into the 23 bits of
Ethernet multicast address
– Many to one mapping: 5 bits are not used
– More filtering has to be done at IP level
Network Programming (TDC561)
Winter 2003
Page 10
Ethernet Multicast
 Ethernet is a broadcast medium
– Every frame can potentially be seen by every host
 Ethernet cards have a unique Ethernet address
 Broadcast address:
– ff:ff:ff:ff:ff:ff
 Ethernet Multicast address range for IP:
– 01:00:5e:00:00:00 -to- 01:00:5e:7f:ff:ff
 Mapping IP Multicast onto Ethernet
 Multicast
Network Programming (TDC561)
Winter 2003
Page 11
The Internet group model (cont’)
 local-area multicast
» use the potential diffusion capabilities of the physical
layer (e.g. Ethernet)
» efficient and straightforward
 wide-area multicast
» requires to go through multicast routers, use
IGMP/multicast routing/...
» routing in the same administrative domain is simple and
efficient
» inter-domain routing is complex, not fully operational
Network Programming (TDC561)
Winter 2003
Page 12
Multicast and the TCP/IP layered model
Application
reliability
mgmt
congestion
control
other building higher-level
blocks
services
user space
Socket layer
kernel space
TCP
ICMP
UDP
IP / IP multicast
IGMP
multicast
routing
device drivers
Network Programming (TDC561)
Winter 2003
Page 13
What is Multicast?
 Several applications need efficient means to transmit data
to multiple destinations with:
–
–
–
–
less bandwidth
higher throughput
lower delay
higher reliability
 Classification
– Data dissemination
– Transactions
– Large Scale Virtual Environments
 Build on top of the existing Internet and take into account
group communication constraints
– Manage groups
– Create and maintain multicast routes
– Efficient end-to-end delay (reliability, flow control, time constraints)
Network Programming (TDC561)
Winter 2003
Page 14
Ideal Multicast
 Senders (S) and Receivers (R) not aware of each other’s
position in the network.
 Scalable.
 Low latency (join, data propagation).
 Low bandwidth and processing overhead.
 “Reliable”, if this is cheap (“end-to-end”?)
 Easy to join/leave.
Network Programming (TDC561)
Winter 2003
Page 15
Why IP multicast?
 scalability...
– scales to an unlimited number of users
 reduced costs...
– cheaper equipment and access line
 increased speed...
– increases the delivery speed
use unicast?
content
server
client
access line
ISP and Internet
client
content
...or multicast? server
client
access line
ISP and Internet
client
Network Programming (TDC561)
Winter 2003
Page 16
Multicast Features: Multicast Scope Control
 Who gets which packets?
– Send everything to everybody ..
 TTL scope
– To keep multicast traffic within an administrative domain by setting
ttl thresholds on interfaces on the border router
 Administratively scoped addresses
– A multicast boundary can be setup
on the borders for addresses in range
of 239.0.0.0–239.255.255.255
– Better than ttl scope
Network Programming (TDC561)
Winter 2003
Page 17
Multicasting: Receiving multicast message
 For a process to receive multicast messages it needs to
perform the following steps:
1. Create a UDP socket msd
msd = socket(AF_INET,SOCK_DGRAM, 0);
2. Bind it to a UDPport, e.g., 1234.
All processes must bind to the same port in order
to receive the multicast messages.
struct sockaddr_in
groupHost;
groupHost.sin_family = AF_INET;
groupHost.sin_port = htons(UDPport);
groupHost.sin_addr.s_addr = htonl(INADDR_ANY);
bind(msd, (struct sockaddr *) &groupHost, sizeof(groupHost))
Network Programming (TDC561)
Winter 2003
Page 18
Multicasting: Receiving multicast message
 (cont’)
3. Join a multicast group address GroupIPaddress ,
e.g., 224.111.112.113
joinGroup (msd, GroupIPaddress);
4. Use recv or recvfrom to read the messages, e.g.,
nbytes = recv(msd, recvBuf, BufLen,0);
Network Programming (TDC561)
Winter 2003
Page 19
Multicast Groups and Addresses
 Every IP multicast group has a group address.
 IP multicast provides only open groups
– it is not necessary to be a member of a group in order to send
datagrams to the group.
 Multicast address are like IP addresses used for single
hosts, and is written in the same way: A.B.C.D.
– Multicast addresses will never clash with host addresses because
a portion of the IP address space is specifically reserved for
multicast. 224.0.0.0 to 239.255.255.255.
– Multicast addresses from 224.0.0.0 to 224.0.0.255 are reserved for
multicast routing information;
– Application programs should use multicast addresses outside this
range.
Network Programming (TDC561)
Winter 2003
Page 20
Multicasting: Receiving multicast message
/* This function sets the socket option to make the local host join the multicast
group */
void joinGroup(int s, char *group)
{
struct sockaddr_in groupStruct;
struct ip_mreq mreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1)
printf("error in inet_addr\n");
/* check if group address is indeed a Class D address */
mreq.imr_multiaddr = groupStruct.sin_addr;
mreq.imr_interface.s_addr = INADDR_ANY;
if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq,
sizeof(mreq)) == -1 )
{
printf("error in joining group \n"); exit(-1);
}
}
Network Programming (TDC561)
Winter 2003
Page 21
Receiving Multicast Datagrams
 Join a particular multicast group. This is done using
another call to setsockopt:
struct ip_mreq mreq;
setsockopt(sock,IPPROTO_IP,IP_ADD_MEMBERSHIP,&mreq,sizeof(mreq));
 The definition of struct ip_mreq is as follows:
struct ip_mreq {
struct in_addr imr_multiaddr; /* multicast group to join */
struct in_addr imr_interface; /* interface to join on */
}
Network Programming (TDC561)
Winter 2003
Page 22
Multicasting: Receiving multicast message
/* This function removes the process from the group */
void leaveGroup(int recvSock,char *group)
{
struct sockaddr_in groupStruct;
struct ip_mreq dreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1)
printf("error in inet_addr\n");
dreq.imr_multiaddr = groupStruct.sin_addr;
dreq.imr_interface.s_addr = INADDR_ANY;
if( setsockopt(recvSock,IPPROTO_IP,IP_DROP_MEMBERSHIP,
(char *) &dreq,sizeof(dreq)) == -1 )
{
printf("error in leaving group \n");
exit(-1);
}
printf("process quitting multicast group %s \n",group);
}
Network Programming (TDC561)
Winter 2003
Page 23
Multicasting: Sending multicast message
For a process to send multicast messages it needs to
perform the following:
1. use the UDP socket msd for sending multicast messages
struct sockaddr_in
dest;
dest.sin_family = AF_INET;
dest.sin_port = UDPport;
dest.sin_addr.s_addr = inet_addr(GroupIPaddress);
sendto (msd, sendBuf, BufLen,0, (struct sockaddr *) &dest,
sizeof(dest)) ;
Network Programming (TDC561)
Winter 2003
Page 24
Multicasting: Sending multicast message
 (cont’)
2. Join a multicast group address GroupIPaddress ,
e.g., 224.111.112.113
joinGroup (msd, GroupIPaddress);
3. Use recv or recvfrom to read the messages, e.g.,
nbytes = recv(msd, recvBuf, BufLen,0);
Network Programming (TDC561)
Winter 2003
Page 25
Multicasting: Sending multicast message
/* This function sets the socket option to make the local host join the multicast
group */
void joinGroup(int s, char *group)
{
struct sockaddr_in groupStruct;
struct ip_mreq mreq; /* multicast group info structure */
if((groupStruct.sin_addr.s_addr = inet_addr(group))== -1)
printf("error in inet_addr\n");
/* check if group address is indeed a Class D address */
mreq.imr_multiaddr = groupStruct.sin_addr;
mreq.imr_interface.s_addr = INADDR_ANY;
if ( setsockopt(s,IPPROTO_IP,IP_ADD_MEMBERSHIP,(char *) &mreq,
sizeof(mreq)) == -1 )
{
printf("error in joining group \n"); exit(-1);
}
}
Network Programming (TDC561)
Winter 2003
Page 26
Multicasting
 Time-to-live
– control how far the messages can go, e.g., 2 means at most 2
routers away. (default is 1- which will result in multicast packets
going only to other hosts on the local network. )
u_char TimeToLive;
TimeToLive = 2;
setTTLvalue (s, &TimeToLive);
/* This function sets the Time-To-Live value */
void setTTLvalue(int s,u_char *ttl_value)
{
if( setsockopt(s, IPPROTO_IP, IP_MULTICAST_TTL, (char *) ttl_value,
sizeof(u_char)) == -1 )
{
printf("error in setting loopback value\n");
}
}
Network Programming (TDC561)
Winter 2003
Page 27
Multicasting
 Time-to-live
– To provide meaningful scope control, multicast routers enforce the
following "thresholds" on forwarding based on the TTL field:
0 restricted to the same host
1 restricted to the same subnet
32 restricted to the same site
64 restricted to the same region
128 restricted to the same continent
255 unrestricted
Network Programming (TDC561)
Winter 2003
Page 28
Multicasting
 Loop-back
– allow the process to get a copy of its own transmission we use:
u_char loop;
By default, messages sent to the multicast
loop = 1;
group are looped back to the local host. this
function disables that.
setLoopback (s, &loop);
loop = 1 /* means enable loopback (default)
loop = 0 /* means disable loopback
void setLoopback(int s,u_char loop)
{
if( setsockopt(s,IPPROTO_IP,IP_MULTICAST_LOOP,(char *) &loop,
sizeof(u_char)) == -1 )
{
printf("error in disabling loopback\n");
}
}
Network Programming (TDC561)
Winter 2003
Page 29
Multicasting
 Reuse-port
– allow multiple multicast processes to to run on the same host:
reusePort (s);
/*
This function sets a socket option that allows multiple processes
to bind to the same port
*/
void reusePort(int s)
{
int one=1;
if ( setsockopt(s,SOL_SOCKET,SO_REUSEADDR,(char *) &one,sizeof(one)) == -1
)
{
printf("error in setsockopt,SO_REUSEPORT \n");
exit(-1);
}
}
Network Programming (TDC561)
Winter 2003
Page 30
Multicasting - Example
 http://condor.depaul.edu/~czlatea/TDC561/LectureNotes/TDC561_week8/
 multicast.h
 multicastUtilities.c
 multicastChat.c
Network Programming (TDC561)
Winter 2003
Page 31
Reliable One-One Communication
 Use reliable transport protocols (TCP) or handle at the application layer
 Client/Server semantics in the presence of failures
 Possibilities
–
–
–
–
–
Client unable to locate server
Lost request messages
Server crashes after receiving request
Lost reply messages
Client crashes after sending request
Network Programming (TDC561)
Winter 2003
Page 32
Reliable One-Many Communication
 Reliable multicast
– Lost messages => need
to retransmit
 Possibilities
– ACK-based schemes
» Sender can become
bottleneck
– NACK-based schemes
Network Programming (TDC561)
Winter 2003
Page 33
Atomic Multicast





Reliable Group Communication
– Processes can fail
– Atomicity of Multicast is required
» Atomicity?
Group Membership
– Multicast and a corresponding group of recipients
– Failures of processes can be viewed as changes to group membership.
System Model
– Separating receiving a message and delivering it to a application
– Group View: a list of processes associated with a message
View Change
– A special multicast message
– Race between m and vc
Condition
– Either m is delivered to all processes before a process is delivered a new vc
– Or, m is not delivered at all.
Network Programming (TDC561)
Winter 2003
Page 34
Atomic Multicast
Atomic multicast: a guarantee that all
process received the message or none
at all
– Replicated database example
Problem: how to handle process
crashes?
Solution: group view
– Each message is uniquely
associated with a group of
processes
» View of the process group when
message was sent
» All processes in the group should
have the same view (and agree on
it)
Network Programming (TDC561)
Virtually Synchronous Multicast
Winter 2003
Page 35
Reliable Mcast Transport Protocol
Smart “session manager”
elects DR’s and sets
parameters. How? Just
like that...
• S, R use windows
• Designated Receivers
eliminate ACK implosion
• ACK’s sent to DR’s
• DR’s and S cache data and
retransmit it when needed.
Network Programming (TDC561)
Winter 2003
Page 36
RMTP(2)
 After set up S starts sending data. Receivers send periodic
ACK’s after first packet received.
 If no ACK’s for a long time, connection terminates.
 DR’s or S retransmit info using unicast or multicast, depending
on number of errors.
 Immediate TX request sent to DR’s, for receivers that join the
session.
 Sender window advance determined by slowest receiver.
 ACK’s must not be repeated too often. Measure RTT to AP.
 S adjusts (decreases) send window to 1 if many errors; then
increases linearly.
 DR’s are fixed, but each R chooses its DR. (DR sends
SND_ACK_TOME with TTL fixed to a known value).
Network Programming (TDC561)
Winter 2003
Page 37
Socket Options
 Various attributes that are used to determine the
behavior of sockets.
 Setting options tells the OS/Protocol Stack the
behavior we want.
 Support for generic options (apply to all sockets)
and protocol specific options.
Network Programming (TDC561)
Winter 2003
Page 38
Option types
 Many socket options are Boolean flags indicating
whether some feature is enabled (1) or disabled (0).
 Other options are associated with more complex
types including int, timeval, in_addr,
sockaddr, etc.
 Read-Only Socket Options
– Some options are readable only (we can’t set the value).
Network Programming (TDC561)
Winter 2003
Page 39
Setting and Getting option values
getsockopt() gets the current value of a socket option.
setsockopt() is used to set the value of a socket option.
#include <sys/socket.h>
Network Programming (TDC561)
Winter 2003
Page 40
getsockopt()
int getsockopt( int sockfd,
int level,
int optname,
void *opval,
socklen_t *optlen);
level specifies whether the option is a general option or a
protocol specific option (what level of code should
interpret the option).
Network Programming (TDC561)
Winter 2003
Page 41
setsockopt()
int setsockopt( int sockfd,
int level,
int optname,
const void *opval,
socklen_t optlen);
Network Programming (TDC561)
Winter 2003
Page 42
General Options
 Protocol independent options.
 Handled by the generic socket system code.
 Some general options are supported only by specific
types of sockets (SOCK_DGRAM, SOCK_STREAM).
Network Programming (TDC561)
Winter 2003
Page 43
Some Generic Options
SO_BROADCAST
SO_DONTROUTE
SO_ERROR
SO_KEEPALIVE
SO_LINGER
SO_RCVBUF,SO_SNDBUF
SO_REUSEADDR
Network Programming (TDC561)
Winter 2003
Page 44
SO_BROADCAST
 Boolean option: enables/disables sending of
broadcast messages.
 Underlying DL layer must support broadcasting!
 Applies only to SOCK_DGRAM sockets.
 Prevents applications from inadvertently sending
broadcasts (OS looks for this flag when
broadcast address is specified).
Network Programming (TDC561)
Winter 2003
Page 45
SO_DONTROUTE
 Boolean option: enables bypassing of normal routing.
 Used by routing daemons.
Network Programming (TDC561)
Winter 2003
Page 46
SO_ERROR
 Integer value option.
 The value is an error indicator value (similar to
errno).
 Readable only
 Reading (by calling getsockopt()) clears any
pending error.
Network Programming (TDC561)
Winter 2003
Page 47
SO_KEEPALIVE
 Boolean option: enabled means that STREAM
sockets should send a probe to peer if no data flow
for a “long time”.
 Used by TCP - allows a process to determine
whether peer process/host has crashed.
 Consider what would happen to an open telnet
connection without keepalive.
Network Programming (TDC561)
Winter 2003
Page 48
SO_LINGER
Value is of type:
struct linger {
int l_onoff;
int l_linger;
};
/* 0 = off */
/* time in seconds */
 Used to control whether and how long a call to close
will wait for pending ACKS.
 connection-oriented sockets only.
Network Programming (TDC561)
Winter 2003
Page 49
SO_LINGER usage
 By default, calling close() on a TCP socket will
return immediately.
 The closing process has no way of knowing whether
or not the peer received all data.
 Setting SO_LINGER means the closing process can
determine that the peer machine has received the
data (but not that the data has been read() !).
Network Programming (TDC561)
Winter 2003
Page 50
shutdown() vs SO_LINGER
 How you can use shutdown() to find out when the peer
process has read all the sent data [R.Stevens, 7.5]
Network Programming (TDC561)
Winter 2003
Page 51
TCP Connection Termination
Server
Client
write
close
close returns
FIN
SN=X
Data queued
By TCP
2
App. Reads
queued data
and FIN
...
ACK=X+1
1
FIN
SN=Y
ACK=Y+1
Network Programming (TDC561)
close
3
4
Winter 2003
Page 52
TCP Connection Termination close w/ SO_LINGER
Server
Client
write
close
FIN
SN=X
ACK=X+1
1
Data queued
By TCP
2
App. Reads
queued data
and FIN
...
close returns
FIN
SN=Y
ACK=Y+1
Network Programming (TDC561)
close
3
4
Winter 2003
Page 53
TCP Connection Termination w/ shutdown
Server
Client
write
shutdown WR
read blocks
FIN
SN=X
Data queued
By TCP
2
App. Reads
queued data
and FIN
...
ACK=X+1
1
FIN
SN=Y
close
3
read returns 0
ACK=Y+1
Network Programming (TDC561)
4
Winter 2003
Page 54
SO_RCVBUF
and
SO_SNDBUF
 Integer values options - change the receive and send
buffer sizes.
 Can be used with STREAM and DGRAM sockets.
 With TCP, this option effects the window size used for flow
control - must be established before connection is made.
Network Programming (TDC561)
Winter 2003
Page 55
SO_REUSEADDR
 Boolean option: enables binding to an address (port)
that is already in use.
 Used by servers that are transient - allows binding a
passive socket to a port currently in use (with active
sockets) by other processes.
 Can be used to establish separate servers for the
same service on different interfaces (or different IP
addresses on the same interface).
 Virtual Web Servers can work this way.
Network Programming (TDC561)
Winter 2003
Page 56
IP Options (IPv4)
 IP_HDRINCL: used on raw IP sockets when we want to
build the IP header ourselves.
 IP_TOS: allows us to set the “Type-of-service” field in an
IP header.
 IP_TTL: allows us to set the “Time-to-live” field in an IP
header.
Network Programming (TDC561)
Winter 2003
Page 57
TCP socket options
 TCP_KEEPALIVE: set the idle time used when
SO_KEEPALIVE is enabled.
 TCP_MAXSEG: set the maximum segment size sent by a
TCP socket.
 TCP_NODELAY: can disable TCP’s Nagle algorithm that
delays sending small packets if there is unACK’d data
pending.
 TCP_NODELAY also disables delayed ACKS (TCP ACKs
are cumulative).
Network Programming (TDC561)
Winter 2003
Page 58