ppt - Jon Schipp

Download Report

Transcript ppt - Jon Schipp

What's Under Your Hood?
Implementing a Network Monitoring System
4/9/2015
[email protected]
1
Who am I?

Jon Schipp

Unix Admin

Linux & Unix User Group

Southern Indiana Computer Klub
4/9/2015
[email protected]
2
and...
I like computers a lot
4/9/2015
[email protected]
3
What's Network Monitoring?
Monitoring?

Monitoring your network

Collecting data i.e. network traffic

Interpreting the data
4/9/2015
[email protected]
4
Why?

Network issues

Attack detection

Record keeping

Fun
4/9/2015
[email protected]
5
Focus

Small/Medium size business

Basement endeavors

Cheap goods

Working with what you have
4/9/2015
[email protected]
6
where the magic happens
4/9/2015
[email protected]
7
gimme the data

hubs

monitor/SPAN ports, port mirroring

taps

ip forwarding/relaying/tunneling, whatev
4/9/2015
[email protected]
8
4/9/2015
9
Forwarding/Relaying


Wireshark Remote Feature
Network Minor Pro: Pcap-over-IP
tcpdump -nni eth0 -s0 -w -| nc 192.168.1.254 33246
SSL/Encryption: ssh, socat, ncat, crypcat, stunnel

Netfilter's Iptables
iptables -t mangle -A PREROUTING -p tcp -m multiport --dport 80,443,22,20,21 -i eth0 -j TEE --gateway 192.168.1.254
iptables -t mangle -A PREROUTING -p tcp -m multiport --dport 80,443,22,20,21 -o eth0 -j TEE --gateway 192.168.1.254

OpenBSD's PF
pass out on em0 dup-to (em1 192.168.1.254) proto tcp from any to any port { 80, 443, 22, 20 ,21 }
pass in on em0 dup-to (em1, 192.168.1.254) proto tcp from any to any port { 80, 443, 22, 20, 21 }
4/9/2015
[email protected]
10
Architecture
4/9/2015
[email protected]
11
High Speed Packet Capture

High-end equipment is expensive

DIY: tuning and compiling

Hardware is pretty fast nowadays but...

4/9/2015
We are using software that isn't
designed for efficient packet capture
[email protected]
12
NIC's

Get a quality card

NAPI is good

DMA is good

4/9/2015
Intel PRO/1000 MT Gigabit models are
generally good, $30 on Ebay
[email protected]
13
PCI buses
(bus speed in MHz) * (bus width in bits) / 8 = speed in Megabytes/second
PCI
66 MHz
* 32 bit
/ 8 = 264 MB/s
PCI X
66 MHz
* 64 bit
/ 8 = 400 MB/s (minus 20% overhead)
PCI X
133 MHz * 64 bit
/ 8 = 850 MB/s (minus 20% overhead)
PCI X
266 MHz * 64 bit
/ 8 = 1700 MB/s (minus 20% overhead)
PCI X
533 MHz * 64 bit
/ 8 = 3400 MB/s (minus 20% overhead)
PCIe v1
2500 Mhz * 32 1 bit lanes / 8 = 250 MB/s (minus 20% overhead)
PCIe v2 x1 5000 Mhz * 1 1 bit lane / 8 = 500 MB/s (minus 20% overhead)
PCIe v2 x2 5000 Mhz * 2 1 bit lanes / 8 = 1000 MB/s (minus 20% overhead)
PCIe v2 x4 5000 Mhz * 4 1 bit lanes / 8 = 2000 MB/s (minus 20% overhead)
PCIe v2 x8 5000 Mhz * 8 1 bit lanes / 8 = 4000 MB/s (minus 20% overhead)
PCIe v2 x16 5000 Mhz * 16 1 bit lanes / 8 = 8000 MB/s (minus 20% overhead)
PCIe v2 x32 5000 Mhz * 32 1 bit lanes / 8 = 16000 MB/s (minus 20% overhead)
PCIe v3 x32 5000 Mhz * 32 1 bit lanes / 8 = 19700 MB/s (minus 1.5% overhead)
1000/8 = 128 Megabytes/second.
10000/8 = 1250 Megabytes/second
4/9/2015
[email protected]
14
Other things




4/9/2015
Decent commodity CPU,
e.g. Opteron whoops Xeon in capture
SMP is good
If you plan on storing the data, writing to
disk will be a bottleneck
RAID Striping, SATA? for sure
SSD (maybe ?) nah
[email protected]
15
Typical Frame Processing








Frame reaches NIC
Ethernet preamble is removed
FCS is calculated, if bad, dropped
If interface is set in promiscuous mode, capture all
Else, only process when dst MAC is me (unicast), or broadcast, or multicast (if on)
FIFO to kernel ring buffer, CPU or DMA
NIC generates an interrupt, interrupt handler is called
Passed to host stack → ip_input module → tcp/udp module → userspace
4/9/2015
[email protected]
16
Frame Processing
4/9/2015
[email protected]
17
Specimen

FreeBSD 8.2-RELEASE

Ubuntu Server 10.04
4/9/2015
[email protected]
18
mbuf kernel structure

FreeBSD - data and headers are stored in mbufs and mbuf clusters
$netstat -m | head -n 3
82/653/735 mbufs in use (current/cache/total)
0/648/648/25600 mbuf clusters in use (current/cache/total/max)
0/256 mbuf+clusters out of packet secondary zone in use (current/cache)
man mbuf: The total size of an mbuf, MSIZE, is a constant defined in <sys/param.h>.
$grep -H -n MSIZE /sys/sys/param.h
sys/sys/param.h:145:#define MSIZE
sysctl kern.ipc.nmbclusters=25600
256
/* size of an mbuf */
(default)
$ vmstat -z | grep mbuf_cluster
mbuf_cluster:
2048,
^size^
4/9/2015
25600
^limit^
[email protected]
19
sk_buff kernel structure

Linux - data and headers are stored in sk_buffs
/usr/include/linux/skbuff.h
4/9/2015
[email protected]
20
Problems

Each packet generates an interrupt, this can
lead to receive live lock/interrupt storm

Context switches

System Calls
4/9/2015
[email protected]
21
Solutions

Device Polling

NAPI

Shared memory, mmap(), and Zero Copy

Bypassing host stack
4/9/2015
[email protected]
22
Solutions, less so

Checksum offloading

Large Receive Offload (LRO)

Larger on-board memory size

More data descriptors
4/9/2015
[email protected]
23
Capture Mechanisms/Subsystems

Berkeley Packet Filter (BPF)
Filter packets before they get to user space

Linux Socket Filter (LSF)
Extended BPF (kinda)


4/9/2015
and PF_RING (Linux)
Others: CSPF, NDIS, xPF, MPF, DPF,
Swift and so on...
[email protected]
24
libpcap

C library for packet capture
Provides link layer access to data available on the network through
interfaces attached to the system.

Runs on almost all the modern Unices
winpcap for windows

4/9/2015
When data reaches user space, it's stored in
the libpcap buffer, applications read from it
[email protected]
25
FreeBSD Frame Processing
4/9/2015
[email protected]
26
FreeBSD Processing cont.

3 copies due to double buffer

Deals with smaller buffers compared to Linux

Half of the double buffer is copied to user space


4/9/2015
Packet is passed to each BPF device, /dev/bpf[0-9]
(where application via libpcap binds to)
App reads from HOLD buffer, data is copied from the
STORE buffer into the HOLD buffer
[email protected]
27
Linux Frame Processing
4/9/2015
[email protected]
28
Linux Processing cont.

2 copies

Deals with larger buffers compared to FreeBSD

Smart queue, pointers


4/9/2015
Packets copied individually, not whole buffers full of
packets
If packets are available, wake up user spacer(libpcap)
to grab data from LSF
[email protected]
29
Tuning: Interrupt Livelock

Interrupt usage high?

Most modern Linux kernels are compiled with device polling

FreeBSD does not have it on by default
options DEVICE_POLLING
options HZ=1000
make buildkernel KERNCONF=NEWKERN
make installkernel KERNCONF=NEWKERN
ifconfig em0 polling

4/9/2015
Get a New API (NAPI) card
[email protected]
30
Tuning: Buffers

Kernel dropping lots of packets?

Increase the size of your kernel buffers

FreeBSD
sysctl net.bpf.bufsize=4096
sysctl net.bpf.maxbufsize=524288

Linux
sysctl net.core.rmem_default=114688
sysctl net.core.rmem_max=131071
net.core.netdev_max_backlog=1000

Increase kernel virtual memory size
4/9/2015
[email protected]
31
Tuning: Drivers

Bad NIC performance?

FreeBSD: man driver e.g. man em:
hw.em.rxd
Number of receive descriptors allocated by the driver. The
default value is 256. The 82542 and 82543-based adapters can
handle up to 256 descriptors, while others can have up to 4096.
echo hm.em.rxd=4096 >> /boot/loader.conf

Linux: ethtool, find driver README file (/usr/src/linux/)
ethtool –g eth0
ethtool -G rx 4096
4/9/2015
[email protected]
32
tcpdump tests, average
6,000,000 packets in 60 seconds using iperf, loss

OS defaults, hardware: Dell PowerEdge 2850, Xeon (Quad), 4GB RAM

tcpdump -nni em0 -w test96.pcap | FreeBSD: 0%, Linux: 8%

tcpdump -nni em0 -w /dev/null | FreeBSD: 0%, Linux: 0%

tcpdump -nni em0 -s0 -w test65535.pcap | FreeBSD: 1.6%, Linux: 22%

tcpdump -nni em0 -s0 /dev/null | FreeBSD: 0%, Linux: .02%
4/9/2015
[email protected]
33
libpcap buffers

libpcap library initializes libpcap buffer to 32kb, if bpf
value is less than 32kb
if ((ioctl (fd, BIOCGBLEN, (caddr_t)&v) < 0) || v < 32768)
v = 32768;
Linux initializes its buffer size at 512Kb

Increase BPF buffer size globally, all apps, remember?
net.bpf.bufsize, net.bpf.maxbufsize

Libpcap will initialize its buffer to size in net.bpf.bufsize

Set buffer for tcpdump only, use -B 524288 (512kb)
4/9/2015
[email protected]
34
FreeBSD, interface drop counts
netstat
$ netstat -dI em0
Name Mtu Network
em0 1500 <Link#2>
Address
Ipkts Ierrs Idrop
00:02:b3:9a:c2:03 2083316 0
Opkts Oerrs Coll Drop
0 1043607 0 0 0
$ netstat –B
Pid Netif Flags
Recv
Drop Match Sblen Hblen Command
90460 em0 p--s--103
0
103 632 0 tcpdump
43960 em0 p--s--- 3803363
0 3803363 712 0 ntop
$ sysctl dev.em.0.dropped
dev.em.0.dropped: 0
$ grep -R -H -n if_iqdrops /usr/src/
sys/dev/e1000/if_lem.c:3470: ifp->if_iqdrops++;
usr.bin/netstat/if.c:289:
idrops = ifnet.if_iqdrops
4/9/2015
[email protected]
35
Linux, interface drop counts
ifconfig
$ ifconfig -a | egrep -e "(^eth|drop)"
$ ethtool -S eth0
static int get_dev_fields(char *bp, struct interface *ife)
$ awk '{ print $1, $5 }' /proc/net/dev {
switch (procnetdev_vsn) {
Inter-|
case 3:
face drop
sscanf(bp,
lo: 0
"%llu %llu %lu %lu %lu %lu %lu",
br0: 3354
&ife->stats.rx_bytes,
eth0: 0
&ife->stats.rx_packets,
eth1: 0
&ife->stats.rx_errors,
eth2: 0
&ife->stats.rx_dropped,
eth3: 14
...
eth4: 0
eth5: 103395
4/9/2015
[email protected]
36
tcpdump/libpcap drops



“Packets captured” – Packets processed by tcpdump
“Received by filter” – Passed the filter (LSF, BPF)
“Dropped by kernel” - Not enough space in kernel buffer
FreeBSD (kernel drops):


libpcap gets its drop count from the kernel (BPF)

ps_drop from pcap_stats() is bs_drop from BIOCGSTATS
Linux (kernel drops)


libpcap gets its drop count from PF_PACKET’s PACKET_STATISTICS

ps_drop from pcap_stats()

ps_ifdrop – Ubuntu addendum/patch (Linux , Tru64 Unix only) from /proc/net/dev
4/9/2015
[email protected]
37
PF_RING for Linux

Creates new socket called PF_RING
Works with existing PF_PACKET apps

Shared memory

Can bypass host stack, sniffing only

4/9/2015
PF_RING aware drivers for faster
capture: e1000, igb, ixgbe
[email protected]
38
PF_RING for Linux

Compile PF_RING

Compile PF_RING aware libpcap and tcpdump

Load PF_RING kernel module
modprobe pf_ring transparent_mode=2 enable_debug=0 enable_tx_capture=0 enable_ip_defrag=0 quick_mode=0

Recompile all apps to use new shared libraries, libpcap and
PF_RING
./configure CPPFLAGS=”-I/usr/local/include” LDFLAGS=”-L/usr/local/lib -lpfring -lpcap” \
&& make && make install
4/9/2015
[email protected]
39
PF_RING DNA

Direct NIC Access, pure speed

Map NIC memory and registers to user land



Packet copy from the NIC to the DMA ring is
done by the NIC's NPU
One application at a time
can use the DMA ring
Requires DNA driver
4/9/2015
[email protected]
40
PF_RING TNAPI

4/9/2015
Threaded NAPI
[email protected]
41
vPF_RING

Virtual PF_RING

Hypervisor bypass

Zero-Copy
4/9/2015
[email protected]
42
netmap FreeBSD

mmap() shared memory

Use less system calls

Creates new device, /dev/netmap


1 GHz CPU can generate the
14.8 Mpps that can saturate
a 10GigE interface
supports ixgbe, e1000, re
4/9/2015
[email protected]
43
others to checkout

Ringmap – FreeBSD – code.google.com/p/ringmap/

Zero-copy sockets – FreeBSD: man zero_copy
Requires specific NIC's
Recompile kernel with “options ZERO_COPY_SOCKETS”
The zero copy send and zero copy receive code can be individually turned
off via the kern.ipc.zero_copy.send and kern.ipc.zero_copy.receive sysctl
variables respectively.

4/9/2015
MMAP() libpcap – Linux - http://public.lanl.gov/cpw/
[email protected]
44
Interface Configuration
Linux
/etc/network/interfaces
FreeBSD
/etc/rc.conf
auto eth0
iface eth0 inet manual
up ifconfig eth0 0.0.0.0 -arp up
up ip link set eth0 promisc on
up ip link set eth0 multicast on
up ip link set eth0 mtu 1514
down ip link set eth0 promisc off
down ifconfig eth0 down
auto eth1
iface eth1 inet manual
up ifconfig eth1 0.0.0.0 -arp up
up ip link set eth1 promisc on
up ip link set eth1 multicast on
up ip link set eth1 mtu 1514
down ip link set eth1 promisc off
down ifconfig eth1 down
4/9/2015
ifconfig_em0=”inet 0.0.0.0 -arp promisc multicast mtu 1514 polling”
ifconfig_em1=”inet 0.0.0.0 -arp promisc multicast mtu 1514 polling”
Bridging two interfaces (Linux)
[email protected]
brctl addbr br0
brctl addif br0 eth0 eth1
ifconfig br0 up
45
Useful Applications








4/9/2015
snort, ntop, tcpdump, iftop
trafshow, wireshark, tshark, tcpick
tcpflow, etherape, ngrep, tcptrack
suricata, bro-ids, ttt
xplico, ifstat, tcpflow
iptraf, bmon, bwm-ng, slurm
dsniff, p0f, tcptrace, tcpreplay
ipsumdump, speedometer
[email protected]
46
ntop
ntop -d -L -u ntop –access-log-file=/var/log/ntop/access.log -b -C –output-packet-path=/var/log/ntopsuspicious.log –local-subnets 192.168.1.0/24,192.168.2.0/24,192.168.3.0/24 -o -M -p
/etc/ntop/protocol.list -i br0,eth0,eth1,eth2,eth3,eth4,eth5 -o /var/log/ntop
4/9/2015
[email protected]
47
netsniff-ng
Linux, libpcap independent, zero-copy mechanism
Kernel compiled with CONFIG_PACKET_MMAP
4/9/2015
[email protected]
48
Daemonlogger
Packet Logger & Soft Tap
This is a libpcap-based program. It has two runtime modes:
1)It sniffs packets and spools them straight to the disk and can daemonize itself for
background packet logging.
2)It sniffs packets and rewrites them to a second interface, essentially acting as a soft tap. It
can also do this in daemon mode.
4/9/2015
[email protected]
49
etherape
4/9/2015
[email protected]
50
iftop
4/9/2015
[email protected]
51
IPTraf
4/9/2015
[email protected]
52
Trafshow
4/9/2015
[email protected]
53
tcpick
4/9/2015
[email protected]
54
tcpstat
4/9/2015
[email protected]
55
speedometer
4/9/2015
[email protected]
56
bmon
4/9/2015
[email protected]
57
Contact


Questions, comments, criticism:
[email protected]
More info:
sickbits.networklabs.org/other/packetcapt
dclinux.org
4/9/2015
[email protected]
58