• Iago López Galeiras's avatar
    Add eBPF connection tracking without dependencies on kernel headers · 9920c4ea
    Iago López Galeiras authored
    Based on work from Lorenzo, updated by Iago, Alban, Alessandro and
    Michael.
    
    This PR adds connection tracking using eBPF. This feature is not enabled by default.
    For now, you can enable it by launching scope with the following command:
    
    ```
    sudo ./scope launch --probe.ebpf.connections=true
    ```
    
    This patch allows scope to get notified of every connection event,
    without relying on the parsing of /proc/$pid/net/tcp{,6} and
    /proc/$pid/fd/*, and therefore improve performance.
    
    We vendor https://github.com/iovisor/gobpf in Scope to load the
    pre-compiled ebpf program and https://github.com/weaveworks/tcptracer-bpf
    to guess the offsets of the structures we need in the kernel. In this
    way we don't need a different pre-compiled ebpf object file per kernel.
    The pre-compiled ebpf program is included in the vendoring of
    tcptracer-bpf.
    
    The ebpf program uses kprobes/kretprobes on the following kernel functions:
    - tcp_v4_connect
    - tcp_v6_connect
    - tcp_set_state
    - inet_csk_accept
    - tcp_close
    
    It generates "connect", "accept" and "close" events containing the
    connection tuple but also pid and netns.
    Note: the IPv6 events are not supported in Scope and thus not passed on.
    
    probe/endpoint/ebpf.go maintains the list of connections. Similarly to
    conntrack, it also keeps the dead connections for one iteration in order
    to report short-lived connections.
    
    The code for parsing /proc/$pid/net/tcp{,6} and /proc/$pid/fd/* is still
    there and still used at start-up because eBPF only brings us the events
    and not the initial state. However, the /proc parsing for the initial
    state is now done in foreground instead of background, via
    newForegroundReader().
    
    NAT resolution on connections from eBPF works in the same way as it did
    on connections from /proc: by using conntrack. One of the two conntrack
    instances is only started to get the initial state and then it is
    stopped since eBPF detects short-lived connections.
    
    The Scope Docker image size comparison:
    - weaveworks/scope in current master:  22 MB (compressed),  68 MB
      (uncompressed)
    - weaveworks/scope with this patchset: 23 MB (compressed), 69 MB
      (uncompressed)
    
    Fixes #1168 (walking /proc to obtain connections is very expensive)
    
    Fixes #1260 (Short-lived connections not tracked for containers in
    shared networking namespaces)
    
    Fixes #1962 (Port ebpf tracker to Go)
    
    Fixes #1961 (Remove runtime kernel header dependency from ebpf tracker)
    9920c4ea