Cilium eBPF Networking: A Deep Dive into Kernel-Level Kubernetes Connectivity

The rise of cloud-native architectures has fundamentally changed how we think about container networking, and eBPF networking has emerged as the definitive technology for achieving high-performance Kubernetes CNI implementations. Traditional networking paradigms built on iptables and virtual interfaces are hitting their scalability limits as clusters grow into thousands of nodes and tens of thousands of pods. This is where Cilium, powered by eBPF networking technology, delivers a paradigm shift in how we handle packet processing, security enforcement, and multi-cluster connectivity within Kubernetes environments.

Understanding the eBPF Revolution

eBPF, or Extended Berkeley Packet Filter, represents one of the most significant innovations in Linux kernel programming. Unlike traditional kernel modules that require compilation and loading as separate units, eBPF programs are verified by the kernel before execution and run in a sandboxed virtual machine inside the kernel itself. This allows for safe, dynamic insertion of custom code at virtually any point in the kernel's execution path.

For Kubernetes networking specifically, eBPF networking enables packet processing decisions to happen directly in the kernel context without the overhead of context switches to userspace. When a packet arrives at a node, an eBPF program attached to the network interface can inspect, modify, route, or drop that packet immediately. This eliminates the traditional path where packets traverse the kernel network stack, hit iptables rules sequentially, and potentially get forwarded to userspace proxies.

Traditional CNI plugins like Flannel and Calico's iptables mode rely on virtual Ethernet pairs, routing tables, and sequential rule evaluation. As your cluster scales, these rule chains grow linearly. A cluster with 5000 services might generate 50,000 iptables rules that every packet must traverse. Cilium replaces this entirely with eBPF maps that provide O(1) lookup complexity regardless of cluster size.

Cilium Architecture Deep Dive

At its core, Cilium operates through a control plane running as a DaemonSet on every node, coordinating with a centralized operator. The Cilium agent compiles and loads eBPF programs onto network interfaces, manages eBPF maps for state storage, and interfaces with the Kubernetes API server to watch for pod, service, and network policy changes.

The key architectural components include:

Cilium Agent: Runs on each node and handles eBPF program lifecycle, map management, and API synchronization. The agent translates Kubernetes resources into eBPF data structures.
eBPF Programs: Compiled from C code using LLVM/Clang, these attach to kernel hooks including XDP (eXpress Data Path), TC (Traffic Control), socket operations, and cgroup networking. Each hook point serves a specific purpose in the packet journey.
eBPF Maps: Hash tables stored in kernel memory that maintain state across packet processing. These store endpoint identities, policy rules, connection tracking, and load balancer backends with nanosecond access times.
Hubble: The observability layer built on top of Cilium, providing flow logging, metrics export, and network debugging capabilities through eBPF-derived data.

Setting Up Cilium in Production

Info! Installation requires kernel version 4.19 or higher, though 5.10+ is recommended for full feature support including Bandwidth Manager and efficient connection tracking.

Here is a production-grade installation using the Cilium CLI:

# Install Cilium CLI
curl -L --remote-name-all https://github.com/cilium/cilium-cli/releases/latest/download/cilium-linux-amd64.tar.gz
tar xzvfC cilium-linux-amd64.tar.gz /usr/local/bin
rm cilium-linux-amd64.tar.gz

# Install Cilium with production options
cilium install \
  --version 1.15.0 \
  --set ipam.mode=kubernetes \
  --set hubble.enabled=true \
  --set hubble.relay.enabled=true \
  --set hubble.ui.enabled=true \
  --set prometheus.enabled=true \
  --set operator.prometheus.enabled=true

# Verify installation
cilium status --wait

The ipam.mode=kubernetes setting uses the Kubernetes Node CIDR allocator, which works well with most cloud provider integrations. For on-premises deployments or advanced IPAM requirements, Cilium supports various modes including Cluster Pool, Multi-Pool, and CRD-backed allocation.

Deploying Hubble alongside Cilium provides critical visibility into your network. The relay component aggregates flows from all agents, while the UI offers visual debugging of connectivity issues. Prometheus metrics enable building dashboards for network latency, policy drops, and connection statistics.

Advanced Network Policy with Layer 7 Filtering

Where Cilium truly differentiates itself from other Kubernetes CNI solutions is its Layer 7 policy capabilities. While standard Kubernetes NetworkPolicies only operate at L3/L4 (IP addresses and ports), Cilium can enforce policies based on HTTP paths, gRPC services, Kafka topics, and DNS queries.

Here is an example of an L7-aware policy that restricts HTTP access:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: http-restrictions
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: frontend
  egress:
    - toEndpoints:
        - matchLabels:
            app: backend
            k8s:io.kubernetes.pod.namespace: production
      toPorts:
        - ports:
            - port: "80"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: "/api/v1/users/*"
              - method: POST
                path: "/api/v1/orders"
    - toFQDNs:
        - matchPattern: "*.stripe.com"
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP

This policy permits the frontend service to make GET requests to /api/v1/users/* and POST requests to /api/v1/orders on the backend service. It also allows HTTPS connections to any Stripe subdomain. Attempts to access other paths or methods are dropped and logged through Hubble.

The enforcement happens entirely within the kernel eBPF program. When the proxy identifies an HTTP request, it communicates back to the eBPF layer through a maps-based coordination, allowing for immediate packet decisions without proxying all traffic through userspace.

DNS-based policies are particularly powerful for egress control. Instead of maintaining lists of IP addresses for external services that change constantly, you can define policies based on fully qualified domain names. Cilium maintains a DNS cache and automatically updates allowed IPs as DNS records change.

Cluster Mesh: Multi-Cluster Networking Solved

Kubernetes deployment rarely stops at a single cluster. Organizations run multiple clusters for high availability, geographic distribution, and workload isolation. The Cluster mesh capability in Cilium enables seamless pod-to-pod connectivity across cluster boundaries with automatic service discovery.

Cluster mesh works by establishing encrypted tunnels between cluster nodes using existing node network connectivity. Each cluster maintains autonomy, but services can be exposed globally. When a service is marked with the appropriate annotation, Cilium's control plane propagates endpoints across clusters.

Configuration requires shared Certificate Authority (CA) certificates for mutual TLS authentication between clusters. Here is a minimal setup for connecting two clusters:

# On cluster 1: generate and export CA
cilium clustermesh ca rotate --ca-file cluster1-ca.crt --ca-key-file cluster1-ca.key

# On cluster 2: import CA and enable
cilium clustermesh enable --context kind-cluster2 \
  --ca-cert-file cluster1-ca.crt \
  --ca-key-file cluster1-ca.key

# Connect the clusters
cilium clustermesh connect --context kind-cluster1 \
  --destination-context kind-cluster2

Once connected, services can use the global service annotation to become discoverable across clusters:

apiVersion: v1
kind: Service
metadata:
  name: payments-api
  annotations:
    io.cilium/global-service: "true"
spec:
  selector:
    app: payments
  ports:
    - port: 8080

This service is now accessible from any pod in any connected cluster using the standard Kubernetes DNS name. Cilium handles load balancing across all available backends transparently, with health checking to remove failed instances.

Performance Tuning and Optimization

Achieving optimal performance with eBPF networking requires understanding several kernel-level tunables. The following configurations maximize Cilium throughput:

# Enable XDP acceleration on supported NICs
# Modify DaemonSet to include:
spec:
  template:
    spec:
      containers:
        - name: cilium-agent
          securityContext:
            privileged: true
          volumeMounts:
            - name: bpf-maps
              mountPath: /sys/fs/bpf

cilium config set enable-xdp-acceleration true

# Configure BPF-based masquerading instead of iptables
cilium config set enable-bpf-masquerade true

# Enable Bandwidth Manager for traffic shaping
cilium config set enable-bandwidth-manager true

# Optimize for high connection counts
cilium config set bpf-map-dynamic-size-ratio 0.0025

The enable-bpf-masquerade option replaces iptables SNAT rules with eBPF-based handling for pod-to-external traffic. This removes another iptables dependency and improves performance for egress-heavy workloads.

Bandwidth Manager enables kernel-native traffic shaping without requiring sidecars or external controllers. Pods can specify bandwidth limits through annotations, and the eBPF layer enforces these at the kernel level:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/ingress-bandwidth: 10M
    kubernetes.io/egress-bandwidth: 10M

Observability with Hubble

Network troubleshooting in Kubernetes has historically been painful. Hubble changes this by providing flow-level visibility powered by eBPF. Every connection attempt, every policy decision, and every dropped packet can be observed in real-time.

Basic flow monitoring:

# Live view of all flows
hubble observe --follow

# Filter by namespace
hubble observe --namespace production --follow

# Show dropped flows with reason code
hubble observe --type drop --verdict DROPPED

# Export metrics for Prometheus
cilium config set hubble-metrics enabled

Hubble can identify the specific network policy that caused a drop, the identity of both source and destination endpoints, and the L7 context when applicable. This eliminates trial-and-error debugging when connectivity issues arise.

Troubleshooting Common Issues

Info! Cilium requires UDP port 8472 (VXLAN) or 51871 (Geneve) for inter-cluster communication, along with TCP 4240 for health checks.

Policy drops often occur due to DNS resolution failures in FQDN policies. Ensure Cilium DNS proxy is not being bypassed and that the core-dns pods are visible to Cilium. Use hubble observe --verdict DROPPED to identify the specific policy causing denial.

For performance issues, check that eBPF programs loaded correctly by running cilium bpf endpoint list on each node. Missing programs indicate kernel incompatibility or resource constraints preventing compilation.

FAQ

Why does my pod see its own service IP as the source address instead of the client IP?

By default, kube-proxy handles service translation before Cilium sees the packet, causing source NAT. Enable Direct Server Return (DSR) mode with cilium config set loadBalancer.mode dsr to preserve client IPs. This requires kernel 5.10+ but eliminates the NAT step entirely for better observability and compatibility with applications that log source IP.

How do I migrate from Calico to Cilium without cluster downtime?

Use the migration guide from the Cilium documentation which leverages Kubernetes rolling updates. Set up Cilium alongside Calico temporarily, migrate workloads node by node, and only remove Calico after verifying Cilium handles all traffic. The cilium-dbg tool can verify endpoint state before proceeding with each node.

Can Cilium encrypt traffic between nodes without using a service mesh?

Yes. Enable WireGuard or IPsec transparent encryption with cilium config set encryption.enabled true and encryption.type=wireguard. This encrypts all pod-to-pod traffic automatically without application awareness or sidecar injection. For cluster mesh, enable encryption.wireguard.connectivity=clusters for cross-cluster security.

GeekSynapse: Where Tech Connects