Kubernetes networking

After understanding the basic architecture of Kubernetes, the next step is to understand how Kubernetes networking works.

Kubernetes specifies a set of guidelines for networking:

  1. Every Pod in the cluster gets a unique IP.
  2. All Pods should be able to communicate with all other Pods, even the ones on other nodes in the cluster, without using NAT.
  3. The IP a Pod sees itself as is the same IP that others see it as.
  4. Containers within a Pod share the same network namespace.

Based on this set of guidelines, there are 4 networking scenarios:

  • Container-to-Container networking
  • Pod-to-Pod networking
  • Pod-to-Service networking
  • Internet-to-Service networking‍

Container-to-Container networking

  • It is possible to run multiple containers in a single Pod. When that happens, the containers communicate with each other via localhost and port numbers. This is possible because the containers in the same pod are in the same network namespace, so they’re sharing the same network resources.
  • A network namespace is a collection of network resources like network interfaces, firewall configuration, routing tables etc.
  • In Kubernetes, each Pod gets its own network namespace. As a result, all the containers within a Pod have the same IP address and port space and can communicate with each other via localhost.

Pod-to-Pod networking

  • Every Pod has a real IP address and each Pod in the cluster communicates with other Pods using this IP address irrespective of whether the Pods are on the same node or different nodes in the cluster.
  • We already know that each Pod has its own network namespace.
  • Linux also creates a root networking namespace that allows communication with the external network.
  • Let's look at how Pods on the same node communicate
    - For each Pod, a Virtual Ethernet device (veth) pair consisting of two virtual interfaces is created.
    - One side of the veth is attached to the root network namespace and the other side to the Pod’s network namespace.
    - At this point, the Pods are attached to the root network namespace.
    - Now to actually connect the Pod network namespaces via the root network namespace, a virtual network bridge is created. The bridge can unite two or more networks.
    - So when a Pod sends traffic to another Pod, the packets are routed to the default network device on the source Pod’s network namespace.
    - Then the packets reach the network bridge in the root namespace via the veth device from the Pod’s network namespace.
    - The bridge uses ARP protocol to resolve the correct destination Pod and routes the packets to the virtual ethernet device for the destination Pod.
    - All this communication is happening via localhost since the Pods are on the same node.
  • When Pods are on different nodes
    - Every node in the cluster is assigned a CIDR block for Pods running on that node.
    - In this case, when a Pod sends traffic to another Pod on a different node, the network bridge in the root namespace won’t find any Pod on the same node. So the bridge sends the packets out of the root network namespace’s default ethernet device into the network.
    - The network then routes the packets to the destination node from where they are routed via the bridge on the destination node’s root network namespace and the virtual ethernet device for the destination Pod.

Pod-to-Service networking

  • Pods are ephemeral. The number of Pods can go up or down based on scaling requirements, node reboots, network outages etc. So the Pod IP addresses are not static.
  • Kubernetes uses Services to address this problem. A Service acts as an abstraction layer and assigns a single virtual IP address, called the Cluster IP, to a group of Pod IP addresses.
  • Any traffic sent to the Service’s Cluster IP will be routed to the set of Pods that are associated with the Service.

Internet-to-Service networking

  • Till now we’ve seen how traffic is routed inside a cluster. To expose an application to the external network, there are two scenarios:

    #1: Egress: Sending traffic from a Kubernetes service to Internet
       ~ Some kind of gateway router is needed to route the traffic to the internet.
       ~ The Pod IP is only recognisable within the cluster.
       ~ Data from the Pods is NATed at the Node so that the gateway router thinks the traffic is coming from the Node.
       ~ The gateway router will NAT the Node IP again translating the internal Node IP to an external IP and sends the traffic into the Internet.
       ~ On receiving the response, the traffic follows the reverse path all the way back to the Pod.

    #2: Ingress: Getting traffic from Internet to Kubernetes service
       ~ Ingress in general is made up of two solutions that work on different parts of the network stack
           -- Service LoadBalancer (Layer 4)
                - Specify a LoadBalancer Service type when the Service is created.
                - LoadBalancer is implemented by a cloud controller.
                - The LoadBalancer distributes the Incoming traffic to all the nodes in the cluster.
                - Firewall rules on every node will filter and pass the traffic to the correct Pod.
                - The response from the Pod traverses the same reverse path and is NATed with the LoadBalancer IP address back to the source.
          -- Ingress controller (Layer 7)
                - Layer 7 operates at HTTP/HTTPS protocol range.
                - Specify a NodePort Service type when the service is created.
                - Kubernetes master allocates a port that each Node will proxy to the Service.
                 - An Ingress object is used to expose a Node’s port to the Internet.
                - The traffic flows through an Ingress similar to that of a LoadBalancer.
                - The key difference is that an Ingress is aware of the URL’s path, so it can route traffic to services based on their path, and that the initial connection between the Ingress and the Node is through the port exposed on the Node for each service.

For more detailed information, check out these links.