K8 Building Blocks

Nodes

  • Nodes are virtual identities assigned by Kubernetes to the systems part of the cluster - whether Virtual Machines, bare-metal, Containers, etc.

    • A node is a logical/virtual representation of a machine in a Kubernetes cluster that is used to run workloads.

    • Kubernetes doesn't directly interact with the hardware. Instead, it assigns virtual identities called "nodes" to these machines—whether they are:

      • Virtual Machines (VMs) in the cloud,

      • Bare-metal servers (physical machines),

      • Containers (in some setups).

  • These identities are unique to each system, and are used by the cluster for resources accounting and monitoring purposes, which helps with workload management throughout the cluster.

  • Each node is managed with the help of two Kubernetes node agents - kubelet and kube-proxy, while it also hosts a container runtime.

  • The container runtime is required to run all containerized workload on the node - control plane agents and user workloads.

  • The kubelet and kube-proxy node agents are responsible for executing all local workload management related tasks - interact with the runtime to run containers, monitor containers and node health, report any issues and node state to the API Server, and manage network traffic to containers.

  • Based on their predetermined functions, there are two distinct types of nodes - control plane and worker.

    • A typical Kubernetes cluster includes at least one control plane node, but it may include multiple control plane nodes for the High Availability (HA) of the control plane.

    • In addition, the cluster includes one or more worker nodes to provide resource redundancy in the cluster. There are cases when a single all-in-one cluster is bootstrapped as a single node on a single VM, bare-metal, or Container, when high availability and resource redundancy are not of importance.

    • These are hybrid or mixed nodes hosting both control plane agents and user workload on the same system.

    • Minikube allows us to bootstrap multi-node clusters with distinct, dedicated control plane nodes, however, if our host system has a limited amount of physical resources (CPU, RAM, disk), we can easily bootstrap a single all-in-one cluster as a single node on a single VM or Container.

  • Node identities are created and assigned during the cluster bootstrapping process by the tool responsible to initialize the cluster agents.

  • Minikube is using the default kubeadm bootstrapping tool, to initialize the control plane node during the init phase and grow the cluster by adding worker or control plane nodes with the join phase.

Control Plane Node

  • The control plane nodes run the control plane agents, such as the API Server, Scheduler, Controller Managers, and etcd in addition to the kubelet and kube-proxy node agents, the container runtime, and add-ons for container networking, monitoring, logging, DNS, etc.

Worker Node

  • Worker nodes run the kubelet and kube-proxy node agents, the container runtime, and add-ons for container networking, monitoring, logging, DNS, etc.

Namespace

  • In Kubernetes, namespaces provide a mechanism for isolating groups of resources within a single cluster.

  • Names of resources need to be unique within a namespace, but not across namespaces.

  • Namespace-based scoping is applicable only for namespaced objects (e.g. Deployments, Services, etc.) and not for cluster-wide objects (e.g. StorageClass, Nodes, PersistentVolumes, etc.).

  • If multiple users and teams use the same Kubernetes cluster we can partition the cluster into virtual sub-clusters using Namespaces.

  • Generally, Kubernetes creates four Namespaces out of the box: kube-system, kube-public, kube-node-lease, and default.

    • The kube-system Namespace contains the objects created by the Kubernetes system, mostly the control plane agents.

    • The default Namespace contains the objects and resources created by administrators and developers, and objects are assigned to it by default unless another Namespace name is provided by the user.

    • kube-public is a special Namespace, which is unsecured and readable by anyone, used for special purposes such as exposing public (non-sensitive) information about the cluster.

    • The newest Namespace is kube-node-lease which holds node lease objects used for node heartbeat data.

  • Good practice, however, is to create additional Namespaces, as desired, to virtualize the cluster and isolate users, developer teams, applications, or tiers.

  • Namespaces are one of the most desired features of Kubernetes, securing its lead against competitors, as it provides a solution to the multi-tenancy requirement of today's enterprise development teams.

  • Resource quotas help users limit the overall resources consumed within Namespaces.

  • LimitRanges help limit the resources consumed by individual Containers and their enclosing objects in a Namespace.

Pods

  • A Pod is the smallest Kubernetes workload object.

  • It is the unit of deployment in Kubernetes, which represents a single instance of the application.

  • A Pod is a logical collection of one or more containers, enclosing and isolating them to ensure that they:

    • Are scheduled together on the same host with the Pod.

    • Share the same network namespace, meaning that they share a single IP address originally assigned to the Pod.

    • Have access to mount the same external storage (volumes) and other common dependencies.

  • Pods are ephemeral in nature, and they do not have the capability to self-heal themselves.

    • That is the reason they are used with controllers, or operators (controllers/operators are used interchangeably), which handle Pods' replication, fault tolerance, self-healing, etc.

    • Examples of controllers are Deployments, ReplicaSets, DaemonSets, Jobs, etc.

    • When an operator is used to manage an application, the Pod's specification is nested in the controller's definition using the Pod Template.

  • An example of a stand-alone Pod object's definition manifest in YAML format, without an operator.

    • This represents the declarative method to define an object, and can serve as a template for a much more complex Pod definition manifest if desired:

    apiVersion: v1
    kind: Pod
    metadata:
        name: nginx-pod
        labels:
        run: nginx-pod
    spec:
        containers:
            - name: nginx-pod
            image: nginx:1.22.1
            ports:
                - containerPort: 80
    Field
    Required
    Description

    apiVersion

    Y

    Represnts the API endpoint on the API server which we want to connect to. It must match an existing version for the object type defined.

    kind

    Y

    The type of object

    metadata

    Y

    Hold object's name, optional labels, namespaces, annotations etc.

    spec

    Y

    Beginning of the block defining the desired state of the Pod object - also named the PodSpec.

    spec.template

    The objects are created using the details defined in spec.template.

  • The contents of spec are evaluated for scheduling purposes, then the kubelet of the selected node becomes responsible for running the container image with the help of the container runtime of the node.

  • The Pod's name and labels are used for workload accounting purposes.

Labels

  • Labels are key-value pairs attached to Kubernetes objects such as Pods, ReplicaSets, Nodes, Namespaces, and Persistent Volumes.

  • Labels are used to organize and select a subset of objects, based on the requirements in place.

  • Many objects can have the same Label(s). Labels do not provide uniqueness to objects.

    • Controllers use Labels to logically group together decoupled objects, rather than using objects' names or IDs.

Label selector

  • In general, we expect many objects to carry the same label(s).

  • Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.

  • The API currently supports two types of selectors: equality-based and set-based.

    • Equality-Based Selectors

      • Equality-Based Selectors allow filtering of objects based on Label keys and values.

      • Matching is achieved using the=, == (equals, used interchangeably), or != (not equals) operators.

      • For example, with env==dev or env=dev we are selecting the objects where the env Label key is set to value dev.

    • Set-Based Selectors

      • Set-Based Selectors allow filtering of objects based on a set of values.

      • We can use in, notin operators for Label values, and exist/does not exist operators for Label keys.

      • For example, with env in (dev,qa) we are selecting objects where the env Label is set to either dev or qa; with !app we select objects with no Label key app.

  • A label selector can be made of multiple requirements which are comma-separated. In the case of multiple requirements, all must be satisfied so the comma separator acts as a logical AND (&&) operator.

ReplicationControllers

  • Although no longer a recommended controller, a ReplicationController is a complex operator that ensures a specified number of replicas of a Pod are running at any given time the desired version of the application container, by constantly comparing the actual state with the desired state of the managed application.

  • If there are more Pods than the desired count, the replication controller randomly terminates the number of Pods exceeding the desired count, and, if there are fewer Pods than the desired count, then the replication controller requests additional Pods to be created until the actual count matches the desired count.

  • Generally a Pod is not deployed independently, as it would not be able to restart itself if terminated in error because a Pod misses the much desired self-healing feature that Kubernetes otherwise promises. The recommended method is to use some type of an operator to run and manage Pods.

  • In addition to replication, the ReplicationController operator also supports application updates.

  • However, the default recommended controller is the Deployment which configures a ReplicaSet controller to manage application Pods' lifecycle.

ReplicaSets

  • It is, in part, the next-generation ReplicationController, as it implements the replication and self-healing aspects of the ReplicationController.

  • ReplicaSets support both equality- and set-based Selectors, whereas ReplicationControllers only support equality-based Selectors.

  • When a single instance of an application is running there is always the risk of the application instance crashing unexpectedly, or the entire server hosting the application crashing. If relying only on a single application instance, such a crash could adversely impact other applications, services, or clients.

  • To avoid such possible failures, we can run in parallel multiple instances of the application, hence achieving high availability. The lifecycle of the application defined by a Pod will be overseen by a controller - the ReplicaSet. With the help of the ReplicaSet, we can scale the number of Pods running a specific application container image. Scaling can be accomplished manually or through the use of an autoscaler (horizontal scaling).

  • ReplicaSets can be used independently as Pod controllers but they only offer a limited set of features.

  • A set of complementary features are provided by Deployments, the recommended controllers for the orchestration of Pods.

  • Deployments manage the creation, deletion, and updates of Pods.

    • A Deployment automatically creates a ReplicaSet, which then creates a Pod.

    • There is no need to manage ReplicaSets and Pods separately, the Deployment will manage them on our behalf.

        apiVersion: apps/v1
        kind: ReplicaSet
        metadata:
        name: frontend
        labels:
            app: guestbook
            tier: frontend
        spec:
        replicas: 3
        selector:
            matchLabels:
            app: guestbook
        template:
            metadata:
            labels:
                app: guestbook
            spec:
            containers:
            - name: php-redis
                image: gcr.io/google_samples/gb-frontend:v3

Deployments

  • It provides declarative updates to Pods and ReplicaSets.

  • The DeploymentController is part of the control plane node's controller manager, and as a controller it also ensures that the current state always matches the desired state of our running containerized application.

  • It allows for seamless application updates and rollbacks, known as the default RollingUpdate strategy, through rollouts and rollbacks, and it directly manages its ReplicaSets for application scaling. It also supports a disruptive, less popular update strategy, known as Recreate.

        apiVersion: apps/v1
        kind: Deployment
        metadata:
        name: nginx-deployment
        labels:
            app: nginx-deployment
        spec:
        replicas: 3
        selector:
            matchLabels:
            app: nginx-deployment
        template:
            metadata:
            labels:
                app: nginx-deployment
            spec:
            containers:
            - name: nginx
                image: nginx:1.20.2
                ports:
                - containerPort: 80

DaemonSets

  • DaemonSets are operators designed to manage node agents.

  • They resemble ReplicaSet and Deployment operators when managing multiple Pod replicas and application updates, but the DaemonSets present a distinct feature that enforces a single Pod replica to be placed per Node, on all the Nodes or on a select subset of Nodes.

  • In contrast, the ReplicaSet and Deployment operators by default have no control over the scheduling and placement of multiple Pod replicas on the same Node.

  • DaemonSet operators are commonly used in cases when,

    • we need to collect monitoring data from all Nodes.

    • To run storage, networking, or proxy daemons on all Nodes.

    • To ensure that we have a specific type of Pod running on all Nodes at all times.

  • They are critical API resources in multi-node Kubernetes clusters.

    • The kube-proxy agent running as a Pod on every single node in the cluster, or the Calico or Cilium networking node agent implementing the Pod networking across all nodes of the cluster, are examples of applications managed by DaemonSet operators.

  • Whenever a Node is added to the cluster, a Pod from a given DaemonSet is automatically placed on it.

    • Although it ensures an automated process, the DaemonSet's Pods are placed on all cluster's Nodes by the controller itself, and not with the help of the default Scheduler.

    • When any one Node crashes or it is removed from the cluster, the respective DaemonSet operated Pods are garbage collected.

    • If a DaemonSet is deleted, all Pod replicas it created are deleted as well.

    • The placement of DaemonSet Pods is still governed by scheduling properties which may limit its Pods to be placed only on a subset of the cluster's Nodes.

    • This can be achieved with the help of Pod scheduling properties such as nodeSelectors, node affinity rules, taints and tolerations.

    • This ensures that Pods of a DaemonSet are placed only on specific Nodes, such as workers if desired.

    • However, the default Scheduler can take over the scheduling process if a corresponding feature is enabled, accepting again node affinity rules.

        apiVersion: apps/v1
        kind: DaemonSet
        metadata:
        name: fluentd-agent
        namespace: default
        labels:
            k8s-app: fluentd-agent
        spec:
        selector:
            matchLabels:
            k8s-app: fluentd-agent
        template:
            metadata:
            labels:
                k8s-app: fluentd-agent
            spec:
            containers:
            - name: fluentd
                image: quay.io/fluentd_elasticsearch/fluentd:v4.5.2

Services

  • A containerized application deployed to a Kubernetes cluster may need to reach other such applications, or it may need to be accessible to other applications and possibly clients.

  • This is problematic because the container does not expose its ports to the cluster's network, and it is not discoverable either. The solution would be a simple port mapping, as offered by a typical container host.

    • However, due to the complexity of the Kubernetes framework, such a simple port mapping is not that "simple". The solution is much more sophisticated, with the involvement of the kube-proxy node agent, IP tables, routing rules, cluster DNS server, all collectively implementing a micro-load balancing mechanism that exposes a container's port to the cluster's network, even to the outside world if desired.

  • Services is the recommended method to expose any containerized application to the Kubernetes network.

  • The benefits of the Kubernetes Service becomes more obvious when exposing a multi-replica application, when multiple containers running the same image need to expose the same port.

  • This is where the simple port mapping of a container host would no longer work, but the Service would have no issue implementing such a complex requirement.

Last updated