This is summury of some key concepts in K8s:
K8S contains two properties: Worker node & Master node (Control plane)
Node can contain multiple Pods, Pod can contain multiple containers
Kubectl – kubectl is the command-line interface (CLI) tool for working with a Kubernetes cluster
Minikube – minikube is known as as “Local Kubernetes engine“, minikube runs a single-node Kubernetes cluster on your personal computer (including Windows, macOS and Linux PCs) so that you can try out Kubernetes, or for daily development work
Kubeadm – Kubeadm is a tool use to built Kubernetes clusters
Kubelet– Kubelet is the technology that applies, creates, updates, and destroys containers on a Kubernetes node.
Resources type in K8s: job, service, deployment, config map, pod, secret, statefulset, role, name sapce, ingress,....
K8S comunicates with containers through CRI - Container runtime interface (standard by OCI). Two most types CRI common:
+ containerd: is kinds of container run time begin from docker
+ cri-o: develped by Redhad
Containerd architecture:
Workload: Workloads are the applications or services that you deploy and manage using Kubernetes, workload contains multiple pods.
Key Kubernetes objects arranged in a workload node:
Common workload types in K8S:
- Pods: The smallest deployable units in Kubernetes. A pod can contain one or more containers that share the same network namespace and storage volumes. Pods are often used to encapsulate tightly coupled application components.
- Deployments: A higher-level abstraction that manages the scaling and rolling updates of replicated pods. Deployments ensure that a specified number of replicas (identical instances) of a pod are running at all times, allowing for easy management and scalability.
- StatefulSets: Used for managing stateful applications that require stable network identities and storage. Each pod in a StatefulSet has a unique identity, and they are created and scaled in a controlled order.
- DaemonSets: Ensures that a copy of a pod is running on each node in the cluster. This is often used for cluster-level services such as log collection or monitoring agents.
- Jobs: : Used for running short-lived, batch-like tasks. Jobs create one or more pods to complete a task and then terminate the pods when the task is completed.
- CronJobs: A time-based job scheduler. CronJobs allow you to create and manage jobs that run on a scheduled basis, similar to the UNIX cron utility.
- ReplicaSets: A lower-level abstraction that ensures a specified number of replicas of a pod are running at all times. Deployments are built on top of ReplicaSets and provide additional features like rolling updates and scaling.
Storage in Kubernetes (often abbreviated as K8s) refers to the mechanisms and components used to manage persistent data storage for applications running in Kubernetes clusters. Kubernetes provides various options for managing storage that cater to different use cases and requirements. Here are some key concepts and components related to storage in Kubernetes:
- Volumes: Volumes are the basic unit of storage in Kubernetes. They represent a directory accessible to containers within a pod. Kubernetes supports several types of volumes, such as EmptyDir (temporary storage), HostPath (using the host filesystem), and various cloud provider-specific volumes.
- Persistent Volumes (PVs) and Persistent Volume Claims (PVCs): PVs are cluster-wide storage resources provisioned by an administrator. They represent physical storage such as network-attached storage (NAS) or cloud-based storage. PVCs are used by users (developers) to request a specific amount and type of storage. A PVC can request a specific amount of storage from a PV and then be attached to a pod, allowing the pod to access the persistent data.
Storage Classes: Storage Classes are used to dynamically provision Persistent Volumes based on the user's storage requirements. Each Storage Class defines the parameters for creating PVs, such as the type of storage, provisioner, and other attributes. This allows for a more dynamic and flexible storage provisioning process.
- Dynamic Provisioning: Kubernetes supports dynamic provisioning of storage, where a PVC can automatically create and bind to a PV without requiring manual intervention from the cluster administrator. This is facilitated by Storage Classes.
- StatefulSets: StatefulSets are a type of workload in Kubernetes designed for managing stateful applications that require stable and unique network identities, persistent storage, and ordered deployment/scaling.
- CSI (Container Storage Interface): The Container Storage Interface is a standard for exposing storage systems to containerized workloads in Kubernetes. It enables storage providers to develop plugins that seamlessly integrate with Kubernetes, allowing users to choose from a variety of storage solutions.
- Volume Plugins: Kubernetes supports various volume plugins that allow integration with different storage backends, including NFS, iSCSI, AWS EBS, Azure Disk, and more. These plugins enable pods to access different types of external storage.
- Stateful Workloads: For stateful applications like databases, Kubernetes provides mechanisms like StatefulSets and Operators to manage the deployment, scaling, and lifecycle of these applications, including their associated storage.
- CSI Drivers: CSI drivers are components that implement the Container Storage Interface and allow Kubernetes to communicate with various storage systems. These drivers enable the integration of external storage providers into Kubernetes.
etcd is a distributed key-value store that is widely used as the primary data store for Kubernetes (K8s). It serves as the backbone of Kubernetes' control plane, storing configuration data, state information, and other critical data that the Kubernetes cluster needs to function.
Here's how etcd is used in Kubernetes:
- Cluster State Storage: Kubernetes uses etcd to store the entire state of the cluster. This includes information about nodes, pods, services, configurations, secrets, and more. This allows Kubernetes to maintain high availability and recover from failures.
- Configuration and Metadata: All configuration data and metadata that define the desired state of the cluster are stored in etcd. When you create or update resources (such as deployments, services, or pods), the API server writes the changes to etcd, which then serves as the single source of truth for the cluster's state.
- Consistency and Consensus: etcd employs a consensus algorithm (Raft) to ensure that data consistency is maintained across the distributed cluster. This means that changes are only considered valid if they are replicated and agreed upon by a majority of etcd nodes.
- High Availability: To ensure resilience, etcd can be run in a highly available configuration, with multiple etcd instances distributed across different nodes in the cluster. This minimizes the risk of data loss in case of node failures.
- Backup and Restore: Regularly backing up etcd data is crucial to recover from disasters or data corruption. Kubernetes administrators often perform routine backups of the etcd data to ensure data integrity.
- Security: As a critical component of the cluster, etcd must be secured. Access controls, encryption, and network segmentation are used to protect the etcd cluster from unauthorized access.
Prometheus is an open-source systems monitoring and alerting toolkit originally developed by SoundCloud. It is widely used to monitor and collect metrics from various components of software systems, helping operators gain insights into the performance, health, and behavior of their applications.
Key features of Prometheus include:
- Multi-dimensional Data Model: Prometheus employs a flexible data model based on key-value pairs, allowing users to represent data with various dimensions. This facilitates effective monitoring and querying.
- PromQL: Prometheus Query Language (PromQL) enables users to query collected data for analytics and alerting purposes. PromQL supports various aggregation, filtering, and transformation functions.
- Time Series Collection: Prometheus continuously scrapes data from configured targets using a pull model. Targets can be various endpoints like HTTP, exporters, or other Prometheus instances.
- Data Storage: Collected data is stored in a time-series database with efficient and compressed storage mechanisms. Prometheus uses its storage format, which is optimized for time-based data.
- Alerting: Prometheus allows users to define custom alerting rules based on collected metrics. When a rule condition is met, an alert is triggered, and notifications can be sent to specified channels.
- Visualization: Although Prometheus itself does not provide native visualization tools, it can be integrated with various visualization and dashboarding tools like Grafana.
- Service Discovery: Prometheus has built-in support for service discovery, making it easier to dynamically discover and monitor new instances of services as they come online.
- Exporters: Prometheus can be extended using exporters, which are specialized agents that expose metrics from various systems and applications. Many third-party exporters are available for popular services like databases, web servers, and more.
- Scalability: Prometheus is designed to be horizontally scalable and can be configured to work in a federated mode, allowing multiple Prometheus instances to share data and queries.
Prometheus is often used as part of a larger monitoring and observability ecosystem alongside other tools like Grafana, which provides powerful visualization capabilities, and Alertmanager, which handles alert notifications. The combination of these tools can help operators maintain a clear understanding of their systems' health and performance.
As with any open-source project, it's advisable to refer to the official documentation and community resources for the most up-to-date and accurate information. You can find more information about Prometheus on its official website: https://prometheus.io/
Grafana is an open-source platform used for monitoring and observability. It allows you to visualize, analyze, and understand various metrics and data from different sources in real-time. Grafana supports a wide range of data sources, including databases, cloud services, and various monitoring systems. Some key features of Grafana include:
- Data Visualization: Grafana provides a flexible and customizable dashboarding system where you can create interactive visualizations such as graphs, charts, and tables to represent your data.
- Data Source Integration: Grafana supports integration with numerous data sources, including popular time-series databases like InfluxDB, Prometheus, Graphite, and Elasticsearch, as well as cloud platforms like AWS CloudWatch and Google Cloud Monitoring.
- Alerting: You can set up alerting rules in Grafana to notify you when certain conditions are met or thresholds are exceeded. This helps in proactive monitoring and quick response to issues.
- Templating: Grafana allows you to create dynamic dashboards by using variables that can be changed at runtime. This is useful for creating more interactive and flexible dashboards.
- Plugins and Extensions: Grafana supports a rich ecosystem of plugins and extensions that enable additional features and integrations. These plugins can extend the functionality and allow you to create custom visualizations or data source connectors.
Community and Support: Grafana has a strong community of users and contributors, which means you can find a wealth of resources, tutorials, and documentation to help you get started and solve issues.
References:
https://kubernetes.io/docs/concepts/overview/components/
https://azuredays.com/2020/12/09/understanding-kubernetes-workload-objects/