Kubernetes is a powerful tool for managing containerized applications. However, as with any complex system, errors can occur when using it. When problems arise, it's important to have effective troubleshooting techniques and tools.
In this article, we'll take the following steps to get you started with event collection:
- Retrieves the most recent events
- Use pods to simulate problems
- Store events in a pod located on a PV
Retrieves the most recent events
The first step in troubleshooting a Kubernetes cluster is to retrieve the latest events. Events in Kubernetes are generated by various components and objects in the cluster, such as pods, nodes, and services. They provide information about the status of the cluster and any issues that may occur.
To retrieve the latest events, you can use the Kubectl get events command. This will display a list of all events in the cluster.
kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
78s Warning BackOff pod/bbb Back-off restarting failed container
72s Warning BackOff pod/bbb2 Back-off restarting failed container
12m Normal Pulling pod/bbb3 Pulling image "busybox"
12m Normal Created pod/bbb3 Created container bbb3
46m Normal Started pod/bbb3 Started container bbb3
As shown above, it displays a list of all communication ports in the cluster in chronological order. You can also add a -w tag to see how new events change.
This will show the real-time status of the events that have occurred in the cluster. By observing events, you can quickly identify any issues that may have occurred.
While the kubectl get events command is helpful for retrieving events, it can be difficult to identify the problem if the events are displayed in chronological order. To make it easier to identify issues, you can sort events by metadata.creationTimestamp.
kubectl get events --sort-by=.metadata.creationTimestamp
LAST SEEN TYPE REASON OBJECT MESSAGE
104s Normal Pulling pod/busybox13 Pulling image "busybox"
88s Warning FailedScheduling pod/mysqldeployment-6f8b755598-phgzr 0/2 nodes are available: 2 Insufficient cpu. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.
104s Warning BackOff pod/busybox6 Back-off restarting failed container
82s Warning ProvisioningFailed persistentvolumeclaim/pv-volume storageclass.storage.k8s.io "csi-hostpath-sc" not found
82s Warning ProvisioningFailed persistentvolumeclaim/pv-volume-2 storageclass.storage.k8s.io "csi-hostpath-sc" not found
As shown above, a list of all events in the cluster is displayed by metada.creationTimestamp. By sequencing the communication ports in this way, you can quickly identify recent events and any issues that may arise.
Use pods to simulate problems
If you find an issue related to networking or service discovery, terminating the kube-proxy pod may help. The kube-proxy pod is responsible for networking and service discovery in the cluster, so terminating it helps identify any issues related to these features.
To terminate a kube-proxy pod, you can use the kubectl delete pod command. If you need to specify the name of the kube-proxy pod, you can find it using the kubectl get pods command.
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-57575c5f89-66z2h 1/1 Running 1 (45h ago) 36d
coredns-57575c5f89-bcjdn 1/1 Running 1 (45h ago) 36d
etcd-k81 1/1 Running 1 (45h ago) 36d
fluentd-elasticsearch-5fdvc 1/1 Running 2 (45h ago) 60d
fluentd-elasticsearch-wx6x9 1/1 Running 1 (45h ago) 60d
kube-apiserver-k81 1/1 Running 1 (45h ago) 36d
kube-controller-manager-k81 1/1 Running 2 (45h ago) 36d
kube-proxy-bqpb5 1/1 Running 1 (45h ago) 36d
kube-proxy-q94sk 1/1 Running 1 (45h ago) 36d
kube-scheduler-k81 1/1 Running 2 (45h ago) 36d
metrics-server-5c59ff65b6-s4kms 1/1 Running 2 (45h ago) 58d
weave-net-56pl2 2/2 Running 3 (45h ago) 61d
weave-net-rml96 2/2 Running 5 (45h ago) 62d
As above, a list of all pods in the Kube system namespace is displayed, including kube-proxy pods. Once you have the name of the kube-proxy pod, you can delete it using the kubectl delete pod command.
kubectl delete pod -n kube-system kube-proxy-q94sk
This will remove the kube-proxy pod in the kube-system namespace. Kubernetes automatically creates a new kube-proxy pod to replace it.
You can check for events with the following command:
kubectl get events -n=kube-system --sort-by=.metadata.creationTimestamp
LAST SEEN TYPE REASON OBJECT MESSAGE
4m59s Normal Killing pod/kube-proxy-bqpb5 Stopping container kube-proxy
4m58s Normal Scheduled pod/kube-proxy-cbkx6 Successfully assigned kube-system/kube-proxy-cbkx6 to k82
4m58s Normal SuccessfulCreate daemonset/kube-proxy Created pod: kube-proxy-cbkx6
4m57s Normal Pulled pod/kube-proxy-cbkx6 Container image "registry.k8s.io/kube-proxy:v1.24.11" already present on machine
Store events in a pod located on a PV
Storing events in pods located in PVs is an effective way to keep track of what happens in your Kubernetes cluster. Here's a step-by-step explanation of how to do it:
1. Add permissions to the pod
To connect to the Kubernetes API in a pod, you need to give it the appropriate permissions. Here's an example of a YAML file that binds permissions to a pod.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-logger
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: default
namespace: default
2. Create Persistent Encrypted Volumes (PVs) and Persistent Encrypted Volume Declarations (PVCs)
Now that we've set up ClusterRoleBind, we can create a persistent volume to store our events. Here's an example of a YAML file that uses hostPath to create a PC:
# pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /mnt/data
---
# pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
volumeName: my-pv
3. Create a pod to collect events
Now that we've set up our PVs and PVCs, we can create pods to collect events. Here's an example of a YAML file that creates a pod, connects to the Kubernetes API in the pod, and stores all events in the file events.log.
apiVersion: v1
kind: Pod
metadata:
name: event-logger
spec:
containers:
- name: event-logger
image: alpine
command: ["/bin/sh", "-c"]
args:
- |
apk add --no-cache curl jq && while true; do
EVENTS=$(curl -s -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" https://${KUBERNETES_SERVICE_HOST}/api/v1/events | jq -r '.items[]')
if [ -n "$EVENTS" ]; then
echo "$EVENTS" >> /pv/events.log
fi
sleep 10
done
volumeMounts:
- name: event-log
mountPath: /pv
- name: sa-token
mountPath: /var/run/secrets/kubernetes.io/serviceaccount
readOnly: true
volumes:
- name: event-log
persistentVolumeClaim:
claimName: my-pvc
- name: sa-token
projected:
sources:
- serviceAccountToken:
path: token
expirationSeconds: 7200
- configMap:
name: kube-root-ca.crt
The pod will run a simple shell script with curl and jq installed, connect to the Kubernetes API using event-logger ClusterRoleBinding, and store all events in /pv/events.log.
You can run the following command to check for events:
kubectl exec event-logger -- cat /pv/events.log
By using these troubleshooting techniques and tools, you can keep your Kubernetes cluster healthy and running smoothly. Retrieving the latest events, simulating issues, and storing events in pods located in PVs are essential steps for effectively maintaining a cluster. As you become more experienced with Kubernetes, you can explore more advanced tools like Kibana, Prometheus, or Grafana for analyzing events, as well as centralized logging solutions like Elasticsearch or Fluentd.