Lab Exercises for Troubleshooting
Exercise 0 - Setup
- Prepare a cluster (Single node, kubeadm, k3s, etc)
- Open browser tabs to Kubernetes Documentation, Kubernetes GitHub and Kubernetes Blog (these are permitted as per the current guidelines)
- Ensure your etcd nodes have
etcdctl
installed
Exercise 1 - Evaluate cluster and node logging
- For your cluster type, determine how to acquire logs for your master nodes. They could be in the form of:
- Services
- Static Pods
- Kubernetes Pods
- Using
kubectl
get a list ofevents
from your cluster - Using
etcdctl
determine the health of the etcd cluster
Answer
- 1 Is dependent on how the cluster was made and potentially which OS's were used. For example, if K8S components manifest as Kubernetes Pods:
kubectl logs <podname> <namespace>
kubectl get events
etcdctl --write-out=table --endpoints=$ENDPOINTS endpoint status
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://127.0.0.1:2379 | 4e30a295f2c3c1a4 | 3.5.0 | 8.1 MB | true | false | 3 | 7903 | 7903 | |
+------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
Exercise 2 - Understand how to monitor applications
- Deploy the following manifest: counter-pod.yaml
- Acquire the logs from this pod - what do they mention?
Answer
kubectl logs counter
0: Mon Dec 9 13:15:08 UTC 2024
1: Mon Dec 9 13:15:09 UTC 2024
2: Mon Dec 9 13:15:10 UTC 2024
3: Mon Dec 9 13:15:11 UTC 2024
4: Mon Dec 9 13:15:12 UTC 2024
5: Mon Dec 9 13:15:13 UTC 2024
6: Mon Dec 9 13:15:14 UTC 2024
7: Mon Dec 9 13:15:15 UTC 2024
8: Mon Dec 9 13:15:16 UTC 2024
9: Mon Dec 9 13:15:17 UTC 2024
Exercise 3 - Troubleshoot application failure
- Deploy the following manifest: brokenpod.yaml. It will create a Pod named
nginx-pod
. Determine why this Pod will not enterrunning
state by usingkubectl
Answer
kubectl describe pod nginx-pod
..
Normal BackOff 4m55s (x7 over 6m56s) kubelet Back-off pulling image "nginx:invalidversion"
..
Exercise 4 - Troubleshoot networking
- Deploy the following manifest:
https://raw.githubusercontent.com/David-VTUK/CKAExampleYaml/master/nginx-svc-and-deployment-broken.yaml
, It will createdeployment
andservice
objects. Identify the DNS name of this service. - Test resolution of this DNS record:
- Create a Pod that has
nslookup
installed. ie:kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
- Test sending traffic to this service
- Create a Pod that has
curl
installed. ie:kubectl run curl --image=radial/busyboxplus:curl -i --tty
- Why does it fail? - Rectify the identified error in step 3
Answer
nginx-service.default.svc.cluster.local
kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
kubectl exec -it dnsutils sh
/ # nslookup nginx-service.default.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: nginx-service.default.svc.cluster.local
Address: 10.99.41.254
kubectl run curl --image=radial/busyboxplus:curl -i --tty
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ curl nginx-service.default.svc.cluster.local
curl: (7) Failed to connect to nginx-service.default.svc.cluster.local port 80: Connection refused
Check service:
kubectl describe service nginx-service
Name: nginx-service
Namespace: default
Labels: <none>
Annotations: <none>
Selector: app=nginx
Type: ClusterIP
IP Families: <none>
IP: 10.99.41.254
IPs: 10.99.41.254
Port: <unset> 80/TCP
TargetPort: 8080/TCP
Endpoints: 10.244.1.18:8080,10.244.1.19:8080,10.244.1.20:8080
Session Affinity: None
Events: <none>
Note:
- Service is listening on port 80
- Service has a endpoint list, with target port of 8080
- Test
curl
directly against pod:
curl 10.244.1.18:8080
curl: (7) Failed to connect to 10.244.1.18 port 8080: Connection refused
Port 8080 isn't listening, check the pod config:
kubectl describe po nginx-deployment-5d59d67564-bk9xb | grep -i "port:"
Port: 80/TCP
The service is trying to forward traffic to port 8080 on the container, but the container is only listening on port 80. Reconfigure the service
object, ie:
kubectl edit service nginx-service
Replace targetPort: 8080 with targetPort: 80
Retest:
curl nginx-service.default.svc.cluster.local
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>