Lab K204 - Adding health checks with Probes

Adding health checks

Health checks in Kubernetes work the same way as traditional health checks of applications. They make sure that our application is ready to receive and process user requests. In Kubernetes we have two types of health checks, * Liveness Probe * Readiness Probe Probes are simply a diagnostic action performed by the kubelet. There are three types actions a kubelet perfomes on a pod, which are namely,

ExecAction: Executes a command inside the pod. Assumed successful when the command returns 0 as exit code.
TCPSocketAction: Checks for a state of a particular port on the pod. Considered successful when the state of the port is open.
HTTPGetAction: Performs a GET request on pod's IP. Assumed successful when the status code is greater than 200 and less than 400

In cases of any failure during the diagnostic action, kubelet will report back to the API server. Let us study about how these health checks work in practice.

Adding Liveness/Readineess Probes

Liveness probe checks the status of the pod(whether it is running or not). If livenessProbe fails, then the pod is subjected to its restart policy. The default state of livenessProbe is Success.

Readiness probe checks whether your application is ready to serve the requests. When the readiness probe fails, the pod's IP is removed from the end point list of the service. The default state of readinessProbe is Success.

Let us add liveness/readiness probes to our vote deployment.

file: vote-deploy-probes.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: vote
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 1
  revisionHistoryLimit: 4
  replicas: 12
  minReadySeconds: 20
  selector:
    matchLabels:
      role: vote
    matchExpressions:
      - {key: version, operator: In, values: [v1, v2, v3, v4, v5]}
  template:
    metadata:
      name: vote
      labels:
        app: python
        role: vote
        version: v1
    spec:
      containers:
        - name: app
          image: schoolofdevops/vote:v1
          resources:
            requests:
              memory: "64Mi"
              cpu: "50m"
            limits:
              memory: "128Mi"
              cpu: "250m"
          livenessProbe:
            tcpSocket:
              port: 80
            initialDelaySeconds: 5
            periodSeconds: 5
          readinessProbe:
            httpGet:
              path: /
              port: 80
            initialDelaySeconds: 5
            periodSeconds: 3

where,

livenessProbe used a simple tcp check to test whether application is listening on port 80
readinessProbe does httpGet to actually fetch a page using get method and tests for the http response code.

Apply this code using,

kubectl apply -f vote-deploy-probes.yaml
kubectl get pods
kubectl describe svc vote

Testing livenessProbe

kubectl edit deploy vote

livenessProbe:
  failureThreshold: 3
  initialDelaySeconds: 5
  periodSeconds: 5
  successThreshold: 1
  tcpSocket:
    port: 8888
  timeoutSeconds: 1

Since you are using edit command, as soon as you save the file, deployment is modified.

kubectl get pods
kubectl describe pod vote-xxxx

where, vote-xxxx is one of the new pods created.

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  38s                default-scheduler  Successfully assigned instavote/vote-668579766d-p65xb to k-02
  Normal   Pulled     18s (x2 over 36s)  kubelet, k-02      Container image "schoolofdevops/vote:v1" already present on machine
  Normal   Created    18s (x2 over 36s)  kubelet, k-02      Created container
  Normal   Started    18s (x2 over 36s)  kubelet, k-02      Started container
  Normal   Killing    18s                kubelet, k-02      Killing container with id docker://app:Container failed liveness probe.. Container will be killed and recreated.
  Warning  Unhealthy  4s (x5 over 29s)   kubelet, k-02      Liveness probe failed: dial tcp 10.32.0.12:8888: connect: connection refused

What just happened ?

Since livenessProbe is failing it will keep killing and recreating containers. Thats what you see in the description above.
When you list pods, you should see it in crashloopbackoff state with number of restarts incrementing with time.

e.g.

vote-668579766d-p65xb    0/1     CrashLoopBackOff   7          7m38s   10.32.0.12   k-02        <none>           <none>
vote-668579766d-sclbr    0/1     CrashLoopBackOff   7          7m38s   10.32.0.10   k-02        <none>           <none>
vote-668579766d-vrcmj    0/1     CrashLoopBackOff   7          7m38s   10.38.0.8    kube03-01   <none>           <none>

To fix it, revert the livenessProbe configs by editing the deplyment again.

Readiness Probe

Readiness probe is configured just like liveness probe. But this time we will use httpGet request.

kubectl edit deploy vote

readinessProbe:
  failureThreshold: 3
  httpGet:
    path: /test.html
    port: 80
    scheme: HTTP
  initialDelaySeconds: 5
  periodSeconds: 3
  successThreshold: 1

where, readinessProbe.httpGet.path is been changed from / to /test.html which is a non existant path.

check

kubectl get deploy,rs,pods

[output snippet]

NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.extensions/vote    11/12   3            11          2m12s


vote-8cbb7ff89-6xvbc     0/1     Running   0          73s     10.38.0.10   kube03-01   <none>           <none>
vote-8cbb7ff89-6z5zv     0/1     Running   0          73s     10.38.0.5    kube03-01   <none>           <none>
vote-8cbb7ff89-hdmxb     0/1     Running   0          73s     10.32.0.12   k-02        <none>           <none>

kubectl describe pod vote-8cbb7ff89-hdmxb

where, vote-8cbb7ff89-hdmxb is one of the pods launched after changing readiness probe.

[output snippet]

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  109s                 default-scheduler  Successfully assigned instavote/vote-8cbb7ff89-hdmxb to k-02
  Normal   Pulled     108s                 kubelet, k-02      Container image "schoolofdevops/vote:v1" already present on machine
  Normal   Created    108s                 kubelet, k-02      Created container
  Normal   Started    108s                 kubelet, k-02      Started container
  Warning  Unhealthy  39s (x22 over 102s)  kubelet, k-02      Readiness probe failed: HTTP probe failed with statuscode: 404

kubectl describe svc vote

what happened ?

Since readinessProbe failed, the new launched batch does not show containers running (0/1)
Description of the pod shows it being Unhealthy due to failed HTTP probe
Deployment shows surged pods, with number of ready pods being less than number of desired replicas (e.g. 11/12).
Service does not send traffic to the pod which are marked as unhealthy/not ready.

Reverting the changes to readiness probe should bring it back to working state.