Network Troubleshooting Solution

  • Troubleshooting Test 1: A simple 2 tier application is deployed in the triton namespace. It must display a green web page on success. Click on the app tab at the top of your terminal to view your application. It is currently failed. Troubleshoot and fix the issue.
Stick to the given architecture. Use the same names and port numbers as given in the below architecture diagram. Feel free to edit, delete or recreate objects as necessary.


    Start by checking the pods in the namespace triton , they are stuck in ContainerCreating state.

root@controlplane ~ ➜  k get pods -n triton
NAME                            READY   STATUS              RESTARTS   AGE
mysql                           0/1     ContainerCreating   0          2m10s
webapp-mysql-7bd5857746-qhrrq   0/1     ContainerCreating   0          2m10s

    Check the pod details for more information:

root@controlplane ~ ➜  kubectl describe pod/mysql -n triton
Name:             mysql
Namespace:        triton
.
.
.

Events:
  Type     Reason                  Age                     From               Message
  ----     ------                  ----                    ----               -------
  Normal   Scheduled               5m34s                   default-scheduler  Successfully assigned triton/mysql to controlplane
  Warning  FailedCreatePodSandBox  5m34s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "dd800900e6eb528af5afd679c1dfa02a7cbbcb9645014df37bac8e50270274c6": plugin type="weave-net" name="weave" failed (add): unable to allocate IP address: Post "http://127.0.0.1:6784/ip/dd800900e6eb528af5afd679c1dfa02a7cbbcb9645014df37bac8e50270274c6": dial tcp 127.0.0.1:6784: connect: connection refused
  Warning  FailedCreatePodSandBox  5m33s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "fac1a8529953fca9a2c351091b882ea433df03ed421ec3efe5f9e5173c918dcf": plugin type="weave-net" name="weave" failed (add): unable to allocate IP address: Post "http://127.0.0.1:6784/ip/fac1a8529953fca9a2c351091b882ea433df03ed421ec3efe5f9e5173c918dcf": dial tcp 127.0.0.1:6784: connect: connection refused
  Warning  FailedCreatePodSandBox  5m32s                   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "30890b3fd51a9fb788472228357a56f81c43b3c9fc20d4f9b2081d57fb65fff5": plugin type="weave-net" name="weave" failed (add): unable to allocate IP address: Post "http://127.0.0.1:6784/ip/30890b3fd51a9fb788472228357a56f81c43b3c9fc20d4f9b2081d57fb65fff5": dial tcp 127.0.0.1:6784: connect: connection refused

    You will find that , The pod mysql in namespace triton initially failed multiple times to start due to Weave Net CNI plugin failures:

Failed to setup network for sandbox: plugin type="weave-net" ... dial tcp 127.0.0.1:6784: connect: connection refused

    This indicates issue with Weave Net CNI , So , Check it's Pods:

root@controlplane ~ ➜  kubectl get pods -A
NAMESPACE     NAME                                   READY   STATUS              RESTARTS   AGE
kube-system   coredns-668d6bf9bc-2nwwf               1/1     Running             0          2m59s
kube-system   coredns-668d6bf9bc-lhzgz               1/1     Running             0          2m59s
kube-system   etcd-controlplane                      1/1     Running             0          3m7s
kube-system   kube-apiserver-controlplane            1/1     Running             0          3m7s
kube-system   kube-controller-manager-controlplane   1/1     Running             0          3m7s
kube-system   kube-proxy-hspkg                       1/1     Running             0          2m59s
kube-system   kube-scheduler-controlplane            1/1     Running             0          3m7s
triton        mysql                                  0/1     ContainerCreating   0          107s
triton        webapp-mysql-7bd5857746-vpg4n          0/1     ContainerCreating   0          106s

There are no any Weave Pods , which indicates it's not installed.

    Install the Weave Net CNI using curl -L https://github.com/weaveworks/weave/releases/download/latest_release/weave-daemonset-k8s-1.11.yaml | kubectl apply -f - command.

root@controlplane ~ ➜  curl -L https://github.com/weaveworks/weave/releases/download/latest_release/weave-daemonset-k8s-1.11.yaml | kubectl apply -f -
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  6183  100  6183    0     0  14378      0 --:--:-- --:--:-- --:--:-- 14378
serviceaccount/weave-net created
clusterrole.rbac.authorization.k8s.io/weave-net created
clusterrolebinding.rbac.authorization.k8s.io/weave-net created
role.rbac.authorization.k8s.io/weave-net created
rolebinding.rbac.authorization.k8s.io/weave-net created
daemonset.apps/weave-net created

    check the pods again , they will be in running state:

root@controlplane ~ ➜  k get pods -n triton
NAME                            READY   STATUS    RESTARTS   AGE
mysql                           1/1     Running   0          4m11s
webapp-mysql-7bd5857746-qhrrq   1/1     Running   0          4m11s
  • Troubleshooting Test 2: The same 2 tier application is having issues again. It must display a green web page on success. Click on the app tab at the top of your terminal to view your application. It is currently failed. Troubleshoot and fix the issue.

Stick to the given architecture. Use the same names and port numbers as given in the below architecture diagram. Feel free to edit, delete or recreate objects as necessary.


    Check the pods in all namespaces using k get pods -A command.

You will find the app Pods are running , however the kube-proxy is having issues.

root@controlplane ~ ➜  k get pods -A
NAMESPACE     NAME                                   READY   STATUS             RESTARTS      AGE
kube-system   coredns-668d6bf9bc-4nbdj               1/1     Running            0             3m45s
kube-system   coredns-668d6bf9bc-zcddz               1/1     Running            0             3m45s
kube-system   etcd-controlplane                      1/1     Running            0             3m50s
kube-system   kube-apiserver-controlplane            1/1     Running            0             3m52s
kube-system   kube-controller-manager-controlplane   1/1     Running            0             3m50s
kube-system   kube-proxy-wx5f9                       0/1     CrashLoopBackOff   2 (18s ago)   38s
kube-system   kube-scheduler-controlplane            1/1     Running            0             3m50s
kube-system   weave-net-5bgqz                        2/2     Running            0             117s
triton        mysql                                  1/1     Running            0             38s
triton        webapp-mysql-7bd5857746-wwfhz          1/1     Running            0             38s

    Get the logs using k logs -n kube-system <POD_NAME> command .

root@controlplane ~ ➜  k logs -n kube-system kube-proxy-wx5f9
E0613 09:45:18.840160       1 run.go:74] "command failed" err="failed complete: open /var/lib/kube-proxy/configuration.conf: no such file or directory"

This means kube-proxy is failing because it cannot find its configuration file at /var/lib/kube-proxy/configuration.conf.

    Describe the failed pod using command k describe -n kube-system pod <POD_NAME>

root@controlplane ~ ➜  k describe -n kube-system pod kube-proxy-wx5f9
Name:                 kube-proxy-wx5f9
Namespace:            kube-system
...
    Mounts:
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /var/lib/kube-proxy from kube-proxy (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-gcxd6 (ro)
...
Volumes:
  kube-proxy:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-proxy
    Optional:  false
...

This confirms that kube-proxy expects its configuration to be mounted as a ConfigMap named kube-proxy into the /var/lib/kube-proxy directory.

    Check the ConfigMap using command : k get configmap -n kube-system kube-proxy -o yaml

root@controlplane /var/lib/kube-proxy ➜  k get configmap -n kube-system kube-proxy -o yaml
apiVersion: v1
data:
  config.conf: |-
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    bindAddress: 0.0.0.0
    bindAddressHardFail: false
    clientConnection:
      acceptContentTypes: ""
      burst: 0
      contentType: ""
      kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
...

    Check the Current configs of the Deamonset :

oot@controlplane /var/lib/kube-proxy ➜  kubectl -n kube-system describe ds kube-proxy
Name:           kube-proxy
Selector:       k8s-app=kube-proxy
...
  Containers:
   kube-proxy:
    Image:      registry.k8s.io/kube-proxy:v1.26.0
    Port:       <none>
    Host Port:  <none>
    Command:
      /usr/local/bin/kube-proxy
      --config=/var/lib/kube-proxy/configuration.conf
      --hostname-override=$(NODE_NAME)
    Environment:
      NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /lib/modules from lib-modules (ro)
      /run/xtables.lock from xtables-lock (rw)
      /var/lib/kube-proxy from kube-proxy (rw)
  Volumes:
   kube-proxy:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kube-proxy
    Optional:  false
...

Based on the logs (failed complete: open /var/lib/kube-proxy/configuration.conf: no such file or directory), the DaemonSet's --config argument points to configuration.conf, but the ConfigMap provides its main configuration under config.conf

Fix
We need to change the kube-proxy DaemonSet to point to the correct configuration file name that exists in the kube-proxy ConfigMap , which is : config.conf

    Edit the DaemonSet:

kubectl -n kube-system edit ds kube-proxy

    Find the command or args section.

command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/configuration.conf # <--- THIS IS THE LINE TO CHANGE
        - --hostname-override=$(NODE_NAME)

    Change configuration.conf to config.conf

command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/config.conf # <--- CHANGED!
        - --hostname-override=$(NODE_NAME)

Monitor kube-proxy pod status and verify application connectivity.

k get pods -n kube-system -w

Updated: