Kubernetes Troubleshooting
How to troubleshoot pod creation errors and execute into containers.
Executing into Containers
Sometimes, the pod execution logs are not available if a pod fails before redirecting its logs. To understand these errors, you need some advanced troubleshooting.
One of the methods we use is building our pod as usual, but giving it a very lengthy command, such as sleep
. While a pod is sleeping, you can investigate its container.
The following steps describe how you can troubleshoot the databand_examples
pipeline running in the Kubernetes cluster.
- Run a task that does nothing but sleeps for a long time.
In the following example, we run a task namedlong_task
that sleeps for 600 seconds.
dbnd run dbnd_test_scenarios.scheduler_scenarios.long_task --task-version now --env kubernetes_cluster_env
Your pod is now successfully initialized into a sleeping state.
- Find out the ID of the container where the
long_task
runs:
docker ps | grep "long"
- Run bash inside the container:
docker exec -it <CONTAINER_ID> /bin/bash
Now, you have a shell on the pod.
- Run a command that fails, and investigate the issue.
Tip
You can get the command line that fails from the log of the driver task.
Pod Creation Errors
databand-secrets-env
, which is deployed when you install DBND, includes cluster-role-binding
that provides your driver with the necessary permissions to execute pods inside your namespace. If it fails for any reason, your driver will report errors on pods creation. You must ensure that it has the required permissions.
Updated 12 months ago