Scenario Based Kubernetes Interview Questions and Answers
1. A pod is in CrashLoopBackOff. What will you do?
Check pod logs using kubectl logs, describe pod, verify resource limits and fix application errors.
2. Pods are not getting scheduled. What could be the reason?
Check node capacity, taints, node selectors, resource requests and cluster health.
3. Service is not accessible from outside.
Check service type (NodePort/LoadBalancer), endpoints, firewall rules and ingress configuration.
4. Pod is running but application is not responding.
Check container logs, liveness/readiness probes and application ports.
5. Node goes to NotReady state.
Drain the node, check kubelet logs, disk space, network and restart node if required.
6. Deployment update caused downtime.
Use rolling update strategy, increase replicas and configure readiness probes.
7. High CPU usage in pods.
Enable HPA, optimize application and set proper resource limits.
8. Config change not reflecting in pod.
Restart pods or rollout restart deployment to reload ConfigMap or Secret.
9. Kubernetes cluster upgrade failed.
Rollback upgrade, upgrade node by node and test in staging first.
10. Pod cannot access another service.
Check service name, DNS, network policies and namespace.
11. ImagePullBackOff error occurs.
Verify image name, tag, registry access and image pull secrets.
12. Application needs persistent storage.
Use PersistentVolume and PersistentVolumeClaim with proper StorageClass.
13. Too many pods on one node.
Use pod anti-affinity and increase nodes in cluster.
14. Secrets leaked in logs.
Rotate secrets, remove from logs and use Kubernetes Secrets or Vault.
15. Traffic spike causes failures.
Enable autoscaling, load balancing and tune resource limits.
16. Pod restarts randomly.
Check OOMKilled errors, logs and resource limits.
17. Deployment rollback needed.
Use kubectl rollout undo to revert to previous version.
18. Namespace is consuming too many resources.
Apply ResourceQuota and LimitRange.
19. Unauthorized pod created.
Apply RBAC rules and audit cluster activity.
20. Logs not visible in monitoring tool.
Check log agent, permissions and disk space.
21. Kubernetes API server is slow.
Check etcd health, resource usage and network latency.
22. Stateful app losing data.
Use StatefulSet with persistent volume and proper storage class.
23. Pods cannot communicate across nodes.
Check CNI plugin, routing and firewall rules.
24. CI/CD deployment stuck.
Check kubectl permissions, kubeconfig and deployment status.
25. Cluster cost is high.
Scale down unused nodes, use autoscaling and optimize resource usage.
26. Pod stuck in Pending state.
Check events, node resources and scheduling constraints.
27. Health checks failing.
Verify endpoints, ports and timeout values.
28. Cluster backup needed.
Backup etcd and application volumes regularly.
29. Ingress not routing traffic.
Check ingress controller, rules and DNS.
30. How do you ensure high availability?
Use multiple replicas, autoscaling, health checks and multi-node clusters.
Source: sureshtechlabs.com