Scenario Based AWS DevOps Interview Questions and Answers
1. EC2 instance is not reachable. What will you check?
Check security group, network ACL, instance state, public IP and route table.
2. Application is slow in AWS.
Check CloudWatch metrics, CPU, memory, disk I/O, load balancer and database performance.
3. Suddenly AWS bill increased.
Check Cost Explorer, identify unused resources, stop idle instances and enable autoscaling.
4. S3 object deleted accidentally.
Enable versioning and restore previous version.
5. RDS database is down.
Failover to standby, check CloudWatch alarms and restore from backup if required.
6. Load balancer shows unhealthy targets.
Check health check path, security group rules and application status.
7. Auto Scaling not scaling out.
Check scaling policies, CloudWatch alarms and instance limits.
8. IAM user accessed unauthorized resource.
Review IAM policies, revoke access and rotate credentials.
9. Lambda function timing out.
Increase timeout, optimize code and check downstream services.
10. CloudWatch logs not appearing.
Check IAM role permissions and log group configuration.
11. EC2 disk is full.
Extend EBS volume, clean logs and enable log rotation.
12. Application needs zero downtime deployment.
Use ALB with Auto Scaling, blue-green or rolling deployment strategy.
13. VPC peering not working.
Check route tables, CIDR overlap and security group rules.
14. EKS pods cannot access internet.
Check NAT gateway, route tables and security groups.
15. Secrets leaked in code.
Move secrets to AWS Secrets Manager or Parameter Store and rotate keys.
16. ALB returns 502 error.
Check target group health, app logs and listener rules.
17. ECS service keeps restarting.
Check task logs, CPU/memory limits and health checks.
18. CloudFormation stack failed.
Check error events, fix template and redeploy.
19. S3 is publicly accessible.
Block public access, fix bucket policy and enable encryption.
20. DynamoDB throttling errors.
Increase capacity or enable on-demand mode.
21. EC2 instance lost data after restart.
Use EBS volumes and backups instead of instance storage.
22. Route 53 not resolving domain.
Check hosted zone records and DNS propagation.
23. Backup strategy needed.
Use AWS Backup, snapshots and lifecycle policies.
24. Multi-region disaster recovery needed.
Use replication, Route 53 failover and regular DR testing.
25. Application logs are huge.
Use CloudWatch retention policies and log rotation.
26. User complains of high latency.
Use CloudFront CDN and optimize backend services.
27. EBS snapshot taking too long.
Check volume size and snapshot schedule.
28. API Gateway returns 403.
Check IAM permissions, authorizer and resource policy.
29. Infrastructure drift detected.
Reapply Terraform or CloudFormation to match desired state.
30. How do you ensure AWS reliability?
Use multi-AZ setup, autoscaling, monitoring and backups.
Source: sureshtechlabs.com