How To Fix - "DockerTimeoutError" Error in AWS Jobs ?

In this post, we will explore How To Fix - "DockerTimeoutError" Error in AWS Jobs. Error Logs -


"DockerTimeoutError: Could not transition to started; timed out after waiting xm0s".


dockertimeouterror unable transition start timeout after wait 3m0s


CannotInspectContainerError: Could not transition to inspecting; timed out after waiting xs

The default timeout for AWS ECS container agent is four minutes. If it takes more than four minutes, then AWS Batch returns a DockerTimeoutError error. Firts thing first, check the basic details of docker.


$ docker info


$ docker --debug info


$ docker system info

There might be various causes for this error or issue. We are listing few checks and corresponding guidelines to practice. Try those and see if that helps.

Checks and Guidelines:

Are all previous stopped containers been deleted to free up space ?

Have you included the ECS cleanup process in the AMI ? You can use below environment variables to do the image clean up

ECS_IMAGE_CLEANUP_INTERVAL - Specifies how frequently the automated image cleanup process should check for images to delete. The default is every 30 minutes but you can reduce to 10 minutes to remove images more frequently. ECS_IMAGE_MINIMUM_CLEANUP_AGE - Specifies the minimum amount of time between when an image was pulled and when it may become a candidate for removal. The default is 1 hour. Used to prevent cleaning up images that have just been pulled. ECS_NUM_IMAGES_DELETE_PER_CYCLEThis variable specifies how many images may be removed during a single cleanup cycle. The default is 5 and the minimum is 1.

What launch type are you using - EC2 or Fargate ? Fargate launch type might cause issue to run Windows containers. So you might have to use EC2 Windows.
Are you using VPC endpoints ? You have to use VPC endpoints if the tasks are running in a private subnet (with No NAT gateway\NAT instance). To download image from ECR, Container Instance would require access to ECR/S3 endpoints.So if your subnet is private, either use a Private Link option or use NAT gateway for reaching ECR endpoints.

Do you have VPC endpoint for your Fargate tasks ?

What is your Log Driver ? In your AWS ECS console, for your Task definition & Container Definition within that, for Log Configuration, set Log driver to awslogs.

Check your Services' Security Group. Does it allow the port required ? You might have to add ingress rule to allow port. Otherwise this might also cause timeout issue.

Have you run out of run out of EBS burst credits ? Check Burst Balance for your EC2 Instance.

What is the volume IO utilization ? If it very High for those the EC2 instances, it would cause the Docker operations to timeout. In such a case, use larger or different volume type like Solid state drives (SSD)

Hope this helps to fix the AWS issue.

Other Interesting Reads :


DockerTimeoutError ,DockerTimeoutError: Could not transition to started; timed out after waiting xm0s ,dockertimeouterror unable transition start timeout after wait 3m0s ,CannotInspectContainerError: Could not transition to inspecting; timed out after waiting xs ,dockertimeouterror: could not transition to created; timed out after waiting ,cannotinspectcontainererror: could not transition to inspecting; timed out after waiting 30s ,docker build timeout ,docker hub ,docker container timeout ,aws ecr timeout ,docker run timeout ,ecs\_container\_start\_timeout ,dockertimeouterror ,dockertimeouterror ecs , ,cannotinspectcontainererror: could not transition to inspecting; timed out after waiting 30s ,ecs\_container\_start\_timeout ,aws ecr timeout ,ecs config ,docker container timeout ,dockertimeouterror: could not transition to created; timed out after waiting ,aws ecs environment variables ,docker build timeout

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

How To Fix - "DockerTimeoutError" Error in AWS Jobs ?

Checks and Guidelines:

Other Interesting Reads :

Apache Spark Tricky Interview Questions Part 1

How To Read Various File Formats in PySpark (Json, Parquet, ORC, Avro) ?

How To Fix – Indentation Problem in Python ?

How To Read(Load) Data from Local, HDFS & Amazon S3 in Spark ?

How To Create A Kerberos Keytab File ?

How to Create a Multi Node Hadoop\Spark Cluster in Google Cloud(GCP) ?

How To Install Google Cloud GCP Command Line Utility gcloud ?

How To Code SparkSQL in PySpark – Examples Part 1

How To Setup Spark Scala SBT in Eclipse

How To Read Kafka JSON Data in Spark Structured Streaming

How to Purge a Running Kafka Topic ?

How To Fix – Message Out Of Order Issue in Kafka Broker ?

How To Fix – “Not all brokers have rack information” Error in Kafka ?

How to Setup a Multi Node Kafka Cluster or Brokers ?

Apply Pod Security Standards To Kubernetes Cluster

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)

DevOps | Cloud | Cyber Security | Web-Dev | Analytics | Open Source

How To Fix - "DockerTimeoutError" Error in AWS Jobs ?

Checks and Guidelines:

Other Interesting Reads :

Popular Articles

Indentation Problem Fix in Python

Most Important Metrics To Monitor In Kafka

Data Skewness in Spark (Salting Method)

Unicode Encode Error in Python (Ascii Codec Encode)