How can I resolve the "dockertimeouterror unable transition start timeout after wait 3m0s" error in Amazon ECS for Fargate?

4 minute read
0

I get the "dockertimeouterror unable transition start timeout after wait 3m0s" error on my Amazon Elastic Container Service (Amazon ECS) tasks for AWS Fargate.

Short description

This error occurs when there is a networking configuration issue with your Fargate tasks. For Fargate, the default value of the start timeout is 3 minutes. If a task doesn't flip from pending state to running state in 3 minutes, then that task is failed and moves to the stopped state.

If your Fargate tasks are running in a private subnet with no NAT instance or gateway configured, then you must set the proper Amazon Virtual Private Cloud (Amazon VPC) endpoints. This involves endpoints for the following:

  • Amazon Elastic Container Registry (Amazon ECR): This is required for pulling the image from the ECR repository.
  • Amazon Simple Storage Service (Amazon S3): This is required because Amazon ECR uses Amazon S3 to store your image layers. When your containers download images from Amazon ECR, the containers must access Amazon ECR to get the image manifest and Amazon S3 to download the actual image layers.
  • AWS Secrets Manager and/or AWS Systems Manager**:** These are required if you're referencing either Secrets Manager secrets or Systems Manager Parameter Store parameters in your task definitions to inject sensitive data into your containers. You must create the interface VPC endpoints for Secrets Manager or Systems Manager so those tasks can reach those services. You must create the endpoints only from the specific service (Secrets Manager or System Manager) that your sensitive data is hosted in.
  • Amazon CloudWatch: This is required when the Fargate tasks are using awslogs as the logging driver. This is because the tasks that use awslogs as the logging driver export their logs to CloudWatch. If you're using awslogs and the VPC endpoint for CloudWatch is created but not set up, then your tasks can't reach the endpoint. You receive the following error: "DockerTimeoutError: Could not transition to started; timed out after waiting 3m0s."

Resolution

Check if your task definition uses the awslogs logging driver

  1. Open the Amazon ECS console.
  2. In the navigation pane, choose Task Definitions.
  3. Choose the task definition that's used by your task or service, and then choose your task definition name.
  4. In the Container Definitions section of your task definition, choose the expander icon for your container in the Container Name column.
  5. In the Log Configuration subsection, check if Log driver is set to awslogs.

Important: You must use VPC endpoints if your tasks are running in a private subnet with no NAT gateway or NAT instance.

Confirm that you have a VPC endpoint for your Fargate tasks

  1. Open the Amazon VPC console.
  2. In the navigation pane, choose Endpoints.
  3. Check if com.amazonaws.region.logs exists in the Service name field.

If the endpoint doesn't exist, then create a new endpoint.

If the endpoint does exist, then confirm if the endpoint is the same VPC where the Fargate tasks are running. To do this in the VPC console, choose the endpoint, and then look for the VPC ID in the Details tab of the endpoint.

If the endpoint isn't used by the same VPC as the Fargate tasks, then create a new endpoint.

If the endpoint is used by the same VPC as the Fargate tasks, then check the security group associated with the VPC for the following:

  • The ingress rule of the security group must allow traffic on port 443 from the Fargate tasks.
  • The security group associated with the Fargate task must have an egress rule to send traffic on port 443 to the VPC endpoint.

Now, your Fargate tasks can reach the CloudWatch endpoints that you created.


Related information

Amazon ECR interface VPC endpoints (AWS PrivateLink)

AWS OFFICIAL
AWS OFFICIALUpdated 3 years ago