How do I delay the termination of an unhealthy Amazon EC2 instance so that I can troubleshoot issues?

4 minute read
1

My Amazon Elastic Compute Cloud (Amazon EC2) instance is unhealthy and terminated before I determined the cause of the issue.

Short description

To troubleshoot an unhealthy EC2 instance before termination, add an Amazon EC2 Auto Scaling lifecycle hook to move the instance's status from Terminating to Terminating:Wait.

By default, an instance remains in the Terminating:Wait state for 3,600 seconds, or 1 hour. To increase this time, use the heartbeat-timeout parameter in the put-lifecycle-hook API call. The maximum time that you can keep an instance in the Terminating:Wait state is 48 hours or 100 times the heartbeat timeout, whichever is smaller.

Resolution

Use the AWS Command Line Interface (AWS CLI) to configure a lifecycle hook.

Note: If you receive errors when you run AWS CLI commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

Create an Amazon SNS topic

To create an Amazon Simple Notification Service (Amazon SNS) topic, complete the following steps:

  1. Create an SNS topic where the EC2 Auto Scaling group sends lifecycle notifications. The following example runs the create-topic command to create the ASNotifications topic:

    $ aws sns create-topic --name ASNotifications

    The output returns an ARN that's similar to the following:

    "TopicArn": "arn:aws:sns:us-west-2:123456789012:ASNotifications"
  2. Create a subscription to the topic. You must have a subscription to receive the LifecycleActionToken that's required to extend the heartbeat timeout of the pending state or complete the lifecycle action. The following example runs the subscribe command to create a subscription that uses the email protocol SMTP with the user@amazon.com endpoint email address:

    $ aws sns subscribe --topic-arn arn:aws:sns:us-west-2:123456789012:ASNotifications --protocol email --notification-endpoint user@amazon.com

Configure IAM permissions

Configure an AWS Identity and Access Management (IAM) role that grants the EC2 Auto Scaling group service permissions to send to the SNS topic. To complete this task, create a text file that contains the appropriate policy. Then, reference the file in the create-role command.

  1. Use a text editor, such as vi, to create the text file:

    $ sudo vi assume-role.txt
  2. Enter the following information into the text file, and then save the file:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "Service": "autoscaling.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }
  3. Run the aws iam create-role command to create the AS-Lifecycle-Hook-Role IAM role from the policy that's saved to assume-role.txt:

    $ aws iam create-role --role-name AS-Lifecycle-Hook-Role --assume-role-policy-document file://assume-role.txt

    The output contains the role's ARN. Note the ARN of both the IAM role and the SNS topic.

  4. Add permissions to the role to allow EC2 Auto Scaling to send SNS notifications when a lifecycle hook event occurs. The following example runs the attach-role-policy command to attach the AutoScalingNotificationAccessRole AWS managed policy to the AS-Lifecycle-Hook-Role IAM role:

    $ aws iam attach-role-policy --role-name AS-Lifecycle-Hook-Role --policy-arn arn:aws:iam::aws:policy/service-role/AutoScalingNotificationAccessRole

    The preceding managed policy grants the following permissions:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Resource": "*",
          "Action": [
            "sqs:SendMessage",
            "sqs:GetQueueUrl",
            "sns:Publish"
          ]
        }
      ]
    }

    Important: The AutoScalingNotificationAccessRole AWS managed policy allows EC2 Auto Scaling to make calls to all SNS topics and queues. To restrict EC2 Auto Scaling's access to specific SNS topics or queues, use the following sample policy.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Resource": "arn:aws:sns:us-west-2:123456789012:ASNotifications",
          "Action": [
            "sqs:SendMessage",
            "sqs:GetQueueUrl",
            "sns:Publish"
          ]
        }
      ]
    }

Configure the lifecycle hook

Next, run the put-lifecycle-hook command to configure the lifecycle hook:

aws autoscaling put-lifecycle-hook 
    --lifecycle-hook-name AStroubleshoot 
    --auto-scaling-group-name MyASGroup
    --lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING
    --notification-target-arn arn:aws:sns:us-west-2:123456789012:ASNotifications
    --role-arn arn:aws:iam::123456789012:role/AS-Lifecycle-Hook-Role 

Note: Replace the example values with your EC2 Auto Scaling group name, SNS target ARN, and IAM role ARN.

The put-lifecycle-hook command completes the following functions:

  • Names the lifecycle hook (AStroubleshoot)
  • Identifies the EC2 Auto Scaling group that's associated with the lifecycle hook (MyASGroup)
  • Configures the hook for the instance termination lifecycle stage (EC2_INSTANCE_TERMINATING)
  • Specifies the SNS topic's ARN (arn:aws:sns:us-west-2:123456789012:ASNotifications)
  • Specifies the IAM role's ARN (arn:aws:iam::123456789012:role/AS-Lifecycle-Hook-Role)

Test the lifecycle hook

To test the lifecycle hook, first choose an instance. Then, run the terminate-instance-in-auto-scaling group command to forcefully terminate the instance. After the instance moves to the Terminating:Wait state, run the record-lifecycle-action-heartbeat command to keep the instance in this state. Or, run the complete-lifecycle-action command to let the termination finish:

aws autoscaling complete-lifecycle-action 
    --lifecycle-hook-name my-lifecycle-hook
    --auto-scaling-group-name MyASGroup 
    --lifecycle-action-result CONTINUE
    --instance-id i-0e7380909ffaab747

Related information

Amazon EC2 Auto Scaling lifecycle hooks

Creating a role to delegate permissions to an AWS service

Creating an Amazon SNS topic

AWS OFFICIAL
AWS OFFICIALUpdated 3 months ago