How do I troubleshoot Systems Manager Run Command failures?

3 minute read
0

When I use AWS Systems Manager Run Command to run commands on my managed Amazon Elastic Compute Cloud (Amazon EC2) instance, the process fails.

Resolution

Prerequisites

Before you can use Run Command to manage EC2 instances, you must configure an AWS Identity and Access Management (IAM) user policy. The user policy is required for any users that run commands. To verify your setup, follow these steps:

  1. Verify that an IAM instance profile role for Systems Manager is attached to your EC2 instances. For more information, see Configure instance permissions for Systems Manager.
  2. Review the IAM policy created for the role or user. The policy must include permissions for ec2messages API calls because the endpoint is required to send and receive commands.

Note: Systems Manager automatically manages EC2 instances without an IAM instance profile when you configure Default Host Management Configuration. All the associated EC2 instances must use Instance Metadata Service Version 2 (IMDSv2). Default Host Management Configuration is available in AWS Systems Manager Agent (SSM Agent) version 3.2.582.0 or later. Set up permissions for ec2messages API calls in the policy attached to the IAM role because you need an endpoint to send and receive commands. Configure AWSSystemsManagerDefaultEC2InstanceManagementRole as the default IAM role. This role contains the minimum set of permissions necessary to manage EC2 instances with Systems Manager.

Troubleshoot Run Command failures

In Systems Manager, under Fleet Manager, the EC2 instances must be listed under Managed nodes, and the SSM Agent ping status must be Online. If Run Command fails, then try these troubleshooting options:

Review Run Command status details

  1. Review the Run Command status details.
  2. Open the Systems Manager console, and then choose Run Command from the navigation pane.
  3. Choose the hyperlinked Command ID to open the Command status page.
  4. From the Targets and outputs section, choose the hyperlinked Instance ID, and then review the output.

When the output is truncated, connect to the EC2 instance using SSH, and then navigate to the following directories to see the full error details. Note the exit status codes, and then see Troubleshooting Systems Manager Run Command for additional troubleshooting steps.

For Linux and macOS:

  • /var/lib/amazon/ssm/<instance-id>/document/orchestration/<command-id>/<Plugin-name>/<Step-name>/stdout
  • /var/lib/amazon/ssm/<instance-id>/document/orchestration/<command-id>/<Plugin-name>/<Step-name>/stderr

For Windows:

  • %ProgramData%\Amazon\SSM\InstanceData\<ManagedInstance-ID>\document\orchestration\<Command-ID>\<plug-in>\<step_number.plug-in>\stdout
  • %ProgramData%\Amazon\SSM\InstanceData\<ManagedInstance-ID>\document\orchestration\<Command-ID>\<plug-in>\<step_number.plug-in>\stderr

Review SSM Agent logs

Review the SSM Agent logs for more details about the failure.

For Linux and macOS, locate the logs in the following directories:

  • /var/log/amazon/ssm/amazon-ssm-agent.log
  • /var/log/amazon/ssm/errors.log
  • /var/log/amazon/ssm/audits/amazon-ssm-agent-audit-YYYY-MM-DD

For Windows, locate the logs in the following directories:

  • %PROGRAMDATA%\Amazon\SSM\Logs\amazon-ssm-agent.log
  • %PROGRAMDATA%\Amazon\SSM\Logs\errors.log
  • %PROGRAMDATA%\Amazon\SSM\Logs\audits\amazon-ssm-agent-audit-YYYY-MM-DD

If the SSM Agent logs don't provide the information that you need to resolve the error, then allow debug logging to reproduce the issue.

Related information

AWS Systems Manager documents

Setting up AWS Systems Manager

AWS OFFICIAL
AWS OFFICIALUpdated 10 months ago