Why is my Lambda IteratorAge metric increasing, and how do I decrease it?

6 minute read
0

I see an increase (or spikes) in my AWS Lambda function's IteratorAge metric.

Short description

A Lambda function's iterator age increases when the function can't efficiently process the data that's written to the streams that invoke the function. To decrease your function's IteratorAge metric, you must increase your stream processing throughput.

The following factors influence a function's IteratorAge metric:

Review this article to understand how each factor affects iterator age. Then, reconfigure your function or data stream to decrease your function's iterator age based on your use case. For more information about Lambda invocations, see Working with AWS Lambda function metrics.

Note: For Amazon DynamoDB streams, see Why is the Lambda IteratorAge metric increasing for my Amazon DynamoDB Streams?

Resolution

Decrease your function's runtime duration

A high runtime duration increases a function's iterator age. Decreasing the duration increases a function's throughput, which decreases a function's iterator age.

To decrease your function's runtime duration, do one or both of the following:

1.    Increase the amount of memory allocated to the function. Lambda allocates CPU power in proportion to the memory.

2.    Optimize your function code so that less time is needed to process records.

Note: To get more information on your Lambda duration and performance, see Using AWS Lambda and AWS X-Ray.

Increase your stream's batch size

If your function's runtime duration is independent from the number of records in an event, then increasing your function's batch size decreases the iterator age.

To increase your function's batch size, follow the instructions in Configuring a stream as an event source.

Note: If your function's duration is dependent on the number of records in an event, then increasing your function's batch size doesn't decrease the iterator age. For more information, see Working with streams.

Make sure that your function gracefully handles invocation errors

Invocation errors can cause a Lambda function to take longer to process an event, or process the same event more than once. Because event records are read sequentially, a function can't progress to later records if a record batch causes an error each time that it's retried. In this situation, the iterator age increases linearly as those records age.

If your function returns an error, Lambda continuously retries the batch. The batch retries continue until the processing is successful, the maximum retry attempts is reached, or the data expires from the stream.

Make sure that your function gracefully handles records written to the stream. As you develop your function, logging and instrumenting your code can help you diagnose errors.

For more information, see the following:

Manage your throttle occurrence

Because event records are read sequentially, a function can't progress to the next record if the current invocation is throttled. In this situation, Iterator age will increase while Lambda retries the throttled invokes.

To manage throttling for your Lambda function, you can request a concurrency increase or review performance issues in the function.

Evenly distribute records in the stream

Partition keys in each data record determine the shards in the stream that the records are written to. An increase in traffic to your stream with records containing the same partition key causes a shard to receive a disproportionate number of records. This results in a hot shard.

Kinesis enhanced shard-level metrics allow you to verify the ingestion rate into each shard and troubleshoot a hot shard.

For more information, see Under the hood: scaling your Kinesis data streams.

Increase your stream's shard count

A low number of shards in a stream increases a function's iterator age. Increasing the number of shards in a stream increases the number of concurrent Lambda functions consuming from your stream, which decreases a function's iterator age.

To increase your stream's shard count, follow the instructions in Resharding a stream.

Note: Shard splitting doesn't have an immediate effect on a function's iterator age. Existing records remain in the shards that they were written to. Those shards must catch up on their backlog before the iterator age for the shards decreases. For more information, see Working with streams.

Increase your stream processing throughput by testing different parallelization factor settings and using enhanced fan-out

To test parallelization factor settings

You can improve stream processing by configuring your function's parallelization factor to increase the number of concurrent Lambda invocations for each shard of a stream. This is done by specifying the number of concurrent batches that Lambda polls from each shard. This is configured on the event source configuration.

If your parallelization factor is set to 10, you can have up to 50 concurrent Lambda invocations to process five Kinesis data shards.

For example, concurrent runs = number of shards x concurrent batches per shard (parallelization Factor).

For more information, see Using AWS Lambda with Amazon Kinesis and New AWS Lambda scaling controls for Kinesis and Amazon DynamoDB event sources.

Note: When increasing the number of concurrent batches per shard, Lambda validates in-order processing at the partition key level.

To use enhanced fan-out

You can reduce latency and increase read throughput by creating a data stream consumer with enhanced fan-out. Stream consumers get a dedicated connection to each shard that doesn't impact other applications that are also reading from the stream.

For more information, see Developing custom consumers with dedicated throughput (enhanced fan-out) and Using AWS Lambda with Amazon Kinesis.

Note: It's a best practice to use enhanced fan-out if you have many applications reading the same data, or if you're reprocessing streams with large records.

Related information

Best practices for working with Lambda functions

Lambda event source mappings

AWS Lambda function scaling

Resharding, scaling, and parallel processing

Configuring your data stream and function

AWS OFFICIAL
AWS OFFICIALUpdated a year ago