Comparing Two Ways to Trigger Lambda from S3
Update 30 November 2021: There is now a third way to trigger Lambda (and many more services) from S3. This third method uses EventBridge but without CloudTrail: https://aws.amazon.com/blogs/aws/new-use-amazon-s3-event-notifications-with-amazon-eventbridge/.
There are two primary methods to trigger Lambda when an object is added to an S3 bucket — S3 Notifications and EventBridge. Using S3 notifications is more typical but there is one big drawback for me. For this reason, let’s look at EventBridge as an alternative.
An S3 notification is part of its NotificationConfiguration, a property of the bucket itself. When using CloudFormation, this generally means that the notification must be created or modified as part of the bucket resource itself. In a serverless application, you can imagine having some sort of shared bucket with separately deployable serverless services triggered by objects being created or deleted with specific prefixes. An upload to the /uploads/images path might trigger a separate function to an upload to /uploads/metadata. CloudFormation, and therefore its derivatives like CDK and AWS SAM don’t allow you to modify a resource outside of the CloudFormation stack in which the resource resides. The Serverless Framework has a workaround for this limitation by using a custom resource to create the notification outside of the bucket’s stack.
The alternative we will explore is to use EventBridge. To receive events from EventBridge, you create a rule with targets. A rule is a separate resource to your bucket and is also a separate resource to the event recipients. This facilitates the loose coupling we want. Many AWS services publish EventBridge events. Those that don’t can still be integrated using CloudTrail events. S3 publishes CloudTrail events for resource modifications but can also be configured to publish data events. Data events relate to creating, modifying and removing objects within a bucket. To receive such events via EventBridge, a CloudTrail trail should exist and be configured for Data Events on the bucket to be monitored.
To run a comparison, we’ll use AWS SAM to create a simple application with two Lambda functions. One will react to EventBridge S3 CloudTrail data events and the other will use the S3 Notification approach. The SAM template for both functions looks like this (follow this link for the code):
The implementations for each function are similar. They use the AWS SDK’s HeadObject method to get the object’s LastModified property. We will log this along with the current time and any other event timings. These can be used to get a reasonable idea of the latency between object creation and Lambda execution. The handler code can be seen in full here in the GitHub repository for this application.
A simple shell script will copy a file repeatedly to different keys with varying prefixes in series. This not a proper benchmark by any means.
Since we are using structured JSON logging with Pino to log our timings, the results can be extracted and aggregated using CloudWatch Logs Insights. I first let the script create 2100 objects.
First, we use Insights to check how many events were received by each service.
We see that the CloudTrail-EventBridge method received an extra event. We can put this down to EventBridge’s at-least-once delivery. This could also happen with S3. Only recently, S3’s delivery guarantees changed from probably once(!).
Now, let’s take a look at the timings. I’m only looking at the interval between the object’s reported LastModifiedAt time and the current time captured in the Lambda.
We’ve now got some significant differences between S3 notification timings and EventBridge/CloudTrail. Note that we have a negative value for the minimum interval with notifications. This is down to the fact that S3 last modified has second precision only. There may also be clock synchronization differences between services but we will assume that to be negligible given the size of intervals being observed.
Switching to CloudWatch Logs Insights’ Visualization tab gives us the following comparison bar chart.
EventBridge has some really long intervals pushing the maximum value high. Looking at the maximum for each minute, there are clearly a few incidents of intervals in the 30–50 second range.
In general, we can conclude that if event latency is critical, S3 notifications are still the way to go and we have to accept the CloudFormation resource ownership limitation. If we can accept delays of close to a minute, EventBridge gives us better separation of triggering infrastructure.
Note that the official line on S3 Notification delivery timing is “Amazon S3 event notifications typically deliver events in seconds but can sometimes take a minute or longer.” Also note that if you want to avoid quick successive events for the same object resulting in missed events, you need to enable object versioning in the bucket. See https://docs.aws.amazon.com/en_gb/AmazonS3/latest/dev/NotificationHowTo.html.
Summary of Differences
It’s clear that the delivery latency with CloudTrail and EventBridge is higher than with S3 Notifications. What else should inform a decision on which method to use? Let’s finish with a summary of the other known differences between the two.
CloudTrail/EventBridge
- CloudTrail Data events are charged at $0.10 per 100,000 events. There is no charge for EventBridge rules or delivery to targets. Lambda targets are subject to normal AWS Lambda billing.
- EventBridge supports a larger number of target services. See https://docs.aws.amazon.com/eventbridge/latest/APIReference/API_PutTargets.html.
- EventBridge rules now support content filtering which allows more complex pattern matching than S3 Notifications support. This allows prefix matching as done in the example above but suffix matching is not supported!
- EventBridge allows you to use to extract and transform part of the event to be sent to the target. See https://docs.aws.amazon.com/eventbridge/latest/APIReference/API_Target.html.
S3 Notifications
- There is no pricing associated with S3 Notifications. Lambda targets are subject to normal AWS Lambda billing.
- S3 Notifications can target Lambda, SNS and SQS.
- Object events can filter based on a prefix or suffix or both.
- You cannot have multiple notifications with overlapping prefixes.
- Object versioning is required to avoid missed events.
Eoin is the CTO of fourTheorem, an AWS Partner, and author of AI as a Service from Manning.