|
| 1 | +# Kinesis Firehose Data Transformation with Lambda (Terraform) |
| 2 | + |
| 3 | +The purpose of this pattern is to deploy the infrastructure necessary to enable Kinesis Data Firehose data transformation. |
| 4 | + |
| 5 | +Kinesis Data Firehose can invoke a Lambda function to transform incoming source data and deliver the transformed data to destinations. In this architecture, Kinesis Data Firehose then invokes the specified Lambda function asynchronously with each buffered batch using the AWS Lambda synchronous invocation mode. |
| 6 | + |
| 7 | +The transformed data is sent from Lambda to Kinesis Data Firehose. Kinesis Data Firehose then sends it to the destination S3 bucket when the specified destination buffering size or buffering interval is reached, whichever happens first. |
| 8 | + |
| 9 | +In this project, the data transformation lambda will modify the value of 'HEALTHCARE' to 'MODERN_HEALTHCARE' for demonstration purposes. |
| 10 | + |
| 11 | +Learn more about this pattern at [Serverless Land Patterns](https://serverlessland.com/patterns/firehose-transformation-terraform). |
| 12 | + |
| 13 | +Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example. |
| 14 | + |
| 15 | +## Requirements |
| 16 | + |
| 17 | +* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources. |
| 18 | +* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured |
| 19 | +* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) |
| 20 | +* [Terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli?in=terraform/aws-get-started) installed |
| 21 | + |
| 22 | +## Deployment Instructions |
| 23 | + |
| 24 | +1. Clone the project to your local working directory |
| 25 | + |
| 26 | + ```sh |
| 27 | + git clone https://github.com/aws-samples/serverless-patterns/ |
| 28 | + ``` |
| 29 | + |
| 30 | +2. Change the working directory to this pattern's directory |
| 31 | + |
| 32 | + ```sh |
| 33 | + cd serverless-patterns/firehose-transformation-terraform |
| 34 | + ``` |
| 35 | + |
| 36 | +3. From the command line, initialize terraform to to downloads and installs the providers defined in the configuration: |
| 37 | + ``` |
| 38 | + terraform init |
| 39 | + ``` |
| 40 | +
|
| 41 | +4. From the command line, apply the configuration in the main.tf file: |
| 42 | + ``` |
| 43 | + terraform apply |
| 44 | + ``` |
| 45 | +
|
| 46 | +5. During the prompts: |
| 47 | + - Enter yes |
| 48 | +
|
| 49 | +## How it works |
| 50 | +
|
| 51 | + |
| 52 | +
|
| 53 | +This pattern deploys a Kinesis Firehose Delivery Stream, a transformation Lambda function, a destination S3 bucket, and all of the additional required infrastructure services. |
| 54 | +
|
| 55 | +Kinesis Data Firehose can invoke a Lambda function to transform incoming source data and deliver the transformed data to destinations. In this architecture, Kinesis Data Firehose then invokes the specified Lambda function asynchronously with each buffered batch using the AWS Lambda synchronous invocation mode. |
| 56 | +
|
| 57 | +The transformed data is sent from Lambda to Kinesis Data Firehose. Kinesis Data Firehose then sends it to the destination S3 bucket when the specified destination buffering size or buffering interval is reached, whichever happens first. |
| 58 | +
|
| 59 | +Note: The default region is `us-east-1`, it can also be changed using the variable `region`. |
| 60 | +
|
| 61 | +**Note:** Variables can be supplied in different options, check the [Terraform documentation](https://developer.hashicorp.com/terraform/language/values/variables) for more details. |
| 62 | +
|
| 63 | +## Testing |
| 64 | +
|
| 65 | +To test this project, follow the below steps: |
| 66 | +
|
| 67 | +1. Generating Data: |
| 68 | + Go to this link * [Testing Your Delivery Stream Using Sample Data](https://docs.aws.amazon.com/firehose/latest/dev/test-drive-firehose.html?icmpid=docs_console_unmapped) and follow the steps mentioned in this section 'Test Using Amazon S3 as the Destination'. |
| 69 | +
|
| 70 | +2. Wait for few minutes and then stop the data generation process. |
| 71 | +  |
| 72 | +
|
| 73 | +3. Once the Kinesis Buffer Interval threshold is reached, go to S3 in the AWS console and select the bucket that you created. You should now see sample data (GZIP) under a prefix. |
| 74 | +
|
| 75 | +4. Select the S3 object and then choose 'Action > Query with S3 Select'. Specify the Input settings as the following: |
| 76 | + ``` |
| 77 | + Format : JSON |
| 78 | + JSON Content Type : Lines |
| 79 | + Compression : GZIP |
| 80 | + ``` |
| 81 | +
|
| 82 | + Leave rest of the settings as default and then select 'Run SQL Query'. |
| 83 | +
|
| 84 | +5. Under 'Query Results', you should now be able to view the results. Specifically, you should notice the values for 'HEALTHCARE' should now be modified to 'MODERN_HEALTHCARE'. This demonstrates that the data got transformed through the data transformation lambda successfully. |
| 85 | +  |
| 86 | +
|
| 87 | +## Cleanup |
| 88 | +
|
| 89 | +1. Change directory to the pattern directory: |
| 90 | + ```sh |
| 91 | + cd serverless-patterns/firehose-transformation-terraform |
| 92 | + ``` |
| 93 | +
|
| 94 | +2. Delete all created resources |
| 95 | + ```sh |
| 96 | + terraform destroy |
| 97 | + ``` |
| 98 | +
|
| 99 | +3. During the prompts: |
| 100 | + * Enter yes |
| 101 | +
|
| 102 | +4. Confirm all created resources has been deleted |
| 103 | + ```sh |
| 104 | + terraform show |
| 105 | + ``` |
| 106 | +
|
| 107 | +## Reference |
| 108 | +- [AWS Lambda - the Basics](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html) |
| 109 | +- [Lambda Function Handler](https://docs.aws.amazon.com/lambda/latest/dg/python-handler.html) |
| 110 | +- [Amazon Kinesis Firehose](https://aws.amazon.com/kinesis/data-firehose/) |
| 111 | +
|
| 112 | +---- |
| 113 | +Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
| 114 | +
|
| 115 | +SPDX-License-Identifier: MIT-0 |
0 commit comments