Skip to content

Commit 42ecd89

Browse files
authored
Merge pull request aws-samples#1589 from nareshrajaram2017/nareshrd-feature-firehose-transformation-terraform
New serverless pattern - firehose-transformation-terraform
2 parents 64e186f + 1689458 commit 42ecd89

21 files changed

Lines changed: 602 additions & 0 deletions
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
# Kinesis Firehose Data Transformation with Lambda (Terraform)
2+
3+
The purpose of this pattern is to deploy the infrastructure necessary to enable Kinesis Data Firehose data transformation.
4+
5+
Kinesis Data Firehose can invoke a Lambda function to transform incoming source data and deliver the transformed data to destinations. In this architecture, Kinesis Data Firehose then invokes the specified Lambda function asynchronously with each buffered batch using the AWS Lambda synchronous invocation mode.
6+
7+
The transformed data is sent from Lambda to Kinesis Data Firehose. Kinesis Data Firehose then sends it to the destination S3 bucket when the specified destination buffering size or buffering interval is reached, whichever happens first.
8+
9+
In this project, the data transformation lambda will modify the value of 'HEALTHCARE' to 'MODERN_HEALTHCARE' for demonstration purposes.
10+
11+
Learn more about this pattern at [Serverless Land Patterns](https://serverlessland.com/patterns/firehose-transformation-terraform).
12+
13+
Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.
14+
15+
## Requirements
16+
17+
* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
18+
* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured
19+
* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
20+
* [Terraform](https://learn.hashicorp.com/tutorials/terraform/install-cli?in=terraform/aws-get-started) installed
21+
22+
## Deployment Instructions
23+
24+
1. Clone the project to your local working directory
25+
26+
```sh
27+
git clone https://github.com/aws-samples/serverless-patterns/
28+
```
29+
30+
2. Change the working directory to this pattern's directory
31+
32+
```sh
33+
cd serverless-patterns/firehose-transformation-terraform
34+
```
35+
36+
3. From the command line, initialize terraform to to downloads and installs the providers defined in the configuration:
37+
```
38+
terraform init
39+
```
40+
41+
4. From the command line, apply the configuration in the main.tf file:
42+
```
43+
terraform apply
44+
```
45+
46+
5. During the prompts:
47+
- Enter yes
48+
49+
## How it works
50+
51+
![Reference Architecture](./images/firehose_data_transformation_lambda.png)
52+
53+
This pattern deploys a Kinesis Firehose Delivery Stream, a transformation Lambda function, a destination S3 bucket, and all of the additional required infrastructure services.
54+
55+
Kinesis Data Firehose can invoke a Lambda function to transform incoming source data and deliver the transformed data to destinations. In this architecture, Kinesis Data Firehose then invokes the specified Lambda function asynchronously with each buffered batch using the AWS Lambda synchronous invocation mode.
56+
57+
The transformed data is sent from Lambda to Kinesis Data Firehose. Kinesis Data Firehose then sends it to the destination S3 bucket when the specified destination buffering size or buffering interval is reached, whichever happens first.
58+
59+
Note: The default region is `us-east-1`, it can also be changed using the variable `region`.
60+
61+
**Note:** Variables can be supplied in different options, check the [Terraform documentation](https://developer.hashicorp.com/terraform/language/values/variables) for more details.
62+
63+
## Testing
64+
65+
To test this project, follow the below steps:
66+
67+
1. Generating Data:
68+
Go to this link * [Testing Your Delivery Stream Using Sample Data](https://docs.aws.amazon.com/firehose/latest/dev/test-drive-firehose.html?icmpid=docs_console_unmapped) and follow the steps mentioned in this section 'Test Using Amazon S3 as the Destination'.
69+
70+
2. Wait for few minutes and then stop the data generation process.
71+
![Firehose Sample Data Generation](./images/firehose_sample_data_generation.png)
72+
73+
3. Once the Kinesis Buffer Interval threshold is reached, go to S3 in the AWS console and select the bucket that you created. You should now see sample data (GZIP) under a prefix.
74+
75+
4. Select the S3 object and then choose 'Action > Query with S3 Select'. Specify the Input settings as the following:
76+
```
77+
Format : JSON
78+
JSON Content Type : Lines
79+
Compression : GZIP
80+
```
81+
82+
Leave rest of the settings as default and then select 'Run SQL Query'.
83+
84+
5. Under 'Query Results', you should now be able to view the results. Specifically, you should notice the values for 'HEALTHCARE' should now be modified to 'MODERN_HEALTHCARE'. This demonstrates that the data got transformed through the data transformation lambda successfully.
85+
![Firehose Data Transformation Results](./images/firehose_data_transformation_results.png)
86+
87+
## Cleanup
88+
89+
1. Change directory to the pattern directory:
90+
```sh
91+
cd serverless-patterns/firehose-transformation-terraform
92+
```
93+
94+
2. Delete all created resources
95+
```sh
96+
terraform destroy
97+
```
98+
99+
3. During the prompts:
100+
* Enter yes
101+
102+
4. Confirm all created resources has been deleted
103+
```sh
104+
terraform show
105+
```
106+
107+
## Reference
108+
- [AWS Lambda - the Basics](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html)
109+
- [Lambda Function Handler](https://docs.aws.amazon.com/lambda/latest/dg/python-handler.html)
110+
- [Amazon Kinesis Firehose](https://aws.amazon.com/kinesis/data-firehose/)
111+
112+
----
113+
Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.
114+
115+
SPDX-License-Identifier: MIT-0
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
{
2+
"title": "Kinesis Firehose Data Transformation with Lambda",
3+
"description": "Transform incoming source data and deliver the transformed data to destinations.",
4+
"language": "",
5+
"level": "200",
6+
"framework": "Terraform",
7+
"introBox": {
8+
"headline": "How it works",
9+
"text": [
10+
"The purpose of this pattern is to deploy the infrastructure necessary to enable Kinesis Data Firehose data transformation.",
11+
"Kinesis Data Firehose can invoke a Lambda function to transform incoming source data and deliver the transformed data to destinations. In this architecture, Kinesis Data Firehose then invokes the specified Lambda function asynchronously with each buffered batch using the AWS Lambda synchronous invocation mode. The transformed data is sent from Lambda to Kinesis Data Firehose. Kinesis Data Firehose then sends it to the destination S3 bucket when the specified destination buffering size or buffering interval is reached, whichever happens first."
12+
]
13+
},
14+
"gitHub": {
15+
"template": {
16+
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/firehose-transformation-terraform",
17+
"templateURL": "serverless-patterns/firehose-transformation-terraform",
18+
"projectFolder": "firehose-transformation-terraform",
19+
"templateFile": "firehose-transformation-terraform/main.tf"
20+
}
21+
},
22+
"resources": {
23+
"bullets": [{
24+
"text": "Amazon Kinesis Data Firehose Data Transformation",
25+
"link": "https://docs.aws.amazon.com/firehose/latest/dev/data-transformation.html"
26+
}
27+
]
28+
},
29+
"deploy": {
30+
"text": [
31+
"terraform init",
32+
"terraform plan",
33+
"terraform apply"
34+
]
35+
},
36+
"testing": {
37+
"text": [
38+
"See the README in the GitHub repo for detailed testing instructions."
39+
]
40+
},
41+
"cleanup": {
42+
"text": [
43+
"terraform destroy",
44+
"terraform show"
45+
]
46+
},
47+
"authors": [{
48+
"name": "Naresh Rajaram",
49+
"image": "",
50+
"bio": "Cloud Infrastructure Architect, AWS",
51+
"linkedin": "https://www.linkedin.com/in/naresh-rajaram-25bb106/"
52+
}]
53+
}
55.8 KB
Loading
81 KB
Loading
128 KB
Loading
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
/* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. */
2+
3+
# --- main.tf ---
4+
5+
# Create Amazon S3 Destination Bucket
6+
module "s3_destination_bucket" {
7+
source = "./modules/s3"
8+
}
9+
10+
# Create Data Transformation Lambda
11+
module "data_transformation_lambda" {
12+
source = "./modules/lambda"
13+
firehose_delivery_stream_arn = module.firehose_delivery_stream.firehose_delivery_stream_arn
14+
15+
}
16+
17+
# Create Amazon Firehose Delivery Stream
18+
module "firehose_delivery_stream" {
19+
source = "./modules/firehose"
20+
s3_destination_bucket_arn = module.s3_destination_bucket.bucket_arn
21+
data_transformation_lambda_arn = module.data_transformation_lambda.data_transformation_lambda_arn
22+
}
23+
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
{
2+
"Version": "2012-10-17",
3+
"Statement":
4+
[
5+
{
6+
"Effect": "Allow",
7+
"Action": [
8+
"s3:AbortMultipartUpload",
9+
"s3:GetBucketLocation",
10+
"s3:GetObject",
11+
"s3:ListBucket",
12+
"s3:ListBucketMultipartUploads",
13+
"s3:PutObject"
14+
],
15+
"Resource": [
16+
"${s3_destination_bucket_arn}",
17+
"${s3_destination_bucket_arn}/*"
18+
]
19+
},
20+
{
21+
"Effect": "Allow",
22+
"Action": [
23+
"lambda:InvokeFunction",
24+
"lambda:GetFunctionConfiguration"
25+
],
26+
"Resource": [
27+
"${data_transformation_lambda_arn}"
28+
]
29+
}
30+
]
31+
}
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
{
2+
"Version": "2012-10-17",
3+
"Statement": [
4+
{
5+
"Effect": "Allow",
6+
"Principal": {
7+
"Service": "firehose.amazonaws.com"
8+
},
9+
"Action": "sts:AssumeRole"
10+
}
11+
]
12+
}
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
/* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. */
2+
3+
# --- modules/firehose/main.tf ---
4+
5+
# Create Kinesis Firehose delivery stream
6+
resource "aws_kinesis_firehose_delivery_stream" "firehose_delivery_stream" {
7+
name = var.kinesis_firehose_name
8+
destination = var.kinesis_firehose_destination
9+
10+
extended_s3_configuration {
11+
bucket_arn = var.s3_destination_bucket_arn
12+
role_arn = aws_iam_role.iam_role_for_firehose.arn
13+
buffering_interval = var.buffer_interval
14+
buffering_size = var.buffer_size
15+
compression_format = var.compression_format
16+
17+
processing_configuration {
18+
enabled = "true"
19+
20+
processors {
21+
type = var.kinesis_firehose_processing_type
22+
23+
parameters {
24+
parameter_name = "LambdaArn"
25+
parameter_value = var.data_transformation_lambda_arn
26+
}
27+
}
28+
}
29+
}
30+
}
31+
32+
# IAM Role for Kinesis Firehose
33+
resource "aws_iam_role" "iam_role_for_firehose" {
34+
name = "iam_role_for_firehose"
35+
assume_role_policy = file("${path.module}/iam_role_for_firehose.json")
36+
}
37+
38+
# IAM Policy for Kinesis Firehose
39+
resource "aws_iam_policy" "iam_policy_for_firehose" {
40+
name = "iam_policy_for_firehose"
41+
policy = templatefile("${path.module}/iam_policy_for_firehose.tftpl",{s3_destination_bucket_arn = var.s3_destination_bucket_arn ,
42+
data_transformation_lambda_arn = var.data_transformation_lambda_arn } )
43+
}
44+
45+
# Attach IAM Role with IAM Policy for Kinesis Firehose
46+
resource "aws_iam_role_policy_attachment" "attach_iam_role_with_iam_policy" {
47+
policy_arn = aws_iam_policy.iam_policy_for_firehose.arn
48+
role = aws_iam_role.iam_role_for_firehose.name
49+
}
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
/* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. */
2+
3+
# --- modules/firehose/outputs.tf ---
4+
5+
# Kinesis Firehose Delivery Stream ARN
6+
output "firehose_delivery_stream_arn" {
7+
value = aws_kinesis_firehose_delivery_stream.firehose_delivery_stream.arn
8+
description = "firehose_delivery_stream_arn"
9+
}

0 commit comments

Comments
 (0)