- Creates a personal S3 bucket for you
- You'll upload a small movie metadata file to this bucket
See the CF Template for this step here.
From the project root folder, run:
./deploy.sh step1-s3This will create a bucket named like: movie-data-bucket-123456789012-ap-southeast-2
To test it you can either locate the bucket via AWS Console or by running this script:
aws cloudformation describe-stacks \
--stack-name step1-s3 \
--query "Stacks[0].Outputs[?OutputKey=='BucketName'].OutputValue" \
--output textPress on q to exit the view.
In real life scenario the file is being dropped to the bucket by another system. But for the demo purpose we drop it ourselves.
aws s3 cp data/movies_metadata.csv \
s3://<your-bucket-name>/raw/movies/Example:
aws s3 cp data/movies_metadata.csv \
s3://movie-data-bucket-123456789012-ap-southeast-2/raw/movies/💡 Using an object key like raw/movies/movies_metadata.csv instead of uploading directly to the bucket root helps organise data into logical "folders" making it easier to manage, automate, and query in tools like AWS Glue or Athena. It also supports better permission control and aligns with best practices for building scalable data pipelines.
- Go to your Amazon S3 and locate the bucekt.
- Click Upload
- Create or choose a folder path: raw/movies/
- Upload the file from this repository.
aws s3 ls s3://<your-bucket-name>/raw/movies/Example:
aws s3 ls s3://movie-data-bucket-123456789012-ap-southeast-2/raw/movies/You should see movies_metadata.csv listed in the output.
- Go to your Amazon S3 and locate your bucket.
- Navigate to the
raw/movies/folder. - Confirm that
movies_metadata.csvappears in the list of files.
Continue to Step 2 - Set Up Glue Crawler and Query with Athena