Repository files navigation
Google Cloud Platform (GCP) : Cloud-based auto-scaling platform by Google
Google Cloud Storage (GCS) : Data Lake
BigQuery : Data Warehouse
Terraform : Infrastructure-as-Code (IaC)
Docker : Containerization
SQL : Data Analysis & Exploration
Prefect : Workflow Orchestration
dbt : Data Transformation
Spark : Distributed Processing
Kafka : Streaming
Docker and Docker-Compose
Python 3 (e.g. via Anaconda )
Google Cloud SDK
Terraform
Introduction to GCP
Docker and docker-compose
Running Postgres locally with Docker
Setting up infrastructure on GCP with Terraform
Preparing the environment for the course
Workflow orchestration
Introduction to Prefect
ETL with GCP & Prefect
Parametrizing workflows
Prefect Cloud and additional resources
BigQuery
Partitioning and clustering
BigQuery best practices
Internals of BigQuery
BigQuery Machine Learning
dbt (data build tool)
BigQuery and dbt
Postgres and dbt
dbt models
Testing and documenting
Deployment to the cloud and locally
Visualizing the data with google data studio and metabase
Batch processing
Spark Dataframes
Spark SQL
Internals: GroupBy and joins
Introduction to Kafka
Schemas (avro)
Kafka Streams
Kafka Connect and KSQL
About
PJT
Topics
Resources
Stars
Watchers
Forks
You can’t perform that action at this time.