Skip to content

Commit 6d68ef9

Browse files
committed
format
1 parent dc8aa1b commit 6d68ef9

1 file changed

Lines changed: 36 additions & 36 deletions

File tree

docs/index.md

Lines changed: 36 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -8,66 +8,66 @@ The experience of developing and deploying data pipelines is more uncertain and
88

99
Here are some challenges that data teams run into, especially when data sizes increase or the number of data users expands:
1010

11-
### Data pipelines are fragmented and fragile
12-
Data pipelines generally consist of Python or SQL scripts that implicitly depend upon each other through tables. Changes to upstream scripts that break downstream dependencies are usually only detected at run time.
11+
1. Data pipelines are fragmented and fragile
12+
* Data pipelines generally consist of Python or SQL scripts that implicitly depend upon each other through tables. Changes to upstream scripts that break downstream dependencies are usually only detected at run time.
1313

14-
### Data quality checks are not sufficient
15-
The data community has settled on data quality checks as the "solution" for testing data pipelines. Although data quality checks are great for detecting large unexpected data changes, they are expensive to run, and they have trouble validating exact logic.
14+
1. Data quality checks are not sufficient
15+
* The data community has settled on data quality checks as the "solution" for testing data pipelines. Although data quality checks are great for detecting large unexpected data changes, they are expensive to run, and they have trouble validating exact logic.
1616

17-
### It's too hard and too costly to build staging environments for data
18-
Validating changes to data pipelines before deploying to production is an uncertain and sometimes expensive process. Although branches can be deployed to environments, when merged to production, the code is re-run. This is wasteful and generates uncertainty because the data is regenerated.
17+
1. It's too hard and too costly to build staging environments for data
18+
* Validating changes to data pipelines before deploying to production is an uncertain and sometimes expensive process. Although branches can be deployed to environments, when merged to production, the code is re-run. This is wasteful and generates uncertainty because the data is regenerated.
1919

20-
### Silos transform data lakes to data swamps
21-
The difficulty and cost of making changes to core pipelines can lead to duplicate pipelines with minor customizations. The inability to easily make and validate changes causes contributors to follow the "path of least resistence". The proliferation of similar tables leads to additional costs, inconsistencies, and maintenance burden.
20+
1. Silos transform data lakes to data swamps
21+
* The difficulty and cost of making changes to core pipelines can lead to duplicate pipelines with minor customizations. The inability to easily make and validate changes causes contributors to follow the "path of least resistence". The proliferation of similar tables leads to additional costs, inconsistencies, and maintenance burden.
2222

2323
## What is SQLMesh?
2424
SQLMesh consists of a CLI, a Python API, and a Web UI to make data pipeline development and deployment easy, efficient, and safe.
2525

2626
### Core principles
2727
SQLMesh was built on three core principles:
2828

29-
#### Correctness is non-negotiable
30-
Bad data is worse than no data. SQLMesh guarantees that your data will be consistent even in heavily collaborative environments.
29+
1. Correctness is non-negotiable
30+
* Bad data is worse than no data. SQLMesh guarantees that your data will be consistent even in heavily collaborative environments.
3131

32-
#### Change with confidence
33-
SQLMesh summarizes the impact of changes and provides automated guardrails empowering everyone to safely and quickly contribute.
32+
1. Change with confidence
33+
* SQLMesh summarizes the impact of changes and provides automated guardrails empowering everyone to safely and quickly contribute.
3434

35-
#### Efficiency without complexity
36-
SQLMesh automatically optimizes your workloads by reusing tables and minimizing computation saving you time and money.
35+
1. Efficiency without complexity
36+
* SQLMesh automatically optimizes your workloads by reusing tables and minimizing computation saving you time and money.
3737

3838
### Key features
39-
#### Efficient dev/staging environments
40-
SQLMesh builds a virtual data mart using views, which allows you to seamlessly rollback or roll forward your changes. Any data computation you run for validation purposes is actually not wasted — with a cheap pointer swap, you re-use your “staging” data in production. This means you get unlimited copy-on-write environments that make data exploration and preview of changes fun and safe.
39+
* Efficient dev/staging environments
40+
* SQLMesh builds a virtual data mart using views, which allows you to seamlessly rollback or roll forward your changes. Any data computation you run for validation purposes is actually not wasted — with a cheap pointer swap, you re-use your “staging” data in production. This means you get unlimited copy-on-write environments that make data exploration and preview of changes fun and safe.
4141

42-
#### Automatic DAG generation by semantically parsing and understanding SQL or Python scripts
43-
No need to manually tag dependencies — SQLMesh was built with the ability to understand your entire data warehouse’s dependency graph.
42+
* Automatic DAG generation by semantically parsing and understanding SQL or Python scripts
43+
* No need to manually tag dependencies — SQLMesh was built with the ability to understand your entire data warehouse’s dependency graph.
4444

45-
#### Informative change summaries
46-
Before making changes, SQLMesh will determine what has changed and show the entire graph of affected jobs.
45+
* Informative change summaries
46+
* Before making changes, SQLMesh will determine what has changed and show the entire graph of affected jobs.
4747

48-
#### CI-Runnable Unit and Integration tests
49-
Can be easily defined in YAML and run in CI. SQLMesh can optionally transpile your queries to DuckDB so that your tests can be self-contained.
48+
* CI-Runnable Unit and Integration tests
49+
* Can be easily defined in YAML and run in CI. SQLMesh can optionally transpile your queries to DuckDB so that your tests can be self-contained.
5050

51-
#### Smart change categorization
52-
Column-level lineage automatically determines whether changes are “breaking” or “non-breaking”, allowing you to correctly categorize changes and to skip expensive backfills.
51+
* Smart change categorization
52+
* Column-level lineage automatically determines whether changes are “breaking” or “non-breaking”, allowing you to correctly categorize changes and to skip expensive backfills.
5353

54-
#### Easy incremental loads
55-
Loading tables incrementally is as easy as a full refresh. SQLMesh transparently handles the complexity of tracking which intervals need loading, so all you have to do is specify a date filter.
54+
* Easy incremental loads
55+
* Loading tables incrementally is as easy as a full refresh. SQLMesh transparently handles the complexity of tracking which intervals need loading, so all you have to do is specify a date filter.
5656

57-
#### Integrated with Airflow
58-
You can schedule jobs with our simple built-in scheduler or use your existing Airflow cluster. SQLMesh can dynamically generate and push Airflow DAGs. We aim to support other schedulers like Dagster and Prefect in the future.
57+
* Integrated with Airflow
58+
* You can schedule jobs with our simple built-in scheduler or use your existing Airflow cluster. SQLMesh can dynamically generate and push Airflow DAGs. We aim to support other schedulers like Dagster and Prefect in the future.
5959

60-
#### Notebook / CLI
61-
Interact with SQLMesh with whatever tool you’re comfortable with.
60+
* Notebook / CLI
61+
* Interact with SQLMesh with whatever tool you’re comfortable with.
6262

63-
#### Web based IDE (in development)
64-
Edit, run, and visualize queries in your browser.
63+
* Web based IDE (in development)
64+
* Edit, run, and visualize queries in your browser.
6565

66-
#### Github CI/CD bot (in development)
67-
A bot to tie your code directly to your data.
66+
* Github CI/CD bot (in development)
67+
* A bot to tie your code directly to your data.
6868

69-
#### Table/Column level lineage visualizations (in development)
70-
Quickly understand the full lineage and sequence of transformation of any column.
69+
* Table/Column level lineage visualizations (in development)
70+
* Quickly understand the full lineage and sequence of transformation of any column.
7171

7272
## Next steps
7373
* [Jump right in with the quickstart](quick_start.md)

0 commit comments

Comments
 (0)