Replies: 2 comments
-
|
A possible solution is using multiple DAGs (every run triggers multiple dag run). But it is not practical in our case, as it would require launching numerous DAG runs each time, making it difficult to visualize dependencies and resulting in a poor user experience." |
Beta Was this translation helpful? Give feedback.
-
|
Additional context: This is a deep learning use case. Our DAG contains a complete pipeline including data preprocessing, feature engineering, and model training. Each time we trigger a DAG run, we need to selectively run specific stages—for example, sometimes only data cleaning, sometimes only model training with existing cleaned data, or sometimes the full pipeline from scratch." |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Airflow community,
We're facing a challenge with a large DAG (150+ tasks) and would appreciate your advice on best practices.
Current Setup:
• We have a complex DAG with 150+ tasks
• Each DAG run only needs to execute a subset of these tasks
• Currently we use conditional logic to skip tasks that aren't needed
• The problem: When we want to run very downstream tasks, we have to wait for all upstream tasks to be evaluated and skipped, which takes 10+ minutes due to DAG complexity
Proposed Solution: We're considering using DAG Versioning combined with Dynamic DAG Generation:
Is this an appropriate use case for DAG Versioning? Is there a better pattern for this use case?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions