How to efficiently test my DAGs on a per-task basis in Airflow 3? #63941

GlenboLake · 2026-03-19T16:53:07Z

GlenboLake
Mar 19, 2026

I am in the process of upgrading DAGs from Airflow 2.10 to 3.x. For each of my DAGs, I tend to have one test for each task that checks the output. A test for a PythonOperator may check that the correct XComs were set, and a SQLExecuteQueryOperator will usually check the final contents of the output table. An example DAG and test might look something like this:

# my_dag.py
with DAG(dag_id='example') as dag:
    start = S3ListOperator(
        task_id="check_s3",
        bucket='data',
        prefix="incoming/{{ logical_date.year }}/{{ logical_date.month }}/",
        delimiter='/'
    )
    load = SQLExecuteQueryOperator(
        task_id='load',
        conn_id='redshift',
        sql=[
            "templates/create_target_table.sql",
            "COPY TO {{ params.target_table }} FROM '{{ ti.xcom_pull(task_ids='check_s3') }}"
        ],
        params={"target_table": "temp.loaded_data"}
    )
    process = SQLExecuteQueryOperator(
        task_id="process",
        conn_id="redshift",
        sql="INSERT INTO {{ params.prod_table }} SELECT * FROM {{ params.new_table }} WHERE some_condition = TRUE",
        params={
            "new_table": "temp.loaded_data",
            "prod_table": "final.example",
        }
    )
    
    start >> load >> process

# test_my_dag.py
from dags.my_dag import dag
from pendulum import UTC, DateTime

logical_date = DateTime(2026, 3, 19, tzinfo=UTC)

@pytest.fixture
def setup_process(db_hook: PostgresHook):
    db_hook.run('CREATE TABLE temp.loaded_data ...')
    data = [
        # fake data as result of load task
    ]
    db_hook.insert_rows("temp.loaded_data", data)
    existing_data = [
        # data that may already exist in final.example
    ]
    db_hook.insert_rows("final.example", existing_data)
    yield
    db_hook.run("DROP TABLE IF EXISTS temp.loaded_data, final.example")
    
@pytest.mark.usefixtures("setup_process")
@provide_session
def test_process(db_hook, session):
    run = dag.create_dagrun(state=DagRunState.RUNNING, execution_date=logical_date)
    ti = TaskInstance(
        task=dag.get_task("process"),
        run_id=run.run_id,
    )
    session.merge(ti)
    ti.run(ignore_all_deps=True)
    
    result = db_hook.get_records("SELECT * FROM final.example")
    expected = [...]
    assert result == expected

I have several DAGs, amounting to over 1000 tests (including pytest.mark.parametrize) with about 700 instances of ti.run as above. This takes about 7 minutes to run in Airflow 2.10.5. However, my attempts to migrate to Airflow 3 have resulted in the test suite taking a full 50 minutes now. I can only conclude that I have done something wrong.

I can share the helper functions that I've written for Airflow 3 if requested, but what I'd really like to know is: Is there a preferred or correct way to test individual tasks? If possible, I want to try to avoid fully mocking out functions like ti.xcom_pull (I want the test to fail if I pass an incorrect argument).

I am aware that the easiest thing would be for me to test my functions and SQL templates directly. However, a large part of that is rendering templates, which requires a template context; is there a test-compatible version of Context available, or would I have to spin up my own?

Sagargupta16 · 2026-03-27T00:48:27Z

Sagargupta16
Mar 27, 2026

Airflow 3 changed how task testing works. The test command and DagBag-based unit testing patterns from 2.x still work but need adjustments.

Per-task testing in Airflow 3

Option 1: Use `dag.test()` method

Airflow 3 added a built-in dag.test() that runs the DAG in a single process:

from airflow.models.dag import DAG
import importlib

def test_check_s3_task():
    mod = importlib.import_module("dags.my_dag")
    dag = mod.dag
    # Run just the specific task
    dag.test(task_id="check_s3")

Option 2: Use `create_task_instance_of_operator` fixture (pytest)

For more granular testing, especially template rendering:

import pytest
from airflow.providers.amazon.aws.operators.s3 import S3ListOperator

@pytest.mark.db_test
def test_s3_task(create_task_instance_of_operator):
    ti = create_task_instance_of_operator(
        S3ListOperator,
        dag_id="test_dag",
        task_id="check_s3",
        bucket="my-bucket",
        prefix="data/",
    )
    ti.render_templates()
    assert ti.task.bucket == "my-bucket"

Option 3: Direct operator execution

For PythonOperator and XCom assertions:

from airflow.models import DagBag

def test_python_task_xcom():
    dagbag = DagBag(dag_folder="dags/", include_examples=False)
    dag = dagbag.get_dag("example")
    
    # Run the task and check XCom
    dr = dag.test()
    ti = dr.get_task_instance("my_python_task")
    result = ti.xcom_pull(task_ids="my_python_task", key="return_value")
    assert result == expected_value

For SQL tasks

Mock the database connection and verify the queries:

from unittest.mock import patch, MagicMock

@patch("airflow.providers.common.sql.operators.sql.BaseSQLOperator.get_db_hook")
def test_sql_task(mock_hook):
    mock_conn = MagicMock()
    mock_hook.return_value = mock_conn
    
    dag.test(task_id="run_query")
    
    # Verify the SQL was executed
    mock_conn.run.assert_called_once()
    executed_sql = mock_conn.run.call_args[0][0]
    assert "INSERT INTO" in executed_sql

The key change in Airflow 3 is that dag.test() returns a DagRun object, making it much easier to inspect task results and XComs after execution.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to efficiently test my DAGs on a per-task basis in Airflow 3? #63941

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to efficiently test my DAGs on a per-task basis in Airflow 3? #63941

Uh oh!

GlenboLake Mar 19, 2026

Replies: 1 comment

Uh oh!

Sagargupta16 Mar 27, 2026

Per-task testing in Airflow 3

Option 1: Use dag.test() method

Option 2: Use create_task_instance_of_operator fixture (pytest)

Option 3: Direct operator execution

For SQL tasks

GlenboLake
Mar 19, 2026

Sagargupta16
Mar 27, 2026

Option 1: Use `dag.test()` method

Option 2: Use `create_task_instance_of_operator` fixture (pytest)