In the short term, a DAG will be created to use the inc-kf-src-ingest pkg, copy_data_wo_format command, to ingest data files directly into the warehouse.
The DAG will be created with an optional parameter to include a data dictionary if one can be identified. This will be useful
In a future iteration we will swap to using the inc-kf-src-ingest pkg, copy_data command to ingest data in a more structured way, using the copy_data command which will use a data dictionary to create tables, rather than having python/pandas try to infer datatypes and trusting the csvs to be properly formatted. This method will ensure a clean warehouse, capture issues early on, and produce tables that harmonizers can rely on.
In the short term, a DAG will be created to use the inc-kf-src-ingest pkg,
copy_data_wo_formatcommand, to ingest data files directly into the warehouse.The DAG will be created with an optional parameter to include a data dictionary if one can be identified. This will be useful
In a future iteration we will swap to using the inc-kf-src-ingest pkg,
copy_datacommand to ingest data in a more structured way, using thecopy_datacommand which will use a data dictionary to create tables, rather than having python/pandas try to infer datatypes and trusting the csvs to be properly formatted. This method will ensure a clean warehouse, capture issues early on, and produce tables that harmonizers can rely on.