Skip to content

P1 - Proposed Short-term vs Long-term data ingest strategy #39

Description

@brendagutman

In the short term, a DAG will be created to use the inc-kf-src-ingest pkg, copy_data_wo_format command, to ingest data files directly into the warehouse.

The DAG will be created with an optional parameter to include a data dictionary if one can be identified. This will be useful

In a future iteration we will swap to using the inc-kf-src-ingest pkg, copy_data command to ingest data in a more structured way, using the copy_data command which will use a data dictionary to create tables, rather than having python/pandas try to infer datatypes and trusting the csvs to be properly formatted. This method will ensure a clean warehouse, capture issues early on, and produce tables that harmonizers can rely on.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions