Pipeline Tweaks, New Commands, and packaging fixes
This pasteur release tweaks pipeline generation to better segment ingestion and synthesis.
It introduces the new commands ingest_dataset (or id) and ingest_view (or iv) which only perform the dataset and view ingest steps. This makes it easier to iterate on creating new datasets and new views by only re-running their ingest code.
Now by default pipe won't perform the view ingestion steps, which may be cumbersome for out-of-core datasets, and will begin from filtering onward (pipe --all will still run the whole pipeline).
A new view option is introduced: fit_global, which allows for fitting the transformers and encoders in the whole view (at the cost of increased overhead), which fixes issues with rare categorical values not being recognized due to be missing from the work set.
Two bugs were also fixed: TabularDataset required pandas but it wasn't imported and the mlflow default style was not packaged in the pypi package.