55 <img alt="Pasteur Logo with text. Tagline reads: 'Sanitize Your Data'" src="./res/logo/logo_text_light.svg" width="90%">
66 </picture>
77</h1 >
8-
9- Pasteur is a system for data synthesis.
10- This readme is under construction.
8+ Pasteur is a library for performing end-to-end data synthesis.
9+ Gather your raw data and preprocess, synthesize, and evaluate it within a single
10+ project.
11+ Use the tools you're familiar with: numpy, pandas, scikit-learn, scipy or any other.
12+ When your dataset grows, scale to out-of-core data by using Pasteur's parallelization
13+ and partitioning primitives, without code changes or using different libraries.
1114
1215## Reproducibility
1316You can find the experiment files that can be used to reproduce the paper
@@ -30,4 +33,34 @@ PASTEUR_MODULES = get_recommended_modules()
3033Currently, there does not exist a template project from which to start upon.
3134This repository is a working Pasteur project and is what was used to develop it.
3235The module ` ./src/project ` is a kedro project with configs in ` ./conf ` and it is
33- the one that was used to develop pasteur.
36+ the one that was used to develop pasteur.
37+
38+ ## Contributing
39+ To contribute, clone this repository and install the frozen requirements.
40+ ``` bash
41+ git clone github.com/pasteur-dev/pasteur pasteur
42+
43+ cd pasteur
44+ python3.11 -m venv venv
45+ pip install -r requirements.txt
46+ ```
47+
48+ The requirements file installs Pasteur from this repository in an editable
49+ state, so you can begin modifying files.
50+ The requirements file can be regenerated with the following commands, which
51+ will pull the latest version of packages.
52+ To ensure interoperability with other packages, Pasteur does not specify narrow
53+ ranges for supported package versions, which might cause issues for certain version
54+ combinations.
55+ ``` bash
56+ rm requirements.txt
57+ pip-compile --resolver=backtracking
58+ ```
59+
60+ This repository is a Pasteur project used for testing.
61+ You can start testing Pasteur by running commands.
62+ ``` bash
63+ pasteur download --accept adult
64+ pasteur p adult.ingest
65+ pasteur p tab_adult.privbayes --synth
66+ ```
0 commit comments