Tackling the issue of low overlap in meta-learners with adaptive regularization.
The project is built with the following Python libraries:
First one needs to make the virtual environment and install all the requirements:
pip3 install virtualenv
python3 -m virtualenv -p python3 --always-copy venv
source venv/bin/activate
pip3 install -r requirements.txtTo start an experiments server, run:
mlflow server --port=5000 --gunicorn-opts "--timeout 280"
To access the MlFLow web UI with all the experiments, connect via ssh:
ssh -N -f -L localhost:5000:localhost:5000 <username>@<server-link>
Then, one can go to the local browser http://localhost:5000.
Before running semi-synthetic experiments, place datasets in the corresponding folders:
- IHDP100 dataset: ihdp_npci_1-100.test.npz and ihdp_npci_1-100.train.npz to
data/ihdp100/ - ACIC 2016: to
data/acic2016/
── data/acic_2016
├── synth_outcomes
| ├── zymu_<id0>.csv
| ├── ...
│ └── zymu_<id14>.csv
├── ids.csv
└── x.csv
The main training script is universal for different methods and datasets. For details on mandatory arguments, see the main configuration file config/config.yaml and other files in config/ folder.
Generic script with logging and fixed random seed is the following:
PYTHONPATH=. python3 runnables/train.py +dataset=<dataset> +model=<model> exp.seed=10One needs to specify a dataset / dataset generator (and some additional parameters, e.g. train size for the synthetic data dataset.n_samples_train=250, or a subset index for ACIC 2016 data dataset.dataset_ix=0):
- Synthetic data (adapted from https://arxiv.org/abs/1810.02894):
+dataset=synthetic - IHDP dataset:
+dataset=ihdp100 - ACIC 2016 dataset:
+dataset=acic2016 - HC-MNIST dataset:
+dataset=hcmnist
Models already have the best hyperparameters saved, for each model - dataset pair. One can access them via: +<dataset>_hparams=<dataset> or +<dataset>_hparams=<dataset_ix> etc.
Stage 1 models are propensity networks (src/models/prop_nets.py) and outcome networks (src/models/mu_nets.py). To perform manual hyperparameter tuning, use the flags prop_net_cov.tune_hparams=True and mu_net_cov.tune_hparams=True.
Stage 2 models are defined in config/config.yaml and src/models/target_model.py. One needs to specify a second-stage model (target net +model=target_net or target kernel ridge regression +model=target_krr) and specific parameters of the regularization:
-
target_net.regularization.adaptive:False- constant regularization,True- overlap-adaptive regularization (OAR) -
target_net.regularization.type:noise- noise regularization (for+model=target_net),dropout- dropout (for+model=target_net),l2- RKHS norm (for+model=target_krr) -
target_net.regularization.coeff:mult- multiplicative regularization function ($\lambda_{\mathrm{m}}$ ),mult- logarithmic regularization function ($\lambda_{\log}$ ),mult2- squared multiplicative regularization function ($\lambda_{\mathrm{m}^2}$ ) -
target_net.regularization.efficient:False- OAR,True- dOAR -
target_net.regularization.base_value: constant value of regularization$\lambda/p$ for rescaling
Example of running target net with OAR(
CUDA_VISIBLE_DEVICES=<devices> PYTHONPATH=. python3 runnables/train.py -m +dataset=synthetic +model=target_net +synthetic_hparams=\'250\' exp.logging=True exp.device=cuda exp.seed=10 target_net.regularization.adaptive=True target_net.regularization.type=noise target_net.regularization.coeff=mult target_net.regularization.efficient=False target_net.regularization.base_value=0.5Example of running target net with dOAR(
CUDA_VISIBLE_DEVICES=<devices> PYTHONPATH=. python3 runnables/train.py -m +dataset=ihdp100 +model=target_net +ihdp100_hparams=ihdp100 exp.logging=True exp.device=cuda exp.seed=10 dataset.dataset_ix=0 prop_net_cov.tune_hparams=True mu_net_cov.tune_hparams=True target_net.regularization.adaptive=True target_net.regularization.type=dropout target_net.regularization.coeff=log target_net.regularization.efficient=True target_net.regularization.base_value=0.1