Skip to content

Commit 4673b31

Browse files
XylambdaAlejandro Pérez Sanjuán
andauthored
ENH: regularization as callbacks (#42)
* WIP: add callbacks scheme before and after loss step * DOC: add TODO * ENH: add 'predict' method to Trainer * BUG: fix minor error in predict * ENH, CLN: use black. Improve predict method. Fix bugs in preprocessing. * CLN, BUG: fix bug in LR sch callback. Fix flake8 complains. * CLN: use isort * CLN: minor code cleaning * BUG: fix bug on synchronization through cuda * DOC: update doc string for 'tabular_to_sliding_dataset' * ENH: allow multiple inputs for the model * ENH: add back '__predict_loader'. Remove some prediction callbacks * BLD: update NumPy requirements * CLN: remove Pandas * ENH: add missin __repr__. Add method to return history dict in trainer. Minor changes * ENH: add torhc distribution example * ENH, CLN: add loss update after loss step * ENH: store only the number of each batch loss * CLN: use isort * ENH: regularization as callbacks. Remove regularization module * BLD: update requirements to avoid conflicts * TST: update utils tests * CLN: update CHANGELOG Co-authored-by: Alejandro Pérez Sanjuán <aperez@ETS.ES>
1 parent 2633a9f commit 4673b31

34 files changed

Lines changed: 1205 additions & 614 deletions

CHANGELOG.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,30 @@ All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
66

7+
## [4.2.0] - 2022-08-08
8+
9+
## Added
10+
- Add `torch.ditribution` example, with code taken from [Romain Strock](https://romainstrock.com/blog/modeling-uncertainty-with-pytorch.html).
11+
- Add `predict` method to `Trainer`. #38
12+
- Add functions to freeze and unfreeze model. #43
13+
- Add function to transform dataset into time series dataset.
14+
15+
## Fixed
16+
- Metrics are now moved to the execution device #41.
17+
- Log level is now used in the Trainer. #40
18+
- `LearningRateScheduler` now does not crash in first epoch when `on_train` is False. #36
19+
20+
## Changed
21+
- Make regularization part of the callbacks system. #37
22+
- Divide utils into three submodules: `convenience`,`preprocessing` and `data`.
23+
- Update requirements to avoid conflicts.
24+
- Update some tests.
25+
26+
## Removed
27+
28+
- Remove old regularization module and all related code.
29+
30+
731
## [4.1.2] - 2021-12-24
832

933
### Fixed
@@ -88,6 +112,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
88112
- Update tests with new testing methods.
89113
- Make some method on Trainer and Manager private.
90114

115+
91116
## [3.0.0] - 2021-07-27
92117

93118
### Fixed
@@ -113,6 +138,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
113138
- Add testing utility to check gradients: `compute_forward_gradient`.
114139
- Add more functions to `utils`: `FastTensorDataLoader`, `check_model_on_cuda`.
115140

141+
116142
## [2.0.2] - 2021-05-10
117143

118144
### Fixed
@@ -126,6 +152,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
126152
- Change `_validate` in favour of `validation_step`.
127153
- Update tests to be correct.
128154

155+
129156
## [2.0.1] - 2021-04-29
130157

131158
### Added

README.md

Lines changed: 1 addition & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -21,17 +21,14 @@ The library also provides a callbacks API that can be used to interact with
2121
the model during the training process, as well as a set of basic regularization
2222
procedures.
2323

24-
Additionally, you will find the `Manager` class which allows you to run
25-
multiple experiments for different random seeds.
26-
2724
## Installation
2825
**Normal user**
2926
```bash
3027
pip install torchfitter
3128
```
3229

3330
This library does not ship CUDA nor XLA. Follow the
34-
[official PyTorch documentarion](https://pytorch.org/get-started/locally/) for
31+
[official PyTorch documentation](https://pytorch.org/get-started/locally/) for
3532
more information about how to install CUDA binaries.
3633

3734
**Developer**
@@ -130,40 +127,6 @@ trainer = Trainer(
130127
)
131128
```
132129

133-
134-
## Regularization
135-
`TorchFitter` includes regularization algorithms but you can also create your
136-
own procedures. To create your own algorithms you just:
137-
1. Inherit from `RegularizerBase` and call the `super` operator appropiately.
138-
2. Implement the procedure in the `compute_penalty` method.
139-
140-
Here's an example implementing L1 from scratch:
141-
142-
```python
143-
import torch
144-
from torchfitter.regularization.base import RegularizerBase
145-
146-
147-
class L1Regularization(RegularizerBase):
148-
def __init__(self, regularization_rate, biases=False):
149-
super(L1Regularization, self).__init__(regularization_rate, biases)
150-
151-
def compute_penalty(self, named_parameters, device):
152-
# Initialize with tensor, cannot be scalar
153-
penalty_term = torch.zeros(1, 1, requires_grad=True).to(device)
154-
155-
for name, param in named_parameters:
156-
if not self.biases and name.endswith("bias"):
157-
pass
158-
else:
159-
penalty_term = penalty_term + param.norm(p=1)
160-
161-
return self.rate * penalty_term
162-
```
163-
164-
Notice how the `penalty_term` is moved to the given `device`. This is necessary
165-
in order to avoid operations with tensors stored at different devices.
166-
167130
## Callbacks
168131
Callbacks allow you to interact with the model during the fitting process. They
169132
provide with different methods that are called at different stages. To create a
Lines changed: 12 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,11 @@
1414
from torchfitter.utils.data import DataWrapper
1515
from torchfitter.conventions import ParamsDict
1616
from sklearn.model_selection import train_test_split
17-
from torchfitter.regularization import L1Regularization
1817
from torchfitter.callbacks import (
1918
EarlyStopping,
2019
RichProgressBar,
2120
StochasticWeightAveraging,
21+
L1Regularization
2222
)
2323

2424
# -----------------------------------------------------------------------------
@@ -29,12 +29,19 @@
2929

3030

3131
def main():
32+
# -------------------------------------------------------------------------
33+
# argument parsing
34+
parser = argparse.ArgumentParser("")
35+
parser.add_argument("--epochs", type=int, default=5000)
36+
37+
args = parser.parse_args()
38+
n_epochs = args.epochs
39+
3240
# -------------------------------------------------------------------------
3341
X = np.load(DATA_PATH / "features.npy")
3442
y = np.load(DATA_PATH / "labels.npy")
3543
y = y.reshape(-1, 1)
3644

37-
3845
# simplest case of cross-validation
3946
X_train, X_val, y_train, y_val = train_test_split(
4047
X, y, test_size=0.33, random_state=42
@@ -43,7 +50,6 @@ def main():
4350
# -------------------------------------------------------------------------
4451
model = nn.Linear(in_features=1, out_features=1)
4552

46-
regularizer = L1Regularization(regularization_rate=0.01, biases=False)
4753
criterion = nn.MSELoss()
4854
optimizer = optim.Adam(model.parameters(), lr=0.005)
4955

@@ -58,6 +64,7 @@ def main():
5864
EarlyStopping(patience=100, load_best=True),
5965
swa_callback,
6066
RichProgressBar(display_step=100, log_lr=False),
67+
L1Regularization(regularization_rate=0.01, biases=False)
6168
]
6269

6370
metrics = [
@@ -80,27 +87,14 @@ def main():
8087
model=model,
8188
criterion=criterion,
8289
optimizer=optimizer,
83-
regularizer=regularizer,
8490
callbacks=callbacks,
8591
metrics=metrics,
8692
)
8793

8894
# -------------------------------------------------------------------------
89-
# argument parsing
90-
parser = argparse.ArgumentParser("")
91-
parser.add_argument("--epochs", type=int, default=5000)
92-
93-
args = parser.parse_args()
94-
n_epochs = args.epochs
95-
96-
# -------------------------------------------------------------------------
97-
# fitting process
95+
# fitting process and predictions
9896
history = trainer.fit(train_loader, val_loader, epochs=n_epochs)
99-
100-
# predictions
101-
with torch.no_grad():
102-
to_predict = torch.from_numpy(X_val).float()
103-
y_pred = model(to_predict).cpu().numpy()
97+
y_pred = trainer.predict(X_val, as_array=True)
10498

10599
# -------------------------------------------------------------------------
106100
# plot predictions, losses and learning rate

examples/torchdist.py

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
"""
2+
In this example, a regression model with the ability to predict a mean and
3+
standard deviation is created and trained using torchfitter.
4+
5+
By predicting a mean and a std. one can define some sort of uncertainty
6+
interval around the predictions (a.k.a. how sure is my model about the
7+
prediction of this sample?).
8+
"""
9+
10+
import torch
11+
import argparse
12+
import torch.nn as nn
13+
import torch.optim as optim
14+
import matplotlib.pyplot as plt
15+
from torchfitter.conventions import ParamsDict
16+
from sklearn.datasets import make_regression
17+
from torchfitter.utils.preprocessing import train_test_val_split, torch_to_numpy
18+
from torchfitter.trainer import Trainer
19+
from torch.utils.data import DataLoader
20+
from torchfitter.utils.data import DataWrapper
21+
from torchfitter.callbacks import RichProgressBar, EarlyStopping
22+
23+
24+
class DeepNormal(nn.Module):
25+
"""Neural network with parametrizable normal distribution as output.
26+
27+
Taken from [1].
28+
29+
References
30+
----------
31+
.. [1] Romain Strock - Modeling uncertainty with Pytorch:
32+
https://romainstrock.com/blog/modeling-uncertainty-with-pytorch.html
33+
"""
34+
def __init__(self, n_inputs, n_hidden):
35+
super().__init__()
36+
37+
# Shared parameters
38+
self.shared_layer = nn.Sequential(
39+
nn.Linear(n_inputs, n_hidden),
40+
nn.ReLU(),
41+
nn.Dropout(),
42+
)
43+
44+
# Mean parameters
45+
self.mean_layer = nn.Sequential(
46+
nn.Linear(n_hidden, n_hidden),
47+
nn.ReLU(),
48+
nn.Dropout(),
49+
nn.Linear(n_hidden, 1),
50+
)
51+
52+
# Standard deviation parameters
53+
self.std_layer = nn.Sequential(
54+
nn.Linear(n_hidden, n_hidden),
55+
nn.ReLU(),
56+
nn.Dropout(),
57+
nn.Linear(n_hidden, 1),
58+
nn.Softplus(), # enforces positivity
59+
)
60+
61+
def forward(self, x):
62+
# Shared embedding
63+
shared = self.shared_layer(x)
64+
65+
# Parametrization of the mean
66+
mean = self.mean_layer(shared)
67+
68+
# Parametrization of the standard deviation
69+
std = self.std_layer(shared)
70+
71+
return torch.distributions.Normal(mean, std)
72+
73+
74+
class NLLLoss(nn.Module):
75+
def __init__(self):
76+
super().__init__()
77+
78+
def forward(self, output, target):
79+
"""
80+
Assumes `output` is a distribution.
81+
"""
82+
neg_log_likelihood = -output.log_prob(target)
83+
return torch.mean(neg_log_likelihood)
84+
85+
86+
def main():
87+
# -------------------------------------------------------------------------
88+
# argument parsing
89+
parser = argparse.ArgumentParser("")
90+
parser.add_argument("--epochs", type=int, default=5000)
91+
92+
args = parser.parse_args()
93+
n_epochs = args.epochs
94+
95+
# -------------------------------------------------------------------------
96+
# generate dummy data
97+
X, y = make_regression(
98+
n_samples=5000, n_features=1, n_informative=1, noise=5, random_state=0
99+
)
100+
y = y.reshape(-1,1)
101+
102+
# split data into train, test and validation
103+
_tup = train_test_val_split(X, y)
104+
X_train, y_train, X_val, y_val, X_test, y_test = _tup
105+
106+
# wrap data in Dataset
107+
train_wrapper = DataWrapper(
108+
X_train, y_train, dtype_X="float", dtype_y="float"
109+
)
110+
val_wrapper = DataWrapper(X_val, y_val, dtype_X="float", dtype_y="float")
111+
112+
# torch Loaders
113+
train_loader = DataLoader(train_wrapper, batch_size=64, pin_memory=True)
114+
val_loader = DataLoader(val_wrapper, batch_size=64, pin_memory=True)
115+
116+
# -------------------------------------------------------------------------
117+
# define model, optimizer and loss
118+
criterion = NLLLoss()
119+
model = DeepNormal(n_inputs=X.shape[1], n_hidden=15)
120+
optimizer = optim.AdamW(model.parameters(), lr=1e-3)
121+
122+
# callbacks list
123+
callbacks = [
124+
EarlyStopping(patience=150, load_best=True),
125+
RichProgressBar(display_step=50)
126+
]
127+
128+
# instantiate Trainer object with all the configuration
129+
trainer = Trainer(
130+
model=model,
131+
criterion=criterion,
132+
optimizer=optimizer,
133+
callbacks=callbacks,
134+
)
135+
136+
# train process
137+
history = trainer.fit(train_loader, val_loader, epochs=n_epochs)
138+
139+
# -------------------------------------------------------------------------
140+
# this is a torch distribution
141+
distr_prediction = trainer.predict(X_test)
142+
143+
# get mean and standard deviation for each sample in test
144+
y_pred = distr_prediction.mean
145+
y_pred_std = distr_prediction.stddev
146+
147+
# to array
148+
y_pred = torch_to_numpy(y_pred)
149+
y_pred_std = torch_to_numpy(y_pred_std)
150+
151+
# -------------------------------------------------------------------------
152+
# plot losses, mean predictions and lr
153+
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(19, 4))
154+
epoch_hist = history[ParamsDict.EPOCH_HISTORY]
155+
156+
ax[0].plot(epoch_hist[ParamsDict.LOSS]["train"], label="Train loss")
157+
ax[0].plot(
158+
epoch_hist[ParamsDict.LOSS]["validation"], label="Validation loss"
159+
)
160+
ax[0].set_title("Train and validation losses")
161+
ax[0].grid()
162+
ax[0].legend()
163+
164+
ax[1].plot(X_test, y_test, ".", label="Real")
165+
ax[1].plot(X_test, y_pred, ".", label="Prediction")
166+
ax[1].set_title("Predictions")
167+
ax[1].grid()
168+
ax[1].legend()
169+
170+
ax[2].plot(epoch_hist[ParamsDict.HISTORY_LR], label="Learning rate")
171+
ax[2].set_title("Learning Rate")
172+
ax[2].legend()
173+
ax[2].grid()
174+
plt.show()
175+
176+
# -------------------------------------------------------------------------
177+
# create some upper and lower bounds
178+
lower = y_pred - 2 * y_pred_std
179+
upper = y_pred + 2 * y_pred_std
180+
181+
fig, ax = plt.subplots(1, 1, figsize=(15,8))
182+
183+
ax.plot(X_test, y_test, "*k")
184+
ax.scatter(X_test.flatten(), y_pred, label="predicted means")
185+
186+
ax.scatter(X_test.flatten(), lower)
187+
ax.scatter(X_test.flatten(), upper)
188+
189+
ax.grid(True)
190+
ax.legend()
191+
192+
193+
if __name__ == "__main__":
194+
main()

0 commit comments

Comments
 (0)