Skip to content

Commit a3f3252

Browse files
updated README
1 parent 612d288 commit a3f3252

1 file changed

Lines changed: 15 additions & 12 deletions

File tree

README.md

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,19 @@ watcher.get_ESD()
9797
watcher.distances(model_1, model_2)
9898
```
9999

100+
## PEFT / LORA models
101+
102+
To analyze an PEFT / LORA fine-tuned model, specify the peft option.
103+
104+
- peft = True: Analyes the base_model, the delta, and the combined layer weight matrices
105+
- peft = 'lora_only': Musch faster, only analyzes the delta
106+
107+
i.e.
108+
```
109+
details = watcher.analyze(peft=True)
110+
```
111+
112+
100113
## Ploting and Fitting the Empirical Spectral Density (ESD)
101114

102115
WW creates plots for each layer weight matrix to observe how well the power law fits work
@@ -147,17 +160,6 @@ All of these attempt to measure how on-random and/or non-heavy-tailed the layer
147160
- (Truncated) PL quality of fit `D` : <img src="https://render.githubusercontent.com/render/math?math=\D"> (the Kolmogorov Smirnov Distance metric)
148161

149162

150-
#### PEFT / LORA models
151-
152-
To analyze an PEFT / LORA fine-tuned model, specify the peft option.
153-
154-
- peft = True: Analyes the base_model, the delta, and the combined layer weight matrices
155-
- peft = 'lora_only': Musch faster, only analyzes the delta
156-
157-
i.e.
158-
```
159-
details = watcher.analyze(peft=True)
160-
```
161163

162164

163165
(advanced usage)
@@ -210,7 +212,8 @@ The summary statistics can be used to gauge the test error of a series of pre/tr
210212
- average `alpha` can be used to compare one or more DNN models with different hyperparemeter settings **&theta;**, when depth is not a driving factor (i.e transformer models)
211213
- average `log_spectral_norm` is useful to compare models of different depths **L** at a coarse grain level
212214
- average `alpha_weighted` and `log_alpha_norm` are suitable for DNNs of differing hyperparemeters **&theta;** and depths **L** simultaneously. (i.e CV models like VGG and ResNet)
213-
215+
216+
214217
#### Predicting the Generalization Error
215218

216219

0 commit comments

Comments
 (0)