-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathnotes.txt
More file actions
34 lines (24 loc) · 2.02 KB
/
notes.txt
File metadata and controls
34 lines (24 loc) · 2.02 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
format code
https://aqs.epa.gov/aqsweb/documents/AQCSV_Format.html
diff activation in cuda lstm:
https://github.com/tensorflow/tensorflow/issues/24375
https://www.reddit.com/r/MLQuestions/comments/9an2y0/keras_cudnnlstm_is_it_worth_the_drawbacks/
dropout in cuda lstm:
https://github.com/keras-team/keras/issues/8935
Limitations we are facing:
CUDALSTM does not provide the option to change activation functions (https://github.com/tensorflow/tensorflow/issues/24375). Yesterday I changed the activation function for the CPU version and mistakenly ran the GPU version. Still on GPU, CUDALSTM does not apply dropout to the recurrent nodes. However, https://arxiv.org/pdf/1708.02182.pdf recomends using DropConnect instead of Dropout: "We propose the use of DropConnect (Wan et al., 2013) on the recurrent hidden to hidden weight matrices [...]. By performing DropConnect on the hidden-to-hidden weight matrices `[Ui,Uf,Uo,Uc]` within the LSTM, we can prevent overfitting from occurring on the recurrent connections of the LSTM."
On one hand the using the CPU version allows us to change parameters in the NN and connect with other CPU layers. On the other hand it is certainly slower than the GPU verion.
whole discussion: https://github.com/keras-team/keras/issues/8935
LOW gpu usage:
https://www.reddit.com/r/MLQuestions/comments/76ivi9/gpu_utilization_is_low_during_training_batch_size/
https://discuss.pytorch.org/t/low-gpu-utilization-with-cuda/4207
https://stackoverflow.com/questions/42319786/low-gpu-usage-performance-with-tensorflow-rnns
https://stackoverflow.com/questions/44563418/low-gpu-usage-by-keras-tensorflow
https://stackoverflow.com/questions/46146757/very-low-gpu-usage-during-training-in-tensorflow
talos (nice idea.. lack development):
https://towardsdatascience.com/hyperparameter-optimization-with-keras-b82e6364ca53
--
https://machinelearningmastery.com/how-to-develop-a-skilful-time-series-forecasting-model/
--
longest non nan sequence
https://stackoverflow.com/questions/41494444/pandas-find-longest-stretch-without-nan-values