Dear Author, I see that in your dqn training, eposide, calculation loss and neural network update are not considered according to the original traditional DQN algorithm, and they are updated after a certain number of steps. Do you have the introduction of the paper and the pseudo-code introduction?
Dear Author, I see that in your dqn training, eposide, calculation loss and neural network update are not considered according to the original traditional DQN algorithm, and they are updated after a certain number of steps. Do you have the introduction of the paper and the pseudo-code introduction?