@@ -353,12 +353,12 @@ <h2 id="lstm-details">LSTM details </h2>
353353element-wise multiplication, denoted by \( \odot \).
354354</ p >
355355
356- < p > It follows </ p >
356+ < p > Mathematically we have (see also figure below) </ p >
357357$$
358- \mathbf{f}^{(t)} = \sigma(W_f \mathbf{x}^{(t)} + U_f \mathbf{h}^{(t-1)} + \mathbf{b}_f)
358+ \mathbf{f}^{(t)} = \sigma(W_{fx} \mathbf{x}^{(t)} + W_{fh} \mathbf{h}^{(t-1)} + \mathbf{b}_f)
359359$$
360360
361- < p > where \( W \) and \( U \) are the weights respectively .</ p >
361+ < p > where the $W$s are the weights to be trained .</ p >
362362
363363<!-- !split --> < br > < br > < br > < br > < br > < br > < br > < br > < br > < br >
364364< h2 id ="comparing-with-a-standard-rnn "> Comparing with a standard RNN </ h2 >
@@ -414,6 +414,11 @@ <h2 id="the-forget-gate">The forget gate </h2>
414414control the amount of information we want to take from the long-term
415415memory.
416416</ p >
417+ $$
418+ \mathbf{f}^{(t)} = \sigma(W_{fx}\mathbf{x}^{(t)} + W_{fh}\mathbf{h}^{(t-1)} + \mathbf{b}_f)
419+ $$
420+
421+ < p > where the $W$s are the weights to be trained.</ p >
417422
418423<!-- !split --> < br > < br > < br > < br > < br > < br > < br > < br > < br > < br >
419424< h2 id ="basic-layout "> Basic layout </ h2 >
@@ -438,15 +443,15 @@ <h2 id="input-gate">Input gate </h2>
438443
439444< p > We have</ p >
440445$$
441- \mathbf{i}^{(t)} = \sigma_g(W_i \mathbf{x}^{(t)} + U_i \mathbf{h}^{(t-1)} + \mathbf{b}_i),
446+ \mathbf{i}^{(t)} = \sigma_g(W_{ix} \mathbf{x}^{(t)} + W_{ih} \mathbf{h}^{(t-1)} + \mathbf{b}_i),
442447$$
443448
444449< p > and</ p >
445450$$
446- \mathbf{\tilde{c}} ^{(t)} = \tanh(W_c \mathbf{x}^{(t)} + U_c \mathbf{h}^{(t-1)} + \mathbf{b}_c ),
451+ \mathbf{g} ^{(t)} = \tanh(W_{gx} \mathbf{x}^{(t)} + W_{gh} \mathbf{h}^{(t-1)} + \mathbf{b}_g ),
447452$$
448453
449- < p > again the \( W \) and \( U \) are the weights.</ p >
454+ < p > again the $W$s are the weights to train .</ p >
450455
451456<!-- !split --> < br > < br > < br > < br > < br > < br > < br > < br > < br > < br >
452457< h2 id ="short-summary "> Short summary </ h2 >
@@ -462,7 +467,7 @@ <h2 id="forget-and-input">Forget and input </h2>
462467
463468< p > The forget gate and the input gate together also update the cell state with the following equation, </ p >
464469$$
465- \mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{\tilde{c} }^{(t)},
470+ \mathbf{c}^{(t)} = \mathbf{f}^{(t)} \otimes \mathbf{c}^{(t-1)} + \mathbf{i}^{(t)} \otimes \mathbf{g }^{(t)},
466471$$
467472
468473< p > where \( f^{(t)} \) and \( i^{(t)} \) are the outputs of the forget gate and the input gate, respectively.</ p >
0 commit comments