lstm-結構

2023-07-15 03:59:56

lstm的結構就不重複廢話了，推薦一個簡單了解的：

[譯] 了解 LSTM 網絡

這個寫的不錯，首先從結構上清晰明了的知道了lstm中幾個門的作用及操作過程。然而卻缺少了矩陣計算級别的結構介紹，也就是今天一直疑惑的，每個門到底是個單神經元還是一層神經元。後面去找原文，也就是第一篇lstm的論文仍未解除心中疑惑，而且其中畫的圖還沒有上面那個好了解。隻能求知于tensorflow源碼。

1 - 首先通過

from tensorflow.contrib import rnn
rnn.BasicLSTMCell()

找到了源碼 C:\Anaconda3\Lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py中

class BasicLSTMCell(RNNCell):
  ......
  def call(self, inputs, state):
    """Long short-term memory cell (LSTM).

    Args:
      inputs: `2-D` tensor with shape `[batch_size x input_size]`.
      state: An `LSTMStateTuple` of state tensors, each shaped
        `[batch_size x self.state_size]`, if `state_is_tuple` has been set to
        `True`.  Otherwise, a `Tensor` shaped
        `[batch_size x 2 * self.state_size]`.

    Returns:
      A pair containing the new hidden state, and the new state (either a
        `LSTMStateTuple` or a concatenated state, depending on
        `state_is_tuple`).
    """
    sigmoid = math_ops.sigmoid
    # Parameters of gates are concatenated into one multiply for efficiency.
    if self._state_is_tuple:
      c, h = state
    else:
      c, h = array_ops.split(value=state, num_or_size_splits=, axis=)

    concat = _linear([inputs, h],  * self._num_units, True)

    # i = input_gate, j = new_input, f = forget_gate, o = output_gate
    i, j, f, o = array_ops.split(value=concat, num_or_size_splits=, axis=)
    print('i shape:%s'%i.shape)
    new_c = (
        c * sigmoid(f + self._forget_bias) + sigmoid(i) * self._activation(j))
    new_h = self._activation(new_c) * sigmoid(o)

    if self._state_is_tuple:
      new_state = LSTMStateTuple(new_c, new_h)
    else:
      new_state = array_ops.concat([new_c, new_h], )
    return new_h, new_state

通過寫入上述print語句，并如《deep learning with tensorflow》中例子部分

n_hidden =  
x = tf.placeholder("float", [, , ])
y = tf.placeholder("float", [, ])

weights = {
    'out': tf.Variable(tf.random_normal([n_hidden, ]))
}
biases = {
    'out': tf.Variable(tf.random_normal([]))
}

def RNN(x, weights, biases):
    x = tf.transpose(x, [, , ])
    x = tf.reshape(x, [-, ])
    x = tf.split(axis=, num_or_size_splits=, value=x)
    lstm_cell = rnn.BasicLSTMCell(n_hidden, forget_bias=)
    outputs, states = rnn.static_rnn(lstm_cell, x, dtype=tf.float32)
    return tf.matmul(outputs[-], weights['out']) + biases['out']

pred = RNN(x, weights, biases)

會輸出

i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)
i shape:(128, 100)

進而得知每個門其實是一層神經元，而非單個神經元，即如《Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge》論文中的lstm結構：

lstm-結構

其中有輸入門i、輸出門f、遺忘門f等，而單拿輸入門，是一個如下圖的結構：

lstm-結構

(==!,visio,coredraw都沒裝)

即是一個全連接配接層。

lstm-結構

繼續閱讀

Deep contextualized word representations (ELMo) 閱讀筆記雙向語言模型ELMo

ELMo 原了解析

CentOS上Docker安裝GPU支援Nvidia-docker

場景文本檢測，CTPN tensorflow版本text-detection-ctpnpreparetraindemosome results

論文閱讀筆記20.05-第三周：ResNet的多種變種Residual Attention Network for Image ClassificationRes2Net: A New Multi-scale Backbone ArchitectureResNeSt: Split-Attention Networks

如何寫一篇好的科研論文背景我能夠從你的論文裡學到什麼？

Fast Spatio-Temporal Residual Network for Video Super-Resolution閱讀了解

Visual Attention

ubuntu從零開始安裝mxnet--安裝NVIDIA驅動

Tensorflow Day19 Denoising Autoencoder

Tensorflow Day16 Autoencoder 實作

Tensorflow Day17 Sparse Autoencoder

基于keras的多GPU深度學習網絡模型及參數儲存-筆記

A Guide For Time Series Prediction Using Recurrent Neural Networks (LSTMs)

ICLR 2017 | GAN Missing Modes 和 GAN

【深度學習-基礎知識】batchNormal原理及caffe中是如何使用的