天天看點

tensorflow常用RNN函數tensorlfow常用RNN函數

tensorlfow常用RNN函數

tf.nn.rnn_cell.BasicLSTMCell

_init_(

num_units,

forget_bias=1.0,

state_is_tuple=True,

activation=None,

reuse=None,

name=None

)

  • num_units:int類型,LSTM 中單元個數(LSTM中包含memory blocks,也就是我們看到的圖示的一個小長方形,memory block中有cell和gate,标準LSTM中一個memory block隻有一個cell和三個gate,但可以包含多個cell及相應的gate,num_units就是一個memory block包含多少個cell及其相應的gate)
  • forget_bias:float,0.0或1.0(預設),
  • state_is_tuple:bool,預設True,即得到(cell,hidden_state)二進制組。False的話是把(cell,hidden_state)連接配接起來,不過要deprecated了
  • activation:内部狀态的激活函數,預設tanh
  • reuse; name

tf.nn.rnn_cell.LSTMCell

:

_init_(

num_units,

use_peepholes=False,

cell_clip=None,

initializer=None,

num_proj=None,

proj_clip=None,

num_unit_shards=None,

num_proj_shards=None,

forget_bias=1.0,

state_is_tuple=True,

activation=None,

reuse=None,

name=None

)

  • num_units:int, LSTM cell中單元個數
  • use_peepholes: bool, 如果為True,LSTM内部的cells與gates的連接配接以掌握精确的輸出時機
  • cell_clip: float, 可選,如果cell state超過這個值,則在cell輸出到激活之前被截斷
  • initializer:權重(weights)及映射(projection)矩陣的初始化
  • num_proj:(可選),int,映射矩陣的輸出次元
  • proj_clip:(可選),float,如果num_proj大于0,并且提供了cell_clip,則映射值截斷于[-proj_clip, proj_clip]
  • num_unit_shards;num_proj_shards: deprecated
  • forget_bias:預設1.0
  • state_is_tuple;activation;reuse;name:如上

具體的peephole connections,projection layer和cell clipping可看如下文章:

Learning Preise Timing with LSTM Reurrent Networks

Long Short-Term Memory

Based Recurrent Neural Network Architectures for Large Vocabulary

Speech Recognition

tf.nn.rnn_cell.MultiRNNCell

由多個簡單cell順序組成的RNN

_init_(

cells,

state_is_tuple=True

)

  • cells:RNN cell的一個清單,由這個順序組成RNN
  • state_is_tuple: bool,如果為True(預設),接受并傳回一個n-元組的狀态,n=len(cells),為False的時候已經deprecated了。

tf.nn.rnn_cell.GRUCell

參數:

  • num_units: GRU cell中單元個數
  • activation:非線性函數,預設tanh
  • reuse
  • kernel_initializer:(可選),用于權重和映射矩陣的初始化
  • biase_initializer:(可選),用于偏置項的初始化
  • name:層的名字,相同的名稱共享相同的權重,為避免錯誤,需要reuse=True

參考文獻:Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

tf.nn.rnn_cell.DropoutWrapper

:

對給定cell的輸入輸出加上dropout

_init_(

cell,

input_keep_prob=1.0,

output_keep_prob=1.0,

state_keep_prob=1.0,

variational_recurrent=False,

input_size=None,

dtype=None,

seed=None,

dropout_state_filter_visitor=None

)

  • cell:一個RNNcell
  • input_keep_prob:0-1之間的float值,如果為1,對輸入不添加dropout
  • output_keep_prob: 0-1之間的float值,如果為1,對輸出不添加dropout
  • state_keep_prob: 0-1之間的float值,如果為1,想要dropout其中的cell,還需要設定dropout_state_filter_visitor(預設的是不會dropout cell的)
  • variation_recurrent:如果True,相同的dropout模式應用于所有的時間步,如果設定了這個參數,那麼input_size也必須設定
  • input_size:嵌套的tensorshape,隻有variation_recurrent為True并且input_keep_prob < 1才能用
  • dtype; seed可選
  • dropout_state_filter_visitor:預設除了cell外可以dropout任何項

預設地,dropout應用層與層之間,variation_recurrent設定為True的時候不單可以應用于層層之間,還可用于時間步之間,對于variation_recurrent的詳細了解可以參考下面的文獻:

A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

tf.nn.dynamic_rnn

:

dynamic_rnn(

cell,

inputs,

sequence_length=None,

initial_state=None,

dtype=None,

parallel_iterations=None,

swap_memory=False,

time_major=False,

scope=None

)

  • cell:一個RNNcell
  • inputs:輸入,每一個batch有相同的輸入長度大小,不同的batch可以不同的輸入大小
  • sequence_length:清單,一個batch中序列的真實長度

下面是Stack Overflow中回答關于dynamic_rnn的輸出問題:Analysis of the output from tf.nn.dynamic_rnn tensorflow function

tf.nn.dynamic_rnn provides two outputs, outputs and state.

outputs contains the output of the RNN cell at every time instant. Assuming the default time_major == False, let’s say you have an input composed of 10 examples with 7 time steps each and a feature vector of size 5 for every time step. Then your input would be 10x7x5 (batch_sizexmax_timexfeatures). Now you give this as an input to a RNN cell with output size 15. Conceptually, each time step of each example is input to the RNN, and you would get a 15-long vector for each of those. So that is what outputs contains, a tensor in this case of size 10x7x15 (batch_sizexmax_timexcell.output_size) with the output of the RNN cell at each time step. If you are only interested in the last output of the cell, you can just slice the time dimension to pick just the last element (e.g. outputs[:, -1, :]).

state contains the state of the RNN after processing all the inputs. Note that, unlike outputs, this doesn’t contain information about every time step, but only about the last one (that is, the state after the last one). Depending on your case, the state may or may not be useful. For example, if you have very long sequences, you may not want/be able to processes them in a single batch, and you may need to split them into several subsequences. If you ignore the state, then whenever you give a new subsequence it will be as if you are beginning a new one; if you remember the state, however (e.g. outputting it or storing it in a variable), you can feed it back later (through the initial_state parameter of tf.nn.dynamic_rnn) in order to correctly keep track of the state of the RNN, and only reset it to the initial state (generally all zeros) after you have completed the whole sequences. The shape of state can vary depending on the RNN cell that you are using, but, in general, you have some state for each of the examples (one or more tensors with size batch_sizexstate_size, where state_size depends on the cell type and size).

在他按時間步分割的時候即outputs中最後一句話不妥,因為在取最後一個時間步輸出的時候,有的sentence可能比較短,是以outputs[:,-1,:]的輸出可能為0,是以,如果取最後的輸出,應該用state.h。

import tensorflow as tf
import numpy as np
x = np.random.randn(2,5,3)
x[1,4:] = 0
x_len = [5,4]
cell = tf.nn.rnn_cell.BasicLSTMCell(num_units=1, state_is_tuple=True) #num_units可以自己設定多個
outputs, last_states = tf.nn.dynamic_rnn(cell=cell,inputs=x,dtype=tf.float64,sequence_length=x_len)
result = tf.contrib.learn.run_n({"outputs": outputs, "last_states": last_states},n=1,feed_dict=None)
print(result[0])
           
[{'outputs': array([[[-0.02468451],
        [-0.09230891],
        [-0.00885939],
        [-0.08525897],
        [-0.25602909]],

       [[-0.05933624],
        [ 0.12028753],
        [ 0.02201308],
        [ 0.10565564],
        [ 0.        ]]]), 
        'last_states':LSTMStateTuple(c=array([[-0.54874369],
       [ 0.21746937]]), h=array([[-0.25602909],
       [ 0.10565564]]))}]
      

tf.nn.bidirectional_dynamic_rnn

:

tf.nn.bidirectional_dynamic_rnn(

cell_fw,

cell_bw,

inputs,

sequence_length=None,

initial_state_fw=None,

initial_state_bw=None,

dtype=None,

parallel_iterations=None,

swap_memory=False,

time_major=False,

scope=None

)

  • cell_fw, cell_bw: 一個RNN Cell的執行個體
  • inputs: RNN的輸入
  • sequence_length: 輸入序列的真實長度

傳回一個(outputs, output_states)二進制組,預設地,outputs為一個 (output_fw, output_bw)二進制組,output_fw或output_bw的tensor的形式為[batch_size, max_time, output_size]。output_states為 一個(output_state_fw, output_state_bw)元組,為雙向RNN最終的輸出狀态。

tf.layers.dense

:

dense(

inputs,

units,

activation=None,

use_bias=True,

kernel_initializer=None,

bias_initializer=tf.zeros_initializer(),

kernel_regularizer=None,

bias_regularizer=None,

activity_regularizer=None,

kernel_constraint=None,

bias_constraint=None,

trainable=True,

name=None,

reuse=None

)

  • inputs: 一個tensor
  • units:int或long輸出次元

tf.layers.dropout

dropout(

inputs,

rate=0.5,

noise_shape=None,

seed=None,

training=False,

name=None

)

  • inputs:一個tensor
  • rate: dropout比例,例如rate=0.1,則會去掉10%的輸入節點,剩下的結點将會1/(1-rate)縮放

tf.contrib.rnn.AttentionCellWrapper

_init_(

cell,

attn_length,

attn_size=None,

attn_vec_size=None,

input_size=None,

state_is_tuple=True,

reuse=None

)

添加attention

  • cell: 要添加attention的RNNcell
  • attn_length:integer,attention視窗的大小
  • attn_size:integer,一個attention vector的大小,預設等于cell.output_size

繼續閱讀