天天看點

Show, Reward and Tell

這是AAAI2018用GAN和reinforcement learning(RL)做Photo Stream Story Telling的文章。paper連結https://pdfs.semanticscholar.org/977b/eecdf0b5c3487d03738cff501c79770f0858.pdf,暫時還沒有找到作何的首頁和相關的code,文章題目Show, Reward and Tell: Automatic Generation of Narrative Paragraph from Photo Stream by Adversarial Training

個人瞎扯:看這篇文章的兩個原因

  • 這個task算是跨媒體任務中比較有意思的。
  • 這篇paper同時利用了GAN和reinforcement learning(RL)

文章要做的事情(Photo Stream Story Telling):

輸入:photo strean(several images)    輸出:paragraph

文章show出來的example如下圖所示。

Show, Reward and Tell

與state-of-the-art方法和ground-truth對比結果如下所示。

Show, Reward and Tell

method

paper的framework如下所示。

Show, Reward and Tell

文章中的幾個點:

Multi-modal Discriminator(sentence-level): generate relevance sentence of image。判斷的内容為圖像分别與成對,不成對和生成的sentence做concatenation(discriminate concatenation)。

Language-style Discriminator(paragraph-level): generate human-level story。判斷的内容為ground truth stories (gt), random combinations of ground truth sentences (random) and the generated narrative paragraphs (generated)(discriminate paragraph)。

Reward Function: 對relevant sentence和human-level story做reward。