Faster Rcnn源碼閱讀之Roi_pooling 層

2023-07-02 20:24:26

首先 RoI pooling 層的代碼位于model/roi_pooling 下

主要檔案為： functions/roi_pool.py

roi_pools 的初始化：

1,pooled_height pool之後feature map的高

2,pooled_width pool之後feature map的寬

3,spatial_scale feature map 原圖的大小比例

過程：

具體算法過程是在functions/roi_pooling.c(cup)/roi_pooling_cuda.c(gpn)中實作的

以下是前向傳播的代碼

将rois /spatial_scale，将rois的映射到feature map上，得到feature map上的bbox的坐标
//找到roi對應的圖檔在batch中的索引
        int roi_batch_ind = rois_flat[index_roi + 0];
        //得到roi在特征圖上的位置
        int roi_start_w = round(rois_flat[index_roi + 1] * spatial_scale);
        int roi_start_h = round(rois_flat[index_roi + 2] * spatial_scale);
        int roi_end_w = round(rois_flat[index_roi + 3] * spatial_scale);
        int roi_end_h = round(rois_flat[index_roi + 4] * spatial_scale);
        //      CHECK_GE(roi_batch_ind, 0);
        //      CHECK_LT(roi_batch_ind, batch_size);
     //得到roi在共享卷積層的高，寬
        int roi_height = fmaxf(roi_end_h - roi_start_h + 1, 1);
        int roi_width = fmaxf(roi_end_w - roi_start_w + 1, 1);
        //這個roi在pooling的時候會被分成多少段
        float bin_size_h = (float)(roi_height) / (float)(pooled_height);
        float bin_size_w = (float)(roi_width) / (float)(pooled_width);

2  //找max，在該區域中循環每一個點的值，找到最大的點，作為該區域pooling後對應點的值。
                                if (data_flat[index_data + index] > output_flat[pool_index + c * output_area])
                                {
                                    output_flat[pool_index + c * output_area] = data_flat[index_data + index];
                                }

看一下forward的輸入

1，feature maps：共享卷積層穿過來的特征

2，rois: rpn層選出來的框

輸出：固定大小 w*h的矩形框

前傳的代碼三個檔案都差不多的。

下面看下反向傳播

roi_pool.py 的反向傳播，主要調用了roi_pooling_cuda中的反向傳播

代碼。

//周遊該點對于特征圖上高的區域
        for (int ph = phstart; ph < phend; ++ph) {
             //周遊該點對于特征圖上寬的區域
            for (int pw = pwstart; pw < pwend; ++pw) {
                if (offset_argmax_data[(c * pooled_height + ph) * pooled_width + pw] == index)
                {

                //找到點，累計剃度
                    gradient += offset_top_diff[(c * pooled_height + ph) * pooled_width + pw];
                }
            }
        }

反向傳播中

輸入：grad_output剃度，在roi_pooling_cuda計算累加剃度用的。但是具體怎麼傳進來的，代表什麼意思。需要看上一層的代碼。

輸出：累加剃度

ok,roi 層就完成了。将rois映射到feature map上，然後劃分成固定大小

m * n的區域，每一個區域輸出最大值，輸出最後pool成固定的m*n的map。

Faster Rcnn源碼閱讀之Roi_pooling 層

roi_pools 的初始化：

以下是前向傳播的代碼

下面看下反向傳播

繼續閱讀

web.py源碼閱讀(一) WSGI協定篇

IOS開發源碼閱讀篇--FMDB源碼分析1(FMResultSet)

dubbo服務暴露之遠端暴露前言重要類解釋：代碼調用邏輯圖具體代碼分析

open-falcon源碼閱讀（二）——agent源碼閱讀1 概覽2 建立映射3 定時任務

容器的基礎 XmlBeanFactory（1）

ReactiveCocoa源碼解析

JDK源碼閱讀-Iterator接口概述方法Q&A如有不适之處，歡迎留言讨論

Jdk8 HashMap源碼閱讀

Java集合源碼剖析之ArrayList1 ArrayList概括2 ArrayList源碼剖析

Mybatis 源碼分析(二) | MyBatis 配置檔案解析過程

ZooKeeper源碼閱讀（二）：用戶端

Gaea源碼閱讀（二）：用戶端流程

為什麼要讀源代碼，如何閱讀源代碼看源碼的意義如何看源碼

為什麼要閱讀源代碼？如何有效的閱讀源代碼？選一些比較優秀的開源産品作為源代碼閱讀對象?