Libra R-CNN: Towards balanced learning for object detection 链接到标题

* Authors: [[Jiangmiao Pang]], [[Kai Chen]], [[Jianping Shi]], [[Huajun Feng]], [[Wanli Ouyang]], [[Dahua Lin]]

初读印象链接到标题

从采样、feature map、损失函数三个角度解决目标检测任务样本不均衡问题，提出了对应的 IoU-sampling、balanced feature pyramid 和 balanced L1-loss 三个组件。

TL;DR 链接到标题

从采样、feature map、损失函数三个角度解决目标检测任务样本不均衡问题，提出了对应的 IoU-sampling、balanced feature pyramid 和 balanced L1-loss 三个组件。

Method 链接到标题

一个好的检测器依赖于下面三点： (1) whether the selected region samples are representative, (2) whether the extracted visual features are fully utilized, and (3) whether the designed objective function is optimal.

然而，很少有工作能同时做好这三点，比如： OHEM 能将模型注意力转至难样本，但是同时会过多关注 noise labels、带来更多的计算负担； Focal Loss对一阶段模型很有用，但是对 RCNN 方法作用有限，因为两阶段模型已经把大部分简单样本过滤掉了。

作者提出的方法整体框架：

IoU balanced sampling 链接到标题

作者通过实验发现超过 60% 的难负样本和 gt 的 iou 大于 60%, 但是随即采样只获得了 30% 这个区间的样本。

假设需要的负样本数量是 N，候选样本数量是 M， IoU balaced samplig 把 N 个样本按和gt 的iou均匀切分成 K 个 bin，然后在每个bin里均匀采样，公式如下：

$$ p_{k}=\frac{N}{K} * \frac{1}{M_{k}}, k \in[0, K) $$

Balanced Feature Pyramid 链接到标题

如上图所示，先将 FPN 所有feature integrate 起来综合各个 level 的语义信息，

$$ C=\frac{1}{L} \sum_{l=l_{m i n}}^{l_{m a x}} C_{l} $$

然后用 non-local module 做refine，最后基于这个refine结果上/下采样回到原来的 levels 进行后续步骤

Balanced L1 Loss 链接到标题

目标检测任务需要同时处理分类和回归两个问题，然而如果使用 L1 loss 的话，难样本由于与 gt 距离比较远，产生的 loss 比较大，所以作者提出类似于 smooth l1 loss 的裁减方法。

首先定义 loss < 1.0 的为 inliers, 大于等于 1.0 的为 outlier,

如图5(a) 所示， smooth l1 loss 对于loss 大于 1.0 的样本，其梯度直接变成常数1.0, smoothed l1 loss 同样如此，它的梯度定义如下：

$$ \frac{\partial L_{b}}{\partial x}=\left{\begin{array}{ll}\alpha \ln (b|x|+1) & \text { if }|x|<1 \ \gamma & \text { otherwise }\end{array}\right. $$

$\alpha$ 越小，inlier 样本梯度越大，而 $\gamma$ 调整针对 outlier 的梯度上限。最后的形式如下：

$$ L_{b}(x)=\left{\begin{array}{ll}\frac{\alpha}{b}(b|x|+1) \ln (b|x|+1)-\alpha|x| & \text { if }|x|<1 \ \gamma|x|+C & \text { otherwise }\end{array}\right. $$

$$ \alpha \ln (b+1)=\gamma $$

References 链接到标题

10.1109/CVPR.2017.634
This reference does not have DOI 😵
This reference does not have DOI 😵
10.1109/CVPR.2018.00377
10.1109/CVPR.2019.00078
10.1109/TPAMI.2017.2745563
This reference does not have DOI 😵
10.1145/2964284.2967274
This reference does not have DOI 😵
This reference does not have DOI 😵
10.1109/CVPR.2016.90
10.1109/CVPR.2017.685
10.1109/CVPR.2018.00378
This reference does not have DOI 😵
This reference does not have DOI 😵
This reference does not have DOI 😵
This reference does not have DOI 😵
10.1109/CVPR.2017.106
This reference does not have DOI 😵
This reference does not have DOI 😵
[[@Redmon2017]]
10.1109/CVPR.2018.00644
This reference does not have DOI 😵
10.1109/CVPR.2016.89
This reference does not have DOI 😵
10.1109/CVPR.2014.81
10.1109/ICCV.2015.169
This reference does not have DOI 😵
This reference does not have DOI 😵
10.1109/CVPR.2016.314
This reference does not have DOI 😵
[[@liuPathAggregationNetwork2018]]
This reference does not have DOI 😵
10.1109/ICCV.2017.214
This reference does not have DOI 😵
[[@redmonYouOnlyLook2016a]]
This reference does not have DOI 😵

Currently 3 references inside library! @2022-12-28