LaneATT 论文学习笔记

阿里云国内75折回扣微信号：monov8

阿里云国际，腾讯云国际，低至75折。AWS 93折免费开户实名账号代冲值优惠多多微信号：monov8 飞机：@monov6

论文《Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection》

地址https://arxiv.org/abs/2010.12035v2

代码https://github.com/lucastabelini/LaneATT

整体结构

在这里插入图片描述

车道线的表示方式

$Lane=\{(x_i,y_i)\}^{N_{pts}-1}_{i=0},\ \ \ \ y_i=i\cdot\frac{H_{image}}{N_{pts}-1}$

s: 起始点 index

e: 终止点 index

anchor 的表示方式

位于图像边缘的初始点 $O=(x_{orig}, y_{orig})$
角度 $\theta$

模型输出

$K + 1$ 个概率值对应 K 条车道线1 个 background
` $N_{pts}$ 个偏移量以衡量 prediction 与 anchor 之间的水平距离
长度 $l$ 具体指有效偏移量的个数

起始点 s 被定义为 $y_{orig}$ ;

终止点 e 被定义为 $e=s+\lfloor l\rfloor -1$ ;

损失函数

Focal Loss 与 Smooth L1 Loss

骨干网

一般的CNN都可以原文中又用了一个 $1\times 1$ 的 Conv 对骨干网输出的特征通道 $\pmb{F}_{back}\in \mathbb{R}^{C'_F\times H_F\times W_F}$ 进行了缩减 $\pmb{F}\in \mathbb{R}^{C_F\times H_F\times W_F}$ 。

Anchor-based Feature Pooling

将图像上的 Anchor $(x_{orig},y_{orig},\theta)$ 投影到特征得到 ${(x_j,y_j)\ |\ y_j=0,1,2,...,H_F-1\}$ 上

$\begin{aligned} x_j&=\lfloor\frac{1}{\tan\theta}(y_j-\frac{y_{orig}}{stride_{backbone}})+\frac{x_{orig}}{stride_{backbone}}\rfloor\\ y_j&=0,\ \ 1,\ \ 2,\ \ ...,\ \ H_F-1 \end{aligned}$

Attention 方案

利用局部 anchor 特征构建辅助 anchor 特征以集成全局信息。

$\begin{aligned} w_{i,j}&=\begin{cases} \text{softmax}(L_{att}(\pmb{a}^{local}_i))_j,&\text{if}\ j<i\\ 0,&\text{if}\ j=i\\ \text{softmax}(L_{att}(\pmb{a}^{local}_i))_{j-1},&\text{if}\ j>i\\ \end{cases}\\ \pmb{a}^{global}_{i}&=\Sigma_{j}w_{i,j}\pmb{a}^{local}_{j} \end{aligned}$