行为动作识别（二）：SlowFast_pytorch slowfast动作分类

作者：运维做开发 | 2024-07-22 07:09:33

踩

pytorch slowfast动作分类

1 Slowfast模型，一共有两个通道，一个Slow pathway，一个Fast pathway，其中Slow pathway采用低采样，高通道数主要提取空时特征；Fast pathway用高时间采样，低通道数量（主要为了降低计算量），来提取时域特征，两个通道都是以3Dresnet作为backbone，提取特征的，基础网络图如下：
在这里插入图片描述
2 对于两个通道都没有采用temporal downsampling，假设Slow pathway里面feature shape是 ${T,S^2,C\}$ ，Fast pathway对应的feature shape $\{\alpha T,S^2,\beta C\}$ 。其中对于Slow pathway，只在res4和res5采用non-degenerate temporal convolutions (temporal kernel size > 1），即311的kernel size，因为发现在前面2个res采用会造成准确率下降，可能原因是We argue that this is because when objects move fast and the temporal stride is large, there is little correlation within a temporal receptive field unless the spatial receptive field is large enough(i.e., in later layers)。对于Fast pathway，全部采用non-degenerate temporal convolutions，因为pathway holds fine temporal resolution for the temporal convolutions to capture detailed motion。
3 横向连接，就是将Fast pathway的特征连接到Slow pathway特征上，论文一共提供了三种方法：
(i) Time-to-channel: 直接将 $\{\alpha T,S^2,\beta C\}$ reshape成 $\{ T,S^2,\alpha \beta C\}$
(ii) Time-strided sampling: 相当于降采样，对没 $\alpha$ 帧，提取一帧，变成 $\{ T,S^2,\beta C\}$
(iii) Time-strided convolution: 通过3D卷积，一个 $5*1^2$ 的kernel和 $2\beta C$ 的channel，以及对应的stride= $\alpha$ ，padding=2
最终两个特征通过求和或者连接形式保留，通过最终实验得出Time-strided convolution以及连接形式效果最好，作为默认设置
4 对于其中超参数 $\alpha ,\beta$ ，论文设置 $\alpha$ =8，对于 $\beta$ ，设置从1/32到1/4，最终取1/8
5 对于Fast pathway，采用Weaker spatial inputs，一共设置了half
spatial resolution，gray-scale，“time difference" frames，optical flow作为输入，发现灰度图和RGB图效果接近
在这里插入图片描述

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/运维做开发/article/detail/863970