CVPR 2021 论文和开源项目合集(Papers with Code)

机器学习AI算法工程

共 15042字,需浏览 31分钟

 · 2021-03-15




向AI转型的程序员都关注了这个号👇👇👇

人工智能大数据与深度学习  公众号:datayx


【CVPR 2021 论文开源目录】

https://github.com/amusi/CVPR2021-Papers-with-Code


  • Backbone

  • NAS

  • GAN

  • Visual Transformer

  • 自监督(Self-Supervised)

  • 目标检测(Object Detection)

  • 实例分割(Instance Segmentation)

  • 全景分割(Panoptic Segmentation)

  • 视频理解/行为识别(Video Understanding)

  • 人脸识别(Face Recognition)

  • 人脸活体检测(Face Anti-Spoofing)

  • Deepfake检测(Deepfake Detection)

  • 人脸年龄估计(Age-Estimation)

  • 人脸解析(Human Parsing)

  • 超分辨率(Super-Resolution)

  • 图像恢复(Image Restoration)

  • 3D目标检测(3D Object Detection)

  • 3D语义分割(3D Semantic Segmentation)

  • 3D目标跟踪(3D Object Tracking)

  • 3D点云配准(3D Point Cloud Registration)

  • 6D位姿估计(6D Pose Estimation)

  • 深度估计(Depth Estimation)

  • 对抗样本(Adversarial-Examples)

  • 图像检索(Image Retrieval)

  • Zero-Shot Learning

  • 视觉推理(Visual Reasoning)

  • "人-物"交互(HOI)检测

  • 阴影去除(Shadow Removal)

  • 数据集(Datasets)

  • 其他(Others)

  • 不确定中没中(Not Sure)



Backbone

Coordinate Attention for Efficient Mobile Network Design

  • Paper: https://arxiv.org/abs/2103.02907

  • Code: https://github.com/Andrew-Qibin/CoordAttention

Inception Convolution with Efficient Dilation Search

  • Paper: https://arxiv.org/abs/2012.13587

  • Code: None

RepVGG: Making VGG-style ConvNets Great Again

  • Paper: https://arxiv.org/abs/2101.03697

  • Code: https://github.com/DingXiaoH/RepVGG


NAS

Inception Convolution with Efficient Dilation Search

  • Paper: https://arxiv.org/abs/2012.13587

  • Code: None


GAN

Training Generative Adversarial Networks in One Stage

  • Paper: https://arxiv.org/abs/2103.00430

  • Code: None

Closed-Form Factorization of Latent Semantics in GANs

  • Homepage: https://genforce.github.io/sefa/

  • Paper: https://arxiv.org/abs/2007.06600

  • Code: https://github.com/genforce/sefa

Anycost GANs for Interactive Image Synthesis and Editing

  • Paper: https://arxiv.org/abs/2103.03243

  • Code: https://github.com/mit-han-lab/anycost-gan

Image-to-image Translation via Hierarchical Style Disentanglement

  • Paper: https://arxiv.org/abs/2103.01456

  • Code: https://github.com/imlixinyang/HiSD


Visual Transformer

End-to-End Video Instance Segmentation with Transformers

  • Paper(Oral): https://arxiv.org/abs/2011.14503

  • Code: None

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

  • Paper(Oral): https://arxiv.org/abs/2011.09094

  • Code: https://github.com/dddzg/up-detr

End-to-End Human Object Interaction Detection with HOI Transformer

  • Paper: https://arxiv.org/abs/2103.04503

  • Code: https://github.com/bbepoch/HoiTransformer

Transformer Interpretability Beyond Attention Visualization

  • Paper: https://arxiv.org/abs/2012.09838

  • Code: https://github.com/hila-chefer/Transformer-Explainability


自监督

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

  • Paper: https://arxiv.org/abs/2011.09157

  • Code: https://github.com/WXinlong/DenseCL


目标检测(Object Detection)

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

  • Paper(Oral): https://arxiv.org/abs/2011.09094

  • Code: https://github.com/dddzg/up-detr

General Instance Distillation for Object Detection

  • Paper: https://arxiv.org/abs/2103.02340

  • Code: None

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

  • Paper: https://arxiv.org/abs/2103.01903

  • Code: None

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

  • Homepage: http://rl.uni-freiburg.de/research/multimodal-distill

  • Paper: https://arxiv.org/abs/2103.01353

  • Code: http://rl.uni-freiburg.de/research/multimodal-distill

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

  • Paper: https://arxiv.org/abs/2011.12885

  • Code: https://github.com/implus/GFocalV2

Multiple Instance Active Learning for Object Detection

  • Paper: https://github.com/yuantn/MIAL/raw/master/paper.pdf

  • Code: https://github.com/yuantn/MIAL

Towards Open World Object Detection

  • Paper: https://arxiv.org/abs/2103.02603

  • Code: https://github.com/JosephKJ/OWOD


实例分割(Instance Segmentation)

End-to-End Video Instance Segmentation with Transformers

  • Paper(Oral): https://arxiv.org/abs/2011.14503

  • Code: None

Zero-shot instance segmentation(Not Sure)

  • Paper: None

  • Code: https://github.com/CVPR2021-pape-id-1395/CVPR2021-paper-id-1395


全景分割(Panoptic Segmentation)

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

  • Paper: https://arxiv.org/abs/2103.02584

  • Code: None


视频理解/行为识别(Video Understanding)

TDN: Temporal Difference Networks for Efficient Action Recognition

  • Paper: https://arxiv.org/abs/2012.10071

  • Code: https://github.com/MCG-NJU/TDN


人脸识别(Face Recognition)

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition

  • Homepage: https://www.face-benchmark.org/

  • Paper: https://arxiv.org/abs/2103.04098

  • Dataset: https://www.face-benchmark.org/

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

  • Paper(Oral): https://arxiv.org/abs/2103.01520

  • Code: https://github.com/Hzzone/MTLFace

  • Dataset: https://github.com/Hzzone/MTLFace


人脸活体检测(Face Anti-Spoofing)

Cross Modal Focal Loss for RGBD Face Anti-Spoofing

  • Paper: https://arxiv.org/abs/2103.00948

  • Code: None


Deepfake检测(Deepfake Detection)

Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain

  • Paper:https://arxiv.org/abs/2103.01856

  • Code: None

Multi-attentional Deepfake Detection

  • Paper:https://arxiv.org/abs/2103.02406

  • Code: None


人脸年龄估计(Age Estimation)

PML: Progressive Margin Loss for Long-tailed Age Classification

  • Paper: https://arxiv.org/abs/2103.02140

  • Code: None


人体解析(Human Parsing)

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

  • Paper: https://arxiv.org/abs/2103.04570

  • Code: https://github.com/tfzhou/MG-HumanParsing


超分辨率(Super-Resolution)

ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic

  • Paper: https://arxiv.org/abs/2103.04039

  • Code: https://github.com/Xiangtaokong/ClassSR

AdderSR: Towards Energy Efficient Image Super-Resolution

  • Paper: https://arxiv.org/abs/2009.08891

  • Code: None


图像恢复(Image Restoration)

Multi-Stage Progressive Image Restoration

  • Paper: https://arxiv.org/abs/2102.02808

  • Code: https://github.com/swz30/MPRNet


3D目标检测(3D Object Detection)

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

  • Paper: None

  • Code: https://github.com/Vegeta2020/SE-SSD

Center-based 3D Object Detection and Tracking

  • Paper: https://arxiv.org/abs/2006.11275

  • Code: https://github.com/tianweiy/CenterPoint

Categorical Depth Distribution Network for Monocular 3D Object Detection

  • Paper: https://arxiv.org/abs/2103.01100

  • Code: None


3D语义分割(3D Semantic Segmentation)

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

  • Homepage: https://github.com/QingyongHu/SensatUrban

  • Paper: http://arxiv.org/abs/2009.03137

  • Code: https://github.com/QingyongHu/SensatUrban

  • Dataset: https://github.com/QingyongHu/SensatUrban


3D目标跟踪(3D Object Trancking)

Center-based 3D Object Detection and Tracking

  • Paper: https://arxiv.org/abs/2006.11275

  • Code: https://github.com/tianweiy/CenterPoint


3D点云配准(3D Point Cloud Registration)

PREDATOR: Registration of 3D Point Clouds with Low Overlap

  • Paper: https://arxiv.org/abs/2011.13005

  • Code: https://github.com/ShengyuH/OverlapPredator


6D位姿估计(6D Pose Estimation)

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

  • Paper: https://arxiv.org/abs/2103.02242

  • Code: https://github.com/ethnhe/FFB6D


深度估计

Depth from Camera Motion and Object Detection

  • Paper: https://arxiv.org/abs/2103.01468

  • Code: https://github.com/griffbr/ODMD

  • Dataset: https://github.com/griffbr/ODMD


对抗样本

Natural Adversarial Examples

  • Paper: https://arxiv.org/abs/1907.07174

  • Code: https://github.com/hendrycks/natural-adv-examples


图像检索(Image Retrieval)

QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval

  • Paper: https://arxiv.org/abs/2103.02927

  • Code: None


Zero-Shot Learning

Counterfactual Zero-Shot and Open-Set Visual Recognition

  • Paper: https://arxiv.org/abs/2103.00887

  • Code: https://github.com/yue-zhongqi/gcm-cf


视觉推理(Visual Reasoning)

Transformation Driven Visual Reasoning

  • homepage: https://hongxin2019.github.io/TVR/

  • Paper: https://arxiv.org/abs/2011.13160

  • Code: https://github.com/hughplay/TVR


"人-物"交互(HOI)检测

End-to-End Human Object Interaction Detection with HOI Transformer

  • Paper: https://arxiv.org/abs/2103.04503

  • Code: https://github.com/bbepoch/HoiTransformer


阴影去除(Shadow Removal)

Auto-Exposure Fusion for Single-Image Shadow Removal

  • Paper: https://arxiv.org/abs/2103.01255

  • Code: https://github.com/tsingqguo/exposure-fusion-shadow-removal


数据集(Datasets)

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

  • Paper: https://arxiv.org/abs/2103.03375

  • Dataset: None

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

  • Homepage: https://github.com/QingyongHu/SensatUrban

  • Paper: http://arxiv.org/abs/2009.03137

  • Code: https://github.com/QingyongHu/SensatUrban

  • Dataset: https://github.com/QingyongHu/SensatUrban

When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework

  • Paper(Oral): https://arxiv.org/abs/2103.01520

  • Code: https://github.com/Hzzone/MTLFace

  • Dataset: https://github.com/Hzzone/MTLFace

Depth from Camera Motion and Object Detection

  • Paper: https://arxiv.org/abs/2103.01468

  • Code: https://github.com/griffbr/ODMD

  • Dataset: https://github.com/griffbr/ODMD

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

  • Homepage: http://rl.uni-freiburg.de/research/multimodal-distill

  • Paper: https://arxiv.org/abs/2103.01353

  • Code: http://rl.uni-freiburg.de/research/multimodal-distill

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

  • Paper: https://arxiv.org/abs/2012.02206

  • Code: https://github.com/daveredrum/Scan2Cap

  • Dataset: https://github.com/daveredrum/ScanRefer

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

  • Paper: https://arxiv.org/abs/2103.01353

  • Code: http://rl.uni-freiburg.de/research/multimodal-distill

  • Dataset: http://rl.uni-freiburg.de/research/multimodal-distill


其他(Others)

Knowledge Evolution in Neural Networks

  • Paper(Oral): https://arxiv.org/abs/2103.05152

  • Code: https://github.com/ahmdtaha/knowledge_evolution

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

  • Paper: https://arxiv.org/abs/2103.02148

  • Code: https://github.com/guopengf/FLMRCM

SGP: Self-supervised Geometric Perception

  • Oral

  • Paper: https://arxiv.org/abs/2103.03114

  • Code: https://github.com/theNded/SGP

Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning

  • Paper: https://arxiv.org/abs/2103.02148

  • Code: https://github.com/guopengf/FLMRCM

Diffusion Probabilistic Models for 3D Point Cloud Generation

  • Paper: https://arxiv.org/abs/2103.01458

  • Code: https://github.com/luost26/diffusion-point-cloud

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

  • Paper: https://arxiv.org/abs/2012.02206

  • Code: https://github.com/daveredrum/Scan2Cap

  • Dataset: https://github.com/daveredrum/ScanRefer

There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge

  • Paper: https://arxiv.org/abs/2103.01353

  • Code: http://rl.uni-freiburg.de/research/multimodal-distill

  • Dataset: http://rl.uni-freiburg.de/research/multimodal-distill


不确定中没中(Not Sure)

CT Film Recovery via Disentangling Geometric Deformation and Photometric Degradation: Simulated Datasets and Deep Models

  • Paper: none

  • Code: https://github.com/transcendentsky/Film-Recovery

Toward Explainable Reflection Removal with Distilling and Model Uncertainty

  • Paper: none

  • Code: https://github.com/ytpeng-aimlab/CVPR-2021-Toward-Explainable-Reflection-Removal-with-Distilling-and-Model-Uncertainty

DeepOIS: Gyroscope-Guided Deep Optical Image Stabilizer Compensation

  • Paper: none

  • Code: https://github.com/lhaippp/DeepOIS

Exploring Adversarial Fake Images on Face Manifold

  • Paper: none

  • Code: https://github.com/ldz666666/Style-atk

Uncertainty-Aware Semi-Supervised Crowd Counting via Consistency-Regularized Surrogate Task

  • Paper: none

  • Code: https://github.com/yandamengdanai/Uncertainty-Aware-Semi-Supervised-Crowd-Counting-via-Consistency-Regularized-Surrogate-Task

Temporal Contrastive Graph for Self-supervised Video Representation Learning

  • Paper: none

  • Code: https://github.com/YangLiu9208/TCG

Boosting Monocular Depth Estimation Models to High-Resolution via Context-Aware Patching

  • Paper: none

  • Code: https://github.com/ouranonymouscvpr/cvpr2021_ouranonymouscvpr

Fast and Memory-Efficient Compact Bilinear Pooling

  • Paper: none

  • Code: https://github.com/cvpr2021kp2/cvpr2021kp2

Identification of Empty Shelves in Supermarkets using Domain-inspired Features with Structural Support Vector Machine

  • Paper: none

  • Code: https://github.com/gapDetection/cvpr2021

Estimating A Child's Growth Potential From Cephalometric X-Ray Image via Morphology-Aware Interactive Keypoint Estimation

  • Paper: none

  • Code: https://github.com/interactivekeypoint2020/Morph

https://github.com/ShaoQiangShen/CVPR2021

https://github.com/gillesflash/CVPR2021

https://github.com/anonymous-submission1991/BaLeNAS

https://github.com/cvpr2021dcb/cvpr2021dcb

https://github.com/anonymousauthorCV/CVPR2021_PaperID_8578

https://github.com/AldrichZeng/FreqPrune

https://github.com/Anonymous-AdvCAM/Anonymous-AdvCAM

https://github.com/ddfss/datadrive-fss




阅读过本文的人还看了以下文章:


TensorFlow 2.0深度学习案例实战


基于40万表格数据集TableBank,用MaskRCNN做表格检测


《基于深度学习的自然语言处理》中/英PDF


Deep Learning 中文版初版-周志华团队


【全套视频课】最全的目标检测算法系列讲解,通俗易懂!


《美团机器学习实践》_美团算法团队.pdf


《深度学习入门:基于Python的理论与实现》高清中文PDF+源码


特征提取与图像处理(第二版).pdf


python就业班学习视频,从入门到实战项目


2019最新《PyTorch自然语言处理》英、中文版PDF+源码


《21个项目玩转深度学习:基于TensorFlow的实践详解》完整版PDF+附书代码


《深度学习之pytorch》pdf+附书源码


PyTorch深度学习快速实战入门《pytorch-handbook》


【下载】豆瓣评分8.1,《机器学习实战:基于Scikit-Learn和TensorFlow》


《Python数据分析与挖掘实战》PDF+完整源码


汽车行业完整知识图谱项目实战视频(全23课)


李沐大神开源《动手学深度学习》,加州伯克利深度学习(2019春)教材


笔记、代码清晰易懂!李航《统计学习方法》最新资源全套!


《神经网络与深度学习》最新2018版中英PDF+源码


将机器学习模型部署为REST API


FashionAI服装属性标签图像识别Top1-5方案分享


重要开源!CNN-RNN-CTC 实现手写汉字识别


yolo3 检测出图像中的不规则汉字


同样是机器学习算法工程师,你的面试为什么过不了?


前海征信大数据算法:风险概率预测


【Keras】完整实现‘交通标志’分类、‘票据’分类两个项目,让你掌握深度学习图像分类


VGG16迁移学习,实现医学图像识别分类工程项目


特征工程(一)


特征工程(二) :文本数据的展开、过滤和分块


特征工程(三):特征缩放,从词袋到 TF-IDF


特征工程(四): 类别特征


特征工程(五): PCA 降维


特征工程(六): 非线性特征提取和模型堆叠


特征工程(七):图像特征提取和深度学习


如何利用全新的决策树集成级联结构gcForest做特征工程并打分?


Machine Learning Yearning 中文翻译稿


蚂蚁金服2018秋招-算法工程师(共四面)通过


全球AI挑战-场景分类的比赛源码(多模型融合)


斯坦福CS230官方指南:CNN、RNN及使用技巧速查(打印收藏)


python+flask搭建CNN在线识别手写中文网站


中科院Kaggle全球文本匹配竞赛华人第1名团队-深度学习与特征工程



不断更新资源

深度学习、机器学习、数据分析、python

 搜索公众号添加: datayx  



机大数据技术与机器学习工程

 搜索公众号添加: datanlp

长按图片,识别二维码


浏览 107
点赞
评论
收藏
分享

手机扫一扫分享

举报
评论
图片
表情
推荐
点赞
评论
收藏
分享

手机扫一扫分享

举报