CVPR 2021 论文和开源项目合集(Papers with Code)
向AI转型的程序员都关注了这个号👇👇👇
人工智能大数据与深度学习 公众号:datayx
【CVPR 2021 论文开源目录】
https://github.com/amusi/CVPR2021-Papers-with-Code
Backbone
NAS
GAN
Visual Transformer
自监督(Self-Supervised)
目标检测(Object Detection)
实例分割(Instance Segmentation)
全景分割(Panoptic Segmentation)
视频理解/行为识别(Video Understanding)
人脸识别(Face Recognition)
人脸活体检测(Face Anti-Spoofing)
Deepfake检测(Deepfake Detection)
人脸年龄估计(Age-Estimation)
人脸解析(Human Parsing)
超分辨率(Super-Resolution)
图像恢复(Image Restoration)
3D目标检测(3D Object Detection)
3D语义分割(3D Semantic Segmentation)
3D目标跟踪(3D Object Tracking)
3D点云配准(3D Point Cloud Registration)
6D位姿估计(6D Pose Estimation)
深度估计(Depth Estimation)
对抗样本(Adversarial-Examples)
图像检索(Image Retrieval)
Zero-Shot Learning
视觉推理(Visual Reasoning)
"人-物"交互(HOI)检测
阴影去除(Shadow Removal)
数据集(Datasets)
其他(Others)
不确定中没中(Not Sure)
Backbone
Coordinate Attention for Efficient Mobile Network Design
Paper: https://arxiv.org/abs/2103.02907
Code: https://github.com/Andrew-Qibin/CoordAttention
Inception Convolution with Efficient Dilation Search
Paper: https://arxiv.org/abs/2012.13587
Code: None
RepVGG: Making VGG-style ConvNets Great Again
Paper: https://arxiv.org/abs/2101.03697
Code: https://github.com/DingXiaoH/RepVGG
NAS
Inception Convolution with Efficient Dilation Search
Paper: https://arxiv.org/abs/2012.13587
Code: None
GAN
Training Generative Adversarial Networks in One Stage
Paper: https://arxiv.org/abs/2103.00430
Code: None
Closed-Form Factorization of Latent Semantics in GANs
Homepage: https://genforce.github.io/sefa/
Paper: https://arxiv.org/abs/2007.06600
Code: https://github.com/genforce/sefa
Anycost GANs for Interactive Image Synthesis and Editing
Paper: https://arxiv.org/abs/2103.03243
Code: https://github.com/mit-han-lab/anycost-gan
Image-to-image Translation via Hierarchical Style Disentanglement
Paper: https://arxiv.org/abs/2103.01456
Code: https://github.com/imlixinyang/HiSD
Visual Transformer
End-to-End Video Instance Segmentation with Transformers
Paper(Oral): https://arxiv.org/abs/2011.14503
Code: None
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
Paper(Oral): https://arxiv.org/abs/2011.09094
Code: https://github.com/dddzg/up-detr
End-to-End Human Object Interaction Detection with HOI Transformer
Paper: https://arxiv.org/abs/2103.04503
Code: https://github.com/bbepoch/HoiTransformer
Transformer Interpretability Beyond Attention Visualization
Paper: https://arxiv.org/abs/2012.09838
Code: https://github.com/hila-chefer/Transformer-Explainability
自监督
Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Paper: https://arxiv.org/abs/2011.09157
Code: https://github.com/WXinlong/DenseCL
目标检测(Object Detection)
UP-DETR: Unsupervised Pre-training for Object Detection with Transformers
Paper(Oral): https://arxiv.org/abs/2011.09094
Code: https://github.com/dddzg/up-detr
General Instance Distillation for Object Detection
Paper: https://arxiv.org/abs/2103.02340
Code: None
Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection
Paper: https://arxiv.org/abs/2103.01903
Code: None
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Homepage: http://rl.uni-freiburg.de/research/multimodal-distill
Paper: https://arxiv.org/abs/2103.01353
Code: http://rl.uni-freiburg.de/research/multimodal-distill
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Paper: https://arxiv.org/abs/2011.12885
Code: https://github.com/implus/GFocalV2
Multiple Instance Active Learning for Object Detection
Paper: https://github.com/yuantn/MIAL/raw/master/paper.pdf
Code: https://github.com/yuantn/MIAL
Towards Open World Object Detection
Paper: https://arxiv.org/abs/2103.02603
Code: https://github.com/JosephKJ/OWOD
实例分割(Instance Segmentation)
End-to-End Video Instance Segmentation with Transformers
Paper(Oral): https://arxiv.org/abs/2011.14503
Code: None
Zero-shot instance segmentation(Not Sure)
Paper: None
Code: https://github.com/CVPR2021-pape-id-1395/CVPR2021-paper-id-1395
全景分割(Panoptic Segmentation)
Cross-View Regularization for Domain Adaptive Panoptic Segmentation
Paper: https://arxiv.org/abs/2103.02584
Code: None
视频理解/行为识别(Video Understanding)
TDN: Temporal Difference Networks for Efficient Action Recognition
Paper: https://arxiv.org/abs/2012.10071
Code: https://github.com/MCG-NJU/TDN
人脸识别(Face Recognition)
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
Homepage: https://www.face-benchmark.org/
Paper: https://arxiv.org/abs/2103.04098
Dataset: https://www.face-benchmark.org/
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
Paper(Oral): https://arxiv.org/abs/2103.01520
Code: https://github.com/Hzzone/MTLFace
Dataset: https://github.com/Hzzone/MTLFace
人脸活体检测(Face Anti-Spoofing)
Cross Modal Focal Loss for RGBD Face Anti-Spoofing
Paper: https://arxiv.org/abs/2103.00948
Code: None
Deepfake检测(Deepfake Detection)
Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain
Paper:https://arxiv.org/abs/2103.01856
Code: None
Multi-attentional Deepfake Detection
Paper:https://arxiv.org/abs/2103.02406
Code: None
人脸年龄估计(Age Estimation)
PML: Progressive Margin Loss for Long-tailed Age Classification
Paper: https://arxiv.org/abs/2103.02140
Code: None
人体解析(Human Parsing)
Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing
Paper: https://arxiv.org/abs/2103.04570
Code: https://github.com/tfzhou/MG-HumanParsing
超分辨率(Super-Resolution)
ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic
Paper: https://arxiv.org/abs/2103.04039
Code: https://github.com/Xiangtaokong/ClassSR
AdderSR: Towards Energy Efficient Image Super-Resolution
Paper: https://arxiv.org/abs/2009.08891
Code: None
图像恢复(Image Restoration)
Multi-Stage Progressive Image Restoration
Paper: https://arxiv.org/abs/2102.02808
Code: https://github.com/swz30/MPRNet
3D目标检测(3D Object Detection)
SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud
Paper: None
Code: https://github.com/Vegeta2020/SE-SSD
Center-based 3D Object Detection and Tracking
Paper: https://arxiv.org/abs/2006.11275
Code: https://github.com/tianweiy/CenterPoint
Categorical Depth Distribution Network for Monocular 3D Object Detection
Paper: https://arxiv.org/abs/2103.01100
Code: None
3D语义分割(3D Semantic Segmentation)
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
Homepage: https://github.com/QingyongHu/SensatUrban
Paper: http://arxiv.org/abs/2009.03137
Code: https://github.com/QingyongHu/SensatUrban
Dataset: https://github.com/QingyongHu/SensatUrban
3D目标跟踪(3D Object Trancking)
Center-based 3D Object Detection and Tracking
Paper: https://arxiv.org/abs/2006.11275
Code: https://github.com/tianweiy/CenterPoint
3D点云配准(3D Point Cloud Registration)
PREDATOR: Registration of 3D Point Clouds with Low Overlap
Paper: https://arxiv.org/abs/2011.13005
Code: https://github.com/ShengyuH/OverlapPredator
6D位姿估计(6D Pose Estimation)
FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation
Paper: https://arxiv.org/abs/2103.02242
Code: https://github.com/ethnhe/FFB6D
深度估计
Depth from Camera Motion and Object Detection
Paper: https://arxiv.org/abs/2103.01468
Code: https://github.com/griffbr/ODMD
Dataset: https://github.com/griffbr/ODMD
对抗样本
Natural Adversarial Examples
Paper: https://arxiv.org/abs/1907.07174
Code: https://github.com/hendrycks/natural-adv-examples
图像检索(Image Retrieval)
QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval
Paper: https://arxiv.org/abs/2103.02927
Code: None
Zero-Shot Learning
Counterfactual Zero-Shot and Open-Set Visual Recognition
Paper: https://arxiv.org/abs/2103.00887
Code: https://github.com/yue-zhongqi/gcm-cf
视觉推理(Visual Reasoning)
Transformation Driven Visual Reasoning
homepage: https://hongxin2019.github.io/TVR/
Paper: https://arxiv.org/abs/2011.13160
Code: https://github.com/hughplay/TVR
"人-物"交互(HOI)检测
End-to-End Human Object Interaction Detection with HOI Transformer
Paper: https://arxiv.org/abs/2103.04503
Code: https://github.com/bbepoch/HoiTransformer
阴影去除(Shadow Removal)
Auto-Exposure Fusion for Single-Image Shadow Removal
Paper: https://arxiv.org/abs/2103.01255
Code: https://github.com/tsingqguo/exposure-fusion-shadow-removal
数据集(Datasets)
Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food
Paper: https://arxiv.org/abs/2103.03375
Dataset: None
Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges
Homepage: https://github.com/QingyongHu/SensatUrban
Paper: http://arxiv.org/abs/2009.03137
Code: https://github.com/QingyongHu/SensatUrban
Dataset: https://github.com/QingyongHu/SensatUrban
When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework
Paper(Oral): https://arxiv.org/abs/2103.01520
Code: https://github.com/Hzzone/MTLFace
Dataset: https://github.com/Hzzone/MTLFace
Depth from Camera Motion and Object Detection
Paper: https://arxiv.org/abs/2103.01468
Code: https://github.com/griffbr/ODMD
Dataset: https://github.com/griffbr/ODMD
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Homepage: http://rl.uni-freiburg.de/research/multimodal-distill
Paper: https://arxiv.org/abs/2103.01353
Code: http://rl.uni-freiburg.de/research/multimodal-distill
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Paper: https://arxiv.org/abs/2012.02206
Code: https://github.com/daveredrum/Scan2Cap
Dataset: https://github.com/daveredrum/ScanRefer
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Paper: https://arxiv.org/abs/2103.01353
Code: http://rl.uni-freiburg.de/research/multimodal-distill
Dataset: http://rl.uni-freiburg.de/research/multimodal-distill
其他(Others)
Knowledge Evolution in Neural Networks
Paper(Oral): https://arxiv.org/abs/2103.05152
Code: https://github.com/ahmdtaha/knowledge_evolution
Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
Paper: https://arxiv.org/abs/2103.02148
Code: https://github.com/guopengf/FLMRCM
SGP: Self-supervised Geometric Perception
Oral
Paper: https://arxiv.org/abs/2103.03114
Code: https://github.com/theNded/SGP
Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning
Paper: https://arxiv.org/abs/2103.02148
Code: https://github.com/guopengf/FLMRCM
Diffusion Probabilistic Models for 3D Point Cloud Generation
Paper: https://arxiv.org/abs/2103.01458
Code: https://github.com/luost26/diffusion-point-cloud
Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
Paper: https://arxiv.org/abs/2012.02206
Code: https://github.com/daveredrum/Scan2Cap
Dataset: https://github.com/daveredrum/ScanRefer
There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge
Paper: https://arxiv.org/abs/2103.01353
Code: http://rl.uni-freiburg.de/research/multimodal-distill
Dataset: http://rl.uni-freiburg.de/research/multimodal-distill
不确定中没中(Not Sure)
CT Film Recovery via Disentangling Geometric Deformation and Photometric Degradation: Simulated Datasets and Deep Models
Paper: none
Code: https://github.com/transcendentsky/Film-Recovery
Toward Explainable Reflection Removal with Distilling and Model Uncertainty
Paper: none
Code: https://github.com/ytpeng-aimlab/CVPR-2021-Toward-Explainable-Reflection-Removal-with-Distilling-and-Model-Uncertainty
DeepOIS: Gyroscope-Guided Deep Optical Image Stabilizer Compensation
Paper: none
Code: https://github.com/lhaippp/DeepOIS
Exploring Adversarial Fake Images on Face Manifold
Paper: none
Code: https://github.com/ldz666666/Style-atk
Uncertainty-Aware Semi-Supervised Crowd Counting via Consistency-Regularized Surrogate Task
Paper: none
Code: https://github.com/yandamengdanai/Uncertainty-Aware-Semi-Supervised-Crowd-Counting-via-Consistency-Regularized-Surrogate-Task
Temporal Contrastive Graph for Self-supervised Video Representation Learning
Paper: none
Code: https://github.com/YangLiu9208/TCG
Boosting Monocular Depth Estimation Models to High-Resolution via Context-Aware Patching
Paper: none
Code: https://github.com/ouranonymouscvpr/cvpr2021_ouranonymouscvpr
Fast and Memory-Efficient Compact Bilinear Pooling
Paper: none
Code: https://github.com/cvpr2021kp2/cvpr2021kp2
Identification of Empty Shelves in Supermarkets using Domain-inspired Features with Structural Support Vector Machine
Paper: none
Code: https://github.com/gapDetection/cvpr2021
Estimating A Child's Growth Potential From Cephalometric X-Ray Image via Morphology-Aware Interactive Keypoint Estimation
Paper: none
Code: https://github.com/interactivekeypoint2020/Morph
https://github.com/ShaoQiangShen/CVPR2021
https://github.com/gillesflash/CVPR2021
https://github.com/anonymous-submission1991/BaLeNAS
https://github.com/cvpr2021dcb/cvpr2021dcb
https://github.com/anonymousauthorCV/CVPR2021_PaperID_8578
https://github.com/AldrichZeng/FreqPrune
https://github.com/Anonymous-AdvCAM/Anonymous-AdvCAM
https://github.com/ddfss/datadrive-fss
阅读过本文的人还看了以下文章:
基于40万表格数据集TableBank,用MaskRCNN做表格检测
《深度学习入门:基于Python的理论与实现》高清中文PDF+源码
2019最新《PyTorch自然语言处理》英、中文版PDF+源码
《21个项目玩转深度学习:基于TensorFlow的实践详解》完整版PDF+附书代码
PyTorch深度学习快速实战入门《pytorch-handbook》
【下载】豆瓣评分8.1,《机器学习实战:基于Scikit-Learn和TensorFlow》
李沐大神开源《动手学深度学习》,加州伯克利深度学习(2019春)教材
【Keras】完整实现‘交通标志’分类、‘票据’分类两个项目,让你掌握深度学习图像分类
如何利用全新的决策树集成级联结构gcForest做特征工程并打分?
Machine Learning Yearning 中文翻译稿
斯坦福CS230官方指南:CNN、RNN及使用技巧速查(打印收藏)
中科院Kaggle全球文本匹配竞赛华人第1名团队-深度学习与特征工程
不断更新资源
深度学习、机器学习、数据分析、python
搜索公众号添加: datayx
机大数据技术与机器学习工程
搜索公众号添加: datanlp
长按图片,识别二维码