ICCV2023论文速递(2023.8.16)!多篇扩散模型/SAM相关研究

AI算法与图像处理

共 3684字,需浏览 8分钟

 ·

2023-08-18 21:13


整理:AI算法与图像处理
CVPR2023论文和代码整理:https://github.com/DWCTOD/CVPR2023-Papers-with-Code-Demo
欢迎关注公众号 AI算法与图像处理,获取更多干货:


大家好,  最近正在优化每周分享的CVPR论文, 目前考虑按照不同类别去分类,方便不同方向的小伙伴挑选自己感兴趣的论文哈

最新成果demo展示:


Helping Hands: An Object-Aware Ego-Centric Video Recognition Model

  • 论文/Paper: http://arxiv.org/pdf/2308.07918

  • 代码/Code: https://github.com/chuhanxx/helping_hand_for_egocentric_videos

Memory-and-Anticipation Transformer for Online Action Understanding

  • 论文/Paper: http://arxiv.org/pdf/2308.07893

  • 代码/Code: https://github.com/echo0125/memory-and-anticipation-transformer

ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces

  • 论文/Paper: http://arxiv.org/pdf/2308.07868

  • 代码/Code: https://github.com/qianyiwu/objectsdf_plus

StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models

  • 论文/Paper: http://arxiv.org/pdf/2308.07863

  • 代码/Code: None

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

  • 论文/Paper: http://arxiv.org/pdf/2308.07815

  • 代码/Code: https://github.com/cool-xuan/imbalanced_sam

Learning to Identify Critical States for Reinforcement Learning from Videos

  • 论文/Paper: http://arxiv.org/pdf/2308.07795

  • 代码/Code: https://github.com/ai-initiative-kaust/videorlcs

Identity-Consistent Aggregation for Video Object Detection

  • 论文/Paper: http://arxiv.org/pdf/2308.07737

  • 代码/Code: None

UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation

  • 论文/Paper: http://arxiv.org/pdf/2308.07732

  • 代码/Code: https://github.com/haiyang-w/unitr

DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models

  • 论文/Paper: http://arxiv.org/pdf/2308.07687

  • 代码/Code: https://github.com/cure-lab/diffguard

Boosting Multi-modal Model Performance with Adaptive Gradient Modulation

  • 论文/Paper: http://arxiv.org/pdf/2308.07686

  • 代码/Code: https://github.com/lihong2303/agm_iccv2023

Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

  • 论文/Paper: http://arxiv.org/pdf/2308.07648

  • 代码/Code: None

Backpropagation Path Search On Adversarial Transferability

  • 论文/Paper: http://arxiv.org/pdf/2308.07625

  • 代码/Code: None

Story Visualization by Online Text Augmentation with Context Memory

  • 论文/Paper: http://arxiv.org/pdf/2308.07575

  • 代码/Code: None

3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

  • 论文/Paper: http://arxiv.org/pdf/2308.07546

  • 代码/Code: None

Boosting Semi-Supervised Learning by bridging high and low-confidence predictions

  • 论文/Paper: http://arxiv.org/pdf/2308.07509

  • 代码/Code: None

DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation

  • 论文/Paper: http://arxiv.org/pdf/2308.07498

  • 代码/Code: https://github.com/hanqingwangai/Dreamwalker

Probabilistic MIMO U-Net: Efficient and Accurate Uncertainty Estimation for Pixel-wise Regression

  • 论文/Paper: http://arxiv.org/pdf/2308.07477

  • 代码/Code: https://github.com/antonbaumann/mimo-unet

PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects

  • 论文/Paper: http://arxiv.org/pdf/2308.07391

  • 代码/Code: None

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding

  • 论文/Paper: http://arxiv.org/pdf/2308.07787

  • 代码/Code: https://github.com/joannahong/diffv2s


浏览 230
点赞
评论
收藏
分享

手机扫一扫分享

分享
举报
评论
图片
表情
推荐
点赞
评论
收藏
分享

手机扫一扫分享

分享
举报