ICCV2023论文速递(2023.8.16)!多篇扩散模型/SAM相关研究
共 3684字,需浏览 8分钟
·
2023-08-18 21:13
最新成果demo展示:
Helping Hands: An Object-Aware Ego-Centric Video Recognition Model
论文/Paper: http://arxiv.org/pdf/2308.07918
代码/Code: https://github.com/chuhanxx/helping_hand_for_egocentric_videos
Memory-and-Anticipation Transformer for Online Action Understanding
论文/Paper: http://arxiv.org/pdf/2308.07893
代码/Code: https://github.com/echo0125/memory-and-anticipation-transformer
ObjectSDF++: Improved Object-Compositional Neural Implicit Surfaces
论文/Paper: http://arxiv.org/pdf/2308.07868
代码/Code: https://github.com/qianyiwu/objectsdf_plus
StyleDiffusion: Controllable Disentangled Style Transfer via Diffusion Models
论文/Paper: http://arxiv.org/pdf/2308.07863
代码/Code: None
ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition
论文/Paper: http://arxiv.org/pdf/2308.07815
代码/Code: https://github.com/cool-xuan/imbalanced_sam
Learning to Identify Critical States for Reinforcement Learning from Videos
论文/Paper: http://arxiv.org/pdf/2308.07795
代码/Code: https://github.com/ai-initiative-kaust/videorlcs
Identity-Consistent Aggregation for Video Object Detection
论文/Paper: http://arxiv.org/pdf/2308.07737
代码/Code: None
UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation
论文/Paper: http://arxiv.org/pdf/2308.07732
代码/Code: https://github.com/haiyang-w/unitr
DiffGuard: Semantic Mismatch-Guided Out-of-Distribution Detection using Pre-trained Diffusion Models
论文/Paper: http://arxiv.org/pdf/2308.07687
代码/Code: https://github.com/cure-lab/diffguard
Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
论文/Paper: http://arxiv.org/pdf/2308.07686
代码/Code: https://github.com/lihong2303/agm_iccv2023
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
论文/Paper: http://arxiv.org/pdf/2308.07648
代码/Code: None
Backpropagation Path Search On Adversarial Transferability
论文/Paper: http://arxiv.org/pdf/2308.07625
代码/Code: None
Story Visualization by Online Text Augmentation with Context Memory
论文/Paper: http://arxiv.org/pdf/2308.07575
代码/Code: None
3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack
论文/Paper: http://arxiv.org/pdf/2308.07546
代码/Code: None
Boosting Semi-Supervised Learning by bridging high and low-confidence predictions
论文/Paper: http://arxiv.org/pdf/2308.07509
代码/Code: None
DREAMWALKER: Mental Planning for Continuous Vision-Language Navigation
论文/Paper: http://arxiv.org/pdf/2308.07498
代码/Code: https://github.com/hanqingwangai/Dreamwalker
Probabilistic MIMO U-Net: Efficient and Accurate Uncertainty Estimation for Pixel-wise Regression
论文/Paper: http://arxiv.org/pdf/2308.07477
代码/Code: https://github.com/antonbaumann/mimo-unet
PARIS: Part-level Reconstruction and Motion Analysis for Articulated Objects
论文/Paper: http://arxiv.org/pdf/2308.07391
代码/Code: None
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
论文/Paper: http://arxiv.org/pdf/2308.07787
代码/Code: https://github.com/joannahong/diffv2s