Submitted by akhaliq 29 InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding · 18 authors 511 5
Submitted by akhaliq 27 LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement · 9 authors 193 2
Submitted by akhaliq 16 Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance · 8 authors 4.25k 2
Submitted by akhaliq 15 ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars · 5 authors 219 1
Submitted by akhaliq 13 SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series · 2 authors 219 1
Submitted by akhaliq 11 DragAPart: Learning a Part-Level Motion Prior for Articulated Objects · 4 authors 1
Submitted by akhaliq 11 FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions · 8 authors 52 1
Submitted by akhaliq 10 AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models · 15 authors 2