VLingNav: Embodied Navigation with Adaptive Reasoning and Visual-Assisted Linguistic Memory Paper • 2601.08665 • Published 1 day ago • 6
SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices Paper • 2601.08303 • Published 1 day ago • 7
JudgeRLVR: Judge First, Generate Second for Efficient Reasoning Paper • 2601.08468 • Published 1 day ago • 5
Dr. Zero: Self-Evolving Search Agents without Training Data Paper • 2601.07055 • Published 3 days ago • 10
SketchJudge: A Diagnostic Benchmark for Grading Hand-drawn Diagrams with Multimodal Large Language Models Paper • 2601.06944 • Published 3 days ago • 1
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests Paper • 2601.06953 • Published 3 days ago • 39
IIB-LPO: Latent Policy Optimization via Iterative Information Bottleneck Paper • 2601.05870 • Published 5 days ago • 2
Over-Searching in Search-Augmented Large Language Models Paper • 2601.05503 • Published 6 days ago • 5
VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction Paper • 2601.05966 • Published 5 days ago • 21
Goal Force: Teaching Video Models To Accomplish Physics-Conditioned Goals Paper • 2601.05848 • Published 5 days ago • 13
GenCtrl -- A Formal Controllability Toolkit for Generative Models Paper • 2601.05637 • Published 6 days ago • 3
VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control Paper • 2601.05138 • Published 6 days ago • 16
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice Paper • 2601.05175 • Published 6 days ago • 32