Mid-Training with Self-Generated Data Improves Reinforcement Learning in Language Models Paper • 2605.08472 • Published 25 days ago • 5
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration Paper • 2605.20025 • Published 14 days ago • 185
A2RBench: An Automatic Paradigm for Formally Verifiable Abstract Reasoning Benchmark Generation Paper • 2605.17278 • Published 16 days ago • 4
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 21 days ago • 195
Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization Paper • 2605.15980 • Published 18 days ago • 36
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 27 days ago • 101
Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing Paper • 2604.22782 • Published Apr 3 • 8
Beyond Prompts: Unconditional 3D Inversion for Out-of-Distribution Shapes Paper • 2604.14914 • Published Apr 16 • 5
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
Qualixar OS: A Universal Operating System for AI Agent Orchestration Paper • 2604.06392 • Published Apr 7 • 19
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 291
EgoSim: Egocentric World Simulator for Embodied Interaction Generation Paper • 2604.01001 • Published Apr 1 • 38
PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding Paper • 2604.00886 • Published Apr 1 • 6
Dynin-Omni: Omnimodal Unified Large Diffusion Language Model Paper • 2604.00007 • Published Mar 9 • 19
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 343
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning Paper • 2603.21289 • Published Mar 22 • 35
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning Paper • 2602.11748 • Published Feb 12 • 38
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models Paper • 2603.17051 • Published Mar 17 • 109
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 248