PyVision-RL: Forging Open Agentic Vision Models via RL Paper • 2602.20739 • Published 15 days ago • 29
LongCLI-Bench: A Preliminary Benchmark and Study for Long-horizon Agentic Programming in Command-Line Interfaces Paper • 2602.14337 • Published 24 days ago • 13
Yume-1.5: A Text-Controlled Interactive World Generation Model Paper • 2512.22096 • Published Dec 26, 2025 • 60
TIR-Bench: A Comprehensive Benchmark for Agentic Thinking-with-Images Reasoning Paper • 2511.01833 • Published Nov 3, 2025 • 16
Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search Paper • 2509.07969 • Published Sep 9, 2025 • 59
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5, 2025 • 47
SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning Paper • 2509.02479 • Published Sep 2, 2025 • 84
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21, 2025 • 268
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers Paper • 2508.14704 • Published Aug 20, 2025 • 43
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published Aug 11, 2025 • 50
Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning Paper • 2507.16746 • Published Jul 22, 2025 • 34
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published Jun 12, 2025 • 73
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published Jun 16, 2025 • 273
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL +4 Jun 3, 2025 • 100