UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents Paper • 2605.29534 • Published 10 days ago • 15
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Paper • 2605.26114 • Published 13 days ago • 64
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing Paper • 2603.09877 • Published Mar 10 • 49
PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents Paper • 2603.08013 • Published Mar 9 • 16
CLI-Gym: Scalable CLI Task Generation via Agentic Environment Inversion Paper • 2602.10999 • Published Feb 11 • 11
FeatureBench: Benchmarking Agentic Coding for Complex Feature Development Paper • 2602.10975 • Published Feb 11 • 18
MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments Paper • 2602.06075 • Published Feb 3 • 14
FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation Paper • 2602.03798 • Published Feb 3 • 10
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published Jan 1 • 132
MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning Paper • 2510.14958 • Published Oct 16, 2025 • 23
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving Paper • 2510.12796 • Published Oct 14, 2025 • 13
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning Paper • 2509.22644 • Published Sep 26, 2025 • 21
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing Paper • 2509.22651 • Published Sep 26, 2025 • 23
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published Jul 28, 2025 • 85
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11, 2025 • 48
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents Paper • 2505.21496 • Published May 27, 2025 • 38
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning Paper • 2505.10557 • Published May 15, 2025 • 50
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch Paper • 2505.03733 • Published May 6, 2025 • 17
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and Prospects Paper • 2504.19838 • Published Apr 28, 2025 • 23