Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation Paper • 2602.12125 • Published 3 days ago • 55
LawThinker: A Deep Research Legal Agent in Dynamic Environments Paper • 2602.12056 • Published 3 days ago • 31
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 10 days ago • 309
GISA: A Benchmark for General Information-Seeking Assistant Paper • 2602.08543 • Published 6 days ago • 26
DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published Jan 14 • 126
Toward Efficient Agents: Memory, Tool learning, and Planning Paper • 2601.14192 • Published 26 days ago • 54
ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback Paper • 2601.10156 • Published Jan 15 • 26
ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration Paper • 2601.06860 • Published Jan 11 • 16
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis Paper • 2601.05808 • Published Jan 9 • 36
Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting Paper • 2601.02151 • Published Jan 5 • 109
ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition Paper • 2601.03822 • Published Jan 7 • 24
ProcessBench: Identifying Process Errors in Mathematical Reasoning Paper • 2412.06559 • Published Dec 9, 2024 • 86
Seed-Prover 1.5: Mastering Undergraduate-Level Theorem Proving via Learning from Experience Paper • 2512.17260 • Published Dec 19, 2025 • 52