admarcosai 's Collections Agentics
updated
GAIA: a benchmark for General AI Assistants
Paper
• 2311.12983
• Published
• 246
ToolTalk: Evaluating Tool-Usage in a Conversational Setting
Paper
• 2311.10775
• Published
• 9
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language
Model-based Agents in Real-world Systems
Paper
• 2311.11315
• Published
• 7
An Embodied Generalist Agent in 3D World
Paper
• 2311.12871
• Published
• 8
Pearl: A Production-ready Reinforcement Learning Agent
Paper
• 2312.03814
• Published
• 15
CogAgent: A Visual Language Model for GUI Agents
Paper
• 2312.08914
• Published
• 31
AppAgent: Multimodal Agents as Smartphone Users
Paper
• 2312.13771
• Published
• 54
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
Lengths in Large Language Models
Paper
• 2401.04658
• Published
• 27
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper
• 2402.01622
• Published
• 38
Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool
Utilization in Real-World Complex Scenarios
Paper
• 2401.17167
• Published
• 1
Language Models, Agent Models, and World Models: The LAW for Machine
Reasoning and Planning
Paper
• 2312.05230
• Published
Large Language Models as Zero-shot Dialogue State Tracker through
Function Calling
Paper
• 2402.10466
• Published
• 18
An Interactive Agent Foundation Model
Paper
• 2402.05929
• Published
• 30
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World
Tasks
Paper
• 2412.14161
• Published
• 51