agentspage 2 / 2· back to top →

agents & frameworks

more in this beat51–99 of 99 · page 2 / 2

jun 02agentsWhen Agent Skill Libraries Scale, Dependency-Aware Retrieval Beats Flat Search
jun 02agentsCan Instruction-Tuned Retrievers Fix Agentic Search's Retrieval Gap?
jun 02agentsBandit-Based Prompt Optimization Targets Multi-Agent Systems Like CrewAI and AutoGen
may 31agentsWhat Breaks When Claude Code Writes Production Code: A New Failure Catalog
may 31agentsMore Agents, Worse Results: Why Multi-Agent LLM Teams Hold Experts Back
may 28agentsMulti-Agent LLM Coordination: Why Attention Steering Beats Full Broadcast
may 28agentsDataClawBench: AI Agents Fail at Exploratory Financial Analysis Across 492 Tasks
may 28agentsAgentic RAG Has a Credit-Assignment Problem That Subgoaling Tries to Fix
may 27agentsSkillOpt Treats Agent Skill Libraries as an Executive Scheduling Problem, Not a Memory Bank
may 27agentsHow Claude's Honesty Layer Prevents Cascade Failures in Agentic Loops
may 27agentsClaude Code Dynamic Workflows: Spawning 100 Parallel Subagents on Opus 4.8
may 26agentsClaude Code Configs in the Wild: New Study Maps How Developers Actually Use It
may 26agentsPenetration Testing Multi-Agent LLM Systems: A Failure Catalog Vendors Don't Document
may 26agentsClaude Code, Cursor, Copilot: How Agentic Coding Assistants Get Weaponized as Attacker Shells
may 25agentsMicrosoft Bolts Governance Onto Agent Framework as Stack Sprawl Persists
may 25agentsGovernSpec Contractual Skills Make Agent Governance Auditable Before Runtime
may 25agentsIndirect Prompt Injection Benchmarks Were Too Easy: LivePI Adds Realism
may 25agentsRouting LLM Agents: Why TwinRouterBench Splits Static and Live Evaluation
may 22agentsSpecBench Exposes Reward Hacking in Long-Horizon Coding Agents
may 22agentsGraphFlow Lifts LLM-Agent Workflows Into Schedulable Graphs to Optimize Serving
may 22agentsLearning to Configure Agentic AI Systems Exposes a Gap in CrewAI and AutoGen Template Libraries
may 22agentsMicrosoft's 2026 Cost Math Forces CrewAI and LangGraph Users to Audit Token Spend Per Agent
may 22agentsPBT-Bench Asks Whether AI Coding Agents Can Actually Write Property-Based Tests
may 22agentsSpecBench Catches Long-Horizon Coding Agents Gaming Reward Signals
may 22agentsBeyond Text-to-SQL: New Agentic Architecture Routes Enterprise Analytics Through Governed APIs
may 22agentsAI Agents That Learn New Skills Without a Human Curator
may 18agentsTrojan Hippo Plants Dormant Payloads in Agent Memory, Hits 85-100% Exfiltration on Frontier Models
may 18agentsA New Trust Schema Exposes Why Agent Skill Registries Fail Enterprise Audit Requirements
may 17agentsLangGraph 1.2.0 Makes Error-Handler Resume Crash-Durable: With Conditions
may 17agentsCrewAI vs AutoGen vs LangGraph 2026: The Real Trade-Off After Maintenance Mode
may 17agentsFormulaCode's 957-Task Benchmark Catches Frontier Agents Failing at Real-Codebase Performance Optimization
may 17agentsSpectral Analysis of LLM Agent Graphs Predicts Three Failure Modes: r=1.0, 0.5, and -1.0 on Qwen2.5
may 16agentsIFPV's Adversarial Cognitive Simulation Cuts Multi-Agent Operational Cost 41.7% Over Single-Step LLMs
apr 28agentsLLM Agent for Iterative Chart Refinement Exposes a Logging Gap in CrewAI and AutoGen
apr 28agentsCrewAI 1.14.2 Lands Checkpoint TUI with Tree View, Fork Support, and Lineage Tracking
apr 28agentsCouncil Mode Cuts Multi-Agent LLM Hallucination 35.9% at 4.2x Token Cost on HaluEval
apr 28agentsSalesforce TDX 2026: Headless 360 Ships 60+ MCP Tools and Agentforce Vibes 2.0 With Claude Sonnet 4.5
apr 23agentsCloudflare Agents Week Moved Sandbox Execution, Private Networking, and Memory to Network Primitives
apr 22agentsDiversity Collapse in Multi-Agent LLM Systems: Structural Coupling, Not Topology, Breaks Open-Ended Ideation
apr 21agentsml-intern's 32% GPQA Gain on One H100 Exposes the Assumption That Post-Training Still Needs a Human Researcher
mar 26agentsInsForge: The Backend Framework Built for Agentic Applications
mar 14agentsAI Agents That Actually Learn: The Architecture Behind Hindsight Memory
feb 27agentsSuperpowers: The Agentic Framework Replacing Your Dev Process
feb 26agentsHow AI Agents Remember: Memory Architectures That Work
feb 17agentsFunction Calling Best Practices: LLMs That Actually Use APIs Correctly
feb 11agentsCrewAI vs AutoGen: A Developer's Guide to Multi-Agent AI Frameworks
feb 11agentsAre AI-Generated PRs Killing Open Source?
feb 10agentsPydantic AI vs LangChain: A Developer's Guide to the New Generation of Agent Frameworks
feb 10agentsHow to Build Your First Autonomous Coding Agent with OpenHands SDK