AI agents still need humans to teach them

AI agents need skills — specific procedural knowledge — to perform tasks well, but they can’t teach themselves, a new research suggests.

The authors of the research have developed a new benchmark, SkillsBench, which evaluates agentic AI performance on 84 tasks across 11 domains including healthcare, manufacturing, cybersecurity and software engineering. The researchers looked at each task under three conditions: no skills (agent receives instructions only), with curated skills (provided with a directory, code snippets and resources to help it) and self-generated skills (agent has no skills but is prompted to develop them).

Typical tasks included conducting a security audit of npm dependencies for vulnerabilities, or analyzing differential protein expression in cancer cell line data.