Snowflake Releases Agent World Model: Generating 1,000 Environments with Code to Train AI Agents
Snowflake's research team introduces the Agent World Model, generating 1,000 synthetic environments via code to address the challenge of environment scarcity in AI agent training. This approach achieved significant improvements across three out-of-distribution benchmarks.

Training AI agents requires a vast number of environments, but acquiring real-world environments is costly. Snowflake's research team took a different approach, using code to generate 1,000 synthetic environments.
These environments are not virtual worlds simulated by LLMs, but tangible code-driven systems. Each environment is equipped with an average of 35 tools, covering a total of 35,000 tools and 10,000 tasks. The key difference lies in the reliability of state transitions—code environments provide stable learning signals, avoiding the uncertainty inherent in LLM simulations.
The research team automatically generated all environments using just 100 seed names, all based on real SQLite databases. They used the GRPO algorithm to perform reinforcement learning training on the Qwen3 model (4B/8B/14B parameters), running 1,024 environment instances in parallel per training step.
The results are impressive. In the BFCLv3 benchmark, the score of the 8B parameter model increased from 53.83 to 65.94, a gain of 12.11 points. More importantly, this method demonstrated improvements across three completely different out-of-distribution benchmarks, whereas traditional methods often only optimize on specific benchmarks.
A user on Hugging Face commented that this is currently the only method that surpasses the baseline across all three benchmarks. Code-driven environments not only offer more stable learning signals but also boast execution efficiency several orders of magnitude higher than LLM-simulated environments.
All code, environments, and models have been open-sourced, and researchers can access the dataset on Hugging Face. This approach provides a new paradigm for large-scale agent training—instead of waiting for real environments, why not create them with code?
Paper: https://arxiv.org/abs/2602.10090
Code Repository: https://github.com/Snowflake-Labs/agent-world-model
Dataset: https://huggingface.co/datasets/Snowflake/AgentWorldModel-1K
发布时间: 2026-02-16 09:07