Back to list

Towards Self-Improving Agents via the Traveling Salesman Problem - an experimental open environment

Check it out

GitHub

Traveling Salesman RL Environment is a Prime Intellect residency project that hardens a classic 10‑city TSP into a reusable, open-source RL/eval benchmark for LLMs: tool‑free prompts, a lenient parser that scores validity instead of format, and published hub runs (Gemini, Grok, Claude, Qwen, Kimi). It lives on Prime Intellect’s Environment Hub (setrf/traveling-salesman) and GitHub, showing how custom RL environments and synthetic data can push model reasoning beyond the usual scaling tricks.

More projects

Prime Intellect environment page for megaminx-solver v0.2.57, showing the public package, README, training, evaluation, and install controls.

Megaminx World Model Bench - symbolic puzzle-world RL environment

GT-Bench result chart showing Qwen3.6-27B accuracy rising from 87.6% baseline to 99.6% after 5000-example SFT.

GT-Bench - verifiable game-theory reasoning benchmark

200loc: Interactive + complete step-by-step guide on how LLMs work

Fractal: The Infinite Curiosity Engine

All projects