Training an RL Agent to Play Slay the Spire 2

A distributed PPO (Proximal Policy Optimization) agent that learns to play Slay the Spire 2 combat from scratch, using a custom C# game mod, TCP bridge, and Python training pipeline.

Python PyTorch PPO C# + Harmony Godot TCP Sockets Distributed Training
Current run · Stage 12
12 / 13
Curriculum stages reached
Stage 12 in progress
583
Input vector dimensions
61-action head
36k+
PPO updates
Across 11 workers
800k+
Combats simulated
Across 13 curriculum stages
STS2 combat gameplay
STS2 combat — Silent, Stage 12 (Slay the Spire 2 © Mega Crit Games; not affiliated.)

What is this?

Slay the Spire 2 is a roguelike deck-building game where you build a deck of cards and fight through a series of increasingly difficult enemies. This project trains a neural network to play the combat portion of the game using reinforcement learning (RL).

The agent sees the same information a human player does — cards in hand, enemy HP and intent, active buffs/debuffs, relics — and decides which card to play or whether to end its turn. It learns entirely through self-play, starting from random actions and gradually improving through PPO.

How it works

Architecture

The system runs across multiple machines. Each worker laptop runs several headless game instances connected to a local actor proxy. The actor proxies communicate with a central learner that collects rollouts and trains the shared policy.

Training Curriculum

Training follows a 13-stage curriculum that gradually increases difficulty.

stages 0–2
Starter deck, then basic attack/block and debuff buckets, vs single Act 1 normal enemies.
stages 3–5
Draw (3), poison (4), shiv (5) added one bucket at a time; multi-enemy encounters unlock at stage 3.
stages 6–7
Full non-power card pool, all normal enemies.
stages 8–12
Elite enemies in from stage 8; scaling-power buckets at stage 9; combo bucket at stage 10.

Current Results

The agent is currently training on Stage 12 (the final stage) with the full Silent card pool against Act 1 elite enemies. Check the blog tabs for detailed write-ups on specific challenges and technical deep dives.

Tech Stack

Python 3.12 PyTorch PPO C# / .NET 9 Harmony Godot Engine TCP Sockets Chart.js GitHub Actions