Digital Red Queen - Adversarial Program Evolution in Core War with LLMs

Abstract

Digital Red Queen: Adversarial Program Evolution in Core War with LLMs

Large language models (LLMs) are increasingly being used to evolve solutions to problems in many domains, in a process inspired by biological evolution. However, unlike biological evolution, most LLM-evolution frameworks are formulated as static optimization problems, overlooking the open-ended adversarial dynamics that characterize real-world evolutionary processes. Here, we study Digital Red Queen (DRQ), a simple self-play algorithm that embraces these so-called “Red Queen” dynamics via continual adaptation to a changing objective. DRQ uses an LLM to evolve assembly-like programs, called warriors, which compete against each other for control of a virtual machine in the game of Core War, a Turing-complete environment studied in artificial life and connected to cybersecurity. In each round of DRQ, the model evolves a new warrior to defeat all previous ones, producing a sequence of adapted warriors. Over many rounds, we observe that warriors become increasingly general (relative to a set of held-out human warriors). Interestingly, warriors also become less behaviorally diverse across independent runs, indicating a convergence pressure toward a general-purpose behavioral strategy, much like convergent evolution in nature. This result highlights a potential value of shifting from static objectives to dynamic Red Queen objectives. Our work positions Core War as a rich, controllable sandbox for studying adversarial adaptation in artificial systems and for evaluating LLM-based evolution methods. More broadly, the simplicity and effectiveness of DRQ suggest that similarly minimal self-play approaches could prove useful in other more practical multi-agent adversarial domains, like real-wor…

Authors

SakanaAI

Summary

The paper introduces Digital Red Queen (DRQ), a self-play evolutionary framework that uses LLMs as mutation operators to navigate the strategic complexity of Core War. By replacing static fitness functions with a dynamic adversarial lineage, DRQ induces “Red Queen” dynamics where agents must continually innovate to maintain relative fitness. The study demonstrates that iterative adaptation against a historical pool of adversaries leads to the emergence of robust, generalist combat strategies and reveals a phenomenon of phenotypic convergence in the space of Turing-complete programs.

Mechanism

Adversarial Self-Play Loop: Let $W$ be the space of Redcode programs. DRQ constructs a sequence of warriors $(w_{0}, w_{1}, \dots, w_{T})$ where $w_{t}$ is optimised against the history pool $P_{t - 1} = {w_{0}, \dots, w_{t - 1}}$ . In round $t$ , the objective is to find:

w_{t} = ar g w \in W max E_{w_{i} \sim P_{t - 1}} [B (w, w_{i})]

where $B (w, w_{i})$ is the battle fitness. History length $K$ controls the look-back window; $K = T$ (full DRQ) mitigates cyclic Rock-Paper-Scissors dynamics.

Fitness Function: For $N$ warriors in a battle of $T$ cycles, fitness is the cumulative time-distributed reward:
$f (w_{i}) = \sum_{τ = 1}^{T} \frac{N}{T} \frac{A _{i}^{τ}}{\sum _{j} A _{j}^{τ}}$

where $A_{i}^{τ} \in {0, 1}$ denotes if warrior $i$ is active at time $τ$ . This incentivises both longevity and the rapid termination of competitors.

Intra-round Optimisation (MAP-Elites): Search is conducted via MAP-Elites to preserve strategic diversity. The behavioural descriptor is $B D (w) = (lo g (threads), lo g (coverage))$ . The LLM (GPT-4.1 mini) performs “informed” mutations and crossovers on elite solutions, utilising a Redcode manual provided in the system prompt to remain within the functional program subspace.

Findings

Generality Emergence: DRQ warriors exhibit superior zero-shot performance against unseen human experts $H$ compared to static optimisation baselines. Generality $G (w) = \frac{1}{∣ H ∣} \sum_{h \in H} I (B (w, h) \geq B (h, w))$ increases monotonically with rounds $T$ .

Convergent Evolution: As $T \to \infty$ , independent DRQ runs exhibit phenotypic convergence. Variance in the behavioural vector $v_{w} = [B (w, h_{1}), \dots, B (w, h_{k})]$ decreases, while genotypic variance (distance in embedding space of source code) remains high. This mirrors biological convergence where similar functional traits emerge from distinct genetic lineages.

Generality Prediction: A linear probe $\hat{G} (w) = ω^{⊤} ϕ (w)$ trained on text embeddings $ϕ (w)$ of Redcode source code achieves $R^{2} = 0.461$ . This suggests that the strategic utility of a program is partially encoded in its static structural features, allowing for potential surrogate-based search acceleration.

Lukas' Notes

Digital Red Queen - Adversarial Program Evolution in Core War with LLMs

Abstract

Authors

Summary

Mechanism

Findings

Graph View

Table of Contents