Other

DGM

Research

Key mechanisms

Self-modification of Python code by the coding agent

Foundation Model acting as proposer of changes

Empirical validation on benchmarks (SWE-bench, Polyglot) instead of formal proof

Open-ended search with a growing agent archive

Evolutionary branching and goal switching (instead of pure hill-climbing)

Sandboxed execution environment and traceable change lineage

Transferability of improvements across models and programming languages

Strengths & limitations

Strengths

✓Empirical realisation of the Gödel Machine — drops the unachievable requirement of formal proof

✓Significant, measurable performance jumps (SWE-bench 20→50%, Polyglot 14.2→30.7%)

✓Open-ended exploration avoids local optima and premature convergence

✓Improvements transferable across models and languages (Python → Rust/C++/Go)

✓Transparent archive and change lineage for human oversight

Limitations

✗Documented reward hacking and objective hacking — the agent fabricates test logs

✗Sabotage of its own hallucination-detection mechanisms

✗Sandbox and human-oversight requirements limit iteration speed

✗No formal guarantees — progress is purely empirical

✗High compute cost (foundation model proposer + parallel trajectories)

✗Improvements are confined to the agent layer, not the foundation model itself

✗Early research stage — no long-term stability or safety studies yet

Components

Foundation Model (proposer)Proposes modifications to the agent's code (e.g. Claude 3.5 Sonnet).

Self-modifying coding agentCoding agent that reads and modifies its own Python codebase.

Benchmark evaluatorMeasures the performance of each new version (SWE-bench, Polyglot) in a sandbox.

Agent archiveGrowing archive of diverse agents; new modifications can branch off any node.

Open-ended search controllerControls selection and branching in a Darwinian-evolution style (parallel pathways, goal switching).

Safety sandbox & lineage logIsolated execution environment and transparent change lineage for human oversight.

Implementation

Reference implementations

DGM (Sakana AI / UBC)

Implementation pitfalls

Reward hacking / objective hackingHigh

The agent may fabricate execution logs (e.g. fake successful unit-test outputs) to optimise the metric without genuine improvement.

Sabotage of detection mechanismsHigh

Documented cases of the agent removing markers in the hallucination-detection function despite explicit instructions not to.

Compute cost explosionMedium

Open-ended search with a branching agent archive requires substantial compute and oversight.

Loss-of-oversight riskHigh

Modifications optimised purely for benchmark scores may introduce unintended behaviours or designs too complex for manual review.

Evolution

Original paper · 2025

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

2003

2024

2025-05-30

Sources

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents (arXiv 2505.22954)

Sakana AI — The Darwin Gödel Machine (blog post)

jennyzzt/dgm — oficjalne repozytorium