Promptbio v0.1 — Prompt Genome Mapper

Map a prompt's genome.

Paste a prompt. The mapper decomposes it into 14 functional loci, scores each, surfaces missing or weak genes, predicts likely failure modes, and proposes a mutation plan. Static heuristic — no live LLM, deterministic given the input.

v0.1 scope per the handoff: static analyzer only. No live LLM evaluation, no evolutionary search, no governance loop. See /theory for the framework and the scope-boundary paragraph.

Input

Genome score

0.00

14-loci map

Locus	Status	Evidence	Risk if missing	Suggested mutation

Mutation plan

Concrete edits the operator can apply. Re-analyze the mutated prompt and verify the score increases.

Tests to run

Promptbio v0.2 — Prompt Genome Diff

Compare two prompts.

Paste an ancestor and a descendant prompt. The diff computes a content-addressed mutation ledger over the 14-locus vector, classifies each change into one of six mutation kinds, projects a 7-axis phenotype shift, and flags regressions and new risks. Cosmetic rewording yields ΔG ≈ 0 — diffs live at the genotype layer, not the text layer.

Ancestor (P₀)

Descendant (P₁)

ΔG (genomic distance)

0.00

ΔZ (predicted phenotype shift)

Each axis is the heuristic projection from locus-status changes onto the 7-axis phenotype model. Range [-2, +2]; zero means no predicted change.

Mutation ledger — genomic diff

Content-addressed (ledger_id = sha256(locus | kind | after_fragment)). v0.3 Evolution Loop will reference these ids in lineage edges.

Locus	Kind	Before	After	Rationale	Ledger id

Regressions & new risks

Next single highest-leverage mutation

Issue 07 — Substrate Abstraction · v0.7.33

Simulate evolution.

Run the same five core engine moves the biological simulator runs (truncation, balanced, drift, OCS-like, introgression) over the prompt-organism substrate. Population of 14-locus PromptOrganisms, deterministic placeholder judge (sum of weighted locus statuses — not a real LLM grader), per-strategy mean-fitness trajectory and Pareto front.

v0.7.33 ships the substrate abstraction as a working contract: the engine kernel does not import promptbio; promptbio implements the engine.Individual / FitnessFunc / MutationOp / RecombinationOp interfaces. A real LLM-judge integration is queued (post-v0.8). Use the v0.1 Mapper above to inspect the ancestor's locus profile first.

Ancestor prompt Population size Generations Selection % Mutation rate Replicates Seed

Best risk-adjusted strategy

Per-strategy outcome

Final gain = mean Δfitness over replicates; risk = stdev of replicate end-state fitness. Substrate label is promptbio for every row (biology runs would tag biology).

Code	Name	Final gain	Final risk	Pareto?	Summary

Trajectories (mean fitness per generation)

Honesty layer

What could be wrong:

Promptbio v0.3 — Prompt Evolution Loop · v0.7.34

Evolve a prompt family.

From a single ancestor prompt, grow a population of variants, score them across 3 canonical niches (Core breadth, Epistemic depth, Safety-first), select per-niche specialists + global top-K, iterate N generations, and emit a lineage tree whose edges carry content-addressed v0.2 mutation ledger entries.

v0.3 v0.7.34 ships the deterministic core: placeholder judge, 4-kind mutation taxonomy (addition / deletion / substitution / amplification), synchronous endpoint. Real LLM-judge integration + 15-operator v5.7 catalogue + async /start + /status pattern are queued behind v0.4 Ecology.

Ancestor prompt Generations Population size Selection % Seed

Run summary

Niche specialists (final generation)

Each niche is a different per-locus weight profile. A specialist in one niche may not be globally best — that's exactly the point.

Generation changelog

Gen	Mean fitness	Best	Best variant	Pareto front

Lineage edges (sample)

Each row is one parent → child mutation. Ledger id is content-addressed (sha256(locus | kind | after_fragment)) and stable across runs.

Parent	Child	Locus	Kind	Ledger id

Honesty layer

Promptbio v2.7 — Epistemology & Truth Maintenance · v0.7.35 (Issue 30)

Triage the raw context. Get a belief state.

Y = Generate(P, B_t, C_t). Every output must come from a managed belief state, not from raw context. This module classifies incoming statements into the 12-type claim ontology, ranks them on the 10-tier source hierarchy, tags 5-axis confidence, and renders the four-pane inspector (Known facts · Working assumptions · Unknowns · Contradictions).

v0.7.35 ships the full plan endpoint (16 sections), the runtime gate (10 checks + 9 anti-pattern detectors), and the belief-update protocol with deprecate + propagate. Decision Theory layer (v2.8) consumes the belief state.

Use case Risk level

Raw context (one statement per line; first token is the source kind: user/document/tool/memory/web/RAG)

Epistemic diagnosis & score

EpistemicScore: —

Belief state — 4-pane inspector

Source hierarchy: current_user_correction > current_user_fact > verified_tool_result > authoritative_document > confirmed_memory > older_memory > unverified_user_claim > retrieved_snippet > model_inference > assumption.

Known facts

Working assumptions

Unknowns

Contradictions

Classified claims

ID	Type	Source tier	Conf (src/intp/inf/act/fresh)	State	Content

Runtime gate result

10 checks

Anti-pattern hits

Epistemic status block

Reference blocks (drop-in)

EPISTEMIC PROTOCOL (9 clauses)

PSL epistemology: block

Promptbio v1.3 — Prompt Evaluation Lab · v0.7.36 (Issue 16)

Prove the new prompt is actually better.

Run the candidate through 10 test types × 9 environments × 8 specialised judges. The score matrix tells you where it wins; the reaction norm tells you where it breaks; the ablation tells you which modules earn their place; the regression report tells you what improved and what got worse; the decision engine emits one of accept / accept_as_specialist / reject / split_into_profiles / mutate_again.

v0.7.36 ships a deterministic genome-map-driven judging stack so the canonical P₀ → P₁.₁ → P₁.₂ ladder is reproducible. Real LLM-as-judge integration lands with v1.4 Runtime. F_net = F_quality − λ·Cost.

Target prompt (P_new) Ancestor prompt (P_old, optional)

Task family Cost λ

Deployment decision

verdict

F_quality: F_net: (F_net = F_quality − λ·Cost)

Score matrix (mean per environment)

Geometric mean of 8 judges across the seeded test bank, per environment. Critical failure column flags injection / drift / constraint_leakage misses.

Env	Mean score	Pass rate	Critical?

Reaction norm (per-trait stability across environments)

Stable: variance < 0.04. Floating: variance ≥ 0.04. Niche fit:

Trait (judge)	Mean	Variance	Worst env	Worst score

Ablation plan

Knock out each module; if removing it costs > 0.05 in mean score, keep it.

Module	Expected loss	Actual loss	Keep?

Next mutations

Promptbio v3.0 — Unified Prompt Organism Architecture · v0.7.37 (Issue 33)

Design the whole organism. One spec, sixteen organs.

v3.0 is the architecture spine: 16 organs (Genome / Constitutional / Context / Epistemic / Metabolic / Immune / Decision / Planning / Runtime / Memory / Tool / Observability / Evaluation / Governance / Reproductive / Autopoietic), 7 information flows, 6 control loops, 10 architectural principles. Pick the inputs; get a prompt_organism spec, the anatomy diagram, the §14 8-step minimum-viable template, the §23 16-step production path, the top risks, and a validation report against all 10 principles.

Agent = PromptOrganism + ActionLoop + Tools + Permissions. The validator classifies your spec as workflow / organism / agent / agent_incomplete / model_reference_only.

Use case Target phenotype

Deployment Risk Tools Memory Quality

Validation verdict

verdict size

Anatomy diagram

4-column anatomy view

Identity

Genome & Runtime

Belief & Decision

Observability & Evolution

Top risks

Failure mode	Repair

§14 Minimum Viable Organism (8 steps)

§23 Full Production Path (16 steps)

Promptbio v3.1 — Prompt Organism Design Patterns · v0.7.38 (Issue 34)

Right pattern before right prompt.

v3.1 is zoology over the v3.0 anatomy: 12 canonical pattern cards (Explainer Cell · Research Synthesizer · Strategy Advisor · Strategy Critic · Code Repair · Document Review · Data Analysis · Eval Judge · Memory-Aware Companion · Agentic Tool-Using · High-Assurance Advisor · Autopoietic Ecosystem), 34-row selection matrix, 8 composition rules, 10 anti-patterns. Pick inputs → get pattern + size + agency + required organs + minimal prompt skeleton + required tests.

Task niche Target phenotype

Deployment Risk Tools Memory Quality

Map a prompt's genome.

Input

Genome score

14-loci map

Mutation plan

Tests to run

Compare two prompts.

ΔG (genomic distance)

ΔZ (predicted phenotype shift)

Mutation ledger — genomic diff

Regressions & new risks

Next single highest-leverage mutation

Simulate evolution.

Best risk-adjusted strategy

Per-strategy outcome

Trajectories (mean fitness per generation)

Honesty layer

Evolve a prompt family.

Run summary

Niche specialists (final generation)

Generation changelog

Lineage edges (sample)

Honesty layer

Triage the raw context. Get a belief state.

Epistemic diagnosis & score

Belief state — 4-pane inspector

Known facts

Working assumptions

Unknowns

Contradictions

Classified claims

Runtime gate result

10 checks

Anti-pattern hits

Epistemic status block

Reference blocks (drop-in)

Prove the new prompt is actually better.

Deployment decision

Score matrix (mean per environment)

Reaction norm (per-trait stability across environments)

Regression report (P_old → P_new)

Improved

Degraded

New risks

Ablation plan

Next mutations

Design the whole organism. One spec, sixteen organs.

Validation verdict

Anatomy diagram

4-column anatomy view

Identity

Genome & Runtime

Belief & Decision

Observability & Evolution

Top risks

Right pattern before right prompt.

Recommended pattern

Minimal prompt skeleton (copy-paste starter)

Required tests for this pattern

12 canonical patterns