Guardrails

Ralph is intentionally conservative.

Fresh context

Each iteration starts from a fresh agent session, then re-anchors against repo state.

Receipt gates

Ralph does not trust a model saying “done.” It checks receipts written by review commands.

Multi-model review

Implementation is checked by a different model through the configured backend. Introduced findings gate the verdict.

No autonomous product decisions

/flow-next:strategy, /flow-next:prospect, and /flow-next:capture are user-triggered surfaces. Ralph does not decide what to build.

Test first

Run one iteration before overnight mode:

scripts/ralph/ralph_once.sh

Guardrail stack

flowchart TB
  Spec["Reviewed spec"] --> Ready["Ready task only"]
  Ready --> Fresh["Fresh session"]
  Fresh --> Evidence["Evidence commands"]
  Evidence --> Receipt["Review receipt"]
  Receipt --> Continue["Continue or stop"]

Human-owned boundaries

Ralph should not decide:

whether a feature belongs in the product
whether a risky migration is acceptable
whether to merge
whether to ignore a security finding
whether undocumented behavior is acceptable

Those decisions belong in the spec, review notes, or human merge decision.

Production defaults

For production use:

Require plan review for risky specs.
Require implementation review with a real backend.
Keep iteration caps low until the repo has strong tests.
Prefer one spec branch per Ralph run.
Review generated PR bodies before merge.

Ralph is powerful when the workflow is already disciplined. Without crisp specs and review gates, it only makes drift faster.