harden(go-migration): require real cutover evidence by mrjf · Pull Request #116 · githubnext/apm

mrjf · 2026-06-09T21:07:04Z

harden(go-migration): require real cutover evidence

TL;DR

This PR changes the Go migration completion gate so it can no longer declare success from representative help output, obsolete Python tests, or placeholder mappings. It adds an explicit option parity gate, requires legacy Python tests to map to existing Go-only behavior tests, and wires strict coverage enforcement into the migration workflow. The important result is that the current migration now fails strict completion with concrete evidence instead of reporting “done.”

Important

This PR intentionally proves the migration is not deletion-grade ready yet; the report-mode workflow can still collect evidence without blocking non-crane PRs.

Problem (WHY)

The scorer had no first-class gate for CLI option parity, so commands could look present while Python options were still missing.
Python test coverage could be marked obsolete or mapped to help/surface tests and still look complete.
The Go-only cutover replay did not prove that mapped Go tests existed or that they performed real Go-only behavior.
[!] The cutover doc still claimed deletion-grade readiness even though strict checks expose missing behavior.

Why these matter: the migration gate is supposed to transform generated progress into verifiable action, not trust labels or naming conventions. That matches the PROSE principle that “Grounding outputs in deterministic tool execution transforms probabilistic generation into verifiable action.” It also follows the Agent Skills guidance that agents “pattern-match well against concrete structures” and that validation should “do the work, run a validator ... fix any issues, and repeat until validation passes.”

Approach (WHAT)

#	Fix	Principle
1	Add `option_parity` as an explicit scorer ratio gate and require it for deletion-grade readiness.	Deterministic tool execution
2	Make Python option inventory tests emit counted pass/total data and hard-fail under `APM_ENFORCE_COMPLETION_GATES=1`.	Concrete structures
3	Reject obsolete Python tests by default; allow them only in report mode with `--allow-obsolete-python-tests`.	Validator loop
4	Require Go cutover coverage mappings to point at existing `TestGoCutoverReal...` tests.	Real behavior evidence
5	Add real state/behavior fixtures that currently catch missing config, MCP, marketplace, and runtime behavior.	Regression traps
6	Update Actions so crane PRs and manual strict runs fail on incomplete coverage, while ordinary PRs still collect evidence.	Low-noise CI

Implementation (HOW)

.crane/scripts/score.go -- Adds OptionParity to the score schema, parses the new option_parity gate event, and requires it for cutover_ready.
.github/workflows/migration-ci.yml -- Exports APM_ENFORCE_PYTHON_BEHAVIOR_CONTRACTS=1 for strict completion runs and only passes report-mode escape hatches outside strict mode.
cmd/apm/python_behavior_contracts_test.go -- Counts every Python CLI option from the extracted command inventory, emits option_parity, and fails strict mode with the exact missing options.
cmd/apm/go_cutover_coverage_test.go -- Discovers actual Go test functions, rejects stale mapping names, and only counts existing TestGoCutoverReal... mappings as final cutover evidence.
cmd/apm/real_behavior_test.go -- Adds real temporary-project fixtures for persisted config, MCP manifests, marketplace mutation/validation, and runtime removal.
scripts/ci/python_behavior_contracts.py and tests/parity/test_python_behavior_contracts.py -- Treat python_tests.obsolete as report-only debt and hard-fail strict coverage.
Docs and manifests -- Update CUTOVER.md, parity README text, and coverage manifest descriptions so the documented gate matches the enforced gate.
Unit tests -- Update scorer and workflow tests to assert the new strict gate wiring.

Diagrams

Legend: The diagram shows how this PR turns collected parity evidence into strict completion gates before the scorer can mark the migration ready.

flowchart LR
    subgraph Evidence[Evidence]
        GoEvents["go test events"]
        PyInventory["Python behavior inventory"]
        Coverage["coverage manifests"]
    end
    subgraph Gates[Strict gates]
        Option["option_parity"]
        Behavior["python_behavior_contracts"]
        Real["functional and state_diff"]
    end
    subgraph Score[Completion score]
        Scorer["score.go"]
        Ready["deletion_grade_ready"]
    end
    PyInventory --> Option
    Coverage --> Behavior
    GoEvents --> Real
    Option --> Scorer
    Behavior --> Scorer
    Real --> Scorer
    Scorer --> Ready
    classDef new stroke-dasharray: 5 5;
    class Option,Behavior,Real new;

Trade-offs

Strict failure instead of optimistic completion. Chose to make the current migration fail strict gates; rejected preserving the old green score because it hid missing work.
Report-mode escape hatches remain. Chose --allow-obsolete-python-tests only for collection/reporting; rejected using it in strict mode.
Go-only behavior prefix is intentionally narrow. Chose TestGoCutoverReal... as the deletion-grade evidence prefix; rejected counting Python-vs-Go completion or help tests as final proof.
This PR does not implement the missing Go behavior. It makes the missing work visible and blocked; the next PRs should fix the concrete command gaps.

Benefits

migration_score = 1.0 now requires option parity in addition to help/surface parity.
Obsolete Python tests no longer count as completion evidence in strict mode.
Stale or placeholder Go test names no longer satisfy the all-Go cutover replay.
Strict mode now exposes current gaps with counted evidence: 134/273 option parity, 17204/23771 behavior-backed mappings, and 20/26 real behavior fixtures.
The cutover document no longer says the Go port is deletion-grade ready while the strict gate disagrees.

Validation

uv run pytest tests/unit/test_crane_score.py tests/unit/test_migration_ci_workflow.py -q:

27 passed in 107.78s (0:01:47)

uv run pytest tests/parity/test_python_behavior_contracts.py -q --tb=short:

2 passed, 136 skipped, 1 xfailed in 23.88s

APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1:

ok  	github.com/githubnext/apm/cmd/apm	6.410s

Expected strict failures proving the gate now blocks false completion

APM_ENFORCE_COMPLETION_GATES=1 APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1:

{"crane":"gate","name":"option_parity","passing":134,"total":273}
HARD-GATE FAILED: Go help is missing 139/273 Python CLI options.

go test ./cmd/apm -run '^TestGoCutover' -count=1:

{"crane":"gate","name":"python_behavior_contracts","passing":17204,"total":23771}
Go cutover coverage is not behavior-backed: 6567/23771 Python tests do not map to a real Go-only cutover behavior test.
{"crane":"gate","name":"functional","passing":20,"total":26}
{"crane":"gate","name":"state_diff","passing":20,"total":26}

APM_ENFORCE_PYTHON_BEHAVIOR_CONTRACTS=1 uv run pytest tests/parity/test_python_behavior_contracts.py::test_python_contract_coverage_manifest_is_complete -q --tb=short:

obsolete-python-test-coverage: 24177
1 failed in 16.66s

Additional checks:

ruff check ...                         All checks passed!
ruff format --check ...                3 files already formatted
git diff --check                       passed

Scenario Evidence

#	Scenario (user promise)	Principle(s)	Test(s) proving it	Type
1	Crane cannot mark the Go migration complete while Python CLI options are missing.	DevX, Governed by policy	`cmd/apm/python_behavior_contracts_test.go::TestParityPythonOptionsFromSource`	integration
2	Legacy Python tests must be replaced by behavior-backed Go evidence, not obsolete labels or help-only mappings.	Governed by policy, OSS / community-driven	`cmd/apm/go_cutover_coverage_test.go::TestGoCutoverPythonTestConversionCoverage` `tests/parity/test_python_behavior_contracts.py::test_python_contract_coverage_manifest_is_complete`	integration
3	Real commands must mutate or read real project state before deletion-grade cutover can pass.	Portability by manifest, DevX	`cmd/apm/real_behavior_test.go::TestGoCutoverRealFunctionalAndStateDiffContracts`	integration
4	Strict migration CI fails on incomplete evidence, but report-mode CI still produces summaries.	Governed by policy, DevX	`tests/unit/test_migration_ci_workflow.py::test_migration_ci_enforces_completion_for_crane_prs_and_explicit_manual_runs`	unit

How to test

Run uv run pytest tests/unit/test_crane_score.py tests/unit/test_migration_ci_workflow.py -q and expect all tests to pass.
Run APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1 and expect report mode to pass.
Run APM_ENFORCE_COMPLETION_GATES=1 APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1 and expect the option_parity hard gate to fail with missing options.
Run go test ./cmd/apm -run '^TestGoCutover' -count=1 and expect the cutover gate to fail with behavior-backed coverage and real-command fixture gaps.
In Actions, run “Migration Parity and Benchmarks” with enforce_completion=true and expect strict coverage failures until the Go implementation actually closes the gaps.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

github-actions · 2026-06-09T21:14:47Z

Migration Benchmark Results

Commit: 3aa11ef0c8b5c12908c8bc0d4b63706842c16dbc
Run: https://github.com/githubnext/apm/actions/runs/27235890993

Migration CLI Benchmark

Includes fixture-backed commands that must read, write, execute, or fail against real project state. The installed-project fixture contains apm.yml, apm.lock.yaml, apm_modules packages, local .apm primitives, target directories, deployed prompt files, and sample source files.
The harness checks return-code parity for each command. Detailed stdout/stderr byte counts are kept in the JSON samples, but this is not an output-parity test.

Max allowed Go/Python median ratio: 5.00

Benchmark	Command	Fixture	Python median	Go median	Go/Python	Result	Return codes
init scaffold	`init --yes`	empty-project	0.4992s	0.0013s	0.00x	377.79x faster	{'python': [0], 'go': [0]}
targets json	`targets --json`	installed-project	0.4711s	0.0016s	0.00x	293.68x faster	{'python': [0], 'go': [0]}
script list	`list`	installed-project	0.4972s	0.0017s	0.00x	301.19x faster	{'python': [0], 'go': [0]}
deps list	`deps list`	installed-project	0.4875s	0.0015s	0.00x	329.04x faster	{'python': [0], 'go': [0]}
deps tree	`deps tree`	installed-project	0.4789s	0.0015s	0.00x	327.19x faster	{'python': [0], 'go': [0]}
install local package	`install --no-policy ./packages/local-tools`	local-install-project	0.5259s	0.0018s	0.00x	290.91x faster	{'python': [0], 'go': [0]}
compile copilot target	`compile --target copilot`	compilation-project	0.5099s	0.0015s	0.00x	332.20x faster	{'python': [0], 'go': [0]}
pack output	`pack --output dist`	installed-project	0.4959s	0.0017s	0.00x	287.55x faster	{'python': [0], 'go': [0]}
run script	`run stamp`	runnable-project	0.4784s	0.0025s	0.01x	192.66x faster	{'python': [0], 'go': [0]}
audit hidden unicode	`audit --ci`	audit-finding-project	0.4970s	0.0017s	0.00x	288.90x faster	{'python': [1], 'go': [1]}

Workloads

init scaffold: Creates a new apm.yml in an otherwise empty project directory.
targets json: Reads configured project targets from apm.yml and emits machine output.
script list: Reads apm.yml scripts and renders the runnable script inventory.
deps list: Scans apm_modules package directories and apm.lock.yaml metadata.
deps tree: Builds a dependency tree from apm.lock.yaml and installed package metadata.
install local package: Installs a local package and materializes lock/module state.
compile copilot target: Discovers local primitives and writes the Copilot target artifact.
pack output: Resolves local package contents and writes a distributable artifact.
run script: Executes a project script and writes the script's side-effect file.
audit hidden unicode: Scans a real installed file and fails on planted hidden Unicode.

harden Go migration completion gates

3aa11ef

mrjf merged commit e96b795 into main Jun 9, 2026
13 checks passed

mrjf deleted the codex/rock-solid-go-parity-gate branch June 9, 2026 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

harden(go-migration): require real cutover evidence#116

harden(go-migration): require real cutover evidence#116
mrjf merged 1 commit into
mainfrom
codex/rock-solid-go-parity-gate

mrjf commented Jun 9, 2026

Uh oh!

github-actions Bot commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mrjf commented Jun 9, 2026

harden(go-migration): require real cutover evidence

TL;DR

Problem (WHY)

Approach (WHAT)

Implementation (HOW)

Diagrams

Trade-offs

Benefits

Validation

Scenario Evidence

How to test

Uh oh!

github-actions Bot commented Jun 9, 2026

Migration Benchmark Results

Migration CLI Benchmark

Workloads

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant