harden(go-migration): require real cutover evidence#116
Merged
Conversation
Contributor
Migration Benchmark Results
Migration CLI BenchmarkIncludes fixture-backed commands that must read, write, execute, or fail against real project state. The installed-project fixture contains apm.yml, apm.lock.yaml, apm_modules packages, local .apm primitives, target directories, deployed prompt files, and sample source files. Max allowed Go/Python median ratio:
Workloads
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
harden(go-migration): require real cutover evidence
TL;DR
This PR changes the Go migration completion gate so it can no longer declare success from representative help output, obsolete Python tests, or placeholder mappings. It adds an explicit option parity gate, requires legacy Python tests to map to existing Go-only behavior tests, and wires strict coverage enforcement into the migration workflow. The important result is that the current migration now fails strict completion with concrete evidence instead of reporting “done.”
Important
This PR intentionally proves the migration is not deletion-grade ready yet; the report-mode workflow can still collect evidence without blocking non-crane PRs.
Problem (WHY)
Why these matter: the migration gate is supposed to transform generated progress into verifiable action, not trust labels or naming conventions. That matches the PROSE principle that “Grounding outputs in deterministic tool execution transforms probabilistic generation into verifiable action.” It also follows the Agent Skills guidance that agents “pattern-match well against concrete structures” and that validation should “do the work, run a validator ... fix any issues, and repeat until validation passes.”
Approach (WHAT)
option_parityas an explicit scorer ratio gate and require it for deletion-grade readiness.APM_ENFORCE_COMPLETION_GATES=1.--allow-obsolete-python-tests.TestGoCutoverReal...tests.Implementation (HOW)
.crane/scripts/score.go-- AddsOptionParityto the score schema, parses the newoption_paritygate event, and requires it forcutover_ready..github/workflows/migration-ci.yml-- ExportsAPM_ENFORCE_PYTHON_BEHAVIOR_CONTRACTS=1for strict completion runs and only passes report-mode escape hatches outside strict mode.cmd/apm/python_behavior_contracts_test.go-- Counts every Python CLI option from the extracted command inventory, emitsoption_parity, and fails strict mode with the exact missing options.cmd/apm/go_cutover_coverage_test.go-- Discovers actual Go test functions, rejects stale mapping names, and only counts existingTestGoCutoverReal...mappings as final cutover evidence.cmd/apm/real_behavior_test.go-- Adds real temporary-project fixtures for persisted config, MCP manifests, marketplace mutation/validation, and runtime removal.scripts/ci/python_behavior_contracts.pyandtests/parity/test_python_behavior_contracts.py-- Treatpython_tests.obsoleteas report-only debt and hard-fail strict coverage.CUTOVER.md, parity README text, and coverage manifest descriptions so the documented gate matches the enforced gate.Diagrams
Legend: The diagram shows how this PR turns collected parity evidence into strict completion gates before the scorer can mark the migration ready.
flowchart LR subgraph Evidence[Evidence] GoEvents["go test events"] PyInventory["Python behavior inventory"] Coverage["coverage manifests"] end subgraph Gates[Strict gates] Option["option_parity"] Behavior["python_behavior_contracts"] Real["functional and state_diff"] end subgraph Score[Completion score] Scorer["score.go"] Ready["deletion_grade_ready"] end PyInventory --> Option Coverage --> Behavior GoEvents --> Real Option --> Scorer Behavior --> Scorer Real --> Scorer Scorer --> Ready classDef new stroke-dasharray: 5 5; class Option,Behavior,Real new;Trade-offs
--allow-obsolete-python-testsonly for collection/reporting; rejected using it in strict mode.TestGoCutoverReal...as the deletion-grade evidence prefix; rejected counting Python-vs-Go completion or help tests as final proof.Benefits
migration_score = 1.0now requires option parity in addition to help/surface parity.134/273option parity,17204/23771behavior-backed mappings, and20/26real behavior fixtures.Validation
uv run pytest tests/unit/test_crane_score.py tests/unit/test_migration_ci_workflow.py -q:uv run pytest tests/parity/test_python_behavior_contracts.py -q --tb=short:APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1:Expected strict failures proving the gate now blocks false completion
APM_ENFORCE_COMPLETION_GATES=1 APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1:go test ./cmd/apm -run '^TestGoCutover' -count=1:APM_ENFORCE_PYTHON_BEHAVIOR_CONTRACTS=1 uv run pytest tests/parity/test_python_behavior_contracts.py::test_python_contract_coverage_manifest_is_complete -q --tb=short:Additional checks:
Scenario Evidence
cmd/apm/python_behavior_contracts_test.go::TestParityPythonOptionsFromSourcecmd/apm/go_cutover_coverage_test.go::TestGoCutoverPythonTestConversionCoveragetests/parity/test_python_behavior_contracts.py::test_python_contract_coverage_manifest_is_completecmd/apm/real_behavior_test.go::TestGoCutoverRealFunctionalAndStateDiffContractstests/unit/test_migration_ci_workflow.py::test_migration_ci_enforces_completion_for_crane_prs_and_explicit_manual_runsHow to test
uv run pytest tests/unit/test_crane_score.py tests/unit/test_migration_ci_workflow.py -qand expect all tests to pass.APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1and expect report mode to pass.APM_ENFORCE_COMPLETION_GATES=1 APM_PYTHON_BIN="$PWD/.venv/bin/apm" go test ./cmd/apm -run '^TestParityPythonOptionsFromSource$' -count=1and expect theoption_parityhard gate to fail with missing options.go test ./cmd/apm -run '^TestGoCutover' -count=1and expect the cutover gate to fail with behavior-backed coverage and real-command fixture gaps.enforce_completion=trueand expect strict coverage failures until the Go implementation actually closes the gaps.Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com