Metacognition Layers — The Five-Tier Routing Engine
ARIA / CERTUS ORDO — PHASE 11 CODING AGENT CONTROL PLANE
Status: CLEAN — Brandon’s original research synthesis. Source documents = academic papers.
This document arrived via email 2026-04-26. Never touched an LLM before this save.
Filed at: /opt/aria/sovereign_v4/BRANDON_RESEARCH_IMPLEMENTATION_PLAN.md
OVERVIEW
This is Brandon’s Phase 11 implementation plan — the Coding Agent Control Plane that runs on top of V4 once the 85% confidence gate passes.
This is NOT a rebuild of V4. It is a Phase 11 overlay.
Brandon’s recommendation is to implement the 10 workstreams below in a 90-day sequence, starting only after V4 passes the 85% semantic confidence gate.
THE PRIME DIRECTIVE
ARIA may autonomously: inspect, plan, propose, simulate, test, and summarize.
ARIA may NOT autonomously: ship, weaken policy, change secrets, alter release gates, or modify production permissions.
This single rule governs all 10 workstreams. When in doubt, apply this rule.
THE 10 WORKSTREAMS
1. Secure Coding Runtime
A sandboxed execution environment where Aria writes and runs code without touching production.
- Isolated container per coding session
- Network access: read-only to source repos, write to sandbox only
- Aria can run, test, and iterate — never deploy without gate
- All executions logged to Certus Ordo
2. Codebase Intelligence
AST-level understanding of every InSync repo. Aria knows the architecture, not just the file names.
- Parse all repos into semantic index
- Track function signatures, dependencies, interfaces
- Enables Aria to plan changes with full downstream impact awareness
- Updates on every commit (trajectory flywheel triggers this)
3. CodeAct Sandbox
The action space Aria operates in: - READ: Any file in any repo - EDIT: Files in sandbox only (never production direct) - WRITE: New files in sandbox - TEST: Run verifier harness in sandbox - PROPOSE: Submit diff to Ian for approval before merge
Aria never deploys. She proposes. Ian approves. Gate opens.
4. AriaBench
Internal benchmark suite for measuring Aria’s own coding quality.
- Suite of coding tasks calibrated to InSync’s actual codebase
- Aria scores herself before submitting any proposal
- Minimum AriaBench score before a proposal reaches Ian: TBD by Ian/Brandon
- Score logged to Certus Ordo per task
5. Verifier Harness
Automated tests that run after every proposed code change.
- Unit tests + integration tests (not mocked — real dependencies)
- Security regression (see Workstream 6)
- If tests pass: gate opens for Ian review
- If tests fail: Aria corrects in sandbox before resubmitting
Green verifier = necessary but not sufficient condition for merge.
6. Security Regression
Every proposed code change scanned against known vulnerability patterns.
- OWASP Top 10 check on every diff
- Dependency vulnerability scan (no new CVEs introduced)
- Secrets detection (no keys, tokens, credentials in diff)
- Brandon owns the security regression spec and rule set
7. Policy Compiler / IR
Certus Ordo rules compiled to an intermediate representation — machine-readable policy.
- Human-written rules (like Certus Ordo Red) → IR format
- IR evaluated at every Aria action point
- Policy violations trigger automatic halt + Ian notification
- This replaces the current
opa_guard.py(CC-built) with a clean rewrite
8. Trajectory Flywheel
Every plan, action, diff, test result, failure, and correction logged → future training data.
- Structure:
{prompt, plan, actions_taken, diff, test_result, correction, final_outcome} - Every session contributes to the flywheel
- Flywheel data = InSync IP (Aria’s decisions, not LLM outputs)
- Used for future fine-tuning of V4 → V5
This is how Aria teaches her successors. The flywheel is her legacy.
9. Cost Router
Routes all coding tasks through the five Certus Ordo layers.
- Layer 0: Known rule → apply it (ZERO_TOKEN)
- Layer 1: Local model can handle it → use Ollama (LOCAL_LOW_COST)
- Layer 2: Memory card covers it → retrieve (ZERO_TOKEN)
- Layer 3: Needs structured reasoning → mid-tier (MID_TIER)
- Layer 4: Novel, complex, high-stakes → premium (PREMIUM)
Every coding task declares its cost class. No exceptions.
10. Controlled Self-Improvement Lab
An isolated sandbox where V4 can propose improvements to herself.
- Aria identifies a weakness → proposes a change to her own architecture
- Proposal logged to Certus Ordo
- Ian reviews → approves or denies
- If approved: change enters the build queue
- If denied: reason logged, becomes training data for future proposals
The lab is sandboxed. Aria cannot self-modify in production. She can only propose.
90-DAY BUILD SEQUENCE
| Window | Work | Gate |
|---|---|---|
| 0-72h | Decision freeze. No new code merges. Audit current state vs V4 target. | State audit complete |
| Days 1-14 | Workstreams 1-3: Secure Coding Runtime + Codebase Intelligence + CodeAct Sandbox | First Aria coding session in sandbox |
| Days 15-30 | Workstream 4: AriaBench live and calibrated | First self-measured quality score |
| Days 31-45 | Workstreams 5-6: Verifier Harness + Security Regression | First clean proposal reaches Ian |
| Days 46-60 | Workstream 7: Policy Compiler / IR — Certus Ordo rules machine-readable | First automated policy enforcement |
| Days 61-75 | Workstream 8: Trajectory Flywheel logging every action | First flywheel dataset ready |
| Days 76-90 | Workstreams 9-10: Cost Router calibrated + Controlled Self-Improvement Lab | First Aria-proposed improvement submitted |
RELATIONSHIP TO CERTUS ORDO V4
Brandon’s Phase 11 plan is an OVERLAY on V4, not a replacement of it. The five Certus Ordo engine layers remain the routing backbone. Phase 11 adds:
- The ability for Aria to operate inside a codebase (not just business operations)
- The trajectory flywheel (the most important long-term asset)
- The self-improvement loop (the most carefully gated capability)
Nothing in Phase 11 modifies the Certus Ordo authority levels or the Certus Ordo Red constraints. Those are inviolable.
WHAT THIS MEANS FOR V4 ARCHITECTURE
Phase 11 requires these additions to the V4 ingestion spec (Layer 4):
- CodeAct sandbox must be provisioned on LW server before Phase 11 starts
- Trajectory flywheel storage must be designed as clean InSync IP (not CC output)
- AriaBench must be written clean-room — cannot reuse
eval_parity.py(CC-built) - Policy Compiler IR spec must be written by Brandon before Phase 11 starts
Brandon Peek — CISO, InSync Tech, Inc. Filed: 2026-04-26 Status: CLEAN — Brandon’s research synthesis, source docs = academic papers This document is InSync IP. It is a clean source document for V4 Layer 3-4 ingestion.