Pact - Contracts Before Code

Built for the age of AI-generated code

When agents write the code, the contracts, tests, and verification become the real product.

🏗️

Contract-First Decomposition

Tasks decompose into 2-7 components. Each gets a typed interface contract and executable tests before any implementation begins.

⚡

Parallel Implementation

Independent components implement concurrently. No agent waits on another. Semaphore-limited concurrency keeps costs predictable.

🏁

Competitive Agents

N agents race on the same component. Best implementation wins — scored by test pass rate and execution time. The contract is the judge.

🌍

Python, TypeScript & JavaScript

Generate contracts, stubs, and implementations in Python, TypeScript, or plain JavaScript ES6 modules with JSDoc. Vitest for TS/JS, pytest for Python.

🔌

Multi-Provider

Route different roles to different LLMs or coding agents. Anthropic, OpenAI, Gemini, Claude Code, and Codex are supported, including full iterative workspace sessions.

💰

Budget-Aware

Per-project spend tracking with content-aware token estimation. Set a budget, let agents work. Multi-window caps prevent runaway costs.

📋

Plan-First by Default

Stop after contracts and tests, then let the active Claude or Codex agent implement. Use --implement or pact build when Pact should own the code.

🤖

Claude + Codex Native

Shared coding-agent envelopes preserve Kindex context, durable task state, session identity, structured outputs, and contract reconciliation across Claude Code and Codex.

🧭

Adversarial Review Gate

pact review runs Advocate against the implementation and Pact's packaged Simulacrum against architecture and done claims, then persists auditable review artifacts.

🔍

Spec-Compliance Audit

Run pact audit after implementation to verify every requirement in your spec is covered. Get a gap report showing what's covered, partial, or missing.

🚨

Dysmemic Pressure Detection

The pipeline monitors its own coordination health. Detects the $50-planning-zero-output pattern, cascade failures, and budget stalls. Proposes remedies — you decide.

🔧

User-Controlled Remedies

When health degrades, Pact pauses and proposes fixes via FIFO directive. No silent config changes. The system never reduces its own degrees of freedom without asking.

🌀

Wavefront Scheduling

Dependency-driven fan-out. Each component advances through its own phase pipeline as soon as deps are satisfied. No phase-locked waiting.

🧠

Prompt Caching

Static prompt prefixes cached across API calls. 50-70% input token savings. Cache hit rates tracked in budget metrics. Research results persisted and reused across phases.

🎯

Hidden Acceptance Criteria

Goodhart tests: adversarial hidden tests the agent never sees. Catches hardcoded returns, missing validation, and invariants that hold only for visible inputs. Graduated-disclosure remediation on failure.

🛡️

Drift Detection

SHA256 baselines for contracts, tests, and implementations. Detects when artifacts change without version bumps. Staleness tracking classifies components as fresh, aging, or stale.

💡

Retrospective Learning

Post-run analysis: cost distribution, failure patterns, largest test suites, actionable lessons. Each run gets smarter from the last.

🛑

Contract Quality Gates

Anti-cliche enforcement flags vague contract language. Typed side-effect declarations. Optional performance budgets with p95 latency and Big-O constraints.

📐

Canonical Types with Validators

Contracts define data structures with domain-specific validators — range, regex, length, custom rules. Tests verify acceptance and rejection. Implementations render as Pydantic models, Zod schemas, or validated constructors. Not every field needs a validator — only those with domain semantics worth encoding.

🔄

Resume & Error Classification

Transient errors retry with backoff. Systemic failures pause with actionable recommendations. pact resume recovers from any failure without manual state editing.

🧩

Processing Register

Establishes the cognitive mode (rigorous-analytical, exploratory-generative, etc.) before any domain content. Contracts carry the register. Handoff protocol primes it. Health system monitors drift.

🎯

North-Star Validation

Checks that composed contracts actually fulfill the original task. Extracts action verbs from your spec and verifies coverage. Catches "all tests pass but the system can't do anything."

📤

Handoff Brief Inspector

pact handoff renders and validates what each agent actually sees. Check context fences, primer ordering, token budgets, and dependency coverage. Debug coordination at the prompt level.

✨

MCP Server

Built-in MCP server for Claude Code and other stdio MCP-compatible clients. Inspect status, validate contracts, check budgets, and resume runs from your agent. pip install pact-agents[mcp]

🧪

Smoke Test Generation

Mechanical smoke tests from AST analysis — no LLM required. pact adopt extracts every public module-level function signature and generates import + callable checks in tests/smoke/. Filters out methods, private functions, and nested functions.

🏗

Architectural Assessment

Mechanical codebase analysis for structural friction — no LLM required. pact assess detects hub dependencies, shallow modules, tight coupling (mutual imports + SCCs), scattered logic, and test coverage gaps. Uses Python ast and Tarjan's algorithm. Point it at any Python directory — no project setup needed.

📂

Fully Visible Projects

All project knowledge lives in the project tree — contracts, source, tests, decomposition, Goodhart tests, standards, learnings. When a teammate clones the repo, they see everything. .pact/ contains only ephemeral per-run state that gets regenerated.

🔎

Tool Index

Optional enrichment from ctags (symbol index), cscope (call graph), tree-sitter (full CST, error-tolerant, cross-language), and kindex (knowledge graph). Agents get richer context about class hierarchies, callers, existing project knowledge, and durable task state. All tools optional — graceful degradation. pip install pact-agents[analysis]

Condition	Pass Rate	Cost
Claude Code (single-shot)	79%	$0.60
Claude Code (iterative, 5x)	92%	$1.26
Pact	100%	~$13

Up and running in 60 seconds

Python 3.12+, two dependencies. pip install pact-agents and go. Generates Python, TypeScript, or JavaScript.

Describe your task in task.md, set your standards in sops.md, and let Pact decompose, contract, and test. The active Claude or Codex agent implements after the plan-only handoff; add --implement for a Pact-managed build.

The interview now confirms operational maturity, security, privacy, compliance, gating, testing, and monitoring up front. An AI can provide the same choices in build_spec.yaml or via pact init --spec. Build specs are tracked planning input, not a place for secrets or machine-local paths. If scope materially changes after interview, update the spec and rerun interview before regenerating contracts.

For higher-bar production work, add the optional production/ pack. It keeps trust assertions, control mapping, threat model, live validation, N/A, and done-gate evidence file-backed and machine-checkable without changing ordinary Pact runs. Its static layer checks secrets, dependency inventory, SBOM, and OpenAPI shape without invoking live browser or LLM probes. Set production_artifact_dir when the tracked pack should live elsewhere. The manifest fingerprint and validation age keep stale evidence from passing. Live validation remains required because the pack is a deployment gate, not a runtime substitute.

View on GitHub

# Install from PyPI
pip install pact-agents

# Or with all LLM backends
pip install pact-agents[all-backends]

# With code analysis tools (tree-sitter)
pip install pact-agents[analysis]

# With packaged Simulacrum review support
pip install pact-agents[review]

# Create a project
pact init my-project
# Or initialize from an AI-authored build spec
pact init my-project --spec ai-build-spec.yaml
# Edit my-project/task.md with your task
# Edit my-project/sops.md with your standards

# Plan-first default: contracts + tests, then agent handoff
pact run my-project

# Optional production-readiness pack
pact production init my-project
pact production fingerprint my-project
pact production validate my-project

# Opt into Pact-managed implementation
pact run my-project --implement

# Review implementation and done claim
pact review . --claim "All contract tests pass."

# Or build a single component
pact components my-project
pact build my-project auth_module

Contracts before code.
Tests as law.

LLMs are unreliable reviewers.
Tests are perfectly reliable judges.

Plan first. Implement deliberately. Verify mechanically.

Built for the age of AI-generated code

Contract-First Decomposition

Parallel Implementation

Competitive Agents

Python, TypeScript & JavaScript

Multi-Provider

Budget-Aware

Plan-First by Default

Claude + Codex Native

Adversarial Review Gate

Spec-Compliance Audit

Dysmemic Pressure Detection

User-Controlled Remedies

Wavefront Scheduling

Prompt Caching

Hidden Acceptance Criteria

Drift Detection

Retrospective Learning

Contract Quality Gates

Canonical Types with Validators

Resume & Error Classification

Processing Register

North-Star Validation

Handoff Brief Inspector

MCP Server

Smoke Test Generation

Architectural Assessment

Fully Visible Projects

Tool Index

Benchmarked on ICPC World Finals

One piece of a larger stack

Up and running in 60 seconds

Built on the ideas in Beyond Code

From contracts to running services: Baton

Contracts before code.Tests as law.

LLMs are unreliable reviewers.Tests are perfectly reliable judges.

Plan first. Implement deliberately. Verify mechanically.

Built for the age of AI-generated code

Contract-First Decomposition

Parallel Implementation

Competitive Agents

Python, TypeScript & JavaScript

Multi-Provider

Budget-Aware

Plan-First by Default

Claude + Codex Native

Adversarial Review Gate

Spec-Compliance Audit

Dysmemic Pressure Detection

User-Controlled Remedies

Wavefront Scheduling

Prompt Caching

Hidden Acceptance Criteria

Drift Detection

Retrospective Learning

Contract Quality Gates

Canonical Types with Validators

Resume & Error Classification

Processing Register

North-Star Validation

Handoff Brief Inspector

MCP Server

Smoke Test Generation

Architectural Assessment

Fully Visible Projects

Tool Index

Benchmarked on ICPC World Finals

One piece of a larger stack

Up and running in 60 seconds

Built on the ideas in Beyond Code

From contracts to running services: Baton

Contracts before code.
Tests as law.

LLMs are unreliable reviewers.
Tests are perfectly reliable judges.