Contributor Guide

This guide helps new contributors get started with the OpenAI Agents Python repository. It covers repo structure, how to test your work, available utilities, and guidelines for commits and PRs.

Views0
PublishedFeb 23, 2026

Loading actions...

5 minBeginnerpromptSingle file

Skill content

Main instructions and any bundled files for this skill.

markdown

Contributor Guide

This guide helps new contributors get started with the OpenAI Agents Python repository. It covers repo structure, how to test your work, available utilities, and guidelines for commits and PRs.

Location: AGENTS.md at the repository root.

Table of Contents

  1. Policies & Mandatory Rules
  2. Project Structure Guide
  3. Operation Guide

Policies & Mandatory Rules

Mandatory Skill Usage

$code-change-verification

Run $code-change-verification before marking work complete when changes affect runtime code, tests, or build/test behavior.

Run it when you change:

  • src/agents/ (library code) or shared utilities.
  • tests/ or add or modify snapshot tests.
  • examples/.
  • Build or test configuration such as pyproject.toml, Makefile, mkdocs.yml, docs/scripts/, or CI workflows.

You can skip $code-change-verification for docs-only or repo-meta changes (for example, docs/, .agents/, README.md, AGENTS.md, .github/), unless a user explicitly asks to run the full verification stack.

$openai-knowledge

When working on OpenAI API or OpenAI platform integrations in this repo (Responses API, tools, streaming, Realtime API, auth, models, rate limits, MCP, Agents SDK or ChatGPT Apps SDK), use $openai-knowledge to pull authoritative docs via the OpenAI Developer Docs MCP server (and guide setup if it is not configured).

$implementation-strategy

Before changing runtime code, exported APIs, external configuration, persisted schemas, wire protocols, or other user-facing behavior, use $implementation-strategy to decide the compatibility boundary and implementation shape. Judge breaking changes against the latest release tag, not unreleased branch-local churn. Interfaces introduced or changed after the latest release tag may be rewritten without compatibility shims unless they define a released or explicitly supported durable external state boundary, or the user explicitly asks for a migration path. Unreleased persisted formats on main may be renumbered or squashed before release when intermediate snapshots are intentionally unsupported.

$pr-draft-summary

Before every final response for a task that changed runtime code, tests, examples, build/test configuration, or docs with behavior impact, invoke $pr-draft-summary to generate the required PR summary block, branch suggestion, title, and draft description. Determine whether to invoke it from the changed files, not from a subjective assessment of change size.

Skip $pr-draft-summary only for trivial or conversation-only tasks, repo-meta/doc-only tasks without behavior impact, or when the user explicitly says not to include the PR draft block.

Producing the PR draft block is part of the local final handoff. It is required for eligible local-only or uncommitted changes and does not authorize creating a branch, committing, pushing, or opening a pull request.

ExecPlans

Call out compatibility risk early in your plan only when the change affects behavior shipped in the latest release tag or a released or explicitly supported durable external state boundary, and confirm the approach before implementing changes that could impact users.

Use an ExecPlan when work is multi-step, spans several files, involves new features or refactors, or is likely to take more than about an hour. Start with the template and rules in PLANS.md, keep milestones and living sections (Progress, Surprises & Discoveries, Decision Log, Outcomes & Retrospective) up to date as you execute, and rewrite the plan if scope shifts. Call out compatibility risk only when the plan changes behavior shipped in the latest release tag or a released or explicitly supported durable external state boundary. Do not treat branch-local interface churn or unreleased post-tag changes on main as breaking by default; prefer direct replacement over compatibility layers in those cases, and renumber or squash unreleased persisted schemas before release when the intermediate snapshots are intentionally unsupported. If you intentionally skip an ExecPlan for a complex task, note why in your response so reviewers understand the choice.

Public API Compatibility

Treat the parameter and dataclass field order of exported runtime APIs as a compatibility contract.

  • For public constructors (for example RunConfig, FunctionTool, AgentHookContext), preserve existing positional argument meaning. Do not insert new constructor parameters or dataclass fields in the middle of existing public order.
  • When adding a new optional public field/parameter, append it to the end whenever possible and keep old fields in the same order.
  • If reordering is unavoidable, add an explicit compatibility layer and regression tests that exercise the old positional call pattern.
  • Prefer keyword arguments at call sites to reduce accidental breakage, but do not rely on this to justify breaking positional compatibility for public APIs.
  • Treat intended import paths and __all__ membership as compatibility contracts. When adding or moving a public symbol, update the owning module, intended top-level or subpackage re-exports, and an import regression test. Keep top-level imports free of optional-dependency failures and runtime side effects; use lazy exports when needed.

Platform, Docs, and Security Review

  • Documentation is published to the live site, so coordinate SDK behavior changes and docs carefully. If docs describe behavior that is not released yet, either delay the docs change until the SDK release is available or split it into a follow-up PR.
  • Treat runnable docs snippets as API compatibility checks. Before adding OpenAI API, provider, Responses, Realtime, WebSocket, or SDK constructor examples, verify the shown arguments and call shape against the actual implementation.
  • Do not let untrusted sandbox manifests opt themselves out of host filesystem or base-directory boundaries. Escape hatches for local source materialization must be controlled by trusted application code at the call site, not by serialized manifest data.
  • When documenting sandbox or security grants, verify the actual implementation path enforces the grant or boundary. Do not claim a grant applies to LocalDir, LocalFile, archive extraction, or other materialization paths unless those paths actually consult it.
  • When redacting OpenAI tool, MCP, model, or provider payloads, consider traceback display, exception chaining, __context__, logs, and telemetry. Suppressing display with raise ... from None is not enough if the original exception object still carries sensitive input data.
  • For OpenAI platform or SDK-specific docs changes, prefer $openai-knowledge for authoritative platform behavior and inspect the local code path for SDK behavior. Do not rely on generic API assumptions when documenting Responses, Chat Completions, Realtime, tools, MCP, or provider adapters.
  • For Realtime tracing changes, read Realtime tracing architecture before proposing SDK spans. Realtime API server traces and Agents SDK client traces are separate; group_id can correlate them but does not create a shared trace hierarchy.

Project Structure Guide

Overview

The OpenAI Agents Python repository provides the Python Agents SDK, examples, and documentation built with MkDocs. Use uv run python ... for Python commands to ensure a consistent environment.

Repo Structure & Important Files

  • src/agents/: Core library implementation.
  • tests/: Test suite; see tests/README.md for snapshot guidance.
  • examples/: Sample projects showing SDK usage.
  • docs/: MkDocs documentation source; do not edit translated docs under docs/ja, docs/ko, or docs/zh (they are generated).
  • docs/scripts/: Documentation utilities, including translation and reference generation.
  • mkdocs.yml: Documentation site configuration.
  • Makefile: Common developer commands.
  • pyproject.toml, uv.lock: Python dependencies and tool configuration.
  • .github/PULL_REQUEST_TEMPLATE/pull_request_template.md: Pull request template to use when opening PRs.
  • .agents/references/: Durable SDK maintainer architecture references. Start with the reference map and open only the files relevant to the affected runtime boundary.
  • site/: Built documentation output.

Agents Core Runtime Guidelines

  • For Agent fields, cloning, dynamic instructions, enabled tools or handoffs, output schemas, run context wrappers, usage aggregation, or public-versus-internal agent identity, read Agent definition and run context.
  • src/agents/run.py is the runtime entrypoint (Runner, AgentRunner). Keep it focused on orchestration and public flow control. Put new runtime logic under src/agents/run_internal/ and import it into run.py.
  • When run.py grows, refactor helpers into run_internal/ modules (for example run_loop.py, turn_resolution.py, tool_execution.py, session_persistence.py) and leave only wiring and composition in run.py.
  • For turn accounting, guardrail ordering, handoffs, interruptions, cancellation, hooks, or streaming behavior, read Runner lifecycle. Keep streaming and non-streaming paths behaviorally aligned.
  • For new model output, tool call, approval, or run item variants, read Run item lifecycle and update every applicable processing, event, replay, persistence, tracing, and serialization surface.
  • For function-tool parameter schemas, Annotated or Field metadata, strict JSON schema conversion, or structured output schemas, read Function and output schema.
  • For function-tool naming, namespacing, lookup, approvals, tracing, or call-ID changes, read Tool identity and routing and use the canonical helpers in src/agents/_tool_identity.py instead of adding local normalization rules.
  • For function-tool planning, approval ordering, tool guardrails, concurrency, cancellation, timeouts, hooks, or failure conversion, read Tool execution lifecycle.
  • For local MCP connection ownership, MCPServerManager, request serialization, tool caching or filtering, transport retries, cancellation, or cleanup, read Local MCP server lifecycle.
  • For trace or span context, processors, export, flush, shutdown, sensitive data, or resumed trace state, read Tracing lifecycle.
  • For RealtimeSession lifecycle, background-task, handoff, listener, connection, or cleanup changes, read Realtime session lifecycle and verify both normal and failure-path resource ownership.
  • For VoicePipeline, streamed audio input, STT session ownership, TTS task ordering, voice lifecycle events, PCM framing, or voice tracing changes, read Voice pipeline lifecycle.
  • For server-managed conversation (conversation_id, previous_response_id, auto_previous_response_id), read Conversation state ownership before changing continuation, filtering, retry, compaction, handoffs, or resume behavior.
  • For client-managed session input, per-turn saves, retry rewind, backend atomicity, or compaction replacement, read Session persistence.
  • For model resolution, ModelSettings, provider adapters, Responses versus Chat Completions capabilities, request conversion, terminal events, transport reuse, or model retries, read Model and provider boundaries.
  • If the serialized RunState shape changes, read RunState schema and resume boundary and follow its release-boundary, schema-version, backward-read, and regression-test rules.
  • For sandbox session ownership, agent preparation, manifests, host-path materialization, snapshots, resume state, or cleanup, read Sandbox runtime boundary.

Operation Guide

Prerequisites

  • Python 3.10+.
  • uv installed for dependency management (uv sync) and uv run for Python commands.
  • make available to run repository tasks.

Development Workflow

  1. Sync with main and create a feature branch:
    git checkout -b feat/<short-description>
    
  2. If dependencies changed or you are setting up the repo, run make sync.
  3. Implement changes and add or update tests alongside code updates.
  4. Highlight compatibility or API risks in your plan before implementing changes that alter the latest released behavior or a released or explicitly supported durable external state boundary.
  5. Build docs when you touch documentation:
    make build-docs
    
  6. When $code-change-verification applies, run it to execute the full verification stack before marking work complete.
  7. Commit with concise, imperative messages; keep commits small and focused, then open a pull request.
  8. Before reporting eligible code changes as complete, invoke $pr-draft-summary as the final handoff step unless the task falls under the documented skip cases. Do not omit it based on perceived change size or because the work remains local or uncommitted.

Testing & Automated Checks

Before submitting changes, ensure relevant checks pass and extend tests when you touch code.

When $code-change-verification applies, run it to execute the required verification stack from the repository root. Rerun the full stack after applying fixes.

Unit tests and type checking

  • Run the full test suite:
    make tests
    
  • Run a focused test:
    uv run pytest -s -k <pattern>
    
  • Type checking:
    make typecheck
    

Snapshot tests

Some tests rely on inline snapshots; see tests/README.md for details. Re-run make tests after updating snapshots.

  • Fix snapshots:
    make snapshots-fix
    
  • Create new snapshots:
    make snapshots-create
    

Coverage

  • Generate coverage (fails if coverage drops below threshold):
    make coverage
    

Formatting, linting, and type checking

  • Formatting and linting use ruff; run make format (applies fixes) and make lint (checks only).
  • Type hints must pass make typecheck.
  • Write comments as full sentences ending with a period.
  • Imports are managed by Ruff and should stay sorted.

Mandatory local run order

When $code-change-verification applies, run the full sequence in order (or use the skill scripts):

make format
make lint
make typecheck
make tests

Utilities & Tips

  • Install or refresh development dependencies:
    make sync
    
  • Run tests against the oldest supported version (Python 3.10) in an isolated environment:
    UV_PROJECT_ENVIRONMENT=.venv_310 uv sync --python 3.10 --all-extras --all-packages --group dev
    UV_PROJECT_ENVIRONMENT=.venv_310 uv run --python 3.10 -m pytest
    
  • Documentation workflows:
    make build-docs      # build docs after editing docs
    make serve-docs      # preview docs locally
    make build-full-docs # run translations and build
    
  • Snapshot helpers:
    make snapshots-fix
    make snapshots-create
    
  • Use examples/ to see common SDK usage patterns.
  • Review Makefile for common commands and use uv run for Python invocations.
  • Explore docs/ and docs/scripts/ to understand the documentation pipeline.
  • Consult tests/README.md for test and snapshot workflows.
  • Check mkdocs.yml to understand how docs are organized.

Pull Request & Commit Guidelines

  • Use the template at .github/PULL_REQUEST_TEMPLATE/pull_request_template.md; include a summary, test plan, and issue number if applicable.
  • Add tests for new behavior when feasible and update documentation for user-facing changes.
  • Run make format, make lint, make typecheck, and make tests before marking work ready.
  • Commit messages should be concise and written in the imperative mood. Small, focused commits are preferred.

Review Process & What Reviewers Look For

  • ✅ Checks pass (make format, make lint, make typecheck, make tests).
  • ✅ Tests cover new behavior and edge cases.
  • ✅ Code is readable, maintainable, and consistent with existing style.
  • ✅ Public APIs and user-facing behavior changes are documented.
  • ✅ Examples are updated if behavior changes.
  • ✅ History is clean with a clear PR description.
Share: