Introduction

pctx generates LLM-ready context from your codebase. It intelligently packages source files with proper formatting, truncation, and filtering for optimal AI assistant consumption.

Motivation

When working with modern AI coding harnesses, agents often need to make multiple, sequential tool calls just to explore, read, and understand the layout of your project. This back-and-forth communication is slow, consumes excess tokens for system overhead, and can easily derail an agent’s train of thought before it even starts writing code.

pctx solves this by providing a unified, pre-packaged snapshot of your project’s context. By generating a single, intelligently filtered and truncated document, you can:

Reduce Latency: Eliminate the need for the AI to “poke around” your filesystem using search and read commands.
Improve Accuracy: Provide the AI with an immediate, holistic view of the project’s structure and relevant files.
Save Tokens: Filter out noise (like binaries, build artifacts, and vendor directories) and smartly truncate long files so you only pay for the context that matters.
Maintain Control: Keep sensitive or irrelevant information out of the context window using standard gitignore syntax.

Features

Smart file discovery: Respects .gitignore, excludes binary files, and filters common non-source directories
Multiple output formats: Markdown (default), XML, and plain text
Intelligent truncation: Preserves file head and tail when truncating large files
Flexible filtering: Include/exclude patterns with gitignore-style syntax
Multiple destinations: stdout, clipboard, or file output
JSON mode: Structured output for programmatic use and CI/CD integration
Stdin support: Read file lists from pipes for integration with other tools
Token estimation: Approximate token counts for various LLM models

Installation

You can install pctx from crates.io using Cargo:

cargo install pctx

Build from Source

Alternatively, you can build from source:

git clone https://github.com/mc-marcocheng/pctx
cd pctx
cargo build --release

The compiled binary will be available at target/release/pctx.

Usage

Quick Start

# Generate context for current directory
pctx

# Copy to clipboard
pctx --clipboard

# Write to file
pctx --output context.md

# JSON output for scripts
pctx --json

# Filter specific files
pctx --include "*.rs" --include "*.toml"
pctx --exclude "*.test.ts" --exclude "__tests__"

# Pipe file list from other tools
find . -name "*.rs" -mtime -7 | pctx --stdin
pctx files list --quiet | grep -v test | pctx --stdin

# Preview without generating
pctx --dry-run

# Include file tree in output
pctx --tree

# Disable truncation for full file contents
pctx --no-truncation

Basic Commands

# Default: generate context from current directory
pctx [OPTIONS] [PATHS...]

# List files that would be included
pctx files list [OPTIONS]

# Show file tree structure
pctx files tree [OPTIONS]

# Configuration management
pctx config show      # Show current config
pctx config init      # Create .pctx.toml
pctx config defaults  # List default excludes

# Generate shell completions
pctx completions bash
pctx completions zsh
pctx completions fish

Output Options

Flag	Description
`--clipboard`, `-c`	Copy output to system clipboard
`--output FILE`, `-o`	Write to file (use `--force` to overwrite)
`--format`, `-f`	Output format: `markdown`, `xml`, `plain`
`--tree`, `-t`	Include file tree at beginning of output
`--stats`, `-s`	Show statistics summary
`--json`	Structured JSON output (for scripts)
`--stdin`	Read file paths from stdin (one per line)

Filtering Options

Flag	Description
`--exclude PATTERN`, `-e`	Exclude files matching pattern (repeatable)
`--include PATTERN`, `-i`	Include only files matching pattern (repeatable)
`--hidden`	Include hidden files (starting with `.`)
`--no-default-excludes`	Disable built-in exclusions
`--no-gitignore`	Ignore `.gitignore` rules
`--max-size KB`	Maximum file size in KB (default: 1024)
`--max-depth N`, `-d`	Limit directory recursion depth

Truncation Options

Flag	Description
`--no-truncation`	Disable all truncation
`--max-lines N`	Max lines per file before truncating (default: 500, 0 = unlimited)
`--head-lines N`	Lines to keep at file start (default: 20)
`--tail-lines N`	Lines to keep at file end (default: 10)
`--max-line-length N`	Max chars per line (default: 500, 0 = unlimited)
`--head-chars N`	Chars to keep at line start (default: 200)
`--tail-chars N`	Chars to keep at line end (default: 100)

Stdin Mode

The --stdin flag allows reading file paths from standard input, enabling powerful integrations:

# Process only recently modified Rust files
find . -name "*.rs" -mtime -1 | pctx --stdin

# Process files from a list
cat files_to_review.txt | pctx --stdin

# Chain with pctx's own file listing
pctx files list --quiet | grep -v _test | pctx --stdin

# Use with git to process only changed files
git diff --name-only HEAD~5 | pctx --stdin

# Process files matching complex criteria
fd -e rs -e toml --changed-within 2weeks | pctx --stdin

When using --stdin:

Empty lines and whitespace-only lines are ignored
Non-existent files are skipped with a warning (in verbose mode)
Directories in the input are expanded recursively

Configuration

Config File

Create a .pctx.toml file in your project root:

pctx config init

Example configuration:

# Patterns to exclude (in addition to defaults)
exclude = [
    "*.generated.ts",
    "vendor/",
    "__snapshots__",
]

# Patterns to include (if specified, only these are included)
include = [
    "*.rs",
    "*.toml",
]

# Truncation settings
max_lines = 500
head_lines = 20
tail_lines = 10
max_line_length = 500

Configuration is loaded from .pctx.toml in the current directory or any parent directory. If the config file exists but has syntax errors, a warning is printed and the file is skipped.

Configuration Precedence

Settings are applied in this order (highest priority first):

Command-line arguments
Config file (.pctx.toml)
Built-in defaults

Default Exclusions

Common directories and files are excluded by default:

Version control: .git, .svn, .hg
Dependencies: node_modules, vendor, target, .venv
Build outputs: dist, build, out, bin, obj
IDE/Editor: .idea, .vscode, .vs
Caches: __pycache__, .cache, .pytest_cache
Lock files: package-lock.json, yarn.lock, Cargo.lock, etc.

See all defaults with: pctx config defaults

Pattern Syntax

Patterns follow gitignore-style syntax:

Pattern	Matches
`*.log`	All `.log` files
`test_*`	Files starting with `test_`
`/tests/`	Any `tests` directory at any level
`/src/generated`	`src/generated` at root only
`docs/`	`docs` directory

Limitations:

Negation patterns (!pattern) are not supported and will show a warning
Character classes ([abc]) depend on glob crate support
Some edge cases with **/ patterns may differ from git behavior

Architecture & Developer Guide

pctx is built in Rust with a modular design aimed at fast file discovery and robust content formatting. It provides both a command-line interface and a library crate that you can use programmatically.

Core Modules

The library (src/lib.rs) is divided into several focused modules:

1. `cli`

Defines the command-line interface using the clap crate. It handles parsing commands, options (like --max-depth, --json), and truncation thresholds. The CLI is designed to provide structured output (--json mode) for programmatic consumers and standard human-readable streams.

2. `config`

Manages configuration resolution. Settings are prioritized in the following order:

Command-line arguments.
Configuration file (.pctx.toml).
Built-in defaults.

3. `scanner`

Responsible for file discovery and validation.

Uses robust tools like walkdir to traverse directories recursively.
Respects depth limits.
Validates that paths exist and are accessible.

4. `filter`

The filtering engine handles .gitignore logic and custom gitignore-style glob patterns.

Applies standard built-in exclusions (e.g., node_modules, .git, target).
Excludes binary files or overly large files.
Evaluates --include and --exclude rules dynamically.

5. `content` & `truncation`

Once files are discovered and filtered, this module processes the raw text.

Reads file contents efficiently.
Applies line and character truncation thresholds to ensure LLM prompt windows aren’t overwhelmed.
Automatically preserves the “head” and “tail” of files or long lines to provide maximum semantic context.

6. `output`

Handles the final formatting and writing.

Formats context into Markdown, XML, or Plain text.
Outputs via stdout, clipboard (arboard), or file output.
Generates JSON structures for the --json option.
Optionally produces a filesystem tree via the tree module.

7. `stats`

Calculates usage statistics (e.g., total files, token estimation, total bytes) to help users gauge the size of the generated context.

Keyboard shortcuts

pctx Documentation