Introduction
pctx generates LLM-ready context from your codebase. It intelligently packages source files with proper formatting, truncation, and filtering for optimal AI assistant consumption.
Motivation
When working with modern AI coding harnesses, agents often need to make multiple, sequential tool calls just to explore, read, and understand the layout of your project. This back-and-forth communication is slow, consumes excess tokens for system overhead, and can easily derail an agent’s train of thought before it even starts writing code.
pctx solves this by providing a unified, pre-packaged snapshot of your project’s context. By generating a single, intelligently filtered and truncated document, you can:
- Reduce Latency: Eliminate the need for the AI to “poke around” your filesystem using search and read commands.
- Improve Accuracy: Provide the AI with an immediate, holistic view of the project’s structure and relevant files.
- Save Tokens: Filter out noise (like binaries, build artifacts, and vendor directories) and smartly truncate long files so you only pay for the context that matters.
- Maintain Control: Keep sensitive or irrelevant information out of the context window using standard gitignore syntax.
Features
- Smart file discovery: Respects
.gitignore, excludes binary files, and filters common non-source directories - Multiple output formats: Markdown (default), XML, and plain text
- Intelligent truncation: Preserves file head and tail when truncating large files
- Flexible filtering: Include/exclude patterns with gitignore-style syntax
- Multiple destinations: stdout, clipboard, or file output
- JSON mode: Structured output for programmatic use and CI/CD integration
- Stdin support: Read file lists from pipes for integration with other tools
- Token estimation: Approximate token counts for various LLM models
Installation
You can install pctx from crates.io using Cargo:
cargo install pctx
Build from Source
Alternatively, you can build from source:
git clone https://github.com/mc-marcocheng/pctx
cd pctx
cargo build --release
The compiled binary will be available at target/release/pctx.
Usage
Quick Start
# Generate context for current directory
pctx
# Copy to clipboard
pctx --clipboard
# Write to file
pctx --output context.md
# JSON output for scripts
pctx --json
# Filter specific files
pctx --include "*.rs" --include "*.toml"
pctx --exclude "*.test.ts" --exclude "__tests__"
# Pipe file list from other tools
find . -name "*.rs" -mtime -7 | pctx --stdin
pctx files list --quiet | grep -v test | pctx --stdin
# Preview without generating
pctx --dry-run
# Include file tree in output
pctx --tree
# Disable truncation for full file contents
pctx --no-truncation
Basic Commands
# Default: generate context from current directory
pctx [OPTIONS] [PATHS...]
# List files that would be included
pctx files list [OPTIONS]
# Show file tree structure
pctx files tree [OPTIONS]
# Configuration management
pctx config show # Show current config
pctx config init # Create .pctx.toml
pctx config defaults # List default excludes
# Generate shell completions
pctx completions bash
pctx completions zsh
pctx completions fish
Output Options
| Flag | Description |
|---|---|
--clipboard, -c | Copy output to system clipboard |
--output FILE, -o | Write to file (use --force to overwrite) |
--format, -f | Output format: markdown, xml, plain |
--tree, -t | Include file tree at beginning of output |
--stats, -s | Show statistics summary |
--json | Structured JSON output (for scripts) |
--stdin | Read file paths from stdin (one per line) |
Filtering Options
| Flag | Description |
|---|---|
--exclude PATTERN, -e | Exclude files matching pattern (repeatable) |
--include PATTERN, -i | Include only files matching pattern (repeatable) |
--hidden | Include hidden files (starting with .) |
--no-default-excludes | Disable built-in exclusions |
--no-gitignore | Ignore .gitignore rules |
--max-size KB | Maximum file size in KB (default: 1024) |
--max-depth N, -d | Limit directory recursion depth |
Truncation Options
| Flag | Description |
|---|---|
--no-truncation | Disable all truncation |
--max-lines N | Max lines per file before truncating (default: 500, 0 = unlimited) |
--head-lines N | Lines to keep at file start (default: 20) |
--tail-lines N | Lines to keep at file end (default: 10) |
--max-line-length N | Max chars per line (default: 500, 0 = unlimited) |
--head-chars N | Chars to keep at line start (default: 200) |
--tail-chars N | Chars to keep at line end (default: 100) |
Stdin Mode
The --stdin flag allows reading file paths from standard input, enabling powerful integrations:
# Process only recently modified Rust files
find . -name "*.rs" -mtime -1 | pctx --stdin
# Process files from a list
cat files_to_review.txt | pctx --stdin
# Chain with pctx's own file listing
pctx files list --quiet | grep -v _test | pctx --stdin
# Use with git to process only changed files
git diff --name-only HEAD~5 | pctx --stdin
# Process files matching complex criteria
fd -e rs -e toml --changed-within 2weeks | pctx --stdin
When using --stdin:
- Empty lines and whitespace-only lines are ignored
- Non-existent files are skipped with a warning (in verbose mode)
- Directories in the input are expanded recursively
Configuration
Config File
Create a .pctx.toml file in your project root:
pctx config init
Example configuration:
# Patterns to exclude (in addition to defaults)
exclude = [
"*.generated.ts",
"vendor/",
"__snapshots__",
]
# Patterns to include (if specified, only these are included)
include = [
"*.rs",
"*.toml",
]
# Truncation settings
max_lines = 500
head_lines = 20
tail_lines = 10
max_line_length = 500
Configuration is loaded from .pctx.toml in the current directory or any parent directory. If the config file exists but has syntax errors, a warning is printed and the file is skipped.
Configuration Precedence
Settings are applied in this order (highest priority first):
- Command-line arguments
- Config file (
.pctx.toml) - Built-in defaults
Default Exclusions
Common directories and files are excluded by default:
- Version control:
.git,.svn,.hg - Dependencies:
node_modules,vendor,target,.venv - Build outputs:
dist,build,out,bin,obj - IDE/Editor:
.idea,.vscode,.vs - Caches:
__pycache__,.cache,.pytest_cache - Lock files:
package-lock.json,yarn.lock,Cargo.lock, etc.
See all defaults with: pctx config defaults
Pattern Syntax
Patterns follow gitignore-style syntax:
| Pattern | Matches |
|---|---|
*.log | All .log files |
test_* | Files starting with test_ |
**/tests/** | Any tests directory at any level |
/src/generated | src/generated at root only |
docs/ | docs directory |
Limitations:
- Negation patterns (
!pattern) are not supported and will show a warning - Character classes (
[abc]) depend on glob crate support - Some edge cases with
**/patterns may differ from git behavior
Architecture & Developer Guide
pctx is built in Rust with a modular design aimed at fast file discovery and robust content formatting. It provides both a command-line interface and a library crate that you can use programmatically.
Core Modules
The library (src/lib.rs) is divided into several focused modules:
1. cli
Defines the command-line interface using the clap crate. It handles parsing commands, options (like --max-depth, --json), and truncation thresholds. The CLI is designed to provide structured output (--json mode) for programmatic consumers and standard human-readable streams.
2. config
Manages configuration resolution. Settings are prioritized in the following order:
- Command-line arguments.
- Configuration file (
.pctx.toml). - Built-in defaults.
3. scanner
Responsible for file discovery and validation.
- Uses robust tools like
walkdirto traverse directories recursively. - Respects depth limits.
- Validates that paths exist and are accessible.
4. filter
The filtering engine handles .gitignore logic and custom gitignore-style glob patterns.
- Applies standard built-in exclusions (e.g.,
node_modules,.git,target). - Excludes binary files or overly large files.
- Evaluates
--includeand--excluderules dynamically.
5. content & truncation
Once files are discovered and filtered, this module processes the raw text.
- Reads file contents efficiently.
- Applies line and character truncation thresholds to ensure LLM prompt windows aren’t overwhelmed.
- Automatically preserves the “head” and “tail” of files or long lines to provide maximum semantic context.
6. output
Handles the final formatting and writing.
- Formats context into Markdown, XML, or Plain text.
- Outputs via
stdout, clipboard (arboard), or file output. - Generates JSON structures for the
--jsonoption. - Optionally produces a filesystem tree via the
treemodule.
7. stats
Calculates usage statistics (e.g., total files, token estimation, total bytes) to help users gauge the size of the generated context.