Chapter I · AI agent stack

Compaction & Context Tuning

Tested on
OpenClaw 2026.4.x with GPT 5.5 (main), Claude Code tmux relay for review escalation, and ACP compatibility; applies to any main model
Last updated
2026-06-05

How to configure OpenClaw’s compaction, memory flush, context pruning, and session search so your agent doesn’t lose its mind (or personality) during long sessions.

The Problem

Long sessions kill context. Here’s the progression:

  1. Early in session: Agent follows all your AGENTS.md rules, maintains personality, remembers what you said 10 messages ago.
  2. Mid session: Context window fills up. Compaction kicks in and summarizes older exchanges. Some nuance gets lost.
  3. Late session: Behavioral rules from AGENTS.md get compacted away. Personality drifts. The agent goes generic. Critical decisions from earlier in the session are gone.

The default compaction settings are conservative. They protect the model from running out of context, but they don’t protect your context from being destroyed. These five config changes fix that.

Important: Use config.patch for all of these. Never config.apply unless you’re intentionally replacing your entire config file.

Pre-flight: Schema Validation

Before patching anything, confirm the fields exist in your OpenClaw version:

config.schema.lookup → agents.defaults.compaction
config.schema.lookup → memory

If a field doesn’t exist in the schema, your OpenClaw version may not support it yet. Check the docs or update.


Step 1: Compaction Tuning

What it does: Controls how aggressively OpenClaw summarizes older conversation history to free up context window space.

Path: agents.defaults.compaction

{
  "mode": "safeguard",
  "keepRecentTokens": 20000,
  "recentTurnsPreserve": 4,
  "maxHistoryShare": 0.7,
  "reserveTokens": 30000
}

What each field does:

FieldValueWhy
mode"safeguard"Only compacts when context pressure is real, not preemptively
keepRecentTokens20000Protects the last 20K tokens from summarization
recentTurnsPreserve4Always keeps the last 4 full exchanges intact
maxHistoryShare0.7History can use up to 70% of the context window
reserveTokens30000Keeps 30K tokens free for the model’s response and tool use

Tuning notes:


Step 2: Memory Flush

What it does: Before compaction destroys context, memory flush tells the agent to write important information to disk. Think of it as an auto-save before the game overwrites your checkpoint.

Path: agents.defaults.compaction.memoryFlush

{
  "enabled": true,
  "softThresholdTokens": 32000,
  "forceFlushTranscriptBytes": "2mb",
  "prompt": "Write a durable session note to memory/YYYY-MM-DD.md (use today's date). Capture: decisions made, corrections given, constraints or preferences learned, open questions with owners, new projects or resources discovered, any state that would break the plan if forgotten. If nothing meaningful happened, write NO_FLUSH.",
  "systemPrompt": "Be terse. Prefer bullet points. Do not rewrite the conversation."
}

What each field does:

FieldValueWhy
enabledtrueTurns on auto-flush
softThresholdTokens32000Triggers flush when context exceeds 32K tokens
forceFlushTranscriptBytes"2mb"Forces flush when raw transcript exceeds 2MB
prompt(see above)Tells the agent exactly what to capture
systemPrompt(see above)Keeps the flush output concise

Tuning the prompt: The default prompt above captures the most critical session state. Customize it based on what matters to your workflow:

Tuning the threshold:


Step 3: Post-Compaction Rule Re-injection

What it does: After compaction summarizes old context, your behavioral rules from AGENTS.md may get compressed or lost. This setting re-injects specific sections from AGENTS.md after every compaction, so your agent’s personality and operating rules survive indefinitely.

Path: agents.defaults.compaction.postCompactionSections

First, check what sections exist in your AGENTS.md:

grep -n '^## \|^### ' AGENTS.md

Then pick the sections that contain critical behavioral rules:

["Every Session", "Memory", "Safety"]

How to choose sections:

Why this matters: Without re-injection, a 4-hour session will gradually lose your agent’s personality. It starts following instructions loosely, forgets writing rules (like “no em dashes”), and reverts to generic assistant behavior. Re-injection prevents this drift.

Sizing note: Each re-injected section costs tokens on every turn after compaction. Keep the total under ~2000 tokens. If a section is too large, extract the critical rules into a dedicated smaller section.


Step 4: Context Pruning

What it does: Trims old, bloated tool outputs (web fetches, file reads, large API responses) from your context window. These outputs served their purpose when they were fresh but are dead weight 30 minutes later.

Path: agents.defaults.contextPruning

{
  "mode": "cache-ttl",
  "ttl": "5m",
  "keepLastAssistants": 3,
  "softTrimRatio": 0.3,
  "hardClearRatio": 0.5,
  "minPrunableToolChars": 20000,
  "softTrim": {
    "maxChars": 3000,
    "headChars": 1000,
    "tailChars": 1000
  },
  "hardClear": {
    "enabled": true,
    "placeholder": "[tool output pruned]"
  }
}

What each field does:

FieldValueWhy
mode"cache-ttl"Time-based pruning of stale content
ttl"5m"Aggressive: tool outputs older than 5 min get pruned
keepLastAssistants3Always keeps the last 3 assistant messages intact
minPrunableToolChars20000Only prunes tool outputs larger than 20K chars
softTrim.maxChars3000When soft-trimming, keep ~3K chars (1K head + 1K tail)
hardClear.enabledtrueAfter TTL expires, replace content with placeholder

Why cache-ttl: A single web_fetch can dump 50K characters into your context. Multiply that by a few research queries and you’ve burned half your window on content you referenced once. Cache-TTL prunes these after the configured time while keeping recent tool outputs available for follow-up questions.

Tuning notes:

Known bug (2026-04, upstream ticket queued): Context pruning can split tool_use/tool_result pairs. When the conversation grows past the pruning threshold, the pruner may remove a tool_result while keeping its tool_use, causing a hard Anthropic API error (tool_use ids were found without tool_result blocks). The only workaround today is starting a fresh conversation. Upstream fix should enforce atomic pair handling.


What it does: Enables semantic search over raw conversation transcripts from past sessions. Without this, memory_search only searches your MEMORY.md and knowledge card files. With it, your agent can find specific things said in previous conversations.

Path: memory.qmd.sessions

{
  "enabled": true,
  "retentionDays": 90
}

Also confirm this is set:

// memory.qmd.includeDefaultMemory should be true

Why 90 days: Session transcripts are large. Retaining them indefinitely bloats your search index and slows queries. 90 days gives you three months of searchable history, which covers most “what did we decide about X last month?” questions. Increase to 180 if you have long-running projects with infrequent check-ins.


Verification

After applying all patches, verify with config.get:

config.get → agents.defaults.compaction
config.get → memory

Confirm:


How These Work Together

Here’s the lifecycle during a long session:

  1. Early session (< 32K tokens): Everything works normally. Full context available.
  2. Approaching threshold (32K tokens): Memory flush triggers. Agent writes key decisions and state to memory/YYYY-MM-DD.md.
  3. Context pressure builds: Context pruning removes stale tool outputs older than 2 hours.
  4. Compaction triggers: Safeguard mode compresses older exchanges but protects the last 20K tokens and 4 exchanges.
  5. Post-compaction: Behavioral rules from AGENTS.md are re-injected. Agent retains personality and operating rules.
  6. Next session: Agent searches QMD transcripts and memory files to recover context from previous sessions.

The result: your agent maintains personality across multi-hour sessions, preserves critical decisions to disk before they’re lost, and can recall past conversations through semantic search.


Common Mistakes