Advanced AI agent memory system with hybrid search (BM25 + vector), Cohere reranking, embedding cache, and incremental sync. ChromaDB + Ollama backed, 100% self-hosted (Cohere reranking optional). Use when managing agent long-term memory, indexing memory files, running memory health checks, or searching agent memories with hybrid retrieval.
Recommended by author
This prompt takes no variables — just pick a model and run.
# Self-Memory — Advanced Agent Memory System
Hybrid retrieval memory layer for OpenClaw agents. Combines ChromaDB vector search with BM25 keyword search, optional Cohere reranking, embedding cache, and incremental sync.
## Architecture
```
Query → [BM25 keyword search] ──┐
├─ Weighted Fusion (70/30) → Cohere Rerank → Results
Query → [ChromaDB vector search]┘
```
- **Vector search**: ChromaDB + Ollama mxbai-embed-large (semantic meaning)
- **Keyword search**: BM25Okapi (exact matches, error codes, function names)
- **Fusion**: 70% vector + 30% BM25 weighted score
- **Reranking**: Cohere rerank-v3.5 (optional, graceful fallback)
- **Cache**: SHA256-based embedding cache (skip redundant Ollama calls)
- **Sync**: Incremental file-hash sync (only changed files re-indexed)
## Prerequisites
1. **ChromaDB** running locally:
```bash
docker run -d --name chromadb -p 8100:8000 chromadb/chroma:latest
```
2. **Ollama** with embedding model:
```bash
ollama pull mxbai-embed-large
```
3. **Python deps**:
```bash
pip install rank-bm25 cohere
```
4. **Optional**: `COHERE_API_KEY` env var for reranking
## Scripts
### Reindex — `scripts/chromadb-reindex.py`
Index all memory files into ChromaDB with embedding cache + incremental sync.
```bash
# Incremental (only changed files)
python3 scripts/chromadb-reindex.py
# Full reindex (rebuild everything)
python3 scripts/chromadb-reindex.py --force
```
**Features:**
- Embedding cache: `tmp/embedding-cache.json` — skips Ollama for unchanged chunks
- Sync state: `tmp/chromadb-sync-state.json` — tracks file hashes
- BM25 corpus: `tmp/bm25-corpus.pkl` — saved for hybrid search
- Logs: X changed, Y unchanged, Z deleted
**Indexed paths**: MEMORY.md, SOUL.md, USER.md, IDENTITY.md, TOOLS.md, GOALS.md, AGENTS.md, HEARTBEAT.md, memory/, agents/, .learnings/
### Hybrid Search — `scripts/chromadb-hybrid-search.py`
Search with BM25 + vector fusion + optional Cohere reranking.
```bash
# Default (5 results, with reranking)
python3 scripts/chromadb-hybrid-search.py "keresési kifejezés"
# Custom top-N
python3 scripts/chromadb-hybrid-search.py "query" --top 10
# Without reranking
python3 scripts/chromadb-hybrid-search.py "query" --no-rerank
```
**Pipeline**: 20 candidates from each search → fusion → rerank → top N results.
### Benchmark — `scripts/chromadb-benchmark.py`
Measure search quality with standardized test queries.
```bash
# Vector-only baseline
python3 scripts/chromadb-benchmark.py --save
# Hybrid (BM25 + vector)
python3 scripts/chromadb-benchmark.py --hybrid --save
# Hybrid + Cohere reranking (best quality)
python3 scripts/chromadb-benchmark.py --hybrid --rerank --save
```
**Metrics**: Precision@3, MRR (Mean Reciprocal Rank), latency. Results saved to `tmp/benchmark-*.json`.
### Health Check — `scripts/chromadb-health.py`
```bash
# Quick smoke test (for heartbeats — ~2 sec)
python3 scripts/chromadb-health.py --quick
# Full health report (weekly)
python3 scripts/chromadb-health.py
```
**Quick mode checks** (exit code 1 if any fail):
1. ChromaDB server reachable
2. Ollama embedding responds
3. Collection has documents
4. Search returns results
5. Last reindex < 48 hours ago
**Full mode adds**: source coverage, query quality tests, latency benchmark, hybrid search test, overall grade (A-D).
## Configuration
All scripts use these constants (edit at top of each file):
| Setting | Default | Description |
|---------|---------|-------------|
| `CHROMA_URL` | `http://localhost:8000` | ChromaDB endpoint |
| `OLLAMA_URL` | `http://localhost:11434` | Ollama endpoint |
| `COLLECTION_ID` | (your collection UUID) | ChromaDB collection |
| `VECTOR_WEIGHT` | `0.7` | Vector score weight in fusion |
| `BM25_WEIGHT` | `0.3` | BM25 score weight in fusion |
| `TOP_CANDIDATES` | `20` | Candidates per search before fusion |
| `FINAL_RESULTS` | `5` | Results after reranking |
| `MAX_CHUNK_SIZE` | `1500` | Chars per chunk |
| `OVERLAP` | `200` | Overlap between chunks |
### Compaction Guard — `scripts/compaction-guard.py`
Protect against context loss during OpenClaw's automatic memory compaction.
Inspired by Redis buffer approach — saves critical files before compaction wipes raw context.
```bash
# Auto backup (skips if nothing changed or too soon)
python3 scripts/compaction-guard.py
# Force backup now
python3 scripts/compaction-guard.py --force
# List existing backups
python3 scripts/compaction-guard.py --list
# Cleanup backups older than 7 days
python3 scripts/compaction-guard.py --cleanup 7
```
**How it works:**
1. Every heartbeat, runs automatically (15-min cooldown between backups)
2. Computes SHA256 hash of all critical files (HOT_MEMORY, WARM_MEMORY, MEMORY.md, today's log, HEARTBEAT.md)
3. If content changed → creates timestamped backup in `memory/session-backups/`
4. Each backup includes metadata (timestamp, files, hash)
5. Weekly cleanup removes backups older than 7 days
**Why this matters:** OpenClaw compacts context when it gets too full, creating a summary but losing raw conversation details. This guard ensures the full state is preserved — like Redis buffer but using simple files.
**Backup location:** `memory/session-backups/YYYYMMDD-HHMMSS/`
## Heartbeat Integration
Add to HEARTBEAT.md for continuous monitoring:
```markdown
## EVERY heartbeat (mandatory):
- [ ] Compaction Guard: `python3 scripts/compaction-guard.py` → auto-saves session state
- [ ] ChromaDB smoke test: `python3 scripts/chromadb-health.py --quick` → if FAIL, alert!
## Rotating checks:
- [ ] ChromaDB full health (weekly): `python3 scripts/chromadb-health.py`
- [ ] ChromaDB reindex (when new memory files): `python3 scripts/chromadb-reindex.py`
- [ ] Compaction backup cleanup (weekly): `python3 scripts/compaction-guard.py --cleanup 7`
```
## Dokumentáció
- **Részletes specifikáció**: `references/SPEC.md` — architektúra, komponensek, adatfolyamok, roadmap
- **Memory tier rendszer**: `references/memory-tiers.md` — HOT/WARM/COLD tier részletekRunning prompts needs a free account.
Sign in and we'll stream the response from Claude Opus 4.7 right here — no config needed for the platform models.
Advanced AI agent memory system with hybrid search (BM25 + vector), Cohere reranking, embedding cache, and incremental sync. ChromaDB + Ollama backed, 100% self-hosted (Cohere reranking optional). Use when managing agent long-term memory, indexing memory files, running memory health checks, or searching agent memories with hybrid retrieval.
# Self-Memory — Advanced Agent Memory System
Hybrid retrieval memory layer for OpenClaw agents. Combines ChromaDB vector search with BM25 keyword search, optional Cohere reranking, embedding cache, and incremental sync.
## Architecture
```
Query → [BM25 keyword search] ──┐
├─ Weighted Fusion (70/30) → Cohere Rerank → Results
Query → [ChromaDB vector search]┘
```
- **Vector search**: ChromaDB + Ollama mxbai-embed-large (semantic meaning)
- **Keyword search**: BM25Okapi (exact matches, error codes, function names)
- **Fusion**: 70% vector + 30% BM25 weighted score
- **Reranking**: Cohere rerank-v3.5 (optional, graceful fallback)
- **Cache**: SHA256-based embedding cache (skip redundant Ollama calls)
- **Sync**: Incremental file-hash sync (only changed files re-indexed)
## Prerequisites
1. **ChromaDB** running locally:
```bash
docker run -d --name chromadb -p 8100:8000 chromadb/chroma:latest
```
2. **Ollama** with embedding model:
```bash
ollama pull mxbai-embed-large
```
3. **Python deps**:
```bash
pip install rank-bm25 cohere
```
4. **Optional**: `COHERE_API_KEY` env var for reranking
## Scripts
### Reindex — `scripts/chromadb-reindex.py`
Index all memory files into ChromaDB with embedding cache + incremental sync.
```bash
# Incremental (only changed files)
python3 scripts/chromadb-reindex.py
# Full reindex (rebuild everything)
python3 scripts/chromadb-reindex.py --force
```
**Features:**
- Embedding cache: `tmp/embedding-cache.json` — skips Ollama for unchanged chunks
- Sync state: `tmp/chromadb-sync-state.json` — tracks file hashes
- BM25 corpus: `tmp/bm25-corpus.pkl` — saved for hybrid search
- Logs: X changed, Y unchanged, Z deleted
**Indexed paths**: MEMORY.md, SOUL.md, USER.md, IDENTITY.md, TOOLS.md, GOALS.md, AGENTS.md, HEARTBEAT.md, memory/, agents/, .learnings/
### Hybrid Search — `scripts/chromadb-hybrid-search.py`
Search with BM25 + vector fusion + optional Cohere reranking.
```bash
# Default (5 results, with reranking)
python3 scripts/chromadb-hybrid-search.py "keresési kifejezés"
# Custom top-N
python3 scripts/chromadb-hybrid-search.py "query" --top 10
# Without reranking
python3 scripts/chromadb-hybrid-search.py "query" --no-rerank
```
**Pipeline**: 20 candidates from each search → fusion → rerank → top N results.
### Benchmark — `scripts/chromadb-benchmark.py`
Measure search quality with standardized test queries.
```bash
# Vector-only baseline
python3 scripts/chromadb-benchmark.py --save
# Hybrid (BM25 + vector)
python3 scripts/chromadb-benchmark.py --hybrid --save
# Hybrid + Cohere reranking (best quality)
python3 scripts/chromadb-benchmark.py --hybrid --rerank --save
```
**Metrics**: Precision@3, MRR (Mean Reciprocal Rank), latency. Results saved to `tmp/benchmark-*.json`.
### Health Check — `scripts/chromadb-health.py`
```bash
# Quick smoke test (for heartbeats — ~2 sec)
python3 scripts/chromadb-health.py --quick
# Full health report (weekly)
python3 scripts/chromadb-health.py
```
**Quick mode checks** (exit code 1 if any fail):
1. ChromaDB server reachable
2. Ollama embedding responds
3. Collection has documents
4. Search returns results
5. Last reindex < 48 hours ago
**Full mode adds**: source coverage, query quality tests, latency benchmark, hybrid search test, overall grade (A-D).
## Configuration
All scripts use these constants (edit at top of each file):
| Setting | Default | Description |
|---------|---------|-------------|
| `CHROMA_URL` | `http://localhost:8000` | ChromaDB endpoint |
| `OLLAMA_URL` | `http://localhost:11434` | Ollama endpoint |
| `COLLECTION_ID` | (your collection UUID) | ChromaDB collection |
| `VECTOR_WEIGHT` | `0.7` | Vector score weight in fusion |
| `BM25_WEIGHT` | `0.3` | BM25 score weight in fusion |
| `TOP_CANDIDATES` | `20` | Candidates per search before fusion |
| `FINAL_RESULTS` | `5` | Results after reranking |
| `MAX_CHUNK_SIZE` | `1500` | Chars per chunk |
| `OVERLAP` | `200` | Overlap between chunks |
### Compaction Guard — `scripts/compaction-guard.py`
Protect against context loss during OpenClaw's automatic memory compaction.
Inspired by Redis buffer approach — saves critical files before compaction wipes raw context.
```bash
# Auto backup (skips if nothing changed or too soon)
python3 scripts/compaction-guard.py
# Force backup now
python3 scripts/compaction-guard.py --force
# List existing backups
python3 scripts/compaction-guard.py --list
# Cleanup backups older than 7 days
python3 scripts/compaction-guard.py --cleanup 7
```
**How it works:**
1. Every heartbeat, runs automatically (15-min cooldown between backups)
2. Computes SHA256 hash of all critical files (HOT_MEMORY, WARM_MEMORY, MEMORY.md, today's log, HEARTBEAT.md)
3. If content changed → creates timestamped backup in `memory/session-backups/`
4. Each backup includes metadata (timestamp, files, hash)
5. Weekly cleanup removes backups older than 7 days
**Why this matters:** OpenClaw compacts context when it gets too full, creating a summary but losing raw conversation details. This guard ensures the full state is preserved — like Redis buffer but using simple files.
**Backup location:** `memory/session-backups/YYYYMMDD-HHMMSS/`
## Heartbeat Integration
Add to HEARTBEAT.md for continuous monitoring:
```markdown
## EVERY heartbeat (mandatory):
- [ ] Compaction Guard: `python3 scripts/compaction-guard.py` → auto-saves session state
- [ ] ChromaDB smoke test: `python3 scripts/chromadb-health.py --quick` → if FAIL, alert!
## Rotating checks:
- [ ] ChromaDB full health (weekly): `python3 scripts/chromadb-health.py`
- [ ] ChromaDB reindex (when new memory files): `python3 scripts/chromadb-reindex.py`
- [ ] Compaction backup cleanup (weekly): `python3 scripts/compaction-guard.py --cleanup 7`
```
## Dokumentáció
- **Részletes specifikáció**: `references/SPEC.md` — architektúra, komponensek, adatfolyamok, roadmap
- **Memory tier rendszer**: `references/memory-tiers.md` — HOT/WARM/COLD tier részletek