- Python 99.6%
- Dockerfile 0.4%
| .woodpecker | ||
| docs | ||
| scripts | ||
| tests | ||
| .gitignore | ||
| bao.yml | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| config.yaml.example | ||
| Containerfile | ||
| jobs.py | ||
| main.py | ||
| README.md | ||
mem0-api-server
Unified AI memory + document RAG API for the aienv stack. Wraps:
- Qdrant (vector embeddings, semantic search)
- OpenSearch (BM25 full-text)
- Neo4j (entity relationship graph)
- PostgreSQL (relational metadata)
Hybrid search (semantic + BM25 with RRF merge), persistent memory via the mem0ai library, and an OpenAI-compatible /v1/chat/completions endpoint that injects RAG context before forwarding to Ollama.
Endpoints
| Path | Purpose |
|---|---|
GET /health |
Service + dependency health |
POST /v1/memories/ |
Add a memory |
POST /v1/memories/search/ |
Semantic memory search |
GET /v1/memories/?user_id=... |
List memories |
POST /v1/conversations/ |
Ingest a conversation transcript |
POST /v1/conversations/search/ |
Search conversation history |
POST /v1/documents/index |
Index a document (chunks + embeds + indexes in OS) |
POST /v1/documents/query |
Hybrid search (Qdrant + BM25 → RRF) |
POST /v1/documents/query-llm |
RAG query → LLM answer with citations |
GET /v1/documents/collections |
List document collections |
POST /v1/chat/completions |
OpenAI-compatible chat with implicit RAG injection |
POST /mcp/ |
Streamable HTTP MCP server (rag_query, mem0_search, ...) |
Deploy
Container image is published to git.loop-coop.net/loco/workloads-mem0-api-server:vX.Y.Z AND :latest by Woodpecker on every tag push. Consumers (the aienv-mem0 qubes module) pull this image with Pull=newer — no local builds, no version pinning. A new release lands by tagging this repo and restarting the consumer:
edit + commit + push to main
→ Woodpecker bump.yml auto-tags vX.Y.Z (Conventional Commits)
→ release.yml builds + pushes :vX.Y.Z + :latest
mem0 qube:
→ systemctl restart mem0
→ Pull=newer fetches the new digest, container starts on it
No aienv-salt --sync, no apply mem0 for routine releases.
Runtime config is a YAML mounted at /app/config.yaml (Qdrant/OpenSearch/Neo4j hosts + ports, embedder/LLM model). See config.yaml.example.
The service is agent-neutral — MCP tools default user_id to "default". Each MCP client (Claude Code, opencode, custom agents) passes its own user_id so memories segment per agent.
Tools (MCP)
The /mcp/ endpoint exposes Streamable HTTP MCP tools so any MCP-aware client (Claude Code, opencode, custom agents) can scope RAG queries and read/write mem0 directly. The server is agent-neutral — no client is the default; identity is whatever the caller passes as user_id. Six tools: rag_query, rag_query_llm, rag_list_collections, mem0_search, mem0_add, mem0_list.
Client config:
"rag": {"type": "http", "url": "http://localhost:8090/mcp"}
See docs/MCP.md for the full tool reference, smoke-test commands, implementation notes, and the comparison between explicit /mcp/ calls and the implicit RAG injection in /v1/chat/completions.