No description

Python 99.6%
Dockerfile 0.4%

Find a file

woodpecker-bump b107b49d21 chore(release): v0.14.1 [skip ci]		2026-05-18 13:30:12 +00:00
.woodpecker	fix: pin tree-sitter<0.25 + tree-sitter-language-pack<1.0	2026-05-13 13:12:10 +02:00
docs	docs: deployment topology + central jobs queue notes	2026-05-09 18:25:32 +02:00
scripts	feat(coverage): index Forgejo issues + PRs into the planning collection (#9 )	2026-04-25 18:42:30 +02:00
tests	feat(jobs): /v1/jobs/_/metrics Prometheus exposition	2026-05-08 16:09:03 +02:00
.gitignore	chore(ci): consolidate to single pipeline + unified bao-checks	2026-05-01 17:17:36 +02:00
bao.yml	chore(ci): pilot ci-shared consolidation (#11 )	2026-05-01 15:05:11 +00:00
CHANGELOG.md	chore(release): v0.14.1 [skip ci]	2026-05-18 13:30:12 +00:00
CLAUDE.md	docs: git push origin main (the k8s remote convention isn't instantiated)	2026-05-10 09:29:06 +02:00
config.yaml.example	fix(retrieval): move expand_llm.model out of llm.config (mem0 OllamaConfig rejects unknown kwargs)	2026-05-08 10:20:58 +02:00
Containerfile	fix: pin tree-sitter<0.25 + tree-sitter-language-pack<1.0	2026-05-13 13:12:10 +02:00
jobs.py	feat(jobs): /v1/jobs/_/metrics Prometheus exposition	2026-05-08 16:09:03 +02:00
main.py	feat(jobs): central queue surface (HTTP + MCP + scheduler)	2026-05-08 11:51:30 +02:00
README.md	chore: scrub registry paths to loco/<group>-<project>	2026-05-01 19:44:24 +02:00

README.md

mem0-api-server

Unified AI memory + document RAG API for the aienv stack. Wraps:

Qdrant (vector embeddings, semantic search)
OpenSearch (BM25 full-text)
Neo4j (entity relationship graph)
PostgreSQL (relational metadata)

Hybrid search (semantic + BM25 with RRF merge), persistent memory via the mem0ai library, and an OpenAI-compatible /v1/chat/completions endpoint that injects RAG context before forwarding to Ollama.

Endpoints

Path	Purpose
`GET /health`	Service + dependency health
`POST /v1/memories/`	Add a memory
`POST /v1/memories/search/`	Semantic memory search
`GET /v1/memories/?user_id=...`	List memories
`POST /v1/conversations/`	Ingest a conversation transcript
`POST /v1/conversations/search/`	Search conversation history
`POST /v1/documents/index`	Index a document (chunks + embeds + indexes in OS)
`POST /v1/documents/query`	Hybrid search (Qdrant + BM25 → RRF)
`POST /v1/documents/query-llm`	RAG query → LLM answer with citations
`GET /v1/documents/collections`	List document collections
`POST /v1/chat/completions`	OpenAI-compatible chat with implicit RAG injection
`POST /mcp/`	Streamable HTTP MCP server (rag_query, mem0_search, ...)

Deploy

Container image is published to git.loop-coop.net/loco/workloads-mem0-api-server:vX.Y.Z AND :latest by Woodpecker on every tag push. Consumers (the aienv-mem0 qubes module) pull this image with Pull=newer — no local builds, no version pinning. A new release lands by tagging this repo and restarting the consumer:

edit + commit + push to main
  → Woodpecker bump.yml auto-tags vX.Y.Z (Conventional Commits)
  → release.yml builds + pushes :vX.Y.Z + :latest
mem0 qube:
  → systemctl restart mem0
  → Pull=newer fetches the new digest, container starts on it

No aienv-salt --sync, no apply mem0 for routine releases.

Runtime config is a YAML mounted at /app/config.yaml (Qdrant/OpenSearch/Neo4j hosts + ports, embedder/LLM model). See config.yaml.example.

The service is agent-neutral — MCP tools default user_id to "default". Each MCP client (Claude Code, opencode, custom agents) passes its own user_id so memories segment per agent.

Tools (MCP)

The /mcp/ endpoint exposes Streamable HTTP MCP tools so any MCP-aware client (Claude Code, opencode, custom agents) can scope RAG queries and read/write mem0 directly. The server is agent-neutral — no client is the default; identity is whatever the caller passes as user_id. Six tools: rag_query, rag_query_llm, rag_list_collections, mem0_search, mem0_add, mem0_list.

Client config:

"rag": {"type": "http", "url": "http://localhost:8090/mcp"}

See docs/MCP.md for the full tool reference, smoke-test commands, implementation notes, and the comparison between explicit /mcp/ calls and the implicit RAG injection in /v1/chat/completions.