No description
- Python 99%
- Dockerfile 1%
Previous pipeline (with the shared-python-base:v1 swap) failed at the lint step because the base image didn't exist yet — ci-shared #6 hit a Woodpecker dedup race during the three-tag burst and got unstuck only via base-python-v1.0.1 (commit ea55fbd). The base image is now in the registry; retry should be clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .woodpecker | ||
| docs | ||
| tests | ||
| web_api | ||
| .gitignore | ||
| bao.yml | ||
| CHANGELOG.md | ||
| CLAUDE.md | ||
| Containerfile | ||
| pytest.ini | ||
| README.md | ||
| requirements-dev.txt | ||
| requirements.txt | ||
web-api
Local search + fetch + extract coordinator for the aienv stack.
Replaces the legacy ollama-qube web-fetch service (port :8081, retired).
claude/flex qube ──qrexec tunnel──▶ api qube :8097 (web-api)
│
├── http://localhost:8098 SearXNG (loopback)
│ └── Google / Brave / DDG / Bing / Wikipedia
│
└── outbound HTTPS via sys-wg → page fetch + Trafilatura
- Federated search through a local SearXNG container (no provider API keys needed in v1).
- Boilerplate-stripped Markdown via Trafilatura.
- sqlite cache for both searches and pages (TTLs configurable). Keeps re-runs of the same research cheap and stops Claude from re-hitting upstreams on retry.
- One-shot research endpoint:
search → fetch top-N → bundle. - Image search through SearXNG
categories=images— preservesimg_src/thumbnail_src/resolution/img_format. - MCP server mounted at
/mcp— four tools:web_search,web_image_search,web_fetch,web_research.
Not a web crawler. No JS execution (yet — Playwright fallback is a v2 candidate). No write surface. Only http(s) URLs accepted.
Endpoints
| Route | Method | Notes |
|---|---|---|
/health |
GET | version + searxng probe + cache row counts |
/v1/search |
GET | ?q=&engines=&count=&language=&safesearch=&fresh= |
/v1/images/search |
GET | ?q=&engines=&count=&language=&safesearch=&fresh= — image-shaped results |
/v1/fetch |
GET | ?url=&max_chars=&fresh= |
/v1/research |
POST | JSON body, see web_api/app.py |
/v1/cache/stats |
GET | row counts |
/mcp/ |
* | FastMCP Streamable HTTP |
Full design: docs/design.md.
Run locally
pip install -r requirements.txt
SEARXNG_URL=http://localhost:8098 CACHE_DB=/tmp/web-api.sqlite \
python3 -m web_api
Tests:
pip install -r requirements.txt pytest pytest-asyncio respx
pytest tests/
Deploy
Quadlet at ~/aienv/api/services/web-api/web-api.container pulls
git.loop-coop.net/loco/workloads-web-api:latest and depends on
searxng.service. SearXNG settings live alongside in
~/aienv/api/services/searxng/.
Versioning + release
Same flow as media-api / mem0-api-server — Conventional Commits, push to
main, bump.yml tags, release.yml builds + pushes to the local
Forgejo registry, api qube auto-pulls via Pull=newer.