LIVE RAG · Retrieval

PflegeLotse

GDPR-compliant RAG assistant for German long-term-care law (SGB XI) — source-grounded answers, conversational with streaming, anti-hallucination and an eval harness. Live.

Problem & context

Care law is complex — and answers must be verifiable

Care providers and families lose time searching the SGB XI. A generic chatbot hallucinates — unacceptable for legal questions. What is needed is an assistant that answers only from the statute, cites every claim, and abstains honestly when unsure.

Solution

Retrieval-augmented generation with strict grounding

Paragraph-precise chunking, retrieval via pgvector and a grounding prompt: every claim is cited with its section, and on weak hits the system abstains honestly (abstention over hallucination). Conversational with history (follow-ups) and token-by-token streaming — without giving up grounding.

PflegeLotse interface: a question about respite care with a source-grounded answer (§ 39, § 42a SGB XI) and a sources panel.

Architecture

Clean Architecture, four layers

domain

Entities, ports & rules — framework-free

application

Use cases: ingest, answer, conversation+streaming

infrastructure

pgvector, mistral-embed / E5, Mistral/Ollama

api

FastAPI + Jinja2/HTMX + SSE streaming

Process history

From plan to deploy — six phases

  1. 01

    Setup & architecture

    DONE

    Clean-architecture skeleton, Docker, CI (ruff + mypy --strict + pytest). ADR-0001: Python/HTMX over Next.js.

  2. 02

    Data & ingestion

    DONE

    Parsed SGB XI (public domain), paragraph-precise chunking (235 sections), embeddings → pgvector. Adding § + title to the embedding measurably raised recall.

  3. 03

    Retrieval & grounding

    DONE

    Retrieval via pgvector, source-grounded answers, abstention threshold (score < 0.78) — no hallucination.

  4. 04

    Eval harness

    DONE

    Golden set (26 cases), measured recall@k, abstention accuracy and latency — measurable, not "felt".

  5. 05

    Conversation, streaming & UI

    DONE

    Jinja2/HTMX + SSE: history (follow-ups via query rewriting), token-by-token answers, source panel, cookie banner (TDDDG), BFSG/WCAG, disclaimer (RDG).

  6. 06

    Deploy & docs

    DONE

    Live on own VPS via Traefik (auto-HTTPS), lightweight image (mistral-embed, no torch). Project documentation + design presentation public.

Results

Made measurable

85 %
Recall@5 (real questions)
100 %
Abstention on traps · 0 hallucinations
~1,8 s
Latency p50 (end-to-end)

Measured against a golden set (26 cases). Details in the project documentation.

Stack & compliance

Python 3.12FastAPIpgvectormistral-embed / E5Mistral / OllamaHTMX + SSEDockerTraefik

GDPR & EU AI Act: no personal data, EU or local LLM, citations instead of free generation. Disclaimer: no legal advice (RDG).

PflegeLotse live demo

Open live demo ↗ ← All projects