code header
This commit is contained in:
parent
ec759dd1dc
commit
83bb18b6a7
|
|
@ -1,17 +1,10 @@
|
|||
"""
|
||||
app — mindnet API package
|
||||
|
||||
Zweck:
|
||||
Markiert 'app/' als Python-Paket, damit 'from app.main import create_app'
|
||||
in Tests und Skripten funktioniert.
|
||||
Kompatibilität:
|
||||
Python 3.12+
|
||||
Version:
|
||||
0.1.0 (Erstanlage)
|
||||
Stand:
|
||||
2025-10-07
|
||||
Hinweise:
|
||||
Keine Logik – nur Paketinitialisierung.
|
||||
FILE: app/__init__.py
|
||||
DESCRIPTION: Paket-Initialisierung.
|
||||
VERSION: 0.1.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: None
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
__version__ = "0.1.0"
|
||||
|
|
|
|||
|
|
@ -1,6 +1,10 @@
|
|||
"""
|
||||
app/config.py — zentrale Konfiguration
|
||||
Version: 0.4.0 (WP-06 Complete)
|
||||
FILE: app/config.py
|
||||
DESCRIPTION: Zentrale Pydantic-Konfiguration (Env-Vars für Qdrant, LLM, Retriever).
|
||||
VERSION: 0.4.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: os, functools, pathlib
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
import os
|
||||
|
|
|
|||
|
|
@ -1,11 +1,11 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
app/core/chunk_payload.py (Mindnet V2 — types.yaml authoritative)
|
||||
- neighbors_prev / neighbors_next sind Listen ([], [id]).
|
||||
- retriever_weight / chunk_profile kommen aus types.yaml (Frontmatter wird ignoriert).
|
||||
- Fallbacks: defaults.* in types.yaml; sonst 1.0 / "default".
|
||||
- WP-11 Update: Injects 'title' into chunk payload for Discovery Service.
|
||||
FILE: app/core/chunk_payload.py
|
||||
DESCRIPTION: Baut das JSON-Objekt für 'mindnet_chunks'. Inkludiert Nachbarschafts-IDs (prev/next) und Titel.
|
||||
VERSION: 2.0.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: yaml, os
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
|
|
|||
|
|
@ -1,3 +1,13 @@
|
|||
"""
|
||||
FILE: app/core/chunker.py
|
||||
DESCRIPTION: Zerlegt Texte in Chunks (Sliding Window oder nach Headings). Orchestriert die Smart-Edge-Allocation via SemanticAnalyzer.
|
||||
VERSION: 2.5.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.services.semantic_analyzer, app.core.derive_edges, markdown_it, yaml, asyncio
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
from dataclasses import dataclass
|
||||
from typing import List, Dict, Optional, Tuple, Any, Set
|
||||
|
|
|
|||
|
|
@ -1,26 +1,11 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Modul: app/core/derive_edges.py
|
||||
Zweck:
|
||||
- Bewahrt bestehende Edgelogik (belongs_to, prev/next, references, backlink)
|
||||
- Ergänzt typenbasierte Default-Kanten (edge_defaults aus config/types.yaml)
|
||||
- Unterstützt "typed inline relations":
|
||||
* [[rel:KIND | Target]]
|
||||
* [[rel:KIND Target]]
|
||||
* rel: KIND [[Target]]
|
||||
- Unterstützt Obsidian-Callouts:
|
||||
* > [!edge] KIND: [[Target]] [[Target2]] ...
|
||||
Kompatibilität:
|
||||
- build_edges_for_note(...) Signatur unverändert
|
||||
- rule_id Werte:
|
||||
* structure:belongs_to
|
||||
* structure:order
|
||||
* explicit:wikilink
|
||||
* inline:rel
|
||||
* callout:edge
|
||||
* edge_defaults:<type>:<relation>
|
||||
* derived:backlink
|
||||
FILE: app/core/derive_edges.py
|
||||
DESCRIPTION: Extrahiert Graph-Kanten aus Text. Unterstützt Wikilinks, Inline-Relations ([[rel:type|target]]) und Obsidian Callouts.
|
||||
VERSION: 2.0.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: re, os, yaml, typing
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,17 +1,10 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
app/core/graph_adapter.py — Adjazenzaufbau & Subgraph-Expansion
|
||||
|
||||
Zweck:
|
||||
Baut aus Qdrant-Edges (Collection: *_edges) einen leichten In-Memory-Graph.
|
||||
|
||||
Kompatibilität:
|
||||
- WP-04a: Liefert Scores (edge_bonus, centrality).
|
||||
- WP-04b: Liefert jetzt auch Struktur-Daten für Erklärungen (Reverse-Lookup).
|
||||
|
||||
Version:
|
||||
0.4.0 (Update für WP-04b: Reverse Adjacency für Explainability)
|
||||
FILE: app/core/graph_adapter.py
|
||||
DESCRIPTION: Lädt Kanten aus Qdrant und baut einen In-Memory Subgraphen für Scoring (Centrality) und Explanation.
|
||||
VERSION: 0.4.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: qdrant_client, app.core.qdrant
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,8 +1,11 @@
|
|||
"""
|
||||
app/core/ingestion.py
|
||||
|
||||
Zentraler Service für die Transformation von Markdown-Dateien in Qdrant-Objekte.
|
||||
Version: 2.5.2 (Full Feature: Change Detection + Robust IO + Clean Config)
|
||||
FILE: app/core/ingestion.py
|
||||
DESCRIPTION: Haupt-Ingestion-Logik. Liest Markdown, prüft Hashes (Change Detection), zerlegt in Chunks und schreibt in Qdrant.
|
||||
VERSION: 2.5.2
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.core.parser, app.core.note_payload, app.core.chunker, app.core.derive_edges, app.core.qdrant*, app.services.embeddings_client
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
import os
|
||||
import logging
|
||||
|
|
|
|||
|
|
@ -1,17 +1,11 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Modul: app/core/note_payload.py
|
||||
Version: 2.1.0 (WP-11 Update: Aliases support)
|
||||
|
||||
Zweck
|
||||
-----
|
||||
Erzeugt ein robustes Note-Payload. Werte wie `retriever_weight`, `chunk_profile`
|
||||
und `edge_defaults` werden in folgender Priorität bestimmt:
|
||||
1) Frontmatter (Note)
|
||||
2) Typ-Registry (config/types.yaml: types.<type>.*)
|
||||
3) Registry-Defaults (config/types.yaml: defaults.*)
|
||||
4) ENV-Defaults (MINDNET_DEFAULT_RETRIEVER_WEIGHT / MINDNET_DEFAULT_CHUNK_PROFILE)
|
||||
FILE: app/core/note_payload.py
|
||||
DESCRIPTION: Baut das JSON-Objekt für 'mindnet_notes'. Wendet Vererbung für Configs an (Frontmatter > Type > Default).
|
||||
VERSION: 2.1.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: yaml, os, json, pathlib
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,43 +1,10 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Modul: app/core/parser.py
|
||||
Version: 1.7.1 (fault-tolerant, API-kompatibel)
|
||||
Datum: 2025-10-01
|
||||
|
||||
Zweck
|
||||
-----
|
||||
Fehlertolerantes Einlesen von Markdown-Dateien mit YAML-Frontmatter.
|
||||
Kompatibel zur bisherigen Parser-API, aber robust gegenüber Nicht-UTF-8-Dateien:
|
||||
- Versucht nacheinander: utf-8 → utf-8-sig → cp1252 → latin-1.
|
||||
- Bei Fallback wird ein JSON-Warnhinweis auf stdout ausgegeben; der Import bricht NICHT ab.
|
||||
- YAML-Frontmatter wird mit '---' am Anfang und '---' als Abschluss erkannt.
|
||||
- extract_wikilinks() normalisiert [[id#anchor|label]] → 'id'.
|
||||
|
||||
Öffentliche API (kompatibel):
|
||||
- class ParsedNote(frontmatter: dict, body: str, path: str)
|
||||
- read_markdown(path) -> ParsedNote | None
|
||||
- normalize_frontmatter(fm) -> dict
|
||||
- validate_required_frontmatter(fm, required: tuple[str,...]=("id","title")) -> None
|
||||
- extract_wikilinks(text) -> list[str]
|
||||
- FRONTMATTER_RE (Kompatibilitäts-Konstante; Regex für '---'-Zeilen)
|
||||
|
||||
Beispiele
|
||||
---------
|
||||
from app.core.parser import read_markdown, normalize_frontmatter, validate_required_frontmatter
|
||||
parsed = read_markdown("./vault/30_projects/project-demo.md")
|
||||
fm = normalize_frontmatter(parsed.frontmatter)
|
||||
validate_required_frontmatter(fm)
|
||||
body = parsed.body
|
||||
|
||||
from app.core.parser import extract_wikilinks
|
||||
links = extract_wikilinks(body)
|
||||
|
||||
Abhängigkeiten
|
||||
--------------
|
||||
- PyYAML (yaml)
|
||||
|
||||
Lizenz: MIT (projektintern)
|
||||
FILE: app/core/parser.py
|
||||
DESCRIPTION: Liest Markdown-Dateien fehlertolerant (Encoding-Fallback). Trennt Frontmatter (YAML) vom Body.
|
||||
VERSION: 1.7.1
|
||||
STATUS: Active
|
||||
DEPENDENCIES: yaml, re, dataclasses, json, io, os
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
|
|
|
|||
|
|
@ -1,28 +1,10 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
app/core/qdrant.py
|
||||
Version: 2.2.0 (2025-11-11)
|
||||
|
||||
Aufgabe
|
||||
-------
|
||||
- Zentraler Qdrant-Zugriff (Client, Config)
|
||||
- Collection-Anlage (notes/chunks/edges)
|
||||
- **Payload-Indizes sicherstellen** (idempotent)
|
||||
|
||||
Hinweis
|
||||
-------
|
||||
Diese Datei ist als Drop-in-Ersatz gedacht, falls in deinem Projekt noch keine
|
||||
robuste ensure_payload_indexes()-Implementierung vorliegt. Die Signaturen
|
||||
bleiben kompatibel zu scripts.import_markdown und scripts.reset_qdrant.
|
||||
|
||||
API-Notizen
|
||||
-----------
|
||||
- Payload-Indizes werden mit `create_payload_index` angelegt.
|
||||
- Typen stammen aus `qdrant_client.http.models.PayloadSchemaType`:
|
||||
KEYWORD | TEXT | INTEGER | FLOAT | BOOL | GEO | DATETIME
|
||||
- Für häufige Filterfelder (note_id, kind, scope, type, tags, ...) legen wir
|
||||
Indizes an. Das ist laut Qdrant-Doku Best Practice für performante Filter.
|
||||
FILE: app/core/qdrant.py
|
||||
DESCRIPTION: Qdrant-Client Factory und Schema-Management. Erstellt Collections und Payload-Indizes.
|
||||
VERSION: 2.2.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: qdrant_client, dataclasses, os
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
|
|
|
|||
|
|
@ -1,18 +1,10 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
app/core/qdrant_points.py - robust points helpers for Qdrant
|
||||
|
||||
- Single source of truth for building PointStruct for notes/chunks/edges
|
||||
- Backward-compatible payloads for edges
|
||||
- Handles both Single-Vector and Named-Vector collections
|
||||
- Deterministic overrides via ENV to avoid auto-detection traps:
|
||||
* NOTES_VECTOR_NAME, CHUNKS_VECTOR_NAME, EDGES_VECTOR_NAME
|
||||
* MINDNET_VECTOR_NAME (fallback)
|
||||
> Set to a concrete name (e.g. "text") to force Named-Vector with that name
|
||||
> Set to "__single__" (or "single") to force Single-Vector
|
||||
|
||||
Version: 1.5.0 (2025-11-08)
|
||||
FILE: app/core/qdrant_points.py
|
||||
DESCRIPTION: Object-Mapper für Qdrant. Konvertiert JSON-Payloads (Notes, Chunks, Edges) in PointStructs und generiert deterministische UUIDs.
|
||||
VERSION: 1.5.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: qdrant_client, uuid, os
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
import os
|
||||
|
|
|
|||
|
|
@ -1,8 +1,10 @@
|
|||
"""
|
||||
app/core/retriever.py — Hybrider Such-Algorithmus
|
||||
|
||||
Version:
|
||||
0.5.3 (WP-06 Fix: Populate 'payload' in QueryHit for meta-data access)
|
||||
FILE: app/core/retriever.py
|
||||
DESCRIPTION: Implementiert die Hybrid-Suche (Vektor + Graph-Expansion) und das Scoring-Modell (Explainability).
|
||||
VERSION: 0.5.3
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.config, app.models.dto, app.core.qdrant*, app.services.embeddings_client, app.core.graph_adapter
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
|
|
|
|||
|
|
@ -1,30 +1,11 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Modul: app/core/type_registry.py
|
||||
Version: 1.0.0
|
||||
Datum: 2025-11-08
|
||||
|
||||
Zweck
|
||||
-----
|
||||
Lädt eine optionale Typ-Registry (config/types.yaml) und stellt
|
||||
komfortable Zugriffsfunktionen bereit. Die Registry ist *optional*:
|
||||
- Fehlt die Datei oder ist das YAML defekt, wird ein konservativer
|
||||
Default (Typ "concept") verwendet und es wird eine Warnung ausgegeben.
|
||||
- Änderungen an der Datei greifen nach einem Neustart des Prozesses.
|
||||
|
||||
Öffentliche API
|
||||
---------------
|
||||
- load_type_registry(path: str = "config/types.yaml") -> dict
|
||||
- get_type_config(note_type: str, reg: dict) -> dict
|
||||
- resolve_note_type(fm_type: str | None, reg: dict) -> str
|
||||
- effective_chunk_profile(note_type: str, reg: dict) -> str | None
|
||||
- profile_overlap(profile: str | None) -> tuple[int,int] # nur Overlap-Empfehlung
|
||||
|
||||
Hinweis
|
||||
-------
|
||||
Die Registry steuert KEINE Breaking Changes. Ohne Datei/Typ bleibt das
|
||||
Verhalten exakt wie im Release-Stand 20251105.
|
||||
FILE: app/core/type_registry.py
|
||||
DESCRIPTION: Loader für types.yaml. Achtung: Wird in der aktuellen Pipeline meist durch lokale Loader in 'ingestion.py' oder 'note_payload.py' umgangen.
|
||||
VERSION: 1.0.0
|
||||
STATUS: Deprecated (Redundant)
|
||||
DEPENDENCIES: yaml, os, functools
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
|
|
|
|||
|
|
@ -1,23 +0,0 @@
|
|||
"""
|
||||
Version 0.1
|
||||
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
from typing import List
|
||||
from functools import lru_cache
|
||||
|
||||
from .config import get_settings
|
||||
|
||||
@lru_cache
|
||||
def _load_model():
|
||||
from sentence_transformers import SentenceTransformer
|
||||
settings = get_settings()
|
||||
model = SentenceTransformer(settings.MODEL_NAME, device="cpu")
|
||||
return model
|
||||
|
||||
def embed_texts(texts: List[str]) -> list[list[float]]:
|
||||
model = _load_model()
|
||||
texts = [t if isinstance(t, str) else str(t) for t in texts]
|
||||
vecs = model.encode(texts, normalize_embeddings=True, convert_to_numpy=False)
|
||||
return [list(map(float, v)) for v in vecs]
|
||||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui.py
|
||||
DESCRIPTION: Main Entrypoint für Streamlit. Router, der basierend auf Sidebar-Auswahl die Module (Chat, Editor, Graph) lädt.
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: streamlit, ui_config, ui_sidebar, ui_chat, ui_editor, ui_graph_service, ui_graph*, ui_graph_cytoscape
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import streamlit as st
|
||||
import uuid
|
||||
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_api.py
|
||||
DESCRIPTION: Wrapper für Backend-Calls (Chat, Ingest, Feedback). Kapselt requests und Error-Handling.
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: requests, streamlit, ui_config
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import requests
|
||||
import streamlit as st
|
||||
from ui_config import CHAT_ENDPOINT, INGEST_ANALYZE_ENDPOINT, INGEST_SAVE_ENDPOINT, FEEDBACK_ENDPOINT, API_TIMEOUT
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_callbacks.py
|
||||
DESCRIPTION: Event-Handler für UI-Interaktionen. Implementiert den Übergang vom Graphen zum Editor (State Transfer).
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: streamlit, os, ui_utils
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import streamlit as st
|
||||
import os
|
||||
from ui_utils import build_markdown_doc
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_chat.py
|
||||
DESCRIPTION: Chat-UI. Rendert Nachrichtenverlauf, Quellen-Expanders mit Feedback-Buttons und delegiert bei Bedarf an den Editor.
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: streamlit, ui_api, ui_editor
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import streamlit as st
|
||||
from ui_api import send_chat_message, submit_feedback
|
||||
from ui_editor import render_draft_editor
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_config.py
|
||||
DESCRIPTION: Zentrale Konfiguration für das Frontend. Definiert API-Endpoints, Timeouts und Graph-Styles (Farben).
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: os, hashlib, dotenv, pathlib
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import os
|
||||
import hashlib
|
||||
from dotenv import load_dotenv
|
||||
|
|
|
|||
|
|
@ -1,3 +1,11 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_editor.py
|
||||
DESCRIPTION: Markdown-Editor mit Live-Vorschau und Metadaten-Feldern. Unterstützt Intelligence-Features (Link-Vorschläge) und unterscheidet Create/Update-Modus.
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: streamlit, uuid, re, datetime, ui_utils, ui_api
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
import streamlit as st
|
||||
import uuid
|
||||
import re
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_graph.py
|
||||
DESCRIPTION: Legacy Graph-Explorer (Streamlit-Agraph). Implementiert Physik-Simulation (BarnesHut) und direkten Editor-Sprung.
|
||||
VERSION: 2.6.0
|
||||
STATUS: Maintenance (Active Fallback)
|
||||
DEPENDENCIES: streamlit, streamlit_agraph, qdrant_client, ui_config, ui_callbacks
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import streamlit as st
|
||||
from streamlit_agraph import agraph, Config
|
||||
from qdrant_client import models
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_graph_cytoscape.py
|
||||
DESCRIPTION: Moderner Graph-Explorer (Cytoscape.js). Features: COSE-Layout, Deep-Linking (URL Params), Active Inspector Pattern (CSS-Styling ohne Re-Render).
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: streamlit, st_cytoscape, qdrant_client, ui_config, ui_callbacks
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import streamlit as st
|
||||
from st_cytoscape import cytoscape
|
||||
from qdrant_client import models
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_graph_service.py
|
||||
DESCRIPTION: Data Layer für den Graphen. Greift direkt auf Qdrant zu (Performance), um Knoten/Kanten zu laden und Texte zu rekonstruieren ("Stitching").
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: qdrant_client, streamlit_agraph, ui_config, re
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import re
|
||||
from qdrant_client import QdrantClient, models
|
||||
from streamlit_agraph import Node, Edge
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_sidebar.py
|
||||
DESCRIPTION: Rendert die Sidebar. Steuert den Modus-Wechsel (Chat/Editor/Graph) und globale Settings (Top-K).
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: streamlit, ui_utils, ui_config
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import streamlit as st
|
||||
from ui_utils import load_history_from_logs
|
||||
from ui_config import HISTORY_FILE
|
||||
|
|
|
|||
|
|
@ -1,3 +1,12 @@
|
|||
"""
|
||||
FILE: app/frontend/ui_utils.py
|
||||
DESCRIPTION: String-Utilities. Parser für Markdown/YAML (LLM-Healing) und Helper für History-Loading.
|
||||
VERSION: 2.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: re, yaml, unicodedata, json, datetime
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import re
|
||||
import yaml
|
||||
import unicodedata
|
||||
|
|
|
|||
|
|
@ -1,172 +0,0 @@
|
|||
#!/usr/bin/env python3
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
Modul: app/graph/service.py
|
||||
Version: 0.1.0
|
||||
Datum: 2025-09-10
|
||||
|
||||
Zweck
|
||||
-----
|
||||
Leichtgewichtiger Graph-Layer über Qdrant:
|
||||
- get_note(note_id)
|
||||
- get_chunks(note_id)
|
||||
- neighbors(source_id, kinds=[...], scope=['note','chunk'], depth=1)
|
||||
- walk_bfs(source_id, kinds, max_depth)
|
||||
- context_for_note(note_id, max_neighbors): heuristische Kontextsammlung
|
||||
|
||||
Hinweise
|
||||
--------
|
||||
- Nutzt die bestehenden Collections <prefix>_notes/_chunks/_edges.
|
||||
- Edges werden über Payload-Felder (`kind`, `source_id`, `target_id`) abgefragt.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
from typing import List, Dict, Any, Optional, Iterable, Set, Tuple
|
||||
from qdrant_client.http import models as rest
|
||||
from app.core.qdrant import QdrantConfig, get_client
|
||||
|
||||
def _cols(prefix: str):
|
||||
return f"{prefix}_notes", f"{prefix}_chunks", f"{prefix}_edges"
|
||||
|
||||
class GraphService:
|
||||
def __init__(self, cfg: Optional[QdrantConfig] = None, prefix: Optional[str] = None):
|
||||
self.cfg = cfg or QdrantConfig.from_env()
|
||||
if prefix:
|
||||
self.cfg.prefix = prefix
|
||||
self.client = get_client(self.cfg)
|
||||
self.notes_col, self.chunks_col, self.edges_col = _cols(self.cfg.prefix)
|
||||
|
||||
# ------------------------ fetch helpers ------------------------
|
||||
def _scroll(self, col: str, flt: Optional[rest.Filter] = None, limit: int = 256):
|
||||
out = []
|
||||
nextp = None
|
||||
while True:
|
||||
pts, nextp = self.client.scroll(
|
||||
collection_name=col,
|
||||
with_payload=True,
|
||||
with_vectors=False,
|
||||
limit=limit,
|
||||
offset=nextp,
|
||||
scroll_filter=flt,
|
||||
)
|
||||
if not pts:
|
||||
break
|
||||
out.extend(pts)
|
||||
if nextp is None:
|
||||
break
|
||||
return out
|
||||
|
||||
# ------------------------ public API ---------------------------
|
||||
def get_note(self, note_id: str) -> Optional[Dict[str, Any]]:
|
||||
f = rest.Filter(must=[rest.FieldCondition(key="note_id", match=rest.MatchValue(value=note_id))])
|
||||
pts, _ = self.client.scroll(self.notes_col, with_payload=True, with_vectors=False, limit=1, scroll_filter=f)
|
||||
return (pts[0].payload or None) if pts else None
|
||||
|
||||
def get_chunks(self, note_id: str) -> List[Dict[str, Any]]:
|
||||
f = rest.Filter(must=[rest.FieldCondition(key="note_id", match=rest.MatchValue(value=note_id))])
|
||||
pts = self._scroll(self.chunks_col, f)
|
||||
# Sortierung analog Export
|
||||
def key(pl):
|
||||
p = pl.payload or {}
|
||||
s = p.get("seq") or 0
|
||||
ci = p.get("chunk_index") or 0
|
||||
n = 0
|
||||
cid = p.get("chunk_id") or ""
|
||||
if isinstance(cid, str) and "#" in cid:
|
||||
try:
|
||||
n = int(cid.rsplit("#", 1)[-1])
|
||||
except Exception:
|
||||
n = 0
|
||||
return (int(s), int(ci), n)
|
||||
pts_sorted = sorted(pts, key=key)
|
||||
return [p.payload or {} for p in pts_sorted]
|
||||
|
||||
def neighbors(self, source_id: str, kinds: Optional[Iterable[str]] = None,
|
||||
scope: Optional[Iterable[str]] = None, depth: int = 1) -> Dict[str, List[Dict[str, Any]]]:
|
||||
"""
|
||||
Liefert eingehende & ausgehende Nachbarn (nur nach kind gefiltert).
|
||||
depth==1: direkte Kanten.
|
||||
"""
|
||||
kinds = list(kinds) if kinds else None
|
||||
must = [rest.FieldCondition(key="source_id", match=rest.MatchValue(value=source_id))]
|
||||
if kinds:
|
||||
must.append(rest.FieldCondition(key="kind", match=rest.MatchAny(any=kinds)))
|
||||
f = rest.Filter(must=must)
|
||||
edges = self._scroll(self.edges_col, f)
|
||||
out = {"out": [], "in": []}
|
||||
for e in edges:
|
||||
out["out"].append(e.payload or {})
|
||||
# Inverse Richtung (eingehend)
|
||||
must_in = [rest.FieldCondition(key="target_id", match=rest.MatchValue(value=source_id))]
|
||||
if kinds:
|
||||
must_in.append(rest.FieldCondition(key="kind", match=rest.MatchAny(any=kinds)))
|
||||
f_in = rest.Filter(must=must_in)
|
||||
edges_in = self._scroll(self.edges_col, f_in)
|
||||
for e in edges_in:
|
||||
out["in"].append(e.payload or {})
|
||||
return out
|
||||
|
||||
def walk_bfs(self, source_id: str, kinds: Iterable[str], max_depth: int = 2) -> Set[str]:
|
||||
visited: Set[str] = {source_id}
|
||||
frontier: Set[str] = {source_id}
|
||||
kinds = list(kinds)
|
||||
for _ in range(max_depth):
|
||||
nxt: Set[str] = set()
|
||||
for s in frontier:
|
||||
neigh = self.neighbors(s, kinds=kinds)
|
||||
for e in neigh["out"]:
|
||||
t = e.get("target_id")
|
||||
if isinstance(t, str) and t not in visited:
|
||||
visited.add(t)
|
||||
nxt.add(t)
|
||||
frontier = nxt
|
||||
if not frontier:
|
||||
break
|
||||
return visited
|
||||
|
||||
def context_for_note(self, note_id: str, kinds: Iterable[str] = ("references","backlink"), max_neighbors: int = 12) -> Dict[str, Any]:
|
||||
"""
|
||||
Heuristischer Kontext: eigene Chunks + Nachbarn nach Kantenarten, dedupliziert.
|
||||
"""
|
||||
note = self.get_note(note_id) or {}
|
||||
chunks = self.get_chunks(note_id)
|
||||
neigh = self.neighbors(note_id, kinds=list(kinds))
|
||||
targets = []
|
||||
for e in neigh["out"]:
|
||||
t = e.get("target_id")
|
||||
if isinstance(t, str):
|
||||
targets.append(t)
|
||||
for e in neigh["in"]:
|
||||
s = e.get("source_id")
|
||||
if isinstance(s, str):
|
||||
targets.append(s)
|
||||
# de-dupe
|
||||
seen = set()
|
||||
uniq = []
|
||||
for t in targets:
|
||||
if t not in seen:
|
||||
seen.add(t)
|
||||
uniq.append(t)
|
||||
uniq = uniq[:max_neighbors]
|
||||
neighbor_notes = [self.get_note(t) for t in uniq]
|
||||
return {
|
||||
"note": note,
|
||||
"chunks": chunks,
|
||||
"neighbors": [n for n in neighbor_notes if n],
|
||||
"edges_out": neigh["out"],
|
||||
"edges_in": neigh["in"],
|
||||
}
|
||||
|
||||
# Optional: Mini-CLI
|
||||
if __name__ == "__main__": # pragma: no cover
|
||||
import argparse, json
|
||||
ap = argparse.ArgumentParser()
|
||||
ap.add_argument("--prefix", help="Collection-Prefix (überschreibt ENV)")
|
||||
ap.add_argument("--note-id", required=True)
|
||||
ap.add_argument("--neighbors", action="store_true", help="Nur Nachbarn anzeigen")
|
||||
args = ap.parse_args()
|
||||
svc = GraphService(prefix=args.prefix)
|
||||
if args.neighbors:
|
||||
out = svc.neighbors(args.note_id, kinds=["references","backlink","prev","next","belongs_to"])
|
||||
else:
|
||||
out = svc.context_for_note(args.note_id)
|
||||
print(json.dumps(out, ensure_ascii=False, indent=2))
|
||||
|
|
@ -1,6 +1,12 @@
|
|||
"""
|
||||
app/main.py — mindnet API bootstrap
|
||||
FILE: app/main.py
|
||||
DESCRIPTION: Bootstrap der FastAPI Anwendung. Inkludiert Router und Middleware.
|
||||
VERSION: 0.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.config, app.routers.* (embed, qdrant, query, graph, tools, feedback, chat, ingest, admin)
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
from fastapi import FastAPI
|
||||
from .config import get_settings
|
||||
|
|
|
|||
|
|
@ -1,14 +1,10 @@
|
|||
"""
|
||||
app/models/dto.py — Pydantic-Modelle (DTOs) für WP-04/WP-05/WP-06
|
||||
|
||||
Zweck:
|
||||
Laufzeit-Modelle für FastAPI (Requests/Responses).
|
||||
WP-06 Update: Intent & Intent-Source in ChatResponse.
|
||||
|
||||
Version:
|
||||
0.6.2 (WP-06: Decision Engine Transparency, Erweiterung des Feeback Request)
|
||||
Stand:
|
||||
2025-12-09
|
||||
FILE: app/models/dto.py
|
||||
DESCRIPTION: Pydantic-Modelle (DTOs) für Request/Response Bodies. Definiert das API-Schema.
|
||||
VERSION: 0.6.2
|
||||
STATUS: Active
|
||||
DEPENDENCIES: pydantic, typing, uuid
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,20 +1,10 @@
|
|||
"""
|
||||
app/routers/admin.py — Admin-/Monitoring-Endpunkte (optional)
|
||||
|
||||
Zweck:
|
||||
Liefert einfache Kennzahlen zu Collections (Counts) und Config.
|
||||
Kompatibilität:
|
||||
Python 3.12+, FastAPI 0.110+, qdrant-client 1.x
|
||||
Version:
|
||||
0.1.0 (Erstanlage)
|
||||
Stand:
|
||||
2025-10-07
|
||||
Bezug:
|
||||
- Qdrant Collections: *_notes, *_chunks, *_edges
|
||||
Nutzung:
|
||||
app.include_router(admin.router, prefix="/admin", tags=["admin"])
|
||||
Änderungsverlauf:
|
||||
0.1.0 (2025-10-07) – Erstanlage.
|
||||
FILE: app/routers/admin.py
|
||||
DESCRIPTION: Monitoring-Endpunkt. Zeigt Qdrant-Collection-Counts und geladene Config.
|
||||
VERSION: 0.1.0
|
||||
STATUS: Active (Optional)
|
||||
DEPENDENCIES: qdrant_client, app.config
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,6 +1,11 @@
|
|||
"""
|
||||
app/routers/chat.py — RAG Endpunkt
|
||||
Version: 2.5.0 (Fix: Question Detection protects against False-Positive Interviews)
|
||||
FILE: app/routers/chat.py
|
||||
DESCRIPTION: Haupt-Chat-Interface (RAG & Interview). Enthält Intent-Router (Keywords/LLM) und Prompt-Construction.
|
||||
VERSION: 2.5.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.config, app.models.dto, app.services.llm_service, app.core.retriever, app.services.feedback_service
|
||||
EXTERNAL_CONFIG: config/decision_engine.yaml, config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from fastapi import APIRouter, HTTPException, Depends
|
||||
|
|
|
|||
|
|
@ -1,5 +1,10 @@
|
|||
"""
|
||||
Version 0.1
|
||||
FILE: app/routers/embed_router.py
|
||||
DESCRIPTION: Exponiert die lokale Embedding-Funktion als API-Endpunkt.
|
||||
VERSION: 0.1.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.embeddings, pydantic
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,6 +1,10 @@
|
|||
"""
|
||||
app/routers/feedback.py
|
||||
Endpunkt für User-Feedback (WP-04c).
|
||||
FILE: app/routers/feedback.py
|
||||
DESCRIPTION: Endpunkt für explizites User-Feedback (WP-04c).
|
||||
VERSION: 0.1.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.models.dto, app.services.feedback_service
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from fastapi import APIRouter, HTTPException
|
||||
from app.models.dto import FeedbackRequest
|
||||
|
|
|
|||
|
|
@ -1,21 +1,10 @@
|
|||
"""
|
||||
app/routers/graph.py — Graph-Endpunkte (WP-04)
|
||||
|
||||
Zweck:
|
||||
Liefert die Nachbarschaft einer Note/ID als JSON-Graph (Nodes/Edges/Stats).
|
||||
Kompatibilität:
|
||||
Python 3.12+, FastAPI 0.110+, qdrant-client 1.x
|
||||
Version:
|
||||
0.1.0 (Erstanlage)
|
||||
Stand:
|
||||
2025-10-07
|
||||
Bezug:
|
||||
- app/core/graph_adapter.py
|
||||
- app/models/dto.py
|
||||
Nutzung:
|
||||
app.include_router(graph.router, prefix="/graph", tags=["graph"])
|
||||
Änderungsverlauf:
|
||||
0.1.0 (2025-10-07) – Erstanlage.
|
||||
FILE: app/routers/graph.py
|
||||
DESCRIPTION: Liefert Graph-Daten (Knoten/Kanten) für UI-Visualisierungen basierend auf einer Seed-ID. (WP4)
|
||||
VERSION: 0.1.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: qdrant_client, app.models.dto, app.core.graph_adapter, app.config
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,8 +1,12 @@
|
|||
"""
|
||||
app/routers/ingest.py
|
||||
API-Endpunkte für WP-11 (Discovery & Persistence).
|
||||
Delegiert an Services.
|
||||
FILE: app/routers/ingest.py
|
||||
DESCRIPTION: Endpunkte für WP-11. Nimmt Markdown entgegen, steuert Ingestion und Discovery (Link-Vorschläge).
|
||||
VERSION: 0.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.core.ingestion, app.services.discovery, fastapi, pydantic
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import os
|
||||
import time
|
||||
import logging
|
||||
|
|
|
|||
|
|
@ -1,23 +1,12 @@
|
|||
"""
|
||||
app/routers/query.py — Query-Endpunkte (WP-04)
|
||||
FILE: app/routers/query.py
|
||||
DESCRIPTION: Klassische Such-Endpunkte (Semantic & Hybrid). Initiiert asynchrones Feedback-Logging und ruft den richtigen Retriever Modus auf
|
||||
VERSION: 0.2.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.models.dto, app.core.retriever, app.services.feedback_service
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
Zweck:
|
||||
Stellt POST /query bereit und ruft den passenden Retriever-Modus auf.
|
||||
Kompatibilität:
|
||||
Python 3.12+, FastAPI 0.110+
|
||||
Version:
|
||||
0.1.0 (Erstanlage)
|
||||
Stand:
|
||||
2025-10-07
|
||||
Bezug:
|
||||
- app/core/retriever.py
|
||||
- app/models/dto.py
|
||||
Nutzung:
|
||||
app.include_router(query.router, prefix="/query", tags=["query"])
|
||||
Änderungsverlauf:
|
||||
0.2.0 (2025-12-07) - Update für WP04c Feedback
|
||||
0.1.0 (2025-10-07) – Erstanlage.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
from fastapi import APIRouter, HTTPException, BackgroundTasks
|
||||
from app.models.dto import QueryRequest, QueryResponse
|
||||
|
|
|
|||
|
|
@ -1,21 +1,10 @@
|
|||
"""
|
||||
app/routers/tools.py — Tool-Definitionen für Ollama/n8n/MCP (read-only)
|
||||
|
||||
Zweck:
|
||||
Liefert Funktions-Schemas (OpenAI-/Ollama-kompatibles Tool-JSON) für:
|
||||
- mindnet_query -> POST /query
|
||||
- mindnet_subgraph -> GET /graph/{note_id}
|
||||
Kompatibilität:
|
||||
Python 3.12+, FastAPI 0.110+
|
||||
Version:
|
||||
0.1.1 (query ODER query_vector möglich)
|
||||
Stand:
|
||||
2025-10-07
|
||||
Nutzung:
|
||||
app.include_router(tools.router, prefix="/tools", tags=["tools"])
|
||||
Änderungsverlauf:
|
||||
0.1.1 (2025-10-07) – mindnet_query: oneOf(query, query_vector).
|
||||
0.1.0 (2025-10-07) – Erstanlage.
|
||||
FILE: app/routers/tools.py
|
||||
DESCRIPTION: Liefert JSON-Schemas für die Integration als 'Tools' in Agents (Ollama/OpenAI). Read-Only.
|
||||
VERSION: 0.1.1
|
||||
STATUS: Active
|
||||
DEPENDENCIES: fastapi
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
|
|
|||
|
|
@ -1,12 +1,11 @@
|
|||
"""
|
||||
app/services/discovery.py
|
||||
Service für Link-Vorschläge und Knowledge-Discovery (WP-11).
|
||||
|
||||
Features:
|
||||
- Sliding Window Analyse für lange Texte.
|
||||
- Footer-Scan für Projekt-Referenzen.
|
||||
- 'Matrix-Logic' für intelligente Kanten-Typen (Experience -> Value = based_on).
|
||||
- Async & Nomic-Embeddings kompatibel.
|
||||
FILE: app/services/discovery.py
|
||||
DESCRIPTION: Service für WP-11. Analysiert Texte, findet Entitäten und schlägt typisierte Verbindungen vor ("Matrix-Logic").
|
||||
VERSION: 0.6.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.core.qdrant, app.models.dto, app.core.retriever
|
||||
EXTERNAL_CONFIG: config/types.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
import logging
|
||||
import asyncio
|
||||
|
|
|
|||
|
|
@ -1,12 +1,10 @@
|
|||
"""
|
||||
app/services/embeddings_client.py — Text→Embedding Service
|
||||
|
||||
Zweck:
|
||||
Einheitlicher Client für Embeddings via Ollama (Nomic).
|
||||
Stellt sicher, dass sowohl Async (Ingestion) als auch Sync (Retriever)
|
||||
denselben Vektorraum (768 Dim) nutzen.
|
||||
|
||||
Version: 2.5.0 (Unified Ollama)
|
||||
FILE: app/services/embeddings_client.py
|
||||
DESCRIPTION: Unified Embedding Client. Nutzt Ollama API (HTTP). Ersetzt lokale sentence-transformers.
|
||||
VERSION: 2.5.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: httpx, requests, app.config
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
from __future__ import annotations
|
||||
import os
|
||||
|
|
|
|||
|
|
@ -1,9 +1,10 @@
|
|||
"""
|
||||
app/services/feedback_service.py
|
||||
Service zum Loggen von Suchanfragen und Feedback (WP-04c).
|
||||
Speichert Daten als JSONL für späteres Self-Tuning (WP-08).
|
||||
|
||||
Version: 1.1 (Chat-Support)
|
||||
FILE: app/services/feedback_service.py
|
||||
DESCRIPTION: Schreibt Search- und Feedback-Logs in JSONL-Dateien.
|
||||
VERSION: 1.1
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.models.dto
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
import json
|
||||
import os
|
||||
|
|
|
|||
|
|
@ -1,6 +1,11 @@
|
|||
"""
|
||||
app/services/llm_service.py — LLM Client
|
||||
Version: 2.8.0 (Configurable Concurrency Limit)
|
||||
FILE: app/services/llm_service.py
|
||||
DESCRIPTION: Asynchroner Client für Ollama. Verwaltet Prompts und Background-Last (Semaphore).
|
||||
VERSION: 2.8.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: httpx, yaml, asyncio, app.config
|
||||
EXTERNAL_CONFIG: config/prompts.yaml
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import httpx
|
||||
|
|
|
|||
|
|
@ -1,6 +1,10 @@
|
|||
"""
|
||||
app/services/semantic_analyzer.py — Edge Validation & Filtering
|
||||
Version: 2.0 (Update: Background Priority for Batch Jobs)
|
||||
FILE: app/services/semantic_analyzer.py
|
||||
DESCRIPTION: KI-gestützte Kanten-Validierung. Nutzt LLM (Background-Priority), um Kanten präzise einem Chunk zuzuordnen.
|
||||
VERSION: 2.0.0
|
||||
STATUS: Active
|
||||
DEPENDENCIES: app.services.llm_service, json, logging
|
||||
LAST_ANALYSIS: 2025-12-15
|
||||
"""
|
||||
|
||||
import json
|
||||
|
|
|
|||
|
|
@ -9,145 +9,200 @@ context: "Umfassender Guide für Entwickler: Architektur, Modul-Interna (Deep Di
|
|||
|
||||
# Mindnet Developer Guide & Workflow
|
||||
|
||||
**Quellen:** `developer_guide.md`, `dev_workflow.md`
|
||||
**Quellen:** `developer_guide.md`, `dev_workflow.md`, `Architecture_Audit_v2.6`
|
||||
|
||||
Dieser Guide vereint das technische Verständnis der Module mit dem operativen Workflow zwischen Windows (Dev) und Linux (Runtime).
|
||||
Dieser Guide ist die zentrale technische Referenz für Mindnet v2.6. Er vereint das technische Verständnis der Module mit dem operativen Workflow zwischen Windows (Dev) und Linux (Runtime).
|
||||
|
||||
---
|
||||
|
||||
## 1. Die physische Architektur
|
||||
## 1. Einführung & Systemüberblick
|
||||
|
||||
### Was ist Mindnet?
|
||||
Mindnet ist ein **Hybrides Knowledge Management System**, das klassische Notizen (Markdown) mit KI-gestützter Analyse verbindet. Es kombiniert **RAG** (Retrieval Augmented Generation) mit einer **Graphen-Datenbank** (Qdrant), um Wissen nicht nur semantisch zu finden, sondern auch strukturell zu vernetzen.
|
||||
|
||||
### Kern-Philosophie
|
||||
1. **Filesystem First:** Die Wahrheit liegt immer auf der Festplatte (Markdown-Dateien). Die Datenbank ist ein abgeleiteter Index.
|
||||
2. **Hybrid Retrieval:** Relevanz entsteht aus Textähnlichkeit (Semantik) + Graphen-Verbindungen (Edges) + Wichtigkeit (Centrality).
|
||||
3. **Active Intelligence:** Das System wartet nicht nur auf Anfragen, sondern schlägt beim Schreiben proaktiv Verbindungen vor ("Matrix Logic").
|
||||
4. **Local Privacy:** Alle KI-Berechnungen (Ollama) laufen lokal. Keine Cloud-Abhängigkeit für Inference.
|
||||
|
||||
---
|
||||
|
||||
## 2. Architektur
|
||||
|
||||
### 2.1 High-Level Übersicht
|
||||
Das System folgt einer strikten Trennung zwischen Frontend (Streamlit) und Backend (FastAPI), wobei bestimmte Performance-Pfade (Graph-Visualisierung) optimiert wurden.
|
||||
|
||||
```mermaid
|
||||
graph TD
|
||||
User((User))
|
||||
|
||||
subgraph "Frontend Layer (Streamlit)"
|
||||
UI[ui.py Router]
|
||||
ViewChat[Chat View]
|
||||
ViewGraph[Graph View]
|
||||
ViewEditor[Editor View]
|
||||
Logic[Callbacks & State]
|
||||
end
|
||||
|
||||
subgraph "Backend Layer (FastAPI)"
|
||||
API[main.py]
|
||||
RouterChat[Chat / RAG]
|
||||
RouterIngest[Ingest / Write]
|
||||
CoreRet[Retriever Engine]
|
||||
CoreIngest[Ingestion Pipeline]
|
||||
end
|
||||
|
||||
subgraph "Infrastructure & Services"
|
||||
LLM[Ollama (Phi3/Nomic)]
|
||||
DB[(Qdrant Vector DB)]
|
||||
FS[File System (.md)]
|
||||
end
|
||||
|
||||
User <--> UI
|
||||
UI --> API : REST (Chat, Save, Feedback)
|
||||
UI -.-> DB : Direct Read (Graph Viz Performance)
|
||||
API --> LLM : Embeddings & Completion
|
||||
API --> DB : Read/Write
|
||||
API --> FS : Read/Write (Source of Truth)
|
||||
```
|
||||
|
||||
### 2.2 Datenfluss-Muster
|
||||
|
||||
1. **Ingestion (Write):**
|
||||
`Markdown` -> `Parser` -> `Chunker` -> `SemanticAnalyzer (LLM)` -> `Embedder` -> `Qdrant (Points)`
|
||||
2. **Retrieval (Read):**
|
||||
`Query` -> `Embedding` -> `Hybrid Search (Vector + Graph)` -> `Re-Ranking` -> `LLM Context`
|
||||
3. **Visualisierung (Graph):**
|
||||
`UI` -> `GraphService` -> `Qdrant (Edges Collection)` -> `Cytoscape`
|
||||
|
||||
---
|
||||
|
||||
## 3. Physische Architektur
|
||||
|
||||
Mindnet läuft in einer verteilten Umgebung (Post-WP15 Setup).
|
||||
|
||||
* **Windows 11 (VS Code):** Hier schreibst du Code. **Nie** direkt auf `main` arbeiten!
|
||||
* **Beelink (Runtime):** Der Server. Hier läuft die Software. Wir nutzen **Systemd-Services**:
|
||||
* **PROD:** API (8001) + UI (8501). Ordner: `~/mindnet`.
|
||||
* **DEV:** API (8002) + UI (8502). Ordner: `~/mindnet_dev`.
|
||||
* **Gitea:** Der "Safe" (Raspberry Pi). Speichert den Code und verwaltet Versionen.
|
||||
* **Windows 11 (VS Code):** Entwicklungsumgebung. **Nie** direkt auf `main` arbeiten!
|
||||
* **Beelink (Runtime):** Der Server hostet zwei Instanzen via Systemd:
|
||||
* **PROD:** API (Port 8001) + UI (Port 8501). Home: `~/mindnet`.
|
||||
* **DEV:** API (Port 8002) + UI (Port 8502). Home: `~/mindnet_dev`.
|
||||
* **Gitea (Raspberry Pi):** Versionskontrolle ("Safe"). Speichert den Code.
|
||||
|
||||
---
|
||||
|
||||
## 2. Projektstruktur & Referenz
|
||||
## 4. Projektstruktur & Modul-Referenz (Deep Dive)
|
||||
|
||||
### 2.1 Verzeichnisbaum
|
||||
Das System ist modular aufgebaut. Hier ist die detaillierte Analyse aller Komponenten.
|
||||
|
||||
### 4.1 Verzeichnisbaum
|
||||
|
||||
```text
|
||||
mindnet/
|
||||
├── app/
|
||||
│ ├── core/ # Ingestion, Chunker, Qdrant Wrapper
|
||||
│ ├── routers/ # FastAPI Endpoints
|
||||
│ ├── services/ # Ollama Client, Traffic Control
|
||||
│ ├── core/ # Business Logic & Algorithms
|
||||
│ ├── routers/ # API Interface (FastAPI)
|
||||
│ ├── services/ # External Integrations (LLM, DB)
|
||||
│ ├── models/ # Pydantic DTOs
|
||||
│ └── frontend/ # Streamlit UI Module
|
||||
├── config/ # YAML Configs (Single Source of Truth)
|
||||
├── scripts/ # CLI Tools (Import, Diagnose, Reset)
|
||||
├── tests/ # Pytest Suite & Smoke Scripts
|
||||
└── vault/ # Lokaler Test-Content
|
||||
│ └── frontend/ # UI Logic (Streamlit)
|
||||
├── config/ # Configuration Files (YAML)
|
||||
├── scripts/ # CLI Tools (Ops & Maintenance)
|
||||
└── vault/ # Local Content Storage
|
||||
```
|
||||
|
||||
### 2.2 Vollständige Datei-Referenz (Auto-Scan)
|
||||
### 4.2 Frontend Architecture (`app/frontend/`)
|
||||
|
||||
Eine Übersicht aller Skripte und Module im System.
|
||||
Das Frontend ist eine Streamlit-App, die sich wie eine Single-Page-Application (SPA) verhält.
|
||||
|
||||
| Datei/Pfad | Typ | Beschreibung |
|
||||
| Modul | Status | Verantwortung |
|
||||
| :--- | :--- | :--- |
|
||||
| **Backend Core** | | |
|
||||
| `app/main.py` | Skript | Bootstrap der FastAPI API. |
|
||||
| `app/config.py` | Config | Zentrale Konfiguration (Pydantic Settings). |
|
||||
| `app/core/ingestion.py` | Core Modul | Async Ingestion Service & Change Detection. |
|
||||
| `app/core/chunker.py` | Core Modul | Smart Chunker Orchestrator. |
|
||||
| `app/core/retriever.py` | Core Modul | Hybrider Such-Algorithmus (Semantik + Graph). |
|
||||
| `app/core/ranking.py` | Core Modul | Kombiniertes Scoring (WP-04). |
|
||||
| `app/core/graph_adapter.py` | Core Modul | Adjazenzaufbau & Subgraph-Expansion. |
|
||||
| `app/core/qdrant.py` | Core Modul | Qdrant Client Wrapper. |
|
||||
| `app/core/qdrant_points.py` | Core Modul | Robuste Point-Helper für Qdrant (Retry-Logik). |
|
||||
| `app/core/derive_edges.py` | Core Modul | Edge-Erzeugung aus Markdown. |
|
||||
| `app/core/edges.py` | Core Modul | Datenstrukturen für Kanten. |
|
||||
| `app/core/edges_writer.py` | Core Modul | Schreibt Kanten in die DB. |
|
||||
| `app/core/note_payload.py` | Core Modul | Builder für Note-Metadaten. |
|
||||
| `app/core/chunk_payload.py` | Core Modul | Builder für Chunk-Payloads. |
|
||||
| `app/core/type_registry.py` | Core Modul | Logik zum Laden der `types.yaml`. |
|
||||
| `app/core/schema_loader.py` | Core Modul | Lädt JSON-Schemas für Validierung. |
|
||||
| `app/core/env_vars.py` | Core Modul | Environment-Variablen Konstanten. |
|
||||
| **API Router** | | |
|
||||
| `app/routers/chat.py` | API Router | RAG Endpunkt & Hybrid Router. |
|
||||
| `app/routers/query.py` | API Router | Query-Endpunkte (WP-04). |
|
||||
| `app/routers/graph.py` | API Router | Graph-Endpunkte (WP-04). |
|
||||
| `app/routers/ingest.py` | API Router | Ingestion-Trigger & Analyse. |
|
||||
| `app/routers/feedback.py` | API Router | Feedback-Endpunkt. |
|
||||
| `app/routers/tools.py` | API Router | Tool-Definitionen für Ollama/n8n/MCP. |
|
||||
| `app/routers/admin.py` | API Router | Admin-/Monitoring-Endpunkte. |
|
||||
| **`ui.py`** | 🟢 Core | **Main Router.** Initialisiert Session-State und entscheidet anhand der Sidebar-Auswahl, welche View gerendert wird. |
|
||||
| **`ui_config.py`** | 🟢 Config | **Constants.** Zentraler Ort für Farben (`GRAPH_COLORS`), API-URLs und Timeouts. Änderungen am Look & Feel passieren hier. |
|
||||
| **`ui_chat.py`** | 🟢 View | **Chat UI.** Rendert Nachrichtenverlauf, Intent-Badges, Quellen-Expanders und Feedback-Buttons. |
|
||||
| **`ui_editor.py`** | 🟢 View | **Editor UI.** Markdown-Editor mit Live-Vorschau. Integriert "Intelligence" (KI-Link-Vorschläge). |
|
||||
| **`ui_graph_cytoscape.py`**| 🟢 View | **Modern Graph.** Interaktiver Graph basierend auf Cytoscape.js (COSE Layout). |
|
||||
| **`ui_graph.py`** | 🟡 Legacy | **Graph UI (Fallback).** Alte Implementierung mittels `streamlit-agraph`. |
|
||||
| **`ui_callbacks.py`** | 🟢 Logic | **State Controller.** Handhabt komplexe State-Übergänge (z.B. Graph -> Editor). |
|
||||
| **`ui_utils.py`** | 🟢 Logic | **Helper.** Enthält den **Healing Parser** (`parse_markdown_draft`), der defektes JSON/YAML von LLMs repariert. |
|
||||
| **`ui_api.py`** | 🟢 Data | **API Client.** Wrapper für Backend REST-Calls. |
|
||||
| **`ui_graph_service.py`** | 🟢 Data | **Performance Hack.** Greift direkt auf Qdrant zu (bypass API), um Graphen schnell zu laden. |
|
||||
|
||||
#### Frontend Design Patterns (Wichtig!)
|
||||
|
||||
1. **Active Inspector Pattern (`ui_graph_cytoscape.py`)**
|
||||
Um Re-Renders im Graphen zu vermeiden, nutzen wir CSS-Klassen. Wird ein Knoten angeklickt, ändert sich nur die CSS-Klasse (`.inspected`), aber die Physik-Simulation startet nicht neu. Das sorgt für ein stabiles UI-Gefühl.
|
||||
|
||||
2. **Resurrection Pattern (`ui_editor.py`)**
|
||||
Streamlit neigt dazu, Eingaben bei Re-Runs zu "vergessen". Der Editor synchronisiert seinen Inhalt aggressiv in den `session_state`.
|
||||
* Logik: `if widget_key not in session_state: restore_from_data_key()`.
|
||||
* Ergebnis: Texteingaben überleben Tab-Wechsel.
|
||||
|
||||
3. **Filesystem First (`ui_callbacks.py`)**
|
||||
Wenn man im Graphen auf "Bearbeiten" klickt:
|
||||
1. Versucht das System, die **echte Datei** von der Festplatte zu lesen.
|
||||
2. Nur wenn das fehlschlägt, wird der Text aus den Datenbank-Chunks rekonstruiert ("Stitching").
|
||||
Dies verhindert, dass veraltete Datenbank-Stände die echten Dateien überschreiben.
|
||||
|
||||
---
|
||||
|
||||
### 4.3 Backend Architecture (`app/`)
|
||||
|
||||
Das Backend stellt die Logik via REST-API bereit.
|
||||
|
||||
| Modul | Typ | Verantwortung |
|
||||
| :--- | :--- | :--- |
|
||||
| **Core Engine** | | |
|
||||
| `core/ingestion.py` | Engine | **Pipeline Controller.** Koordiniert den 13-Schritte-Import, Parsing, Hash-Check und DB-Upserts. |
|
||||
| `core/retriever.py` | Engine | **Search Engine.** Berechnet Hybrid-Score: `(Semantic * W) + (Edge Bonus * 0.25) + (Centrality * 0.05)`. |
|
||||
| `core/chunker.py` | Engine | **Segmentation.** Zerlegt Text intelligent. Orchestriert `SemanticAnalyzer` für Smart Edges. |
|
||||
| `core/derive_edges.py`| Engine | **Link Extractor.** Findet Wikilinks, Callouts und Typed Relations im Text. |
|
||||
| `core/qdrant_points.py`| Mapper | **Object Mapper.** Wandelt Payloads in Qdrant `PointStruct`s um. |
|
||||
| `core/graph_adapter.py` | Algo | **Graph Logic.** Baut In-Memory Graphen für Re-Ranking und Pfad-Analysen. |
|
||||
| **Router (API)** | | |
|
||||
| `routers/chat.py` | Router | **Hybrid Router.** Entscheidet: RAG-Antwort vs. Interview-Modus. |
|
||||
| `routers/ingest.py` | Router | **Write API.** Nimmt Markdown entgegen, steuert Ingestion und Discovery-Analyse. |
|
||||
| `routers/query.py` | Router | **Search API.** Klassischer Hybrid-Retriever Endpunkt. |
|
||||
| `routers/graph.py` | Router | **Viz API.** Liefert Knoten/Kanten für Frontend. |
|
||||
| **Services** | | |
|
||||
| `app/services/llm_service.py` | Service | LLM Client mit Traffic Control. |
|
||||
| `app/services/llm_ollama.py` | Service | Legacy: Ollama-Integration & Prompt-Bau. |
|
||||
| `app/services/embeddings_client.py` | Service | Async Text→Embedding Service. |
|
||||
| `app/services/semantic_analyzer.py` | Service | Smart Edge Validation & Filtering. |
|
||||
| `app/services/discovery.py` | Service | Backend Intelligence (Matrix-Logik). |
|
||||
| `app/services/feedback_service.py` | Service | Schreibt JSONL-Logs. |
|
||||
| **Frontend** | | |
|
||||
| `app/frontend/ui.py` | Frontend | Entrypoint (Streamlit). |
|
||||
| `app/frontend/ui_editor.py` | Frontend | Editor-View & Logic. |
|
||||
| `app/frontend/ui_chat.py` | Frontend | Chat-View. |
|
||||
| `app/frontend/ui_graph_cytoscape.py` | Frontend | Graph-Visualisierung (Modern). |
|
||||
| `app/frontend/ui_graph.py` | Frontend | Graph-Visualisierung (Legacy). |
|
||||
| `app/frontend/ui_graph_service.py` | Frontend | Datenaufbereitung für Graphen. |
|
||||
| `app/frontend/ui_callbacks.py` | Frontend | Event-Handler. |
|
||||
| `app/frontend/ui_api.py` | Frontend | Backend-Bridge. |
|
||||
| `app/frontend/ui_utils.py` | Frontend | Helper (Healing Parser). |
|
||||
| `app/frontend/ui_config.py` | Frontend | Konstanten (Farben, URLs). |
|
||||
| **CLI & Scripts** | | |
|
||||
| `scripts/import_markdown.py` | Skript | Haupt-Importer CLI. |
|
||||
| `scripts/reset_qdrant.py` | Skript | Löscht Collections (`--mode wipe`). |
|
||||
| `scripts/payload_dryrun.py` | Skript | Zeigt Payloads VOR dem Upsert. |
|
||||
| `scripts/edges_dryrun.py` | Skript | Erzeugt Edges ohne DB-Write. |
|
||||
| `scripts/edges_full_check.py` | Skript | Prüft Graph-Integrität. |
|
||||
| `scripts/resolve_unresolved_references.py`| Skript | Löst Wikilinks nachträglich auf. |
|
||||
| `scripts/audit_vault_vs_qdrant.py` | Skript | Konsistenz-Check File vs. DB. |
|
||||
| `scripts/audit_edges_vs_expectations.py`| Skript | Prüft Kanten gegen Erwartungswert. |
|
||||
| `scripts/setup_mindnet_collections.py` | Skript | Richtet Collections initial ein. |
|
||||
| `scripts/export_markdown.py` | Skript | Exportiert Qdrant zurück zu Markdown. |
|
||||
| `scripts/wp04_smoketest.py` | Skript | E2E-Schnelltest der WP-04 Endpunkte. |
|
||||
| `scripts/health_check_mindnet.py` | Skript | System Health Check. |
|
||||
| `scripts/report_hashes.py` | Skript | Übersicht bei Mehrfach-Hashes. |
|
||||
| `scripts/make_test_vault.py` | Skript | Erzeugt minimalen Test-Vault. |
|
||||
| `scripts/ollama_tool_runner.py` | Skript | Minimaler Tool-Caller für Ollama. |
|
||||
| `services/llm_service.py`| Service | **Traffic Control.** Async Client für Ollama. Nutzt **Semaphore**, um Hintergrund-Jobs (Import) zu drosseln. |
|
||||
| `services/discovery.py`| Service | **Intelligence.** "Matrix Logic" für Link-Vorschläge (WP-11). |
|
||||
| `services/semantic_analyzer.py`| Service | **Filter.** KI-Validierung von Kanten im Hintergrund. |
|
||||
| `services/feedback_service.py`| Service | **Logging.** Schreibt Interaktions-Logs (JSONL). |
|
||||
|
||||
---
|
||||
|
||||
## 3. Core-Module im Detail (Architektur)
|
||||
### 4.4 Scripts & Tooling (Die Admin-Toolbox)
|
||||
|
||||
Hier wird erklärt, *wie* die wichtigsten Komponenten unter der Haube arbeiten.
|
||||
Der Ordner `scripts/` enthält verifizierte Werkzeuge für den Betrieb.
|
||||
|
||||
### 3.1 Der Importer (`scripts.import_markdown`)
|
||||
Dies ist das komplexeste Modul.
|
||||
* **Orchestrierung:** Es ruft `app.core.chunker` für die Textzerlegung und `app.services.semantic_analyzer` für Smart Edges auf.
|
||||
* **Idempotenz:** Der Importer kann beliebig oft laufen. Er nutzt deterministische IDs (UUIDv5) und überschreibt vorhandene Einträge konsistent.
|
||||
* **Robustheit:** In `ingestion.py` sind Mechanismen wie Change Detection (Hash-Vergleich) und Robust File I/O implementiert.
|
||||
|
||||
### 3.2 Der Hybrid Router (`app.routers.chat`)
|
||||
Hier liegt die Logik für Intent Detection (WP06) und Interview-Modus (WP07).
|
||||
* **Question Detection:** Prüft zuerst regelbasiert, ob der Input eine Frage ist (`?`, W-Wörter). Falls ja -> RAG.
|
||||
* **Keyword Match:** Prüft Keywords aus `decision_engine.yaml` und `types.yaml`.
|
||||
* **Priority:** Ruft `llm_service` mit `priority="realtime"` auf, um die Import-Warteschlange zu umgehen.
|
||||
|
||||
### 3.3 Der Retriever (`app.core.retriever`)
|
||||
Hier passiert das Scoring (WP04a).
|
||||
* **Hybrid Search:** Der Chat-Endpoint erzwingt `mode="hybrid"`.
|
||||
* **Strategic Retrieval:** In `chat.py` wird der Retriever ggf. *zweimal* aufgerufen, wenn ein Intent (z.B. `DECISION`) eine Injection (`value`) erfordert.
|
||||
|
||||
### 3.4 Das Frontend (`app.frontend.ui`)
|
||||
Eine Streamlit-App (WP10/19).
|
||||
* **Resurrection Pattern:** Das UI nutzt ein spezielles State-Management (`st.session_state`), um Eingaben bei Tab-Wechseln (Chat <-> Editor) zu erhalten. Widgets synchronisieren sich via Callbacks.
|
||||
* **Healing Parser:** Die Funktion `parse_markdown_draft` repariert defekte YAML-Frontmatter (z.B. fehlendes `---`) vom LLM automatisch.
|
||||
|
||||
### 3.5 Traffic Control (`app.services.llm_service`)
|
||||
Neu in v2.6. Stellt sicher, dass Batch-Prozesse (Import) den Live-Chat nicht ausbremsen.
|
||||
* **Methode:** `generate_raw_response(..., priority="background")` aktiviert eine Semaphore.
|
||||
* **Limit:** Konfigurierbar über `MINDNET_LLM_BACKGROUND_LIMIT` (Default: 2).
|
||||
| Skript | Status | Zweck | Wichtiges Argument |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **`import_markdown.py`** | 🟢 Prod | **Master-Sync.** Der zentrale Importer. | `--apply`, `--purge-before-upsert` |
|
||||
| **`reset_qdrant.py`** | ⚠️ Ops | **Wipe.** Löscht Collections für Rebuilds. | `--mode wipe`, `--yes` |
|
||||
| **`export_markdown.py`** | 🟢 Backup| **Backup.** Exportiert DB-Inhalt zurück zu MD. | -- |
|
||||
| **`health_check_mindnet.py`**| 🟢 Ops | **Monitoring.** Prüft ob API/DB laufen. | (Exit Code 0/1) |
|
||||
| **`payload_dryrun.py`** | 🟢 Test | **Audit.** Simuliert Import (Schema Check). | -- |
|
||||
| **`edges_full_check.py`** | 🟢 Test | **Integrity.** Prüft Graph-Logik. | -- |
|
||||
| **`resolve_unresolved.py`**| 🟡 Maint | **Repair.** Versucht, kaputte Links zu heilen. | -- |
|
||||
|
||||
---
|
||||
|
||||
## 4. Lokales Setup (Development)
|
||||
## 5. Maintenance & "Kill List"
|
||||
|
||||
Folgende Dateien wurden im Audit v2.6 als veraltet, redundant oder "Zombie-Code" identifiziert und sollten entfernt werden.
|
||||
|
||||
| Datei | Diagnose | Empfohlene Aktion |
|
||||
| :--- | :--- | :--- |
|
||||
| `app/embed_server.py` | **Zombie.** Alter Standalone-Server. | 🗑️ Löschen |
|
||||
| `app/embeddings.py` | **Zombie.** Veraltete lokale Lib. | 🗑️ Löschen |
|
||||
| `app/core/edges.py` | **Redundant.** Ersetzt durch `derive_edges.py`. | 🗑️ Löschen |
|
||||
| `app/core/ranking.py` | **Redundant.** Logik in `retriever.py` integriert. | 🗑️ Löschen |
|
||||
| `app/core/type_registry.py` | **Redundant.** Logik in `ingestion.py` integriert. | 🗑️ Löschen |
|
||||
| `app/core/env_vars.py` | **Veraltet.** Ersetzt durch `config.py`. | 🗑️ Löschen |
|
||||
| `app/routers/qdrant_router.py`| **Deprecated.** Keine Logik, nur CRUD. | 📂 Verschieben nach `scripts/archive/` |
|
||||
|
||||
---
|
||||
|
||||
## 6. Lokales Setup (Development)
|
||||
|
||||
**Voraussetzungen:** Python 3.10+, Docker, Ollama.
|
||||
|
||||
|
|
@ -170,16 +225,18 @@ ollama pull nomic-embed-text
|
|||
**Konfiguration (`.env`):**
|
||||
```ini
|
||||
QDRANT_URL="http://localhost:6333"
|
||||
MINDNET_OLLAMA_URL="http://localhost:11434"
|
||||
MINDNET_LLM_MODEL="phi3:mini"
|
||||
MINDNET_EMBEDDING_MODEL="nomic-embed-text"
|
||||
COLLECTION_PREFIX="mindnet_dev"
|
||||
VECTOR_DIM=768
|
||||
MINDNET_LLM_BACKGROUND_LIMIT=2
|
||||
MINDNET_API_URL="http://localhost:8002"
|
||||
MINDNET_LLM_TIMEOUT=300.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Der Entwicklungs-Zyklus (Workflow)
|
||||
## 7. Der Entwicklungs-Zyklus (Workflow)
|
||||
|
||||
### Phase 1: Windows (Code)
|
||||
1. **Basis aktualisieren:** `git checkout main && git pull`.
|
||||
|
|
@ -219,7 +276,7 @@ Wenn alles getestet ist:
|
|||
|
||||
---
|
||||
|
||||
## 6. Erweiterungs-Guide: "Teach-the-AI"
|
||||
## 8. Erweiterungs-Guide: "Teach-the-AI"
|
||||
|
||||
Mindnet lernt nicht durch Training (Fine-Tuning), sondern durch **Konfiguration** und **Vernetzung**.
|
||||
|
||||
|
|
@ -227,6 +284,7 @@ Mindnet lernt nicht durch Training (Fine-Tuning), sondern durch **Konfiguration*
|
|||
1. **Physik (`config/types.yaml`):**
|
||||
```yaml
|
||||
risk:
|
||||
chunk_profile: sliding_short
|
||||
retriever_weight: 0.90 # Sehr wichtig
|
||||
edge_defaults: ["blocks"] # Automatische Kante
|
||||
detection_keywords: ["gefahr", "risiko"]
|
||||
|
|
@ -238,21 +296,20 @@ Mindnet lernt nicht durch Training (Fine-Tuning), sondern durch **Konfiguration*
|
|||
```
|
||||
*Ergebnis:* Wenn der Intent `DECISION` erkannt wird, sucht das System nun auch aktiv nach Risiken.
|
||||
|
||||
### Workflow B: Interview-Schema anpassen (WP07)
|
||||
Wenn Mindnet neue Fragen stellen soll (z.B. "Budget" bei Projekten):
|
||||
1. **Schema (`config/types.yaml`):**
|
||||
```yaml
|
||||
project:
|
||||
schema:
|
||||
- "Titel"
|
||||
- "Ziel"
|
||||
- "Budget (Neu)"
|
||||
```
|
||||
2. **Kein Code nötig:** Der `One-Shot Extractor` (Prompt Template) liest diese Liste dynamisch.
|
||||
### Workflow B: Graph-Farben ändern
|
||||
1. Öffne `app/frontend/ui_config.py`.
|
||||
2. Bearbeite das Dictionary `GRAPH_COLORS`.
|
||||
|
||||
```python
|
||||
GRAPH_COLORS = {
|
||||
"project": "#FF4B4B",
|
||||
"risk": "#8B0000" # Neu
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Tests & Debugging
|
||||
## 9. Tests & Debugging
|
||||
|
||||
**Unit Tests (Pytest):**
|
||||
```bash
|
||||
|
|
@ -280,7 +337,7 @@ python tests/test_feedback_smoke.py --url http://localhost:8002/query
|
|||
|
||||
---
|
||||
|
||||
## 8. Troubleshooting & One-Liners
|
||||
## 10. Troubleshooting & One-Liners
|
||||
|
||||
**DB komplett zurücksetzen (Vorsicht!):**
|
||||
```bash
|
||||
|
|
@ -300,5 +357,9 @@ journalctl -u mindnet-ui-dev -f
|
|||
```
|
||||
|
||||
**"UnicodeDecodeError in .env":**
|
||||
* Ursache: Umlaute oder Sonderzeichen in der `.env`.
|
||||
* Lösung: Datei bereinigen (nur ASCII) und sicherstellen, dass UTF-8 ohne BOM genutzt wird.
|
||||
* **Ursache:** Umlaute oder Sonderzeichen in der `.env`.
|
||||
* **Lösung:** Datei bereinigen (nur ASCII) und sicherstellen, dass UTF-8 ohne BOM genutzt wird.
|
||||
|
||||
**"Read timed out" im Frontend:**
|
||||
* **Ursache:** Smart Edges brauchen länger als 60s.
|
||||
* **Lösung:** `MINDNET_API_TIMEOUT=300.0` in `.env`.
|
||||
Loading…
Reference in New Issue
Block a user