WP24c - Agentic Edge Validation & Chunk-Aware Multigraph-System (v4.5.8) #22

Lars · 2026-01-12T10:52:40+01:00

Lars commented

2026-01-12 10:52:40 +01:00

feat: Phase 3 Agentic Edge Validation & Chunk-Aware Multigraph-System (v4.5.8)

Phase 3 Agentic Edge Validation

Finales Validierungs-Gate für Kanten mit candidate: Präfix
LLM-basierte semantische Prüfung gegen Kontext (Note-Scope vs. Chunk-Scope)
Differenzierte Fehlerbehandlung: Transiente Fehler erlauben Kante, permanente Fehler lehnen ab
Kontext-Optimierung: Note-Scope nutzt Note-Summary/Text, Chunk-Scope nutzt spezifischen Chunk-Text
Implementierung in app/core/ingestion/ingestion_validation.py (v2.14.0)

Automatische Spiegelkanten (Invers-Logik)

Automatische Erzeugung von Spiegelkanten für explizite Verbindungen
Phase 2 Batch-Injektion am Ende des Imports
Authority-Check: Explizite Kanten haben Vorrang (keine Duplikate)
Provenance Firewall: System-Kanten können nicht manuell überschrieben werden
Implementierung in app/core/ingestion/ingestion_processor.py (v2.13.12)

Note-Scope Zonen (v4.2.0)

Globale Verbindungen für ganze Notizen (scope: note)
Konfigurierbare Header-Namen via ENV-Variablen
Höchste Priorität bei Duplikaten
Phase 3 Validierung nutzt Note-Summary/Text für bessere Präzision
Implementierung in app/core/graph/graph_derive_edges.py (v1.1.2)

Chunk-Aware Multigraph-System

Section-basierte Links: Note#Section wird präzise in target_id und target_section aufgeteilt
Multigraph-Support: Mehrere Kanten zwischen denselben Knoten möglich (verschiedene Sections)
Semantische Deduplizierung basierend auf src->tgt:kind@sec Key
Metadaten-Persistenz: target_section, provenance, confidence bleiben erhalten

Code-Komponenten

app/core/ingestion/ingestion_validation.py: v2.14.0 (Phase 3 Validierung, Kontext-Optimierung)
app/core/ingestion/ingestion_processor.py: v2.13.12 (Automatische Spiegelkanten, Authority-Check)
app/core/graph/graph_derive_edges.py: v1.1.2 (Note-Scope Zonen, LLM-Validierung Zonen)
app/core/chunking/chunking_processor.py: v2.13.0 (LLM-Validierung Zonen Erkennung)
app/core/chunking/chunking_parser.py: v2.12.0 (Header-Level Erkennung, Zonen-Extraktion)

Konfiguration

Neue ENV-Variablen für konfigurierbare Header:
- MINDNET_LLM_VALIDATION_HEADERS (Default: "Unzugeordnete Kanten,Edge Pool,Candidates")
- MINDNET_LLM_VALIDATION_HEADER_LEVEL (Default: 3)
- MINDNET_NOTE_SCOPE_ZONE_HEADERS (Default: "Smart Edges,Relationen,Global Links,Note-Level Relations,Globale Verbindungen")
- MINDNET_NOTE_SCOPE_HEADER_LEVEL (Default: 2)
config/llm_profiles.yaml: ingest_validator Profil für Phase 3 Validierung (Temperature 0.0)
config/prompts.yaml: edge_validation Prompt für Phase 3 Validierung

Dokumentation

01_knowledge_design.md: Automatische Spiegelkanten, Phase 3 Validierung, Note-Scope Zonen
NOTE_SCOPE_ZONEN.md: Phase 3 Validierung integriert
LLM_VALIDIERUNG_VON_LINKS.md: Phase 3 statt global_pool, Kontext-Optimierung
02_concept_graph_logic.md: Phase 3 Validierung, automatische Spiegelkanten, Note-Scope vs. Chunk-Scope
03_tech_data_model.md: candidate: Präfix, verified Status, virtual Flag, scope Feld
03_tech_configuration.md: Neue ENV-Variablen dokumentiert
04_admin_operations.md: Troubleshooting für Phase 3 Validierung und Note-Scope Links
05_testing_guide.md: WP-24c Test-Szenarien hinzugefügt
00_quality_checklist.md: WP-24c Features in Checkliste aufgenommen
README.md: Version auf v4.5.8 aktualisiert, WP-24c Features verlinkt

Breaking Changes

Keine Breaking Changes für Endbenutzer
Vollständige Rückwärtskompatibilität
Bestehende Notizen funktionieren ohne Änderungen

Migration

Keine Migration erforderlich
System funktioniert ohne Änderungen
Optional: ENV-Variablen können für Custom-Header konfiguriert werden

Status: ✅ WP-24c ist zu 100% implementiert und audit-geprüft.
Nächster Schritt: WP-25c (Kontext-Budgeting & Erweiterte Prompt-Optimierung).


---

## Zusammenfassung

Dieser Merge führt die **Phase 3 Agentic Edge Validation** und das **Chunk-Aware Multigraph-System** in MindNet ein. Das System validiert nun automatisch Kanten mit `candidate:` Präfix, erzeugt automatisch Spiegelkanten für explizite Verbindungen und unterstützt Note-Scope Zonen für globale Verbindungen.

**Kern-Features:**
- Phase 3 Agentic Edge Validation (finales Validierungs-Gate)
- Automatische Spiegelkanten (Invers-Logik)
- Note-Scope Zonen (globale Verbindungen)
- Chunk-Aware Multigraph-System (Section-basierte Links)

**Technische Integrität:**
- Alle Kanten durchlaufen Phase 3 Validierung (falls candidate: Präfix)
- Spiegelkanten werden automatisch erzeugt (Phase 2)
- Note-Scope Links haben höchste Priorität
- Kontext-Optimierung für bessere Validierungs-Genauigkeit

**Dokumentation:**
- Vollständige Aktualisierung aller relevanten Dokumente
- Neue ENV-Variablen dokumentiert
- Troubleshooting-Guide erweitert
- Test-Szenarien hinzugefügt

**Deployment:**
- Keine Breaking Changes
- Optional: ENV-Variablen für Custom-Header konfigurieren
- System funktioniert ohne Änderungen

feat: Phase 3 Agentic Edge Validation & Chunk-Aware Multigraph-System (v4.5.8) ### Phase 3 Agentic Edge Validation - Finales Validierungs-Gate für Kanten mit candidate: Präfix - LLM-basierte semantische Prüfung gegen Kontext (Note-Scope vs. Chunk-Scope) - Differenzierte Fehlerbehandlung: Transiente Fehler erlauben Kante, permanente Fehler lehnen ab - Kontext-Optimierung: Note-Scope nutzt Note-Summary/Text, Chunk-Scope nutzt spezifischen Chunk-Text - Implementierung in app/core/ingestion/ingestion_validation.py (v2.14.0) ### Automatische Spiegelkanten (Invers-Logik) - Automatische Erzeugung von Spiegelkanten für explizite Verbindungen - Phase 2 Batch-Injektion am Ende des Imports - Authority-Check: Explizite Kanten haben Vorrang (keine Duplikate) - Provenance Firewall: System-Kanten können nicht manuell überschrieben werden - Implementierung in app/core/ingestion/ingestion_processor.py (v2.13.12) ### Note-Scope Zonen (v4.2.0) - Globale Verbindungen für ganze Notizen (scope: note) - Konfigurierbare Header-Namen via ENV-Variablen - Höchste Priorität bei Duplikaten - Phase 3 Validierung nutzt Note-Summary/Text für bessere Präzision - Implementierung in app/core/graph/graph_derive_edges.py (v1.1.2) ### Chunk-Aware Multigraph-System - Section-basierte Links: [[Note#Section]] wird präzise in target_id und target_section aufgeteilt - Multigraph-Support: Mehrere Kanten zwischen denselben Knoten möglich (verschiedene Sections) - Semantische Deduplizierung basierend auf src->tgt:kind@sec Key - Metadaten-Persistenz: target_section, provenance, confidence bleiben erhalten ### Code-Komponenten - app/core/ingestion/ingestion_validation.py: v2.14.0 (Phase 3 Validierung, Kontext-Optimierung) - app/core/ingestion/ingestion_processor.py: v2.13.12 (Automatische Spiegelkanten, Authority-Check) - app/core/graph/graph_derive_edges.py: v1.1.2 (Note-Scope Zonen, LLM-Validierung Zonen) - app/core/chunking/chunking_processor.py: v2.13.0 (LLM-Validierung Zonen Erkennung) - app/core/chunking/chunking_parser.py: v2.12.0 (Header-Level Erkennung, Zonen-Extraktion) ### Konfiguration - Neue ENV-Variablen für konfigurierbare Header: - MINDNET_LLM_VALIDATION_HEADERS (Default: "Unzugeordnete Kanten,Edge Pool,Candidates") - MINDNET_LLM_VALIDATION_HEADER_LEVEL (Default: 3) - MINDNET_NOTE_SCOPE_ZONE_HEADERS (Default: "Smart Edges,Relationen,Global Links,Note-Level Relations,Globale Verbindungen") - MINDNET_NOTE_SCOPE_HEADER_LEVEL (Default: 2) - config/llm_profiles.yaml: ingest_validator Profil für Phase 3 Validierung (Temperature 0.0) - config/prompts.yaml: edge_validation Prompt für Phase 3 Validierung ### Dokumentation - 01_knowledge_design.md: Automatische Spiegelkanten, Phase 3 Validierung, Note-Scope Zonen - NOTE_SCOPE_ZONEN.md: Phase 3 Validierung integriert - LLM_VALIDIERUNG_VON_LINKS.md: Phase 3 statt global_pool, Kontext-Optimierung - 02_concept_graph_logic.md: Phase 3 Validierung, automatische Spiegelkanten, Note-Scope vs. Chunk-Scope - 03_tech_data_model.md: candidate: Präfix, verified Status, virtual Flag, scope Feld - 03_tech_configuration.md: Neue ENV-Variablen dokumentiert - 04_admin_operations.md: Troubleshooting für Phase 3 Validierung und Note-Scope Links - 05_testing_guide.md: WP-24c Test-Szenarien hinzugefügt - 00_quality_checklist.md: WP-24c Features in Checkliste aufgenommen - README.md: Version auf v4.5.8 aktualisiert, WP-24c Features verlinkt ### Breaking Changes - Keine Breaking Changes für Endbenutzer - Vollständige Rückwärtskompatibilität - Bestehende Notizen funktionieren ohne Änderungen ### Migration - Keine Migration erforderlich - System funktioniert ohne Änderungen - Optional: ENV-Variablen können für Custom-Header konfiguriert werden --- **Status:** ✅ WP-24c ist zu 100% implementiert und audit-geprüft. **Nächster Schritt:** WP-25c (Kontext-Budgeting & Erweiterte Prompt-Optimierung). ``` --- ## Zusammenfassung Dieser Merge führt die **Phase 3 Agentic Edge Validation** und das **Chunk-Aware Multigraph-System** in MindNet ein. Das System validiert nun automatisch Kanten mit `candidate:` Präfix, erzeugt automatisch Spiegelkanten für explizite Verbindungen und unterstützt Note-Scope Zonen für globale Verbindungen. **Kern-Features:** - Phase 3 Agentic Edge Validation (finales Validierungs-Gate) - Automatische Spiegelkanten (Invers-Logik) - Note-Scope Zonen (globale Verbindungen) - Chunk-Aware Multigraph-System (Section-basierte Links) **Technische Integrität:** - Alle Kanten durchlaufen Phase 3 Validierung (falls candidate: Präfix) - Spiegelkanten werden automatisch erzeugt (Phase 2) - Note-Scope Links haben höchste Priorität - Kontext-Optimierung für bessere Validierungs-Genauigkeit **Dokumentation:** - Vollständige Aktualisierung aller relevanten Dokumente - Neue ENV-Variablen dokumentiert - Troubleshooting-Guide erweitert - Test-Szenarien hinzugefügt **Deployment:** - Keine Breaking Changes - Optional: ENV-Variablen für Custom-Header konfigurieren - System funktioniert ohne Änderungen

Lars added 71 commits 2026-01-12 10:52:40 +01:00

Integrate symmetric edge logic and discovery API: Update ingestion processor and validation to support automatic inverse edge generation. Enhance edge registry for dual vocabulary and schema management. Introduce new discovery endpoint for proactive edge suggestions, improving graph topology and edge validation processes. 4802eba27b

Update ingestion processor to version 3.1.0: Fix bidirectional edge injection for Qdrant, streamline edge validation by removing symmetry logic from the validation step, and enhance inverse edge generation in the processing pipeline. Improve logging for symmetry creation in edge payloads. 9b3fd7723e

Implement origin-based purge logic in ingestion_db.py to prevent accidental deletion of inverse edges during re-imports. Enhance logging for error handling and artifact checks. Update ingestion_processor.py to support redundancy checks and improve symmetry logic for edge generation, ensuring bidirectional graph integrity. Version bump to 3.1.2. 5e2a074019

Update type_registry, graph_utils, ingestion_note_payload, and discovery services for dynamic edge handling: Integrate EdgeRegistry for improved edge defaults and topology management (WP-24c). Enhance type loading and edge resolution logic to ensure backward compatibility while transitioning to a more robust architecture. Version bumps to 1.1.0 for type_registry, 1.1.0 for graph_utils, 2.5.0 for ingestion_note_payload, and 1.1.0 for discovery service. a392dc2786

Update ingestion_processor.py to version 3.1.4: Implement semantic cross-note redundancy checks to enhance edge generation logic. Refactor redundancy validation to distinguish between local and cross-note redundancies, ensuring improved bidirectional graph integrity. Adjust versioning and documentation accordingly. 61a319a049

Update ingestion_processor.py to version 3.1.5: Implement database-aware redundancy checks to prevent overwriting explicit edges by virtual symmetries. Enhance edge validation logic to include real-time database queries, ensuring improved integrity in edge generation. Adjust versioning and documentation accordingly. d5d6987ce2

Update ingestion_db.py and ingestion_processor.py to version 2.2.0 and 3.1.6 respectively: Integrate authority checks for Point-IDs and enhance edge validation logic to prevent overwriting explicit edges by virtual symmetries. Introduce new function to verify explicit edge presence in the database, ensuring improved integrity in edge generation. Adjust versioning and documentation accordingly. 2c18f8b3de

Update ingestion_processor.py to version 3.1.7: Enhance authority enforcement for explicit edges by implementing runtime ID protection and database checks to prevent overwriting. Refactor edge generation logic to ensure strict authority compliance and improve symmetry handling. Adjust versioning and documentation accordingly. 9cb08777fa

Update ingestion_processor.py to version 3.1.8: Enhance ID validation to prevent HTTP 400 errors and improve edge generation robustness by excluding known system types. Refactor edge processing logic to ensure valid note IDs and streamline database interactions. Adjust versioning and documentation accordingly. 72cf71fa87

Update graph_utils.py and ingestion_processor.py to versions 1.2.0 and 3.1.9 respectively: Transition to deterministic UUIDs for edge ID generation to ensure Qdrant compatibility and prevent HTTP 400 errors. Enhance ID validation and streamline edge processing logic to improve robustness and prevent collisions with known system types. Adjust versioning and documentation accordingly. 7ed82ad82e

Refactor graph_utils.py and ingestion_processor.py: Update documentation for deterministic UUIDs to enhance Qdrant compatibility. Improve logging and ID validation in ingestion_processor.py, including adjustments to edge processing logic and batch import handling for better clarity and robustness. Version updates to 1.2.0 and 3.1.9 respectively. 008a470f02

Update ingestion_processor.py to version 3.2.0: Enhance logging stability and improve edge validation by addressing KeyError risks. Implement batch import with symmetry memory and modularized schema logic for explicit edge handling. Adjust documentation and versioning for improved clarity and robustness. 7e4ea670b1

Refactor ingestion_processor.py for version 3.2.0: Integrate Mixture of Experts architecture, enhance logging stability, and improve edge validation. Update batch import process with symmetry memory and modularized schema logic. Adjust documentation for clarity and robustness. 00264a9653

Update ingestion_db.py and ingestion_processor.py: Refine documentation and enhance logging mechanisms. Improve edge validation logic with robust ID resolution and clarify comments for better understanding. Version updates to 2.2.1 and 3.2.1 respectively. 4318395c83

Update ingestion_processor.py to version 3.3.0: Integrate global authority mapping and enhance two-pass ingestion workflow. Improve logging mechanisms and edge validation logic, ensuring robust handling of explicit edges and authority protection. Adjust documentation for clarity and accuracy. c9ae58725c

Enhance ingestion_db.py and ingestion_processor.py: Integrate authority checks for Point-IDs and improve edge validation logic. Update logging mechanisms and refine batch import process with two-phase writing strategy. Adjust documentation for clarity and accuracy, reflecting version updates to 2.2.0 and 3.3.0 respectively. e2c40666d1

Update ingestion_db.py and ingestion_processor.py to version 3.3.1: Enhance documentation for clarity, refine edge validation logic, and improve logging mechanisms. Implement strict separation of explicit writes and symmetry validation in the two-phase ingestion workflow, ensuring data authority and integrity. Adjust comments for better understanding and maintainability. 981b0cba1f

Update ingestion_processor.py to version 3.3.2: Implement two-phase write strategy and API compatibility fix, ensuring data authority for explicit edges. Enhance logging clarity and adjust batch import process to maintain compatibility with importer script. Refine comments for improved understanding and maintainability. 114cea80de

Refactor ingestion_db.py and ingestion_processor.py: Simplify comments and documentation for clarity, enhance artifact purging logic to protect against accidental deletions, and improve symmetry injection process descriptions. Update versioning to reflect changes in functionality and maintainability. 29e334625e

Refactor ingestion_db.py and ingestion_processor.py: Enhance documentation for clarity, improve symmetry injection logic, and refine artifact purging process. Update versioning to 3.3.5 to reflect changes in functionality and maintainability, ensuring robust handling of explicit edges and authority checks. 3f528f2184

Update import_markdown.py to version 2.5.0: Implement global two-phase write strategy, enhance folder filtering to exclude system directories, and refine logging for improved clarity. Adjusted processing phases for better organization and error handling during markdown ingestion. 7953acf3ee

Refactor ingestion_db.py and ingestion_processor.py: Enhance documentation and logging clarity, integrate cloud resilience and error handling, and improve artifact purging logic. Update versioning to 3.3.6 to reflect changes in functionality, including strict phase separation and authority checks for explicit edges. 57656bbaaf

Update ingestion_db.py, ingestion_processor.py, and import_markdown.py: Enhance documentation and logging clarity, improve artifact purging and symmetry injection logic, and implement stricter authority checks. Update versioning to 2.6.0 and 3.3.7 to reflect changes in functionality and maintain compatibility with the ingestion service. ec89d83916

Update ingestion_processor.py to version 3.3.8: Address Ghost-ID issues, enhance Pydantic safety, and improve logging clarity. Refine symmetry injection logic and ensure strict phase separation for authority checks. Adjust comments for better understanding and maintainability. 7e00344b84

NEUSTART von vorne mit frischer Codebasis 7cc823e2f4

Update qdrant_points.py, graph_utils.py, ingestion_db.py, ingestion_processor.py, and import_markdown.py: Enhance UUID generation for edge IDs, improve error handling, and refine documentation for clarity. Implement atomic consistency in batch upserts and ensure strict phase separation in the ingestion workflow. Update versioning to reflect changes in functionality and maintain compatibility with the ingestion service.

Update graph_utils.py to version 1.6.1: Restore '_edge' function to address ImportError, revert to UUIDv5 for Qdrant compatibility, and maintain section logic in ID generation. Enhance documentation for clarity and refine edge ID generation process. c33b1c644a

Update qdrant_points.py, graph_utils.py, ingestion_processor.py, and import_markdown.py: Enhance ID generation and error handling, centralize identity logic to prevent ID drift, and improve documentation clarity. Update versioning to reflect changes in functionality and maintain compatibility across modules. b0f4309a29

Update ingestion_processor.py to version 3.4.3: Remove incompatible edge_registry initialization, maintain strict two-phase strategy, and fix ID generation issues. Enhance logging and comments for clarity, ensuring compatibility and improved functionality in the ingestion workflow. 8fd7ef804d

Update qdrant_points.py, graph_utils.py, graph_derive_edges.py, and ingestion_processor.py to version 4.0.0: Implement GOLD-STANDARD identity with strict 4-parameter ID generation, eliminating rule_id and variant from ID calculations. Enhance documentation for clarity and consistency across modules, addressing ID drift and ensuring compatibility in the ingestion workflow. a852975811

Update graph_derive_edges.py and graph_utils.py to version 4.1.0: Enhance edge ID generation by incorporating target_section into the ID calculation, allowing for distinct edges across different sections. Update documentation to reflect changes in ID structure and improve clarity on edge handling during de-duplication. 2da98e8e37

Update qdrant_points.py, ingestion_processor.py, and import_markdown.py to version 4.1.0: Enhance edge ID generation by incorporating target_section for improved multigraph support and symmetry integrity. Update documentation and logging for clarity, ensuring consistent ID generation across phases and compatibility with the ingestion workflow. be2bed9927

Update graph_db_adapter.py, graph_derive_edges.py, graph_subgraph.py, graph_utils.py, ingestion_processor.py, and retriever.py to version 4.1.0: Introduce Scope-Awareness and Section-Filtering features, enhancing edge retrieval and processing. Implement Note-Scope Zones extraction from Markdown, improve edge ID generation with target_section, and prioritize Note-Scope Links during de-duplication. Update documentation for clarity and consistency across modules. 39fd15b565

Implement WP-24c v4.2.0: Introduce configurable header names and levels for LLM validation and Note-Scope zones in the chunking system. Update chunking models, parser, and processor to support exclusion of edge zones during chunking. Enhance documentation and configuration files to reflect new environment variables for improved flexibility in Markdown processing. 003a270548

Update graph_derive_edges.py to version 4.2.1: Implement Clean-Context enhancements, including consolidated callout extraction and smart scope prioritization. Refactor callout handling to avoid duplicates and improve processing efficiency. Update documentation to reflect changes in edge extraction logic and prioritization strategy. dfff46e45c

Update graph_derive_edges.py to version 4.2.2: Implement semantic de-duplication with improved scope decision-making. Enhance edge ID calculation by prioritizing semantic grouping before scope assignment, ensuring accurate edge representation across different contexts. Update documentation to reflect changes in edge processing logic and prioritization strategy. 6131b315d7

Update ingestion_processor.py to version 4.2.4: Implement hash-based change detection for content integrity verification. Restore iterative matching based on content hashes, enhancing the accuracy of change detection. Update documentation to reflect changes in the processing logic and versioning. 4d43cc526e

Enhance chunking system with WP-24c v4.2.6 and v4.2.7 updates: Introduce is_meta_content flag for callouts in RawBlock, ensuring they are chunked but later removed for clean context. Update parse_blocks and propagate_section_edges to handle callout edges with explicit provenance for chunk attribution. Implement clean-context logic to remove callout syntax post-processing, maintaining chunk integrity. Adjust get_chunk_config to prioritize frontmatter overrides for chunking profiles. Update documentation to reflect these changes. 55b64c331a

Update chunking_utils.py to include Optional type hint: Add Optional to the import statement for improved type annotations, enhancing code clarity and maintainability. 1d66ca0649

Enhance chunking functionality in version 4.2.8: Update callout pattern to support additional syntax for edge and abstract callouts. Modify get_chunk_config to allow fallback to chunk_profile if chunking_profile is not present. Ensure explicit passing of chunk_profile in make_chunk_payloads for improved payload handling. Update type hints in chunking_parser for better clarity. 20fb1e92e2

Fix regex pattern in parse_edges_robust to support multiple leading '>' characters for edge callouts, enhancing flexibility in edge parsing. f51e1cb2c4

Enhance edge processing in graph_derive_edges.py for version 4.2.9: Finalize chunk attribution with synchronization to "Semantic First" signal. Collect callout keys from candidate pool before text scan to prevent duplicates. Update callout extraction logic to ensure strict adherence to existing chunk callouts, improving deduplication and processing efficiency. a780104b3c

Refine edge parsing and chunk attribution in chunking_parser.py and graph_derive_edges.py for version 4.2.9: Ensure current_edge_type persists across empty lines in callout blocks for accurate link processing. Implement two-phase synchronization for chunk authority, collecting explicit callout keys before the global scan to prevent duplicates. Enhance callout extraction logic to respect existing chunk callouts, improving deduplication and processing efficiency. 727de50290

Update graph_derive_edges.py and ingestion_chunk_payload.py for version 4.3.0: Introduce debug logging for data transfer audits and candidate pool handling to address potential data loss. Ensure candidate_pool is explicitly retained for accurate chunk attribution, enhancing traceability and reliability in edge processing. 3a17b646e1

Update graph_derive_edges.py to version 4.3.1: Introduce precision prioritization for chunk scope, ensuring chunk candidates are favored over note scope. Adjust confidence values for explicit callouts and enhance key generation for consistent deduplication. Improve edge processing logic to reinforce the precedence of chunk scope in decision-making. ee91583614

Enhance logging and debugging in chunking_processor.py, graph_derive_edges.py, and ingestion_chunk_payload.py for version 4.4.0: Introduce detailed debug statements to trace chunk extraction, global scan comparisons, and payload transfers. Improve visibility into candidate pool handling and decision-making processes for callout edges, ensuring better traceability and debugging capabilities. c91910ee9f

Refactor logging in graph_derive_edges.py and ingestion_chunk_payload.py: Remove redundant logging import and ensure consistent logger initialization for improved debugging capabilities. This change enhances traceability in edge processing and chunk ingestion. f8506c0bb2

Refactor logging in graph_derive_edges.py for version 4.4.0: Move logger initialization to module level for improved accessibility across functions. This change enhances debugging capabilities and maintains consistency in logging practices. d7d6155203

Enhance compatibility in chunking and edge processing for version 4.4.1: Harmonize handling of "to" and "target_id" across chunking_processor.py, graph_derive_edges.py, and ingestion_processor.py. Ensure consistent validation and processing of explicit callouts, improving integration and reliability in edge candidate handling. 2d87f9d816

Enhance logging capabilities across multiple modules for version 4.5.0: Introduce detailed debug statements in decision_engine.py, retriever_scoring.py, retriever.py, and logging_setup.py to improve traceability during retrieval processes. Implement dynamic log level configuration based on environment variables, allowing for more flexible debugging and monitoring of application behavior. 3e27c72b80

Update logging in retriever.py for version 4.5.1: Modify edge count logging to utilize the adjacency list instead of the non-existent .edges attribute in the subgraph, enhancing accuracy in debug statements related to graph retrieval processes. 47fdcf8eed

Implement chunk-aware graph traversal in hybrid_retrieve: Extract both note_id and chunk_id from hits to enhance seed coverage for edge retrieval. Combine direct and additional chunk IDs for improved accuracy in subgraph expansion. Update debug logging to reflect the new seed and chunk ID handling, ensuring better traceability in graph retrieval processes. 2445f7cb2b

Update EdgeDTO to support extended provenance values and modify explanation building in retriever.py to accommodate new provenance types. This enhances the handling of edge data for improved accuracy in retrieval processes. 1df89205ac

Update logging in decision_engine.py and retriever.py to use node_id as chunk_id and total_score instead of score for improved accuracy in debug statements. This change aligns with the new data structure introduced in version 4.5.4, enhancing traceability in retrieval processes. 3dc81ade0f

Enhance decision_engine.py to support context reuse during compression failures. Implement error handling to return original content when compression fails, ensuring robust fallback mechanisms without re-retrieval. Update logging for better traceability of compression and fallback processes, improving overall reliability in stream handling. 716a063849

Add LLM validation zone extraction and configuration support in graph_derive_edges.py c8c828c8a8

Implement functions to extract LLM validation zones from Markdown, allowing for configurable header identification via environment variables. Enhance the existing note scope zone extraction to differentiate between note scope and LLM validation zones. Update edge building logic to handle LLM validation edges with a 'candidate:' prefix, ensuring proper processing and avoiding duplicates in global scans. This update improves the overall handling of edge data and enhances the flexibility of the extraction process.

Enhance LLM validation zone extraction in graph_derive_edges.py ea0fd951f2

Implement support for H2 headers in LLM validation zone detection, allowing for improved flexibility in header recognition. Update the extraction logic to track zones during callout processing, ensuring accurate differentiation between LLM validation and standard zones. This enhancement improves the handling of callouts and their associated metadata, contributing to more precise edge construction.

Refine LLM validation zone handling in graph_derive_edges.py f2a2f4d2df

Enhance the extraction logic to store the zone status before header updates, ensuring accurate context during callout processing. Initialize the all_chunk_callout_keys set prior to its usage to prevent potential UnboundLocalError. These improvements contribute to more reliable edge construction and better handling of LLM validation zones.

Implement LLM validation for candidate edges in ingestion_processor.py 9b0d8c18cb

Enhance the edge validation process by introducing logic to validate edges with rule IDs starting with "candidate:". This includes extracting target IDs, validating against the entire note text, and updating rule IDs upon successful validation. Rejected edges are logged for traceability, improving the overall handling of edge data during ingestion.

Refactor edge validation process in ingestion_processor.py b19f91c3ee

Remove LLM validation from the candidate edge processing loop, shifting it to a later phase for improved context handling. Introduce a new validation mechanism that aggregates note text for better decision-making and optimizes the validation criteria to include both rule IDs and provenance. Update logging to reflect the new validation phases and ensure rejected edges are not processed further. This enhances the overall efficiency and accuracy of edge validation during ingestion.

Implement Phase 3 Agentic Edge Validation in ingestion_processor.py and related documentation updates 742792770c

Introduce a new method for persisting rejected edges for audit purposes, enhancing traceability and validation logic. Update the decision engine to utilize a generic fallback template for improved error handling during LLM validation. Revise documentation across multiple files to reflect the new versioning, context, and features related to Phase 3 validation, including automatic mirror edges and note-scope zones. This update ensures better graph integrity and validation accuracy in the ingestion process.

Enhance ingestion_processor.py with path normalization and strict change detection 78fbc9b31b

Implement path normalization to ensure consistent hash checks by converting file paths to absolute paths. Update change detection logic to handle hash comparisons more robustly, treating missing hashes as content changes for safety. This prevents redundant processing and improves efficiency in the ingestion workflow.

Refactor edge processing in graph_derive_edges.py and ingestion_processor.py for consistency and efficiency 6047e94964

Implement deterministic sorting of semantic groups in graph_derive_edges.py to ensure consistent edge extraction across batches. Update ingestion_processor.py to enhance change detection logic, ensuring that hash checks are performed before artifact checks to prevent redundant processing. These changes improve the reliability and efficiency of the edge building and ingestion workflows.

Enhance logging in ingestion_processor.py for improved change detection diagnostics 7cb8fd6602

Add detailed debug and warning logs to the change detection process, providing insights into hash comparisons and artifact checks. This update aims to facilitate better traceability and debugging during ingestion, particularly when handling hash changes and missing hashes. The changes ensure that the ingestion workflow is more transparent and easier to troubleshoot.

Update logging levels in ingestion_processor.py and import_markdown.py for improved visibility de5db09b51

Change debug logs to info and warning levels in ingestion_processor.py to enhance the visibility of change detection processes, including hash comparisons and artifact checks. Additionally, ensure .env is loaded before logging setup in import_markdown.py to correctly read the DEBUG environment variable. These adjustments aim to improve traceability and debugging during ingestion workflows.

Enhance logging in ingestion_processor.py for detailed change detection diagnostics c613d81846

Add comprehensive logging for hash input, body length comparisons, and frontmatter key checks in the change detection process. This update aims to improve traceability and facilitate debugging by providing insights into potential discrepancies between new and old payloads during ingestion workflows.

Refactor hash input and body/frontmatter handling in ingestion_processor.py for improved accuracy 43641441ef

Update the ingestion process to utilize the parsed object instead of note_pl for hash input, body, and frontmatter extraction. This change ensures that the correct content is used for comparisons, enhancing the reliability of change detection diagnostics and improving overall ingestion accuracy.

Refactor hash input handling in ingestion_processor.py to use dictionary format e52eed40ca

Update the ingestion process to convert the parsed object to a dictionary before passing it to the hash input function. This change ensures compatibility with the updated function requirements and improves the accuracy of hash comparisons during ingestion workflows.

Enhance logging in ingestion_processor.py to include normalized file path and note title f9118a36f8

Update the logging statement to provide additional context during the ingestion process by including the normalized file path and note title. This change aims to improve traceability and debugging capabilities in the ingestion workflow.

Implement ID collision detection and enhance logging in ingestion_processor.py ec9b3c68af

Add a check for ID collisions during the ingestion process to prevent multiple files from using the same note_id. Update logging levels to DEBUG for detailed diagnostics on hash comparisons, body lengths, and frontmatter keys, improving traceability and debugging capabilities in the ingestion workflow.

Add dedicated logging for ID collisions in ingestion_processor.py c42a76b3d7

Implement a new method to log ID collisions into a separate file (logs/id_collisions.log) for manual analysis. This update captures relevant metadata in JSONL format, enhancing traceability during the ingestion process. The logging occurs when a conflict is detected between existing and new files sharing the same note_id, improving error handling and diagnostics.

Refactor ID collision logging in ingestion_processor.py for improved clarity and structure 1056078e6a

Update the logging mechanism for ID collisions to include more structured metadata, enhancing the clarity of logged information. This change aims to facilitate easier analysis of conflicts during the ingestion process and improve overall traceability.

Lars merged commit d0012355b9 into main

2026-01-12 10:53:20 +01:00

Lars deleted branch WP24c

2026-01-12 10:53:20 +01:00

Lars referenced this issue from a commit

2026-01-12 10:53:21 +01:00

Merge pull request 'WP24c - Agentic Edge Validation & Chunk-Aware Multigraph-System (v4.5.8)' (#22) from WP24c into main

Sign in to join this conversation.

No reviewers

No Label

No Milestone

No project

No Assignees

1 Participants

Notifications

Due Date

The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Lars/mindnet#22