- Introduced a `virtual` attribute for automatically generated section transitions and backlinks in the graph edge definitions, indicating their virtual nature.
- Updated documentation to reflect the addition of the `virtual` attribute for both section transitions and backlinks, clarifying its implications for scoring in the retriever.
- Enhanced the understanding of edge types by specifying that these automatically generated edges will receive a penalty during scoring.
- Added functionality to automatically create inverse backlinks for intra-note edges at the chunk level, ensuring that backlinks are generated only when they do not already exist.
- Updated the documentation to outline the requirements and rules for backlink creation, including conditions for deduplication and scope.
- Introduced unit tests to validate the creation of backlinks and ensure correct behavior when existing backlinks are present.
- Incremented version to 4.4.0 to reflect the new feature addition.
- Introduced a new function `load_graph_schema_full` to parse and cache both typical and prohibited edge types from the graph schema.
- Updated `load_graph_schema` to utilize the full schema for improved edge type extraction.
- Added `get_topology_info` to retrieve typical and prohibited edges for source/target pairs.
- Implemented `validate_intra_note_edge` and `validate_edge_against_schema` for schema validation of intra-note edges.
- Enhanced logging for schema validation outcomes and edge handling.
- Updated documentation to reflect new validation features and testing procedures.
- Introduced configurable edge scoring with internal and external boosts for intra-note edges.
- Added aggregation configuration to support note-level and chunk-level retrieval strategies.
- Updated retriever and graph subgraph modules to utilize new scoring and aggregation logic.
- Enhanced YAML configuration to include new parameters for edge scoring and aggregation levels.
- Added boolean indexing for filtering based on edge properties in the setup script.
- Implemented WP-26 v1.1: Section-Type-Wechsel erzwingt Split auch in SMART MODE (Schritt 2) zur Verbesserung der Chunking-Logik.
- Updated `parse_link_target` to extract block IDs from section strings, ensuring accurate handling of links with block references.
- Added unit tests to validate section-type change behavior and block ID extraction functionality, enhancing overall reliability.
- Implemented WP-26 v1.1: Section-Type-Wechsel erzwingt immer einen neuen Chunk, um konsistente Chunking-Verhalten bei unterschiedlichen section_types zu gewährleisten.
- Introduced automatic Intra-Note-Edges zwischen Sektionen mit unterschiedlichen Typen, um semantische Beziehungen zu erfassen.
- Updated graph utilities to support automatic edge type derivation based on section transitions.
- Added unit tests for section-type changes and automatic edge generation to ensure functionality and reliability.
- Updated provenance priorities and introduced a mapping from internal provenance values to EdgeDTO-compliant literals.
- Added a new function `normalize_provenance` to standardize internal provenance strings.
- Enhanced the `_edge` function to include an `is_internal` flag and provenance normalization.
- Modified the `EdgeDTO` model to include a new `source_hint` field for detailed provenance information and an `is_internal` flag for intra-note edges.
- Reduced the provenance options in `EdgeDTO` to valid literals, improving data integrity.
Implement deterministic sorting of semantic groups in graph_derive_edges.py to ensure consistent edge extraction across batches. Update ingestion_processor.py to enhance change detection logic, ensuring that hash checks are performed before artifact checks to prevent redundant processing. These changes improve the reliability and efficiency of the edge building and ingestion workflows.
Enhance the extraction logic to store the zone status before header updates, ensuring accurate context during callout processing. Initialize the all_chunk_callout_keys set prior to its usage to prevent potential UnboundLocalError. These improvements contribute to more reliable edge construction and better handling of LLM validation zones.
Implement support for H2 headers in LLM validation zone detection, allowing for improved flexibility in header recognition. Update the extraction logic to track zones during callout processing, ensuring accurate differentiation between LLM validation and standard zones. This enhancement improves the handling of callouts and their associated metadata, contributing to more precise edge construction.
Implement functions to extract LLM validation zones from Markdown, allowing for configurable header identification via environment variables. Enhance the existing note scope zone extraction to differentiate between note scope and LLM validation zones. Update edge building logic to handle LLM validation edges with a 'candidate:' prefix, ensuring proper processing and avoiding duplicates in global scans. This update improves the overall handling of edge data and enhances the flexibility of the extraction process.
Update qdrant_points.py, graph_utils.py, ingestion_db.py, ingestion_processor.py, and import_markdown.py: Enhance UUID generation for edge IDs, improve error handling, and refine documentation for clarity. Implement atomic consistency in batch upserts and ensure strict phase separation in the ingestion workflow. Update versioning to reflect changes in functionality and maintain compatibility with the ingestion service.