- Implemented WP-26 v1.1: Section-Type-Wechsel erzwingt immer einen neuen Chunk, um konsistente Chunking-Verhalten bei unterschiedlichen section_types zu gewährleisten.
- Introduced automatic Intra-Note-Edges zwischen Sektionen mit unterschiedlichen Typen, um semantische Beziehungen zu erfassen.
- Updated graph utilities to support automatic edge type derivation based on section transitions.
- Added unit tests for section-type changes and automatic edge generation to ensure functionality and reliability.
- Updated provenance priorities and introduced a mapping from internal provenance values to EdgeDTO-compliant literals.
- Added a new function `normalize_provenance` to standardize internal provenance strings.
- Enhanced the `_edge` function to include an `is_internal` flag and provenance normalization.
- Modified the `EdgeDTO` model to include a new `source_hint` field for detailed provenance information and an `is_internal` flag for intra-note edges.
- Reduced the provenance options in `EdgeDTO` to valid literals, improving data integrity.
- Replaced 'sliding_smart_edges' with 'structured_smart_edges' for multiple types to improve data processing.
- Added detection keywords for 'goal', 'concept', 'task', 'journal', 'source', 'glossary', 'person', and 'event' to enhance retrieval capabilities.
- Adjusted retriever weights for consistency across types.
- Introduced 'upholds', 'violates', 'aligned_with', 'conflicts_with', 'supports', and 'contradicts' attributes to improve the decision engine's relationship handling.
- Added 'followed_by' and 'preceded_by' attributes to the facts_stream for better context in data relationships.
- Added additional spacing for improved readability in the document.
- Ensured consistent formatting throughout the section on causal retrieval for Mindnet.
- Introduced final validation gate for edges with candidate: prefix.
- Enabled automatic generation of mirror edges for explicit connections.
- Added support for Note-Scope zones to facilitate global connections.
- Enhanced section-based links in the multigraph system for improved edge handling.
- Updated documentation and added new ENV variables for configuration.
- Ensured no breaking changes for end users, maintaining full backward compatibility.
- Increased the timeout for LLM calls from 30 to 60 seconds to accommodate longer processing times.
- Added informative messages for potential timeout causes and troubleshooting tips to improve user awareness.
- Updated error handling to provide clearer feedback on query failures, emphasizing the resolution of the EdgeDTO issue.
- Added a verification step in the health endpoint to check if the service supports 'explicit:callout' for EdgeDTO, providing clearer diagnostics.
- Updated the health response to include messages based on the EdgeDTO version support status, enhancing user awareness of potential issues.
- Adjusted the test query endpoint to reflect the correct path for improved functionality.
- Implemented a check in the health endpoint to determine if EdgeDTO supports explicit callouts, enhancing diagnostic capabilities.
- The check is non-critical and handles exceptions gracefully, ensuring the health response remains robust.
- Updated health response to include the new `edge_dto_supports_callout` field for better insight into EdgeDTO capabilities.
- Updated the .env loading process to first check for an explicit path, improving reliability in different working directories.
- Added logging for successful .env loading and fallback mechanisms.
- Enhanced EdgeDTO creation with robust error handling, including fallbacks for unsupported provenance values and logging of errors for better traceability.
Update the logging mechanism for ID collisions to include more structured metadata, enhancing the clarity of logged information. This change aims to facilitate easier analysis of conflicts during the ingestion process and improve overall traceability.
Implement a new method to log ID collisions into a separate file (logs/id_collisions.log) for manual analysis. This update captures relevant metadata in JSONL format, enhancing traceability during the ingestion process. The logging occurs when a conflict is detected between existing and new files sharing the same note_id, improving error handling and diagnostics.
Add a check for ID collisions during the ingestion process to prevent multiple files from using the same note_id. Update logging levels to DEBUG for detailed diagnostics on hash comparisons, body lengths, and frontmatter keys, improving traceability and debugging capabilities in the ingestion workflow.
Update the logging statement to provide additional context during the ingestion process by including the normalized file path and note title. This change aims to improve traceability and debugging capabilities in the ingestion workflow.
Update the ingestion process to convert the parsed object to a dictionary before passing it to the hash input function. This change ensures compatibility with the updated function requirements and improves the accuracy of hash comparisons during ingestion workflows.
Update the ingestion process to utilize the parsed object instead of note_pl for hash input, body, and frontmatter extraction. This change ensures that the correct content is used for comparisons, enhancing the reliability of change detection diagnostics and improving overall ingestion accuracy.
Add comprehensive logging for hash input, body length comparisons, and frontmatter key checks in the change detection process. This update aims to improve traceability and facilitate debugging by providing insights into potential discrepancies between new and old payloads during ingestion workflows.
Change debug logs to info and warning levels in ingestion_processor.py to enhance the visibility of change detection processes, including hash comparisons and artifact checks. Additionally, ensure .env is loaded before logging setup in import_markdown.py to correctly read the DEBUG environment variable. These adjustments aim to improve traceability and debugging during ingestion workflows.
Add detailed debug and warning logs to the change detection process, providing insights into hash comparisons and artifact checks. This update aims to facilitate better traceability and debugging during ingestion, particularly when handling hash changes and missing hashes. The changes ensure that the ingestion workflow is more transparent and easier to troubleshoot.
Implement deterministic sorting of semantic groups in graph_derive_edges.py to ensure consistent edge extraction across batches. Update ingestion_processor.py to enhance change detection logic, ensuring that hash checks are performed before artifact checks to prevent redundant processing. These changes improve the reliability and efficiency of the edge building and ingestion workflows.
Implement path normalization to ensure consistent hash checks by converting file paths to absolute paths. Update change detection logic to handle hash comparisons more robustly, treating missing hashes as content changes for safety. This prevents redundant processing and improves efficiency in the ingestion workflow.
Introduce a new method for persisting rejected edges for audit purposes, enhancing traceability and validation logic. Update the decision engine to utilize a generic fallback template for improved error handling during LLM validation. Revise documentation across multiple files to reflect the new versioning, context, and features related to Phase 3 validation, including automatic mirror edges and note-scope zones. This update ensures better graph integrity and validation accuracy in the ingestion process.
Remove LLM validation from the candidate edge processing loop, shifting it to a later phase for improved context handling. Introduce a new validation mechanism that aggregates note text for better decision-making and optimizes the validation criteria to include both rule IDs and provenance. Update logging to reflect the new validation phases and ensure rejected edges are not processed further. This enhances the overall efficiency and accuracy of edge validation during ingestion.
Enhance the edge validation process by introducing logic to validate edges with rule IDs starting with "candidate:". This includes extracting target IDs, validating against the entire note text, and updating rule IDs upon successful validation. Rejected edges are logged for traceability, improving the overall handling of edge data during ingestion.
Enhance the extraction logic to store the zone status before header updates, ensuring accurate context during callout processing. Initialize the all_chunk_callout_keys set prior to its usage to prevent potential UnboundLocalError. These improvements contribute to more reliable edge construction and better handling of LLM validation zones.
Implement support for H2 headers in LLM validation zone detection, allowing for improved flexibility in header recognition. Update the extraction logic to track zones during callout processing, ensuring accurate differentiation between LLM validation and standard zones. This enhancement improves the handling of callouts and their associated metadata, contributing to more precise edge construction.
Implement functions to extract LLM validation zones from Markdown, allowing for configurable header identification via environment variables. Enhance the existing note scope zone extraction to differentiate between note scope and LLM validation zones. Update edge building logic to handle LLM validation edges with a 'candidate:' prefix, ensuring proper processing and avoiding duplicates in global scans. This update improves the overall handling of edge data and enhances the flexibility of the extraction process.