- Incremented version numbers to reflect updates: authoring guidelines to 1.4.0, knowledge design manual to 4.6.0, and Obsidian integration guide to 1.1.0.
- Introduced detailed documentation on Section Types and Intra-Note-Edges, including syntax, rules, and examples for implementation.
- Enhanced context in the knowledge design manual to include new features related to section types and intra-note edges.
- Updated scopes in all relevant documents to reflect the addition of section types and intra-note edges, ensuring comprehensive coverage of the new functionalities.
- Introduced a `virtual` attribute for automatically generated section transitions and backlinks in the graph edge definitions, indicating their virtual nature.
- Updated documentation to reflect the addition of the `virtual` attribute for both section transitions and backlinks, clarifying its implications for scoring in the retriever.
- Enhanced the understanding of edge types by specifying that these automatically generated edges will receive a penalty during scoring.
- Added functionality to automatically create inverse backlinks for intra-note edges at the chunk level, ensuring that backlinks are generated only when they do not already exist.
- Updated the documentation to outline the requirements and rules for backlink creation, including conditions for deduplication and scope.
- Introduced unit tests to validate the creation of backlinks and ensure correct behavior when existing backlinks are present.
- Incremented version to 4.4.0 to reflect the new feature addition.
- Updated the `make_chunk_payloads` function calls to use a structured `note` parameter instead of separate `frontmatter` and `chunks` arguments, improving clarity and consistency in payload construction.
- Added comments to clarify the function signature for better understanding of the parameters being passed.
- Updated the function signature for `strategy_by_heading` to accept a configuration dictionary and note ID, enhancing flexibility in chunking strategies.
- Introduced a configuration object to specify parameters such as max tokens and smart edge allocation, improving code clarity and maintainability.
- Improved edge type extraction by refining the `load_graph_schema` function to utilize a comprehensive schema.
- Added new functions for validating intra-note edges against the schema and retrieving topology information.
- Enhanced logging for validation processes and updated documentation to reflect these changes.
- Introduced a new function `load_graph_schema_full` to parse and cache both typical and prohibited edge types from the graph schema.
- Updated `load_graph_schema` to utilize the full schema for improved edge type extraction.
- Added `get_topology_info` to retrieve typical and prohibited edges for source/target pairs.
- Implemented `validate_intra_note_edge` and `validate_edge_against_schema` for schema validation of intra-note edges.
- Enhanced logging for schema validation outcomes and edge handling.
- Updated documentation to reflect new validation features and testing procedures.
- Introduced configurable edge scoring with internal and external boosts for intra-note edges.
- Added aggregation configuration to support note-level and chunk-level retrieval strategies.
- Updated retriever and graph subgraph modules to utilize new scoring and aggregation logic.
- Enhanced YAML configuration to include new parameters for edge scoring and aggregation levels.
- Added boolean indexing for filtering based on edge properties in the setup script.
- Added a new function `_propagate_section_type_backwards` to ensure that the section_type is correctly assigned to all blocks within a heading section, even if the [!section] callout appears later in the text.
- Updated the `parse_blocks` function to call this new method, enhancing the accuracy of section-type assignments.
- Modified chunking strategies to reflect the changes in section-type handling, simplifying logic related to section-type transitions.
- Expanded unit tests to validate the backward propagation of section_type, ensuring comprehensive coverage of the new functionality.
- Modified the assertion in the test for edge types to include 'references' as a valid fallback option alongside 'foundation_for', 'guides', and 'related_to'.
- This change enhances the test coverage for edge type validation, ensuring more comprehensive checks for edge cases.
- Implemented WP-26 v1.1: Section-Type-Wechsel erzwingt Split auch in SMART MODE (Schritt 2) zur Verbesserung der Chunking-Logik.
- Updated `parse_link_target` to extract block IDs from section strings, ensuring accurate handling of links with block references.
- Added unit tests to validate section-type change behavior and block ID extraction functionality, enhancing overall reliability.
- Implemented WP-26 v1.1: Section-Type-Wechsel erzwingt immer einen neuen Chunk, um konsistente Chunking-Verhalten bei unterschiedlichen section_types zu gewährleisten.
- Introduced automatic Intra-Note-Edges zwischen Sektionen mit unterschiedlichen Typen, um semantische Beziehungen zu erfassen.
- Updated graph utilities to support automatic edge type derivation based on section transitions.
- Added unit tests for section-type changes and automatic edge generation to ensure functionality and reliability.
- Updated provenance priorities and introduced a mapping from internal provenance values to EdgeDTO-compliant literals.
- Added a new function `normalize_provenance` to standardize internal provenance strings.
- Enhanced the `_edge` function to include an `is_internal` flag and provenance normalization.
- Modified the `EdgeDTO` model to include a new `source_hint` field for detailed provenance information and an `is_internal` flag for intra-note edges.
- Reduced the provenance options in `EdgeDTO` to valid literals, improving data integrity.
- Replaced 'sliding_smart_edges' with 'structured_smart_edges' for multiple types to improve data processing.
- Added detection keywords for 'goal', 'concept', 'task', 'journal', 'source', 'glossary', 'person', and 'event' to enhance retrieval capabilities.
- Adjusted retriever weights for consistency across types.
- Introduced 'upholds', 'violates', 'aligned_with', 'conflicts_with', 'supports', and 'contradicts' attributes to improve the decision engine's relationship handling.
- Added 'followed_by' and 'preceded_by' attributes to the facts_stream for better context in data relationships.
- Added additional spacing for improved readability in the document.
- Ensured consistent formatting throughout the section on causal retrieval for Mindnet.
- Introduced final validation gate for edges with candidate: prefix.
- Enabled automatic generation of mirror edges for explicit connections.
- Added support for Note-Scope zones to facilitate global connections.
- Enhanced section-based links in the multigraph system for improved edge handling.
- Updated documentation and added new ENV variables for configuration.
- Ensured no breaking changes for end users, maintaining full backward compatibility.
- Increased the timeout for LLM calls from 30 to 60 seconds to accommodate longer processing times.
- Added informative messages for potential timeout causes and troubleshooting tips to improve user awareness.
- Updated error handling to provide clearer feedback on query failures, emphasizing the resolution of the EdgeDTO issue.
- Added a verification step in the health endpoint to check if the service supports 'explicit:callout' for EdgeDTO, providing clearer diagnostics.
- Updated the health response to include messages based on the EdgeDTO version support status, enhancing user awareness of potential issues.
- Adjusted the test query endpoint to reflect the correct path for improved functionality.
- Implemented a check in the health endpoint to determine if EdgeDTO supports explicit callouts, enhancing diagnostic capabilities.
- The check is non-critical and handles exceptions gracefully, ensuring the health response remains robust.
- Updated health response to include the new `edge_dto_supports_callout` field for better insight into EdgeDTO capabilities.
- Updated the .env loading process to first check for an explicit path, improving reliability in different working directories.
- Added logging for successful .env loading and fallback mechanisms.
- Enhanced EdgeDTO creation with robust error handling, including fallbacks for unsupported provenance values and logging of errors for better traceability.
Update the logging mechanism for ID collisions to include more structured metadata, enhancing the clarity of logged information. This change aims to facilitate easier analysis of conflicts during the ingestion process and improve overall traceability.
Implement a new method to log ID collisions into a separate file (logs/id_collisions.log) for manual analysis. This update captures relevant metadata in JSONL format, enhancing traceability during the ingestion process. The logging occurs when a conflict is detected between existing and new files sharing the same note_id, improving error handling and diagnostics.
Add a check for ID collisions during the ingestion process to prevent multiple files from using the same note_id. Update logging levels to DEBUG for detailed diagnostics on hash comparisons, body lengths, and frontmatter keys, improving traceability and debugging capabilities in the ingestion workflow.
Update the logging statement to provide additional context during the ingestion process by including the normalized file path and note title. This change aims to improve traceability and debugging capabilities in the ingestion workflow.
Update the ingestion process to convert the parsed object to a dictionary before passing it to the hash input function. This change ensures compatibility with the updated function requirements and improves the accuracy of hash comparisons during ingestion workflows.
Update the ingestion process to utilize the parsed object instead of note_pl for hash input, body, and frontmatter extraction. This change ensures that the correct content is used for comparisons, enhancing the reliability of change detection diagnostics and improving overall ingestion accuracy.
Add comprehensive logging for hash input, body length comparisons, and frontmatter key checks in the change detection process. This update aims to improve traceability and facilitate debugging by providing insights into potential discrepancies between new and old payloads during ingestion workflows.
Change debug logs to info and warning levels in ingestion_processor.py to enhance the visibility of change detection processes, including hash comparisons and artifact checks. Additionally, ensure .env is loaded before logging setup in import_markdown.py to correctly read the DEBUG environment variable. These adjustments aim to improve traceability and debugging during ingestion workflows.
Add detailed debug and warning logs to the change detection process, providing insights into hash comparisons and artifact checks. This update aims to facilitate better traceability and debugging during ingestion, particularly when handling hash changes and missing hashes. The changes ensure that the ingestion workflow is more transparent and easier to troubleshoot.
Implement deterministic sorting of semantic groups in graph_derive_edges.py to ensure consistent edge extraction across batches. Update ingestion_processor.py to enhance change detection logic, ensuring that hash checks are performed before artifact checks to prevent redundant processing. These changes improve the reliability and efficiency of the edge building and ingestion workflows.
Implement path normalization to ensure consistent hash checks by converting file paths to absolute paths. Update change detection logic to handle hash comparisons more robustly, treating missing hashes as content changes for safety. This prevents redundant processing and improves efficiency in the ingestion workflow.
Introduce a new method for persisting rejected edges for audit purposes, enhancing traceability and validation logic. Update the decision engine to utilize a generic fallback template for improved error handling during LLM validation. Revise documentation across multiple files to reflect the new versioning, context, and features related to Phase 3 validation, including automatic mirror edges and note-scope zones. This update ensures better graph integrity and validation accuracy in the ingestion process.
Remove LLM validation from the candidate edge processing loop, shifting it to a later phase for improved context handling. Introduce a new validation mechanism that aggregates note text for better decision-making and optimizes the validation criteria to include both rule IDs and provenance. Update logging to reflect the new validation phases and ensure rejected edges are not processed further. This enhances the overall efficiency and accuracy of edge validation during ingestion.
Enhance the edge validation process by introducing logic to validate edges with rule IDs starting with "candidate:". This includes extracting target IDs, validating against the entire note text, and updating rule IDs upon successful validation. Rejected edges are logged for traceability, improving the overall handling of edge data during ingestion.
Enhance the extraction logic to store the zone status before header updates, ensuring accurate context during callout processing. Initialize the all_chunk_callout_keys set prior to its usage to prevent potential UnboundLocalError. These improvements contribute to more reliable edge construction and better handling of LLM validation zones.
Implement support for H2 headers in LLM validation zone detection, allowing for improved flexibility in header recognition. Update the extraction logic to track zones during callout processing, ensuring accurate differentiation between LLM validation and standard zones. This enhancement improves the handling of callouts and their associated metadata, contributing to more precise edge construction.
Implement functions to extract LLM validation zones from Markdown, allowing for configurable header identification via environment variables. Enhance the existing note scope zone extraction to differentiate between note scope and LLM validation zones. Update edge building logic to handle LLM validation edges with a 'candidate:' prefix, ensuring proper processing and avoiding duplicates in global scans. This update improves the overall handling of edge data and enhances the flexibility of the extraction process.