mitai-jinkendo/.claude/docs/technical/PLACEHOLDER_DEVELOPMENT_GUIDE.md

# Placeholder Development Guide

**Version:** 1.0
**Erstellt:** 28. März 2026
**Zielgruppe:** Entwickler, Claude Code

---

## Überblick

Dieses Dokument beschreibt, wie neue KI-Platzhalter hinzugefügt, getestet und dokumentiert werden.

**Wichtig für Phase 0c:** Nach dem Refactoring zu Multi-Layer Architecture nutzen alle Platzhalter das Data Layer. Dieser Guide beschreibt beide Architekturen.

---

## Phase 0b Architektur (Aktuell - bis Phase 0c)

### Anatomie eines Platzhalters

```python
# backend/placeholder_resolver.py

def resolve_weight_28d_trend_slope(profile_id: str) -> str:
    """
    Returns kg/week slope for 28-day weight trend.

    This function:
    1. Retrieves data from database
    2. Performs calculation
    3. Formats result for KI consumption

    Args:
        profile_id: User profile ID

    Returns:
        Formatted string (e.g., "0.23 kg/Woche")
        or "Nicht genug Daten" if insufficient data
    """
    with get_db() as conn:
        cur = get_cursor(conn)

        # 1. DATA RETRIEVAL
        cur.execute("""
            SELECT date, weight
            FROM weight_log
            WHERE profile_id = %s
              AND date >= NOW() - INTERVAL '28 days'
            ORDER BY date
        """, (profile_id,))
        rows = cur.fetchall()

        # 2. VALIDATION
        if len(rows) < 18:  # Confidence threshold
            return "Nicht genug Daten"

        # 3. CALCULATION
        x = [(row[0] - rows[0][0]).days for row in rows]
        y = [row[1] for row in rows]

        # Linear regression
        n = len(x)
        sum_x = sum(x)
        sum_y = sum(y)
        sum_xy = sum(xi * yi for xi, yi in zip(x, y))
        sum_x2 = sum(xi ** 2 for xi in x)

        slope = (n * sum_xy - sum_x * sum_y) / (n * sum_x2 - sum_x ** 2)
        slope_per_week = slope * 7

        # 4. FORMATTING
        return f"{slope_per_week:.2f} kg/Woche"
```

### Schritte zum Hinzufügen eines neuen Platzhalters (Phase 0b)

#### 1. Funktion implementieren

**Datei:** `backend/placeholder_resolver.py`

**Namenskonvention:**
- `resolve_<placeholder_name>(profile_id: str) -> str`
- Snake_case
- Immer `profile_id` als Parameter
- Immer `str` als Return-Type

**Template:**
```python
def resolve_my_new_metric(profile_id: str) -> str:
    """
    [Beschreibung was der Platzhalter zurückgibt]

    Args:
        profile_id: User profile ID

    Returns:
        [Beschreibung des Return-Formats]
    """
    with get_db() as conn:
        cur = get_cursor(conn)

        # 1. DATA RETRIEVAL
        cur.execute("""
            SELECT ...
            FROM ...
            WHERE profile_id = %s
        """, (profile_id,))

        # 2. VALIDATION
        if <insufficient_data_condition>:
            return "Nicht genug Daten"

        # 3. CALCULATION
        result = ...

        # 4. FORMATTING
        return f"{result}"
```

#### 2. In Mapping registrieren

**Datei:** `backend/placeholder_resolver.py`

**Finde `PLACEHOLDER_FUNCTIONS` Dictionary:**
```python
PLACEHOLDER_FUNCTIONS = {
    # ... existing placeholders ...

    # Add your new placeholder:
    "my_new_metric": resolve_my_new_metric,
}
```

**Naming:**
- Key = Platzhalter-Name (snake_case)
- Value = Funktions-Referenz (ohne Klammern!)

#### 3. In Katalog dokumentieren

**Datei:** `backend/placeholder_resolver.py`

**Finde `get_placeholder_catalog()` Funktion:**
```python
def get_placeholder_catalog(profile_id: str) -> Dict[str, List[Dict[str, str]]]:
    placeholders = {
        'Körper': [
            # ... existing ...
            ('my_new_metric', 'Beschreibung des Platzhalters'),
        ],
        # ...
    }
```

**Kategorien:**
- `Profil`
- `Körper`
- `Ernährung`
- `Training`
- `Schlaf & Erholung`
- `Vitalwerte`
- `Scores (Phase 0b)`
- `Focus Areas`
- `Zeitraum`

#### 4. Testen

**Manueller Test:**
```python
# In Python REPL oder test script:
from placeholder_resolver import resolve_my_new_metric

result = resolve_my_new_metric("test_profile_id")
print(result)  # Should return formatted string
```

**Integration Test:**
```python
# Test in actual prompt
from placeholder_resolver import resolve_placeholders

template = "Dein {{my_new_metric}} ist ..."
result = resolve_placeholders(template, "test_profile_id")
print(result)  # Should have placeholder replaced
```

---

## Phase 0c Architektur (Nach Refactoring)

### Anatomie eines Platzhalters (3-Layer)

```python
# Layer 1: DATA LAYER
# backend/data_layer/body_metrics.py

def get_weight_trend_data(profile_id: str, days: int = 90) -> dict:
    """
    Returns weight trend data with slopes and projections.

    This is pure data retrieval and calculation.
    NO FORMATTING. NO STRINGS.

    Args:
        profile_id: User profile ID
        days: Analysis window

    Returns:
        {
            "raw_values": [(date, weight), ...],
            "rolling_median_7d": [(date, value), ...],
            "slope_7d": float,
            "slope_28d": float,
            "slope_90d": float,
            "confidence": str,
            ...
        }
    """
    with get_db() as conn:
        cur = get_cursor(conn)

        # DATA RETRIEVAL
        cur.execute("""...""", (profile_id, days))
        rows = cur.fetchall()

        # VALIDATION + CONFIDENCE
        from data_layer.utils import calculate_confidence
        confidence = calculate_confidence(len(rows), days, "trend")

        if confidence == 'insufficient':
            return {
                "confidence": "insufficient",
                "slope_28d": 0.0,
                # ... minimal data
            }

        # CALCULATION
        # ... (same logic as before)

        # RETURN STRUCTURED DATA (not formatted!)
        return {
            "raw_values": rows,
            "slope_7d": slope_7d,
            "slope_28d": slope_28d,
            "confidence": confidence,
            # ... all data as dict/list/float
        }


# Layer 2a: KI LAYER
# backend/placeholder_resolver.py

from data_layer.body_metrics import get_weight_trend_data

def resolve_weight_28d_trend_slope(profile_id: str) -> str:
    """
    Formats weight trend slope for KI consumption.

    This function is now THIN - just calls data layer and formats.
    """
    data = get_weight_trend_data(profile_id, days=28)

    if data['confidence'] == 'insufficient':
        return "Nicht genug Daten"

    return f"{data['slope_28d']:.2f} kg/Woche"
```

### Schritte zum Hinzufügen eines neuen Platzhalters (Phase 0c)

#### 1. Data Layer Funktion implementieren

**Datei:** Passendes Modul in `backend/data_layer/`
- Body metrics → `body_metrics.py`
- Nutrition → `nutrition_metrics.py`
- Activity → `activity_metrics.py`
- Recovery → `recovery_metrics.py`
- Health → `health_metrics.py`
- Goals → `goals.py`
- Correlations → `correlations.py`

**Template:**
```python
# backend/data_layer/<module>.py

def get_<metric>_data(
    profile_id: str,
    days: int = 28,
    **kwargs
) -> dict:
    """
    [Beschreibung der Daten]

    Args:
        profile_id: User profile ID
        days: Analysis window
        **kwargs: Additional parameters

    Returns:
        {
            "<field>": <value>,
            "confidence": str,  # ALWAYS include!
            "data_points": int,  # ALWAYS include!
        }
    """
    with get_db() as conn:
        cur = get_cursor(conn)

        # 1. DATA RETRIEVAL
        cur.execute("""...""", (profile_id,))
        rows = cur.fetchall()

        # 2. CONFIDENCE CALCULATION
        from data_layer.utils import calculate_confidence
        confidence = calculate_confidence(
            len(rows),
            days,
            "general"  # or "correlation" or "trend"
        )

        # 3. VALIDATION
        if confidence == 'insufficient':
            return {
                "confidence": "insufficient",
                "data_points": len(rows),
                # Return minimal safe data
            }

        # 4. CALCULATION
        # ... your logic here ...

        # 5. RETURN STRUCTURED DATA
        return {
            # All data as primitives: dict, list, float, int, str, bool
            # NO FORMATTING (no "0.23 kg/Woche" - just 0.23)
            "result": result_value,
            "confidence": confidence,
            "data_points": len(rows),
        }
```

**WICHTIG:**
- ❌ Keine Strings mit Einheiten: `"0.23 kg/Woche"`
- ✅ Nur Zahlen: `0.23`
- ❌ Keine Formatierung für Menschen
- ✅ Strukturierte Daten für Maschinen

#### 2. KI Layer Wrapper erstellen

**Datei:** `backend/placeholder_resolver.py`

```python
from data_layer.<module> import get_<metric>_data

def resolve_<placeholder_name>(profile_id: str) -> str:
    """
    [Beschreibung was zurückgegeben wird]

    Phase 0c: Uses data_layer.<module>.get_<metric>_data()
    """
    data = get_<metric>_data(profile_id)

    if data['confidence'] == 'insufficient':
        return "Nicht genug Daten"

    # FORMAT for KI consumption
    return f"{data['<field>']:.2f} <unit>"
```

#### 3. In Mapping registrieren

**UNVERÄNDERT - gleich wie Phase 0b:**
```python
PLACEHOLDER_FUNCTIONS = {
    "my_new_metric": resolve_my_new_metric,
}
```

#### 4. In Katalog dokumentieren

**UNVERÄNDERT - gleich wie Phase 0b:**
```python
def get_placeholder_catalog(profile_id: str):
    placeholders = {
        'Körper': [
            ('my_new_metric', 'Beschreibung'),
        ],
    }
```

#### 5. Testen

**Unit Test für Data Layer:**
```python
# backend/tests/test_data_layer.py

def test_get_metric_data_sufficient():
    data = get_<metric>_data("test_profile_1", days=28)

    assert data['confidence'] in ['high', 'medium', 'low', 'insufficient']
    assert 'data_points' in data
    assert isinstance(data['<field>'], float)

def test_get_metric_data_insufficient():
    data = get_<metric>_data("profile_no_data", days=28)

    assert data['confidence'] == 'insufficient'
```

**Integration Test für KI Layer:**
```python
# backend/tests/test_placeholders.py

def test_resolve_placeholder():
    result = resolve_<placeholder_name>("test_profile_1")

    assert isinstance(result, str)
    assert result != "Nicht genug Daten"
```

---

## Best Practices

### 1. Confidence Scoring

**IMMER `calculate_confidence()` verwenden:**
```python
from data_layer.utils import calculate_confidence

confidence = calculate_confidence(
    data_points=len(rows),
    days_requested=days,
    metric_type="general"  # or "correlation" or "trend"
)
```

**Confidence Thresholds:**
- General (28d): high >= 18, medium >= 12, low >= 8
- Correlation: high >= 28, medium >= 21, low >= 14
- Trend: high >= (days * 0.7), medium >= (days * 0.5)

### 2. Decimal → Float Conversion

**PostgreSQL gibt Decimal zurück - immer zu float konvertieren:**
```python
# ❌ WRONG:
value = row['column']

# ✅ CORRECT:
value = float(row['column']) if row['column'] else 0.0
```

### 3. Safe Dict Access

**Nie direkter Key-Zugriff ohne Fallback:**
```python
# ❌ WRONG:
value = data['key']  # KeyError if missing

# ✅ CORRECT:
value = data.get('key', default_value)
```

### 4. Date Serialization

**Python date objects sind nicht JSON-serializable:**
```python
from data_layer.utils import serialize_dates

data = {
    "date": date(2026, 3, 28),
    "values": [...]
}

# Serialize before returning from API
return serialize_dates(data)
```

### 5. SQL Parameter Binding

**IMMER Parameter-Binding, NIE String-Concatenation:**
```python
# ✅ CORRECT:
cur.execute("SELECT * FROM t WHERE id = %s", (id,))

# ❌ WRONG (SQL Injection Risk):
cur.execute(f"SELECT * FROM t WHERE id = {id}")
```

### 6. Column Name Consistency

**Prüfe Schema BEVOR du Column-Namen verwendest:**
```python
# ❌ WRONG (assumed name):
SELECT bf_jpl FROM caliper_log

# ✅ CORRECT (check schema first):
SELECT body_fat_pct FROM caliper_log
```

**Schema prüfen:**
```sql
\d caliper_log  -- in psql
-- oder
SELECT column_name FROM information_schema.columns
WHERE table_name = 'caliper_log';
```

---

## Fehler-Handling

### 1. Insufficient Data

**Return-Value bei zu wenig Daten:**
```python
# Data Layer:
return {
    "confidence": "insufficient",
    "data_points": 0,
    # Alle anderen Felder mit safe defaults (0.0, [], etc.)
}

# KI Layer:
if data['confidence'] == 'insufficient':
    return "Nicht genug Daten"
```

### 2. Missing Optional Data

**Wenn optionale Daten fehlen (z.B. keine Vitals):**
```python
# Data Layer:
return {
    "hrv": None,  # or 0.0, depending on semantic
    "confidence": "low",  # downgrade confidence
}

# KI Layer:
if data['hrv'] is None:
    return "Keine HRV-Daten verfügbar"
```

### 3. Calculation Errors

**Bei Math-Errors (Division by Zero, etc.):**
```python
try:
    result = numerator / denominator
except ZeroDivisionError:
    result = 0.0  # or None, depending on semantic
```

---

## Dokumentations-Pflicht

### 1. Docstring

**Jede Funktion braucht Docstring:**
```python
def get_metric_data(profile_id: str, days: int = 28) -> dict:
    """
    [Eine Zeile Zusammenfassung]

    [Ausführliche Beschreibung wenn nötig]

    Args:
        profile_id: User profile ID
        days: Analysis window (default 28)

    Returns:
        {
            "field": value,
            "confidence": str,
            "data_points": int
        }

    Confidence Rules:
        - high: >= X points
        - medium: >= Y points
        - low: >= Z points
        - insufficient: < Z points
    """
```

### 2. Inline Comments

**Nur bei nicht-offensichtlicher Logik:**
```python
# Calculate trimmed mean (remove top/bottom 10%)
sorted_values = sorted(values)
trim_count = len(values) // 10
trimmed = sorted_values[trim_count:-trim_count]
result = sum(trimmed) / len(trimmed)
```

### 3. Type Hints

**IMMER Type Hints verwenden:**
```python
from typing import Optional, List, Dict, Tuple

def get_data(
    profile_id: str,
    days: int = 28,
    include_raw: bool = False
) -> Dict[str, any]:
    ...
```

---

## Testing-Strategie

### 1. Unit Tests (Data Layer)

**Teste jede Data Layer Funktion isoliert:**
```python
# backend/tests/test_data_layer.py

import pytest
from data_layer.body_metrics import get_weight_trend_data

@pytest.fixture
def test_profile():
    # Setup test data in database
    ...
    yield profile_id
    # Teardown
    ...

def test_weight_trend_sufficient_data(test_profile):
    data = get_weight_trend_data(test_profile, days=28)

    assert data['confidence'] in ['high', 'medium']
    assert data['slope_28d'] != 0.0
    assert len(data['raw_values']) >= 18

def test_weight_trend_insufficient_data():
    data = get_weight_trend_data("no_data_profile", days=28)

    assert data['confidence'] == 'insufficient'
```

### 2. Integration Tests (KI Layer)

**Teste Placeholder-Resolution:**
```python
# backend/tests/test_placeholders.py

def test_placeholder_resolution(test_profile):
    result = resolve_weight_28d_trend_slope(test_profile)

    assert isinstance(result, str)
    assert "kg/Woche" in result or "Nicht genug Daten" in result

def test_placeholder_in_template(test_profile):
    template = "Trend: {{weight_28d_trend_slope}}"
    result = resolve_placeholders(template, test_profile)

    assert "{{" not in result  # All placeholders resolved
    assert result.startswith("Trend:")
```

### 3. Manual Testing Checklist

```
[ ] Funktion mit verschiedenen days-Parametern testen
[ ] Mit vollständigen Daten testen
[ ] Mit unvollständigen Daten testen
[ ] Mit NO DATA testen
[ ] Edge Cases: Extreme Werte, Outliers
[ ] Performance: < 500ms für typische Queries
[ ] Memory: Kein Leak bei großen Datasets
```

---

## Checkliste: Neuer Platzhalter

### Phase 0b (Aktuell):
```
[ ] Funktion in placeholder_resolver.py implementiert
[ ] resolve_<name>(profile_id: str) -> str Signatur
[ ] Docstring vollständig
[ ] Confidence-Check implementiert
[ ] In PLACEHOLDER_FUNCTIONS registriert
[ ] In get_placeholder_catalog() dokumentiert
[ ] Manuell getestet
[ ] In echtem Prompt getestet
```

### Phase 0c (Nach Refactoring):
```
[ ] Data Layer Funktion implementiert
    [ ] Richtiges Modul gewählt
    [ ] get_<metric>_data(profile_id, ...) -> dict Signatur
    [ ] Returns structured data (dict/list/primitives)
    [ ] NO formatting, NO strings with units
    [ ] Confidence calculation included
    [ ] Docstring vollständig
[ ] KI Layer Wrapper implementiert
    [ ] resolve_<name>(profile_id: str) -> str Signatur
    [ ] Calls data_layer function
    [ ] Formats result for KI
[ ] In PLACEHOLDER_FUNCTIONS registriert
[ ] In get_placeholder_catalog() dokumentiert
[ ] Unit Test für Data Layer geschrieben
[ ] Integration Test für KI Layer geschrieben
[ ] Manual Testing durchgeführt
```

---

## Häufige Fehler (Learnings from Phase 0b)

### 1. Vergessen float() Conversion
```python
# SYMPTOM: "Object of type Decimal is not JSON serializable"
# FIX:
value = float(row['column']) if row['column'] else 0.0
```

### 2. Hardcoded Column Names
```python
# SYMPTOM: "column bf_jpl does not exist"
# FIX: Check schema first
SELECT column_name FROM information_schema.columns
WHERE table_name = 'caliper_log';
```

### 3. KeyError bei fehlenden Daten
```python
# SYMPTOM: "KeyError: 'hrv'"
# FIX: Use .get() with default
hrv = data.get('hrv', 0.0)
```

### 4. Confidence nicht berechnet
```python
# SYMPTOM: Platzhalter liefert Daten bei <3 Punkten
# FIX: calculate_confidence() verwenden
from data_layer.utils import calculate_confidence
confidence = calculate_confidence(len(rows), days, "general")
```

### 5. Date nicht serialized
```python
# SYMPTOM: "Object of type date is not JSON serializable"
# FIX:
from data_layer.utils import serialize_dates
return serialize_dates(data)
```

### 6. SQL Injection Risk
```python
# SYMPTOM: Security Scanner warnt
# FIX: ALWAYS use parameter binding
cur.execute("SELECT * FROM t WHERE id = %s", (id,))
```

---

## Nächste Schritte

### Nach Implementierung eines neuen Platzhalters:

1. **Commit Message:**
   ```
   feat: add {{my_new_metric}} placeholder

   - Implements resolve_my_new_metric() in placeholder_resolver.py
   - Adds entry to PLACEHOLDER_FUNCTIONS
   - Documents in get_placeholder_catalog()
   - Tested with profile XYZ

   Category: <Körper/Ernährung/Training/etc.>
   Returns: <description>
   ```

2. **Dokumentation aktualisieren:**
   - `CLAUDE.md` - Neue Platzhalter auflisten
   - `docs/api/PLACEHOLDERS.md` - API-Dokumentation

3. **Testing:**
   - Mindestens 1 manueller Test mit echtem Profil
   - Optional: Unit Test hinzufügen

4. **Review:**
   - Prüfe ob Platzhalter in Prompt-Bibliothek sinnvoll
   - Teste mit verschiedenen Prompts
   - Performance-Check (< 500ms)

---

**Autor:** Claude Sonnet 4.5
**Version:** 1.0
**Letzte Aktualisierung:** 28. März 2026