mitai-jinkendo/.claude/docs/working/issue-21-universal-csv-parser-analysis.md

# Issue #21: Universeller CSV-Parser – Anforderungsanalyse & Konzept

**Stand:** 2026-04-09
**Autor:** Claude Code Agent
**Status:** Konzeptphase (Wartet auf User-Approval)

---

## 1. Ausgangslage

### 1.1 Bestehende CSV-Import-Implementierungen

Aktuell existieren **4 separate CSV-Import-Funktionen**:

| Modul | Datei | Format | Besonderheiten |
|-------|-------|--------|----------------|
| **Nutrition** | `nutrition.py:34` | FDDB | Delimiter `;`, hardcoded Spalten, Aggregierung nach Tag |
| **Activity** | `activity.py:344` | Apple Health | **Lernendes Mapping** via `activity_type_mappings`, Update-or-Insert |
| **Blood Pressure** | `blood_pressure.py:293` | Omron | Multiple Spaltennamen-Varianten (DE/EN), Context-Tagging |
| **ZIP Import** | `importdata.py:30` | Eigenes Format | Profile.json + CSV-Bundle |

### 1.2 Gemeinsame Patterns (bereits vorhanden)

✅ **Encoding-Detection:**
```python
try:    text = raw.decode('utf-8')
except: text = raw.decode('latin-1')
if text.startswith('\ufeff'): text = text[1:]  # BOM-Handling
```

✅ **Duplikat-Erkennung:**
- Nutrition: `ON CONFLICT (profile_id, date) DO UPDATE`
- Activity: `SELECT WHERE profile_id=%s AND date=%s AND start_time=%s`
- Blood Pressure: Timestamp-basiert

✅ **Type-Conversion** (scattered):
- Datumsformate: FDDB (`dd.mm.yyyy`), Apple Health (ISO), Omron (mehrere)
- Dezimaltrennzeichen: `,` → `.`
- Einheiten: kJ → kcal

❌ **Fehlende Patterns:**
- Kein **einheitliches Mapping-System** (außer Activity)
- Kein **User-Interface für Mapping-Anpassung**
- Keine **automatische Format-Erkennung**
- Keine **Vorschläge für unbekannte Spalten**

---

## 2. Anforderungen (aus User-Request)

### 2.1 Funktionale Anforderungen

| # | Anforderung | Priorität |
|---|-------------|-----------|
| **F1** | **Universeller Parser:** Ein Parser für alle Module (Nutrition, Activity, Weight, Circumference, Caliper, Vitals, Sleep) | MUST |
| **F2** | **Lernendes System:** Automatische Erkennung bekannter CSV-Strukturen basierend auf Spalten-Signaturen | MUST |
| **F3** | **User-anpassbares Mapping:** UI zur manuellen Zuordnung von CSV-Spalten zu DB-Feldern | MUST |
| **F4** | **Intelligente Vorschläge:** System schlägt Mappings vor basierend auf Spalten-Namen, Sample-Daten, Statistiken | SHOULD |
| **F5** | **Type-Conversion:** Automatische Konvertierung von Datumsformaten, Dezimaltrennzeichen, Text→Zahl, Einheiten | MUST |
| **F6** | **Mapping-Persistenz:** Gespeicherte Mappings können wiederverwendet werden (pro User, pro Modul, global) | MUST |
| **F7** | **Format-Templates:** Vordefinierte Templates für bekannte Formate (FDDB, Apple Health, Omron, Garmin, etc.) | SHOULD |
| **F8** | **Validierung:** Vor-Import-Validierung mit Fehler-Report und Preview (erste 5 Zeilen) | SHOULD |
| **F9** | **Rollback:** Fehlerhafte Imports können rückgängig gemacht werden | NICE |

### 2.2 Nicht-funktionale Anforderungen

| # | Anforderung | Priorität |
|---|-------------|-----------|
| **NF1** | **Backward-Kompatibilität:** Bestehende CSV-Import-Endpoints bleiben funktionsfähig (Wrapper um neuen Parser) | MUST |
| **NF2** | **Performance:** Import von 1000 Zeilen < 5 Sekunden | SHOULD |
| **NF3** | **Erweiterbarkeit:** Neue Module/Felder können ohne Code-Änderung hinzugefügt werden (Registry-Pattern) | MUST |
| **NF4** | **Security:** User können nur eigene Mappings sehen/ändern (außer Admin) | MUST |

---

## 3. Datenmodell

### 3.1 Neue DB-Tabellen

#### **`csv_field_mappings`** (Zentrale Mapping-Registry)

```sql
CREATE TABLE csv_field_mappings (
    id                  SERIAL PRIMARY KEY,
    profile_id          INTEGER REFERENCES profiles(id),  -- NULL = System-Template
    is_system           BOOLEAN DEFAULT false,            -- true = read-only Template
    module              VARCHAR(50) NOT NULL,             -- 'nutrition', 'activity', etc.
    mapping_name        VARCHAR(100) NOT NULL,            -- "FDDB Export", "Apple Health"
    description         TEXT,                             -- "Standard-Format für FDDB CSV-Exporte"

    -- CSV-Signatur (für Auto-Detection)
    column_signature    TEXT[],                           -- Spalten-Namen (sortiert, normalisiert)
    delimiter           VARCHAR(10) DEFAULT ',',          -- CSV-Delimiter
    encoding            VARCHAR(20) DEFAULT 'utf-8',
    has_header          BOOLEAN DEFAULT true,

    -- Mapping-Definition (JSONB)
    field_mappings      JSONB NOT NULL,                   -- { "csv_column": "db_field" }
    type_conversions    JSONB,                            -- { "db_field": {"type": "date", "format": "dd.mm.yyyy"} }

    -- Statistik (für Ranking)
    usage_count         INTEGER DEFAULT 0,
    last_used_at        TIMESTAMP,
    success_rate        FLOAT DEFAULT 1.0,                -- Erfolgreiche Imports / Gesamt

    created_at          TIMESTAMP DEFAULT NOW(),
    updated_at          TIMESTAMP DEFAULT NOW(),

    UNIQUE(profile_id, module, mapping_name),
    CHECK (
        -- System-Templates haben profile_id = NULL
        (is_system = true AND profile_id IS NULL) OR
        (is_system = false AND profile_id IS NOT NULL) OR
        (is_system = false AND profile_id IS NULL)
    )
);

CREATE INDEX idx_csv_mappings_lookup ON csv_field_mappings(module, profile_id);
CREATE INDEX idx_csv_mappings_signature ON csv_field_mappings USING GIN (column_signature);
CREATE INDEX idx_csv_mappings_system ON csv_field_mappings(is_system, module) WHERE is_system = true;

COMMENT ON TABLE csv_field_mappings IS 'Mapping-Registry: System-Templates (is_system=true) + User-Mappings (profile_id NOT NULL)';
COMMENT ON COLUMN csv_field_mappings.is_system IS 'System-Templates sind read-only und für alle User verfügbar';
COMMENT ON COLUMN csv_field_mappings.profile_id IS 'NULL = System-Template, NOT NULL = User-spezifisches Mapping';
```

**Beispiel-Entries:**

**System-Template (für alle User verfügbar):**
```json
{
  "id": 1,
  "profile_id": null,
  "is_system": true,
  "module": "nutrition",
  "mapping_name": "FDDB Export (Standard)",
  "description": "Standard-Format für FDDB.de CSV-Exporte (Deutsch)",
  "column_signature": ["datum_tag_monat_jahr_stunde_minute", "fett_g", "kh_g", "kj", "protein_g"],
  "delimiter": ";",
  "encoding": "utf-8",
  "has_header": true,
  "field_mappings": {
    "datum_tag_monat_jahr_stunde_minute": "date",
    "kj": "kcal",
    "fett_g": "fat_g",
    "kh_g": "carbs_g",
    "protein_g": "protein_g"
  },
  "type_conversions": {
    "date": {
      "type": "date",
      "format": "dd.mm.yyyy HH:MM",
      "extract": "date_only"
    },
    "kcal": {
      "type": "float",
      "source_unit": "kJ",
      "target_unit": "kcal",
      "conversion_factor": 0.239
    },
    "fat_g": {
      "type": "float",
      "decimal_separator": ","
    }
  },
  "usage_count": 1523,
  "success_rate": 0.99
}
```

**User-spezifisches Mapping (nur für User ID 42):**
```json
{
  "id": 123,
  "profile_id": 42,
  "is_system": false,
  "module": "nutrition",
  "mapping_name": "Mein FDDB Export (angepasst)",
  "description": "FDDB Export mit Notiz-Spalte",
  "column_signature": ["datum_tag_monat_jahr_stunde_minute", "fett_g", "kh_g", "kj", "protein_g", "notiz"],
  "delimiter": ";",
  "encoding": "utf-8",
  "has_header": true,
  "field_mappings": {
    "datum_tag_monat_jahr_stunde_minute": "date",
    "kj": "kcal",
    "fett_g": "fat_g",
    "kh_g": "carbs_g",
    "protein_g": "protein_g",
    "notiz": "note"
  },
  "type_conversions": {
    "date": {
      "type": "date",
      "format": "dd.mm.yyyy HH:MM",
      "extract": "date_only"
    },
    "kcal": {
      "type": "float",
      "source_unit": "kJ",
      "target_unit": "kcal",
      "conversion_factor": 0.239
    }
  },
  "usage_count": 8,
  "success_rate": 1.0
}
```

#### **`csv_import_log`** (Import-Historie für Rollback)

```sql
CREATE TABLE csv_import_log (
    id                  SERIAL PRIMARY KEY,
    profile_id          INTEGER REFERENCES profiles(id),
    mapping_id          INTEGER REFERENCES csv_field_mappings(id),
    module              VARCHAR(50) NOT NULL,

    filename            VARCHAR(255),
    rows_total          INTEGER,
    rows_imported       INTEGER,
    rows_updated        INTEGER,
    rows_skipped        INTEGER,
    rows_errors         INTEGER,

    error_details       JSONB,                            -- [{"row": 5, "error": "Invalid date"}]

    started_at          TIMESTAMP DEFAULT NOW(),
    finished_at         TIMESTAMP,
    status              VARCHAR(20) DEFAULT 'running',    -- 'running', 'success', 'failed'

    -- Für Rollback
    affected_ids        JSONB                             -- {"nutrition_log": [123, 456, ...]}
);

CREATE INDEX idx_csv_import_profile ON csv_import_log(profile_id, module);
```

### 3.2 System-Templates (Seed-Data)

**Bei Installation/Migration werden folgende System-Templates angelegt:**

#### **Nutrition (Ernährung)**

1. **FDDB Export (Standard)**
   - Delimiter: `;`
   - Encoding: `utf-8`
   - Spalten: `datum_tag_monat_jahr_stunde_minute`, `kj`, `fett_g`, `kh_g`, `protein_g`
   - Besonderheit: kJ → kcal Konvertierung

2. **MyFitnessPal Export**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Date`, `Calories`, `Carbohydrates (g)`, `Fat (g)`, `Protein (g)`

3. **Cronometer Export**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Day`, `Energy (kcal)`, `Protein (g)`, `Net Carbs (g)`, `Fat (g)`

#### **Activity (Aktivität)**

1. **Apple Health Workout Export (English)**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Workout Type`, `Start`, `End`, `Duration`, `Distance (km)`, `Active Energy (kcal)`, `Heart Rate Average (bpm)`
   - Besonderheit: Automatisches Training-Type-Mapping

2. **Apple Health Workout Export (Deutsch)**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Trainingsart`, `Start`, `Ende`, `Dauer`, `Strecke (km)`, `Aktive Energie (kcal)`, `Durchschnittliche Herzfrequenz (bpm)`

3. **Garmin Connect Export**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Activity Type`, `Date`, `Time`, `Duration`, `Distance`, `Calories`, `Avg HR`

#### **Blood Pressure (Blutdruck)**

1. **Omron Export (Deutsch)**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Datum`, `Zeit`, `Systolisch (mmHg)`, `Diastolisch (mmHg)`, `Puls (bpm)`

2. **Omron Export (English)**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Date`, `Time`, `Systolic (mmHg)`, `Diastolic (mmHg)`, `Pulse (bpm)`

#### **Vitals (Vitalwerte)**

1. **Apple Health Vitals Export**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Start`, `Resting Heart Rate (bpm)`, `Heart Rate Variability (ms)`, `Respiratory Rate (breaths/min)`, `Oxygen Saturation (%)`

#### **Weight (Gewicht)**

1. **Apple Health Weight Export**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Start`, `Body Mass (kg)`

2. **Withings Export**
   - Delimiter: `,`
   - Encoding: `utf-8`
   - Spalten: `Date`, `Weight (kg)`, `Body Fat (%)`, `Muscle Mass (kg)`

**GESAMT:** ~12-15 System-Templates initial

**Migration:** `backend/migrations/XXX_csv_parser_seed_templates.sql`

### 3.3 Modul-Registry (Backend Code)

**`backend/csv_parser/module_registry.py`**

Definiert für jedes Modul:
- Verfügbare DB-Felder
- Datentypen
- Validierung
- Erforderliche Felder
- Duplikat-Strategie

```python
MODULE_DEFINITIONS = {
    "nutrition": {
        "table": "nutrition_log",
        "fields": {
            "date": {"type": "date", "required": True},
            "kcal": {"type": "float", "required": True, "min": 0, "max": 10000},
            "protein_g": {"type": "float", "required": False, "min": 0},
            "fat_g": {"type": "float", "required": False, "min": 0},
            "carbs_g": {"type": "float", "required": False, "min": 0},
            "note": {"type": "string", "required": False, "max_length": 500}
        },
        "duplicate_key": ["profile_id", "date"],  # ON CONFLICT
        "duplicate_strategy": "update"  # "update" | "skip" | "error"
    },
    "activity": {
        "table": "activity_log",
        "fields": {
            "date": {"type": "date", "required": True},
            "start_time": {"type": "time", "required": False},
            "activity_type": {"type": "string", "required": True},
            "duration_min": {"type": "float", "required": True, "min": 0},
            "kcal_active": {"type": "float", "required": False},
            "distance_km": {"type": "float", "required": False},
            "hr_avg": {"type": "int", "required": False, "min": 30, "max": 220}
        },
        "duplicate_key": ["profile_id", "date", "start_time"],
        "duplicate_strategy": "update"
    },
    # ... weitere Module
}
```

---

## 4. Architektur

### 4.1 System-Komponenten

```
┌─────────────────────────────────────────────────────────────┐
│                      Frontend (React)                        │
├─────────────────────────────────────────────────────────────┤
│  1. CSV-Upload-Komponente                                    │
│     - Datei-Upload + Format-Detection                        │
│     - Preview (erste 5 Zeilen)                               │
│                                                               │
│  2. Mapping-Editor                                           │
│     - Spalten-zu-Feld-Zuordnung (Drag & Drop)               │
│     - Type-Conversion-Konfiguration                          │
│     - Vorschau der konvertierten Werte                       │
│                                                               │
│  3. Mapping-Bibliothek                                       │
│     - Gespeicherte Mappings anzeigen/auswählen               │
│     - Templates (FDDB, Apple Health, etc.)                   │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    Backend (FastAPI)                         │
├─────────────────────────────────────────────────────────────┤
│  1. CSV-Parser-Engine                                        │
│     - Encoding-Detection (UTF-8, Latin-1, etc.)             │
│     - Delimiter-Detection (`,` `;` `\t`)                    │
│     - Column-Signature-Berechnung                            │
│                                                               │
│  2. Mapping-Engine                                           │
│     - Auto-Detection (Spalten → bekannte Mappings)           │
│     - Intelligent Suggestions (Fuzzy-Match, Sample-Analyse)  │
│     - Mapping-Persistenz (DB speichern/laden)               │
│                                                               │
│  3. Type-Converter                                           │
│     - Date-Parser (20+ Formate)                              │
│     - Number-Parser (Dezimaltrennzeichen, Tausender)        │
│     - Unit-Converter (kJ↔kcal, km↔mi, etc.)                │
│     - Text-Normalizer (Trim, Lowercase, etc.)               │
│                                                               │
│  4. Validator                                                │
│     - Type-Validation (INT, FLOAT, DATE, etc.)              │
│     - Range-Validation (min/max)                             │
│     - Required-Field-Check                                   │
│     - Custom-Validators pro Modul                            │
│                                                               │
│  5. Import-Executor                                          │
│     - Batch-Insert mit Transaction                           │
│     - Duplikat-Handling (Update/Skip/Error)                 │
│     - Rollback bei Fehler                                    │
│     - Progress-Tracking (für große Files)                    │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      PostgreSQL                              │
├─────────────────────────────────────────────────────────────┤
│  - csv_field_mappings (Mapping-Registry)                    │
│  - csv_import_log (Import-Historie)                         │
│  - nutrition_log, activity_log, ... (Daten-Tabellen)       │
└─────────────────────────────────────────────────────────────┘
```

### 4.2 Workflow (Happy Path)

```
1. User wählt Datei
   ↓
2. Frontend: POST /api/csv/analyze
   - Datei hochladen
   - Backend: Encoding + Delimiter erkennen
   - Backend: Column-Signature berechnen
   - Backend: Auto-Detection
     1. Suche in User-Mappings (profile_id = current_user)
     2. Suche in System-Templates (is_system = true)
   ↓
3. Backend antwortet:
   {
     "detected_mapping": {
       "id": 1,
       "name": "FDDB Export (Standard)",
       "is_system": true,
       "confidence": 0.98,
       "match_type": "exact_signature"
     },
     "columns": ["date", "kcal", "protein"],
     "sample_rows": [...],
     "suggestions": {
       "date": ["date", "created_at"],  // Vorschläge
       "kcal": ["kcal", "energy"]
     }
   }
   ↓
4. Frontend: Mapping-Editor
   - User sieht: "System-Template erkannt: FDDB Export (Standard)"
   - User kann Mapping anpassen (erstellt dann automatisch User-Copy)
   - User testet Type-Conversion (Preview)
   ↓
5. Frontend: POST /api/csv/import
   {
     "mapping_id": 1,              // Verwende bestehendes Mapping, ODER:
     "mapping": {...},             // Custom-Mapping
     "module": "nutrition",
     "save_mapping": true,         // Als User-Mapping speichern?
     "mapping_name": "MyFitnessPal Export"
   }
   ↓
6. Backend: Import ausführen
   - Validierung
   - Transaction starten
   - Row-by-Row importieren
   - Bei Fehler: Rollback
   - Bei Erfolg: usage_count++ für verwendetes Mapping
   ↓
7. Backend: Antwort
   {
     "success": true,
     "imported": 100,
     "updated": 5,
     "skipped": 2,
     "errors": [{"row": 7, "error": "Invalid date"}],
     "import_log_id": 456  // Für Rollback
   }
```

### 4.3 System-Templates vs. User-Mappings

**Hierarchie (Auto-Detection-Reihenfolge):**

1. **User-Mappings** (profile_id = current_user)
   - Höchste Priorität
   - Exact Match → sofort verwenden
   - Partial Match → als Vorschlag

2. **System-Templates** (is_system = true, profile_id = NULL)
   - Fallback wenn kein User-Mapping passt
   - Read-only (User kann nicht ändern)
   - User kann aber **Kopie erstellen** und anpassen

**Permissions:**

| Aktion | User-Mappings | System-Templates |
|--------|---------------|------------------|
| **Anzeigen** | ✅ Eigene | ✅ Alle |
| **Verwenden** | ✅ Eigene | ✅ Alle |
| **Erstellen** | ✅ Ja | ❌ Nur Admin/Migration |
| **Ändern** | ✅ Eigene | ❌ Nein (Kopie erstellen) |
| **Löschen** | ✅ Eigene | ❌ Nein |
| **Kopieren** | ✅ Ja | ✅ Ja → User-Mapping |

**Workflow "System-Template anpassen":**

```
User wählt System-Template "FDDB Export (Standard)"
  → User ändert Mapping (z.B. fügt Spalte hinzu)
  → Frontend fragt: "System-Template kann nicht geändert werden.
                     Kopie erstellen? [Ja] [Abbrechen]"
  → User klickt [Ja]
  → Neue User-Mapping mit is_system=false, profile_id=current_user
```

---

## 5. Intelligente Features

### 5.1 Auto-Detection (Spalten-Signatur-Matching)

**Algorithmus:**

1. **Exakte Signatur:** Spalten-Namen (normalisiert, sortiert) → 100% Match
   ```
   ["date", "kcal", "protein_g"] → Mapping-ID 123
   ```

2. **Partial Match:** ≥70% Überlappung → Vorschlag
   ```
   CSV: ["date", "calories", "protein"]
   DB:  ["date", "kcal", "protein_g"]
   → Match: 66% → Mapping-ID 123 als Vorschlag
   ```

3. **Fuzzy-Match:** Levenshtein-Distanz < 3
   ```
   "Datum" → "date" (Distance: 3)
   "Kalorien" → "kcal" (keine exakte Match)
   ```

### 5.2 Intelligente Vorschläge

**Sample-basierte Analyse:**

1. **Date-Detection:** Regex-Patterns für 20+ Formate
   ```python
   SAMPLES = ["01.01.2024", "02.01.2024", "03.01.2024"]
   → Pattern: dd.mm.yyyy
   → Vorschlag: Spalte "Datum" → Feld "date"
   ```

2. **Number-Detection:** Statistik über Sample-Werte
   ```python
   SAMPLES = ["1500,5", "2000,3", "1800,0"]
   → Decimal-Separator: ","
   → Range: 1000-3000 → passt zu "kcal"
   ```

3. **Unit-Detection:** Keyword-Search in Spalten-Namen
   ```python
   "Active Energy (kJ)" → Einheit: kJ → Feld: kcal (mit Conversion)
   ```

### 5.3 Type-Conversion (20+ Formate)

**Date-Formate:**
```python
DATE_PATTERNS = [
    "%Y-%m-%d",           # 2024-01-15 (ISO)
    "%d.%m.%Y",           # 15.01.2024 (DE)
    "%d/%m/%Y",           # 15/01/2024 (UK)
    "%m/%d/%Y",           # 01/15/2024 (US)
    "%Y-%m-%d %H:%M:%S",  # Full timestamp
    "%d.%m.%Y %H:%M",     # FDDB format
    # ... 15 weitere
]
```

**Number-Conversion:**
```python
def parse_number(value: str, decimal_sep=',', thousands_sep='.') -> float:
    # "1.500,50" → 1500.50
    value = value.replace(thousands_sep, '')
    value = value.replace(decimal_sep, '.')
    return float(value)
```

**Unit-Conversion:**
```python
UNIT_CONVERSIONS = {
    ("kJ", "kcal"): lambda x: x / 4.184,
    ("kcal", "kJ"): lambda x: x * 4.184,
    ("km", "mi"): lambda x: x * 0.621371,
    ("mi", "km"): lambda x: x * 1.60934,
    ("kg", "lb"): lambda x: x * 2.20462,
    ("lb", "kg"): lambda x: x * 0.453592,
}
```

---

## 6. API-Endpoints

### 6.1 Neue Endpoints

#### **POST /api/csv/analyze**

Analysiert hochgeladene CSV-Datei und schlägt Mappings vor.

**Request:**
```
Content-Type: multipart/form-data

file: <csv-file>
module: "nutrition"
```

**Response:**
```json
{
  "encoding": "utf-8",
  "delimiter": ";",
  "columns": ["Datum", "Kalorien (kJ)", "Protein (g)", "Fett (g)"],
  "sample_rows": [
    {"Datum": "01.01.2024", "Kalorien (kJ)": "8000", "Protein (g)": "80", "Fett (g)": "60"},
    {"Datum": "02.01.2024", "Kalorien (kJ)": "9000", "Protein (g)": "90", "Fett (g)": "70"}
  ],
  "detected_mappings": [
    {
      "mapping_id": 123,
      "mapping_name": "FDDB Export",
      "confidence": 0.95,
      "match_type": "exact_signature"
    }
  ],
  "suggestions": {
    "Datum": {
      "suggested_field": "date",
      "confidence": 0.98,
      "type": "date",
      "detected_format": "dd.mm.yyyy",
      "sample_conversions": ["2024-01-01", "2024-01-02"]
    },
    "Kalorien (kJ)": {
      "suggested_field": "kcal",
      "confidence": 0.85,
      "type": "float",
      "requires_conversion": true,
      "source_unit": "kJ",
      "target_unit": "kcal",
      "sample_conversions": [1912.6, 2151.7]
    }
  },
  "available_fields": {
    "date": {"type": "date", "required": true},
    "kcal": {"type": "float", "required": true, "min": 0, "max": 10000},
    "protein_g": {"type": "float", "required": false},
    "fat_g": {"type": "float", "required": false},
    "carbs_g": {"type": "float", "required": false}
  }
}
```

#### **POST /api/csv/import**

Führt Import mit bestätigtem Mapping aus.

**Request:**
```json
{
  "file_data": "<base64-encoded-csv>",  // Oder file_id aus /analyze
  "module": "nutrition",
  "mapping": {
    "field_mappings": {
      "Datum": "date",
      "Kalorien (kJ)": "kcal",
      "Protein (g)": "protein_g"
    },
    "type_conversions": {
      "date": {"type": "date", "format": "dd.mm.yyyy"},
      "kcal": {"type": "float", "source_unit": "kJ", "conversion_factor": 0.239}
    }
  },
  "save_mapping": true,
  "mapping_name": "FDDB Export 2024"
}
```

**Response:**
```json
{
  "success": true,
  "import_log_id": 456,
  "stats": {
    "total_rows": 100,
    "imported": 95,
    "updated": 3,
    "skipped": 2,
    "errors": 0
  },
  "error_details": [],
  "duration_ms": 1234
}
```

#### **GET /api/csv/mappings**

Liste gespeicherter Mappings (User + System-Templates).

**Query-Params:**
- `module`: Filter nach Modul (optional)

**Response:**
```json
{
  "system_templates": [
    {
      "id": 1,
      "module": "nutrition",
      "name": "FDDB Export (Standard)",
      "description": "Standard-Format für FDDB.de CSV-Exporte",
      "is_system": true,
      "usage_count": 1523,
      "success_rate": 0.99,
      "created_at": "2024-01-01T00:00:00"
    },
    {
      "id": 2,
      "module": "activity",
      "name": "Apple Health Workout Export",
      "description": "Apple Health CSV-Export (English)",
      "is_system": true,
      "usage_count": 5043,
      "success_rate": 0.98,
      "created_at": "2024-01-01T00:00:00"
    }
  ],
  "user_mappings": [
    {
      "id": 123,
      "module": "nutrition",
      "name": "Mein FDDB Export (angepasst)",
      "description": "FDDB mit Notizen",
      "is_system": false,
      "usage_count": 8,
      "success_rate": 1.0,
      "last_used_at": "2024-01-15T10:30:00",
      "created_at": "2024-01-10T12:00:00"
    }
  ]
}
```

**Sortierung:**
- System-Templates: nach `usage_count DESC` (beliebteste zuerst)
- User-Mappings: nach `last_used_at DESC` (neueste zuerst)

#### **POST /api/csv/mappings/{mapping_id}/copy**

Erstellt User-Kopie eines System-Templates (für Anpassungen).

**Response:**
```json
{
  "new_mapping_id": 124,
  "message": "Kopie erstellt: 'FDDB Export (Standard)' → 'FDDB Export (Standard) - Kopie'"
}
```

#### **DELETE /api/csv/mappings/{mapping_id}**

Löscht gespeichertes Mapping.

**Permissions:**
- User können nur **eigene** Mappings löschen (profile_id = current_user)
- System-Templates (is_system = true) können **nicht** gelöscht werden
- Admin kann alle löschen (außer System-Templates)

#### **POST /api/csv/rollback/{import_log_id}**

Macht einen Import rückgängig (löscht importierte Einträge).

**NICE-TO-HAVE:** Nur wenn Zeit bleibt.

### 6.2 Bestehende Endpoints (Wrapper)

Die bestehenden Endpoints **bleiben funktional** als dünner Wrapper:

```python
# backend/routers/nutrition.py

@router.post("/import-csv")
async def import_nutrition_csv(file: UploadFile, ...):
    """
    LEGACY: FDDB-spezifischer Import (Backward-Kompatibilität).
    Nutzt intern den Universal-Parser mit vordefiniertem FDDB-Template.
    """
    # Wrapper um Universal-Parser:
    from csv_parser import universal_import

    mapping = get_predefined_mapping("nutrition", "fddb")
    result = await universal_import(
        file=file,
        module="nutrition",
        mapping=mapping,
        profile_id=pid
    )

    # Legacy Response-Format beibehalten:
    return {
        "imported": result["stats"]["imported"],
        "skipped": result["stats"]["skipped"]
    }
```

---

## 7. Frontend-UI (Skizze)

### 7.1 CSV-Upload-Seite

```
┌─────────────────────────────────────────────────────────┐
│  Daten importieren › CSV-Upload                          │
├─────────────────────────────────────────────────────────┤
│                                                           │
│  Schritt 1: Datei hochladen                              │
│  ┌─────────────────────────────────────────────────┐   │
│  │  [📁 Datei auswählen]  nutrition-export.csv     │   │
│  └─────────────────────────────────────────────────┘   │
│                                                           │
│  Schritt 2: Modul auswählen                              │
│  ○ Ernährung   ○ Aktivität   ○ Gewicht   ○ Vitalwerte  │
│                                                           │
│  [Weiter →]                                              │
└─────────────────────────────────────────────────────────┘
```

### 7.2 Mapping-Editor

```
┌──────────────────────────────────────────────────────────────┐
│  CSV-Import › Mapping bearbeiten                             │
├──────────────────────────────────────────────────────────────┤
│                                                               │
│  ✓ Format erkannt: FDDB Export (95% Übereinstimmung)        │
│                                                               │
│  Spalten-Zuordnung:                                          │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  CSV-Spalte                    → Datenbank-Feld        │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │  "Datum"                       → [date ▼]     ✓        │ │
│  │  "Kalorien (kJ)"               → [kcal ▼]     ⚠️       │ │
│  │    └─ Umrechnung: kJ → kcal (÷4.184)                  │ │
│  │  "Protein (g)"                 → [protein_g ▼] ✓       │ │
│  │  "Fett (g)"                    → [fat_g ▼]    ✓        │ │
│  │  "Produkt"                     → [—nicht zuordnen—]    │ │
│  └────────────────────────────────────────────────────────┘ │
│                                                               │
│  Vorschau (erste 3 Zeilen):                                  │
│  ┌────────────────────────────────────────────────────────┐ │
│  │ date       │ kcal    │ protein_g │ fat_g │            │ │
│  ├────────────────────────────────────────────────────────┤ │
│  │ 2024-01-01 │ 1912.6  │ 80.0      │ 60.0  │            │ │
│  │ 2024-01-02 │ 2151.7  │ 90.0      │ 70.0  │            │ │
│  └────────────────────────────────────────────────────────┘ │
│                                                               │
│  ☐ Mapping speichern als: [FDDB Export 2024________]        │
│                                                               │
│  [← Zurück]  [Import starten →]                             │
└──────────────────────────────────────────────────────────────┘
```

### 7.3 Import-Fortschritt

```
┌─────────────────────────────────────────────────────────┐
│  CSV-Import läuft...                                     │
├─────────────────────────────────────────────────────────┤
│                                                           │
│  ████████████████████░░░░░░░░  80% (80/100 Zeilen)      │
│                                                           │
│  ✓ 75 Einträge importiert                                │
│  ↻  3 Einträge aktualisiert                              │
│  ⊗  2 Fehler                                             │
│                                                           │
│  [Abbrechen]                                             │
└─────────────────────────────────────────────────────────┘
```

---

## 8. Implementierungs-Phasen

### Phase 1: Foundation (Woche 1) **← START HIER**

**Ziel:** Parser-Engine + Modul-Registry + System-Templates

- [ ] **Migration:**
  - `XXX_csv_parser_tables.sql` – `csv_field_mappings`, `csv_import_log` Tabellen
  - `XXX_csv_parser_seed_templates.sql` – 12-15 System-Templates anlegen
- [ ] **Backend:**
  - `csv_parser/core.py` – Encoding/Delimiter-Detection
  - `csv_parser/module_registry.py` – Modul-Definitionen
  - `csv_parser/type_converter.py` – Date/Number/Unit-Converter (20+ Formate)
  - `csv_parser/permissions.py` – System-Template Read-Only-Check
- [ ] **Testing:** Unit-Tests für Type-Converter + System-Template-Seed

**Output:**
- Funktionierender Parser (ohne Auto-Detection, ohne UI)
- 12-15 System-Templates in DB verfügbar
- User können Templates laden (aber nicht ändern)

---

### Phase 2: Mapping-System (Woche 2)

**Ziel:** Auto-Detection + Mapping-Persistenz

- [ ] **Backend:**
  - `csv_parser/mapping_engine.py` – Auto-Detection, Fuzzy-Match
  - `csv_parser/suggestions.py` – Intelligente Vorschläge
  - API: `/api/csv/analyze`, `/api/csv/mappings`, `/api/csv/mappings/{id}/copy`
- [ ] **Permissions:** System-Template Read-Only-Enforcement
- [ ] **Testing:**
  - Auto-Detection-Tests mit realen CSV-Files (alle System-Templates)
  - User vs. System Permissions (User kann nicht System-Template ändern)
  - Copy-Workflow (System-Template → User-Mapping)

**Output:**
- Auto-Detection funktioniert (User-Mappings > System-Templates)
- User können System-Templates kopieren und anpassen
- Permissions korrekt (Read-Only für System-Templates)

---

### Phase 3: Import-Executor + API (Woche 2-3)

**Ziel:** Import-Workflow komplett

- [ ] **Backend:**
  - `csv_parser/executor.py` – Batch-Insert, Validation, Rollback
  - API: `/api/csv/import`, `/api/csv/mappings`
- [ ] **Migration:** Bestehende Import-Endpoints auf Wrapper umstellen
- [ ] **Testing:** End-to-End-Tests (Nutrition, Activity)

**Output:** Import funktioniert via API, Legacy-Endpoints funktional

---

### Phase 4: Frontend (Woche 3-4)

**Ziel:** User-Interface für Mapping-Editor

- [ ] **Frontend:**
  - `CSVUploadPage.jsx` – Upload + Modul-Auswahl
  - `CSVMappingEditor.jsx` – Spalten-zu-Feld-Zuordnung
  - `CSVImportProgress.jsx` – Fortschritts-Anzeige
  - `CSVMappingLibrary.jsx` – Gespeicherte Mappings anzeigen/auswählen
- [ ] **UX:** Drag & Drop für Spalten-Zuordnung
- [ ] **Testing:** E2E-Tests (Playwright)

**Output:** Vollständige UI, User kann eigene Mappings erstellen

---

### Phase 5: Rollout (Woche 4)

**Ziel:** Alle Module migriert, Legacy-Code entfernt

- [ ] Alle Module auf Universal-Parser migriert (Weight, Circumference, Caliper, Sleep)
- [ ] Legacy-Import-Code entfernt (nach Deprecation-Phase)
- [ ] Dokumentation aktualisiert
- [ ] Gitea Issue #21 geschlossen

---

## 9. Offene Fragen (für User-Approval)

1. **Scope:** Alle Module sofort oder schrittweise? (Empfehlung: Start mit Nutrition + Activity)
2. **Rollback:** Wichtig genug für Phase 1-3? Oder NICE-TO-HAVE?
3. **UI-Komplexität:** Drag & Drop oder simple Dropdowns? (Empfehlung: Dropdowns zuerst, D&D später)
4. **Performance:** Import-Limit pro File? (Empfehlung: 10.000 Zeilen, dann Batch-Upload)
5. **Migration:** Legacy-Endpoints sofort wrappen oder parallel laufen lassen?

---

## 10. Aufwandsschätzung

| Phase | Aufwand | Komponenten |
|-------|---------|-------------|
| **Phase 1** | 8-12h | Parser-Engine, Type-Converter, Migrations |
| **Phase 2** | 6-8h | Auto-Detection, Mapping-Engine, Suggestions |
| **Phase 3** | 8-10h | Import-Executor, API-Endpoints, Wrapper |
| **Phase 4** | 12-16h | Frontend UI (3-4 Komponenten) |
| **Phase 5** | 4-6h | Migration aller Module, Cleanup |
| **GESAMT** | **38-52h** | ~5-7 Arbeitstage |

**Kritischer Pfad:** Phase 1 → Phase 2 → Phase 3 (Backend muss komplett sein vor Frontend)

---

## 11. Risiken & Mitigations

| Risiko | Wahrscheinlichkeit | Impact | Mitigation |
|--------|-------------------|--------|------------|
| **Date-Format-Vielfalt:** 20+ Formate schwer zu parsen | HOCH | MITTEL | Fallback auf Manual-Input, User kann Format angeben |
| **Performance:** Große Files (>10k Zeilen) langsam | MITTEL | MITTEL | Batch-Processing + Background-Job (Celery) |
| **Backward-Compatibility:** Legacy-Code bricht | NIEDRIG | HOCH | Parallel-Betrieb + Feature-Flag |
| **UX-Komplexität:** Mapping-Editor zu komplex | MITTEL | NIEDRIG | Wizard-Flow, Step-by-Step, gute Defaults |

---

## 12. Erfolgskriterien

✅ **User kann CSV-File hochladen ohne Code-Kenntnisse**
✅ **System erkennt bekannte Formate automatisch (≥80% Accuracy)**
✅ **User kann eigene Mappings speichern und wiederverwenden**
✅ **Import-Fehlerrate < 5% bei validen Daten**
✅ **Performance: 1000 Zeilen in < 5 Sekunden**
✅ **Alle bestehenden CSV-Imports funktionieren weiter (Wrapper)**

---

**Nächster Schritt:** User-Approval für Konzept + Start Phase 1 (Foundation)

**Geschätzter Start-to-Finish:** 5-7 Arbeitstage (bei Fokus-Arbeit ohne Unterbrechungen)