neues Server Handbuch
This commit is contained in:
parent
84e319a42f
commit
bc06ce5b90
198
docs/04_Operations/04_server_operation_manual.md
Normal file
198
docs/04_Operations/04_server_operation_manual.md
Normal file
|
|
@ -0,0 +1,198 @@
|
|||
---
|
||||
doc_type: operations_manual
|
||||
audience: system_admin, devops
|
||||
scope: server_lifecycle, disaster_recovery, maintenance, backup
|
||||
status: active
|
||||
version: 1.6
|
||||
hostname: llm-node
|
||||
ip_address: 192.168.2.144
|
||||
context: "Zentrale Dokumentation für Host-Konfiguration, Mindnet-Applikation, Gitea und Backup-Strategie."
|
||||
created_date: 2025-12-14
|
||||
last_updated: 2025-12-14
|
||||
---
|
||||
|
||||
# Server Operations Manual: llm-node (192.168.2.144)
|
||||
|
||||
Dieses Dokument beschreibt den Betrieb, die Wartung und die Wiederherstellung (Disaster Recovery) des Servers "llm-node".
|
||||
|
||||
## 1. Systemübersicht & Netzwerk
|
||||
|
||||
### 1.1 Host Details
|
||||
* **Hostname:** `llm-node`
|
||||
* **IP-Adresse:** `192.168.2.144`
|
||||
* **OS:** Ubuntu Server
|
||||
* **Qdrant Host-Volume Pfad:** `/home/llmadmin/docker/qdrant/qdrant_data` (Bind-Mount)
|
||||
|
||||
### 1.2 Storage & NAS Mount
|
||||
Das System sichert auf ein Synology NAS via NFSv3.
|
||||
|
||||
* **Mount Point:** `/mnt/nas_backup`
|
||||
* **Quelle:** `192.168.2.63:/volume1/Backup_LLM`
|
||||
* **Fstab-Eintrag (Referenz für Restore):**
|
||||
```text
|
||||
192.168.2.63:/volume1/Backup_LLM /mnt/nas_backup nfs vers=3,_netdev,x-systemd.automount,nofail 0 0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. Dienste & Port-Matrix
|
||||
|
||||
### 2.1 Native Dienste (Systemd)
|
||||
|
||||
| Service | Port | User | Beschreibung |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **SSH** | 22 | root | Remote Zugriff |
|
||||
| **Gitea** | 3000 | `git` | Git Version Control (User: git) |
|
||||
| **Ollama** | 11434 | `ollama` | AI Model Server |
|
||||
| **Mindnet Prod API** | 8001 | `llmadmin` | Backend Applikation |
|
||||
| **Mindnet Dev API** | 8002 | `llmadmin` | Backend Applikation (Entwicklung) |
|
||||
| **Mindnet Prod UI** | 8501 | `llmadmin` | Frontend Applikation |
|
||||
| **Mindnet Dev UI** | 8502 | `llmadmin` | Frontend Applikation (Entwicklung) |
|
||||
| **Act Runner** | 38703 | `root` | Gitea CI/CD Runner |
|
||||
|
||||
### 2.2 Docker Container
|
||||
Übersicht der Container (`docker ps`).
|
||||
|
||||
| Container Name | Port | Beschreibung | Backup Strategie |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **qdrant** | 6333 | Vektor Datenbank (Container Name: `qdrant`) | **Stop -> Tar -> Start** (Konsistenter Snapshot) |
|
||||
| **mindnet-embed** | 8990 | Embeddings Service | Volume Backup |
|
||||
| **code-server** | 8443 | VS Code (Web) | Volume Backup |
|
||||
| **silverbullet** | 13000 | Knowledge Base | Volume Backup |
|
||||
|
||||
---
|
||||
|
||||
## 3. Applikations-Management
|
||||
|
||||
### 3.1 Gitea (Git Server)
|
||||
* **User:** `git`
|
||||
* **Datenpfad:** `/var/lib/gitea`
|
||||
* **Backup:** Konsistenter Dump (`gitea-dump.zip`) im `before_backup` Hook integriert.
|
||||
|
||||
### 3.2 Ollama (AI Models)
|
||||
* **User:** `ollama`
|
||||
* **Modell-Speicher:** `/usr/share/ollama/.ollama/models`
|
||||
* **Backup:** Nur Metadaten. Blobs werden exkludiert.
|
||||
|
||||
### 3.3 Mindnet (AI App)
|
||||
* **User:** `llmadmin`
|
||||
* **Qdrant Storage:** `/home/llmadmin/docker/qdrant/qdrant_data`
|
||||
* **Cronjob:** Regelmäßiger Import (stündlich) unter User `llmadmin`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Backup Strategie (Borgmatic)
|
||||
|
||||
**Status:** Automatisiert via `systemctl start borgmatic.timer`.
|
||||
**Ziel:** `/mnt/nas_backup/borg_repo`
|
||||
|
||||
### 4.1 Konsistenz-Strategie
|
||||
Alle schreibenden Dienste werden durch Hooks für das Backup vorbereitet:
|
||||
|
||||
### 4.2 Konfiguration (`/etc/borgmatic/config.yaml`)
|
||||
|
||||
```yaml
|
||||
source_directories:
|
||||
- /etc
|
||||
- /home
|
||||
- /var
|
||||
- /opt
|
||||
- /root
|
||||
|
||||
repositories:
|
||||
- path: /mnt/nas_backup/borg_repo
|
||||
label: nas
|
||||
|
||||
exclude_patterns:
|
||||
- /mnt/*
|
||||
- /var/lib/gitea/data/tmp/*
|
||||
# Ollama Blobs (Speicher sparen)
|
||||
- /usr/share/ollama/.ollama/models/blobs/*
|
||||
- /home/*/.ollama/models/blobs/*
|
||||
# Qdrant Live-Daten exkludieren
|
||||
- /home/llmadmin/docker/qdrant/qdrant_data/*
|
||||
|
||||
keep_daily: 7
|
||||
keep_weekly: 4
|
||||
keep_monthly: 6
|
||||
|
||||
before_backup:
|
||||
- echo "--- Start Pre-Backup Hooks ---"
|
||||
|
||||
# 1. Ollama Liste sichern
|
||||
- ollama list > /root/ollama_models_backup.txt
|
||||
|
||||
# 2. Gitea Dump
|
||||
- echo "Erstelle Gitea Dump..."
|
||||
- rm -f /var/lib/gitea/gitea-dump.zip
|
||||
- runuser -u git -- gitea dump -c /etc/gitea/app.ini -f /var/lib/gitea/gitea-dump.zip
|
||||
|
||||
# 3. Qdrant Snapshot
|
||||
- echo "Erstelle Qdrant Snapshot..."
|
||||
- docker stop qdrant
|
||||
- rm -f /home/llmadmin/qdrant_backup_latest.tar.gz
|
||||
# Tar Archiv im Home-Verzeichnis erstellen (wird von Borg gesichert)
|
||||
- tar -czf /home/llmadmin/qdrant_backup_latest.tar.gz -C /home/llmadmin/docker/qdrant qdrant_data
|
||||
- docker start qdrant
|
||||
|
||||
after_backup:
|
||||
- echo "Backup erfolgreich beendet."
|
||||
# Nach erfolgreichem Backup wird das temporäre Tar-Archiv gelöscht
|
||||
- rm -f /home/llmadmin/qdrant_backup_latest.tar.gz
|
||||
|
||||
on_error:
|
||||
- echo "FEHLER aufgetreten! Versuche Container zu retten..."
|
||||
- docker start qdrant
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Disaster Recovery (Wiederherstellung)
|
||||
|
||||
#### Schritt 1: Basissystem
|
||||
1. Ubuntu Server installieren (IP: `192.168.2.144`).
|
||||
2. NAS mounten: `mount -t nfs 192.168.2.63:/volume1/Backup_LLM /mnt/nas_backup`
|
||||
3. Tools installieren: `apt install borgmatic docker.io git nfs-common`
|
||||
4. User anlegen: `llmadmin`, `git`, `ollama`.
|
||||
|
||||
#### Schritt 2: Daten Restore (Borg)
|
||||
```bash
|
||||
# Repo verbinden
|
||||
# borgmatic init --encryption=none /mnt/nas_backup/borg_repo
|
||||
# Alle Daten wiederherstellen
|
||||
borgmatic extract --archive latest --path / --destination /
|
||||
```
|
||||
|
||||
#### Schritt 3: Dienste wiederherstellen
|
||||
|
||||
**A. Gitea (Aus Dump):**
|
||||
1. Dump entpacken: `unzip /var/lib/gitea/gitea-dump.zip -d /tmp/gitea_restore`
|
||||
2. Service starten: `systemctl enable --now gitea`
|
||||
|
||||
**B. Qdrant (Aus Snapshot):**
|
||||
Das Tar-Archiv liegt im Home-Verzeichnis (`/home/llmadmin/`). Es muss in das Volume-Verzeichnis entpackt werden.
|
||||
|
||||
1. Wechsle in das Host-Verzeichnis des Volumes:
|
||||
```bash
|
||||
cd /home/llmadmin/docker/qdrant
|
||||
```
|
||||
2. Entpacke das gesicherte Tar (stellt den Ordner `qdrant_data` wieder her):
|
||||
```bash
|
||||
tar -xzf /home/llmadmin/qdrant_backup_latest.tar.gz
|
||||
```
|
||||
3. Container starten (Volume-Mount ist korrekt):
|
||||
```bash
|
||||
docker start qdrant
|
||||
# Oder Container neu erstellen, falls nötig
|
||||
```
|
||||
|
||||
**C. Ollama (Smart Restore):**
|
||||
1. Ollama installieren.
|
||||
2. Modelle mit dem Script aus `/root/ollama_models_backup.txt` neu ziehen.
|
||||
|
||||
---
|
||||
|
||||
## 6. Log- und Fehleranalyse
|
||||
|
||||
* **Backup-Logs:** `journalctl -u borgmatic`
|
||||
* **Qdrant Container Logs:** `docker logs qdrant`
|
||||
Loading…
Reference in New Issue
Block a user