docs: add comprehensive project documentation

- Replace original Subgen README with TranscriptorIO documentation
- Add docs/API.md with 45+ REST endpoint documentation
- Add docs/ARCHITECTURE.md with backend component details
- Add docs/FRONTEND.md with Vue 3 frontend structure
- Add docs/CONFIGURATION.md with settings system documentation
- Remove outdated backend/README.md
This commit is contained in:
2026-01-16 15:10:41 +01:00
parent 9655686a50
commit 8373d8765f
6 changed files with 3109 additions and 435 deletions

1195
docs/API.md Normal file

File diff suppressed because it is too large Load Diff

613
docs/ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,613 @@
# TranscriptorIO Backend Architecture
Technical documentation of the backend architecture, components, and data flow.
## Table of Contents
- [Overview](#overview)
- [Directory Structure](#directory-structure)
- [Core Components](#core-components)
- [Data Flow](#data-flow)
- [Database Schema](#database-schema)
- [Transcription vs Translation](#transcription-vs-translation)
- [Worker Architecture](#worker-architecture)
- [Queue System](#queue-system)
- [Scanner System](#scanner-system)
- [Settings System](#settings-system)
- [Graceful Degradation](#graceful-degradation)
- [Thread Safety](#thread-safety)
- [Important Patterns](#important-patterns)
---
## Overview
TranscriptorIO is built with a modular architecture consisting of:
- **FastAPI Server**: REST API with 45+ endpoints
- **Worker Pool**: Multiprocessing-based transcription workers (CPU/GPU)
- **Queue Manager**: Persistent job queue with priority support
- **Library Scanner**: Rule-based file scanning with scheduler and watcher
- **Settings Service**: Database-backed configuration system
```
┌─────────────────────────────────────────────────────────┐
│ FastAPI Server │
│ ┌─────────────────────────────────────────────────┐ │
│ │ REST API (45+ endpoints) │ │
│ │ /api/workers | /api/jobs | /api/settings │ │
│ │ /api/scanner | /api/system | /api/setup │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────────┬──────────────────────────────────────┘
┌──────────────┼──────────────┬──────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌────────┐ ┌──────────┐ ┌─────────┐ ┌──────────┐
│ Worker │ │ Queue │ │ Scanner │ │ Database │
│ Pool │◄──┤ Manager │◄──┤ Engine │ │ SQLite/ │
│ CPU/GPU│ │ Priority │ │ Rules + │ │ Postgres │
└────────┘ │ Queue │ │ Watcher │ └──────────┘
└──────────┘ └─────────┘
```
---
## Directory Structure
```
backend/
├── app.py # FastAPI application + lifespan
├── cli.py # CLI commands (server, db, worker, scan, setup)
├── config.py # Pydantic Settings (from .env)
├── setup_wizard.py # Interactive first-run setup
├── core/
│ ├── database.py # SQLAlchemy setup + session management
│ ├── models.py # Job model + enums
│ ├── language_code.py # ISO 639 language code utilities
│ ├── settings_model.py # SystemSettings model (database-backed)
│ ├── settings_service.py # Settings service with caching
│ ├── system_monitor.py # CPU/RAM/GPU/VRAM monitoring
│ ├── queue_manager.py # Persistent queue with priority
│ ├── worker.py # Individual worker (Process)
│ └── worker_pool.py # Worker pool orchestrator
├── transcription/
│ ├── __init__.py # Exports + WHISPER_AVAILABLE flag
│ ├── transcriber.py # WhisperTranscriber wrapper
│ ├── translator.py # Google Translate integration
│ └── audio_utils.py # ffmpeg/ffprobe utilities
├── scanning/
│ ├── __init__.py # Exports (NO library_scanner import!)
│ ├── models.py # ScanRule model
│ ├── file_analyzer.py # ffprobe file analysis
│ ├── language_detector.py # Audio language detection
│ ├── detected_languages.py # Language mappings
│ └── library_scanner.py # Scanner + scheduler + watcher
└── api/
├── __init__.py # Router exports
├── workers.py # Worker management endpoints
├── jobs.py # Job queue endpoints
├── scan_rules.py # Scan rules CRUD
├── scanner.py # Scanner control endpoints
├── settings.py # Settings CRUD endpoints
├── system.py # System resources endpoints
├── filesystem.py # Filesystem browser endpoints
└── setup_wizard.py # Setup wizard endpoints
```
---
## Core Components
### 1. WorkerPool (`core/worker_pool.py`)
Orchestrates CPU/GPU workers as separate processes.
**Key Features:**
- Dynamic add/remove workers at runtime
- Health monitoring with auto-restart
- Thread-safe multiprocessing
- Each worker is an isolated Process
```python
from backend.core.worker_pool import worker_pool
from backend.core.worker import WorkerType
# Add GPU worker on device 0
worker_id = worker_pool.add_worker(WorkerType.GPU, device_id=0)
# Add CPU worker
worker_id = worker_pool.add_worker(WorkerType.CPU)
# Get pool stats
stats = worker_pool.get_pool_stats()
```
### 2. QueueManager (`core/queue_manager.py`)
Persistent SQLite/PostgreSQL queue with priority support.
**Key Features:**
- Job deduplication (no duplicate `file_path`)
- Row-level locking with `skip_locked=True`
- Priority-based ordering (higher first)
- FIFO within same priority (by `created_at`)
- Auto-retry failed jobs
```python
from backend.core.queue_manager import queue_manager
from backend.core.models import QualityPreset
job = queue_manager.add_job(
file_path="/media/anime.mkv",
file_name="anime.mkv",
source_lang="jpn",
target_lang="spa",
quality_preset=QualityPreset.FAST,
priority=5
)
```
### 3. LibraryScanner (`scanning/library_scanner.py`)
Rule-based file scanning system.
**Three Scan Modes:**
- **Manual**: One-time scan via API or CLI
- **Scheduled**: Periodic scanning with APScheduler
- **Real-time**: File watcher with watchdog library
```python
from backend.scanning.library_scanner import library_scanner
# Manual scan
result = library_scanner.scan_paths(["/media/anime"], recursive=True)
# Start scheduler (every 6 hours)
library_scanner.start_scheduler(interval_minutes=360)
# Start file watcher
library_scanner.start_file_watcher(paths=["/media/anime"], recursive=True)
```
### 4. WhisperTranscriber (`transcription/transcriber.py`)
Wrapper for stable-whisper and faster-whisper.
**Key Features:**
- GPU/CPU support with auto-device detection
- VRAM management and cleanup
- Graceful degradation (works without Whisper installed)
```python
from backend.transcription.transcriber import WhisperTranscriber
transcriber = WhisperTranscriber(
model_name="large-v3",
device="cuda",
compute_type="float16"
)
result = transcriber.transcribe_file(
file_path="/media/episode.mkv",
language="jpn",
task="translate" # translate to English
)
result.to_srt("episode.eng.srt")
```
### 5. SettingsService (`core/settings_service.py`)
Database-backed configuration with caching.
```python
from backend.core.settings_service import settings_service
# Get setting
value = settings_service.get("worker_cpu_count", default=1)
# Set setting
settings_service.set("worker_cpu_count", "2")
# Bulk update
settings_service.bulk_update({
"worker_cpu_count": "2",
"scanner_enabled": "true"
})
```
---
## Data Flow
```
1. LibraryScanner detects file (manual/scheduled/watcher)
2. FileAnalyzer analyzes with ffprobe
- Audio tracks (codec, language, channels)
- Embedded subtitles
- External .srt files
- Duration, video info
3. Rules Engine evaluates against ScanRules (priority order)
- Checks all conditions (audio language, missing subs, etc.)
- First matching rule wins
4. If match → QueueManager.add_job()
- Deduplication check (no duplicate file_path)
- Assigns priority based on rule
5. Worker pulls job from queue
- Uses with_for_update(skip_locked=True)
- FIFO within same priority
6. WhisperTranscriber processes with model
- Stage 1: Audio → English (Whisper translate)
- Stage 2: English → Target (Google Translate, if needed)
7. Generate output SRT file(s)
- .eng.srt (always)
- .{target}.srt (if translate mode)
8. Job marked completed ✓
```
---
## Database Schema
### Job Table (`jobs`)
```sql
id VARCHAR PRIMARY KEY
file_path VARCHAR UNIQUE -- Ensures no duplicates
file_name VARCHAR
status VARCHAR -- queued/processing/completed/failed/cancelled
priority INTEGER
source_lang VARCHAR
target_lang VARCHAR
quality_preset VARCHAR -- fast/balanced/best
transcribe_or_translate VARCHAR -- transcribe/translate
progress FLOAT
current_stage VARCHAR
eta_seconds INTEGER
created_at DATETIME
started_at DATETIME
completed_at DATETIME
output_path VARCHAR
srt_content TEXT
segments_count INTEGER
error TEXT
retry_count INTEGER
max_retries INTEGER
worker_id VARCHAR
vram_used_mb INTEGER
processing_time_seconds FLOAT
```
### ScanRule Table (`scan_rules`)
```sql
id INTEGER PRIMARY KEY
name VARCHAR UNIQUE
enabled BOOLEAN
priority INTEGER -- Higher = evaluated first
-- Conditions (all must match):
audio_language_is VARCHAR -- ISO 639-2
audio_language_not VARCHAR -- Comma-separated
audio_track_count_min INTEGER
has_embedded_subtitle_lang VARCHAR
missing_embedded_subtitle_lang VARCHAR
missing_external_subtitle_lang VARCHAR
file_extension VARCHAR -- Comma-separated
-- Action:
action_type VARCHAR -- transcribe/translate
target_language VARCHAR
quality_preset VARCHAR
job_priority INTEGER
created_at DATETIME
updated_at DATETIME
```
### SystemSettings Table (`system_settings`)
```sql
id INTEGER PRIMARY KEY
key VARCHAR UNIQUE
value TEXT
description TEXT
category VARCHAR -- general/workers/transcription/scanner/bazarr
value_type VARCHAR -- string/integer/boolean/list
created_at DATETIME
updated_at DATETIME
```
---
## Transcription vs Translation
### Understanding the Two Modes
**Mode 1: `transcribe`** (Audio → English subtitles)
```
Audio (any language) → Whisper (task='translate') → English SRT
Example: Japanese audio → anime.eng.srt
```
**Mode 2: `translate`** (Audio → English → Target language)
```
Audio (any language) → Whisper (task='translate') → English SRT
→ Google Translate → Target language SRT
Example: Japanese audio → anime.eng.srt + anime.spa.srt
```
### Why Two Stages?
**Whisper Limitation**: Whisper can only translate TO English, not between other languages.
**Solution**: Two-stage process:
1. **Stage 1 (Always)**: Whisper converts audio to English using `task='translate'`
2. **Stage 2 (Only for translate mode)**: Google Translate converts English to target language
### Output Files
| Mode | Target | Output Files |
|------|--------|--------------|
| transcribe | spa | `.eng.srt` only |
| translate | spa | `.eng.srt` + `.spa.srt` |
| translate | fra | `.eng.srt` + `.fra.srt` |
---
## Worker Architecture
### Worker Types
| Type | Description | Device |
|------|-------------|--------|
| CPU | Uses CPU for inference | None |
| GPU | Uses NVIDIA GPU | cuda:N |
### Worker Lifecycle
```
┌─────────────┐
│ CREATED │
└──────┬──────┘
│ start()
┌─────────────┐
┌──────────│ IDLE │◄─────────┐
│ └──────┬──────┘ │
│ │ get_job() │ job_done()
│ ▼ │
│ ┌─────────────┐ │
│ │ BUSY │──────────┘
│ └──────┬──────┘
│ │ error
│ ▼
│ ┌─────────────┐
└──────────│ ERROR │
└─────────────┘
```
### Process Isolation
Each worker runs in a separate Python process:
- Memory isolation (VRAM per GPU worker)
- Crash isolation (one worker crash doesn't affect others)
- Independent model loading
---
## Queue System
### Priority System
```python
# Priority values
BAZARR_REQUEST = base_priority + 10 # Highest (external request)
MANUAL_REQUEST = base_priority + 5 # High (user-initiated)
AUTO_SCAN = base_priority # Normal (scanner-generated)
```
### Job Deduplication
Jobs are deduplicated by `file_path`:
- If job exists with same `file_path`, new job is rejected
- Returns `None` from `add_job()`
- Prevents duplicate processing
### Concurrency Safety
```python
# Row-level locking prevents race conditions
job = session.query(Job).filter(
Job.status == JobStatus.QUEUED
).with_for_update(skip_locked=True).first()
```
---
## Scanner System
### Scan Rule Evaluation
Rules are evaluated in priority order (highest first):
```python
# Pseudo-code for rule matching
for rule in rules.order_by(priority.desc()):
if rule.enabled and matches_all_conditions(file, rule):
create_job(file, rule.action)
break # First match wins
```
### Conditions
All conditions must match (AND logic):
| Condition | Match If |
|-----------|----------|
| audio_language_is | Primary audio track language equals |
| audio_language_not | Primary audio track language NOT in list |
| audio_track_count_min | Number of audio tracks >= value |
| has_embedded_subtitle_lang | Has embedded subtitle in language |
| missing_embedded_subtitle_lang | Does NOT have embedded subtitle |
| missing_external_subtitle_lang | Does NOT have external .srt file |
| file_extension | File extension in comma-separated list |
---
## Settings System
### Categories
| Category | Settings |
|----------|----------|
| general | operation_mode, library_paths, log_level |
| workers | cpu_count, gpu_count, auto_start, healthcheck_interval |
| transcription | whisper_model, compute_type, vram_management |
| scanner | enabled, schedule_interval, watcher_enabled |
| bazarr | provider_enabled, api_key |
### Caching
Settings service implements caching:
- Cache invalidated on write
- Thread-safe access
- Lazy loading from database
---
## Graceful Degradation
The system can run WITHOUT Whisper/torch/PyAV installed:
```python
# Pattern used everywhere
try:
import stable_whisper
WHISPER_AVAILABLE = True
except ImportError:
stable_whisper = None
WHISPER_AVAILABLE = False
# Later in code
if not WHISPER_AVAILABLE:
raise RuntimeError("Install with: pip install stable-ts faster-whisper")
```
**What works without Whisper:**
- Backend server starts normally
- All APIs work fully
- Frontend development
- Scanner and rules management
- Job queue (jobs just won't be processed)
**What doesn't work:**
- Actual transcription (throws RuntimeError)
---
## Thread Safety
### Database Sessions
Always use context managers:
```python
with database.get_session() as session:
# Session is automatically committed on success
# Rolled back on exception
job = session.query(Job).filter(...).first()
```
### Worker Pool
- Each worker is a separate Process (multiprocessing)
- Communication via shared memory (Manager)
- No GIL contention between workers
### Queue Manager
- Uses SQLAlchemy row locking
- `skip_locked=True` prevents deadlocks
- Transactions are short-lived
---
## Important Patterns
### Circular Import Resolution
**Critical**: `backend/scanning/__init__.py` MUST NOT import `library_scanner`:
```python
# backend/scanning/__init__.py
from backend.scanning.models import ScanRule
from backend.scanning.file_analyzer import FileAnalyzer, FileAnalysis
# DO NOT import library_scanner here!
```
**Why?**
```
library_scanner → database → models → scanning.models → database (circular!)
```
**Solution**: Import `library_scanner` locally where needed:
```python
def some_function():
from backend.scanning.library_scanner import library_scanner
library_scanner.scan_paths(...)
```
### Optional Imports
```python
try:
import pynvml
NVML_AVAILABLE = True
except ImportError:
pynvml = None
NVML_AVAILABLE = False
```
### Database Session Pattern
```python
from backend.core.database import database
with database.get_session() as session:
# All operations within session context
job = session.query(Job).filter(...).first()
job.status = JobStatus.PROCESSING
# Commit happens automatically
```
### API Response Pattern
```python
from pydantic import BaseModel
class JobResponse(BaseModel):
id: str
status: str
# ...
@router.get("/{job_id}", response_model=JobResponse)
async def get_job(job_id: str):
with database.get_session() as session:
job = session.query(Job).filter(Job.id == job_id).first()
if not job:
raise HTTPException(status_code=404, detail="Not found")
return JobResponse(**job.to_dict())
```

402
docs/CONFIGURATION.md Normal file
View File

@@ -0,0 +1,402 @@
# TranscriptorIO Configuration
Complete documentation for the configuration system.
## Table of Contents
- [Overview](#overview)
- [Configuration Methods](#configuration-methods)
- [Settings Categories](#settings-categories)
- [All Settings Reference](#all-settings-reference)
- [Environment Variables](#environment-variables)
- [Setup Wizard](#setup-wizard)
- [API Configuration](#api-configuration)
---
## Overview
TranscriptorIO uses a **database-backed configuration system**. All settings are stored in the `system_settings` table and can be managed through:
1. **Setup Wizard** (first run)
2. **Web UI** (Settings page)
3. **REST API** (`/api/settings`)
4. **CLI** (for advanced users)
This approach provides:
- Persistent configuration across restarts
- Runtime configuration changes without restart
- Category-based organization
- Type validation and parsing
---
## Configuration Methods
### 1. Setup Wizard (Recommended for First Run)
```bash
# Runs automatically on first server start
python backend/cli.py server
# Or run manually anytime
python backend/cli.py setup
```
The wizard guides you through:
- **Operation mode selection** (Standalone or Bazarr provider)
- **Library paths configuration**
- **Initial scan rules**
- **Worker configuration** (CPU/GPU counts)
- **Scanner schedule**
### 2. Web UI (Recommended for Daily Use)
Navigate to **Settings** in the web interface (`http://localhost:8000/settings`).
Features:
- Settings grouped by category tabs
- Descriptions for each setting
- Change detection (warns about unsaved changes)
- Bulk save functionality
### 3. REST API (For Automation/Integration)
```bash
# Get all settings
curl http://localhost:8000/api/settings
# Get settings by category
curl http://localhost:8000/api/settings?category=workers
# Update a setting
curl -X PUT http://localhost:8000/api/settings/worker_cpu_count \
-H "Content-Type: application/json" \
-d '{"value": "2"}'
# Bulk update
curl -X POST http://localhost:8000/api/settings/bulk-update \
-H "Content-Type: application/json" \
-d '{
"settings": {
"worker_cpu_count": "2",
"worker_gpu_count": "1"
}
}'
```
---
## Settings Categories
| Category | Description |
|----------|-------------|
| `general` | Operation mode, library paths, API server |
| `workers` | CPU/GPU worker configuration |
| `transcription` | Whisper model and transcription options |
| `subtitles` | Subtitle naming and formatting |
| `skip` | Skip conditions for files |
| `scanner` | Library scanner configuration |
| `bazarr` | Bazarr provider integration |
| `advanced` | Advanced options (path mapping, etc.) |
---
## All Settings Reference
### General Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `operation_mode` | string | `standalone` | Operation mode: `standalone`, `provider`, or `standalone,provider` |
| `library_paths` | list | `""` | Comma-separated library paths to scan |
| `api_host` | string | `0.0.0.0` | API server host |
| `api_port` | integer | `8000` | API server port |
| `debug` | boolean | `false` | Enable debug mode |
### Worker Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `worker_cpu_count` | integer | `0` | Number of CPU workers to start on boot |
| `worker_gpu_count` | integer | `0` | Number of GPU workers to start on boot |
| `concurrent_transcriptions` | integer | `2` | Maximum concurrent transcriptions |
| `worker_healthcheck_interval` | integer | `60` | Worker health check interval (seconds) |
| `worker_auto_restart` | boolean | `true` | Auto-restart failed workers |
| `clear_vram_on_complete` | boolean | `true` | Clear VRAM after job completion |
### Transcription Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `whisper_model` | string | `medium` | Whisper model: `tiny`, `base`, `small`, `medium`, `large-v3`, `large-v3-turbo` |
| `model_path` | string | `./models` | Path to store Whisper models |
| `transcribe_device` | string | `cpu` | Device: `cpu`, `cuda`, `gpu` |
| `cpu_compute_type` | string | `auto` | CPU compute type: `auto`, `int8`, `float32` |
| `gpu_compute_type` | string | `auto` | GPU compute type: `auto`, `float16`, `float32`, `int8_float16`, `int8` |
| `whisper_threads` | integer | `4` | Number of CPU threads for Whisper |
| `transcribe_or_translate` | string | `transcribe` | Default mode: `transcribe` or `translate` |
| `word_level_highlight` | boolean | `false` | Enable word-level highlighting |
| `detect_language_length` | integer | `30` | Seconds of audio for language detection |
| `detect_language_offset` | integer | `0` | Offset for language detection sample |
### Whisper Models
| Model | Size | Speed | Quality | VRAM |
|-------|------|-------|---------|------|
| `tiny` | 39M | Fastest | Basic | ~1GB |
| `base` | 74M | Very Fast | Fair | ~1GB |
| `small` | 244M | Fast | Good | ~2GB |
| `medium` | 769M | Medium | Great | ~5GB |
| `large-v3` | 1.5G | Slow | Excellent | ~10GB |
| `large-v3-turbo` | 809M | Fast | Excellent | ~6GB |
### Subtitle Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `subtitle_language_name` | string | `""` | Custom subtitle language name |
| `subtitle_language_naming_type` | string | `ISO_639_2_B` | Naming type: `ISO_639_1`, `ISO_639_2_T`, `ISO_639_2_B`, `NAME`, `NATIVE` |
| `custom_regroup` | string | `cm_sl=84_sl=42++++++1` | Custom regrouping algorithm |
**Language Naming Types:**
| Type | Example (Spanish) |
|------|-------------------|
| ISO_639_1 | `es` |
| ISO_639_2_T | `spa` |
| ISO_639_2_B | `spa` |
| NAME | `Spanish` |
| NATIVE | `Espanol` |
### Skip Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `skip_if_external_subtitles_exist` | boolean | `false` | Skip if any external subtitle exists |
| `skip_if_target_subtitles_exist` | boolean | `true` | Skip if target language subtitle exists |
| `skip_if_internal_subtitles_language` | string | `""` | Skip if internal subtitle in this language |
| `skip_subtitle_languages` | list | `""` | Pipe-separated language codes to skip |
| `skip_if_audio_languages` | list | `""` | Skip if audio track is in these languages |
| `skip_unknown_language` | boolean | `false` | Skip files with unknown audio language |
| `skip_only_subgen_subtitles` | boolean | `false` | Only skip SubGen-generated subtitles |
### Scanner Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `scanner_enabled` | boolean | `true` | Enable library scanner |
| `scanner_cron` | string | `0 2 * * *` | Cron expression for scheduled scans |
| `scanner_schedule_interval_minutes` | integer | `360` | Scan interval in minutes (6 hours) |
| `watcher_enabled` | boolean | `false` | Enable real-time file watcher |
| `auto_scan_enabled` | boolean | `false` | Enable automatic scheduled scanning |
### Bazarr Provider Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `bazarr_provider_enabled` | boolean | `false` | Enable Bazarr provider mode |
| `bazarr_url` | string | `http://bazarr:6767` | Bazarr server URL |
| `bazarr_api_key` | string | `""` | Bazarr API key (auto-generated) |
| `provider_timeout_seconds` | integer | `600` | Provider request timeout |
| `provider_callback_enabled` | boolean | `true` | Enable callback on completion |
| `provider_polling_interval` | integer | `30` | Polling interval for jobs |
### Advanced Settings
| Key | Type | Default | Description |
|-----|------|---------|-------------|
| `force_detected_language_to` | string | `""` | Force detected language to specific code |
| `preferred_audio_languages` | list | `eng` | Pipe-separated preferred audio languages |
| `use_path_mapping` | boolean | `false` | Enable path mapping for network shares |
| `path_mapping_from` | string | `/tv` | Path mapping source |
| `path_mapping_to` | string | `/Volumes/TV` | Path mapping destination |
| `lrc_for_audio_files` | boolean | `true` | Generate LRC files for audio-only files |
---
## Environment Variables
The **only** environment variable required is `DATABASE_URL` in the `.env` file:
```bash
# SQLite (default, good for single-user)
DATABASE_URL=sqlite:///./transcriptarr.db
# PostgreSQL (recommended for production)
DATABASE_URL=postgresql://user:password@localhost:5432/transcriptarr
# MariaDB/MySQL
DATABASE_URL=mariadb+pymysql://user:password@localhost:3306/transcriptarr
```
**All other configuration** is stored in the database and managed through:
- Setup Wizard (first run)
- Web UI Settings page
- Settings API endpoints
This design ensures:
- No `.env` file bloat
- Runtime configuration changes without restart
- Centralized configuration management
- Easy backup (configuration is in the database)
---
## Setup Wizard
### Standalone Mode
For independent operation with local library scanning.
**Configuration Flow:**
1. Select library paths (e.g., `/media/anime`, `/media/movies`)
2. Create initial scan rules (e.g., "Japanese audio → Spanish subtitles")
3. Configure workers (CPU count, GPU count)
4. Set scanner interval (default: 6 hours)
**API Endpoint:** `POST /api/setup/standalone`
```json
{
"library_paths": ["/media/anime", "/media/movies"],
"scan_rules": [
{
"name": "Japanese to Spanish",
"audio_language_is": "jpn",
"missing_external_subtitle_lang": "spa",
"target_language": "spa",
"action_type": "transcribe"
}
],
"worker_config": {
"count": 1,
"type": "cpu"
},
"scanner_config": {
"interval_minutes": 360
}
}
```
### Bazarr Slave Mode
For integration with Bazarr as a subtitle provider.
**Configuration Flow:**
1. Select Bazarr mode
2. System auto-generates API key
3. Displays connection info for Bazarr configuration
**API Endpoint:** `POST /api/setup/bazarr-slave`
**Response:**
```json
{
"success": true,
"message": "Bazarr slave mode configured successfully",
"bazarr_info": {
"mode": "bazarr_slave",
"host": "127.0.0.1",
"port": 8000,
"api_key": "generated_api_key_here",
"provider_url": "http://127.0.0.1:8000"
}
}
```
---
## API Configuration
### Get All Settings
```bash
curl http://localhost:8000/api/settings
```
### Get by Category
```bash
curl "http://localhost:8000/api/settings?category=workers"
```
### Get Single Setting
```bash
curl http://localhost:8000/api/settings/worker_cpu_count
```
### Update Setting
```bash
curl -X PUT http://localhost:8000/api/settings/worker_cpu_count \
-H "Content-Type: application/json" \
-d '{"value": "2"}'
```
### Bulk Update
```bash
curl -X POST http://localhost:8000/api/settings/bulk-update \
-H "Content-Type: application/json" \
-d '{
"settings": {
"worker_cpu_count": "2",
"worker_gpu_count": "1",
"scanner_enabled": "true"
}
}'
```
### Create Custom Setting
```bash
curl -X POST http://localhost:8000/api/settings \
-H "Content-Type: application/json" \
-d '{
"key": "my_custom_setting",
"value": "custom_value",
"description": "My custom setting",
"category": "advanced",
"value_type": "string"
}'
```
### Delete Setting
```bash
curl -X DELETE http://localhost:8000/api/settings/my_custom_setting
```
### Initialize Defaults
```bash
curl -X POST http://localhost:8000/api/settings/init-defaults
```
---
## Python Usage
```python
from backend.core.settings_service import settings_service
# Get setting with default
cpu_count = settings_service.get("worker_cpu_count", default=1)
# Set setting
settings_service.set("worker_cpu_count", 2)
# Bulk update
settings_service.bulk_update({
"worker_cpu_count": "2",
"scanner_enabled": "true"
})
# Get all settings in category
worker_settings = settings_service.get_by_category("workers")
# Initialize defaults (safe to call multiple times)
settings_service.init_default_settings()
```

666
docs/FRONTEND.md Normal file
View File

@@ -0,0 +1,666 @@
# TranscriptorIO Frontend
Technical documentation for the Vue 3 frontend application.
## Table of Contents
- [Overview](#overview)
- [Technology Stack](#technology-stack)
- [Directory Structure](#directory-structure)
- [Development Setup](#development-setup)
- [Views](#views)
- [Components](#components)
- [State Management](#state-management)
- [API Service](#api-service)
- [Routing](#routing)
- [Styling](#styling)
- [Build and Deployment](#build-and-deployment)
---
## Overview
The TranscriptorIO frontend is a Single Page Application (SPA) built with Vue 3, featuring:
- **6 Complete Views**: Dashboard, Queue, Scanner, Rules, Workers, Settings
- **Real-time Updates**: Polling-based status updates
- **Dark Theme**: Tdarr-inspired dark UI
- **Type Safety**: Full TypeScript support
- **State Management**: Pinia stores for shared state
---
## Technology Stack
| Technology | Version | Purpose |
|------------|---------|---------|
| Vue.js | 3.4+ | UI Framework |
| Vue Router | 4.2+ | Client-side routing |
| Pinia | 2.1+ | State management |
| Axios | 1.6+ | HTTP client |
| TypeScript | 5.3+ | Type safety |
| Vite | 5.0+ | Build tool / dev server |
---
## Directory Structure
```
frontend/
├── public/ # Static assets (favicon, etc.)
├── src/
│ ├── main.ts # Application entry point
│ ├── App.vue # Root component + navigation
│ │
│ ├── views/ # Page components (routed)
│ │ ├── DashboardView.vue # System overview + resources
│ │ ├── QueueView.vue # Job management
│ │ ├── ScannerView.vue # Scanner control
│ │ ├── RulesView.vue # Scan rules CRUD
│ │ ├── WorkersView.vue # Worker pool management
│ │ └── SettingsView.vue # Settings management
│ │
│ ├── components/ # Reusable components
│ │ ├── ConnectionWarning.vue # Backend connection status
│ │ ├── PathBrowser.vue # Filesystem browser modal
│ │ └── SetupWizard.vue # First-run setup wizard
│ │
│ ├── stores/ # Pinia state stores
│ │ ├── config.ts # Configuration store
│ │ ├── system.ts # System status store
│ │ ├── workers.ts # Workers store
│ │ └── jobs.ts # Jobs store
│ │
│ ├── services/
│ │ └── api.ts # Axios API client
│ │
│ ├── router/
│ │ └── index.ts # Vue Router configuration
│ │
│ ├── types/
│ │ └── api.ts # TypeScript interfaces
│ │
│ └── assets/
│ └── css/
│ └── main.css # Global styles (dark theme)
├── index.html # HTML template
├── vite.config.ts # Vite configuration
├── tsconfig.json # TypeScript configuration
└── package.json # Dependencies
```
---
## Development Setup
### Prerequisites
- Node.js 18+ and npm
- Backend server running on port 8000
### Installation
```bash
cd frontend
# Install dependencies
npm install
# Start development server (with proxy to backend)
npm run dev
```
### Development URLs
| URL | Description |
|-----|-------------|
| http://localhost:3000 | Frontend dev server |
| http://localhost:8000 | Backend API |
| http://localhost:8000/docs | Swagger API docs |
### Scripts
```bash
npm run dev # Start dev server with HMR
npm run build # Build for production
npm run preview # Preview production build
npm run lint # Run ESLint
```
---
## Views
### DashboardView
**Path**: `/`
System overview with real-time resource monitoring.
**Features**:
- System status (running/stopped)
- CPU usage gauge
- RAM usage gauge
- GPU usage gauges (per device)
- Recent jobs list
- Worker pool summary
- Scanner status
**Data Sources**:
- `GET /api/status`
- `GET /api/system/resources`
- `GET /api/jobs?page_size=10`
### QueueView
**Path**: `/queue`
Job queue management with filtering and pagination.
**Features**:
- Job list with status icons
- Status filter (All/Queued/Processing/Completed/Failed)
- Pagination controls
- Retry failed jobs
- Cancel queued/processing jobs
- Clear completed jobs
- Job progress display
- Processing time display
**Data Sources**:
- `GET /api/jobs`
- `GET /api/jobs/stats`
- `POST /api/jobs/{id}/retry`
- `DELETE /api/jobs/{id}`
- `POST /api/jobs/queue/clear`
### ScannerView
**Path**: `/scanner`
Library scanner control and configuration.
**Features**:
- Scanner status display
- Start/stop scheduler
- Start/stop file watcher
- Manual scan trigger
- Scan results display
- Next scan time
- Total files scanned counter
**Data Sources**:
- `GET /api/scanner/status`
- `POST /api/scanner/scan`
- `POST /api/scanner/scheduler/start`
- `POST /api/scanner/scheduler/stop`
- `POST /api/scanner/watcher/start`
- `POST /api/scanner/watcher/stop`
### RulesView
**Path**: `/rules`
Scan rules CRUD management.
**Features**:
- Rules list with priority ordering
- Create new rule (modal)
- Edit existing rule (modal)
- Delete rule (with confirmation)
- Toggle rule enabled/disabled
- Condition configuration
- Action configuration
**Data Sources**:
- `GET /api/scan-rules`
- `POST /api/scan-rules`
- `PUT /api/scan-rules/{id}`
- `DELETE /api/scan-rules/{id}`
- `POST /api/scan-rules/{id}/toggle`
### WorkersView
**Path**: `/workers`
Worker pool management.
**Features**:
- Worker list with status
- Add CPU worker
- Add GPU worker (with device selection)
- Remove worker
- Start/stop pool
- Worker statistics
- Current job display per worker
- Progress and ETA display
**Data Sources**:
- `GET /api/workers`
- `GET /api/workers/stats`
- `POST /api/workers`
- `DELETE /api/workers/{id}`
- `POST /api/workers/pool/start`
- `POST /api/workers/pool/stop`
### SettingsView
**Path**: `/settings`
Database-backed settings management.
**Features**:
- Settings grouped by category
- Category tabs (General, Workers, Transcription, Scanner, Bazarr)
- Edit settings in-place
- Save changes button
- Change detection (unsaved changes warning)
- Setting descriptions
**Data Sources**:
- `GET /api/settings`
- `PUT /api/settings/{key}`
- `POST /api/settings/bulk-update`
---
## Components
### ConnectionWarning
Displays warning banner when backend is unreachable.
**Props**: None
**State**: Uses `systemStore.isConnected`
### PathBrowser
Modal component for browsing filesystem paths.
**Props**:
- `show: boolean` - Show/hide modal
- `initialPath: string` - Starting path
**Emits**:
- `select(path: string)` - Path selected
- `close()` - Modal closed
**API Calls**:
- `GET /api/filesystem/browse?path={path}`
- `GET /api/filesystem/common-paths`
### SetupWizard
First-run setup wizard component.
**Props**: None
**Features**:
- Mode selection (Standalone/Bazarr)
- Library path configuration
- Scan rule creation
- Worker configuration
- Scanner interval setting
**API Calls**:
- `GET /api/setup/status`
- `POST /api/setup/standalone`
- `POST /api/setup/bazarr-slave`
- `POST /api/setup/skip`
---
## State Management
### Pinia Stores
#### systemStore (`stores/system.ts`)
Global system state.
```typescript
interface SystemState {
isConnected: boolean
status: SystemStatus | null
resources: SystemResources | null
loading: boolean
error: string | null
}
// Actions
fetchStatus() // Fetch /api/status
fetchResources() // Fetch /api/system/resources
startPolling() // Start auto-refresh
stopPolling() // Stop auto-refresh
```
#### workersStore (`stores/workers.ts`)
Worker pool state.
```typescript
interface WorkersState {
workers: Worker[]
stats: WorkerStats | null
loading: boolean
error: string | null
}
// Actions
fetchWorkers() // Fetch all workers
fetchStats() // Fetch pool stats
addWorker(type, deviceId?) // Add worker
removeWorker(id) // Remove worker
startPool(cpuCount, gpuCount) // Start pool
stopPool() // Stop pool
```
#### jobsStore (`stores/jobs.ts`)
Job queue state.
```typescript
interface JobsState {
jobs: Job[]
stats: QueueStats | null
total: number
page: number
pageSize: number
statusFilter: string | null
loading: boolean
error: string | null
}
// Actions
fetchJobs() // Fetch with current filters
fetchStats() // Fetch queue stats
retryJob(id) // Retry failed job
cancelJob(id) // Cancel job
clearCompleted() // Clear completed jobs
setStatusFilter(status) // Update filter
setPage(page) // Change page
```
#### configStore (`stores/config.ts`)
Settings configuration state.
```typescript
interface ConfigState {
settings: Setting[]
loading: boolean
error: string | null
pendingChanges: Record<string, string>
}
// Actions
fetchSettings(category?) // Fetch settings
updateSetting(key, value) // Queue update
saveChanges() // Save all pending
discardChanges() // Discard pending
```
---
## API Service
### Configuration (`services/api.ts`)
```typescript
import axios from 'axios'
const api = axios.create({
baseURL: '/api',
timeout: 30000,
headers: {
'Content-Type': 'application/json'
}
})
// Response interceptor for error handling
api.interceptors.response.use(
response => response,
error => {
console.error('API Error:', error)
return Promise.reject(error)
}
)
export default api
```
### Usage Example
```typescript
import api from '@/services/api'
// GET request
const response = await api.get('/jobs', {
params: { status_filter: 'queued', page: 1 }
})
// POST request
const job = await api.post('/jobs', {
file_path: '/media/video.mkv',
target_lang: 'spa'
})
// PUT request
await api.put('/settings/worker_cpu_count', {
value: '2'
})
// DELETE request
await api.delete(`/jobs/${jobId}`)
```
---
## Routing
### Route Configuration
```typescript
const routes = [
{ path: '/', name: 'Dashboard', component: DashboardView },
{ path: '/workers', name: 'Workers', component: WorkersView },
{ path: '/queue', name: 'Queue', component: QueueView },
{ path: '/scanner', name: 'Scanner', component: ScannerView },
{ path: '/rules', name: 'Rules', component: RulesView },
{ path: '/settings', name: 'Settings', component: SettingsView }
]
```
### Navigation
Navigation is handled in `App.vue` with a sidebar menu.
```vue
<nav class="sidebar">
<router-link to="/">Dashboard</router-link>
<router-link to="/workers">Workers</router-link>
<router-link to="/queue">Queue</router-link>
<router-link to="/scanner">Scanner</router-link>
<router-link to="/rules">Rules</router-link>
<router-link to="/settings">Settings</router-link>
</nav>
<main class="content">
<router-view />
</main>
```
---
## Styling
### Dark Theme
The application uses a Tdarr-inspired dark theme defined in `assets/css/main.css`.
**Color Palette**:
| Variable | Value | Usage |
|----------|-------|-------|
| --bg-primary | #1a1a2e | Main background |
| --bg-secondary | #16213e | Card background |
| --bg-tertiary | #0f3460 | Hover states |
| --text-primary | #eaeaea | Primary text |
| --text-secondary | #a0a0a0 | Secondary text |
| --accent-primary | #e94560 | Buttons, links |
| --accent-success | #4ade80 | Success states |
| --accent-warning | #fbbf24 | Warning states |
| --accent-error | #ef4444 | Error states |
### Component Styling
Components use scoped CSS with CSS variables:
```vue
<style scoped>
.card {
background: var(--bg-secondary);
border-radius: 8px;
padding: 1.5rem;
}
.btn-primary {
background: var(--accent-primary);
color: white;
border: none;
padding: 0.5rem 1rem;
border-radius: 4px;
cursor: pointer;
}
.btn-primary:hover {
opacity: 0.9;
}
</style>
```
---
## Build and Deployment
### Production Build
```bash
cd frontend
npm run build
```
This creates a `dist/` folder with:
- `index.html` - Entry HTML
- `assets/` - JS, CSS bundles (hashed filenames)
### Deployment Options
#### Option 1: Served by Backend (Recommended)
The FastAPI backend automatically serves the frontend from `frontend/dist/`:
```python
# backend/app.py
frontend_path = Path(__file__).parent.parent / "frontend" / "dist"
if frontend_path.exists():
app.mount("/assets", StaticFiles(directory=str(frontend_path / "assets")))
@app.get("/{full_path:path}")
async def serve_frontend(full_path: str = ""):
return FileResponse(str(frontend_path / "index.html"))
```
**Access**: http://localhost:8000
#### Option 2: Nginx Reverse Proxy
```nginx
server {
listen 80;
server_name transcriptorio.local;
# Frontend
location / {
root /var/www/transcriptorio/frontend/dist;
try_files $uri $uri/ /index.html;
}
# Backend API
location /api {
proxy_pass http://localhost:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
```
#### Option 3: Docker
```dockerfile
# Build frontend
FROM node:18-alpine AS frontend-builder
WORKDIR /app/frontend
COPY frontend/package*.json ./
RUN npm ci
COPY frontend/ ./
RUN npm run build
# Final image
FROM python:3.12-slim
COPY --from=frontend-builder /app/frontend/dist /app/frontend/dist
# ... rest of backend setup
```
---
## TypeScript Interfaces
### Key Types (`types/api.ts`)
```typescript
// Job
interface Job {
id: string
file_path: string
file_name: string
status: 'queued' | 'processing' | 'completed' | 'failed' | 'cancelled'
priority: number
progress: number
// ... more fields
}
// Worker
interface Worker {
worker_id: string
worker_type: 'cpu' | 'gpu'
device_id: number | null
status: 'idle' | 'busy' | 'stopped' | 'error'
current_job_id: string | null
jobs_completed: number
jobs_failed: number
}
// Setting
interface Setting {
id: number
key: string
value: string | null
description: string | null
category: string | null
value_type: string | null
}
// ScanRule
interface ScanRule {
id: number
name: string
enabled: boolean
priority: number
conditions: ScanRuleConditions
action: ScanRuleAction
}
```