Transcriptarr/docs/ARCHITECTURE.md

# TranscriptorIO Backend Architecture

Technical documentation of the backend architecture, components, and data flow.

## Table of Contents

- [Overview](#overview)
- [Directory Structure](#directory-structure)
- [Core Components](#core-components)
- [Data Flow](#data-flow)
- [Database Schema](#database-schema)
- [Transcription vs Translation](#transcription-vs-translation)
- [Worker Architecture](#worker-architecture)
- [Queue System](#queue-system)
- [Scanner System](#scanner-system)
- [Settings System](#settings-system)
- [Graceful Degradation](#graceful-degradation)
- [Thread Safety](#thread-safety)
- [Important Patterns](#important-patterns)

---

## Overview

TranscriptorIO is built with a modular architecture consisting of:

- **FastAPI Server**: REST API with 45+ endpoints
- **Worker Pool**: Multiprocessing-based transcription workers (CPU/GPU)
- **Queue Manager**: Persistent job queue with priority support
- **Library Scanner**: Rule-based file scanning with scheduler and watcher
- **Settings Service**: Database-backed configuration system

```
                    ┌─────────────────────────────────────────────────────────┐
                    │                   FastAPI Server                         │
                    │  ┌─────────────────────────────────────────────────┐   │
                    │  │              REST API (45+ endpoints)            │   │
                    │  │  /api/workers  | /api/jobs     | /api/settings  │   │
                    │  │  /api/scanner  | /api/system   | /api/setup     │   │
                    │  └─────────────────────────────────────────────────┘   │
                    └──────────────────┬──────────────────────────────────────┘
                                       │
                        ┌──────────────┼──────────────┬──────────────────┐
                        │              │              │                  │
                        ▼              ▼              ▼                  ▼
                   ┌────────┐   ┌──────────┐   ┌─────────┐      ┌──────────┐
                   │ Worker │   │  Queue   │   │ Scanner │      │ Database │
                   │  Pool  │◄──┤ Manager  │◄──┤ Engine  │      │ SQLite/  │
                   │ CPU/GPU│   │ Priority │   │ Rules + │      │ Postgres │
                   └────────┘   │  Queue   │   │ Watcher │      └──────────┘
                                └──────────┘   └─────────┘
```

---

## Directory Structure

```
backend/
├── app.py                      # FastAPI application + lifespan
├── cli.py                      # CLI commands (server, db, worker, scan, setup)
├── config.py                   # Pydantic Settings (from .env)
├── setup_wizard.py             # Interactive first-run setup
│
├── core/
│   ├── database.py             # SQLAlchemy setup + session management
│   ├── models.py               # Job model + enums
│   ├── language_code.py        # ISO 639 language code utilities
│   ├── settings_model.py       # SystemSettings model (database-backed)
│   ├── settings_service.py     # Settings service with caching
│   ├── system_monitor.py       # CPU/RAM/GPU/VRAM monitoring
│   ├── queue_manager.py        # Persistent queue with priority
│   ├── worker.py               # Individual worker (Process)
│   └── worker_pool.py          # Worker pool orchestrator
│
├── transcription/
│   ├── __init__.py             # Exports + WHISPER_AVAILABLE flag
│   ├── transcriber.py          # WhisperTranscriber wrapper
│   ├── translator.py           # Google Translate integration
│   └── audio_utils.py          # ffmpeg/ffprobe utilities
│
├── scanning/
│   ├── __init__.py             # Exports (NO library_scanner import!)
│   ├── models.py               # ScanRule model
│   ├── file_analyzer.py        # ffprobe file analysis
│   ├── language_detector.py    # Audio language detection
│   ├── detected_languages.py   # Language mappings
│   └── library_scanner.py      # Scanner + scheduler + watcher
│
└── api/
    ├── __init__.py             # Router exports
    ├── workers.py              # Worker management endpoints
    ├── jobs.py                 # Job queue endpoints
    ├── scan_rules.py           # Scan rules CRUD
    ├── scanner.py              # Scanner control endpoints
    ├── settings.py             # Settings CRUD endpoints
    ├── system.py               # System resources endpoints
    ├── filesystem.py           # Filesystem browser endpoints
    └── setup_wizard.py         # Setup wizard endpoints
```

---

## Core Components

### 1. WorkerPool (`core/worker_pool.py`)

Orchestrates CPU/GPU workers as separate processes.

**Key Features:**
- Dynamic add/remove workers at runtime
- Health monitoring with auto-restart
- Thread-safe multiprocessing
- Each worker is an isolated Process

```python
from backend.core.worker_pool import worker_pool
from backend.core.worker import WorkerType

# Add GPU worker on device 0
worker_id = worker_pool.add_worker(WorkerType.GPU, device_id=0)

# Add CPU worker
worker_id = worker_pool.add_worker(WorkerType.CPU)

# Get pool stats
stats = worker_pool.get_pool_stats()
```

### 2. QueueManager (`core/queue_manager.py`)

Persistent SQLite/PostgreSQL queue with priority support.

**Key Features:**
- Job deduplication (no duplicate `file_path`)
- Row-level locking with `skip_locked=True`
- Priority-based ordering (higher first)
- FIFO within same priority (by `created_at`)
- Auto-retry failed jobs

```python
from backend.core.queue_manager import queue_manager
from backend.core.models import QualityPreset

job = queue_manager.add_job(
    file_path="/media/anime.mkv",
    file_name="anime.mkv",
    source_lang="jpn",
    target_lang="spa",
    quality_preset=QualityPreset.FAST,
    priority=5
)
```

### 3. LibraryScanner (`scanning/library_scanner.py`)

Rule-based file scanning system.

**Three Scan Modes:**
- **Manual**: One-time scan via API or CLI
- **Scheduled**: Periodic scanning with APScheduler
- **Real-time**: File watcher with watchdog library

```python
from backend.scanning.library_scanner import library_scanner

# Manual scan
result = library_scanner.scan_paths(["/media/anime"], recursive=True)

# Start scheduler (every 6 hours)
library_scanner.start_scheduler(interval_minutes=360)

# Start file watcher
library_scanner.start_file_watcher(paths=["/media/anime"], recursive=True)
```

### 4. WhisperTranscriber (`transcription/transcriber.py`)

Wrapper for stable-whisper and faster-whisper.

**Key Features:**
- GPU/CPU support with auto-device detection
- VRAM management and cleanup
- Graceful degradation (works without Whisper installed)

```python
from backend.transcription.transcriber import WhisperTranscriber

transcriber = WhisperTranscriber(
    model_name="large-v3",
    device="cuda",
    compute_type="float16"
)

result = transcriber.transcribe_file(
    file_path="/media/episode.mkv",
    language="jpn",
    task="translate"  # translate to English
)

result.to_srt("episode.eng.srt")
```

### 5. SettingsService (`core/settings_service.py`)

Database-backed configuration with caching.

```python
from backend.core.settings_service import settings_service

# Get setting
value = settings_service.get("worker_cpu_count", default=1)

# Set setting
settings_service.set("worker_cpu_count", "2")

# Bulk update
settings_service.bulk_update({
    "worker_cpu_count": "2",
    "scanner_enabled": "true"
})
```

---

## Data Flow

```
1. LibraryScanner detects file (manual/scheduled/watcher)
   ↓
2. FileAnalyzer analyzes with ffprobe
   - Audio tracks (codec, language, channels)
   - Embedded subtitles
   - External .srt files
   - Duration, video info
   ↓
3. Rules Engine evaluates against ScanRules (priority order)
   - Checks all conditions (audio language, missing subs, etc.)
   - First matching rule wins
   ↓
4. If match → QueueManager.add_job()
   - Deduplication check (no duplicate file_path)
   - Assigns priority based on rule
   ↓
5. Worker pulls job from queue
   - Uses with_for_update(skip_locked=True)
   - FIFO within same priority
   ↓
6. WhisperTranscriber processes with model
   - Stage 1: Audio → English (Whisper translate)
   - Stage 2: English → Target (Google Translate, if needed)
   ↓
7. Generate output SRT file(s)
   - .eng.srt (always)
   - .{target}.srt (if translate mode)
   ↓
8. Job marked completed ✓
```

---

## Database Schema

### Job Table (`jobs`)

```sql
id                      VARCHAR PRIMARY KEY
file_path               VARCHAR UNIQUE      -- Ensures no duplicates
file_name               VARCHAR
status                  VARCHAR             -- queued/processing/completed/failed/cancelled
priority                INTEGER
source_lang             VARCHAR
target_lang             VARCHAR
quality_preset          VARCHAR             -- fast/balanced/best
transcribe_or_translate VARCHAR             -- transcribe/translate
progress                FLOAT
current_stage           VARCHAR
eta_seconds             INTEGER
created_at              DATETIME
started_at              DATETIME
completed_at            DATETIME
output_path             VARCHAR
srt_content             TEXT
segments_count          INTEGER
error                   TEXT
retry_count             INTEGER
max_retries             INTEGER
worker_id               VARCHAR
vram_used_mb            INTEGER
processing_time_seconds FLOAT
```

### ScanRule Table (`scan_rules`)

```sql
id                          INTEGER PRIMARY KEY
name                        VARCHAR UNIQUE
enabled                     BOOLEAN
priority                    INTEGER         -- Higher = evaluated first

-- Conditions (all must match):
audio_language_is           VARCHAR         -- ISO 639-2
audio_language_not          VARCHAR         -- Comma-separated
audio_track_count_min       INTEGER
has_embedded_subtitle_lang  VARCHAR
missing_embedded_subtitle_lang VARCHAR
missing_external_subtitle_lang VARCHAR
file_extension              VARCHAR         -- Comma-separated

-- Action:
action_type                 VARCHAR         -- transcribe/translate
target_language             VARCHAR
quality_preset              VARCHAR
job_priority                INTEGER

created_at                  DATETIME
updated_at                  DATETIME
```

### SystemSettings Table (`system_settings`)

```sql
id          INTEGER PRIMARY KEY
key         VARCHAR UNIQUE
value       TEXT
description TEXT
category    VARCHAR             -- general/workers/transcription/scanner/bazarr
value_type  VARCHAR             -- string/integer/boolean/list
created_at  DATETIME
updated_at  DATETIME
```

---

## Transcription vs Translation

### Understanding the Two Modes

**Mode 1: `transcribe`** (Audio → English subtitles)
```
Audio (any language) → Whisper (task='translate') → English SRT
Example: Japanese audio → anime.eng.srt
```

**Mode 2: `translate`** (Audio → English → Target language)
```
Audio (any language) → Whisper (task='translate') → English SRT
                    → Google Translate → Target language SRT
Example: Japanese audio → anime.eng.srt + anime.spa.srt
```

### Why Two Stages?

**Whisper Limitation**: Whisper can only translate TO English, not between other languages.

**Solution**: Two-stage process:
1. **Stage 1 (Always)**: Whisper converts audio to English using `task='translate'`
2. **Stage 2 (Only for translate mode)**: Google Translate converts English to target language

### Output Files

| Mode | Target | Output Files |
|------|--------|--------------|
| transcribe | spa | `.eng.srt` only |
| translate | spa | `.eng.srt` + `.spa.srt` |
| translate | fra | `.eng.srt` + `.fra.srt` |

---

## Worker Architecture

### Worker Types

| Type | Description | Device |
|------|-------------|--------|
| CPU | Uses CPU for inference | None |
| GPU | Uses NVIDIA GPU | cuda:N |

### Worker Lifecycle

```
                    ┌─────────────┐
                    │   CREATED   │
                    └──────┬──────┘
                           │ start()
                           ▼
                    ┌─────────────┐
        ┌──────────│    IDLE     │◄─────────┐
        │          └──────┬──────┘          │
        │                 │ get_job()       │ job_done()
        │                 ▼                 │
        │          ┌─────────────┐          │
        │          │    BUSY     │──────────┘
        │          └──────┬──────┘
        │                 │ error
        │                 ▼
        │          ┌─────────────┐
        └──────────│    ERROR    │
                   └─────────────┘
```

### Process Isolation

Each worker runs in a separate Python process:
- Memory isolation (VRAM per GPU worker)
- Crash isolation (one worker crash doesn't affect others)
- Independent model loading

---

## Queue System

### Priority System

```python
# Priority values
BAZARR_REQUEST = base_priority + 10    # Highest (external request)
MANUAL_REQUEST = base_priority + 5     # High (user-initiated)
AUTO_SCAN      = base_priority         # Normal (scanner-generated)
```

### Job Deduplication

Jobs are deduplicated by `file_path`:
- If job exists with same `file_path`, new job is rejected
- Returns `None` from `add_job()`
- Prevents duplicate processing

### Concurrency Safety

```python
# Row-level locking prevents race conditions
job = session.query(Job).filter(
    Job.status == JobStatus.QUEUED
).with_for_update(skip_locked=True).first()
```

---

## Scanner System

### Scan Rule Evaluation

Rules are evaluated in priority order (highest first):

```python
# Pseudo-code for rule matching
for rule in rules.order_by(priority.desc()):
    if rule.enabled and matches_all_conditions(file, rule):
        create_job(file, rule.action)
        break  # First match wins
```

### Conditions

All conditions must match (AND logic):

| Condition | Match If |
|-----------|----------|
| audio_language_is | Primary audio track language equals |
| audio_language_not | Primary audio track language NOT in list |
| audio_track_count_min | Number of audio tracks >= value |
| has_embedded_subtitle_lang | Has embedded subtitle in language |
| missing_embedded_subtitle_lang | Does NOT have embedded subtitle |
| missing_external_subtitle_lang | Does NOT have external .srt file |
| file_extension | File extension in comma-separated list |

---

## Settings System

### Categories

| Category | Settings |
|----------|----------|
| general | operation_mode, library_paths, log_level |
| workers | cpu_count, gpu_count, auto_start, healthcheck_interval |
| transcription | whisper_model, compute_type, vram_management |
| scanner | enabled, schedule_interval, watcher_enabled |
| bazarr | provider_enabled, api_key |

### Caching

Settings service implements caching:
- Cache invalidated on write
- Thread-safe access
- Lazy loading from database

---

## Graceful Degradation

The system can run WITHOUT Whisper/torch/PyAV installed:

```python
# Pattern used everywhere
try:
    import stable_whisper
    WHISPER_AVAILABLE = True
except ImportError:
    stable_whisper = None
    WHISPER_AVAILABLE = False

# Later in code
if not WHISPER_AVAILABLE:
    raise RuntimeError("Install with: pip install stable-ts faster-whisper")
```

**What works without Whisper:**
- Backend server starts normally
- All APIs work fully
- Frontend development
- Scanner and rules management
- Job queue (jobs just won't be processed)

**What doesn't work:**
- Actual transcription (throws RuntimeError)

---

## Thread Safety

### Database Sessions

Always use context managers:

```python
with database.get_session() as session:
    # Session is automatically committed on success
    # Rolled back on exception
    job = session.query(Job).filter(...).first()
```

### Worker Pool

- Each worker is a separate Process (multiprocessing)
- Communication via shared memory (Manager)
- No GIL contention between workers

### Queue Manager

- Uses SQLAlchemy row locking
- `skip_locked=True` prevents deadlocks
- Transactions are short-lived

---

## Important Patterns

### Circular Import Resolution

**Critical**: `backend/scanning/__init__.py` MUST NOT import `library_scanner`:

```python
# backend/scanning/__init__.py
from backend.scanning.models import ScanRule
from backend.scanning.file_analyzer import FileAnalyzer, FileAnalysis
# DO NOT import library_scanner here!
```

**Why?**
```
library_scanner → database → models → scanning.models → database (circular!)
```

**Solution**: Import `library_scanner` locally where needed:
```python
def some_function():
    from backend.scanning.library_scanner import library_scanner
    library_scanner.scan_paths(...)
```

### Optional Imports

```python
try:
    import pynvml
    NVML_AVAILABLE = True
except ImportError:
    pynvml = None
    NVML_AVAILABLE = False
```

### Database Session Pattern

```python
from backend.core.database import database

with database.get_session() as session:
    # All operations within session context
    job = session.query(Job).filter(...).first()
    job.status = JobStatus.PROCESSING
    # Commit happens automatically
```

### API Response Pattern

```python
from pydantic import BaseModel

class JobResponse(BaseModel):
    id: str
    status: str
    # ...

@router.get("/{job_id}", response_model=JobResponse)
async def get_job(job_id: str):
    with database.get_session() as session:
        job = session.query(Job).filter(Job.id == job_id).first()
        if not job:
            raise HTTPException(status_code=404, detail="Not found")
        return JobResponse(**job.to_dict())
```