18 KiB
Transcriptarr Backend Architecture
Technical documentation of the backend architecture, components, and data flow.
Table of Contents
- Overview
- Directory Structure
- Core Components
- Data Flow
- Database Schema
- Transcription vs Translation
- Worker Architecture
- Queue System
- Scanner System
- Settings System
- Graceful Degradation
- Thread Safety
- Important Patterns
Overview
Transcriptarr is built with a modular architecture consisting of:
- FastAPI Server: REST API with 45+ endpoints
- Worker Pool: Multiprocessing-based transcription workers (CPU/GPU)
- Queue Manager: Persistent job queue with priority support
- Library Scanner: Rule-based file scanning with scheduler and watcher
- Settings Service: Database-backed configuration system
┌─────────────────────────────────────────────────────────┐
│ FastAPI Server │
│ ┌─────────────────────────────────────────────────┐ │
│ │ REST API (45+ endpoints) │ │
│ │ /api/workers | /api/jobs | /api/settings │ │
│ │ /api/scanner | /api/system | /api/setup │ │
│ └─────────────────────────────────────────────────┘ │
└──────────────────┬──────────────────────────────────────┘
│
┌──────────────┼──────────────┬──────────────────┐
│ │ │ │
▼ ▼ ▼ ▼
┌────────┐ ┌──────────┐ ┌─────────┐ ┌──────────┐
│ Worker │ │ Queue │ │ Scanner │ │ Database │
│ Pool │◄──┤ Manager │◄──┤ Engine │ │ SQLite/ │
│ CPU/GPU│ │ Priority │ │ Rules + │ │ Postgres │
└────────┘ │ Queue │ │ Watcher │ └──────────┘
└──────────┘ └─────────┘
Directory Structure
backend/
├── app.py # FastAPI application + lifespan
├── cli.py # CLI commands (server, db, worker, scan, setup)
├── config.py # Pydantic Settings (from .env)
├── setup_wizard.py # Interactive first-run setup
│
├── core/
│ ├── database.py # SQLAlchemy setup + session management
│ ├── models.py # Job model + enums
│ ├── language_code.py # ISO 639 language code utilities
│ ├── settings_model.py # SystemSettings model (database-backed)
│ ├── settings_service.py # Settings service with caching
│ ├── system_monitor.py # CPU/RAM/GPU/VRAM monitoring
│ ├── queue_manager.py # Persistent queue with priority
│ ├── worker.py # Individual worker (Process)
│ └── worker_pool.py # Worker pool orchestrator
│
├── transcription/
│ ├── __init__.py # Exports + WHISPER_AVAILABLE flag
│ ├── transcriber.py # WhisperTranscriber wrapper
│ ├── translator.py # Google Translate integration
│ └── audio_utils.py # ffmpeg/ffprobe utilities
│
├── scanning/
│ ├── __init__.py # Exports (NO library_scanner import!)
│ ├── models.py # ScanRule model
│ ├── file_analyzer.py # ffprobe file analysis
│ ├── language_detector.py # Audio language detection
│ ├── detected_languages.py # Language mappings
│ └── library_scanner.py # Scanner + scheduler + watcher
│
└── api/
├── __init__.py # Router exports
├── workers.py # Worker management endpoints
├── jobs.py # Job queue endpoints
├── scan_rules.py # Scan rules CRUD
├── scanner.py # Scanner control endpoints
├── settings.py # Settings CRUD endpoints
├── system.py # System resources endpoints
├── filesystem.py # Filesystem browser endpoints
└── setup_wizard.py # Setup wizard endpoints
Core Components
1. WorkerPool (core/worker_pool.py)
Orchestrates CPU/GPU workers as separate processes.
Key Features:
- Dynamic add/remove workers at runtime
- Health monitoring with auto-restart
- Thread-safe multiprocessing
- Each worker is an isolated Process
from backend.core.worker_pool import worker_pool
from backend.core.worker import WorkerType
# Add GPU worker on device 0
worker_id = worker_pool.add_worker(WorkerType.GPU, device_id=0)
# Add CPU worker
worker_id = worker_pool.add_worker(WorkerType.CPU)
# Get pool stats
stats = worker_pool.get_pool_stats()
2. QueueManager (core/queue_manager.py)
Persistent SQLite/PostgreSQL queue with priority support.
Key Features:
- Job deduplication (no duplicate
file_path) - Row-level locking with
skip_locked=True - Priority-based ordering (higher first)
- FIFO within same priority (by
created_at) - Auto-retry failed jobs
from backend.core.queue_manager import queue_manager
from backend.core.models import QualityPreset
job = queue_manager.add_job(
file_path="/media/anime.mkv",
file_name="anime.mkv",
source_lang="jpn",
target_lang="spa",
quality_preset=QualityPreset.FAST,
priority=5
)
3. LibraryScanner (scanning/library_scanner.py)
Rule-based file scanning system.
Three Scan Modes:
- Manual: One-time scan via API or CLI
- Scheduled: Periodic scanning with APScheduler
- Real-time: File watcher with watchdog library
from backend.scanning.library_scanner import library_scanner
# Manual scan
result = library_scanner.scan_paths(["/media/anime"], recursive=True)
# Start scheduler (every 6 hours)
library_scanner.start_scheduler(interval_minutes=360)
# Start file watcher
library_scanner.start_file_watcher(paths=["/media/anime"], recursive=True)
4. WhisperTranscriber (transcription/transcriber.py)
Wrapper for stable-whisper and faster-whisper.
Key Features:
- GPU/CPU support with auto-device detection
- VRAM management and cleanup
- Graceful degradation (works without Whisper installed)
from backend.transcription.transcriber import WhisperTranscriber
transcriber = WhisperTranscriber(
model_name="large-v3",
device="cuda",
compute_type="float16"
)
result = transcriber.transcribe_file(
file_path="/media/episode.mkv",
language="jpn",
task="translate" # translate to English
)
result.to_srt("episode.eng.srt")
5. SettingsService (core/settings_service.py)
Database-backed configuration with caching.
from backend.core.settings_service import settings_service
# Get setting
value = settings_service.get("worker_cpu_count", default=1)
# Set setting
settings_service.set("worker_cpu_count", "2")
# Bulk update
settings_service.bulk_update({
"worker_cpu_count": "2",
"scanner_enabled": "true"
})
Data Flow
1. LibraryScanner detects file (manual/scheduled/watcher)
↓
2. FileAnalyzer analyzes with ffprobe
- Audio tracks (codec, language, channels)
- Embedded subtitles
- External .srt files
- Duration, video info
↓
3. Rules Engine evaluates against ScanRules (priority order)
- Checks all conditions (audio language, missing subs, etc.)
- First matching rule wins
↓
4. If match → QueueManager.add_job()
- Deduplication check (no duplicate file_path)
- Assigns priority based on rule
↓
5. Worker pulls job from queue
- Uses with_for_update(skip_locked=True)
- FIFO within same priority
↓
6. WhisperTranscriber processes with model
- Stage 1: Audio → English (Whisper translate)
- Stage 2: English → Target (Google Translate, if needed)
↓
7. Generate output SRT file(s)
- .eng.srt (always)
- .{target}.srt (if translate mode)
↓
8. Job marked completed ✓
Database Schema
Job Table (jobs)
id VARCHAR PRIMARY KEY
file_path VARCHAR UNIQUE -- Ensures no duplicates
file_name VARCHAR
status VARCHAR -- queued/processing/completed/failed/cancelled
priority INTEGER
source_lang VARCHAR
target_lang VARCHAR
quality_preset VARCHAR -- fast/balanced/best
transcribe_or_translate VARCHAR -- transcribe/translate
progress FLOAT
current_stage VARCHAR
eta_seconds INTEGER
created_at DATETIME
started_at DATETIME
completed_at DATETIME
output_path VARCHAR
srt_content TEXT
segments_count INTEGER
error TEXT
retry_count INTEGER
max_retries INTEGER
worker_id VARCHAR
vram_used_mb INTEGER
processing_time_seconds FLOAT
ScanRule Table (scan_rules)
id INTEGER PRIMARY KEY
name VARCHAR UNIQUE
enabled BOOLEAN
priority INTEGER -- Higher = evaluated first
-- Conditions (all must match):
audio_language_is VARCHAR -- ISO 639-2
audio_language_not VARCHAR -- Comma-separated
audio_track_count_min INTEGER
has_embedded_subtitle_lang VARCHAR
missing_embedded_subtitle_lang VARCHAR
missing_external_subtitle_lang VARCHAR
file_extension VARCHAR -- Comma-separated
-- Action:
action_type VARCHAR -- transcribe/translate
target_language VARCHAR
quality_preset VARCHAR
job_priority INTEGER
created_at DATETIME
updated_at DATETIME
SystemSettings Table (system_settings)
id INTEGER PRIMARY KEY
key VARCHAR UNIQUE
value TEXT
description TEXT
category VARCHAR -- general/workers/transcription/scanner/bazarr
value_type VARCHAR -- string/integer/boolean/list
created_at DATETIME
updated_at DATETIME
Transcription vs Translation
Understanding the Two Modes
Mode 1: transcribe (Audio → English subtitles)
Audio (any language) → Whisper (task='translate') → English SRT
Example: Japanese audio → anime.eng.srt
Mode 2: translate (Audio → English → Target language)
Audio (any language) → Whisper (task='translate') → English SRT
→ Google Translate → Target language SRT
Example: Japanese audio → anime.eng.srt + anime.spa.srt
Why Two Stages?
Whisper Limitation: Whisper can only translate TO English, not between other languages.
Solution: Two-stage process:
- Stage 1 (Always): Whisper converts audio to English using
task='translate' - Stage 2 (Only for translate mode): Google Translate converts English to target language
Output Files
| Mode | Target | Output Files |
|---|---|---|
| transcribe | spa | .eng.srt only |
| translate | spa | .eng.srt + .spa.srt |
| translate | fra | .eng.srt + .fra.srt |
Worker Architecture
Worker Types
| Type | Description | Device |
|---|---|---|
| CPU | Uses CPU for inference | None |
| GPU | Uses NVIDIA GPU | cuda:N |
Worker Lifecycle
┌─────────────┐
│ CREATED │
└──────┬──────┘
│ start()
▼
┌─────────────┐
┌──────────│ IDLE │◄─────────┐
│ └──────┬──────┘ │
│ │ get_job() │ job_done()
│ ▼ │
│ ┌─────────────┐ │
│ │ BUSY │──────────┘
│ └──────┬──────┘
│ │ error
│ ▼
│ ┌─────────────┐
└──────────│ ERROR │
└─────────────┘
Process Isolation
Each worker runs in a separate Python process:
- Memory isolation (VRAM per GPU worker)
- Crash isolation (one worker crash doesn't affect others)
- Independent model loading
Queue System
Priority System
# Priority values
BAZARR_REQUEST = base_priority + 10 # Highest (external request)
MANUAL_REQUEST = base_priority + 5 # High (user-initiated)
AUTO_SCAN = base_priority # Normal (scanner-generated)
Job Deduplication
Jobs are deduplicated by file_path:
- If job exists with same
file_path, new job is rejected - Returns
Nonefromadd_job() - Prevents duplicate processing
Concurrency Safety
# Row-level locking prevents race conditions
job = session.query(Job).filter(
Job.status == JobStatus.QUEUED
).with_for_update(skip_locked=True).first()
Scanner System
Scan Rule Evaluation
Rules are evaluated in priority order (highest first):
# Pseudo-code for rule matching
for rule in rules.order_by(priority.desc()):
if rule.enabled and matches_all_conditions(file, rule):
create_job(file, rule.action)
break # First match wins
Conditions
All conditions must match (AND logic):
| Condition | Match If |
|---|---|
| audio_language_is | Primary audio track language equals |
| audio_language_not | Primary audio track language NOT in list |
| audio_track_count_min | Number of audio tracks >= value |
| has_embedded_subtitle_lang | Has embedded subtitle in language |
| missing_embedded_subtitle_lang | Does NOT have embedded subtitle |
| missing_external_subtitle_lang | Does NOT have external .srt file |
| file_extension | File extension in comma-separated list |
Settings System
Categories
| Category | Settings |
|---|---|
| general | operation_mode, library_paths, log_level |
| workers | cpu_count, gpu_count, auto_start, healthcheck_interval |
| transcription | whisper_model, compute_type, vram_management |
| scanner | enabled, schedule_interval, watcher_enabled |
| bazarr | provider_enabled, api_key |
Caching
Settings service implements caching:
- Cache invalidated on write
- Thread-safe access
- Lazy loading from database
Graceful Degradation
The system can run WITHOUT Whisper/torch/PyAV installed:
# Pattern used everywhere
try:
import stable_whisper
WHISPER_AVAILABLE = True
except ImportError:
stable_whisper = None
WHISPER_AVAILABLE = False
# Later in code
if not WHISPER_AVAILABLE:
raise RuntimeError("Install with: pip install stable-ts faster-whisper")
What works without Whisper:
- Backend server starts normally
- All APIs work fully
- Frontend development
- Scanner and rules management
- Job queue (jobs just won't be processed)
What doesn't work:
- Actual transcription (throws RuntimeError)
Thread Safety
Database Sessions
Always use context managers:
with database.get_session() as session:
# Session is automatically committed on success
# Rolled back on exception
job = session.query(Job).filter(...).first()
Worker Pool
- Each worker is a separate Process (multiprocessing)
- Communication via shared memory (Manager)
- No GIL contention between workers
Queue Manager
- Uses SQLAlchemy row locking
skip_locked=Trueprevents deadlocks- Transactions are short-lived
Important Patterns
Circular Import Resolution
Critical: backend/scanning/__init__.py MUST NOT import library_scanner:
# backend/scanning/__init__.py
from backend.scanning.models import ScanRule
from backend.scanning.file_analyzer import FileAnalyzer, FileAnalysis
# DO NOT import library_scanner here!
Why?
library_scanner → database → models → scanning.models → database (circular!)
Solution: Import library_scanner locally where needed:
def some_function():
from backend.scanning.library_scanner import library_scanner
library_scanner.scan_paths(...)
Optional Imports
try:
import pynvml
NVML_AVAILABLE = True
except ImportError:
pynvml = None
NVML_AVAILABLE = False
Database Session Pattern
from backend.core.database import database
with database.get_session() as session:
# All operations within session context
job = session.query(Job).filter(...).first()
job.status = JobStatus.PROCESSING
# Commit happens automatically
API Response Pattern
from pydantic import BaseModel
class JobResponse(BaseModel):
id: str
status: str
# ...
@router.get("/{job_id}", response_model=JobResponse)
async def get_job(job_id: str):
with database.get_session() as session:
job = session.query(Job).filter(Job.id == job_id).first()
if not job:
raise HTTPException(status_code=404, detail="Not found")
return JobResponse(**job.to_dict())