docs: add comprehensive project documentation

- Replace original Subgen README with TranscriptorIO documentation - Add docs/API.md with 45+ REST endpoint documentation - Add docs/ARCHITECTURE.md with backend component details - Add docs/FRONTEND.md with Vue 3 frontend structure - Add docs/CONFIGURATION.md with settings system documentation - Remove outdated backend/README.md
2026-01-16 15:10:41 +01:00
parent 9655686a50
commit 8373d8765f
6 changed files with 3109 additions and 435 deletions
--- a/README.md
+++ b/README.md
@@ -1,282 +1,265 @@
-[![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://www.paypal.com/donate/?hosted_button_id=SU4QQP6LH5PF6)
+# 🎬 TranscriptorIO
 <img src="https://raw.githubusercontent.com/McCloudS/subgen/main/icon.png" width="200">
-<details>
+**AI-powered subtitle transcription service with REST API and Web UI**
 <summary>Updates:</summary>
-26 Aug 2025: Renamed environment variables to make them slightly easier to understand.  Currently maintains backwards compatibility. See https://github.com/McCloudS/subgen/pull/229
+[![Python](https://img.shields.io/badge/Python-3.12+-blue.svg)](https://www.python.org/)
 [![FastAPI](https://img.shields.io/badge/FastAPI-0.100+-green.svg)](https://fastapi.tiangolo.com/)
 [![Vue.js](https://img.shields.io/badge/Vue.js-3.x-brightgreen.svg)](https://vuejs.org/)
 [![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
-12 Aug 2025: Added distil-large-v3.5
+TranscriptorIO is an AI-powered subtitle transcription service based on [Subgen](https://github.com/McCloudS/subgen), featuring a modern FastAPI backend with 45+ REST endpoints, a Vue 3 web interface, and a distributed worker pool architecture.
-7 Feb: Fixed (V)RAM clearing, added PLEX_QUEUE_SEASON, other extraneous fixes or refactorting.  
+---
-23 Dec: Added PLEX_QUEUE_NEXT_EPISODE and PLEX_QUEUE_SERIES.  Will automatically start generating subtitles for the next episode in your series, or queue the whole series.  
+## ✨ Features
-4 Dec: Added more ENV settings: DETECT_LANGUAGE_OFFSET, PREFERRED_AUDIO_LANGUAGES, SKIP_IF_AUDIO_TRACK_IS, ONLY_SKIP_IF_SUBGEN_SUBTITLE, SKIP_UNKNOWN_LANGUAGE, SKIP_IF_LANGUAGE_IS_NOT_SET_BUT_SUBTITLES_EXIST, SHOULD_WHISPER_DETECT_AUDIO_LANGUAGE
+### 🎯 Core Features
 - **Whisper Transcription** - Support for faster-whisper and stable-ts
 - **Translation** - Two-stage translation: Whisper to English, then Google Translate to target language
 - **CPU/GPU Workers** - Scalable worker pool with CUDA support
 - **Persistent Queue** - Priority-based queue manager with SQLite/PostgreSQL
 - **Library Scanner** - Automatic scanning with configurable rules
 - **REST API** - 45+ endpoints with FastAPI
 - **Web UI** - Complete Vue 3 dashboard with 6 views
 - **Setup Wizard** - Interactive first-run configuration
 - **Real-time Monitoring** - File watcher, scheduled scans, and system resources
-30 Nov 2024: Signifcant refactoring and handling by Muisje.  Added language code class for more robustness and flexibility and ability to separate audio tracks to make sure you get the one you want.  New ENV Variables: SUBTITLE_LANGUAGE_NAMING_TYPE, SKIP_IF_AUDIO_TRACK_IS, PREFERRED_AUDIO_LANGUAGE, SKIP_IF_TO_TRANSCRIBE_SUB_ALREADY_EXIST
+### 🔧 Technical Features
 - **Multiprocessing**: Workers isolated in separate processes
 - **Priority Queuing**: Queue with priorities and deduplication
 - **Graceful Degradation**: Works without optional dependencies (Whisper, GPU)
 - **Thread-Safe**: Row locking and context managers
 - **Auto-retry**: Automatic retry of failed jobs
 - **Health Monitoring**: Detailed statistics and health checks
 - **Database-backed Settings**: All configuration stored in database, editable via Web UI
-    There will be some minor hiccups, so please identify them as we work through this major overhaul.
+---
-22 Nov 2024: Updated to support large-v3-turbo
+## 🚀 Quick Start
-30 Sept 2024: Removed webui
+### 1. Install dependencies
-5 Sept 2024: Fixed Emby response to a test message/notification.  Clarified Emby/Plex/Jellyfin instructions for paths.
+```bash
 # Basic dependencies
 pip install -r requirements.txt
-14 Aug 2024: Cleaned up usage of kwargs across the board a bit.  Added ability for /asr to encode or not, so you don't need to worry about what files/formats you upload.
+# Transcription dependencies (optional - required for actual transcription)
-
+pip install stable-ts faster-whisper
-3 Aug 2024: Added SUBGEN_KWARGS environment variable which allows you to override the model.transcribe with most options you'd like from whisper, faster-whisper, or stable-ts.  This won't be exposed via the webui, it's best to set directly.
+pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
-
+pip install av>=10.0.0
 21 Apr 2024: Fixed queuing with thanks to https://github.com/xhzhu0628 @ https://github.com/McCloudS/subgen/pull/85.  Bazarr intentionally doesn't follow `CONCURRENT_TRANSCRIPTIONS` because it needs a time sensitive response.
 31 Mar 2024: Removed `/subsync` endpoint and general refactoring.  Open an issue if you were using it!
 24 Mar 2024: ~~Added a 'webui' to configure environment variables.  You can use this instead of manually editing the script or using Environment Variables in your OS or Docker (if you want).  The config will prioritize OS Env Variables, then the .env file, then the defaults.  You can access it at `http://subgen:9000/`~~
 23 Mar 2024: Added `CUSTOM_REGROUP` to try to 'clean up' subtitles a bit.  
 22 Mar 2024: Added LRC capability via see: `'LRC_FOR_AUDIO_FILES' | True | Will generate LRC (instead of SRT) files for filetypes: '.mp3', '.flac', '.wav', '.alac', '.ape', '.ogg', '.wma', '.m4a', '.m4b', '.aac', '.aiff' |`
 21 Mar 2024: Added a 'wizard' into the launcher that will help standalone users get common Bazarr variables configured.  See below in Launcher section.  Removed 'Transformers' as an option.  While I usually don't like to remove features, I don't think anyone is using this and the results are wildly unpredictable and often cause out of memory errors.  Added two new environment variables called `USE_MODEL_PROMPT` and `CUSTOM_MODEL_PROMPT`.  If `USE_MODEL_PROMPT` is `True` it will use `CUSTOM_MODEL_PROMPT` if set, otherwise will default to using the pre-configured language pairings, such as: `"en": "Hello, welcome to my lecture.",
    "zh": "你好，欢迎来到我的讲座。"`  These pre-configurated translations are geared towards fixing some audio that may not have punctionation.  We can prompt it to try to force the use of punctuation during transcription.
 19 Mar 2024: Added a `MONITOR` environment variable.  Will 'watch' or 'monitor' your `TRANSCRIBE_FOLDERS` for changes and run on them.  Useful if you just want to paste files into a folder and get subtitles.   
 6 Mar 2024: Added a `/subsync` endpoint that can attempt to align/synchronize subtitles to a file.  Takes audio_file, subtitle_file, language (2 letter code), and outputs an srt.
 5 Mar 2024: Cleaned up logging. Added timestamps option (if Debug = True, timestamps will print in logs).
 4 Mar 2024: Updated Dockerfile CUDA to 12.2.2 (From CTranslate2).  Added endpoint `/status` to return Subgen version.  Can also use distil models now!  See variables below!
 29 Feb 2024: Changed sefault port to align with whisper-asr and deconflict other consumers of the previous port.
 11 Feb 2024: Added a 'launcher.py' file for Docker to prevent huge image downloads. Now set UPDATE to True if you want pull the latest version, otherwise it will default to what was in the image on build.  Docker builds will still be auto-built on any commit.  If you don't want to use the auto-update function, no action is needed on your part and continue to update docker images as before.  Fixed bug where detect-langauge could return an empty result.  Reduced useless debug output that was spamming logs and defaulted DEBUG to True.  Added APPEND, which will add f"Transcribed by whisperAI with faster-whisper ({whisper_model}) on {datetime.now()}" at the end of a subtitle.
 10 Feb 2024: Added some features from JaiZed's branch such as skipping if SDH subtitles are detected, functions updated to also be able to transcribe audio files, allow individual files to be manually transcribed, and a better implementation of forceLanguage. Added `/batch` endpoint (Thanks JaiZed).  Allows you to navigate in a browser to http://subgen_ip:9000/docs and call the batch endpoint which can take a file or a folder to manually transcribe files.  Added CLEAR_VRAM_ON_COMPLETE, HF_TRANSFORMERS, HF_BATCH_SIZE.  Hugging Face Transformers boast '9x increase', but my limited testing shows it's comparable to faster-whisper or slightly slower.  I also have an older 8gb GPU.  Simplest way to persist HF Transformer models is to set "HF_HUB_CACHE" and set it to "/subgen/models" for Docker (assuming you have the matching volume).
 8 Feb 2024: Added FORCE_DETECTED_LANGUAGE_TO to force a wrongly detected language.  Fixed asr to actually use the language passed to it.  
 5 Feb 2024: General housekeeping, minor tweaks on the TRANSCRIBE_FOLDERS function.
 28 Jan 2024: Fixed issue with ffmpeg python module not importing correctly.  Removed separate GPU/CPU containers.  Also removed the script from installing packages, which should help with odd updates I can't control (from other packages/modules). The image is a couple gigabytes larger, but allows easier maintenance.  
 19 Dec 2023: Added the ability for Plex and Jellyfin to automatically update metadata so the subtitles shows up properly on playback. (See https://github.com/McCloudS/subgen/pull/33 from Rikiar73574)  
 31 Oct 2023: Added Bazarr support via Whipser provider.
 25 Oct 2023: Added Emby (IE http://192.168.1.111:9000/emby) support and TRANSCRIBE_FOLDERS, which will recurse through the provided folders and generate subtitles.  It's geared towards attempting to transcribe existing media without using a webhook.
 23 Oct 2023: There are now two docker images, ones for CPU (it's smaller): mccloud/subgen:latest, mccloud/subgen:cpu, the other is for cuda/GPU: mccloud/subgen:cuda.  I also added Jellyfin support and considerable cleanup in the script. I also renamed the webhooks, so they will require new configuration/updates on your end. Instead of /webhook they are now /plex, /tautulli, and /jellyfin.
 22 Oct 2023: The script should have backwards compability with previous envirionment settings, but just to be sure, look at the new options below.  If you don't want to manually edit your environment variables, just edit the script manually. While I have added GPU support, I haven't tested it yet.
 19 Oct 2023: And we're back!  Uses faster-whisper and stable-ts.  Shouldn't break anything from previous settings, but adds a couple new options that aren't documented at this point in time.  As of now, this is not a docker image on dockerhub.  The potential intent is to move this eventually to a pure python script, primarily to simplify my efforts.  Quick and dirty to meet dependencies: pip or `pip3 install flask requests stable-ts faster-whisper`
 This potentially has the ability to use CUDA/Nvidia GPU's, but I don't have one set up yet.  Tesla T4 is in the mail!
 2 Feb 2023: Added Tautulli webhooks back in.  Didn't realize Plex webhooks was PlexPass only.  See below for instructions to add it back in.
 31 Jan 2023 : Rewrote the script substantially to remove Tautulli and fix some variable handling.  For some reason my implementation requires the container to be in host mode.  My Plex was giving "401 Unauthorized" when attempt to query from docker subnets during API calls. (**Fixed now, it can be in bridge**)
 </details>
 # What is this?
 This will transcribe your personal media on a Plex, Emby, or Jellyfin server to create subtitles (.srt) from audio/video files with the following languages: https://github.com/McCloudS/subgen#audio-languages-supported-via-openai and transcribe or translate them into english. It can also be used as a Whisper provider in Bazarr (See below instructions). It technically has support to transcribe from a foreign langauge to itself (IE Japanese > Japanese, see [TRANSCRIBE_OR_TRANSLATE](https://github.com/McCloudS/subgen#variables)). It is currently reliant on webhooks from Jellyfin, Emby, Plex, or Tautulli. This uses stable-ts and faster-whisper which can use both Nvidia GPUs and CPUs.
 # Why?
 Honestly, I built this for me, but saw the utility in other people maybe using it.  This works well for my use case.  Since having children, I'm either deaf or wanting to have everything quiet.  We watch EVERYTHING with subtitles now, and I feel like I can't even understand the show without them.  I use Bazarr to auto-download, and gap fill with Plex's built-in capability.  This is for everything else.  Some shows just won't have subtitles available for some reason or another, or in some cases on my H265 media, they are wildly out of sync. 
 # What can it do?
 * Create .srt subtitles when a media file is added or played which triggers off of Jellyfin, Plex, or Tautulli webhooks. It can also be called via the Whisper provider inside Bazarr.
 # How do I set it up?
 ## Install/Setup
 ### Standalone/Without Docker
 Install python3 (Whisper supports Python 3.9-3.11), ffmpeg, and download launcher.py from this repository.  Then run it: `python3 launcher.py -u -i -s`. You need to have matching paths relative to your Plex server/folders, or use USE_PATH_MAPPING.  Paths are not needed if you are only using Bazarr. You will need the appropriate NVIDIA drivers installed minimum of CUDA Toolkit 12.3 (12.3.2 is known working): https://developer.nvidia.com/cuda-toolkit-archive
 Note: If you have previously had Subgen running in standalone, you may need to run `pip install --upgrade --force-reinstall faster-whisper git+https://github.com/jianfch/stable-ts.git` to force the install of the newer stable-ts package.
 #### Using Launcher
 launcher.py can launch subgen for you and automate the setup and can take the following options:
 ![image](https://github.com/McCloudS/subgen/assets/64094529/081f95b2-7a09-498f-a39e-5ea66e0bc7e1)
 Using `-s` for Bazarr setup:
 ![image](https://github.com/McCloudS/subgen/assets/64094529/ade1b886-3b99-4f80-95ac-bb28608259bb)
 ### Docker
 The dockerfile is in the repo along with an example docker-compose file, and is also posted on dockerhub (mccloud/subgen). 
 If using Subgen without Bazarr, you MUST mount your media volumes in subgen the same way Plex (or your media server) sees them.  For example, if Plex uses "/Share/media/TV:/tv" you must have that identical volume in subgen.  
 `"${APPDATA}/subgen/models:/subgen/models"` is just for storage of the language models.  This isn't necessary, but you will have to redownload the models on any new image pulls if you don't use it.  
 `"${APPDATA}/subgen/subgen.py:/subgen/subgen.py"` If you want to control the version of subgen.py by yourself.  Launcher.py can still be used to download a newer version.
 If you want to use a GPU, you need to map it accordingly.  
 #### Unraid
 While Unraid doesn't have an app or template for quick install, with minor manual work, you can install it.  See [https://github.com/McCloudS/subgen/discussions/137](https://github.com/McCloudS/subgen/discussions/137) for pictures and steps.
 ## Bazarr
 You only need to confiure the Whisper Provider as shown below: <br>
 ![bazarr_configuration](https://wiki.bazarr.media/Additional-Configuration/images/whisper_config.png) <br>
 The Docker Endpoint is the ip address and port of your subgen container (IE http://192.168.1.111:9000) See https://wiki.bazarr.media/Additional-Configuration/Whisper-Provider/ for more info.  **127.0.0.1 WILL NOT WORK IF YOU ARE RUNNING BAZARR IN A DOCKER CONTAINER!** I recomend not enabling using the Bazarr provider with other webhooks in Subgen, or you will likely be generating duplicate subtitles. If you are using Bazarr, path mapping isn't necessary, as Bazarr sends the file over http.
 **The defaults of Subgen will allow it to run in Bazarr with zero configuration.  However, you will probably want to change, at a minimum, `TRANSCRIBE_DEVICE` and `WHISPER_MODEL`.**
 ## Plex
 Create a webhook in Plex that will call back to your subgen address, IE: http://192.168.1.111:9000/plex see: https://support.plex.tv/articles/115002267687-webhooks/  You will also need to generate the token to use it.  Remember, Plex and Subgen need to be able to see the exact same files at the exact same paths, otherwise you need `USE_PATH_MAPPING`.
 ## Emby
 All you need to do is create a webhook in Emby pointing to your subgen IE: `http://192.168.154:9000/emby`, set `Request content type` to `multipart/form-data` and configure your desired events (Usually, `New Media Added`, `Start`, and `Unpause`).  See https://github.com/McCloudS/subgen/discussions/115#discussioncomment-10569277 for screenshot examples.
 Emby was really nice and provides good information in their responses, so we don't need to add an API token or server url to query for more information.
 Remember, Emby and Subgen need to be able to see the exact same files at the exact same paths, otherwise you need `USE_PATH_MAPPING`.
 ## Tautulli
 Create the webhooks in Tautulli with the following settings:
 Webhook URL: http://yourdockerip:9000/tautulli
 Webhook Method: Post
 Triggers: Whatever you want, but you'll likely want "Playback Start" and "Recently Added"
 Data: Under Playback Start, JSON Header will be:
 ```json 
 { "source":"Tautulli" }
 ```
-Data:
+
-```json
+### 2. First run (Setup Wizard)
-{
+
-            "event":"played",
+```bash
-            "file":"{file}",
+# The setup wizard runs automatically on first start
-            "filename":"{filename}",
+python backend/cli.py server
-            "mediatype":"{media_type}"
+
-}
+# Or run setup wizard manually
 python backend/cli.py setup
 ```
-Similarly, under Recently Added, Header is: 
+
-```json
+The setup wizard will guide you through:
-{ "source":"Tautulli" }
+- **Standalone mode**: Configure library paths, scan rules, and workers
 - **Bazarr mode**: Configure as Bazarr subtitle provider (in development)
 ### 3. Start the server
 ```bash
 # Development (with auto-reload)
 python backend/cli.py server --reload
 # Production
 python backend/cli.py server --host 0.0.0.0 --port 8000 --workers 4
 ```
-Data:
+
-```json
+### 4. Access the application
-{
+
-            "event":"added",
+| URL | Description |
-            "file":"{file}",
+|-----|-------------|
-            "filename":"{filename}",
+| http://localhost:8000 | Web UI (Dashboard) |
-            "mediatype":"{media_type}"
+| http://localhost:8000/docs | Swagger API Documentation |
-}
+| http://localhost:8000/redoc | ReDoc API Documentation |
 | http://localhost:8000/health | Health Check Endpoint |
 ---
 ## 📋 CLI Commands
 ```bash
 # Server
 python backend/cli.py server [options]
  --host HOST           Host (default: 0.0.0.0)
  --port PORT           Port (default: 8000)
  --reload              Auto-reload for development
  --workers N           Number of uvicorn workers (default: 1)
  --log-level LEVEL     Log level (default: info)
 # Setup wizard
 python backend/cli.py setup     # Run setup wizard
 # Database
 python backend/cli.py db init   # Initialize database
 python backend/cli.py db reset  # Reset (WARNING: deletes all data!)
 # Standalone worker
 python backend/cli.py worker --type cpu
 python backend/cli.py worker --type gpu --device-id 0
 # Manual scan
 python backend/cli.py scan /path/to/media [--no-recursive]
 ```
 ## Jellyfin
-First, you need to install the Jellyfin webhooks plugin.  Then you need to click "Add Generic Destination", name it anything you want, webhook url is your subgen info (IE http://192.168.1.154:9000/jellyfin).  Next, check Item Added, Playback Start, and Send All Properties.  Last, "Add Request Header" and add the Key: `Content-Type` Value: `application/json`<br><br>Click Save and you should be all set!
+---
-Remember, Jellyfin and Subgen need to be able to see the exact same files at the exact same paths, otherwise you need `USE_PATH_MAPPING`.
+## 🏗️ Architecture
-## Variables
+```
 ┌─────────────────────────────────────────────────────────┐
 │                   FastAPI Server                         │
 │  ┌─────────────────────────────────────────────────┐   │
 │  │              REST API (45+ endpoints)            │   │
 │  │  /api/workers  | /api/jobs     | /api/settings  │   │
 │  │  /api/scanner  | /api/system   | /api/setup     │   │
 │  └─────────────────────────────────────────────────┘   │
 └──────────────────┬──────────────────────────────────────┘
                   │
    ┌──────────────┼──────────────┬──────────────────┐
    │              │              │                  │
    ▼              ▼              ▼                  ▼
 ┌────────┐   ┌──────────┐   ┌─────────┐      ┌──────────┐
 │ Worker │   │  Queue   │   │ Scanner │      │ Database │
 │  Pool  │◄──┤ Manager  │◄──┤ Engine  │      │ SQLite/  │
 │ CPU/GPU│   │ Priority │   │ Rules + │      │ Postgres │
 └────────┘   │  Queue   │   │ Watcher │      └──────────┘
             └──────────┘   └─────────┘
 ```
-You can define the port via environment variables, but the endpoints are static.
+### Data Flow
-The following environment variables are available in Docker.  They will default to the values listed below.
+1. **LibraryScanner** detects files (manual/scheduled/watcher)
-| Variable                  | Default Value          | Description                                                                                                                                                                                                                                                                               |
+2. **FileAnalyzer** analyzes with ffprobe (audio tracks, subtitles)
-|---------------------------|------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+3. **Rules Engine** evaluates against configurable ScanRules
-| TRANSCRIBE_DEVICE         | 'cpu'                  | Can transcribe via gpu (Cuda only) or cpu.  Takes option of "cpu", "gpu", "cuda".                                                                                                                     |
+4. **QueueManager** adds job to persistent queue (with deduplication)
-| WHISPER_MODEL             | 'medium'               | Can be:'tiny', 'tiny.en', 'base', 'base.en', 'small', 'small.en', 'medium', 'medium.en', 'large-v1','large-v2', 'large-v3', 'large', 'distil-large-v2', 'distil-large-v3', 'distil-large-v3.5', 'distil-medium.en', 'distil-small.en', 'large-v3-turbo'                                   |
+5. **Worker** processes with WhisperTranscriber
-| CONCURRENT_TRANSCRIPTIONS | 2                      | Number of files it will transcribe in parallel                                                                                                                                                                                                                                            |
+6. **Output**: Generates `.eng.srt` (transcription) or `.{lang}.srt` (translation)
 | WHISPER_THREADS           | 4                      | number of threads to use during computation                                                                                                                                                                                                                                               |
 | MODEL_PATH                | './models'                    | This is where the WHISPER_MODEL will be stored.  This defaults to placing it where you execute the script in the folder 'models'                                                                                                                                                                              |
 | PROCESS_ADDED_MEDIA            | True                   | will gen subtitles for all media added regardless of existing external/embedded subtitles (based off of SKIP_IF_INTERNAL_SUBTITLES_LANGUAGE)                                                                                                                                                            |
 | PROCESS_MEDIA_ON_PLAY           | True                   | will gen subtitles for all played media regardless of existing external/embedded subtitles (based off of SKIP_IF_INTERNAL_SUBTITLES_LANGUAGE)                                                                                                                                                           |
 | SUBTITLE_LANGUAGE_NAME               | 'aa'                   | allows you to pick what it will name the subtitle. Instead of using EN, I'm using AA, so it doesn't mix with exiting external EN subs, and AA will populate higher on the list in Plex. This will override the Whisper detected language for a file name.                                                                                                   |
 | SKIP_IF_INTERNAL_SUBTITLES_LANGUAGE     | 'eng'                  | Will not generate a subtitle if the file has an internal sub matching the 3 letter code of this variable (See https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes)                                                                                                                      |
 | WORD_LEVEL_HIGHLIGHT      | False                  | Highlights each words as it's spoken in the subtitle.  See example video @ https://github.com/jianfch/stable-ts                                                                                                                                                                           |
 | PLEX_SERVER                | 'http://plex:32400'    | This needs to be set to your local plex server address/port                                                                                                                                                                                                                               |
 | PLEX_TOKEN                 | 'token here'           | This needs to be set to your plex token found by https://support.plex.tv/articles/204059436-finding-an-authentication-token-x-plex-token/                                                                                                                                                 |
 | JELLYFIN_SERVER            | 'http://jellyfin:8096' | Set to your Jellyfin server address/port                                                                                                                                                                                                                                                  |
 | JELLYFIN_TOKEN             | 'token here'           | Generate a token inside the Jellyfin interface                                                                                                                                                                                                                                            |
 | WEBHOOK_PORT               | 9000                   | Change this if you need a different port for your webhook                                                                                                                                                                                                                                 |
 | USE_PATH_MAPPING          | False                  | Similar to sonarr and radarr path mapping, this will attempt to replace paths on file systems that don't have identical paths.  Currently only support for one path replacement. Examples below.                                                                                          |
 | PATH_MAPPING_FROM         | '/tv'                  | This is the path of my media relative to my Plex server                                                                                                                                                                                                                                   |
 | PATH_MAPPING_TO           | '/Volumes/TV'          | This is the path of that same folder relative to my Mac Mini that will run the script                                                                                                                                                                                                     |
 | TRANSCRIBE_FOLDERS        | ''                     | Takes a pipe '\|' separated list (For example: /tv\|/movies\|/familyvideos) and iterates through and adds those files to be queued for subtitle generation if they don't have internal subtitles                                                                                              |
 | TRANSCRIBE_OR_TRANSLATE   | 'transcribe'            | Takes either 'transcribe' or 'translate'.  Transcribe will transcribe the audio in the same language as the input. Translate will transcribe and translate into English. | 
 | COMPUTE_TYPE | 'auto' | Set compute-type using the following information: https://github.com/OpenNMT/CTranslate2/blob/master/docs/quantization.md |
 | DEBUG                     | True                  | Provides some debug data that can be helpful to troubleshoot path mapping and other issues. Fun fact, if this is set to true, any modifications to the script will auto-reload it (if it isn't actively transcoding).  Useful to make small tweaks without re-downloading the whole file. |
 | FORCE_DETECTED_LANGUAGE_TO | '' | This is to force the model to a language instead of the detected one, takes a 2 letter language code.  For example, your audio is French but keeps detecting as English, you would set it to 'fr' |
 | CLEAR_VRAM_ON_COMPLETE | True | This will delete the model and do garbage collection when queue is empty.  Good if you need to use the VRAM for something else. |
 | UPDATE | False | Will pull latest subgen.py from the repository if True.  False will use the original subgen.py built into the Docker image.  Standalone users can use this with launcher.py to get updates. |
 | APPEND | False | Will add the following at the end of a subtitle: "Transcribed by whisperAI with faster-whisper ({whisper_model}) on {datetime.now()}"
 | MONITOR | False | Will monitor `TRANSCRIBE_FOLDERS` for real-time changes to see if we need to generate subtitles |
 | USE_MODEL_PROMPT | False | When set to `True`, will use the default prompt stored in greetings_translations "Hello, welcome to my lecture." to try and force the use of punctuation in transcriptions that don't. Automatic `CUSTOM_MODEL_PROMPT` will only work with ASR, but can still be set manually like so: `USE_MODEL_PROMPT=True and CUSTOM_MODEL_PROMPT=Hello, welcome to my lecture.`  |
 | CUSTOM_MODEL_PROMPT | '' | If `USE_MODEL_PROMPT` is `True`, you can override the default prompt (See: https://medium.com/axinc-ai/prompt-engineering-in-whisper-6bb18003562d for great examples). |
 | LRC_FOR_AUDIO_FILES | True | Will generate LRC (instead of SRT) files for filetypes: '.mp3', '.flac', '.wav', '.alac', '.ape', '.ogg', '.wma', '.m4a', '.m4b', '.aac', '.aiff' | 
 | CUSTOM_REGROUP | 'cm_sl=84_sl=42++++++1' | Attempts to regroup some of the segments to make a cleaner looking subtitle.  See https://github.com/McCloudS/subgen/issues/68 for discussion. Set to blank if you want to use Stable-TS default regroups algorithm of `cm_sp=,* /，_sg=.5_mg=.3+3_sp=.* /。/?/？` |
 | DETECT_LANGUAGE_LENGTH | 30 | Detect language on the first x seconds of the audio. |
 | SKIP_IF_EXTERNAL_SUBTITLES_EXIST | False | Skip subtitle generation if an external subtitle with the same language code as NAMESUBLANG is present. Used for the case of not regenerating subtitles if I already have `Movie (2002).NAMESUBLANG.srt` from a non-subgen source. |
 | SUBGEN_KWARGS | '{}' | Takes a kwargs python dictionary of options you would like to add/override.  For advanced users.  An example would be `{'vad': True, 'prompt_reset_on_temperature': 0.35}` |
 | SKIP_SUBTITLE_LANGUAGES | '' | Takes a pipe separated `\|` list of 3 letter language codes to not generate subtitles for example 'eng\|deu'|
 | SUBTITLE_LANGUAGE_NAMING_TYPE | 'ISO_639_2_B' | The type of naming format desired, such as 'ISO_639_1', 'ISO_639_2_T', 'ISO_639_2_B', 'NAME', or 'NATIVE', for example: ("es", "spa", "spa", "Spanish", "Español") |
 | SKIP_SUBTITLE_LANGUAGES | '' | Takes a pipe separated `\|` list of 3 letter language codes to skip if the file has audio in that language.  This could be used to skip generating subtitles for a language you don't want, like, I speak English, don't generate English subtitles (for example: 'eng\|deu')|
 | PREFERRED_AUDIO_LANGUAGE | 'eng' | If there are multiple audio tracks in a file, it will prefer this setting |
 | SKIP_IF_TARGET_SUBTITLES_EXIST | True | Skips generation of subtitle if a file matches our desired language already. |
 | DETECT_LANGUAGE_OFFSET | 0 | Allows you to shift when to run detect_language, geared towards avoiding introductions or songs. |
 | PREFERRED_AUDIO_LANGUAGES | 'eng' | Pipe separated list |
 | SKIP_IF_AUDIO_TRACK_IS | '' | Takes a pipe separated list of ISO 639-2 languages. Skips generation of subtitle if the file has the audio file listed. |
 | SKIP_ONLY_SUBGEN_SUBTITLES | False | Skips generation of subtitles if the file has "subgen" somewhere in the same |
 | SKIP_UNKNOWN_LANGUAGE | False | Skips generation if the file has an unknown language |
 | SKIP_IF_NO_LANGUAGE_BUT_SUBTITLES_EXIST | False | Skips generation if file doesn't have an audio stream marked with a language |
 | SHOULD_WHISPER_DETECT_AUDIO_LANGUAGE | False | Should Whisper try to detect the language if there is no audio language specified via force langauge |
 | PLEX_QUEUE_NEXT_EPISODE | False | Will queue the next Plex series episode for subtitle generation if subgen is triggered. |
 | PLEX_QUEUE_SEASON | False | Will queue the rest of the Plex season for subtitle generation if subgen is triggered. |
 | PLEX_QUEUE_SERIES | False | Will queue the whole Plex series for subtitle generation if subgen is triggered. |
 | SHOW_IN_SUBNAME_SUBGEN | True | Adds subgen to the subtitle file name. |
 | SHOW_IN_SUBNAME_MODEL | True | Adds Whisper model name to the subtitle file name. |
-### Images:
+---
 `mccloud/subgen:latest` is GPU or CPU <br>
 `mccloud/subgen:cpu` is for CPU only (slightly smaller image)
 <br><br>
-# What are the limitations/problems?
+## 🖥️ Web UI
-* I made it and know nothing about formal deployment for python coding.  
+The Web UI includes 6 complete views:
 * It's using trained AI models to transcribe, so it WILL mess up
-# What's next?  
+| View | Description |
 |------|-------------|
 | **Dashboard** | System overview, resource monitoring (CPU/RAM/GPU), recent jobs |
 | **Queue** | Job management with filters, pagination, retry/cancel actions |
 | **Scanner** | Scanner control, scheduler configuration, manual scan trigger |
 | **Rules** | Scan rules CRUD with create/edit modal |
 | **Workers** | Worker pool management, add/remove workers dynamically |
 | **Settings** | Database-backed settings organized by category |
-Fix documentation and make it prettier!
+---
 # Audio Languages Supported (via OpenAI)
-Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.
+## 🎛️ Configuration
-# Known Issues
+### Database-backed Settings
-At this time, if you have high CPU usage when not actively transcribing on the CPU only docker, try the GPU one.
+All configuration is stored in the database and manageable via:
 - **Setup Wizard** (first run)
 - **Settings page** in Web UI
 - **Settings API** (`/api/settings`)
-# Additional reading:
+### Settings Categories
-* https://github.com/openai/whisper (Original OpenAI project)
+| Category | Settings |
-* https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes (2 letter subtitle codes)
+|----------|----------|
 | **General** | Operation mode, library paths, log level |
 | **Workers** | CPU/GPU worker counts, auto-start, health check interval |
 | **Transcription** | Whisper model, compute type, skip existing files |
 | **Scanner** | Enable/disable, schedule interval, file watcher |
 | **Bazarr** | Provider mode (in development) |
-# Credits:  
+### Environment Variables
-* Whisper.cpp (https://github.com/ggerganov/whisper.cpp) for original implementation
+
-* Google
+Only `DATABASE_URL` is required in `.env`:
-* ffmpeg
+
-* https://github.com/jianfch/stable-ts
+```bash
-* https://github.com/guillaumekln/faster-whisper
+# SQLite (default)
-* Whipser ASR Webservice (https://github.com/ahmetoner/whisper-asr-webservice) for how to implement Bazarr webhooks.
+DATABASE_URL=sqlite:///./transcriptarr.db
 # PostgreSQL (production)
 DATABASE_URL=postgresql://user:pass@localhost/transcriptarr
 ```
 ---
 ## 📚 Documentation
 | Document | Description |
 |----------|-------------|
 | [docs/API.md](docs/API.md) | Complete REST API documentation (45+ endpoints) |
 | [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) | Backend architecture and components |
 | [docs/FRONTEND.md](docs/FRONTEND.md) | Frontend structure and components |
 | [docs/CONFIGURATION.md](docs/CONFIGURATION.md) | Configuration system and settings |
 ---
 ## 🐳 Docker
 ```bash
 # CPU only
 docker build -t transcriptorio:cpu -f Dockerfile.cpu .
 # GPU (NVIDIA CUDA)
 docker build -t transcriptorio:gpu -f Dockerfile .
 # Run
 docker run -d \
  -p 8000:8000 \
  -v /path/to/media:/media \
  -v /path/to/data:/app/data \
  --gpus all \
  transcriptorio:gpu
 ```
 ---
 ## 📊 Project Status
 | Component | Status | Progress |
 |-----------|--------|----------|
 | Core Backend | ✅ Complete | 100% |
 | REST API (45+ endpoints) | ✅ Complete | 100% |
 | Worker System | ✅ Complete | 100% |
 | Library Scanner | ✅ Complete | 100% |
 | Web UI (6 views) | ✅ Complete | 100% |
 | Settings System | ✅ Complete | 100% |
 | Setup Wizard | ✅ Complete | 100% |
 | Bazarr Provider | ⏳ In Development | 30% |
 | Testing Suite | ⏳ Pending | 0% |
 | Docker | ⏳ Pending | 0% |
 ---
 ## 🤝 Contributing
 Contributions are welcome!
 ---
 ## 📝 Credits
 Based on [Subgen](https://github.com/McCloudS/subgen) by McCloudS.
 Architecture redesigned with:
 - FastAPI for REST APIs
 - SQLAlchemy for persistence
 - Multiprocessing for workers
 - Whisper (stable-ts / faster-whisper) for transcription
 - Vue 3 + Pinia for frontend
 ---
 ## 📄 License
 MIT License - See [LICENSE](LICENSE) for details.
--- a/backend/README.md
+++ b/backend/README.md
@@ -1,185 +0,0 @@
 # TranscriptorIO Backend
 This is the redesigned backend for TranscriptorIO, a complete fork of SubGen with modern asynchronous architecture.
 ## 🎯 Goal
 Replace SubGen's synchronous non-persistent system with a modern Tdarr-inspired architecture:
 - ✅ Persistent queue (SQLite/PostgreSQL/MariaDB)
 - ✅ Asynchronous processing
 - ✅ Job prioritization
 - ✅ Complete state visibility
 - ✅ No Bazarr timeouts
 ## 📁 Structure
 ```
 backend/
 ├── core/
 │   ├── database.py       # Multi-backend database management
 │   ├── models.py         # SQLAlchemy models (Job, etc.)
 │   ├── queue_manager.py  # Asynchronous persistent queue
 │   └── __init__.py
 ├── api/                  # (coming soon) FastAPI endpoints
 ├── config.py            # Centralized configuration with Pydantic
 └── README.md            # This file
 ```
 ## 🚀 Setup
 ### 1. Install dependencies
 ```bash
 pip install -r requirements.txt
 ```
 ### 2. Configure .env
 Copy `.env.example` to `.env` and adjust as needed:
 ```bash
 cp .env.example .env
 ```
 #### Database Options
 **SQLite (default)**:
 ```env
 DATABASE_URL=sqlite:///./transcriptarr.db
 ```
 **PostgreSQL**:
 ```bash
 pip install psycopg2-binary
 ```
 ```env
 DATABASE_URL=postgresql://user:password@localhost:5432/transcriptarr
 ```
 **MariaDB/MySQL**:
 ```bash
 pip install pymysql
 ```
 ```env
 DATABASE_URL=mariadb+pymysql://user:password@localhost:3306/transcriptarr
 ```
 ### 3. Choose operation mode
 **Standalone Mode** (automatically scans your library):
 ```env
 TRANSCRIPTARR_MODE=standalone
 LIBRARY_PATHS=/media/anime|/media/movies
 AUTO_SCAN_ENABLED=True
 SCAN_INTERVAL_MINUTES=30
 ```
 **Provider Mode** (receives jobs from Bazarr):
 ```env
 TRANSCRIPTARR_MODE=provider
 BAZARR_URL=http://bazarr:6767
 BAZARR_API_KEY=your_api_key
 ```
 **Hybrid Mode** (both simultaneously):
 ```env
 TRANSCRIPTARR_MODE=standalone,provider
 ```
 ## 🧪 Testing
 Run the test script to verify everything works:
 ```bash
 python test_backend.py
 ```
 This will verify:
 - ✓ Configuration loading
 - ✓ Database connection
 - ✓ Table creation
 - ✓ Queue operations (add, get, deduplicate)
 ## 📊 Implemented Components
 ### config.py
 - Centralized configuration with Pydantic
 - Automatic environment variable validation
 - Multi-backend database support
 - Operation mode configuration
 ### database.py
 - Connection management with SQLAlchemy
 - Support for SQLite, PostgreSQL, MariaDB
 - Backend-specific optimizations
  - SQLite: WAL mode, optimized cache
  - PostgreSQL: connection pooling, pre-ping
  - MariaDB: utf8mb4 charset, pooling
 - Health checks and statistics
 ### models.py
 - Complete `Job` model with:
  - States: queued, processing, completed, failed, cancelled
  - Stages: pending, detecting_language, transcribing, translating, etc.
  - Quality presets: fast, balanced, best
  - Progress tracking (0-100%)
  - Complete timestamps
  - Retry logic
  - Worker assignment
 - Optimized indexes for common queries
 ### queue_manager.py
 - Thread-safe persistent queue
 - Job prioritization
 - Duplicate detection
 - Automatic retry for failed jobs
 - Real-time statistics
 - Automatic cleanup of old jobs
 ## 🔄 Comparison with SubGen
 | Feature | SubGen | TranscriptorIO |
 |---------|--------|----------------|
 | Queue | In-memory (lost on restart) | **Persistent in DB** |
 | Processing | Synchronous (blocks threads) | **Asynchronous** |
 | Prioritization | No | **Yes (configurable)** |
 | Visibility | No progress/ETA | **Progress + real-time ETA** |
 | Deduplication | Basic (memory only) | **Persistent + intelligent** |
 | Retries | No | **Automatic with limit** |
 | Database | No | **SQLite/PostgreSQL/MariaDB** |
 | Bazarr Timeouts | Yes (>5min = 24h throttle) | **No (async)** |
 ## 📝 Next Steps
 1. **Worker Pool** - Asynchronous worker system
 2. **REST API** - FastAPI endpoints for management
 3. **WebSocket** - Real-time updates
 4. **Transcriber** - Whisper wrapper with progress callbacks
 5. **Bazarr Provider** - Improved async provider
 6. **Standalone Scanner** - Automatic library scanning
 ## 🐛 Troubleshooting
 ### Error: "No module named 'backend'"
 Make sure to run scripts from the project root:
 ```bash
 cd /home/dasemu/Hacking/Transcriptarr
 python test_backend.py
 ```
 ### Error: Database locked (SQLite)
 SQLite is configured with WAL mode for better concurrency. If you still have issues, consider using PostgreSQL for production.
 ### Error: pydantic.errors.ConfigError
 Verify that all required variables are in your `.env`:
 ```bash
 cp .env.example .env
 # Edit .env with your values
 ```
 ## 📚 Documentation
 See `CLAUDE.md` for complete architecture and project roadmap.
--- a/docs/API.md
+++ b/docs/API.md
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -0,0 +1,613 @@
 # TranscriptorIO Backend Architecture
 Technical documentation of the backend architecture, components, and data flow.
 ## Table of Contents
 - [Overview](#overview)
 - [Directory Structure](#directory-structure)
 - [Core Components](#core-components)
 - [Data Flow](#data-flow)
 - [Database Schema](#database-schema)
 - [Transcription vs Translation](#transcription-vs-translation)
 - [Worker Architecture](#worker-architecture)
 - [Queue System](#queue-system)
 - [Scanner System](#scanner-system)
 - [Settings System](#settings-system)
 - [Graceful Degradation](#graceful-degradation)
 - [Thread Safety](#thread-safety)
 - [Important Patterns](#important-patterns)
 ---
 ## Overview
 TranscriptorIO is built with a modular architecture consisting of:
 - **FastAPI Server**: REST API with 45+ endpoints
 - **Worker Pool**: Multiprocessing-based transcription workers (CPU/GPU)
 - **Queue Manager**: Persistent job queue with priority support
 - **Library Scanner**: Rule-based file scanning with scheduler and watcher
 - **Settings Service**: Database-backed configuration system
 ```
                    ┌─────────────────────────────────────────────────────────┐
                    │                   FastAPI Server                         │
                    │  ┌─────────────────────────────────────────────────┐   │
                    │  │              REST API (45+ endpoints)            │   │
                    │  │  /api/workers  | /api/jobs     | /api/settings  │   │
                    │  │  /api/scanner  | /api/system   | /api/setup     │   │
                    │  └─────────────────────────────────────────────────┘   │
                    └──────────────────┬──────────────────────────────────────┘
                                       │
                        ┌──────────────┼──────────────┬──────────────────┐
                        │              │              │                  │
                        ▼              ▼              ▼                  ▼
                   ┌────────┐   ┌──────────┐   ┌─────────┐      ┌──────────┐
                   │ Worker │   │  Queue   │   │ Scanner │      │ Database │
                   │  Pool  │◄──┤ Manager  │◄──┤ Engine  │      │ SQLite/  │
                   │ CPU/GPU│   │ Priority │   │ Rules + │      │ Postgres │
                   └────────┘   │  Queue   │   │ Watcher │      └──────────┘
                                └──────────┘   └─────────┘
 ```
 ---
 ## Directory Structure
 ```
 backend/
 ├── app.py                      # FastAPI application + lifespan
 ├── cli.py                      # CLI commands (server, db, worker, scan, setup)
 ├── config.py                   # Pydantic Settings (from .env)
 ├── setup_wizard.py             # Interactive first-run setup
 │
 ├── core/
 │   ├── database.py             # SQLAlchemy setup + session management
 │   ├── models.py               # Job model + enums
 │   ├── language_code.py        # ISO 639 language code utilities
 │   ├── settings_model.py       # SystemSettings model (database-backed)
 │   ├── settings_service.py     # Settings service with caching
 │   ├── system_monitor.py       # CPU/RAM/GPU/VRAM monitoring
 │   ├── queue_manager.py        # Persistent queue with priority
 │   ├── worker.py               # Individual worker (Process)
 │   └── worker_pool.py          # Worker pool orchestrator
 │
 ├── transcription/
 │   ├── __init__.py             # Exports + WHISPER_AVAILABLE flag
 │   ├── transcriber.py          # WhisperTranscriber wrapper
 │   ├── translator.py           # Google Translate integration
 │   └── audio_utils.py          # ffmpeg/ffprobe utilities
 │
 ├── scanning/
 │   ├── __init__.py             # Exports (NO library_scanner import!)
 │   ├── models.py               # ScanRule model
 │   ├── file_analyzer.py        # ffprobe file analysis
 │   ├── language_detector.py    # Audio language detection
 │   ├── detected_languages.py   # Language mappings
 │   └── library_scanner.py      # Scanner + scheduler + watcher
 │
 └── api/
    ├── __init__.py             # Router exports
    ├── workers.py              # Worker management endpoints
    ├── jobs.py                 # Job queue endpoints
    ├── scan_rules.py           # Scan rules CRUD
    ├── scanner.py              # Scanner control endpoints
    ├── settings.py             # Settings CRUD endpoints
    ├── system.py               # System resources endpoints
    ├── filesystem.py           # Filesystem browser endpoints
    └── setup_wizard.py         # Setup wizard endpoints
 ```
 ---
 ## Core Components
 ### 1. WorkerPool (`core/worker_pool.py`)
 Orchestrates CPU/GPU workers as separate processes.
 **Key Features:**
 - Dynamic add/remove workers at runtime
 - Health monitoring with auto-restart
 - Thread-safe multiprocessing
 - Each worker is an isolated Process
 ```python
 from backend.core.worker_pool import worker_pool
 from backend.core.worker import WorkerType
 # Add GPU worker on device 0
 worker_id = worker_pool.add_worker(WorkerType.GPU, device_id=0)
 # Add CPU worker
 worker_id = worker_pool.add_worker(WorkerType.CPU)
 # Get pool stats
 stats = worker_pool.get_pool_stats()
 ```
 ### 2. QueueManager (`core/queue_manager.py`)
 Persistent SQLite/PostgreSQL queue with priority support.
 **Key Features:**
 - Job deduplication (no duplicate `file_path`)
 - Row-level locking with `skip_locked=True`
 - Priority-based ordering (higher first)
 - FIFO within same priority (by `created_at`)
 - Auto-retry failed jobs
 ```python
 from backend.core.queue_manager import queue_manager
 from backend.core.models import QualityPreset
 job = queue_manager.add_job(
    file_path="/media/anime.mkv",
    file_name="anime.mkv",
    source_lang="jpn",
    target_lang="spa",
    quality_preset=QualityPreset.FAST,
    priority=5
 )
 ```
 ### 3. LibraryScanner (`scanning/library_scanner.py`)
 Rule-based file scanning system.
 **Three Scan Modes:**
 - **Manual**: One-time scan via API or CLI
 - **Scheduled**: Periodic scanning with APScheduler
 - **Real-time**: File watcher with watchdog library
 ```python
 from backend.scanning.library_scanner import library_scanner
 # Manual scan
 result = library_scanner.scan_paths(["/media/anime"], recursive=True)
 # Start scheduler (every 6 hours)
 library_scanner.start_scheduler(interval_minutes=360)
 # Start file watcher
 library_scanner.start_file_watcher(paths=["/media/anime"], recursive=True)
 ```
 ### 4. WhisperTranscriber (`transcription/transcriber.py`)
 Wrapper for stable-whisper and faster-whisper.
 **Key Features:**
 - GPU/CPU support with auto-device detection
 - VRAM management and cleanup
 - Graceful degradation (works without Whisper installed)
 ```python
 from backend.transcription.transcriber import WhisperTranscriber
 transcriber = WhisperTranscriber(
    model_name="large-v3",
    device="cuda",
    compute_type="float16"
 )
 result = transcriber.transcribe_file(
    file_path="/media/episode.mkv",
    language="jpn",
    task="translate"  # translate to English
 )
 result.to_srt("episode.eng.srt")
 ```
 ### 5. SettingsService (`core/settings_service.py`)
 Database-backed configuration with caching.
 ```python
 from backend.core.settings_service import settings_service
 # Get setting
 value = settings_service.get("worker_cpu_count", default=1)
 # Set setting
 settings_service.set("worker_cpu_count", "2")
 # Bulk update
 settings_service.bulk_update({
    "worker_cpu_count": "2",
    "scanner_enabled": "true"
 })
 ```
 ---
 ## Data Flow
 ```
 1. LibraryScanner detects file (manual/scheduled/watcher)
   ↓
 2. FileAnalyzer analyzes with ffprobe
   - Audio tracks (codec, language, channels)
   - Embedded subtitles
   - External .srt files
   - Duration, video info
   ↓
 3. Rules Engine evaluates against ScanRules (priority order)
   - Checks all conditions (audio language, missing subs, etc.)
   - First matching rule wins
   ↓
 4. If match → QueueManager.add_job()
   - Deduplication check (no duplicate file_path)
   - Assigns priority based on rule
   ↓
 5. Worker pulls job from queue
   - Uses with_for_update(skip_locked=True)
   - FIFO within same priority
   ↓
 6. WhisperTranscriber processes with model
   - Stage 1: Audio → English (Whisper translate)
   - Stage 2: English → Target (Google Translate, if needed)
   ↓
 7. Generate output SRT file(s)
   - .eng.srt (always)
   - .{target}.srt (if translate mode)
   ↓
 8. Job marked completed ✓
 ```
 ---
 ## Database Schema
 ### Job Table (`jobs`)
 ```sql
 id                      VARCHAR PRIMARY KEY
 file_path               VARCHAR UNIQUE      -- Ensures no duplicates
 file_name               VARCHAR
 status                  VARCHAR             -- queued/processing/completed/failed/cancelled
 priority                INTEGER
 source_lang             VARCHAR
 target_lang             VARCHAR
 quality_preset          VARCHAR             -- fast/balanced/best
 transcribe_or_translate VARCHAR             -- transcribe/translate
 progress                FLOAT
 current_stage           VARCHAR
 eta_seconds             INTEGER
 created_at              DATETIME
 started_at              DATETIME
 completed_at            DATETIME
 output_path             VARCHAR
 srt_content             TEXT
 segments_count          INTEGER
 error                   TEXT
 retry_count             INTEGER
 max_retries             INTEGER
 worker_id               VARCHAR
 vram_used_mb            INTEGER
 processing_time_seconds FLOAT
 ```
 ### ScanRule Table (`scan_rules`)
 ```sql
 id                          INTEGER PRIMARY KEY
 name                        VARCHAR UNIQUE
 enabled                     BOOLEAN
 priority                    INTEGER         -- Higher = evaluated first
 -- Conditions (all must match):
 audio_language_is           VARCHAR         -- ISO 639-2
 audio_language_not          VARCHAR         -- Comma-separated
 audio_track_count_min       INTEGER
 has_embedded_subtitle_lang  VARCHAR
 missing_embedded_subtitle_lang VARCHAR
 missing_external_subtitle_lang VARCHAR
 file_extension              VARCHAR         -- Comma-separated
 -- Action:
 action_type                 VARCHAR         -- transcribe/translate
 target_language             VARCHAR
 quality_preset              VARCHAR
 job_priority                INTEGER
 created_at                  DATETIME
 updated_at                  DATETIME
 ```
 ### SystemSettings Table (`system_settings`)
 ```sql
 id          INTEGER PRIMARY KEY
 key         VARCHAR UNIQUE
 value       TEXT
 description TEXT
 category    VARCHAR             -- general/workers/transcription/scanner/bazarr
 value_type  VARCHAR             -- string/integer/boolean/list
 created_at  DATETIME
 updated_at  DATETIME
 ```
 ---
 ## Transcription vs Translation
 ### Understanding the Two Modes
 **Mode 1: `transcribe`** (Audio → English subtitles)
 ```
 Audio (any language) → Whisper (task='translate') → English SRT
 Example: Japanese audio → anime.eng.srt
 ```
 **Mode 2: `translate`** (Audio → English → Target language)
 ```
 Audio (any language) → Whisper (task='translate') → English SRT
                    → Google Translate → Target language SRT
 Example: Japanese audio → anime.eng.srt + anime.spa.srt
 ```
 ### Why Two Stages?
 **Whisper Limitation**: Whisper can only translate TO English, not between other languages.
 **Solution**: Two-stage process:
 1. **Stage 1 (Always)**: Whisper converts audio to English using `task='translate'`
 2. **Stage 2 (Only for translate mode)**: Google Translate converts English to target language
 ### Output Files
 | Mode | Target | Output Files |
 |------|--------|--------------|
 | transcribe | spa | `.eng.srt` only |
 | translate | spa | `.eng.srt` + `.spa.srt` |
 | translate | fra | `.eng.srt` + `.fra.srt` |
 ---
 ## Worker Architecture
 ### Worker Types
 | Type | Description | Device |
 |------|-------------|--------|
 | CPU | Uses CPU for inference | None |
 | GPU | Uses NVIDIA GPU | cuda:N |
 ### Worker Lifecycle
 ```
                    ┌─────────────┐
                    │   CREATED   │
                    └──────┬──────┘
                           │ start()
                           ▼
                    ┌─────────────┐
        ┌──────────│    IDLE     │◄─────────┐
        │          └──────┬──────┘          │
        │                 │ get_job()       │ job_done()
        │                 ▼                 │
        │          ┌─────────────┐          │
        │          │    BUSY     │──────────┘
        │          └──────┬──────┘
        │                 │ error
        │                 ▼
        │          ┌─────────────┐
        └──────────│    ERROR    │
                   └─────────────┘
 ```
 ### Process Isolation
 Each worker runs in a separate Python process:
 - Memory isolation (VRAM per GPU worker)
 - Crash isolation (one worker crash doesn't affect others)
 - Independent model loading
 ---
 ## Queue System
 ### Priority System
 ```python
 # Priority values
 BAZARR_REQUEST = base_priority + 10    # Highest (external request)
 MANUAL_REQUEST = base_priority + 5     # High (user-initiated)
 AUTO_SCAN      = base_priority         # Normal (scanner-generated)
 ```
 ### Job Deduplication
 Jobs are deduplicated by `file_path`:
 - If job exists with same `file_path`, new job is rejected
 - Returns `None` from `add_job()`
 - Prevents duplicate processing
 ### Concurrency Safety
 ```python
 # Row-level locking prevents race conditions
 job = session.query(Job).filter(
    Job.status == JobStatus.QUEUED
 ).with_for_update(skip_locked=True).first()
 ```
 ---
 ## Scanner System
 ### Scan Rule Evaluation
 Rules are evaluated in priority order (highest first):
 ```python
 # Pseudo-code for rule matching
 for rule in rules.order_by(priority.desc()):
    if rule.enabled and matches_all_conditions(file, rule):
        create_job(file, rule.action)
        break  # First match wins
 ```
 ### Conditions
 All conditions must match (AND logic):
 | Condition | Match If |
 |-----------|----------|
 | audio_language_is | Primary audio track language equals |
 | audio_language_not | Primary audio track language NOT in list |
 | audio_track_count_min | Number of audio tracks >= value |
 | has_embedded_subtitle_lang | Has embedded subtitle in language |
 | missing_embedded_subtitle_lang | Does NOT have embedded subtitle |
 | missing_external_subtitle_lang | Does NOT have external .srt file |
 | file_extension | File extension in comma-separated list |
 ---
 ## Settings System
 ### Categories
 | Category | Settings |
 |----------|----------|
 | general | operation_mode, library_paths, log_level |
 | workers | cpu_count, gpu_count, auto_start, healthcheck_interval |
 | transcription | whisper_model, compute_type, vram_management |
 | scanner | enabled, schedule_interval, watcher_enabled |
 | bazarr | provider_enabled, api_key |
 ### Caching
 Settings service implements caching:
 - Cache invalidated on write
 - Thread-safe access
 - Lazy loading from database
 ---
 ## Graceful Degradation
 The system can run WITHOUT Whisper/torch/PyAV installed:
 ```python
 # Pattern used everywhere
 try:
    import stable_whisper
    WHISPER_AVAILABLE = True
 except ImportError:
    stable_whisper = None
    WHISPER_AVAILABLE = False
 # Later in code
 if not WHISPER_AVAILABLE:
    raise RuntimeError("Install with: pip install stable-ts faster-whisper")
 ```
 **What works without Whisper:**
 - Backend server starts normally
 - All APIs work fully
 - Frontend development
 - Scanner and rules management
 - Job queue (jobs just won't be processed)
 **What doesn't work:**
 - Actual transcription (throws RuntimeError)
 ---
 ## Thread Safety
 ### Database Sessions
 Always use context managers:
 ```python
 with database.get_session() as session:
    # Session is automatically committed on success
    # Rolled back on exception
    job = session.query(Job).filter(...).first()
 ```
 ### Worker Pool
 - Each worker is a separate Process (multiprocessing)
 - Communication via shared memory (Manager)
 - No GIL contention between workers
 ### Queue Manager
 - Uses SQLAlchemy row locking
 - `skip_locked=True` prevents deadlocks
 - Transactions are short-lived
 ---
 ## Important Patterns
 ### Circular Import Resolution
 **Critical**: `backend/scanning/__init__.py` MUST NOT import `library_scanner`:
 ```python
 # backend/scanning/__init__.py
 from backend.scanning.models import ScanRule
 from backend.scanning.file_analyzer import FileAnalyzer, FileAnalysis
 # DO NOT import library_scanner here!
 ```
 **Why?**
 ```
 library_scanner → database → models → scanning.models → database (circular!)
 ```
 **Solution**: Import `library_scanner` locally where needed:
 ```python
 def some_function():
    from backend.scanning.library_scanner import library_scanner
    library_scanner.scan_paths(...)
 ```
 ### Optional Imports
 ```python
 try:
    import pynvml
    NVML_AVAILABLE = True
 except ImportError:
    pynvml = None
    NVML_AVAILABLE = False
 ```
 ### Database Session Pattern
 ```python
 from backend.core.database import database
 with database.get_session() as session:
    # All operations within session context
    job = session.query(Job).filter(...).first()
    job.status = JobStatus.PROCESSING
    # Commit happens automatically
 ```
 ### API Response Pattern
 ```python
 from pydantic import BaseModel
 class JobResponse(BaseModel):
    id: str
    status: str
    # ...
@router.get("/{job_id}", response_model=JobResponse)
 async def get_job(job_id: str):
    with database.get_session() as session:
        job = session.query(Job).filter(Job.id == job_id).first()
        if not job:
            raise HTTPException(status_code=404, detail="Not found")
        return JobResponse(**job.to_dict())
 ```
--- a/docs/CONFIGURATION.md
+++ b/docs/CONFIGURATION.md
@@ -0,0 +1,402 @@
 # TranscriptorIO Configuration
 Complete documentation for the configuration system.
 ## Table of Contents
 - [Overview](#overview)
 - [Configuration Methods](#configuration-methods)
 - [Settings Categories](#settings-categories)
 - [All Settings Reference](#all-settings-reference)
 - [Environment Variables](#environment-variables)
 - [Setup Wizard](#setup-wizard)
 - [API Configuration](#api-configuration)
 ---
 ## Overview
 TranscriptorIO uses a **database-backed configuration system**. All settings are stored in the `system_settings` table and can be managed through:
 1. **Setup Wizard** (first run)
 2. **Web UI** (Settings page)
 3. **REST API** (`/api/settings`)
 4. **CLI** (for advanced users)
 This approach provides:
 - Persistent configuration across restarts
 - Runtime configuration changes without restart
 - Category-based organization
 - Type validation and parsing
 ---
 ## Configuration Methods
 ### 1. Setup Wizard (Recommended for First Run)
 ```bash
 # Runs automatically on first server start
 python backend/cli.py server
 # Or run manually anytime
 python backend/cli.py setup
 ```
 The wizard guides you through:
 - **Operation mode selection** (Standalone or Bazarr provider)
 - **Library paths configuration**
 - **Initial scan rules**
 - **Worker configuration** (CPU/GPU counts)
 - **Scanner schedule**
 ### 2. Web UI (Recommended for Daily Use)
 Navigate to **Settings** in the web interface (`http://localhost:8000/settings`).
 Features:
 - Settings grouped by category tabs
 - Descriptions for each setting
 - Change detection (warns about unsaved changes)
 - Bulk save functionality
 ### 3. REST API (For Automation/Integration)
 ```bash
 # Get all settings
 curl http://localhost:8000/api/settings
 # Get settings by category
 curl http://localhost:8000/api/settings?category=workers
 # Update a setting
 curl -X PUT http://localhost:8000/api/settings/worker_cpu_count \
  -H "Content-Type: application/json" \
  -d '{"value": "2"}'
 # Bulk update
 curl -X POST http://localhost:8000/api/settings/bulk-update \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "worker_cpu_count": "2",
      "worker_gpu_count": "1"
    }
  }'
 ```
 ---
 ## Settings Categories
 | Category | Description |
 |----------|-------------|
 | `general` | Operation mode, library paths, API server |
 | `workers` | CPU/GPU worker configuration |
 | `transcription` | Whisper model and transcription options |
 | `subtitles` | Subtitle naming and formatting |
 | `skip` | Skip conditions for files |
 | `scanner` | Library scanner configuration |
 | `bazarr` | Bazarr provider integration |
 | `advanced` | Advanced options (path mapping, etc.) |
 ---
 ## All Settings Reference
 ### General Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `operation_mode` | string | `standalone` | Operation mode: `standalone`, `provider`, or `standalone,provider` |
 | `library_paths` | list | `""` | Comma-separated library paths to scan |
 | `api_host` | string | `0.0.0.0` | API server host |
 | `api_port` | integer | `8000` | API server port |
 | `debug` | boolean | `false` | Enable debug mode |
 ### Worker Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `worker_cpu_count` | integer | `0` | Number of CPU workers to start on boot |
 | `worker_gpu_count` | integer | `0` | Number of GPU workers to start on boot |
 | `concurrent_transcriptions` | integer | `2` | Maximum concurrent transcriptions |
 | `worker_healthcheck_interval` | integer | `60` | Worker health check interval (seconds) |
 | `worker_auto_restart` | boolean | `true` | Auto-restart failed workers |
 | `clear_vram_on_complete` | boolean | `true` | Clear VRAM after job completion |
 ### Transcription Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `whisper_model` | string | `medium` | Whisper model: `tiny`, `base`, `small`, `medium`, `large-v3`, `large-v3-turbo` |
 | `model_path` | string | `./models` | Path to store Whisper models |
 | `transcribe_device` | string | `cpu` | Device: `cpu`, `cuda`, `gpu` |
 | `cpu_compute_type` | string | `auto` | CPU compute type: `auto`, `int8`, `float32` |
 | `gpu_compute_type` | string | `auto` | GPU compute type: `auto`, `float16`, `float32`, `int8_float16`, `int8` |
 | `whisper_threads` | integer | `4` | Number of CPU threads for Whisper |
 | `transcribe_or_translate` | string | `transcribe` | Default mode: `transcribe` or `translate` |
 | `word_level_highlight` | boolean | `false` | Enable word-level highlighting |
 | `detect_language_length` | integer | `30` | Seconds of audio for language detection |
 | `detect_language_offset` | integer | `0` | Offset for language detection sample |
 ### Whisper Models
 | Model | Size | Speed | Quality | VRAM |
 |-------|------|-------|---------|------|
 | `tiny` | 39M | Fastest | Basic | ~1GB |
 | `base` | 74M | Very Fast | Fair | ~1GB |
 | `small` | 244M | Fast | Good | ~2GB |
 | `medium` | 769M | Medium | Great | ~5GB |
 | `large-v3` | 1.5G | Slow | Excellent | ~10GB |
 | `large-v3-turbo` | 809M | Fast | Excellent | ~6GB |
 ### Subtitle Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `subtitle_language_name` | string | `""` | Custom subtitle language name |
 | `subtitle_language_naming_type` | string | `ISO_639_2_B` | Naming type: `ISO_639_1`, `ISO_639_2_T`, `ISO_639_2_B`, `NAME`, `NATIVE` |
 | `custom_regroup` | string | `cm_sl=84_sl=42++++++1` | Custom regrouping algorithm |
 **Language Naming Types:**
 | Type | Example (Spanish) |
 |------|-------------------|
 | ISO_639_1 | `es` |
 | ISO_639_2_T | `spa` |
 | ISO_639_2_B | `spa` |
 | NAME | `Spanish` |
 | NATIVE | `Espanol` |
 ### Skip Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `skip_if_external_subtitles_exist` | boolean | `false` | Skip if any external subtitle exists |
 | `skip_if_target_subtitles_exist` | boolean | `true` | Skip if target language subtitle exists |
 | `skip_if_internal_subtitles_language` | string | `""` | Skip if internal subtitle in this language |
 | `skip_subtitle_languages` | list | `""` | Pipe-separated language codes to skip |
 | `skip_if_audio_languages` | list | `""` | Skip if audio track is in these languages |
 | `skip_unknown_language` | boolean | `false` | Skip files with unknown audio language |
 | `skip_only_subgen_subtitles` | boolean | `false` | Only skip SubGen-generated subtitles |
 ### Scanner Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `scanner_enabled` | boolean | `true` | Enable library scanner |
 | `scanner_cron` | string | `0 2 * * *` | Cron expression for scheduled scans |
 | `scanner_schedule_interval_minutes` | integer | `360` | Scan interval in minutes (6 hours) |
 | `watcher_enabled` | boolean | `false` | Enable real-time file watcher |
 | `auto_scan_enabled` | boolean | `false` | Enable automatic scheduled scanning |
 ### Bazarr Provider Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `bazarr_provider_enabled` | boolean | `false` | Enable Bazarr provider mode |
 | `bazarr_url` | string | `http://bazarr:6767` | Bazarr server URL |
 | `bazarr_api_key` | string | `""` | Bazarr API key (auto-generated) |
 | `provider_timeout_seconds` | integer | `600` | Provider request timeout |
 | `provider_callback_enabled` | boolean | `true` | Enable callback on completion |
 | `provider_polling_interval` | integer | `30` | Polling interval for jobs |
 ### Advanced Settings
 | Key | Type | Default | Description |
 |-----|------|---------|-------------|
 | `force_detected_language_to` | string | `""` | Force detected language to specific code |
 | `preferred_audio_languages` | list | `eng` | Pipe-separated preferred audio languages |
 | `use_path_mapping` | boolean | `false` | Enable path mapping for network shares |
 | `path_mapping_from` | string | `/tv` | Path mapping source |
 | `path_mapping_to` | string | `/Volumes/TV` | Path mapping destination |
 | `lrc_for_audio_files` | boolean | `true` | Generate LRC files for audio-only files |
 ---
 ## Environment Variables
 The **only** environment variable required is `DATABASE_URL` in the `.env` file:
 ```bash
 # SQLite (default, good for single-user)
 DATABASE_URL=sqlite:///./transcriptarr.db
 # PostgreSQL (recommended for production)
 DATABASE_URL=postgresql://user:password@localhost:5432/transcriptarr
 # MariaDB/MySQL
 DATABASE_URL=mariadb+pymysql://user:password@localhost:3306/transcriptarr
 ```
 **All other configuration** is stored in the database and managed through:
 - Setup Wizard (first run)
 - Web UI Settings page
 - Settings API endpoints
 This design ensures:
 - No `.env` file bloat
 - Runtime configuration changes without restart
 - Centralized configuration management
 - Easy backup (configuration is in the database)
 ---
 ## Setup Wizard
 ### Standalone Mode
 For independent operation with local library scanning.
 **Configuration Flow:**
 1. Select library paths (e.g., `/media/anime`, `/media/movies`)
 2. Create initial scan rules (e.g., "Japanese audio → Spanish subtitles")
 3. Configure workers (CPU count, GPU count)
 4. Set scanner interval (default: 6 hours)
 **API Endpoint:** `POST /api/setup/standalone`
 ```json
 {
  "library_paths": ["/media/anime", "/media/movies"],
  "scan_rules": [
    {
      "name": "Japanese to Spanish",
      "audio_language_is": "jpn",
      "missing_external_subtitle_lang": "spa",
      "target_language": "spa",
      "action_type": "transcribe"
    }
  ],
  "worker_config": {
    "count": 1,
    "type": "cpu"
  },
  "scanner_config": {
    "interval_minutes": 360
  }
 }
 ```
 ### Bazarr Slave Mode
 For integration with Bazarr as a subtitle provider.
 **Configuration Flow:**
 1. Select Bazarr mode
 2. System auto-generates API key
 3. Displays connection info for Bazarr configuration
 **API Endpoint:** `POST /api/setup/bazarr-slave`
 **Response:**
 ```json
 {
  "success": true,
  "message": "Bazarr slave mode configured successfully",
  "bazarr_info": {
    "mode": "bazarr_slave",
    "host": "127.0.0.1",
    "port": 8000,
    "api_key": "generated_api_key_here",
    "provider_url": "http://127.0.0.1:8000"
  }
 }
 ```
 ---
 ## API Configuration
 ### Get All Settings
 ```bash
 curl http://localhost:8000/api/settings
 ```
 ### Get by Category
 ```bash
 curl "http://localhost:8000/api/settings?category=workers"
 ```
 ### Get Single Setting
 ```bash
 curl http://localhost:8000/api/settings/worker_cpu_count
 ```
 ### Update Setting
 ```bash
 curl -X PUT http://localhost:8000/api/settings/worker_cpu_count \
  -H "Content-Type: application/json" \
  -d '{"value": "2"}'
 ```
 ### Bulk Update
 ```bash
 curl -X POST http://localhost:8000/api/settings/bulk-update \
  -H "Content-Type: application/json" \
  -d '{
    "settings": {
      "worker_cpu_count": "2",
      "worker_gpu_count": "1",
      "scanner_enabled": "true"
    }
  }'
 ```
 ### Create Custom Setting
 ```bash
 curl -X POST http://localhost:8000/api/settings \
  -H "Content-Type: application/json" \
  -d '{
    "key": "my_custom_setting",
    "value": "custom_value",
    "description": "My custom setting",
    "category": "advanced",
    "value_type": "string"
  }'
 ```
 ### Delete Setting
 ```bash
 curl -X DELETE http://localhost:8000/api/settings/my_custom_setting
 ```
 ### Initialize Defaults
 ```bash
 curl -X POST http://localhost:8000/api/settings/init-defaults
 ```
 ---
 ## Python Usage
 ```python
 from backend.core.settings_service import settings_service
 # Get setting with default
 cpu_count = settings_service.get("worker_cpu_count", default=1)
 # Set setting
 settings_service.set("worker_cpu_count", 2)
 # Bulk update
 settings_service.bulk_update({
    "worker_cpu_count": "2",
    "scanner_enabled": "true"
 })
 # Get all settings in category
 worker_settings = settings_service.get_by_category("workers")
 # Initialize defaults (safe to call multiple times)
 settings_service.init_default_settings()
 ```
--- a/docs/FRONTEND.md
+++ b/docs/FRONTEND.md
@@ -0,0 +1,666 @@
 # TranscriptorIO Frontend
 Technical documentation for the Vue 3 frontend application.
 ## Table of Contents
 - [Overview](#overview)
 - [Technology Stack](#technology-stack)
 - [Directory Structure](#directory-structure)
 - [Development Setup](#development-setup)
 - [Views](#views)
 - [Components](#components)
 - [State Management](#state-management)
 - [API Service](#api-service)
 - [Routing](#routing)
 - [Styling](#styling)
 - [Build and Deployment](#build-and-deployment)
 ---
 ## Overview
 The TranscriptorIO frontend is a Single Page Application (SPA) built with Vue 3, featuring:
 - **6 Complete Views**: Dashboard, Queue, Scanner, Rules, Workers, Settings
 - **Real-time Updates**: Polling-based status updates
 - **Dark Theme**: Tdarr-inspired dark UI
 - **Type Safety**: Full TypeScript support
 - **State Management**: Pinia stores for shared state
 ---
 ## Technology Stack
 | Technology | Version | Purpose |
 |------------|---------|---------|
 | Vue.js | 3.4+ | UI Framework |
 | Vue Router | 4.2+ | Client-side routing |
 | Pinia | 2.1+ | State management |
 | Axios | 1.6+ | HTTP client |
 | TypeScript | 5.3+ | Type safety |
 | Vite | 5.0+ | Build tool / dev server |
 ---
 ## Directory Structure
 ```
 frontend/
 ├── public/                     # Static assets (favicon, etc.)
 ├── src/
 │   ├── main.ts                 # Application entry point
 │   ├── App.vue                 # Root component + navigation
 │   │
 │   ├── views/                  # Page components (routed)
 │   │   ├── DashboardView.vue   # System overview + resources
 │   │   ├── QueueView.vue       # Job management
 │   │   ├── ScannerView.vue     # Scanner control
 │   │   ├── RulesView.vue       # Scan rules CRUD
 │   │   ├── WorkersView.vue     # Worker pool management
 │   │   └── SettingsView.vue    # Settings management
 │   │
 │   ├── components/             # Reusable components
 │   │   ├── ConnectionWarning.vue  # Backend connection status
 │   │   ├── PathBrowser.vue        # Filesystem browser modal
 │   │   └── SetupWizard.vue        # First-run setup wizard
 │   │
 │   ├── stores/                 # Pinia state stores
 │   │   ├── config.ts           # Configuration store
 │   │   ├── system.ts           # System status store
 │   │   ├── workers.ts          # Workers store
 │   │   └── jobs.ts             # Jobs store
 │   │
 │   ├── services/
 │   │   └── api.ts              # Axios API client
 │   │
 │   ├── router/
 │   │   └── index.ts            # Vue Router configuration
 │   │
 │   ├── types/
 │   │   └── api.ts              # TypeScript interfaces
 │   │
 │   └── assets/
 │       └── css/
 │           └── main.css        # Global styles (dark theme)
 │
 ├── index.html                  # HTML template
 ├── vite.config.ts              # Vite configuration
 ├── tsconfig.json               # TypeScript configuration
 └── package.json                # Dependencies
 ```
 ---
 ## Development Setup
 ### Prerequisites
 - Node.js 18+ and npm
 - Backend server running on port 8000
 ### Installation
 ```bash
 cd frontend
 # Install dependencies
 npm install
 # Start development server (with proxy to backend)
 npm run dev
 ```
 ### Development URLs
 | URL | Description |
 |-----|-------------|
 | http://localhost:3000 | Frontend dev server |
 | http://localhost:8000 | Backend API |
 | http://localhost:8000/docs | Swagger API docs |
 ### Scripts
 ```bash
 npm run dev      # Start dev server with HMR
 npm run build    # Build for production
 npm run preview  # Preview production build
 npm run lint     # Run ESLint
 ```
 ---
 ## Views
 ### DashboardView
 **Path**: `/`
 System overview with real-time resource monitoring.
 **Features**:
 - System status (running/stopped)
 - CPU usage gauge
 - RAM usage gauge
 - GPU usage gauges (per device)
 - Recent jobs list
 - Worker pool summary
 - Scanner status
 **Data Sources**:
 - `GET /api/status`
 - `GET /api/system/resources`
 - `GET /api/jobs?page_size=10`
 ### QueueView
 **Path**: `/queue`
 Job queue management with filtering and pagination.
 **Features**:
 - Job list with status icons
 - Status filter (All/Queued/Processing/Completed/Failed)
 - Pagination controls
 - Retry failed jobs
 - Cancel queued/processing jobs
 - Clear completed jobs
 - Job progress display
 - Processing time display
 **Data Sources**:
 - `GET /api/jobs`
 - `GET /api/jobs/stats`
 - `POST /api/jobs/{id}/retry`
 - `DELETE /api/jobs/{id}`
 - `POST /api/jobs/queue/clear`
 ### ScannerView
 **Path**: `/scanner`
 Library scanner control and configuration.
 **Features**:
 - Scanner status display
 - Start/stop scheduler
 - Start/stop file watcher
 - Manual scan trigger
 - Scan results display
 - Next scan time
 - Total files scanned counter
 **Data Sources**:
 - `GET /api/scanner/status`
 - `POST /api/scanner/scan`
 - `POST /api/scanner/scheduler/start`
 - `POST /api/scanner/scheduler/stop`
 - `POST /api/scanner/watcher/start`
 - `POST /api/scanner/watcher/stop`
 ### RulesView
 **Path**: `/rules`
 Scan rules CRUD management.
 **Features**:
 - Rules list with priority ordering
 - Create new rule (modal)
 - Edit existing rule (modal)
 - Delete rule (with confirmation)
 - Toggle rule enabled/disabled
 - Condition configuration
 - Action configuration
 **Data Sources**:
 - `GET /api/scan-rules`
 - `POST /api/scan-rules`
 - `PUT /api/scan-rules/{id}`
 - `DELETE /api/scan-rules/{id}`
 - `POST /api/scan-rules/{id}/toggle`
 ### WorkersView
 **Path**: `/workers`
 Worker pool management.
 **Features**:
 - Worker list with status
 - Add CPU worker
 - Add GPU worker (with device selection)
 - Remove worker
 - Start/stop pool
 - Worker statistics
 - Current job display per worker
 - Progress and ETA display
 **Data Sources**:
 - `GET /api/workers`
 - `GET /api/workers/stats`
 - `POST /api/workers`
 - `DELETE /api/workers/{id}`
 - `POST /api/workers/pool/start`
 - `POST /api/workers/pool/stop`
 ### SettingsView
 **Path**: `/settings`
 Database-backed settings management.
 **Features**:
 - Settings grouped by category
 - Category tabs (General, Workers, Transcription, Scanner, Bazarr)
 - Edit settings in-place
 - Save changes button
 - Change detection (unsaved changes warning)
 - Setting descriptions
 **Data Sources**:
 - `GET /api/settings`
 - `PUT /api/settings/{key}`
 - `POST /api/settings/bulk-update`
 ---
 ## Components
 ### ConnectionWarning
 Displays warning banner when backend is unreachable.
 **Props**: None
 **State**: Uses `systemStore.isConnected`
 ### PathBrowser
 Modal component for browsing filesystem paths.
 **Props**:
 - `show: boolean` - Show/hide modal
 - `initialPath: string` - Starting path
 **Emits**:
 - `select(path: string)` - Path selected
 - `close()` - Modal closed
 **API Calls**:
 - `GET /api/filesystem/browse?path={path}`
 - `GET /api/filesystem/common-paths`
 ### SetupWizard
 First-run setup wizard component.
 **Props**: None
 **Features**:
 - Mode selection (Standalone/Bazarr)
 - Library path configuration
 - Scan rule creation
 - Worker configuration
 - Scanner interval setting
 **API Calls**:
 - `GET /api/setup/status`
 - `POST /api/setup/standalone`
 - `POST /api/setup/bazarr-slave`
 - `POST /api/setup/skip`
 ---
 ## State Management
 ### Pinia Stores
 #### systemStore (`stores/system.ts`)
 Global system state.
 ```typescript
 interface SystemState {
  isConnected: boolean
  status: SystemStatus | null
  resources: SystemResources | null
  loading: boolean
  error: string | null
 }
 // Actions
 fetchStatus()      // Fetch /api/status
 fetchResources()   // Fetch /api/system/resources
 startPolling()     // Start auto-refresh
 stopPolling()      // Stop auto-refresh
 ```
 #### workersStore (`stores/workers.ts`)
 Worker pool state.
 ```typescript
 interface WorkersState {
  workers: Worker[]
  stats: WorkerStats | null
  loading: boolean
  error: string | null
 }
 // Actions
 fetchWorkers()                      // Fetch all workers
 fetchStats()                        // Fetch pool stats
 addWorker(type, deviceId?)          // Add worker
 removeWorker(id)                    // Remove worker
 startPool(cpuCount, gpuCount)       // Start pool
 stopPool()                          // Stop pool
 ```
 #### jobsStore (`stores/jobs.ts`)
 Job queue state.
 ```typescript
 interface JobsState {
  jobs: Job[]
  stats: QueueStats | null
  total: number
  page: number
  pageSize: number
  statusFilter: string | null
  loading: boolean
  error: string | null
 }
 // Actions
 fetchJobs()                 // Fetch with current filters
 fetchStats()                // Fetch queue stats
 retryJob(id)                // Retry failed job
 cancelJob(id)               // Cancel job
 clearCompleted()            // Clear completed jobs
 setStatusFilter(status)     // Update filter
 setPage(page)               // Change page
 ```
 #### configStore (`stores/config.ts`)
 Settings configuration state.
 ```typescript
 interface ConfigState {
  settings: Setting[]
  loading: boolean
  error: string | null
  pendingChanges: Record<string, string>
 }
 // Actions
 fetchSettings(category?)    // Fetch settings
 updateSetting(key, value)   // Queue update
 saveChanges()               // Save all pending
 discardChanges()            // Discard pending
 ```
 ---
 ## API Service
 ### Configuration (`services/api.ts`)
 ```typescript
 import axios from 'axios'
 const api = axios.create({
  baseURL: '/api',
  timeout: 30000,
  headers: {
    'Content-Type': 'application/json'
  }
 })
 // Response interceptor for error handling
 api.interceptors.response.use(
  response => response,
  error => {
    console.error('API Error:', error)
    return Promise.reject(error)
  }
 )
 export default api
 ```
 ### Usage Example
 ```typescript
 import api from '@/services/api'
 // GET request
 const response = await api.get('/jobs', {
  params: { status_filter: 'queued', page: 1 }
 })
 // POST request
 const job = await api.post('/jobs', {
  file_path: '/media/video.mkv',
  target_lang: 'spa'
 })
 // PUT request
 await api.put('/settings/worker_cpu_count', {
  value: '2'
 })
 // DELETE request
 await api.delete(`/jobs/${jobId}`)
 ```
 ---
 ## Routing
 ### Route Configuration
 ```typescript
 const routes = [
  { path: '/', name: 'Dashboard', component: DashboardView },
  { path: '/workers', name: 'Workers', component: WorkersView },
  { path: '/queue', name: 'Queue', component: QueueView },
  { path: '/scanner', name: 'Scanner', component: ScannerView },
  { path: '/rules', name: 'Rules', component: RulesView },
  { path: '/settings', name: 'Settings', component: SettingsView }
 ]
 ```
 ### Navigation
 Navigation is handled in `App.vue` with a sidebar menu.
 ```vue
 <nav class="sidebar">
  <router-link to="/">Dashboard</router-link>
  <router-link to="/workers">Workers</router-link>
  <router-link to="/queue">Queue</router-link>
  <router-link to="/scanner">Scanner</router-link>
  <router-link to="/rules">Rules</router-link>
  <router-link to="/settings">Settings</router-link>
 </nav>
 <main class="content">
  <router-view />
 </main>
 ```
 ---
 ## Styling
 ### Dark Theme
 The application uses a Tdarr-inspired dark theme defined in `assets/css/main.css`.
 **Color Palette**:
 | Variable | Value | Usage |
 |----------|-------|-------|
 | --bg-primary | #1a1a2e | Main background |
 | --bg-secondary | #16213e | Card background |
 | --bg-tertiary | #0f3460 | Hover states |
 | --text-primary | #eaeaea | Primary text |
 | --text-secondary | #a0a0a0 | Secondary text |
 | --accent-primary | #e94560 | Buttons, links |
 | --accent-success | #4ade80 | Success states |
 | --accent-warning | #fbbf24 | Warning states |
 | --accent-error | #ef4444 | Error states |
 ### Component Styling
 Components use scoped CSS with CSS variables:
 ```vue
 <style scoped>
 .card {
  background: var(--bg-secondary);
  border-radius: 8px;
  padding: 1.5rem;
 }
 .btn-primary {
  background: var(--accent-primary);
  color: white;
  border: none;
  padding: 0.5rem 1rem;
  border-radius: 4px;
  cursor: pointer;
 }
 .btn-primary:hover {
  opacity: 0.9;
 }
 </style>
 ```
 ---
 ## Build and Deployment
 ### Production Build
 ```bash
 cd frontend
 npm run build
 ```
 This creates a `dist/` folder with:
 - `index.html` - Entry HTML
 - `assets/` - JS, CSS bundles (hashed filenames)
 ### Deployment Options
 #### Option 1: Served by Backend (Recommended)
 The FastAPI backend automatically serves the frontend from `frontend/dist/`:
 ```python
 # backend/app.py
 frontend_path = Path(__file__).parent.parent / "frontend" / "dist"
 if frontend_path.exists():
    app.mount("/assets", StaticFiles(directory=str(frontend_path / "assets")))
    @app.get("/{full_path:path}")
    async def serve_frontend(full_path: str = ""):
        return FileResponse(str(frontend_path / "index.html"))
 ```
 **Access**: http://localhost:8000
 #### Option 2: Nginx Reverse Proxy
 ```nginx
 server {
    listen 80;
    server_name transcriptorio.local;
    # Frontend
    location / {
        root /var/www/transcriptorio/frontend/dist;
        try_files $uri $uri/ /index.html;
    }
    # Backend API
    location /api {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
 }
 ```
 #### Option 3: Docker
 ```dockerfile
 # Build frontend
 FROM node:18-alpine AS frontend-builder
 WORKDIR /app/frontend
 COPY frontend/package*.json ./
 RUN npm ci
 COPY frontend/ ./
 RUN npm run build
 # Final image
 FROM python:3.12-slim
 COPY --from=frontend-builder /app/frontend/dist /app/frontend/dist
 # ... rest of backend setup
 ```
 ---
 ## TypeScript Interfaces
 ### Key Types (`types/api.ts`)
 ```typescript
 // Job
 interface Job {
  id: string
  file_path: string
  file_name: string
  status: 'queued' | 'processing' | 'completed' | 'failed' | 'cancelled'
  priority: number
  progress: number
  // ... more fields
 }
 // Worker
 interface Worker {
  worker_id: string
  worker_type: 'cpu' | 'gpu'
  device_id: number | null
  status: 'idle' | 'busy' | 'stopped' | 'error'
  current_job_id: string | null
  jobs_completed: number
  jobs_failed: number
 }
 // Setting
 interface Setting {
  id: number
  key: string
  value: string | null
  description: string | null
  category: string | null
  value_type: string | null
 }
 // ScanRule
 interface ScanRule {
  id: number
  name: string
  enabled: boolean
  priority: number
  conditions: ScanRuleConditions
  action: ScanRuleAction
 }
 ```