docs: add GUI app design spec

2026-03-20 11:12:40 +01:00 · 2026-03-20 11:12:40 +01:00 · ebfd751e8d
parent 2464081f4c
commit ebfd751e8d
1 changed files with 184 additions and 0 deletions
--- a/docs/superpowers/specs/2026-03-20-gui-app-design.md
+++ b/docs/superpowers/specs/2026-03-20-gui-app-design.md
@ -0,0 +1,184 @@
 # Whisper Dictation — GUI App Design
 **Date:** 2026-03-20
 **Status:** Approved
 ## Overview
 Convert the existing `dictate.py` script into a proper packaged desktop application that:
 - Runs without a terminal window on Windows and Linux
 - Shows a compact log/status panel accessible from the tray icon
 - Can be integrated into the system (autostart, start menu, desktop shortcut)
 - Is distributed as a standalone binary via PyInstaller
 ## Module Structure
 ```
 whisper-dictation/
 ├── main.py                        # Entry point
 ├── build.py                       # PyInstaller build script (platform-specific)
 ├── whisper-dictation.spec         # PyInstaller spec (manual, for ctranslate2)
 ├── whisper_app/
 │   ├── __init__.py
 │   ├── app.py                     # AppState, central coordination, log queue
 │   ├── audio.py                   # sounddevice stream, audio_callback
 │   ├── transcriber.py             # WhisperModel, stop_and_transcribe()
 │   ├── hotkey.py                  # HotkeyListener (unchanged)
 │   ├── config.py                  # load/save config + vocab, path resolution
 │   ├── typer.py                   # type_text(), cross-platform
 │   ├── tray.py                    # pystray icon + menu
 │   ├── log_window.py              # Compact log panel
 │   ├── settings_window.py         # Settings dialog
 │   ├── vocab_window.py            # Vocabulary dialog
 │   ├── overlay.py                 # "Recording..." overlay
 │   └── installer.py               # System integration (autostart, start menu, shortcut)
 ├── config.json                    # Shared config (git-tracked)
 ├── vocabulary.json                # Shared vocabulary (git-tracked)
 ├── start.sh                       # Dev start (Linux)
 └── start.bat                      # Dev start (Windows)
 ```
 ## Path Handling
 All path resolution lives in `config.py`. Must work in two modes:
 ```python
 import sys, os
 def _app_dir() -> str:
    """Root dir for config.json and vocabulary.json."""
    if getattr(sys, "frozen", False):
        # PyInstaller binary: use directory containing the executable
        return os.path.dirname(sys.executable)
    else:
        # Dev mode: use script directory (git repo root)
        return os.path.dirname(os.path.abspath(__file__ + "/../../"))
 ```
 Machine-local config (`config_local.json`) continues to use `%LOCALAPPDATA%\WhisperDictation` (Windows) or `~/.local/share/WhisperDictation` (Linux) — unchanged.
 ## Logging Architecture
 `app.py` owns a `queue.Queue[str]` and a pre-queue buffer list.
 ```python
 _log_buffer: list[str] = []   # before queue is ready
 _log_queue: queue.Queue | None = None
 def log(msg: str) -> None:
    if _log_queue is not None:
        _log_queue.put(msg)
    else:
        _log_buffer.append(msg)
 def set_log_queue(q: queue.Queue) -> None:
    global _log_queue
    _log_queue = q
    for msg in _log_buffer:
        q.put(msg)
    _log_buffer.clear()
 ```
 All `print()` calls in all modules are replaced with `app.log()`.
 ## Log Panel (Compact)
 Opened via tray icon left-click or "Anzeigen" menu item. Implemented in `log_window.py`.
 ```
 ┌─────────────────────────────────────┐
 │ ● WHISPER DICTATION    medium·de  ✕ │
 ├─────────────────────────────────────┤
 │ Model ready.                        │
 │ Hotkey: ctrl+shift+space            │
 │ ● Recording...                      │
 │ Audio: 2.1s  RMS: 0.048             │
 │ ✓ "Das ist ein Test"                │
 ├─────────────────────────────────────┤
 │ ⚙ Einstellungen  📚 Vokabular  🗑  │
 └─────────────────────────────────────┘
 ```
 - Size: ~380×220px, **not resizable**
 - `tk.Text` widget in read-only mode, max 200 lines (older lines discarded)
 - Auto-scroll to bottom on new messages
 - Color tags: green = ready/result, red = recording, yellow = transcribing, grey = info
 - Close button → `withdraw()` (does not quit the app)
 - 🗑 button → clears the text widget
 - Queue polled via `root.after(100, _poll_log_queue)`
 ## Settings Window — Installation Section
 New section "INSTALLATION" added to the existing settings window. Implemented in `installer.py`.
 Each integration shows status ("eingerichtet" / "nicht eingerichtet") and two buttons: "Einrichten" / "Entfernen".
 | Feature | Windows | Linux |
 |---|---|---|
 | Autostart beim Login | `HKCU\Software\Microsoft\Windows\CurrentVersion\Run` | `~/.config/autostart/whisper-dictation.desktop` |
 | Startmenü-Eintrag | `%APPDATA%\Microsoft\Windows\Start Menu\Programs\Whisper Dictation.lnk` | `~/.local/share/applications/whisper-dictation.desktop` |
 | Desktop-Verknüpfung | `%USERPROFILE%\Desktop\Whisper Dictation.lnk` | `~/Desktop/whisper-dictation.desktop` |
 Windows `.lnk` files are created via `pywin32` (`win32com.client.Dispatch("WScript.Shell")`).
 **Only available when running as a frozen binary.** In dev mode, the buttons are disabled with a tooltip "Nur im gebauten Binary verfügbar".
 ## Icon
 The existing tray icon (64×64 PIL `Image`) is extended:
 - At build time, `build.py` generates `icon.ico` (sizes: 16, 32, 48, 256) via Pillow
 - Used as the PyInstaller `--icon` and for `.lnk` shortcuts
 ## PyInstaller Build
 ### Manual `.spec` file (required for ctranslate2/faster-whisper)
 ```python
 # whisper-dictation.spec (key sections)
 a = Analysis(
    ['main.py'],
    hiddenimports=[
        'ctranslate2',
        'faster_whisper',
        'sounddevice',
        'pynput.keyboard._win32',   # Windows
        'pynput.keyboard._xorg',    # Linux
    ],
    datas=[
        ('config.json', '.'),
        ('vocabulary.json', '.'),
    ],
 )
 exe = EXE(a.pure, ..., console=False, icon='icon.ico')
 ```
 ### `build.py`
 1. Generates `icon.ico` from PIL
 2. Runs PyInstaller with the `.spec` file
 3. Copies `config.json` and `vocabulary.json` into `dist/whisper-dictation/`
 ### Platform requirement
 PyInstaller cannot cross-compile. **Build must be run separately on each platform:**
 - Windows: `python build.py` → `dist/whisper-dictation/whisper-dictation.exe`
 - Linux: `python build.py` → `dist/whisper-dictation/whisper-dictation`
 ## New Dependencies
 | Package | Purpose |
 |---|---|
 | `pywin32` | `.lnk` shortcut creation (Windows only) |
 Added to `requirements-windows.txt`. Not required on Linux.
 ## Out of Scope
 - Code signing / notarization
 - Auto-updater
 - Versioning
 - Cross-compilation
 ## Open Questions
 _(none — all resolved during design session)_