docs: add GUI app design spec
This commit is contained in:
parent
2464081f4c
commit
ebfd751e8d
|
|
@ -0,0 +1,184 @@
|
||||||
|
# Whisper Dictation — GUI App Design
|
||||||
|
|
||||||
|
**Date:** 2026-03-20
|
||||||
|
**Status:** Approved
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Convert the existing `dictate.py` script into a proper packaged desktop application that:
|
||||||
|
- Runs without a terminal window on Windows and Linux
|
||||||
|
- Shows a compact log/status panel accessible from the tray icon
|
||||||
|
- Can be integrated into the system (autostart, start menu, desktop shortcut)
|
||||||
|
- Is distributed as a standalone binary via PyInstaller
|
||||||
|
|
||||||
|
## Module Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
whisper-dictation/
|
||||||
|
├── main.py # Entry point
|
||||||
|
├── build.py # PyInstaller build script (platform-specific)
|
||||||
|
├── whisper-dictation.spec # PyInstaller spec (manual, for ctranslate2)
|
||||||
|
├── whisper_app/
|
||||||
|
│ ├── __init__.py
|
||||||
|
│ ├── app.py # AppState, central coordination, log queue
|
||||||
|
│ ├── audio.py # sounddevice stream, audio_callback
|
||||||
|
│ ├── transcriber.py # WhisperModel, stop_and_transcribe()
|
||||||
|
│ ├── hotkey.py # HotkeyListener (unchanged)
|
||||||
|
│ ├── config.py # load/save config + vocab, path resolution
|
||||||
|
│ ├── typer.py # type_text(), cross-platform
|
||||||
|
│ ├── tray.py # pystray icon + menu
|
||||||
|
│ ├── log_window.py # Compact log panel
|
||||||
|
│ ├── settings_window.py # Settings dialog
|
||||||
|
│ ├── vocab_window.py # Vocabulary dialog
|
||||||
|
│ ├── overlay.py # "Recording..." overlay
|
||||||
|
│ └── installer.py # System integration (autostart, start menu, shortcut)
|
||||||
|
├── config.json # Shared config (git-tracked)
|
||||||
|
├── vocabulary.json # Shared vocabulary (git-tracked)
|
||||||
|
├── start.sh # Dev start (Linux)
|
||||||
|
└── start.bat # Dev start (Windows)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Path Handling
|
||||||
|
|
||||||
|
All path resolution lives in `config.py`. Must work in two modes:
|
||||||
|
|
||||||
|
```python
|
||||||
|
import sys, os
|
||||||
|
|
||||||
|
def _app_dir() -> str:
|
||||||
|
"""Root dir for config.json and vocabulary.json."""
|
||||||
|
if getattr(sys, "frozen", False):
|
||||||
|
# PyInstaller binary: use directory containing the executable
|
||||||
|
return os.path.dirname(sys.executable)
|
||||||
|
else:
|
||||||
|
# Dev mode: use script directory (git repo root)
|
||||||
|
return os.path.dirname(os.path.abspath(__file__ + "/../../"))
|
||||||
|
```
|
||||||
|
|
||||||
|
Machine-local config (`config_local.json`) continues to use `%LOCALAPPDATA%\WhisperDictation` (Windows) or `~/.local/share/WhisperDictation` (Linux) — unchanged.
|
||||||
|
|
||||||
|
## Logging Architecture
|
||||||
|
|
||||||
|
`app.py` owns a `queue.Queue[str]` and a pre-queue buffer list.
|
||||||
|
|
||||||
|
```python
|
||||||
|
_log_buffer: list[str] = [] # before queue is ready
|
||||||
|
_log_queue: queue.Queue | None = None
|
||||||
|
|
||||||
|
def log(msg: str) -> None:
|
||||||
|
if _log_queue is not None:
|
||||||
|
_log_queue.put(msg)
|
||||||
|
else:
|
||||||
|
_log_buffer.append(msg)
|
||||||
|
|
||||||
|
def set_log_queue(q: queue.Queue) -> None:
|
||||||
|
global _log_queue
|
||||||
|
_log_queue = q
|
||||||
|
for msg in _log_buffer:
|
||||||
|
q.put(msg)
|
||||||
|
_log_buffer.clear()
|
||||||
|
```
|
||||||
|
|
||||||
|
All `print()` calls in all modules are replaced with `app.log()`.
|
||||||
|
|
||||||
|
## Log Panel (Compact)
|
||||||
|
|
||||||
|
Opened via tray icon left-click or "Anzeigen" menu item. Implemented in `log_window.py`.
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────┐
|
||||||
|
│ ● WHISPER DICTATION medium·de ✕ │
|
||||||
|
├─────────────────────────────────────┤
|
||||||
|
│ Model ready. │
|
||||||
|
│ Hotkey: ctrl+shift+space │
|
||||||
|
│ ● Recording... │
|
||||||
|
│ Audio: 2.1s RMS: 0.048 │
|
||||||
|
│ ✓ "Das ist ein Test" │
|
||||||
|
├─────────────────────────────────────┤
|
||||||
|
│ ⚙ Einstellungen 📚 Vokabular 🗑 │
|
||||||
|
└─────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
- Size: ~380×220px, **not resizable**
|
||||||
|
- `tk.Text` widget in read-only mode, max 200 lines (older lines discarded)
|
||||||
|
- Auto-scroll to bottom on new messages
|
||||||
|
- Color tags: green = ready/result, red = recording, yellow = transcribing, grey = info
|
||||||
|
- Close button → `withdraw()` (does not quit the app)
|
||||||
|
- 🗑 button → clears the text widget
|
||||||
|
- Queue polled via `root.after(100, _poll_log_queue)`
|
||||||
|
|
||||||
|
## Settings Window — Installation Section
|
||||||
|
|
||||||
|
New section "INSTALLATION" added to the existing settings window. Implemented in `installer.py`.
|
||||||
|
|
||||||
|
Each integration shows status ("eingerichtet" / "nicht eingerichtet") and two buttons: "Einrichten" / "Entfernen".
|
||||||
|
|
||||||
|
| Feature | Windows | Linux |
|
||||||
|
|---|---|---|
|
||||||
|
| Autostart beim Login | `HKCU\Software\Microsoft\Windows\CurrentVersion\Run` | `~/.config/autostart/whisper-dictation.desktop` |
|
||||||
|
| Startmenü-Eintrag | `%APPDATA%\Microsoft\Windows\Start Menu\Programs\Whisper Dictation.lnk` | `~/.local/share/applications/whisper-dictation.desktop` |
|
||||||
|
| Desktop-Verknüpfung | `%USERPROFILE%\Desktop\Whisper Dictation.lnk` | `~/Desktop/whisper-dictation.desktop` |
|
||||||
|
|
||||||
|
Windows `.lnk` files are created via `pywin32` (`win32com.client.Dispatch("WScript.Shell")`).
|
||||||
|
|
||||||
|
**Only available when running as a frozen binary.** In dev mode, the buttons are disabled with a tooltip "Nur im gebauten Binary verfügbar".
|
||||||
|
|
||||||
|
## Icon
|
||||||
|
|
||||||
|
The existing tray icon (64×64 PIL `Image`) is extended:
|
||||||
|
- At build time, `build.py` generates `icon.ico` (sizes: 16, 32, 48, 256) via Pillow
|
||||||
|
- Used as the PyInstaller `--icon` and for `.lnk` shortcuts
|
||||||
|
|
||||||
|
## PyInstaller Build
|
||||||
|
|
||||||
|
### Manual `.spec` file (required for ctranslate2/faster-whisper)
|
||||||
|
|
||||||
|
```python
|
||||||
|
# whisper-dictation.spec (key sections)
|
||||||
|
a = Analysis(
|
||||||
|
['main.py'],
|
||||||
|
hiddenimports=[
|
||||||
|
'ctranslate2',
|
||||||
|
'faster_whisper',
|
||||||
|
'sounddevice',
|
||||||
|
'pynput.keyboard._win32', # Windows
|
||||||
|
'pynput.keyboard._xorg', # Linux
|
||||||
|
],
|
||||||
|
datas=[
|
||||||
|
('config.json', '.'),
|
||||||
|
('vocabulary.json', '.'),
|
||||||
|
],
|
||||||
|
)
|
||||||
|
exe = EXE(a.pure, ..., console=False, icon='icon.ico')
|
||||||
|
```
|
||||||
|
|
||||||
|
### `build.py`
|
||||||
|
|
||||||
|
1. Generates `icon.ico` from PIL
|
||||||
|
2. Runs PyInstaller with the `.spec` file
|
||||||
|
3. Copies `config.json` and `vocabulary.json` into `dist/whisper-dictation/`
|
||||||
|
|
||||||
|
### Platform requirement
|
||||||
|
|
||||||
|
PyInstaller cannot cross-compile. **Build must be run separately on each platform:**
|
||||||
|
- Windows: `python build.py` → `dist/whisper-dictation/whisper-dictation.exe`
|
||||||
|
- Linux: `python build.py` → `dist/whisper-dictation/whisper-dictation`
|
||||||
|
|
||||||
|
## New Dependencies
|
||||||
|
|
||||||
|
| Package | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `pywin32` | `.lnk` shortcut creation (Windows only) |
|
||||||
|
|
||||||
|
Added to `requirements-windows.txt`. Not required on Linux.
|
||||||
|
|
||||||
|
## Out of Scope
|
||||||
|
|
||||||
|
- Code signing / notarization
|
||||||
|
- Auto-updater
|
||||||
|
- Versioning
|
||||||
|
- Cross-compilation
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
_(none — all resolved during design session)_
|
||||||
Loading…
Reference in New Issue