Ka-Note/docs/feature-sync.md

122 lines
6.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Sync Feature — Implementation Notes
## DB: better-sqlite3 (migrated from @libsql/client)
**Why migrated:** `@libsql/client` caused SIGSEGV (exit code 139) on Azure App Service due to Rust native module incompatibility with seccomp profile.
**Key constraints of better-sqlite3:**
- All operations are **synchronous** — no `await` on db calls
- `db.transaction()` callback **must be synchronous**`async (tx) => {}` throws `TypeError: Transaction function cannot return a promise`
- Drizzle ORM for better-sqlite3: `drizzle-orm/better-sqlite3` + `drizzle-orm/better-sqlite3/migrator`
- Result object uses `.changes` (not `.rowsAffected`) → use `(result as any).changes ?? (result as any).rowsAffected ?? 0`
**Config:**
- `drizzle.config.ts`: `url` without `file:` prefix
- Docker base: `node:22-slim` (prebuilt binaries available)
- No build tools needed in Dockerfile (prebuilts used on Linux)
**Migration:** `migrate(db, { migrationsFolder })` called synchronously at startup in `connection.ts`.
## sync-service.ts
`pushChanges` processes entities in FK order: contexts → topics → historyEntries → ratings → imageBlobs → pages → notebooks → pageNotebooks.
`upsertEntity` logic: insert if not exists, update if `clientVersion > serverVersion`, else record conflict.
**No transaction wrapper** — the old `db.transaction(async ...)` was dead code and was removed. Each entity is upserted individually.
## API Keys (M2M Auth)
Added `api_keys` table in schema.ts. Format: `ka_<base64url-32bytes>`. Only SHA-256 hash stored in DB.
Auth middleware (`auth.ts`) detects `ka_...` Bearer tokens **before** JWT path, looks up by hash, fires `lastUsedAt` update async.
Routes: `GET /api/api-keys`, `POST /api/api-keys`, `DELETE /api/api-keys/:id` (soft-delete).
Raw key returned **once** on creation — not recoverable after that.
## Browser API Key (client-side)
`browserApiKey` store in `authStore.ts` backed by `localStorage['ka-note-browser-key']`.
`getAccessToken()` returns browser key first, then falls back to MSAL.
`isAuthenticated` derived from `account OR browserApiKey`.
UI in Settings → "Browser-Authentifizierung" section.
Use case: avoids MSAL silent refresh failures (Azure AD returning HTTP 400 on refresh token).
**401 auto-invalidation:** `authFetch` in `apiClient.ts` detects 401 responses when a browser key was used and calls `setBrowserApiKey(null)` — clears `localStorage` and the store. `isAuthenticated` becomes `false`, restoring the MSAL login UI. Prevents the stuck state where a revoked/expired key blocks both sync and manual login.
## Deploy
`deploy.ps1` uses normal `docker build` with layer caching. Only changed layers are pushed to ACR.
`COPY server/ server/` triggers a cache miss whenever server source changes → tsc reruns automatically. `--no-cache` is not needed and was removed (it caused all layers to be rebuilt and pushed on every deploy, >40 MB).
## Delta Push (client-side filtering)
`pushAll(since)` only sends entities with `updatedAt > since.toISOString()`. A read-only client sends 0 entities and skips the HTTP request entirely.
`fullSync()` passes `since=null` → pushes everything (used after DB reset or first sync).
**Root cause for all-conflicts:** `upsertEntity` accepts only `clientVersion > serverVersion`. After a pull, client has `version == serverVersion` → equal-version entities are never pushed (filtered out by delta push). Without delta push, a read-only client would produce 700+ conflicts on every sync cycle.
## Client identification (`X-Client-Id` header)
Each browser instance generates a stable random ID in `localStorage['ka-clientId']`. The header value is `Browser·OS·id` (e.g. `Edge·Win·a3f2`, `Safari·iOS·x9k2`).
Server logs `client:` field in `[sync/push]` and `[sync/pull]` entries.
## ECONNRESET on POST /api/sync/push
`Error: aborted / ECONNRESET` logged as 500 is a **client-side disconnect**, not a server error. Hono logs 500 because response couldn't be sent. Causes: tab backgrounded on iOS, Azure idle timeout, browser request timeout.
Not actionable on server side unless response time is consistently >10s (indicates oversized payload, e.g. first full sync after DB reset).
## Paginated Pull & Push (since v1.2.6)
**Root cause:** Single pull response was 6.25 MB (all tables at once). iOS Safari aborted both push and pull on full sync (e.g. after local DB reset).
### Pull — table-by-table pagination
`GET /api/sync/pull?since=&table=historyEntries&limit=500&offset=0`
- `table` param routes to `pullTable()` in `sync-service.ts`
- `limit+1` trick detects `hasMore` without extra COUNT query
- **Stable sort**: `ORDER BY updatedAt ASC, id ASC` — mandatory for correct offset pagination (ties on `updatedAt` alone cause skipped/duplicate rows)
- Response: same `SyncPullResponse` shape + optional `hasMore` / `nextOffset`
- Legacy path (no `?table=`): `pullChanges()` unchanged, returns all at once
**Pull order** (FK-safe, imageBlobs excluded — lazy loaded):
```
contexts → notebooks → topics → pageNotebooks → pages → ratings → historyEntries → tasks
```
Client loops per table until `hasMore === false`, merges each page into Dexie in a **small per-page transaction** (previously one big transaction across all 9 tables).
`lastSyncAt` is set only after all tables complete (last page of last table).
### Push — table-by-table batches
Client sends one table at a time in FK order, 500 entities per batch (10 for imageBlobs). Server `pushChanges()` already handles sparse changes (empty arrays skipped) — no server-side changes needed.
**Push order** (FK-safe, imageBlobs last):
```
contexts → topics → historyEntries → ratings → pages → notebooks → pageNotebooks → tasks → imageBlobs
```
### imageBlobs — lazy on-demand
imageBlobs are **excluded from the pull loop** (1 blob = 13 MB base64, pagination alone doesn't scale).
- `GET /api/sync/blob/:id` — fetches single blob by ID, returns same `ImageBlob` format
- `getImageUrl(id)` in `imageStore.ts`: checks Dexie → if miss, fetches from server → stores in Dexie → returns objectURL
- Push remains eager (blobs must reach server so other devices can lazy-fetch)
- Offline: missing blob → `getImageUrl()` returns `null``ka-img:ID` stays unreplaced in markdown → no crash
### Concurrency guard
`_syncInProgress` flag in `syncService.ts` prevents overlapping sync runs (30s auto-interval + visibilitychange + manual can fire simultaneously; with pagination a sync takes much longer).