added search

This commit is contained in:
beo3000 2026-02-25 20:51:53 +01:00
parent cb0f38b7f8
commit 684519384b
15 changed files with 444 additions and 31 deletions

136
docs/feature-search.md Normal file
View File

@ -0,0 +1,136 @@
# Full-Text Search
## Overview
Ka-Note implements a hybrid full-text search strategy: small in-memory corpora (contexts, page titles) are filtered client-side; the large corpus (history entry text, page body) is indexed server-side using SQLite FTS5 and queried via HTTP.
## Architecture
### Search tiers
| Entity | Where | Method |
|---|---|---|
| Contexts (name) | Client only | Substring on in-memory Svelte store |
| Pages (title) | Client only | Substring on in-memory Svelte store |
| HistoryEntries (text) | Server FTS5 | Debounced HTTP GET /api/search |
| Pages (body) | Server FTS5 | Debounced HTTP GET /api/search |
History entries are the primary scaling concern (years of daily journals → tens of thousands of rows). SQLite FTS5 with BM25 ranking handles this efficiently without additional infrastructure.
### Offline fallback
When the server is unreachable, CommandBar falls back to local results (contexts, page titles) only and shows a notice: "Server nicht erreichbar — nur lokale Ergebnisse".
---
## Server
### FTS5 tables
Migration: `server/drizzle/0013_fts_search.sql`
Two virtual tables using the `unicode61` tokenizer (handles German umlauts correctly, no stemming):
- `fts_history` — content table backed by `history_entries` (columns: `text`)
- `fts_pages` — content table backed by `pages` (columns: `title`, `body`)
Both tables are populated via `INSERT INTO fts_*(...) VALUES('rebuild')` on first migration run.
### Index maintenance
FTS index is updated synchronously after every write, covering all server-side write paths:
| Write path | File | FTS update |
|---|---|---|
| Sync push (primary client sync) | `sync-service.ts``pushChanges()` | after each upsert |
| Trash / soft-delete | `routes/trash.ts` | after batch update |
| AI bundle upload (ZIP) | `ai-export-service.ts``applyOps()` | after each op |
| AI legacy JSON upload | `ai-export-service.ts``applyOps()` | after each op |
| Startup drift recovery | `index.ts` `setImmediate` | full rebuild if mismatch > 10 |
All paths use `better-sqlite3` prepared statements. Shared helper `applyOps()` in `ai-export-service.ts` handles both upload variants. Soft-deleted rows are removed from FTS; active rows are re-indexed via `INSERT OR REPLACE … SELECT`.
**Startup consistency check:** On each server start, row counts of `history_entries` (non-deleted) and `fts_history` are compared. If the difference exceeds 10, both FTS tables are rebuilt via `INSERT INTO fts_*(fts_*) VALUES('rebuild')`. This guards against index drift after DB restores or backup imports.
### Raw SQLite access
File: `server/src/db/connection.ts`
The `better-sqlite3` instance is exported as `sqlite` alongside the Drizzle `db`. This is needed for FTS prepared statements (Drizzle has no FTS5 DSL).
### Search endpoint
```
GET /api/search?q=<query>&limit=<n>
Authorization: Bearer <token>
```
Response:
```json
{
"history": [
{ "id": "...", "topicId": "...", "date": "2025-01-15", "snippet": "...text..." }
],
"pages": [
{ "id": "...", "title": "Page Title", "snippet": "...body text..." }
]
}
```
- `q` must be ≥ 2 characters; shorter queries return empty results.
- `limit` is capped at 20 server-side.
- Each word in `q` is automatically appended with `*` for prefix matching (`"term"*`).
- Results are ranked by BM25 (`ORDER BY rank`).
- FTS5 query errors (invalid syntax from special characters) return empty results instead of HTTP 500.
- Soft-deleted entries are excluded via the FTS delete-on-soft-delete strategy.
File: `server/src/routes/search.ts`
---
## Client
### Settings store
File: `client/src/lib/stores/settings.ts`
Generic key-value settings backed by a Dexie `settings` table (version 13). Provides:
- `getSetting<T>(key, default)` — async one-time read
- `setSetting<T>(key, value)` — async write
- `settingStore<T>(key, default)` — reactive Svelte store backed by `liveQuery`
The `searchResultsLimit` store (default: 3) controls how many server results are requested.
### CommandBar integration
File: `client/src/lib/components/CommandBar.svelte`
In navigate mode (query ≥ 2 chars, not starting with `/`):
1. **Immediately (sync):** Filters `$contextsQuery` and `$pagesQuery` by substring match on name/title.
2. **After 250ms debounce:** Calls `authFetch('/api/search?q=...&limit=...')` using the existing `apiClient` helper.
3. **On success:** Server results are appended after local results. Pages already found by title match are deduplicated.
4. **On error:** `isOffline = true`, a footer notice is shown, local results remain visible.
5. **Total results** are capped at 10.
History results deep-link to `/context/daily-log?date=YYYY-MM-DD`.
---
## Settings
| Key | Type | Default | Description |
|---|---|---|---|
| `searchResultsLimit` | number | 3 | Max server search results per entity type |
To change: write to Dexie via `setSetting('searchResultsLimit', 5)` or add a Settings UI field.
---
## Scaling notes
- FTS5 + BM25 scales to millions of rows. No action needed as data grows.
- The `unicode61` tokenizer handles Unicode correctly. Stemming can be added later by changing `tokenize='unicode61'` to `tokenize='porter unicode61'` in the migration.
- If topic title search needs FTS in future, add `fts_topics` following the same pattern.
- Offline full-text search for history (e.g. via MiniSearch in a Web Worker) is a possible v2 enhancement.

View File

@ -1 +1 @@
1.1.75
1.1.77

View File

@ -15,6 +15,8 @@
pageNameExists,
} from "$lib/db/repositories";
import { newId, today } from "$lib/db/helpers";
import { authFetch } from "$lib/auth/apiClient";
import { searchResultsLimit } from "$lib/stores/settings";
const contextsQuery = allActiveContexts();
const pagesQuery = allPages();
@ -25,6 +27,77 @@
let recentContextIds = $state<string[]>([]);
let isMac = $state(false);
// Server FTS results
interface ServerResult {
id: string;
type: "nav-history" | "nav-wiki";
icon: string;
label: string;
badge: string;
action: () => void;
}
let serverResults = $state<ServerResult[]>([]);
let isOffline = $state(false);
let searchTimer: ReturnType<typeof setTimeout> | null = null;
$effect(() => {
const q = query.trim();
if (q.length < 2 || q.startsWith("/")) {
serverResults = [];
isOffline = false;
if (searchTimer) clearTimeout(searchTimer);
return;
}
if (searchTimer) clearTimeout(searchTimer);
searchTimer = setTimeout(async () => {
try {
const limit = $searchResultsLimit;
const res = await authFetch(`/api/search?q=${encodeURIComponent(q)}&limit=${limit}`);
if (!res.ok) throw new Error(`HTTP ${res.status}`);
const data = await res.json() as {
history: { id: string; topicId: string; date: string; snippet: string }[];
pages: { id: string; title: string; snippet: string }[];
};
const localPageIds = new Set(
($pagesQuery ?? [])
.filter((p) => p.title.toLowerCase().includes(q.toLowerCase()))
.map((p) => p.id),
);
const combined: ServerResult[] = [
...data.history.map((h) => ({
id: `hist-${h.id}`,
type: "nav-history" as const,
icon: "📓",
label: h.snippet.replace(/<[^>]+>/g, ""),
badge: `JOURNAL ${h.date}`,
action: () => {
closeBar();
goto(`/context/daily-log?date=${h.date}`);
},
})),
...data.pages
.filter((p) => !localPageIds.has(p.id))
.map((p) => ({
id: `page-fts-${p.id}`,
type: "nav-wiki" as const,
icon: "📄",
label: p.title,
badge: "WIKI",
action: () => {
closeBar();
goto(`/wiki/${p.id}`);
},
})),
];
serverResults = combined;
isOffline = false;
} catch {
isOffline = true;
serverResults = [];
}
}, 250);
});
onMount(() => {
isMac = navigator.platform.toUpperCase().indexOf("MAC") >= 0;
});
@ -474,7 +547,7 @@
}
}
return searchResults;
return [...searchResults, ...serverResults].slice(0, 10);
});
// Reset selection when query changes
@ -627,6 +700,11 @@
Keine Ergebnisse für "{query}"
</div>
{/if}
{#if isOffline}
<div class="px-4 py-1.5 text-[11px] text-zinc-500 border-t border-zinc-700">
Server nicht erreichbar — nur lokale Ergebnisse
</div>
{/if}
</div>
</div>
{/if}

View File

@ -12,6 +12,12 @@ export interface ImageBlob {
version: number;
}
export interface AppSetting {
key: string;
value: unknown;
updatedAt: string;
}
export class KaNoteDB extends Dexie {
contexts!: EntityTable<AgendaContext, 'id'>;
topics!: EntityTable<Topic, 'id'>;
@ -22,6 +28,7 @@ export class KaNoteDB extends Dexie {
pages!: EntityTable<Page, 'id'>;
notebooks!: EntityTable<Notebook, 'id'>;
pageNotebooks!: EntityTable<PageNotebook, 'id'>;
settings!: EntityTable<AppSetting, 'key'>;
constructor() {
super('ka-note');
@ -126,6 +133,10 @@ export class KaNoteDB extends Dexie {
if (p.isFavorite === undefined) p.isFavorite = false;
});
});
this.version(13).stores({
settings: '&key',
});
}
}

View File

@ -0,0 +1,32 @@
import { liveQuery } from 'dexie';
import { readable } from 'svelte/store';
import { db } from '$lib/db/schema.js';
function now(): string {
return new Date().toISOString();
}
export async function getSetting<T>(key: string, defaultValue: T): Promise<T> {
const row = await db.settings.get(key);
return row !== undefined ? (row.value as T) : defaultValue;
}
export async function setSetting<T>(key: string, value: T): Promise<void> {
await db.settings.put({ key, value, updatedAt: now() });
}
/** Reactive store for a single setting. Updates when the DB changes. */
export function settingStore<T>(key: string, defaultValue: T) {
return readable<T>(defaultValue, (set) => {
const subscription = liveQuery(async () => {
const row = await db.settings.get(key);
return row !== undefined ? (row.value as T) : defaultValue;
}).subscribe({
next: (v) => set(v),
error: (e) => console.error('[settings] liveQuery error', e),
});
return () => subscription.unsubscribe();
});
}
export const searchResultsLimit = settingStore<number>('searchResultsLimit', 3);

View File

@ -0,0 +1,7 @@
CREATE VIRTUAL TABLE IF NOT EXISTS fts_history USING fts5(id UNINDEXED, user_id UNINDEXED, text, date UNINDEXED, topic_id UNINDEXED, content='history_entries', content_rowid='rowid', tokenize='unicode61');
--> statement-breakpoint
CREATE VIRTUAL TABLE IF NOT EXISTS fts_pages USING fts5(id UNINDEXED, user_id UNINDEXED, title, body, content='pages', content_rowid='rowid', tokenize='unicode61');
--> statement-breakpoint
INSERT INTO fts_history(fts_history) VALUES('rebuild');
--> statement-breakpoint
INSERT INTO fts_pages(fts_pages) VALUES('rebuild');

View File

@ -92,6 +92,13 @@
"when": 1772004047537,
"tag": "0012_chunky_stature",
"breakpoints": true
},
{
"idx": 13,
"version": "6",
"when": 1772100000000,
"tag": "0013_fts_search",
"breakpoints": true
}
]
}

Binary file not shown.

Binary file not shown.

View File

@ -21,6 +21,7 @@ sqlite.pragma('journal_mode = WAL');
sqlite.pragma('foreign_keys = ON');
export const db = drizzle(sqlite, { schema });
export { sqlite };
// Run migrations on startup
const migrationsFolder = path.resolve(__dirname, '../../drizzle');

View File

@ -15,7 +15,9 @@ import adminRoutes from './routes/admin.js';
import pushRoutes from './routes/push.js';
import backupRoutes from './routes/backup.js';
import apiKeyRoutes from './routes/api-keys.js';
import searchRoutes from './routes/search.js';
import { runScheduledBackup, runIfMissed, checkIntegrity } from './lib/backup-service.js';
import { sqlite } from './db/connection.js';
const app = new OpenAPIHono();
@ -27,13 +29,26 @@ app.onError((err, c) => {
return c.json({ error: 'internal server error', detail: err.message }, 500);
});
// Integrity check at startup
// Integrity check + FTS consistency check at startup
setImmediate(() => {
try {
checkIntegrity();
} catch (e) {
console.error('[db] startup integrity_check threw:', e);
}
try {
const heCount = (sqlite.prepare('SELECT COUNT(*) AS n FROM history_entries WHERE deleted_at IS NULL').get() as { n: number }).n;
const ftsCount = (sqlite.prepare('SELECT COUNT(*) AS n FROM fts_history').get() as { n: number }).n;
if (Math.abs(heCount - ftsCount) > 10) {
console.warn(`[fts] Index mismatch (history_entries=${heCount}, fts_history=${ftsCount}), rebuilding...`);
sqlite.prepare("INSERT INTO fts_history(fts_history) VALUES('rebuild')").run();
sqlite.prepare("INSERT INTO fts_pages(fts_pages) VALUES('rebuild')").run();
console.log('[fts] Rebuild complete');
}
} catch (e) {
console.error('[fts] startup check threw:', e);
}
});
// Public routes
@ -81,6 +96,10 @@ app.route('/api/backup', backupRoutes);
app.use('/api/api-keys/*', authMiddleware);
app.route('/api/api-keys', apiKeyRoutes);
app.use('/api/search/*', authMiddleware);
app.use('/api/search', authMiddleware);
app.route('/api/search', searchRoutes);
// OpenAPI spec + Scalar UI
app.openAPIRegistry.registerComponent('securitySchemes', 'BearerAuth', {
type: 'http',

View File

@ -1,5 +1,5 @@
import { randomUUID } from 'crypto';
import { db } from '../db/connection.js';
import { db, sqlite } from '../db/connection.js';
import { aiLocks, contexts, topics, historyEntries, ratings, imageBlobs, pages, notebooks, pageNotebooks } from '../db/schema.js';
import { eq, and, sql } from 'drizzle-orm';
import { zipSync, unzipSync, strToU8, strFromU8 } from 'fflate';
@ -8,6 +8,34 @@ import { generateAiReadme } from './ai-agent-readme.js';
const LOCK_EXPIRY_HOURS = Number(process.env.AI_LOCK_EXPIRY_HOURS ?? 24);
const stmtFtsHistoryUpsert = sqlite.prepare(`INSERT OR REPLACE INTO fts_history(rowid, id, user_id, text, date, topic_id) SELECT rowid, id, user_id, text, date, topic_id FROM history_entries WHERE id = ? AND user_id = ?`);
const stmtFtsHistoryDelete = sqlite.prepare(`DELETE FROM fts_history WHERE id = ? AND user_id = ?`);
const stmtFtsPagesUpsert = sqlite.prepare(`INSERT OR REPLACE INTO fts_pages(rowid, id, user_id, title, body) SELECT rowid, id, user_id, title, body FROM pages WHERE id = ? AND user_id = ?`);
const stmtFtsPagesDelete = sqlite.prepare(`DELETE FROM fts_pages WHERE id = ? AND user_id = ?`);
type AiTableDef = typeof contexts | typeof topics | typeof historyEntries | typeof ratings | typeof imageBlobs | typeof pages | typeof notebooks | typeof pageNotebooks;
async function applyOps(ops: Array<{ action: 'insert' | 'update'; table: AiTableDef; row: Record<string, unknown> }>, userId: string): Promise<number> {
let accepted = 0;
for (const op of ops) {
if (op.action === 'insert') {
await db.insert(op.table).values(op.row as never);
} else {
await db.update(op.table).set(op.row as never)
.where(and(sql`${op.table.id} = ${op.row.id}`, sql`${op.table.userId} = ${userId}`));
}
accepted++;
if (op.table === historyEntries) {
if (op.row.deletedAt) stmtFtsHistoryDelete.run(op.row.id, userId);
else stmtFtsHistoryUpsert.run(op.row.id, userId);
} else if (op.table === pages) {
if (op.row.deletedAt) stmtFtsPagesDelete.run(op.row.id, userId);
else stmtFtsPagesUpsert.run(op.row.id, userId);
}
}
return accepted;
}
function now(): string {
return new Date().toISOString();
}
@ -195,7 +223,7 @@ export interface AiUploadResult {
conflicts: Array<{ entityType: string; entityId: string; clientVersion: number; serverVersion: number }>;
}
type TableDef = typeof contexts | typeof topics | typeof historyEntries | typeof ratings | typeof imageBlobs | typeof pages | typeof notebooks | typeof pageNotebooks;
type TableDef = AiTableDef;
async function checkConflict(
table: TableDef,
@ -389,17 +417,7 @@ export async function applyUploadFromZip(
return { result: { accepted: 0, skipped, conflicts }, conflict: true };
}
let accepted = 0;
for (const op of ops) {
if (op.action === 'insert') {
await db.insert(op.table).values(op.row as never);
} else {
await db.update(op.table).set(op.row as never)
.where(and(sql`${op.table.id} = ${op.row.id}`, sql`${op.table.userId} = ${userId}`));
}
accepted++;
}
const accepted = await applyOps(ops, userId);
return { result: { accepted, skipped, conflicts }, conflict: false };
}
@ -489,16 +507,6 @@ export async function applyUpload(
return { result: { accepted: 0, skipped, conflicts }, conflict: true };
}
let accepted = 0;
for (const op of ops) {
if (op.action === 'insert') {
await db.insert(op.table).values(op.row as never);
} else {
await db.update(op.table).set(op.row as never)
.where(and(sql`${op.table.id} = ${op.row.id}`, sql`${op.table.userId} = ${userId}`));
}
accepted++;
}
const accepted = await applyOps(ops, userId);
return { result: { accepted, skipped, conflicts }, conflict: false };
}

View File

@ -1,4 +1,4 @@
import { db } from '../db/connection.js';
import { db, sqlite } from '../db/connection.js';
import { contexts, topics, historyEntries, ratings, imageBlobs, pages, notebooks, pageNotebooks } from '../db/schema.js';
import { and, gt, eq, sql, isNotNull, lt } from 'drizzle-orm';
import type {
@ -8,6 +8,24 @@ import type {
type TableDef = typeof contexts | typeof topics | typeof historyEntries | typeof ratings | typeof imageBlobs | typeof pages | typeof notebooks | typeof pageNotebooks;
// FTS index maintenance (prepared statements for performance)
const stmtFtsHistoryUpsert = sqlite.prepare(`
INSERT OR REPLACE INTO fts_history(rowid, id, user_id, text, date, topic_id)
SELECT rowid, id, user_id, text, date, topic_id
FROM history_entries WHERE id = ? AND user_id = ?
`);
const stmtFtsHistoryDelete = sqlite.prepare(`
DELETE FROM fts_history WHERE id = ? AND user_id = ?
`);
const stmtFtsPagesUpsert = sqlite.prepare(`
INSERT OR REPLACE INTO fts_pages(rowid, id, user_id, title, body)
SELECT rowid, id, user_id, title, body
FROM pages WHERE id = ? AND user_id = ?
`);
const stmtFtsPagesDelete = sqlite.prepare(`
DELETE FROM fts_pages WHERE id = ? AND user_id = ?
`);
function now(): string {
return new Date().toISOString();
}
@ -242,7 +260,14 @@ export async function pushChanges(request: SyncPushRequest, userId: string): Pro
purgedAt: he.purgedAt ?? null,
version: he.version,
};
if (await upsertEntity(historyEntries, row, conflicts, 'historyEntry', userId)) accepted++;
if (await upsertEntity(historyEntries, row, conflicts, 'historyEntry', userId)) {
accepted++;
if (he.deletedAt) {
stmtFtsHistoryDelete.run(he.id, userId);
} else {
stmtFtsHistoryUpsert.run(he.id, userId);
}
}
}
for (const rat of rats) {
@ -278,7 +303,14 @@ export async function pushChanges(request: SyncPushRequest, userId: string): Pro
for (const pg of pgs) {
const row = { id: pg.id, userId, title: pg.title, body: pg.body, isPrivate: pg.isPrivate, isFavorite: pg.isFavorite ?? false, sortOrder: pg.sortOrder, updatedAt: pg.updatedAt, deletedAt: pg.deletedAt, purgedAt: pg.purgedAt ?? null, version: pg.version };
if (await upsertEntity(pages, row, conflicts, 'page', userId)) accepted++;
if (await upsertEntity(pages, row, conflicts, 'page', userId)) {
accepted++;
if (pg.deletedAt) {
stmtFtsPagesDelete.run(pg.id, userId);
} else {
stmtFtsPagesUpsert.run(pg.id, userId);
}
}
}
for (const nb of nbs) {

View File

@ -0,0 +1,80 @@
import { Hono } from 'hono';
import { sqlite } from '../db/connection.js';
import type { AuthEnv } from '../middleware/auth.js';
const search = new Hono<AuthEnv>();
interface HistoryResult {
id: string;
topicId: string;
date: string;
snippet: string;
}
interface PageResult {
id: string;
title: string;
snippet: string;
}
const stmtSearchHistory = sqlite.prepare(`
SELECT
id,
topic_id AS topicId,
date,
snippet(fts_history, 2, '<mark>', '</mark>', '...', 12) AS snippet
FROM fts_history
WHERE fts_history MATCH ?
AND user_id = ?
AND id NOT IN (
SELECT id FROM history_entries WHERE deleted_at IS NOT NULL AND user_id = ?
)
ORDER BY rank
LIMIT ?
`);
const stmtSearchPages = sqlite.prepare(`
SELECT
id,
title,
snippet(fts_pages, 2, '<mark>', '</mark>', '...', 12) AS snippet
FROM fts_pages
WHERE fts_pages MATCH ?
AND user_id = ?
AND id NOT IN (
SELECT id FROM pages WHERE deleted_at IS NOT NULL AND user_id = ?
)
ORDER BY rank
LIMIT ?
`);
search.get('/', (c) => {
const auth = c.get('auth');
const userId = auth.userId;
const q = c.req.query('q')?.trim() ?? '';
const limit = Math.min(Number(c.req.query('limit') ?? 5), 20);
if (q.length < 2) {
return c.json({ history: [], pages: [] });
}
// Build FTS5 prefix query: each word gets a trailing * for prefix matching
const ftsQuery = q
.split(/\s+/)
.filter(Boolean)
.map((t) => `"${t.replace(/"/g, '')}"*`)
.join(' ');
try {
const history = stmtSearchHistory.all(ftsQuery, userId, userId, limit) as HistoryResult[];
const pages = stmtSearchPages.all(ftsQuery, userId, userId, limit) as PageResult[];
return c.json({ history, pages });
} catch (err) {
// FTS5 query syntax errors (e.g. special chars) → return empty rather than 500
console.warn('[search] FTS query error:', err instanceof Error ? err.message : err);
return c.json({ history: [], pages: [] });
}
});
export default search;

View File

@ -1,5 +1,5 @@
import { Hono } from 'hono';
import { db } from '../db/connection.js';
import { db, sqlite } from '../db/connection.js';
import { contexts, topics, historyEntries, ratings } from '../db/schema.js';
import { and, eq, inArray, sql } from 'drizzle-orm';
import { handle } from '../lib/route-utils.js';
@ -74,6 +74,8 @@ trash.delete('/', handle('trash/delete', async (c) => {
if (historyIds.size > 0) {
await db.update(historyEntries).set({ deletedAt: ts, purgedAt: ts, updatedAt: ts, version: sql`${historyEntries.version} + 1` })
.where(and(eq(historyEntries.userId, userId), inArray(historyEntries.id, [...historyIds])));
const stmtDel = sqlite.prepare('DELETE FROM fts_history WHERE id = ? AND user_id = ?');
for (const id of historyIds) stmtDel.run(id, userId);
}
if (topicIds.size > 0) {
await db.update(topics).set({ deletedAt: ts, purgedAt: ts, updatedAt: ts, version: sql`${topics.version} + 1` })