Photo-based book cataloger with AI identification. Room → Cabinet → Shelf → Book hierarchy; FastAPI + SQLite backend; vanilla JS SPA; OpenAI-compatible plugin system for boundary detection, text recognition, and archive search.
6.9 KiB
Bookshelf — Technical Overview
Purpose
Photo-based book cataloger. Hierarchy: Room → Cabinet → Shelf → Book. AI plugins identify spine text; archive plugins supply bibliographic metadata.
Stack
- Server: FastAPI + SQLite (no ORM), Python 3.11+, Poetry (
poetry run serve) - Frontend: Single-file vanilla JS SPA (
static/index.html) - AI: OpenAI-compatible API (OpenRouter, OpenAI, etc.) via
openailibrary - Images: Stored uncompressed in
data/images/; Pillow used server-side for crops and AI prep
Directory Layout
app.py # FastAPI routes only
storage.py # DB schema/helpers, settings loading, photo file I/O
logic.py # Image processing, boundary helpers, plugin runners, batch pipeline
scripts.py # Poetry console entry points: fmt, presubmit
plugins/
__init__.py # Registry: load_plugins(), get_manifest(), get_plugin()
rate_limiter.py # Thread-safe per-domain rate limiter (one global instance)
ai_compat/
__init__.py # Exports the four AI plugin classes
_client.py # Internal: AIClient (openai wrapper, JSON extractor)
boundary_detector_shelves.py # BoundaryDetectorShelvesPlugin
boundary_detector_books.py # BoundaryDetectorBooksPlugin
text_recognizer.py # TextRecognizerPlugin
book_identifier.py # BookIdentifierPlugin
archives/
openlibrary.py # OpenLibrary JSON API
rsl.py # RSL AJAX JSON API
html_scraper.py # Config-driven HTML scraper (rusneb, alib, shpl)
sru_catalog.py # SRU XML catalog (nlr)
telegram_bot.py # STUB (pending Telegram credentials)
static/index.html # Full SPA (no build step)
config/
providers.default.yaml # Provider credentials (placeholder api_key)
prompts.default.yaml # Default prompt templates
plugins.default.yaml # Default plugin configurations
ui.default.yaml # Default UI settings
providers.user.yaml # ← create this with your real api_key (gitignored)
*.user.yaml # Optional overrides for other categories (gitignored)
data/ # Runtime: books.db + images/
docs/overview.md # This file
Configuration System
Config is loaded from config/*.default.yaml merged with config/*.user.yaml overrides.
Deep merge: dicts are merged recursively; lists in user files replace default lists entirely.
Categories: providers, prompts, plugins, ui — each loaded from its own pair of files.
Minimal setup — create config/providers.user.yaml:
providers:
openrouter:
api_key: "sk-or-your-actual-key"
Plugin System
Categories
| Category | Input | Output | DB field |
|---|---|---|---|
boundary_detector (target=shelves) |
cabinet image | {boundaries:[…], confidence:N} |
cabinets.ai_shelf_boundaries |
boundary_detector (target=books) |
shelf image | {boundaries:[…]} |
shelves.ai_book_boundaries |
text_recognizer |
spine image | {raw_text, title, author, …} |
books.raw_text + candidates |
book_identifier |
raw_text | {title, author, …, confidence} |
books.ai_* + candidates |
archive_searcher |
query string | [{source, title, author, …}, …] |
books.candidates |
Universal plugin endpoint
POST /api/{entity_type}/{entity_id}/plugin/{plugin_id}
Routes to the correct runner function in logic.py based on plugin category.
AI Plugin Configuration
- Providers (
config/providers.*.yaml): connection credentials only —base_url,api_key. - Per-plugin (
config/plugins.*.yaml):provider,model, optionallymax_image_px(default 1600),confidence_threshold(default 0.8). OUTPUT_FORMATis a hardcoded class constant in each plugin class — not user-configurable. It is substituted into the prompt template as${OUTPUT_FORMAT}byAIClient.call().
Archive Plugin Interface
All archive plugins implement search(query: str) -> list[CandidateRecord].
CandidateRecord: TypedDict with {source, title, author, year, isbn, publisher}.
Uses shared RATE_LIMITER singleton for per-domain throttling.
Auto-queue
- After
text_recognizercompletes → fires allarchive_searcherswithauto_queue: truein background thread pool. POST /api/batch→ runstext_recognizersthenarchive_searchersfor all unidentified books.
Database Schema (key fields)
| Table | Notable columns |
|---|---|
cabinets |
shelf_boundaries (JSON […]), ai_shelf_boundaries (JSON {pluginId:[…]}) |
shelves |
book_boundaries, ai_book_boundaries (same format) |
books |
raw_text, ai_title/author/year/isbn/publisher, candidates (JSON [{source,…}]), identification_status |
identification_status: unidentified → ai_identified → user_approved.
Boundary System
N interior boundaries → N+1 segments. full = [0] + boundaries + [1]. Segment K spans full[K]..full[K+1].
- User boundaries:
shelf_boundaries/book_boundaries(editable via canvas drag) - AI suggestions:
ai_shelf_boundaries/ai_book_boundaries(JSON object{pluginId: [fractions]}) - Shelf K image = cabinet photo cropped to
(0, y_start, 1, y_end)unless override photo exists - Book K spine = shelf image cropped to
(x_start, *, x_end, *)with composed crop if cabinet-based
Tooling
poetry run serve # start uvicorn on :8000
poetry run fmt # black (in-place)
poetry run presubmit # black --check + flake8 + pyright + pytest ← run before finishing any task
Line length: 120. Type checking: pyright strict mode. Pytest fixtures with yield use Iterator[T] return type.
Tests in tests/; use monkeypatch on storage.DB_PATH / storage.DATA_DIR for temp-DB fixtures.
Key API Endpoints
GET /api/config # UI config + plugin manifest
GET /api/tree # full nested tree
POST /api/{entity_type}/{entity_id}/plugin/{plugin_id} # universal plugin runner
PATCH /api/cabinets/{id}/boundaries # update shelf boundary list
PATCH /api/shelves/{id}/boundaries # update book boundary list
GET /api/shelves/{id}/image # shelf image (override or cabinet crop)
GET /api/books/{id}/spine # book spine crop
POST /api/books/{id}/process # run full auto-queue pipeline (single book)
POST /api/batch # start batch processing
GET /api/batch/status
POST /api/books/{id}/dismiss-field # dismiss a candidate suggestion
PATCH /api/{kind}/reorder # SortableJS drag reorder