Document Intelligence Layer

The Document Intelligence Layer, implemented in the memory-store-documents repository, is responsible for the structural parsing, coordinate mapping, and raw storage of source documents.

Core Responsibilities

Blob Storage: Managing raw binary data of documents (PDFs, images, text files) using MinIO.
Structural Parsing: Extracting the physical hierarchy of documents (pages, text elements, lines).
Coordinate Mapping: Storing the exact X/Y coordinates and bounding boxes for every piece of text.
Source Location Finding: Providing fuzzy search capabilities to find the specific location (page + coordinates) of an extracted memory within its source document.

Architecture

graph TD
    Upload[User Upload] --> Ingest API]
    Ingest[Ingest API] --> MinIO[MinIO Blob Storage]
    Ingest --> DB[(PostgreSQL)]
    Ingest --> Task[Structural Parser]
    Task --> Pages[Page & Text Element Extraction]
    Pages --> DB
    Pages --> Event[document.parsed Event]

Data Model (PostgreSQL)

DocumentModel

Base metadata for the document (title, hash, status, subject).

PageModel

Represents a physical page, including dimensions and rotation.

TextElementModel

The most granular unit, storing:

Text: The actual string content.
Coordinates: x, y, width, height (normalized 0.0-1.0).
Offsets: Character start/end within the page.
Style: Font, size, bold/italic markers.

Integration Flow

Ingestion: memory-store-documents saves the blob and performs structural parsing.
Notification: It emits a document.parsed event via NATS.
Extraction: memory-curator-worker receives the event and fetches the structural data to perform LLM-based memory extraction.
Source Mapping: When a memory is created, it references the document_id. The visualizer then uses the SourceLocationFinder to highlight the specific text in the PDF.

Source Location Finder

The layer provides a specialized component, SourceLocationFinder, which allows the system to:

Receive an extracted snippet of text.
Search through the stored TextElementModel records using fuzzy matching.
Reconstruct a SourceLocation (Page # + Bounding Boxes) for highlighting.

Core Responsibilities​

Architecture​

Data Model (PostgreSQL)​

DocumentModel​

PageModel​

TextElementModel​

Integration Flow​

Source Location Finder​