Document Intelligence Layer
The Document Intelligence Layer, implemented in the memory-store-documents repository, is responsible for the structural parsing, coordinate mapping, and raw storage of source documents.
Core Responsibilities
- Blob Storage: Managing raw binary data of documents (PDFs, images, text files) using MinIO.
- Structural Parsing: Extracting the physical hierarchy of documents (pages, text elements, lines).
- Coordinate Mapping: Storing the exact X/Y coordinates and bounding boxes for every piece of text.
- Source Location Finding: Providing fuzzy search capabilities to find the specific location (page + coordinates) of an extracted memory within its source document.
Architecture
graph TD
Upload[User Upload] --> Ingest API]
Ingest[Ingest API] --> MinIO[MinIO Blob Storage]
Ingest --> DB[(PostgreSQL)]
Ingest --> Task[Structural Parser]
Task --> Pages[Page & Text Element Extraction]
Pages --> DB
Pages --> Event[document.parsed Event]
Data Model (PostgreSQL)
DocumentModel
Base metadata for the document (title, hash, status, subject).
PageModel
Represents a physical page, including dimensions and rotation.
TextElementModel
The most granular unit, storing:
- Text: The actual string content.
- Coordinates: x, y, width, height (normalized 0.0-1.0).
- Offsets: Character start/end within the page.
- Style: Font, size, bold/italic markers.
Integration Flow
- Ingestion:
memory-store-documentssaves the blob and performs structural parsing. - Notification: It emits a
document.parsedevent via NATS. - Extraction:
memory-curator-workerreceives the event and fetches the structural data to perform LLM-based memory extraction. - Source Mapping: When a memory is created, it references the
document_id. The visualizer then uses theSourceLocationFinderto highlight the specific text in the PDF.
Source Location Finder
The layer provides a specialized component, SourceLocationFinder, which allows the system to:
- Receive an extracted snippet of text.
- Search through the stored
TextElementModelrecords using fuzzy matching. - Reconstruct a
SourceLocation(Page # + Bounding Boxes) for highlighting.