Working Memory
Working Memory provides a high-speed, Redis-backed session buffer for real-time access to recent context. While the main memory store (Qdrant + JanusGraph) handles long-term retrieval, Working Memory delivers sub-10ms access to the current conversation context.
Three-Tier Memory Architecture
The Memory Platform uses a tiered approach:
| Tier | Store | Latency | Use Case |
|---|---|---|---|
| Working Memory | Redis | Under 10ms | Current session, recent messages |
| Short-term Cache | Redis | 50-100ms | Last 24 hours of memories |
| Long-term Store | Qdrant + PostgreSQL | 100-500ms | Full semantic retrieval |
Session Structure
A Working Memory session contains:
| Field | Description |
|---|---|
| Session ID | Unique identifier for the session |
| Tenant ID | Organization context |
| Subject | Who the session is about |
| Items | Recent messages/context items |
| Token Count | Total tokens in the buffer |
| Primed Memory IDs | Pre-loaded memory references |
Using Working Memory
In the Visualizer
Navigate to Working Memory in the sidebar to:
- View active session status
- Monitor buffer contents
- See token usage and item count
- Inspect individual context items
Session Management
Sessions are automatically created when needed. You can:
- View session contents in real-time
- Monitor token usage for context window management
- See which memories are "primed" for quick access
API Usage
Get or Create Session
GET /api/v1/working-memory/sessions/{session_id}
Headers:
x-tenant-id: your-tenant-id
If the session doesn't exist, create it:
POST /api/v1/working-memory/sessions
Headers:
x-tenant-id: your-tenant-id
Content-Type: application/json
{
"session_id": "chat-12345",
"tenant_id": "your-tenant-id",
"ttl": 3600
}
Add Message to Session
POST /api/v1/working-memory/sessions/{session_id}/messages
Headers:
x-tenant-id: your-tenant-id
Content-Type: application/json
{
"role": "user",
"content": "What was discussed in our last meeting?",
"timestamp": "2024-01-15T10:30:00Z"
}
Prime Memories
Pre-load specific memories for fast access:
POST /api/v1/working-memory/sessions/{session_id}/prime
Headers:
x-tenant-id: your-tenant-id
Content-Type: application/json
["mem_abc123", "mem_def456"]
Configuration
Working Memory sessions have a configurable TTL (time-to-live):
- Default TTL: 3600 seconds (1 hour)
- Maximum Items: 50 context items
- Token Limit: Configurable based on your LLM context window
Integration Tips
- Create sessions early - Initialize sessions at conversation start
- Prime relevant memories - Pre-load memories based on user/topic
- Monitor token count - Stay within context window limits
- Let sessions expire - Redis automatically cleans up stale sessions