Memory & RAG
Search Memories
Hybrid semantic + keyword search with cross-encoder reranking
POST
Search memories in a collection using hybrid retrieval. Combines vector similarity (semantic) with BM25 keyword scoring, with optional cross-encoder reranking for higher precision. Optionally returns graph relationships.
On
Request
Headers
Bearer token with your API key
application/json
Body
Search query text. Max 2,000 characters.
Collection to search. Defaults to the caller’s personal collection.
Search mode:
fast— single-query dense retrieval (~100-200ms). Best for simple lookups.thinking— fetches a wider candidate pool and applies cross-encoder reranking for higher precision (~200-400ms). Best for complex or multi-faceted questions.
Maximum number of results. Capped at 50.
Weight of semantic search (0-1).
0 = keyword only, 1 = semantic only.Weight given to newer memories (0-1).
Include knowledge-graph relationships in the response.
Advanced reranker knobs
These parameters override server-side defaults for the cross-encoder reranker. Omit to use the deployment default.Max candidates the cross-encoder reranks (1-500). Default: server setting (30).
Hard timeout for the rerank call in milliseconds (50-5000). Default: server setting (500).
Drop results with rerank score below this threshold (0-1). Default: server setting (0.25).
In
thinking mode, fetch N x max_results candidates before reranking (1-10). Default: server setting (3).Response
Raw chunk-level search results with scores. Each chunk includes:
score— dense vector similarity score (0-1)rerank_score— cross-encoder rerank score (0-1, present when reranker is active, null otherwise)
Deduplicated source memories (one per unique memory_id)
Graph nodes, edges, and triplets (only if
graph_context: true)Total number of chunks returned
Search latency in milliseconds
Per-query diagnostic trace including stage timings, reranker meta, and active flag snapshot. Useful for debugging search quality.
Example
Billing
Flat **3. Every successful request returns:| Header | Meaning |
|---|---|
x-credit-balance | Wallet balance after this charge |
x-credit-charged | 0.000300 |
x-billing-tx | Audit row UUID |
402 INSUFFICIENT_CREDITS, the response includes details.shortfall so you can prompt the user to top up. See Pricing & Billing.
Tuning
Query types and recommended settings:| Query type | Recommended settings |
|---|---|
| Exact phrase match | alpha: 0.2, mode: fast |
| Conceptual question | alpha: 0.9, mode: fast |
| Complex multi-faceted question | alpha: 0.7, mode: thinking |
| Latest-events focus | alpha: 0.6, recency_bias: 0.3 |