Skip to main content
POST
/
memory
/
search
curl -X POST https://api.60db.ai/memory/search \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are my dietary preferences?",
    "mode": "thinking",
    "max_results": 5,
    "alpha": 0.8,
    "recency_bias": 0.1
  }'
{
  "success": true,
  "data": {
    "query": "What are my dietary preferences?",
    "chunks": [
      {
        "chunk_id": "c_01HV...",
        "source_id": "mem_01HV8K...",
        "text": "User prefers vegetarian food, lactose intolerant",
        "score": 0.912,
        "rerank_score": 0.9485,
        "metadata": { "title": "Dietary preferences" }
      }
    ],
    "sources": [
      {
        "source_id": "mem_01HV8K...",
        "title": "Dietary preferences",
        "text": "User prefers vegetarian food, lactose intolerant",
        "score": 0.912,
        "rerank_score": 0.9485
      }
    ],
    "total_chunks": 1,
    "total_sources": 1,
    "latency_ms": 287.4,
    "mode": "thinking",
    "alpha": 0.8,
    "trace": {
      "timings_ms": { "embed_ms": 112, "vector_search_ms": 52, "total_ms": 287 },
      "rerank": { "mode": "on", "ok": true, "latency_ms": 103, "top_score": 0.9485 }
    }
  }
}
Search memories in a collection using hybrid retrieval. Combines vector similarity (semantic) with BM25 keyword scoring, with optional cross-encoder reranking for higher precision. Optionally returns graph relationships.

Request

Headers

Authorization
string
required
Bearer token with your API key
Content-Type
string
required
application/json

Body

query
string
required
Search query text. Max 2,000 characters.
collection
string
Collection to search. Defaults to the caller’s personal collection.
mode
string
default:"fast"
Search mode:
  • fast — single-query dense retrieval (~100-200ms). Best for simple lookups.
  • thinking — fetches a wider candidate pool and applies cross-encoder reranking for higher precision (~200-400ms). Best for complex or multi-faceted questions.
max_results
integer
default:"10"
Maximum number of results. Capped at 50.
alpha
number
default:"0.8"
Weight of semantic search (0-1). 0 = keyword only, 1 = semantic only.
recency_bias
number
default:"0.0"
Weight given to newer memories (0-1).
graph_context
boolean
default:"false"
Include knowledge-graph relationships in the response.

Advanced reranker knobs

These parameters override server-side defaults for the cross-encoder reranker. Omit to use the deployment default.
rerank_top_k
integer
Max candidates the cross-encoder reranks (1-500). Default: server setting (30).
rerank_timeout_ms
integer
Hard timeout for the rerank call in milliseconds (50-5000). Default: server setting (500).
min_rerank_score
number
Drop results with rerank score below this threshold (0-1). Default: server setting (0.25).
fetch_multiplier
integer
In thinking mode, fetch N x max_results candidates before reranking (1-10). Default: server setting (3).

Response

data.chunks
array
Raw chunk-level search results with scores. Each chunk includes:
  • score — dense vector similarity score (0-1)
  • rerank_score — cross-encoder rerank score (0-1, present when reranker is active, null otherwise)
data.sources
array
Deduplicated source memories (one per unique memory_id)
data.graph_context
object
Graph nodes, edges, and triplets (only if graph_context: true)
data.total_chunks
integer
Total number of chunks returned
data.latency_ms
number
Search latency in milliseconds
data.trace
object
Per-query diagnostic trace including stage timings, reranker meta, and active flag snapshot. Useful for debugging search quality.

Example

curl -X POST https://api.60db.ai/memory/search \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are my dietary preferences?",
    "mode": "thinking",
    "max_results": 5,
    "alpha": 0.8,
    "recency_bias": 0.1
  }'
{
  "success": true,
  "data": {
    "query": "What are my dietary preferences?",
    "chunks": [
      {
        "chunk_id": "c_01HV...",
        "source_id": "mem_01HV8K...",
        "text": "User prefers vegetarian food, lactose intolerant",
        "score": 0.912,
        "rerank_score": 0.9485,
        "metadata": { "title": "Dietary preferences" }
      }
    ],
    "sources": [
      {
        "source_id": "mem_01HV8K...",
        "title": "Dietary preferences",
        "text": "User prefers vegetarian food, lactose intolerant",
        "score": 0.912,
        "rerank_score": 0.9485
      }
    ],
    "total_chunks": 1,
    "total_sources": 1,
    "latency_ms": 287.4,
    "mode": "thinking",
    "alpha": 0.8,
    "trace": {
      "timings_ms": { "embed_ms": 112, "vector_search_ms": 52, "total_ms": 287 },
      "rerank": { "mode": "on", "ok": true, "latency_ms": 103, "top_score": 0.9485 }
    }
  }
}

Billing

Flat **0.0003perquery,regardlessofmaxresultsormode.Aworkloadof10,000searchespermonthcosts0.0003 per query**, regardless of `max_results` or `mode`. A workload of 10,000 searches per month costs 3. Every successful request returns:
HeaderMeaning
x-credit-balanceWallet balance after this charge
x-credit-charged0.000300
x-billing-txAudit row UUID
On 402 INSUFFICIENT_CREDITS, the response includes details.shortfall so you can prompt the user to top up. See Pricing & Billing.

Tuning

Query types and recommended settings:
Query typeRecommended settings
Exact phrase matchalpha: 0.2, mode: fast
Conceptual questionalpha: 0.9, mode: fast
Complex multi-faceted questionalpha: 0.7, mode: thinking
Latest-events focusalpha: 0.6, recency_bias: 0.3