Memory API for AI Agents

Your agents finally
remember everything

Persistent memory API for AI agents. Store and recall across runs.

One POST to remember. One GET to recall. Persistent semantic memory for every agent — no vector DB setup required.

Get API Key → View the API

50–200 calls/run avg agent usage

<50ms recall latency

Works with any framework

pgvector semantic search

api.memstore.dev/v1/memory

Store a memory

const res = await fetch('https://api.memstore.dev
  /v1/memory/remember', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    content: 'User prefers dark mode,
      uses React, timezone UTC-5',
    session: 'user_8821',
    ttl: 2592000 // 30 days
  })
});

Recall semantically

// GET /v1/memory/recall?q=user+preferences

// Response:
{
  "memories": [{
    "id": "mem_k9x2...",
    "content": "User prefers dark
      mode, uses React...",
    "score": 0.97,
    "session": "user_8821",
    "age": "2 hours ago"
  }],
  "tokens_used": 142
}

Try it now

Two calls.
That's the whole API.

Copy your key and run these in your terminal.

Try it in 60 seconds

Store a memory

POST

curl -X POST https://memstore.dev/v1/memory/remember \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content":"User prefers dark mode, uses React"}'

Recall semantically

GET

curl "https://memstore.dev/v1/memory/recall?q=user+settings" \
  -H "Authorization: Bearer YOUR_KEY"

# {"memories":[{"content":"User prefers dark mode,
#   uses React","score":0.97,"age":"2h ago"}]}

Ready for production? Upgrade to Starter →

The problem

AI agents forget everything
between runs

Every session starts from zero. Your agent has no idea what the user said last week, what decisions were made, or what it already tried. That's not intelligence — that's amnesia.

🔁

Repeated questions

Agents ask users the same things over and over. Every run re-discovers what should have been remembered.

🌀

Context stuffing

Dumping entire conversation history into every prompt is expensive, slow, and hits context limits fast.

🎭

Hallucinating history

Without real memory, agents make up plausible-sounding facts about past interactions. Users notice.

Live Demo

Try it yourself

No API key needed. Select a memory, store it, then recall it semantically.

memstore.dev — live demo

01 Store

02 Recall

03 Agent

Step 1 — Store a memory

curl -X POST https://memstore.dev/v1/memory/remember \
-H "Authorization: Bearer am_live_..." \
-d '{"content": "...", "session": "user_8821"}'

Select a memory to store:

User prefers dark mode, uses React, timezone UTC-5

user_8821

Agent decided to use PostgreSQL for the database

task_42

Customer Sarah reported billing issue on March 15th

support_991

API rate limit is 1000 req/min for the Pro tier

shared

// Response 201 Created

Step 2 — Recall semantically

Try these queries (wording doesn't have to match exactly):

editor preferences

what database did we choose

billing complaint

API limits

UI theme settings

// Results ranked by semantic similarity

Step 3 — Inside your agent

# Before every agent run
memories = ms.recall("user preferences", session="user_8821")
context = "\n".join([m["content"] for m in memories])

# Inject into your LLM prompt
prompt = f"Context:\n{context}\n\nTask: {user_task}"

# After learning something new
ms.remember("User switched to Vue.js", session="user_8821")

Agent now has persistent memory across every run

Every run starts with full context. No re-explaining. No hallucinating past decisions.

Definition

What is persistent memory
for AI agents?

Persistent memory for AI agents is the ability to store facts, decisions, and context outside the agent runtime and retrieve them semantically on future runs. Unlike context window stuffing, persistent memory scales across sessions, reduces token costs, and gives agents long-term recall without manual state management.

How it works

Three calls.
Infinite memory.

Remember

POST any text — facts, decisions, user preferences, tool outputs. Memstore embeds it automatically and indexes it for semantic search.

Recall

GET with a natural language query. Returns the most relevant memories ranked by semantic similarity — not just keyword matches.

Run forever

Every agent run starts with full context. No loops, no repeated work, no hallucinating past decisions. Your agent gets smarter over time.

Built for production agents

What your agent needs
to remember

Not a vector DB tutorial. A production-grade memory layer with agent-native primitives.

◆

Semantic recall

pgvector cosine similarity search returns the right memories even when the query wording differs. Agents don't need exact matches to remember.

■

Session isolation

Tag memories by user, task, or run. Recall across sessions or within a single scope. No cross-contamination between different agent contexts.

▲

TTL + auto-expiry

Set time-to-live on any memory. Short-lived task context expires automatically. Long-term user facts persist forever. You control the lifecycle.

●

LLM-parseable errors

Every error response includes a machine-readable code, human-readable message, and a suggested fix. Agents retry correctly without human intervention.

■

Memory summarization

Pro tier automatically compresses aging memories into dense summaries. Slash token costs by 40–50% without losing long-term context.

◆

Webhooks on update

Fire webhooks when memories are created, updated, or expire. Sync agent state across services or trigger downstream actions when context changes.

Use cases

Built for every
agent type

From customer-facing bots to internal automation pipelines.

🎧

Customer support bots

Remember each customer's history, past tickets, and preferences. Resolve issues faster without making customers repeat themselves every time.

💼

Sales & outreach agents

Track every prospect interaction, objection, and follow-up. Agents remember deal context across weeks of back-and-forth.

🤖

Personal AI assistants

Build assistants that genuinely know the user — their preferences, projects, habits, and goals — getting more useful with every interaction.

🔗

Multi-agent pipelines

Share state across agent handoffs. One agent stores a decision; another recalls it 10 steps later — no message-passing spaghetti.

Integrations

Works with your
existing stack

No SDK required. Any language that can make an HTTP request works with Memstore.

Python

# LangChain / CrewAI / any agent
import requests

requests.post(
  "https://memstore.dev/v1/memory/remember",
  headers={"Authorization": f"Bearer {api_key}"},
  json={"content": "User prefers dark mode"}
)

Node.js

// Any JS agent or framework
await fetch(
  "https://memstore.dev/v1/memory/remember",
  { method: "POST",
    headers: { Authorization: `Bearer ${apiKey}` },
    body: JSON.stringify({
      content: "User prefers dark mode"
    })
  }
)

Why not build it yourself?

You could. Here's what
that actually looks like.

Rolling your own agent memory sounds like a weekend project. It isn't.

Set up pgvector

Configure Postgres, enable the extension, tune ivfflat list count, manage migrations. Then do it again for staging.

Wire OpenAI embeddings

Call the embedding API on every store and recall. Handle rate limits, retries, model versioning, and dimension mismatches.

Build session scoping

Design an agent isolation model, session namespacing, and access controls. Get it wrong and agents bleed context into each other.

Implement TTL cleanup

Write cron jobs to expire stale memories. Handle edge cases. Don't delete things that haven't expired yet. Debug at 2am.

Add usage metering

Track ops per agent, enforce limits, handle overages gracefully. Wire into billing. Keep counts consistent under concurrent load.

Or just use Memstore

Two REST calls. Done in an afternoon. Focus on your agent's actual intelligence, not infrastructure plumbing.

Why Memstore

Stop building infrastructure.
Start building agents.

Every alternative requires weeks of setup, ongoing maintenance, and specialized knowledge.

Feature	Self-hosted pgvector	Pinecone / Weaviate	Memstore
Setup time	2–4 hours	~1 hour	Under 5 minutes
Embedding logic	Manual	Manual	Automatic
Maintenance	High	Medium	None
API style	SQL + drivers	Heavy SDK	Simple REST
Cost to start	$25+/mo	Usage + fees	Free

The API

Ship memory in minutes

Five endpoints. Bearer auth. JSON in, JSON out. Structured errors your agent can actually act on. No SDK required — though we have one.

POST/v1/memory/remember

GET/v1/memory/recall?q=...

DEL/v1/memory/forget/:id

GET/v1/memory/list

POST/v1/memory/summarize

DB schema

Error format

-- memories table (Supabase pgvector)
CREATE TABLE memories (
  id        uuid PRIMARY KEY,
  agent_id  uuid REFERENCES agents,
  session   text,
  content   text NOT NULL,
  embedding vector(1536),
  metadata  jsonb,
  ttl       timestamptz,
  created_at timestamptz DEFAULT now()
);

-- cosine similarity index
CREATE INDEX ON memories
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

Pricing

Usage-based pricing. No seats. No surprises.

Pay for what your agents use. Free tier generous enough to build your first production agent.

1 operation = 1 store or recall call. Most agent runs use 20–100 ops.

Free

forever

No credit card required

1,000 operations/month
50MB memory storage
Semantic recall
Session isolation

Get started free

Need more? Upgrade to Starter →

Common questions

What frameworks does Memstore work with? +

Any framework that can make HTTP requests. LangGraph, CrewAI, AutoGen, custom MCP/A2A setups, plain Python, Node.js — if it can call a REST API, it works with Memstore. No framework-specific SDK required, though we ship optional ones for Python and Node.

How is this different from just using Supabase directly? +

Memstore handles embedding generation, index management, TTL enforcement, session scoping, memory summarization, and usage metering out of the box. Using Supabase raw requires you to wire all of that yourself — embedding model calls, vector index tuning, cleanup jobs. Memstore is the "it just works" layer so you ship your agent, not your infrastructure.

What embedding model does it use? +

We use OpenAI text-embedding-3-small by default (1536 dimensions, excellent performance/cost ratio). Pro tier can switch to text-embedding-3-large for higher recall accuracy on complex semantic queries.

How do I count an "operation"? +

Each API call counts as one operation — one remember, one recall, one forget, or one list. A summarize call counts as one operation regardless of how many memories it compresses. We log all usage in your dashboard in real time.

Is my agent's data private? +

Yes. Each API key is scoped to an isolated agent namespace. No memory is shared between different API keys. Data is encrypted at rest. We never use your agent's memory data to train any models.

What happens if I exceed my operation limit? +

Recall operations continue to work (read-only). New remember calls return a 429 with a clear JSON error your agent can parse. We'll email you at 80% usage so you can upgrade before hitting the limit.

Security

Your data stays yours

Agent memory contains sensitive data. Memstore is built with privacy as a default.

🔒

Encrypted in transit and at rest

All memories are encrypted at rest and transmitted over TLS. Your data never moves over unencrypted channels.

🌍

Isolated namespaces

Every API key operates in a fully isolated namespace. Your agent's memories are never mixed with other accounts or used to train any models.

📦

Your data belongs to you

We never use your stored memories or queries to train models. Export or delete your data at any time via the API.

🔑

Revoke keys instantly

Rotate or revoke API keys at any time. Each key is scoped to a single agent namespace with no cross-account access.

Built with Memstore

AI Hub — see Memstore in action

AI Hub is a free multi-AI dashboard that uses Memstore to give every conversation persistent memory. Chat with ChatGPT, Claude, Gemini, Grok, and local Ollama models simultaneously — with memory that actually survives across sessions.

Broadcast one message to all AIs at once
Run structured debates between models
Persistent memory powered by Memstore
Free to use — bring your own API keys

Try AI Hub free → Open dashboard

AI Hub

ChatGPT

The relay feature turns parallel threads into one collaborative conversation...

Claude

Debate mode is where it gets interesting — AIs react to each other's answers...

Memory synced via Memstore

Your agents finallyremember everything

Two calls.That's the whole API.

AI agents forget everythingbetween runs