Memory API for AI Agents

Your agents finally
remember everything

Persistent memory API for AI agents. Store and recall across runs.

One POST to remember. One GET to recall. Persistent semantic memory for every agent — no vector DB setup required.

50–200 calls/run avg agent usage
<50ms recall latency
Works with any framework
pgvector semantic search
api.memstore.dev/v1/memory
Store a memory
const res = await fetch('https://api.memstore.dev
  /v1/memory/remember', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    content: 'User prefers dark mode,
      uses React, timezone UTC-5',
    session: 'user_8821',
    ttl: 2592000 // 30 days
  })
});
Recall semantically
// GET /v1/memory/recall?q=user+preferences

// Response:
{
  "memories": [{
    "id": "mem_k9x2...",
    "content": "User prefers dark
      mode, uses React...",
    "score": 0.97,
    "session": "user_8821",
    "age": "2 hours ago"
  }],
  "tokens_used": 142
}

Two calls.
That's the whole API.

Copy your key and run these in your terminal.

Try it in 60 seconds

Store a memory
POST
curl -X POST https://memstore.dev/v1/memory/remember \
  -H "Authorization: Bearer YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content":"User prefers dark mode, uses React"}'
Recall semantically
GET
curl "https://memstore.dev/v1/memory/recall?q=user+settings" \
  -H "Authorization: Bearer YOUR_KEY"

# {"memories":[{"content":"User prefers dark mode,
#   uses React","score":0.97,"age":"2h ago"}]}

Ready for production? Upgrade to Starter →

AI agents forget everything
between runs

Every session starts from zero. Your agent has no idea what the user said last week, what decisions were made, or what it already tried. That's not intelligence — that's amnesia.

🔁

Repeated questions

Agents ask users the same things over and over. Every run re-discovers what should have been remembered.

🌀

Context stuffing

Dumping entire conversation history into every prompt is expensive, slow, and hits context limits fast.

🎭

Hallucinating history

Without real memory, agents make up plausible-sounding facts about past interactions. Users notice.

Try it yourself

No API key needed. Select a memory, store it, then recall it semantically.

memstore.dev — live demo
01 Store
02 Recall
03 Agent
Step 1 — Store a memory
curl -X POST https://memstore.dev/v1/memory/remember \
  -H "Authorization: Bearer am_live_..." \
  -d '{"content": "...", "session": "user_8821"}'
Select a memory to store:
User prefers dark mode, uses React, timezone UTC-5
user_8821
Agent decided to use PostgreSQL for the database
task_42
Customer Sarah reported billing issue on March 15th
support_991
API rate limit is 1000 req/min for the Pro tier
shared
// Response 201 Created
Step 2 — Recall semantically
Try these queries (wording doesn't have to match exactly):
editor preferences
what database did we choose
billing complaint
API limits
UI theme settings
// Results ranked by semantic similarity
Step 3 — Inside your agent
# Before every agent run
memories = ms.recall("user preferences", session="user_8821")
context = "\n".join([m["content"] for m in memories])

# Inject into your LLM prompt
prompt = f"Context:\n{context}\n\nTask: {user_task}"

# After learning something new
ms.remember("User switched to Vue.js", session="user_8821")
Agent now has persistent memory across every run

Every run starts with full context. No re-explaining. No hallucinating past decisions.

What is persistent memory
for AI agents?

Persistent memory for AI agents is the ability to store facts, decisions, and context outside the agent runtime and retrieve them semantically on future runs. Unlike context window stuffing, persistent memory scales across sessions, reduces token costs, and gives agents long-term recall without manual state management.

Three calls.
Infinite memory.

01

Remember

POST any text — facts, decisions, user preferences, tool outputs. Memstore embeds it automatically and indexes it for semantic search.

02

Recall

GET with a natural language query. Returns the most relevant memories ranked by semantic similarity — not just keyword matches.

03

Run forever

Every agent run starts with full context. No loops, no repeated work, no hallucinating past decisions. Your agent gets smarter over time.

What your agent needs
to remember

Not a vector DB tutorial. A production-grade memory layer with agent-native primitives.

Semantic recall

pgvector cosine similarity search returns the right memories even when the query wording differs. Agents don't need exact matches to remember.

Session isolation

Tag memories by user, task, or run. Recall across sessions or within a single scope. No cross-contamination between different agent contexts.

TTL + auto-expiry

Set time-to-live on any memory. Short-lived task context expires automatically. Long-term user facts persist forever. You control the lifecycle.

LLM-parseable errors

Every error response includes a machine-readable code, human-readable message, and a suggested fix. Agents retry correctly without human intervention.

Memory summarization

Pro tier automatically compresses aging memories into dense summaries. Slash token costs by 40–50% without losing long-term context.

Webhooks on update

Fire webhooks when memories are created, updated, or expire. Sync agent state across services or trigger downstream actions when context changes.

Built for every
agent type

From customer-facing bots to internal automation pipelines.

🎧

Customer support bots

Remember each customer's history, past tickets, and preferences. Resolve issues faster without making customers repeat themselves every time.

💼

Sales & outreach agents

Track every prospect interaction, objection, and follow-up. Agents remember deal context across weeks of back-and-forth.

🤖

Personal AI assistants

Build assistants that genuinely know the user — their preferences, projects, habits, and goals — getting more useful with every interaction.

🔗

Multi-agent pipelines

Share state across agent handoffs. One agent stores a decision; another recalls it 10 steps later — no message-passing spaghetti.

Get API Key →

Free tier · No credit card · First key in 30 seconds

Works with your
existing stack

No SDK required. Any language that can make an HTTP request works with Memstore.

Python
# LangChain / CrewAI / any agent
import requests

requests.post(
  "https://memstore.dev/v1/memory/remember",
  headers={"Authorization": f"Bearer {api_key}"},
  json={"content": "User prefers dark mode"}
)
Node.js
// Any JS agent or framework
await fetch(
  "https://memstore.dev/v1/memory/remember",
  { method: "POST",
    headers: { Authorization: `Bearer ${apiKey}` },
    body: JSON.stringify({
      content: "User prefers dark mode"
    })
  }
)

You could. Here's what
that actually looks like.

Rolling your own agent memory sounds like a weekend project. It isn't.

01

Set up pgvector

Configure Postgres, enable the extension, tune ivfflat list count, manage migrations. Then do it again for staging.

02

Wire OpenAI embeddings

Call the embedding API on every store and recall. Handle rate limits, retries, model versioning, and dimension mismatches.

03

Build session scoping

Design an agent isolation model, session namespacing, and access controls. Get it wrong and agents bleed context into each other.

04

Implement TTL cleanup

Write cron jobs to expire stale memories. Handle edge cases. Don't delete things that haven't expired yet. Debug at 2am.

05

Add usage metering

Track ops per agent, enforce limits, handle overages gracefully. Wire into billing. Keep counts consistent under concurrent load.

06

Or just use Memstore

Two REST calls. Done in an afternoon. Focus on your agent's actual intelligence, not infrastructure plumbing.

Stop building infrastructure.
Start building agents.

Every alternative requires weeks of setup, ongoing maintenance, and specialized knowledge.

Feature Self-hosted pgvector Pinecone / Weaviate Memstore
Setup time 2–4 hours ~1 hour Under 5 minutes
Embedding logic Manual Manual Automatic
Maintenance High Medium None
API style SQL + drivers Heavy SDK Simple REST
Cost to start $25+/mo Usage + fees Free

Ship memory in minutes

Five endpoints. Bearer auth. JSON in, JSON out. Structured errors your agent can actually act on. No SDK required — though we have one.

POST/v1/memory/remember
GET/v1/memory/recall?q=...
DEL/v1/memory/forget/:id
GET/v1/memory/list
POST/v1/memory/summarize
DB schema
Error format
-- memories table (Supabase pgvector)
CREATE TABLE memories (
  id        uuid PRIMARY KEY,
  agent_id  uuid REFERENCES agents,
  session   text,
  content   text NOT NULL,
  embedding vector(1536),
  metadata  jsonb,
  ttl       timestamptz,
  created_at timestamptz DEFAULT now()
);

-- cosine similarity index
CREATE INDEX ON memories
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);

Usage-based pricing. No seats. No surprises.

Pay for what your agents use. Free tier generous enough to build your first production agent.

1 operation = 1 store or recall call. Most agent runs use 20–100 ops.

Free
$0
forever
No credit card required
  • 1,000 operations/month
  • 50MB memory storage
  • Semantic recall
  • Session isolation
Get started free

Need more? Upgrade to Starter →

Pro
$49
per month
  • 500,000 operations/month
  • 10GB memory storage
  • Auto memory summarization
  • Priority support
  • Custom TTL policies
  • Usage analytics
Upgrade to Pro

Use the same email as your API key signup

Common questions

What frameworks does Memstore work with? +
Any framework that can make HTTP requests. LangGraph, CrewAI, AutoGen, custom MCP/A2A setups, plain Python, Node.js — if it can call a REST API, it works with Memstore. No framework-specific SDK required, though we ship optional ones for Python and Node.
How is this different from just using Supabase directly? +
Memstore handles embedding generation, index management, TTL enforcement, session scoping, memory summarization, and usage metering out of the box. Using Supabase raw requires you to wire all of that yourself — embedding model calls, vector index tuning, cleanup jobs. Memstore is the "it just works" layer so you ship your agent, not your infrastructure.
What embedding model does it use? +
We use OpenAI text-embedding-3-small by default (1536 dimensions, excellent performance/cost ratio). Pro tier can switch to text-embedding-3-large for higher recall accuracy on complex semantic queries.
How do I count an "operation"? +
Each API call counts as one operation — one remember, one recall, one forget, or one list. A summarize call counts as one operation regardless of how many memories it compresses. We log all usage in your dashboard in real time.
Is my agent's data private? +
Yes. Each API key is scoped to an isolated agent namespace. No memory is shared between different API keys. Data is encrypted at rest. We never use your agent's memory data to train any models.
What happens if I exceed my operation limit? +
Recall operations continue to work (read-only). New remember calls return a 429 with a clear JSON error your agent can parse. We'll email you at 80% usage so you can upgrade before hitting the limit.

Your data stays yours

Agent memory contains sensitive data. Memstore is built with privacy as a default.

🔒

Encrypted in transit and at rest

All memories are encrypted at rest and transmitted over TLS. Your data never moves over unencrypted channels.

🌍

Isolated namespaces

Every API key operates in a fully isolated namespace. Your agent's memories are never mixed with other accounts or used to train any models.

📦

Your data belongs to you

We never use your stored memories or queries to train models. Export or delete your data at any time via the API.

🔑

Revoke keys instantly

Rotate or revoke API keys at any time. Each key is scoped to a single agent namespace with no cross-account access.

Give your agents
a memory worth keeping

Free tier. No credit card. First API key in 30 seconds.

Get your free API key →
Free tier no card required
5 min integration
Cancel anytime