Persistent memory for AI agents is the ability to store facts, decisions, and context between agent runs and retrieve them semantically on demand. You can add it to any agent in minutes using Memstore's REST API — one POST to store a memory, one GET to recall the most relevant ones.
By default, every agent run starts from a blank slate. Your LangChain chain, CrewAI crew, or AutoGen pipeline has no idea what happened last session. It doesn't know the user's preferences, what decisions were already made, or what tools have already been tried.
The result: agents ask the same questions repeatedly, hallucinate past decisions, and can't personalise responses for returning users. This is the single biggest gap between demo agents and production agents.
The fix is straightforward: store important context after each run, and inject relevant memories at the start of the next one. Memstore makes this two REST calls.
Get your free API key at memstore.dev, then run these two curl commands to prove it works:
curl -X POST https://memstore.dev/v1/memory/remember \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"content": "User prefers concise answers, works in fintech"}'
curl "https://memstore.dev/v1/memory/recall?q=user+preferences" \ -H "Authorization: Bearer YOUR_API_KEY"
That's it. The second call returns the most semantically relevant memories ranked by cosine similarity — not keyword matching. Ask about "user preferences" and you'll surface memories about communication style, domain, and working habits.
Install the Memstore SDK, then use two method calls to store and recall memory in any agent.
pip install memstore
from memstore import Memstore ms = Memstore(api_key="am_live_...") def run_agent(user_id: str, user_message: str): # Recall relevant context before building the prompt memories = ms.recall(user_message, session=user_id) memory_block = "\n".join(m["content"] for m in memories) system_prompt = f"""You are a helpful assistant. What you remember about this user: {memory_block} Use this context to personalise your response.""" # ... call your LLM here ... response = llm.chat(system_prompt, user_message) # Store anything worth remembering after the run ms.remember(f"User asked: {user_message[:120]}", session=user_id) return response
const BASE = 'https://memstore.dev/v1/memory'; const HEADERS = { 'Authorization': `Bearer ${process.env.MEMSTORE_API_KEY}`, 'Content-Type': 'application/json', }; async function remember(content, session) { await fetch(`${BASE}/remember`, { method: 'POST', headers: HEADERS, body: JSON.stringify({ content, session }), }); } async function recall(query, session, limit = 5) { const params = new URLSearchParams({ q: query, limit, ...(session && { session }) }); const res = await fetch(`${BASE}/recall?${params}`, { headers: HEADERS }); const data = await res.json(); return data.memories.map(m => m.content); } module.exports = { remember, recall };
LangChain has built-in memory classes, but they're all in-process and ephemeral. Swap them for Memstore to get persistence across process restarts, deployments, and users.
from langchain.chat_models import ChatOpenAI from langchain.schema import SystemMessage, HumanMessage from memstore import Memstore ms = Memstore(api_key="am_live_...") llm = ChatOpenAI(model="gpt-4o") def langchain_agent(user_id: str, query: str) -> str: memories = ms.recall(query, session=user_id) context = "\n".join(m["content"] for m in memories) or "No prior context." messages = [ SystemMessage(content=f"User context:\n{context}"), HumanMessage(content=query), ] result = llm(messages) ms.remember(result.content[:200], session=user_id) return result.content
CrewAI agents are stateless between crew runs. Inject Memstore recall into your task descriptions or agent backstory, and store key outputs after each run completes.
from crewai import Agent, Task, Crew from memstore import Memstore ms = Memstore(api_key="am_live_...") def run_crew(user_id: str, goal: str): past = ms.recall(goal, session=user_id) context = "\n".join(m["content"] for m in past) or "First run — no prior context." researcher = Agent( role="Researcher", goal=goal, backstory=f"Prior research context:\n{context}", verbose=False, ) task = Task(description=goal, agent=researcher, expected_output="Summary") result = Crew(agents=[researcher], tasks=[task]).kickoff() ms.remember(f"Crew result: {str(result)[:200]}", session=user_id) return result
AutoGen conversation state is lost when the script ends. Use Memstore to persist key decisions and user facts across AutoGen runs.
import autogen from memstore import Memstore ms = Memstore(api_key="am_live_...") def build_system_message(user_id: str, task: str) -> str: memories = ms.recall(task, session=user_id) context = "\n".join(m["content"] for m in memories) return f"""You are a helpful assistant. Relevant history for this user: {context} Use this context to avoid repeating work already done.""" user_proxy = autogen.UserProxyAgent(name="User") assistant = autogen.AssistantAgent( name="Assistant", system_message=build_system_message(user_id, task), ) # After run completes, store the summary ms.remember(summary, session=user_id)
Free tier — 1,000 ops/month. No credit card. First API key in 30 seconds.
Get started at memstore.dev →