LangChain's built-in memory classes — ConversationBufferMemory, ConversationSummaryMemory, and friends — are all in-process. The moment your script ends, your Lambda function returns, or your container restarts, every memory is gone.
What this means in practice: your LangChain agent has no idea it already talked to this user. It can't remember the user's preferences, their project context, or decisions made in prior sessions. Every run starts from zero.
The fix is simple: stop relying on in-process memory classes and start writing important facts to an external store after each run — and reading them back at the start of the next one. Memstore gives you this in two REST calls, with no infrastructure to manage.
Get your free API key at memstore.dev and verify the API works before touching any LangChain code:
curl -X POST https://memstore.dev/v1/memory/remember \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"content": "User is building a fintech app, prefers Python"}'
curl "https://memstore.dev/v1/memory/recall?q=user+background" \ -H "Authorization: Bearer YOUR_API_KEY" # Returns: {"memories":[{"content":"User is building a fintech app...","score":0.94}]}
Run pip install memstore. LangChain and your LLM provider are already installed.
Create a Memstore instance with your API key. Use ms.remember() and ms.recall() anywhere in your agent.
At the start of each chain or agent invocation, fetch the most relevant memories and inject them into your system message.
After the chain returns, persist anything worth remembering — the user's question, key facts surfaced, decisions made.
pip install memstore
from langchain.chat_models import ChatOpenAI from langchain.schema import SystemMessage, HumanMessage from memstore import Memstore ms = Memstore(api_key="am_live_...") llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.3) def chat(user_id: str, user_message: str) -> str: # 1. Recall relevant context memories = ms.recall(user_message, session=user_id) context = "\n".join(f"- {m['content']}" for m in memories) # 2. Build prompt with memory injected system = f"""You are a helpful assistant. What you remember about this user: {context or "Nothing yet — this may be their first message."} Use this context to give personalised, non-repetitive responses.""" messages = [ SystemMessage(content=system), HumanMessage(content=user_message), ] # 3. Run the chain response = llm(messages).content # 4. Store anything worth remembering ms.remember(f"User said: {user_message[:150]}", session=user_id) if "prefer" in user_message.lower() or "always" in user_message.lower(): ms.remember(user_message, session=user_id) # store explicit preferences return response
session parameter equal to your user's ID. This isolates one user's memories from another's — no cross-contamination even when thousands of users share the same API key.
If you're using LangChain Expression Language, wrap your chain invocation with the same recall-before / store-after pattern:
from langchain_core.prompts import ChatPromptTemplate from langchain_openai import ChatOpenAI from memstore import Memstore ms = Memstore(api_key="am_live_...") prompt = ChatPromptTemplate.from_messages([ ("system", "User context:\n{memory}\n\nBe concise and helpful."), ("human", "{input}"), ]) chain = prompt | ChatOpenAI(model="gpt-4o-mini") def invoke(user_id: str, user_input: str) -> str: memories = ms.recall(user_input, session=user_id) result = chain.invoke({ "memory": "\n".join(m["content"] for m in memories) or "No prior context.", "input": user_input, }) ms.remember(user_input, session=user_id) return result.content
Free tier — 1,000 ops/month. No credit card. Works in 10 minutes.
Get your API key →