Microsoft's AutoGen is one of the most powerful frameworks for building multi-agent systems. You can spin up a ConversableAgent, hand it tools, and watch agents negotiate solutions autonomously. But there's a hard limit baked into every AutoGen setup: when the conversation ends, all context is gone.
The next conversation starts from zero. Your agent doesn't remember the user's preferences, the decisions it made last week, or the research it spent three API calls conducting. This tutorial shows you exactly how to fix that with Memstore's REST memory API.
AutoGen manages conversation state through its chat_messages dict — a list of message objects that grows during a session. This is perfect for in-conversation coherence, but it's fundamentally an in-memory Python data structure. Once the process exits, it's gone.
AutoGen does offer a TeachableAgent that can write facts to a local SQLite database, but this approach has real limitations in production:
What you actually need is a dedicated memory layer that lives outside your agent process. That's exactly what Memstore provides.
The pattern is simple. Your AutoGen agents get two new tools: remember and recall. When an agent learns something worth keeping, it calls remember(). When it needs context at the start of a new session, it calls recall().
Memstore handles the storage, vector indexing, and semantic retrieval. Your agent code stays clean.
pip install pyautogen requests
Get your free API key at memstore.dev. The free tier gives you 1,000 memory operations per month — plenty to get started.
import requests import os MEMSTORE_KEY = os.getenv("MEMSTORE_API_KEY") MEMSTORE_URL = "https://memstore.dev/v1/memory" HEADERS = {"Authorization": f"Bearer {MEMSTORE_KEY}"} def remember(content: str, tags: list[str] = None) -> str: """Store a memory in Memstore. Use this to save anything worth keeping between sessions: user preferences, decisions, research findings, facts.""" payload = {"content": content} if tags: payload["tags"] = tags res = requests.post( f"{MEMSTORE_URL}/remember", json=payload, headers=HEADERS ) res.raise_for_status() return f"Memory stored (id: {res.json()['id']})" def recall(query: str, limit: int = 5) -> str: """Search past memories relevant to a query. Returns the most semantically similar stored memories. Use this at the start of sessions or when you need context about a user or topic.""" res = requests.get( f"{MEMSTORE_URL}/recall", params={"q": query, "limit": limit}, headers=HEADERS ) res.raise_for_status() memories = res.json().get("memories", []) if not memories: return "No relevant memories found." return "\n".join([f"- {m['content']}" for m in memories])
import autogen from memory_tools import remember, recall config_list = [{"model": "gpt-4o", "api_key": "YOUR_OPENAI_KEY"}] assistant = autogen.AssistantAgent( name="assistant", llm_config={ "config_list": config_list, "functions": [ { "name": "remember", "description": remember.__doc__, "parameters": { "type": "object", "properties": { "content": {"type": "string"}, "tags": {"type": "array", "items": {"type": "string"}} }, "required": ["content"] } }, { "name": "recall", "description": recall.__doc__, "parameters": { "type": "object", "properties": { "query": {"type": "string"}, "limit": {"type": "integer"} }, "required": ["query"] } } ] } ) user_proxy = autogen.UserProxyAgent( name="user", human_input_mode="NEVER", function_map={ "remember": remember, "recall": recall } )
The real power shows up when you prime each new conversation with relevant context. Before starting any conversation, inject a recall call to load what your agent already knows.
def start_conversation_with_memory(user_message: str, user_id: str): # Load relevant context before starting past_context = recall(f"context for user {user_id}: {user_message}") # Inject memory into the system prompt system_message = f"""You are a helpful assistant with persistent memory. Relevant context from past sessions: {past_context} Use this context to personalize your responses. When you learn new important facts, call remember() to store them for future sessions.""" assistant.update_system_message(system_message) user_proxy.initiate_chat( assistant, message=user_message )
remember(content, tags=["user:alice", "preference"]) and filter on recall.Because Memstore is a shared API, multiple AutoGen agents can read from and write to the same memory store. A researcher agent can save findings that a writer agent uses in the next run — no manual coordination needed.
This is the fundamental difference from local memory solutions. Your memory layer is as scalable and accessible as any REST API — which means it works in distributed systems, serverless functions, and across team members' local environments.
One API. Persistent context across every session. Free to start.
Get your free API key →