Blog · OpenAI · Tutorial

Using OpenAI Function Calling
with Persistent Memory

OpenAI's function calling (now called "tool use") is one of the most powerful features in the API. It lets GPT-4 decide when to call external functions and how to use their results — turning a chat completion into an autonomous agent loop. But there's one capability that's conspicuously absent from every toy example: the agent can't remember anything between API calls.

In this tutorial, we'll wire up OpenAI function calling to Memstore's memory API. By the end, your agent will be able to store memories when it learns something important, and recall relevant context at the start of every new conversation.


Prerequisites


How OpenAI Tool Use Works

When you pass a tools array to the OpenAI chat completions API, the model can respond with a tool_calls field instead of (or alongside) its message content. Your code executes the function, passes the result back, and the model continues its response.

The model decides autonomously when to call a tool. If you give it a remember tool with a good description, it will call it when it judges something worth storing. Same with recall.


Building the Memory Agent

1

Define the memory tool schemas

tools.py — tool definitions Python
MEMORY_TOOLS = [
    {
        "type": "function",
        "function": {
            "name": "remember",
            "description": (
                "Store a memory for future conversations. Use this whenever you learn "
                "something important about the user, their preferences, goals, decisions, "
                "or anything else worth remembering across sessions."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "content": {
                        "type": "string",
                        "description": "The memory to store, written as a clear factual statement."
                    },
                    "importance": {
                        "type": "string",
                        "enum": ["low", "medium", "high"],
                        "description": "How important this memory is to retain."
                    }
                },
                "required": ["content"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "recall",
            "description": (
                "Search past memories for context relevant to the current conversation. "
                "Use this when you need to remember user preferences or past decisions."
            ),
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "What you want to remember. Use natural language."
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Maximum number of memories to return (default 5)."
                    }
                },
                "required": ["query"]
            }
        }
    }
]
2

Implement the tool functions

memory.py — tool implementations Python
import requests, json, os

MEMSTORE_KEY = os.getenv("MEMSTORE_API_KEY")
BASE_URL = "https://memstore.dev/v1/memory"
HEADERS = {"Authorization": f"Bearer {MEMSTORE_KEY}"}

def remember(content: str, importance: str = "medium") -> dict:
    r = requests.post(
        f"{BASE_URL}/remember",
        json={"content": content, "metadata": {"importance": importance}},
        headers=HEADERS
    )
    r.raise_for_status()
    return {"status": "stored", "id": r.json()["id"]}

def recall(query: str, limit: int = 5) -> dict:
    r = requests.get(
        f"{BASE_URL}/recall",
        params={"q": query, "limit": limit},
        headers=HEADERS
    )
    r.raise_for_status()
    memories = r.json().get("memories", [])
    return {"memories": [m["content"] for m in memories]}

def dispatch_tool(name: str, arguments: str) -> str:
    args = json.loads(arguments)
    if name == "remember":
        result = remember(**args)
    elif name == "recall":
        result = recall(**args)
    else:
        result = {"error": f"Unknown tool: {name}"}
    return json.dumps(result)
3

Build the agent loop

agent.py — the main agent loop Python
from openai import OpenAI
from tools import MEMORY_TOOLS
from memory import dispatch_tool

client = OpenAI()

SYSTEM_PROMPT = """You are a helpful personal assistant with long-term memory.

At the start of every conversation, recall relevant context about the user
and what you've discussed before. During conversations, remember new facts,
preferences, and decisions you learn about the user.

Use the remember() tool proactively — don't wait to be asked."""

def chat(user_message: str, conversation_history: list = None) -> str:
    messages = conversation_history or [{
        "role": "system",
        "content": SYSTEM_PROMPT
    }]
    messages.append({"role": "user", "content": user_message})

    while True:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            tools=MEMORY_TOOLS,
            tool_choice="auto"
        )
        msg = response.choices[0].message
        messages.append(msg)

        if not msg.tool_calls:
            return msg.content  # Done — return final response

        # Execute each tool call and append results
        for call in msg.tool_calls:
            result = dispatch_tool(call.function.name, call.function.arguments)
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": result
            })
        # Loop: model sees tool results and continues

# Usage
if __name__ == "__main__":
    print(chat("Hi! I'm building a Python API, what should I use?"))
    # Agent recalls prior sessions, stores new facts automatically

The Tool Call Loop in Detail

The while True loop is the key insight. OpenAI's model may make multiple tool calls in a single "turn" — for example, calling recall to load context, generating a response, then calling remember to save something it learned. Each tool result gets appended to the message history, and the loop continues until the model returns a plain text response.

Tool description quality matters: The model decides when to call your tools based entirely on their descriptions. Write them like docstrings for a human teammate — explain not just what the function does, but when to use it.

Testing the Memory

Verify memory persists across sessions Python
# Session 1 response1 = chat("I'm working on a FastAPI backend, using PostgreSQL and Redis.") print(response1) # Agent stores the tech stack # Session 2 — fresh conversation history response2 = chat("What database should I use for caching?") print(response2) # Agent recalls "using Redis", recommends Redis without being told again

Ready to build a memory-powered agent?

Get your Memstore API key and ship persistent memory in under an hour.

Get your free API key →