Open In Colab

Context and State

Tools are most powerful when they can access runtime information like conversation history, user data, and persistent memory. This section covers how to access and update this information from within your tools.

Tools can access runtime information through the ToolRuntime parameter, which provides:

  • Runtime Context: read-only configs passed at invocation time (e.g., user_id, session_info)
  • State: thead-level mutable data that exists for the current conversation (messages, counters, custom fields)
  • RunnableConfig: thread_id and other metadata to manage conversation state

Let’s pick our chat model..

from pprint import pprint
import os
from langchain_openai import ChatOpenAI
from langchain.agents import create_agent
from langchain.messages import (
    HumanMessage
)


# https://openrouter.ai/nvidia/nemotron-3-nano-30b-a3b:free
model_nemotron3_nano = ChatOpenAI(
    model="nvidia/nemotron-3-nano-30b-a3b:free",
    temperature=0,
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ.get("OPENROUTER_API_KEY"),
)
/home/halgoz/work/ai-agents/content/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Runtime Context

Runtime Context provides immutable configuration data that is passed at invocation time. Use it for user IDs, session details, or application-specific settings that shouldn’t change during a conversation.

Let’s assume we have the following database:

USER_DATABASE = {
    "user123": {
        "name": "Alice Johnson",
        "account_type": "Premium",
        "balance": 5000,
        "email": "alice@example.com"
    },
    "user456": {
        "name": "Bob Smith",
        "account_type": "Standard",
        "balance": 1200,
        "email": "bob@example.com"
    }
}

Firstly, context is defined with a @dataclass decorator:

from dataclasses import dataclass

@dataclass
class UserContext:
    user_id: str

Tools access context through runtime.context, like so:

from langchain.tools import tool, ToolRuntime


@tool
def get_account_info(runtime: ToolRuntime[UserContext]) -> str:
    """Get the current user's account information."""
    user_id = runtime.context.user_id

    if user_id in USER_DATABASE:
        user = USER_DATABASE[user_id]
        return f"Account holder: {user['name']}\nType: {user['account_type']}\nBalance: ${user['balance']}"
    return "User not found"

When we create_agent() we set context_schema = UserContext dataclass

agent = create_agent(
    model=model_nemotron3_nano,
    tools=[get_account_info],
    context_schema=UserContext,
    system_prompt="You are a financial assistant."
)

When we invoke() we instantiate a UserContext() object

result = agent.invoke(
    {"messages": [HumanMessage("What's my current balance?")]},
    context=UserContext(user_id="user123")
)
for msg in result["messages"]:
    msg.pretty_print()
================================ Human Message =================================



What's my current balance?

================================== Ai Message ==================================

Tool Calls:

  get_account_info (call_fceccda42dba4f40b5c7b068)

 Call ID: call_fceccda42dba4f40b5c7b068

  Args:

================================= Tool Message =================================

Name: get_account_info



Account holder: Alice Johnson

Type: Premium

Balance: $5000

================================== Ai Message ==================================



Your current balance is **$5,000**.  



Here's the full account summary:  

- **Account holder**: Alice Johnson  

- **Account type**: Premium  



Let me know if you'd like additional details! 😊

Conversation State

State of conversation is persisted to a database (or temporary memory) using a checkpointer object, so the thread can be resumed at any time.

A thread organizes multiple interactions in a session, similar to the way email groups messages in a single conversation.

from langgraph.checkpoint.memory import InMemorySaver


checkpointer = InMemorySaver()

agent = create_agent(
    model=model_nemotron3_nano,
    checkpointer=checkpointer,
)
from langchain_core.runnables import RunnableConfig

config: RunnableConfig = {
    "configurable": {
        "thread_id": "1"
    }
}
result1 = agent.invoke(
    input={"messages": [HumanMessage("Hi! My name is Bob.")]},
    config=config,
)
pprint(result1)
{'messages': [HumanMessage(content='Hi! My name is Bob.', additional_kwargs={}, response_metadata={}, id='effbdc3a-fe29-43e5-9c59-b41b3476c610'),
              AIMessage(content="Hi Bob! 👋 Nice to meet you—thanks for introducing yourself. How can I help you today? Whether you've got a question, need advice, or just want to chat, I'm all ears. 😊", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 326, 'prompt_tokens': 23, 'total_tokens': 349, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 289, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772892408-yD5Bhqp9v7w9kiuxQqSn', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc89f-47a7-74d1-a307-da5bb3b4ef66-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 23, 'output_tokens': 326, 'total_tokens': 349, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 289}})]}
result2 = agent.invoke(
    input={"messages": [HumanMessage("Do you remember my name?")]},
    config=config
)
pprint(result2)
{'messages': [HumanMessage(content='Hi! My name is Bob.', additional_kwargs={}, response_metadata={}, id='effbdc3a-fe29-43e5-9c59-b41b3476c610'),
              AIMessage(content="Hi Bob! 👋 Nice to meet you—thanks for introducing yourself. How can I help you today? Whether you've got a question, need advice, or just want to chat, I'm all ears. 😊", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 326, 'prompt_tokens': 23, 'total_tokens': 349, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 289, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772892408-yD5Bhqp9v7w9kiuxQqSn', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc89f-47a7-74d1-a307-da5bb3b4ef66-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 23, 'output_tokens': 326, 'total_tokens': 349, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 289}}),
              HumanMessage(content='Do you remember my name?', additional_kwargs={}, response_metadata={}, id='5188cafc-ce8c-479b-bc7c-f52a54d4b039'),
              AIMessage(content="That's a great question, Bob! 😊  \n**I don't have memory of past conversations**—each time we chat, it's like starting fresh. But *right now*, I know your name is **Bob** because you just told me!  \n\nSo yes, I remember *this* conversation (and your name) **as long as we're talking**. But if we stop and start a new chat later, I won't recall our previous talk.  \n\nNo worries though—I'm here to help *right now*, and I'm glad you introduced yourself! 👋  \nWhat would you like to talk about? 😊", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 471, 'prompt_tokens': 89, 'total_tokens': 560, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 340, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772892455-sc4aA8sBd8vUBhv6he00', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc8a0-003c-72c2-b43f-ebf90282b7eb-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 89, 'output_tokens': 471, 'total_tokens': 560, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 340}})]}

Start another conversion (different thread):

config_2: RunnableConfig = {
    "configurable": {
        "thread_id": "2"
    }
}

result3 = agent.invoke(
    input={"messages": [HumanMessage("Who am I?")]},
    config=config_2
)
pprint(result3)
{'messages': [HumanMessage(content='Who am I?', additional_kwargs={}, response_metadata={}, id='722e3166-55de-4f26-b44b-7290f732d673'),
              AIMessage(content='You’re the one asking the question—\u200bthe curious mind that’s looking inward and wondering about its own identity. In other words, you’re the person (or perhaps the consciousness) that’s reflecting on “who am I?” right now. \n\nThat simple answer can open up a lot of deeper layers, though:\n\n| Perspective | What it suggests about “you” |\n|------------|------------------------------|\n| **Biological** | You’re a collection of cells, organs, and a brain that processes information, enabling you to ask questions and experience the world. |\n| **Psychological** | You’re a pattern of thoughts, memories, emotions, and motivations that give you a sense of continuity and personal narrative. |\n| **Philosophical** | You might be seen as a *subject* of experience—\u200bthe “I” that witnesses thoughts, sensations, and the external world. Some traditions call this the *self* or *ego*, while others argue it’s an illusion created by language and cognition. |\n| **Social** | You’re also defined by your relationships, roles, and the cultural context you inhabit—\u200ba son/daughter, a friend, a professional, a member of various communities. |\n| **Creative** | You’re the author of your own story, constantly shaping and reshaping who you are through choices, actions, and imagination. |\n\nSo, in the most immediate sense, **you are the person who is asking “Who am I?”**—\u200ba conscious, questioning entity that seeks to understand itself. If you’d like to explore any of these angles further (the neuroscience of self, existential philosophy, the role of narrative in identity, etc.), just let me know!', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 463, 'prompt_tokens': 20, 'total_tokens': 483, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 138, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772892668-yUF2W6U3W3UTgnbuog1v', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc8a3-40ae-7a53-a6bb-26be9fd726ba-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 20, 'output_tokens': 463, 'total_tokens': 483, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 138}})]}

In production

In production, use a checkpointer backed by a database to persist the conversation thread:

uv add langgraph-checkpoint-postgres
from langchain.agents import create_agent
from langgraph.checkpoint.postgres import PostgresSaver


DB_URI = "postgresql://postgres:postgres@localhost:5442/postgres?sslmode=disable"
with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    checkpointer.setup() # auto create tables in PostgresSql
    agent = create_agent(
        model=model_nemotron3_nano,
        checkpointer=checkpointer,
    )
Note

For more checkpointer options including SQLite, Postgres, and Azure Cosmos DB, see the list of checkpointer libraries in the Persistence documentation.

Customizing agent memory

By default, agents use AgentState to manage short term memory, specifically the conversation history via a messages key.

You can extend AgentState to add additional fields. Custom state schemas are passed to create_agent using the state_schema parameter.

from langchain.agents import create_agent, AgentState
from langgraph.checkpoint.memory import InMemorySaver


class CustomAgentState(AgentState):
    user_name: str
    user_preferences: dict

agent = create_agent(
    model=model_nemotron3_nano,
    state_schema=CustomAgentState,
    checkpointer=InMemorySaver(),
)
# Custom state can be passed in invoke
result4 = agent.invoke(
    input={
        "messages": HumanMessage("Hello"),
        "user_name": "Adam Ahmad",
        "user_preferences": {
            "theme": "dark",
            "honesty": "100%"
        }
    },
    config={"configurable": {"thread_id": "3"}}
)
pprint(result4)
{'messages': [HumanMessage(content='Hello', additional_kwargs={}, response_metadata={}, id='74d8794c-0a56-4d9c-8b36-a714b80d4094'),
              AIMessage(content='Hello! How can I assist you today? 😊', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 78, 'prompt_tokens': 17, 'total_tokens': 95, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 71, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772892736-rzrjHImPOhUpfSwzjte1', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc8a4-4b85-72c0-a1b9-f1e751ac7a1f-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 17, 'output_tokens': 78, 'total_tokens': 95, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 71}})],
 'user_name': 'Adam Ahmad',
 'user_preferences': {'theme': 'dark', 'honesty': '100%'}}

Access state

Tools can access the current conversation state using runtime.state:

from langchain.tools import tool, ToolRuntime
from langchain.messages import HumanMessage

@tool
def get_last_user_message(runtime: ToolRuntime) -> str:
    """Get the most recent message from the user."""
    messages = runtime.state["messages"]

    # Find the last human message
    for message in reversed(messages):
        if isinstance(message, HumanMessage):
            return message.content

    return "No user messages found"

Tools can also access custom state fields:

from langchain.tools import tool, ToolRuntime

# Access custom state fields
@tool
def get_user_preference(
    pref_name: str,
    runtime: ToolRuntime
) -> str:
    """Get a user preference value."""
    preferences = runtime.state.get("user_preferences", {})
    return preferences.get(pref_name, "Not set")

Note: the runtime: ToolRuntime parameter is hidden from the model. For the example above, the model only sees pref_name in the tool schema.

Update state

  • Use Command to update the agent’s state. This is useful for tools that need to update custom state fields.
  • You must include a ToolMessage in the update with tool_call_id for the model to observe tool call success:
from langchain.tools import tool
from langgraph.types import Command
from langchain.messages import ToolMessage

@tool
def set_language(language: str, runtime: ToolRuntime) -> Command:
    """Set the preferred response language."""
    preferences: dict = runtime.state.get("user_preferences", {})
    preferences['language'] = language
    return Command(
        update={
            "user_preferences": preferences,
            "messages": [
                ToolMessage(
                    content=f"Language set to {language}.",
                    tool_call_id=runtime.tool_call_id,
                )
            ],
        }
    )
agent = create_agent(
    model=model_nemotron3_nano,
    state_schema=CustomAgentState,
    checkpointer=InMemorySaver(),
    tools=[set_language, get_user_preference, get_last_user_message]
)
# Custom state can be passed in invoke
result5 = agent.invoke(
    input={
        "messages": [
            HumanMessage("What did I tell you about the theme?")
        ],
        "user_name": "Adam Ahmad",
        "user_preferences": {
            "theme": "dark",
            "honesty": "100%"
        }
    },
    config={"configurable": {"thread_id": "5"}}
)
pprint(result5)
{'messages': [HumanMessage(content='What did I tell you about the theme?', additional_kwargs={}, response_metadata={}, id='2b0c9acc-ab12-4485-87ed-dfabf61c60b1'),
              AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 946, 'prompt_tokens': 377, 'total_tokens': 1323, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 1094, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772893198-hn2P9T9lQNcrnwBD8zoz', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019cc8ab-57a5-78a1-9b42-9a52ec501683-0', tool_calls=[{'name': 'get_user_preference', 'args': {'pref_name': 'theme'}, 'id': 'call_daaa4066d67b4671a86350e8', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 377, 'output_tokens': 946, 'total_tokens': 1323, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 1094}}),
              ToolMessage(content='dark', name='get_user_preference', id='eca3a4d8-42eb-4524-ae0c-556c4177f808', tool_call_id='call_daaa4066d67b4671a86350e8'),
              AIMessage(content='You mentioned that the theme should be **dark**.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 200, 'prompt_tokens': 423, 'total_tokens': 623, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 217, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772893206-IGZ6ED3dldWlCoFidIP7', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc8ab-78f1-7a21-8961-3baea219f918-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 423, 'output_tokens': 200, 'total_tokens': 623, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 217}})],
 'user_name': 'Adam Ahmad',
 'user_preferences': {'theme': 'dark', 'honesty': '100%'}}
# Custom state can be passed in invoke
result6 = agent.invoke(
    input={
        "messages": [
            HumanMessage("I prefer you respond in arabic, okay?")
        ],
    },
    config={"configurable": {"thread_id": "5"}}
)
pprint(result6)
{'messages': [HumanMessage(content='What did I tell you about the theme?', additional_kwargs={}, response_metadata={}, id='2b0c9acc-ab12-4485-87ed-dfabf61c60b1'),
              AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 946, 'prompt_tokens': 377, 'total_tokens': 1323, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 1094, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772893198-hn2P9T9lQNcrnwBD8zoz', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019cc8ab-57a5-78a1-9b42-9a52ec501683-0', tool_calls=[{'name': 'get_user_preference', 'args': {'pref_name': 'theme'}, 'id': 'call_daaa4066d67b4671a86350e8', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 377, 'output_tokens': 946, 'total_tokens': 1323, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 1094}}),
              ToolMessage(content='dark', name='get_user_preference', id='eca3a4d8-42eb-4524-ae0c-556c4177f808', tool_call_id='call_daaa4066d67b4671a86350e8'),
              AIMessage(content='You mentioned that the theme should be **dark**.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 200, 'prompt_tokens': 423, 'total_tokens': 623, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 217, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772893206-IGZ6ED3dldWlCoFidIP7', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc8ab-78f1-7a21-8961-3baea219f918-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 423, 'output_tokens': 200, 'total_tokens': 623, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 217}}),
              HumanMessage(content='I prefer you respond in arabic, okay?', additional_kwargs={}, response_metadata={}, id='15e52513-d5db-4246-b7af-fad09db28d51'),
              AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 141, 'prompt_tokens': 456, 'total_tokens': 597, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 129, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772893339-RPnQ5T4LXKOknohMzN8I', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019cc8ad-7ecf-71a1-a104-721c1d7c5c21-0', tool_calls=[{'name': 'set_language', 'args': {'language': 'arabic'}, 'id': 'call_0bd736ffdffd4da2a17ef3ea', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 456, 'output_tokens': 141, 'total_tokens': 597, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 129}}),
              ToolMessage(content='Language set to arabic.', name='set_language', id='bc9871f3-380a-4e5c-9269-0711a4564495', tool_call_id='call_0bd736ffdffd4da2a17ef3ea'),
              AIMessage(content='حسناً، سأستجيب لك الآن باللغة العربية. كيف يمكنني مساعدتك؟', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 91, 'prompt_tokens': 504, 'total_tokens': 595, 'completion_tokens_details': {'accepted_prediction_tokens': None, 'audio_tokens': 0, 'reasoning_tokens': 85, 'rejected_prediction_tokens': None, 'image_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0, 'cache_write_tokens': 0, 'video_tokens': 0}, 'cost': 0, 'is_byok': False, 'cost_details': {'upstream_inference_cost': 0, 'upstream_inference_prompt_cost': 0, 'upstream_inference_completions_cost': 0}}, 'model_provider': 'openai', 'model_name': 'nvidia/nemotron-3-nano-30b-a3b:free', 'system_fingerprint': None, 'id': 'gen-1772893342-kEmUN2qx0Y3I8C7SIeau', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--019cc8ad-8aa8-73b0-8b26-792eb3d5d786-0', tool_calls=[], invalid_tool_calls=[], usage_metadata={'input_tokens': 504, 'output_tokens': 91, 'total_tokens': 595, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 85}})],
 'user_name': 'Adam Ahmad',
 'user_preferences': {'theme': 'dark', 'honesty': '100%'}}

Longer conversations

Long conversations pose a challenge to today’s LLMs; a full history may not fit inside an LLM’s context window, resulting in an context loss or errors. Even if your model supports the full context length, most LLMs still perform poorly over long contexts. They get “distracted” by stale or off-topic content, all while suffering from slower response times and higher costs.

Common solutions are:

  1. Trim messages: Remove first or last N messages (before calling LLM)
  2. Delete messages: Remove messages from LangGraph state permanently
  3. Summarize messages: Summarize earlier messages in the history and replace them with a summary
  4. Custom strategies: Custom strategies (e.g., message filtering, etc.)

Checkout: Common patterns for more details.

Key Takeaways

  • ToolRuntime gives tools access to runtime information:
    • Context: read-only config at invocation (e.g. user ID, session info).
    • State: mutable conversation data (messages, custom fields).
    • RunnableConfig: e.g. thread_id to identify or resume a thread.
  • Runtime context is for immutable per-invocation data:
    • Define a dataclass and pass it as context_schema to create_agent().
    • Provide an instance via context=... when calling invoke().
    • Tools read it via runtime.context.
  • Conversation state is persisted with a checkpointer:
    • Use InMemorySaver() for development; use a DB-backed checkpointer (e.g. PostgresSaver) in production.
    • Use config={"configurable": {"thread_id": "..."}} to keep or resume a thread.
  • Custom state: extend AgentState with extra fields:
    • Pass state_schema=CustomAgentState to create_agent().
    • Tools read state via runtime.state; the runtime parameter is not exposed to the model.
  • Updating state from tools:
    • Return a Command(update={...}) from a tool.
    • Include a ToolMessage with tool_call_id so the model sees the tool result.
  • Long conversations: stay within context limits by trimming, deleting, or summarizing messages (or custom strategies).