`langchain_ollama.chat_models`.ChatOllama¶

注意 (Note)

ChatOllama 实现了标准的 Runnable 接口。🏃

Runnable 接口 在可运行对象上还有其他可用方法，例如 with_types, with_retry, assign, bind, get_graph, 以及更多。

class langchain_ollama.chat_models.ChatOllama[源代码]¶

基类: BaseChatModel

Ollama 聊天模型集成。

设置 (Setup)

安装 langchain-ollama 并从 Ollama 下载任何你想使用的模型。

ollama pull mistral:v0.3
pip install -U langchain-ollama

关键初始化参数 — 完成参数 (Key init args — completion params)

model: str: 要使用的 Ollama 模型名称。
temperature: float: 采样温度。范围从 0.0 到 1.0。
num_predict: Optional[int]: 要生成的最大 token 数。

请参阅参数部分中受支持的初始化参数及其描述的完整列表。

实例化 (Instantiate)

from langchain_ollama import ChatOllama

llm = ChatOllama(
    model = "llama3",
    temperature = 0.8,
    num_predict = 256,
    # other params ...
)

调用 (Invoke)

messages = [
    ("system", "You are a helpful translator. Translate the user sentence to French."),
    ("human", "I love programming."),
]
llm.invoke(messages)

AIMessage(content='J'adore le programmation. (Note: "programming" can also refer to the act of writing code, so if you meant that, I could translate it as "J'adore programmer". But since you didn't specify, I assumed you were talking about the activity itself, which is what "le programmation" usually refers to.)', response_metadata={'model': 'llama3', 'created_at': '2024-07-04T03:37:50.182604Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 3576619666, 'load_duration': 788524916, 'prompt_eval_count': 32, 'prompt_eval_duration': 128125000, 'eval_count': 71, 'eval_duration': 2656556000}, id='run-ba48f958-6402-41a5-b461-5e250a4ebd36-0')

流式传输 (Stream)

messages = [
    ("human", "Return the words Hello World!"),
]
for chunk in llm.stream(messages):
    print(chunk)

content='Hello' id='run-327ff5ad-45c8-49fe-965c-0a93982e9be1'
content=' World' id='run-327ff5ad-45c8-49fe-965c-0a93982e9be1'
content='!' id='run-327ff5ad-45c8-49fe-965c-0a93982e9be1'
content='' response_metadata={'model': 'llama3', 'created_at': '2024-07-04T03:39:42.274449Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 411875125, 'load_duration': 1898166, 'prompt_eval_count': 14, 'prompt_eval_duration': 297320000, 'eval_count': 4, 'eval_duration': 111099000} id='run-327ff5ad-45c8-49fe-965c-0a93982e9be1'

stream = llm.stream(messages)
full = next(stream)
for chunk in stream:
    full += chunk
full

AIMessageChunk(content='Je adore le programmation.(Note: "programmation" is the formal way to say "programming" in French, but informally, people might use the phrase "le développement logiciel" or simply "le code")', response_metadata={'model': 'llama3', 'created_at': '2024-07-04T03:38:54.933154Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 1977300042, 'load_duration': 1345709, 'prompt_eval_duration': 159343000, 'eval_count': 47, 'eval_duration': 1815123000}, id='run-3c81a3ed-3e79-4dd3-a796-04064d804890')

异步 (Async)

messages = [
    ("human", "Hello how are you!"),
]
await llm.ainvoke(messages)

AIMessage(content="Hi there! I'm just an AI, so I don't have feelings or emotions like humans do. But I'm functioning properly and ready to help with any questions or tasks you may have! How can I assist you today?", response_metadata={'model': 'llama3', 'created_at': '2024-07-04T03:52:08.165478Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 2138492875, 'load_duration': 1364000, 'prompt_eval_count': 10, 'prompt_eval_duration': 297081000, 'eval_count': 47, 'eval_duration': 1838524000}, id='run-29c510ae-49a4-4cdd-8f23-b972bfab1c49-0')

messages = [
    ("human", "Say hello world!"),
]
async for chunk in llm.astream(messages):
    print(chunk.content)

HEL
LO
WORLD
!

messages = [
    ("human", "Say hello world!"),
    ("human","Say goodbye world!")
]
await llm.abatch(messages)

[AIMessage(content='HELLO, WORLD!', response_metadata={'model': 'llama3', 'created_at': '2024-07-04T03:55:07.315396Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 1696745458, 'load_duration': 1505000, 'prompt_eval_count': 8, 'prompt_eval_duration': 111627000, 'eval_count': 6, 'eval_duration': 185181000}, id='run-da6c7562-e25a-4a44-987a-2c83cd8c2686-0'),
AIMessage(content="It's been a blast chatting with you! Say goodbye to the world for me, and don't forget to come back and visit us again soon!", response_metadata={'model': 'llama3', 'created_at': '2024-07-04T03:55:07.018076Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 1399391083, 'load_duration': 1187417, 'prompt_eval_count': 20, 'prompt_eval_duration': 230349000, 'eval_count': 31, 'eval_duration': 1166047000}, id='run-96cad530-6f3e-4cf9-86b4-e0f8abba4cdb-0')]

JSON 模式 (JSON mode)

json_llm = ChatOllama(format="json")
messages = [
    ("human", "Return a query for the weather in a random location and time of day with two keys: location and time_of_day. Respond using JSON only."),
]
llm.invoke(messages).content

'{"location": "Pune, India", "time_of_day": "morning"}'

工具调用 (Tool Calling)

警告 (Warning)

Ollama 目前不支持工具的流式传输 (Ollama currently does not support streaming for tools)

from langchain_ollama import ChatOllama
from langchain_core.pydantic_v1 import BaseModel, Field

class Multiply(BaseModel):
    a: int = Field(..., description="First integer")
    b: int = Field(..., description="Second integer")

ans = await chat.invoke("What is 45*67")
ans.tool_calls

[{'name': 'Multiply',
'args': {'a': 45, 'b': 67},
'id': '420c3f3b-df10-4188-945f-eb3abdb40622',
'type': 'tool_call'}]

param base_url: Optional[str] = None¶: 模型托管的基础 URL (Base url the model is hosted under)。

param cache: Union[BaseCache, bool, None] = None¶

是否缓存响应 (Whether to cache the response)。

如果为 true，将使用全局缓存 (If true, will use the global cache)。
如果为 false，将不使用缓存 (If false, will not use a cache)
如果为 None，如果已设置全局缓存，则将使用全局缓存，否则不使用缓存 (If None, will use the global cache if it’s set, otherwise no cache.)。
如果是 BaseCache 的实例，将使用提供的缓存 (If instance of BaseCache, will use the provided cache)。

模型流式传输方法目前不支持缓存 (Caching is not currently supported for streaming methods of models)。

param callback_manager: Optional[BaseCallbackManager] = None¶: [已弃用 (DEPRECATED)] 要添加到运行轨迹的回调管理器 (Callback manager to add to the run trace)。

param callbacks: Callbacks = None¶: 要添加到运行轨迹的回调函数 (Callbacks to add to the run trace)。

param client_kwargs: Optional[dict] = {}¶: 要传递给 httpx Client 的其他 kwargs。有关参数的完整列表，请参阅 [此链接](https://pydoc.dev/httpx/latest/httpx.Client.html)

param custom_get_token_ids: Optional[Callable[[str], List[int]]] = None¶: 用于计数 token 的可选编码器 (Optional encoder to use for counting tokens)。

param format: Literal['', 'json'] = ''¶: 指定输出的格式 (选项: json) (Specify the format of the output (options: json))

param keep_alive: Optional[Union[int, str]] = None¶: 模型在内存中保持加载的时间 (How long the model will stay loaded into memory)。

param metadata: Optional[Dict[str, Any]] = None¶: 要添加到运行轨迹的元数据 (Metadata to add to the run trace)。

param mirostat: Optional[int] = None¶: 启用 Mirostat 采样以控制困惑度。(默认值: 0, 0 = 禁用, 1 = Mirostat, 2 = Mirostat 2.0) (Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0))

param mirostat_eta: Optional[float] = None¶: 影响算法对来自生成文本的反馈的响应速度。较低的学习率将导致较慢的调整，而较高的学习率将使算法更具响应性。(默认值: 0.1) (Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1))

param mirostat_tau: Optional[float] = None¶: 控制输出的连贯性和多样性之间的平衡。较低的值将导致更集中和连贯的文本。(默认值: 5.0) (Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0))

param model: str [必需]¶: 要使用的模型名称。

param num_ctx: Optional[int] = None¶: 设置用于生成下一个 token 的上下文窗口的大小。(默认值: 2048) (Sets the size of the context window used to generate the next token. (Default: 2048))

param num_gpu: Optional[int] = None¶: 要使用的 GPU 数量。在 macOS 上，默认值为 1 以启用 metal 支持，0 为禁用。(The number of GPUs to use. On macOS it defaults to 1 to enable metal support, 0 to disable.)

param num_predict: Optional[int] = None¶: 生成文本时要预测的最大 token 数。(默认值: 128, -1 = 无限生成, -2 = 填充上下文) (Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context))

param num_thread: Optional[int] = None¶: 设置计算期间要使用的线程数。默认情况下，Ollama 将检测到最佳性能。建议将此值设置为系统拥有的物理 CPU 核心数（而不是逻辑核心数）。(Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).)

param rate_limiter: Optional[BaseRateLimiter] = None¶: 用于限制请求数量的可选速率限制器 (An optional rate limiter to use for limiting the number of requests)。

param repeat_last_n: Optional[int] = None¶: 设置模型回溯多远以防止重复。(默认值: 64, 0 = 禁用, -1 = num_ctx) (Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx))

param repeat_penalty: Optional[float] = None¶: 设置重复惩罚的强度。较高的值（例如 1.5）将更强烈地惩罚重复，而较低的值（例如 0.9）将更宽松。(默认值: 1.1) (Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1))

param seed: Optional[int] = None¶: 设置用于生成的随机数种子。将其设置为特定数字将使模型为相同的提示生成相同的文本。(Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt.)

param stop: Optional[List[str]] = None¶: 设置要使用的停止 token。(Sets the stop tokens to use.)

param tags: Optional[List[str]] = None¶: 要添加到运行轨迹的标签 (Tags to add to the run trace)。

param temperature: Optional[float] = None¶: 模型的温度。升高温度会使模型更具创造性地回答。(默认值: 0.8) (The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8))

param tfs_z: Optional[float] = None¶: 尾部自由采样用于减少输出中不太可能的 token 的影响。较高的值（例如 2.0）将更多地减少影响，而值 1.0 将禁用此设置。(默认值: 1) (Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1))

param top_k: Optional[int] = None¶: 降低生成无意义内容的可能性。较高的值（例如 100）将提供更多样化的答案，而较低的值（例如 10）将更加保守。(默认值: 40) (Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40))

param top_p: Optional[float] = None¶: 与 top-k 协同工作。较高的值（例如 0.95）将导致更多样化的文本，而较低的值（例如 0.5）将生成更集中和保守的文本。(默认值: 0.9) (Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9))

param verbose: bool [可选]¶: 是否打印输出响应文本。(Whether to print out response text.)

__call__(messages: List[BaseMessage], stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, **kwargs: Any) → BaseMessage¶

版本 langchain-core==0.1.7 中已弃用：使用 invoke 代替。

参数 (Parameters)

messages (List[BaseMessage]) –
stop (Optional[List[str]]) –
callbacks (Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]) –
kwargs (Any) –

返回类型

BaseMessage

async abatch(inputs: List[Input], config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None, *, return_exceptions: bool = False, **kwargs: Optional[Any]) → List[Output]¶

默认实现使用 asyncio.gather 并行运行 ainvoke。

批处理的默认实现对于 IO 密集型 Runnable 效果良好。

如果子类可以更有效地进行批处理，则应覆盖此方法；例如，如果底层 Runnable 使用支持批处理模式的 API。

参数 (Parameters)

inputs (List[Input]) – Runnable 的输入列表。
config (Optional[Union[RunnableConfig, List[RunnableConfig]]]) – 调用 Runnable 时使用的配置。该配置支持标准键，如用于追踪目的的 ‘tags’、‘metadata’，用于控制并行执行量的 ‘max_concurrency’，以及其他键。有关更多详细信息，请参阅 RunnableConfig。默认为 None。
return_exceptions (bool) – 是否返回异常而不是引发异常。默认为 False。
kwargs (Optional[Any]) – 传递给 Runnable 的其他关键字参数。

返回

来自 Runnable 的输出列表。

返回类型

List[Output]

async abatch_as_completed(inputs: Sequence[Input], config: Optional[Union[RunnableConfig, Sequence[RunnableConfig]]] = None, *, return_exceptions: bool = False, **kwargs: Optional[Any]) → AsyncIterator[Tuple[int, Union[Output, Exception]]]¶

并行运行列表中输入的 ainvoke，并在结果完成时产生结果。

参数 (Parameters)

inputs (Sequence[Input]) – Runnable 的输入列表。
config (Optional[Union[RunnableConfig, Sequence[RunnableConfig]]]) – 调用 Runnable 时使用的配置。该配置支持标准键，如用于追踪目的的 ‘tags’、‘metadata’，用于控制并行执行量的 ‘max_concurrency’，以及其他键。有关更多详细信息，请参阅 RunnableConfig。默认为 None。默认为 None。
return_exceptions (bool) – 是否返回异常而不是引发异常。默认为 False。
kwargs (Optional[Any]) – 传递给 Runnable 的其他关键字参数。

产生

一个元组，包含输入的索引和来自 Runnable 的输出。

返回类型

AsyncIterator[Tuple[int, Union[Output, Exception]]]

async agenerate(messages: List[List[BaseMessage]], stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, *, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, run_name: Optional[str] = None, run_id: Optional[UUID] = None, **kwargs: Any) → LLMResult¶

异步地将一系列提示传递给模型并返回生成结果。

此方法应利用批处理调用，以用于公开批处理 API 的模型。

当您想要

利用批处理调用，
需要从模型获得比仅仅是最佳生成值更多的输出，
正在构建与底层语言模型无关的链时使用此方法
类型（例如，纯文本完成模型与聊天模型）。

参数 (Parameters)

messages (List[List[BaseMessage]]) – 消息列表的列表。
stop (Optional[List[str]]) – 生成时使用的停止词。模型输出在首次出现任何这些子字符串时被截断。
callbacks (Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]) – 要传递的回调。用于在整个生成过程中执行额外的功能，例如日志记录或流式传输。
**kwargs (Any) – 任意附加关键字参数。这些通常传递给模型提供商 API 调用。
tags (Optional[List[str]]) –
metadata (Optional[Dict[str, Any]]) –
run_name (Optional[str]) –
run_id (Optional[UUID]) –
**kwargs –

返回

一个 LLMResult，其中包含每个输入: 提示的候选 Generations 列表和额外的模型提供商特定的输出。

返回类型

LLMResult

async agenerate_prompt(prompts: List[PromptValue], stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, **kwargs: Any) → LLMResult¶

异步地传递一系列提示并返回模型生成结果。

此方法应利用批处理调用，以用于公开批处理 API 的模型。

当您想要

利用批处理调用，
需要从模型获得比仅仅是最佳生成值更多的输出，
正在构建与底层语言模型无关的链时使用此方法
类型（例如，纯文本完成模型与聊天模型）。

参数 (Parameters)

prompts (List[PromptValue]) – PromptValue 列表。PromptValue 是一个可以转换为匹配任何语言模型格式的对象（纯文本生成模型的字符串和聊天模型的 BaseMessages）。
stop (Optional[List[str]]) – 生成时使用的停止词。模型输出在首次出现任何这些子字符串时被截断。
callbacks (Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]) – 要传递的回调。用于在整个生成过程中执行额外的功能，例如日志记录或流式传输。
**kwargs (Any) – 任意附加关键字参数。这些通常传递给模型提供商 API 调用。

返回

一个 LLMResult，其中包含每个输入: 提示的候选 Generations 列表和额外的模型提供商特定的输出。

返回类型

LLMResult

async ainvoke(input: LanguageModelInput, config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) → BaseMessage¶

ainvoke 的默认实现，从线程调用 invoke。

即使 Runnable 没有实现 invoke 的原生异步版本，默认实现也允许使用异步代码。

如果子类可以异步运行，则应覆盖此方法。

参数 (Parameters)

input (LanguageModelInput) –
config (Optional[RunnableConfig]) –
stop (Optional[List[str]]) –
kwargs (Any) –

返回类型

BaseMessage

async apredict(text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any) → str¶

Deprecated since version langchain-core==0.1.7: 使用 ainvoke 代替。

参数 (Parameters)

text (str) –
stop (Optional[Sequence[str]]) –
kwargs (Any) –

返回类型

str

async apredict_messages(messages: List[BaseMessage], *, stop: Optional[Sequence[str]] = None, **kwargs: Any) → BaseMessage¶

Deprecated since version langchain-core==0.1.7: 使用 ainvoke 代替。

参数 (Parameters)

messages (List[BaseMessage]) –
stop (Optional[Sequence[str]]) –
kwargs (Any) –

返回类型

BaseMessage

as_tool(args_schema: Optional[Type[BaseModel]] = None, *, name: Optional[str] = None, description: Optional[str] = None, arg_types: Optional[Dict[str, Type]] = None) → BaseTool¶

Beta

此 API 处于 Beta 阶段，未来可能会发生变化。

从 Runnable 创建一个 BaseTool。

as_tool 将从 Runnable 实例化一个具有名称、描述和 args_schema 的 BaseTool。如果可能，模式将从 runnable.get_input_schema 推断。或者（例如，如果 Runnable 接受 dict 作为输入，并且未对特定 dict 键进行类型化），可以使用 args_schema 直接指定模式。您还可以传递 arg_types 以仅指定必需的参数及其类型。

参数 (Parameters)

args_schema (Optional[Type[BaseModel]]) – 工具的模式。默认为 None。
name (Optional[str]) – 工具的名称。默认为 None。
description (Optional[str]) – 工具的描述。默认为 None。
arg_types (Optional[Dict[str, Type]]) – 参数名称到类型的字典。默认为 None。

返回

一个 BaseTool 实例。

返回类型

BaseTool

类型化字典输入

from typing import List
from typing_extensions import TypedDict
from langchain_core.runnables import RunnableLambda

class Args(TypedDict):
    a: int
    b: List[int]

def f(x: Args) -> str:
    return str(x["a"] * max(x["b"]))

runnable = RunnableLambda(f)
as_tool = runnable.as_tool()
as_tool.invoke({"a": 3, "b": [1, 2]})

dict 输入，通过 args_schema 指定模式

from typing import Any, Dict, List
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.runnables import RunnableLambda

def f(x: Dict[str, Any]) -> str:
    return str(x["a"] * max(x["b"]))

class FSchema(BaseModel):
    """Apply a function to an integer and list of integers."""

    a: int = Field(..., description="Integer")
    b: List[int] = Field(..., description="List of ints")

runnable = RunnableLambda(f)
as_tool = runnable.as_tool(FSchema)
as_tool.invoke({"a": 3, "b": [1, 2]})

dict 输入，通过 arg_types 指定模式

from typing import Any, Dict, List
from langchain_core.runnables import RunnableLambda

def f(x: Dict[str, Any]) -> str:
    return str(x["a"] * max(x["b"]))

runnable = RunnableLambda(f)
as_tool = runnable.as_tool(arg_types={"a": int, "b": List[int]})
as_tool.invoke({"a": 3, "b": [1, 2]})

字符串输入

from langchain_core.runnables import RunnableLambda

def f(x: str) -> str:
    return x + "a"

def g(x: str) -> str:
    return x + "z"

runnable = RunnableLambda(f) | g
as_tool = runnable.as_tool()
as_tool.invoke("b")

0.2.14 版本新增。

async astream(input: LanguageModelInput, config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) → AsyncIterator[BaseMessageChunk]¶

astream 的默认实现，它调用 ainvoke。如果子类支持流式输出，则应覆盖此方法。

参数 (Parameters)

input (LanguageModelInput) – Runnable 的输入。
config (Optional[RunnableConfig]) – 用于 Runnable 的配置。默认为 None。
kwargs (Any) – 传递给 Runnable 的其他关键字参数。
stop (Optional[List[str]]) –

产生

Runnable 的输出。

返回类型

AsyncIterator[BaseMessageChunk]

astream_events(input: Any, config: Optional[RunnableConfig] = None, *, version: Literal['v1', 'v2'], include_names: Optional[Sequence[str]] = None, include_types: Optional[Sequence[str]] = None, include_tags: Optional[Sequence[str]] = None, exclude_names: Optional[Sequence[str]] = None, exclude_types: Optional[Sequence[str]] = None, exclude_tags: Optional[Sequence[str]] = None, **kwargs: Any) → AsyncIterator[Union[StandardStreamEvent, CustomStreamEvent]]¶

Beta

此 API 处于 Beta 阶段，未来可能会发生变化。

生成事件流。

用于创建一个 StreamEvents 的迭代器，该迭代器提供关于 Runnable 进度的实时信息，包括来自中间结果的 StreamEvents。

StreamEvent 是一个具有以下模式的字典

event: str - 事件名称的格式为：
on_[runnable_type]_(start|stream|end)。
name: str - 生成事件的 Runnable 的名称。
run_id: str - 随机生成的 ID，与给定 Runnable 执行相关联，该 Runnable 发出事件。作为父 Runnable 执行一部分被调用的子 Runnable 将被分配其自己的唯一 ID。
Runnable 发出事件的给定执行相关的随机生成 ID。作为父 Runnable 执行一部分调用的子 Runnable 将被分配其自己的唯一 ID。
parent_ids: List[str] - 生成事件的父 runnable 的 ID 列表。
生成事件的父 runnable 的 ID。根 Runnable 将有一个空列表。父 ID 的顺序是从根到直接父级。仅适用于 API 的 v2 版本。API 的 v1 版本将返回一个空列表。
tags: Optional[List[str]] - 生成事件的 Runnable 的标签。
生成事件的 Runnable 的标签。
metadata: Optional[Dict[str, Any]] - Runnable 的元数据
生成事件的 Runnable 的元数据。
data: Dict[str, Any]

下面是一个表格，说明了各种链可能发出的一些事件。为了简洁起见，元数据字段已从表中省略。链定义已包含在表格之后。

注意此参考表适用于 V2 版本的模式。

事件	名称	块	输入	输出
on_chat_model_start	[模型名称]		{“messages”: [[SystemMessage, HumanMessage]]}
on_chat_model_stream	[模型名称]	AIMessageChunk(content=”hello”)
on_chat_model_end	[模型名称]		{“messages”: [[SystemMessage, HumanMessage]]}	AIMessageChunk(content=”hello world”)
on_llm_start	[模型名称]		{‘input’: ‘hello’}
on_llm_stream	[模型名称]	‘Hello’
on_llm_end	[模型名称]		‘Hello human!’
on_chain_start	format_docs
on_chain_stream	format_docs	“hello world!, goodbye world!”
on_chain_end	format_docs		[Document(…)]	“hello world!, goodbye world!”
on_tool_start	some_tool		{“x”: 1, “y”: “2”}
on_tool_end	some_tool			{“x”: 1, “y”: “2”}
on_retriever_start	[检索器名称]		{“query”: “hello”}
on_retriever_end	[检索器名称]		{“query”: “hello”}	[Document(…), ..]
on_prompt_start	[模板名称]		{“question”: “hello”}
on_prompt_end	[模板名称]		{“question”: “hello”}	ChatPromptValue(messages: [SystemMessage, …])

除了标准事件外，用户还可以调度自定义事件（见以下示例）。

自定义事件将仅在使用 v2 版本的 API 时才会显示！

自定义事件具有以下格式

属性	类型	描述
名称	str	用户为事件定义的名称。
数据	任意	与事件关联的数据。这可以是任何内容，但我们建议使其可序列化为 JSON。

以下是与上面显示的标准事件关联的声明

format_docs:

def format_docs(docs: List[Document]) -> str:
    '''Format the docs.'''
    return ", ".join([doc.page_content for doc in docs])

format_docs = RunnableLambda(format_docs)

some_tool:

@tool
def some_tool(x: int, y: str) -> dict:
    '''Some_tool.'''
    return {"x": x, "y": y}

提示:

template = ChatPromptTemplate.from_messages(
    [("system", "You are Cat Agent 007"), ("human", "{question}")]
).with_config({"run_name": "my_template", "tags": ["my_template"]})

示例

from langchain_core.runnables import RunnableLambda

async def reverse(s: str) -> str:
    return s[::-1]

chain = RunnableLambda(func=reverse)

events = [
    event async for event in chain.astream_events("hello", version="v2")
]

# will produce the following events (run_id, and parent_ids
# has been omitted for brevity):
[
    {
        "data": {"input": "hello"},
        "event": "on_chain_start",
        "metadata": {},
        "name": "reverse",
        "tags": [],
    },
    {
        "data": {"chunk": "olleh"},
        "event": "on_chain_stream",
        "metadata": {},
        "name": "reverse",
        "tags": [],
    },
    {
        "data": {"output": "olleh"},
        "event": "on_chain_end",
        "metadata": {},
        "name": "reverse",
        "tags": [],
    },
]

示例：调度自定义事件

from langchain_core.callbacks.manager import (
    adispatch_custom_event,
)
from langchain_core.runnables import RunnableLambda, RunnableConfig
import asyncio


async def slow_thing(some_input: str, config: RunnableConfig) -> str:
    """Do something that takes a long time."""
    await asyncio.sleep(1) # Placeholder for some slow operation
    await adispatch_custom_event(
        "progress_event",
        {"message": "Finished step 1 of 3"},
        config=config # Must be included for python < 3.10
    )
    await asyncio.sleep(1) # Placeholder for some slow operation
    await adispatch_custom_event(
        "progress_event",
        {"message": "Finished step 2 of 3"},
        config=config # Must be included for python < 3.10
    )
    await asyncio.sleep(1) # Placeholder for some slow operation
    return "Done"

slow_thing = RunnableLambda(slow_thing)

async for event in slow_thing.astream_events("some_input", version="v2"):
    print(event)

参数 (Parameters)

input (Any) – Runnable 的输入。
config (Optional[RunnableConfig]) – 用于 Runnable 的配置。
version (Literal['v1', 'v2']) – 要使用的模式版本，可以是 v2 或 v1。用户应使用 v2。v1 用于向后兼容，将在 0.4.0 版本中弃用。在 API 稳定之前，不会分配默认值。自定义事件将仅在 v2 中显示。
include_names (Optional[Sequence[str]]) – 仅包含来自具有匹配名称的可运行对象的事件。
include_types (Optional[Sequence[str]]) – 仅包含来自具有匹配类型的可运行对象的事件。
include_tags (Optional[Sequence[str]]) – 仅包含来自具有匹配标签的可运行对象的事件。
exclude_names (Optional[Sequence[str]]) – 排除来自具有匹配名称的可运行对象的事件。
exclude_types (Optional[Sequence[str]]) – 排除来自具有匹配类型的可运行对象的事件。
exclude_tags (Optional[Sequence[str]]) – 排除来自具有匹配标签的可运行对象的事件。
kwargs (Any) – 要传递给 Runnable 的其他关键字参数。这些参数将传递给 astream_log，因为 astream_events 的此实现构建在 astream_log 之上。

产生

StreamEvents 的异步流。

引发

NotImplementedError – 如果版本不是 v1 或 v2。

返回类型

AsyncIterator[Union[StandardStreamEvent, CustomStreamEvent]]

batch(inputs: List[Input], config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None, *, return_exceptions: bool = False, **kwargs: Optional[Any]) → List[Output]¶

默认实现使用线程池执行器并行运行 invoke。

批处理的默认实现对于 IO 密集型 Runnable 效果良好。

如果子类可以更有效地进行批处理，则应覆盖此方法；例如，如果底层 Runnable 使用支持批处理模式的 API。

参数 (Parameters)

inputs (List[Input]) –
config (Optional[Union[RunnableConfig, List[RunnableConfig]]]) –
return_exceptions (bool) –
kwargs (Optional[Any]) –

返回类型

List[Output]

batch_as_completed(inputs: Sequence[Input], config: Optional[Union[RunnableConfig, Sequence[RunnableConfig]]] = None, *, return_exceptions: bool = False, **kwargs: Optional[Any]) → Iterator[Tuple[int, Union[Output, Exception]]]¶

在输入列表上并行运行 invoke，并在完成时产生结果。

参数 (Parameters)

inputs (Sequence[Input]) –
config (Optional[Union[RunnableConfig, Sequence[RunnableConfig]]]) –
return_exceptions (bool) –
kwargs (Optional[Any]) –

返回类型

Iterator[Tuple[int, Union[Output, Exception]]]

bind_tools(tools: Sequence[Union[Dict[str, Any], Type, Callable, BaseTool]], **kwargs: Any) → Runnable[Union[PromptValue, str, Sequence[Union[BaseMessage, List[str], Tuple[str, str], str, Dict[str, Any]]]], BaseMessage][source]¶

将类似工具的对象绑定到此聊天模型。

假定模型与 OpenAI 工具调用 API 兼容。

参数 (Parameters)

tools (Sequence[Union[Dict[str, Any], Type, Callable, BaseTool]]) – 要绑定到此聊天模型的工具定义列表。支持 langchain_core.utils.function_calling.convert_to_openai_tool() 处理的任何工具定义。
kwargs (Any) – 任何其他参数都直接传递给 self.bind(**kwargs)。

返回类型

Runnable[Union[PromptValue, str, Sequence[Union[BaseMessage, List[str], Tuple[str, str], str, Dict[str, Any]]]], BaseMessage]

call_as_llm(message: str, stop: Optional[List[str]] = None, **kwargs: Any) → str¶

版本 langchain-core==0.1.7 中已弃用：使用 invoke 代替。

参数 (Parameters)

message (str) –
stop (Optional[List[str]]) –
kwargs (Any) –

返回类型

str

configurable_alternatives(which: ConfigurableField, *, default_key: str = 'default', prefix_keys: bool = False, **kwargs: Union[Runnable[Input, Output], Callable[[], Runnable[Input, Output]]]) → RunnableSerializable[Input, Output]¶

配置可在运行时设置的可运行对象的备选项。

参数 (Parameters)

which (ConfigurableField) – 将用于选择备选项的 ConfigurableField 实例。
default_key (str) – 如果未选择备选项，则使用的默认键。默认为“default”。
prefix_keys (bool) – 是否使用 ConfigurableField id 作为键的前缀。默认为 False。
**kwargs (Union[Runnable[Input, Output], Callable[[], Runnable[Input, Output]]]) – 键到 Runnable 实例或返回 Runnable 实例的可调用对象的字典。

返回

配置了备选项的新 Runnable。

返回类型

RunnableSerializable[Input, Output]

from langchain_anthropic import ChatAnthropic
from langchain_core.runnables.utils import ConfigurableField
from langchain_openai import ChatOpenAI

model = ChatAnthropic(
    model_name="claude-3-sonnet-20240229"
).configurable_alternatives(
    ConfigurableField(id="llm"),
    default_key="anthropic",
    openai=ChatOpenAI()
)

# uses the default model ChatAnthropic
print(model.invoke("which organization created you?").content)

# uses ChatOpenAI
print(
    model.with_config(
        configurable={"llm": "openai"}
    ).invoke("which organization created you?").content
)

configurable_fields(**kwargs: Union[ConfigurableField, ConfigurableFieldSingleOption, ConfigurableFieldMultiOption]) → RunnableSerializable[Input, Output]¶

在运行时配置特定的 Runnable 字段。

参数 (Parameters): **kwargs (Union[ConfigurableField, ConfigurableFieldSingleOption, ConfigurableFieldMultiOption]) – 要配置的 ConfigurableField 实例的字典。
返回: 配置了字段的新 Runnable。
返回类型: RunnableSerializable[Input, Output]

from langchain_core.runnables import ConfigurableField
from langchain_openai import ChatOpenAI

model = ChatOpenAI(max_tokens=20).configurable_fields(
    max_tokens=ConfigurableField(
        id="output_token_number",
        name="Max tokens in the output",
        description="The maximum number of tokens in the output",
    )
)

# max_tokens = 20
print(
    "max_tokens_20: ",
    model.invoke("tell me something about chess").content
)

# max_tokens = 200
print("max_tokens_200: ", model.with_config(
    configurable={"output_token_number": 200}
    ).invoke("tell me something about chess").content
)

generate(messages: List[List[BaseMessage]], stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, *, tags: Optional[List[str]] = None, metadata: Optional[Dict[str, Any]] = None, run_name: Optional[str] = None, run_id: Optional[UUID] = None, **kwargs: Any) → LLMResult¶

将一系列提示传递给模型并返回模型生成结果。

此方法应利用批处理调用，以用于公开批处理 API 的模型。

当您想要

利用批处理调用，
需要从模型获得比仅仅是最佳生成值更多的输出，
正在构建与底层语言模型无关的链时使用此方法
类型（例如，纯文本完成模型与聊天模型）。

参数 (Parameters)

messages (List[List[BaseMessage]]) – 消息列表的列表。
stop (Optional[List[str]]) – 生成时使用的停止词。模型输出在首次出现任何这些子字符串时被截断。
callbacks (Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]) – 要传递的回调。用于在整个生成过程中执行额外的功能，例如日志记录或流式传输。
**kwargs (Any) – 任意附加关键字参数。这些通常传递给模型提供商 API 调用。
tags (Optional[List[str]]) –
metadata (Optional[Dict[str, Any]]) –
run_name (Optional[str]) –
run_id (Optional[UUID]) –
**kwargs –

返回

一个 LLMResult，其中包含每个输入: 提示的候选 Generations 列表和额外的模型提供商特定的输出。

返回类型

LLMResult

generate_prompt(prompts: List[PromptValue], stop: Optional[List[str]] = None, callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None, **kwargs: Any) → LLMResult¶

将一系列提示传递给模型并返回模型生成结果。

此方法应利用批处理调用，以用于公开批处理 API 的模型。

当您想要

利用批处理调用，
需要从模型获得比仅仅是最佳生成值更多的输出，
正在构建与底层语言模型无关的链时使用此方法
类型（例如，纯文本完成模型与聊天模型）。

参数 (Parameters)

prompts (List[PromptValue]) – PromptValue 列表。PromptValue 是一个可以转换为匹配任何语言模型格式的对象（纯文本生成模型的字符串和聊天模型的 BaseMessages）。
stop (Optional[List[str]]) – 生成时使用的停止词。模型输出在首次出现任何这些子字符串时被截断。
callbacks (Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]]) – 要传递的回调。用于在整个生成过程中执行额外的功能，例如日志记录或流式传输。
**kwargs (Any) – 任意附加关键字参数。这些通常传递给模型提供商 API 调用。

返回

一个 LLMResult，其中包含每个输入: 提示的候选 Generations 列表和额外的模型提供商特定的输出。

返回类型

LLMResult

get_num_tokens(text: str) → int¶

获取文本中存在的令牌数量。

用于检查输入是否适合模型的上下文窗口。

参数 (Parameters): text (str) – 要标记化的字符串输入。
返回: 文本中令牌的整数数量。
返回类型: int

get_num_tokens_from_messages(messages: List[BaseMessage]) → int¶

获取消息中的令牌数量。

用于检查输入是否适合模型的上下文窗口。

参数 (Parameters): messages (List[BaseMessage]) – 要标记化的消息输入。
返回: 消息中令牌数量的总和。
返回类型: int

get_token_ids(text: str) → List[int]¶

返回文本中令牌的有序 ID。

参数 (Parameters)

text (str) – 要标记化的字符串输入。

返回

与文本中的令牌对应的 ID 列表，按它们在文本中出现的顺序排列。: 在文本中。

返回类型

List[int]

invoke(input: LanguageModelInput, config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) → BaseMessage¶

将单个输入转换为输出。覆盖以实现。

参数 (Parameters)

input (LanguageModelInput) – Runnable 的输入。
config (Optional[RunnableConfig]) – 调用 Runnable 时要使用的配置。该配置支持标准键，如用于跟踪目的的 “tags”、“metadata”，用于控制并行执行多少工作的 “max_concurrency” 以及其他键。请参阅 RunnableConfig 了解更多详细信息。
stop (Optional[List[str]]) –
kwargs (Any) –

返回

Runnable 的输出。

返回类型

BaseMessage

predict(text: str, *, stop: Optional[Sequence[str]] = None, **kwargs: Any) → str¶

版本 langchain-core==0.1.7 中已弃用：使用 invoke 代替。

参数 (Parameters)

text (str) –
stop (Optional[Sequence[str]]) –
kwargs (Any) –

返回类型

str

predict_messages(messages: List[BaseMessage], *, stop: Optional[Sequence[str]] = None, **kwargs: Any) → BaseMessage¶

版本 langchain-core==0.1.7 中已弃用：使用 invoke 代替。

参数 (Parameters)

messages (List[BaseMessage]) –
stop (Optional[Sequence[str]]) –
kwargs (Any) –

返回类型

BaseMessage

stream(input: LanguageModelInput, config: Optional[RunnableConfig] = None, *, stop: Optional[List[str]] = None, **kwargs: Any) → Iterator[BaseMessageChunk]¶

流式传输的默认实现，它调用 invoke 方法。如果子类支持流式输出，则应重写此方法。

参数 (Parameters)

input (LanguageModelInput) – Runnable 的输入。
config (Optional[RunnableConfig]) – 用于 Runnable 的配置。默认为 None。
kwargs (Any) – 传递给 Runnable 的其他关键字参数。
stop (Optional[List[str]]) –

产生

Runnable 的输出。

返回类型

Iterator[BaseMessageChunk]

to_json() → Union[SerializedConstructor, SerializedNotImplemented]¶

将 Runnable 序列化为 JSON。

返回: Runnable 的 JSON 可序列化表示。
返回类型: Union[SerializedConstructor, SerializedNotImplemented]

with_structured_output(schema: Union[Dict, Type], *, include_raw: bool = False, **kwargs: Any) → Runnable[LanguageModelInput, Union[Dict, BaseModel]]¶

模型包装器，返回的输出格式与给定的模式匹配。

参数 (Parameters)

schema (Union[Dict, Type]) –
输出模式。可以作为以下内容传入：
- OpenAI 函数/工具模式，
- JSON Schema，
- TypedDict 类（在 0.2.26 中添加了支持），
- 或 Pydantic 类。
如果 schema 是 Pydantic 类，则模型输出将是该类的 Pydantic 实例，并且模型生成的字段将由 Pydantic 类验证。否则，模型输出将是一个字典，并且不会被验证。有关如何正确指定 Pydantic 或 TypedDict 类时模式字段的类型和描述的更多信息，请参阅 langchain_core.utils.function_calling.convert_to_openai_tool()。

Changed in version 0.2.26: 添加了对 TypedDict 类的支持。
include_raw (bool) – 如果为 False，则仅返回已解析的结构化输出。如果在模型输出解析期间发生错误，将引发错误。如果为 True，则将返回原始模型响应 (BaseMessage) 和已解析的模型响应。如果在输出解析期间发生错误，则会捕获并返回错误。最终输出始终是一个字典，其中包含键 “raw”、“parsed” 和 “parsing_error”。
kwargs (Any) –

返回

一个 Runnable，它接受与 langchain_core.language_models.chat.BaseChatModel 相同的输入。

如果 include_raw 为 False 且 schema 是 Pydantic 类，则 Runnable 输出 schema 的实例（即，Pydantic 对象）。

否则，如果 include_raw 为 False，则 Runnable 输出一个字典。

如果 include_raw 为 True，则 Runnable 输出一个字典，其中包含键

"raw": BaseMessage
"parsed": 如果存在解析错误，则为 None，否则类型取决于上面描述的 schema。
"parsing_error": Optional[BaseException]

返回类型

Runnable[LanguageModelInput, Union[Dict, BaseModel]]

示例：Pydantic 模式 (include_raw=False)

from langchain_core.pydantic_v1 import BaseModel

class AnswerWithJustification(BaseModel):
    '''An answer to the user question along with justification for the answer.'''
    answer: str
    justification: str

llm = ChatModel(model="model-name", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke("What weighs more a pound of bricks or a pound of feathers")

# -> AnswerWithJustification(
#     answer='They weigh the same',
#     justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
# )

示例：Pydantic 模式 (include_raw=True)

from langchain_core.pydantic_v1 import BaseModel

class AnswerWithJustification(BaseModel):
    '''An answer to the user question along with justification for the answer.'''
    answer: str
    justification: str

llm = ChatModel(model="model-name", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification, include_raw=True)

structured_llm.invoke("What weighs more a pound of bricks or a pound of feathers")
# -> {
#     'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
#     'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
#     'parsing_error': None
# }

示例：字典模式 (include_raw=False)

from langchain_core.pydantic_v1 import BaseModel
from langchain_core.utils.function_calling import convert_to_openai_tool

class AnswerWithJustification(BaseModel):
    '''An answer to the user question along with justification for the answer.'''
    answer: str
    justification: str

dict_schema = convert_to_openai_tool(AnswerWithJustification)
llm = ChatModel(model="model-name", temperature=0)
structured_llm = llm.with_structured_output(dict_schema)

structured_llm.invoke("What weighs more a pound of bricks or a pound of feathers")
# -> {
#     'answer': 'They weigh the same',
#     'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
# }

langchain_ollama.chat_models.ChatOllama¶

使用 ChatOllama 的示例¶

`langchain_ollama.chat_models`.ChatOllama¶