Skip to content

LLM providers and calls

Docent uses a unified interface to call and aggregate results from different LLM providers.

Provider registry

Each LLM provider is specified through a ProviderConfig object, which requires three functions:

  • async_client_getter: Returns an async client for the provider
  • single_output_getter: Gets a single completion from the provider, compatible with the AsyncSingleOutputGetter protocol
  • single_streaming_output_getter: Gets a streaming completion from the provider, compatible with the AsyncSingleStreamingOutputGetter protocol

We currently support anthropic, openai, and azure_openai.

Adding a new provider

  1. Create a new module in docent_core/_llm_util/providers/ (e.g., my_provider.py)
  2. Implement the functions required by ProviderConfig
  3. Add the provider to the PROVIDERS dictionary in registry.py

Selecting models for Docent functions

Docent uses a preference system to determine which LLM models to use for different functions. ProviderPreferences manages the mapping between Docent functions and their ordered preference of ModelOption objects:

@cached_property
def function_name(self) -> list[ModelOption]:
    """Get model options for the function_name function.

    Returns:
        List of configured model options for this function.
    """
    return [
        ModelOption(
            provider="anthropic",
            model_name="claude-3-7-sonnet-20250219",
            reasoning_effort="medium"  # only for reasoning models
        ),
        ModelOption(
            provider="openai",
            model_name="o1",
            reasoning_effort="medium"
        ),
    ]

Any function that calls an LLM API must have a corresponding function in ProviderPreferences that returns its ModelOption preferences. LLMManager will try to use the first ModelOption, then fall back to following ones upon failure.

Usage

To customize which models are used for a specific function:

  1. Locate docent_core/_llm_util/providers/preferences.py
  2. Find or modify the cached property for the function you want to customize
  3. Specify the ModelOption objects in the returned list

docent_core._llm_util.providers.registry

Registry for LLM providers with their configurations.

PROVIDERS module-attribute

PROVIDERS: dict[str, ProviderConfig] = {'anthropic': ProviderConfig(async_client_getter=get_anthropic_client_async, single_output_getter=get_anthropic_chat_completion_async, single_streaming_output_getter=get_anthropic_chat_completion_streaming_async), 'google': ProviderConfig(async_client_getter=get_google_client_async, single_output_getter=get_google_chat_completion_async, single_streaming_output_getter=get_google_chat_completion_streaming_async), 'openai': ProviderConfig(async_client_getter=get_openai_client_async, single_output_getter=get_openai_chat_completion_async, single_streaming_output_getter=get_openai_chat_completion_streaming_async), 'azure_openai': ProviderConfig(async_client_getter=get_azure_openai_client_async, single_output_getter=get_openai_chat_completion_async, single_streaming_output_getter=get_openai_chat_completion_streaming_async)}

Registry of supported LLM providers with their respective configurations.

SingleOutputGetter

Bases: Protocol

Protocol for getting non-streaming output from an LLM.

Defines the interface for async functions that retrieve a single non-streaming response from an LLM provider.

Source code in docent_core/_llm_util/providers/registry.py
class SingleOutputGetter(Protocol):
    """Protocol for getting non-streaming output from an LLM.

    Defines the interface for async functions that retrieve a single
    non-streaming response from an LLM provider.
    """

    async def __call__(
        self,
        client: Any,
        messages: list[ChatMessage],
        model_name: str,
        *,
        tools: list[ToolInfo] | None,
        tool_choice: Literal["auto", "required"] | None,
        max_new_tokens: int,
        temperature: float,
        reasoning_effort: Literal["low", "medium", "high"] | None,
        logprobs: bool,
        top_logprobs: int | None,
        timeout: float,
    ) -> LLMOutput:
        """Get a single completion from an LLM.

        Args:
            client: The provider-specific client instance.
            messages: The list of messages in the conversation.
            model_name: The name of the model to use.
            tools: Optional list of tools available to the model.
            tool_choice: Optional specification for tool usage.
            max_new_tokens: Maximum number of tokens to generate.
            temperature: Controls randomness in output generation.
            reasoning_effort: Optional control for model reasoning depth.
            logprobs: Whether to return log probabilities.
            top_logprobs: Number of most likely tokens to return probabilities for.
            timeout: Maximum time to wait for a response in seconds.

        Returns:
            LLMOutput: The model's response.
        """
        ...

__call__ async

__call__(client: Any, messages: list[ChatMessage], model_name: str, *, tools: list[ToolInfo] | None, tool_choice: Literal['auto', 'required'] | None, max_new_tokens: int, temperature: float, reasoning_effort: Literal['low', 'medium', 'high'] | None, logprobs: bool, top_logprobs: int | None, timeout: float) -> LLMOutput

Get a single completion from an LLM.

Parameters:

Name Type Description Default
client Any

The provider-specific client instance.

required
messages list[ChatMessage]

The list of messages in the conversation.

required
model_name str

The name of the model to use.

required
tools list[ToolInfo] | None

Optional list of tools available to the model.

required
tool_choice Literal['auto', 'required'] | None

Optional specification for tool usage.

required
max_new_tokens int

Maximum number of tokens to generate.

required
temperature float

Controls randomness in output generation.

required
reasoning_effort Literal['low', 'medium', 'high'] | None

Optional control for model reasoning depth.

required
logprobs bool

Whether to return log probabilities.

required
top_logprobs int | None

Number of most likely tokens to return probabilities for.

required
timeout float

Maximum time to wait for a response in seconds.

required

Returns:

Name Type Description
LLMOutput LLMOutput

The model's response.

Source code in docent_core/_llm_util/providers/registry.py
async def __call__(
    self,
    client: Any,
    messages: list[ChatMessage],
    model_name: str,
    *,
    tools: list[ToolInfo] | None,
    tool_choice: Literal["auto", "required"] | None,
    max_new_tokens: int,
    temperature: float,
    reasoning_effort: Literal["low", "medium", "high"] | None,
    logprobs: bool,
    top_logprobs: int | None,
    timeout: float,
) -> LLMOutput:
    """Get a single completion from an LLM.

    Args:
        client: The provider-specific client instance.
        messages: The list of messages in the conversation.
        model_name: The name of the model to use.
        tools: Optional list of tools available to the model.
        tool_choice: Optional specification for tool usage.
        max_new_tokens: Maximum number of tokens to generate.
        temperature: Controls randomness in output generation.
        reasoning_effort: Optional control for model reasoning depth.
        logprobs: Whether to return log probabilities.
        top_logprobs: Number of most likely tokens to return probabilities for.
        timeout: Maximum time to wait for a response in seconds.

    Returns:
        LLMOutput: The model's response.
    """
    ...

SingleStreamingOutputGetter

Bases: Protocol

Protocol for getting streaming output from an LLM.

Defines the interface for async functions that retrieve streaming responses from an LLM provider.

Source code in docent_core/_llm_util/providers/registry.py
class SingleStreamingOutputGetter(Protocol):
    """Protocol for getting streaming output from an LLM.

    Defines the interface for async functions that retrieve streaming
    responses from an LLM provider.
    """

    async def __call__(
        self,
        client: Any,
        streaming_callback: AsyncSingleLLMOutputStreamingCallback | None,
        messages: list[ChatMessage],
        model_name: str,
        *,
        tools: list[ToolInfo] | None,
        tool_choice: Literal["auto", "required"] | None,
        max_new_tokens: int,
        temperature: float,
        reasoning_effort: Literal["low", "medium", "high"] | None,
        logprobs: bool,
        top_logprobs: int | None,
        timeout: float,
    ) -> LLMOutput:
        """Get a streaming completion from an LLM.

        Args:
            client: The provider-specific client instance.
            streaming_callback: Optional callback for processing streaming chunks.
            messages: The list of messages in the conversation.
            model_name: The name of the model to use.
            tools: Optional list of tools available to the model.
            tool_choice: Optional specification for tool usage.
            max_new_tokens: Maximum number of tokens to generate.
            temperature: Controls randomness in output generation.
            reasoning_effort: Optional control for model reasoning depth.
            logprobs: Whether to return log probabilities.
            top_logprobs: Number of most likely tokens to return probabilities for.
            timeout: Maximum time to wait for a response in seconds.

        Returns:
            LLMOutput: The complete model response after streaming finishes.
        """
        ...

__call__ async

__call__(client: Any, streaming_callback: AsyncSingleLLMOutputStreamingCallback | None, messages: list[ChatMessage], model_name: str, *, tools: list[ToolInfo] | None, tool_choice: Literal['auto', 'required'] | None, max_new_tokens: int, temperature: float, reasoning_effort: Literal['low', 'medium', 'high'] | None, logprobs: bool, top_logprobs: int | None, timeout: float) -> LLMOutput

Get a streaming completion from an LLM.

Parameters:

Name Type Description Default
client Any

The provider-specific client instance.

required
streaming_callback AsyncSingleLLMOutputStreamingCallback | None

Optional callback for processing streaming chunks.

required
messages list[ChatMessage]

The list of messages in the conversation.

required
model_name str

The name of the model to use.

required
tools list[ToolInfo] | None

Optional list of tools available to the model.

required
tool_choice Literal['auto', 'required'] | None

Optional specification for tool usage.

required
max_new_tokens int

Maximum number of tokens to generate.

required
temperature float

Controls randomness in output generation.

required
reasoning_effort Literal['low', 'medium', 'high'] | None

Optional control for model reasoning depth.

required
logprobs bool

Whether to return log probabilities.

required
top_logprobs int | None

Number of most likely tokens to return probabilities for.

required
timeout float

Maximum time to wait for a response in seconds.

required

Returns:

Name Type Description
LLMOutput LLMOutput

The complete model response after streaming finishes.

Source code in docent_core/_llm_util/providers/registry.py
async def __call__(
    self,
    client: Any,
    streaming_callback: AsyncSingleLLMOutputStreamingCallback | None,
    messages: list[ChatMessage],
    model_name: str,
    *,
    tools: list[ToolInfo] | None,
    tool_choice: Literal["auto", "required"] | None,
    max_new_tokens: int,
    temperature: float,
    reasoning_effort: Literal["low", "medium", "high"] | None,
    logprobs: bool,
    top_logprobs: int | None,
    timeout: float,
) -> LLMOutput:
    """Get a streaming completion from an LLM.

    Args:
        client: The provider-specific client instance.
        streaming_callback: Optional callback for processing streaming chunks.
        messages: The list of messages in the conversation.
        model_name: The name of the model to use.
        tools: Optional list of tools available to the model.
        tool_choice: Optional specification for tool usage.
        max_new_tokens: Maximum number of tokens to generate.
        temperature: Controls randomness in output generation.
        reasoning_effort: Optional control for model reasoning depth.
        logprobs: Whether to return log probabilities.
        top_logprobs: Number of most likely tokens to return probabilities for.
        timeout: Maximum time to wait for a response in seconds.

    Returns:
        LLMOutput: The complete model response after streaming finishes.
    """
    ...

ProviderConfig

Bases: TypedDict

Configuration for an LLM provider.

Contains the necessary functions to create clients and interact with a specific LLM provider.

Attributes:

Name Type Description
async_client_getter Callable[[], Any]

Function to get an async client for the provider.

single_output_getter SingleOutputGetter

Function to get a non-streaming completion.

single_streaming_output_getter SingleStreamingOutputGetter

Function to get a streaming completion.

Source code in docent_core/_llm_util/providers/registry.py
class ProviderConfig(TypedDict):
    """Configuration for an LLM provider.

    Contains the necessary functions to create clients and interact with
    a specific LLM provider.

    Attributes:
        async_client_getter: Function to get an async client for the provider.
        single_output_getter: Function to get a non-streaming completion.
        single_streaming_output_getter: Function to get a streaming completion.
    """

    async_client_getter: Callable[[], Any]
    single_output_getter: SingleOutputGetter
    single_streaming_output_getter: SingleStreamingOutputGetter

docent_core._llm_util.providers.preferences

Provides preferences of which LLM models to use for different Docent functions.

ModelOption

Bases: BaseModel

Configuration for a specific model from a provider.

Attributes:

Name Type Description
provider str

The name of the LLM provider (e.g., "openai", "anthropic").

model_name str

The specific model to use from the provider.

reasoning_effort Literal['low', 'medium', 'high'] | None

Optional indication of computational effort to use.

Source code in docent_core/_llm_util/providers/preferences.py
class ModelOption(BaseModel):
    """Configuration for a specific model from a provider.

    Attributes:
        provider: The name of the LLM provider (e.g., "openai", "anthropic").
        model_name: The specific model to use from the provider.
        reasoning_effort: Optional indication of computational effort to use.
    """

    provider: str
    model_name: str
    reasoning_effort: Literal["low", "medium", "high"] | None = None

ProviderPreferences

Bases: BaseModel

Manages model preferences for different docent functions.

This class provides access to configured model options for each function that requires LLM capabilities in the docent system.

Source code in docent_core/_llm_util/providers/preferences.py
class ProviderPreferences(BaseModel):
    """Manages model preferences for different docent functions.

    This class provides access to configured model options for each
    function that requires LLM capabilities in the docent system.
    """

    @cached_property
    def handle_ta_message(self) -> list[ModelOption]:
        """Get model options for the handle_ta_message function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
            ),
            ModelOption(
                provider="openai",
                model_name="o1",
            ),
        ]

    @cached_property
    def generate_new_queries(self) -> list[ModelOption]:
        """Get model options for the generate_new_queries function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="openai",
                model_name="o1",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def summarize_intended_solution(self) -> list[ModelOption]:
        """Get model options for the summarize_intended_solution function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
            ),
            ModelOption(
                provider="openai",
                model_name="gpt-4o-2024-08-06",
            ),
        ]

    @cached_property
    def summarize_agent_actions(self) -> list[ModelOption]:
        """Get model options for the summarize_agent_actions function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="low",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
                reasoning_effort="low",
            ),
            ModelOption(
                provider="openai",
                model_name="o1",
                reasoning_effort="low",
            ),
        ]

    @cached_property
    def group_actions_into_high_level_steps(self) -> list[ModelOption]:
        """Get model options for the group_actions_into_high_level_steps function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="low",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
                reasoning_effort="low",
            ),
            ModelOption(
                provider="openai",
                model_name="o1",
                reasoning_effort="low",
            ),
        ]

    @cached_property
    def interesting_agent_observations(self) -> list[ModelOption]:
        """Get model options for the interesting_agent_observations function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="openai",
                model_name="o1",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def propose_clusters(self) -> list[ModelOption]:
        """Get model options for the propose_clusters function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
            ),
            ModelOption(
                provider="openai",
                model_name="gpt-4o-2024-08-06",
            ),
        ]

    @cached_property
    def execute_search(self) -> list[ModelOption]:
        """Get model options for the execute_search function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="openai",
                model_name="o1",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def execute_search_paired(self) -> list[ModelOption]:
        """Get model options for the execute_search_paired function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="openai",
                model_name="o3",
                reasoning_effort="medium",
            ),
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def cluster_assign_o3_mini(self) -> list[ModelOption]:
        """Get model options for the cluster_assign_o3-mini function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="openai",
                model_name="o3-mini",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def cluster_assign_sonnet_37_thinking(self) -> list[ModelOption]:
        """Get model options for the cluster_assign_sonnet-37-thinking function.

        Returns:
            List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def compare_transcripts(self) -> list[ModelOption]:
        """Get model options for the compare_transcripts function.

        Returns:
                List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="anthropic",
                model_name="claude-3-7-sonnet-20250219",
                reasoning_effort="medium",
            ),
        ]

    @cached_property
    def cluster_assign_gemini_flash(self) -> list[ModelOption]:
        """Get model options for the cluster_assign_gemini_flash function.

        Returns:
                List of configured model options for this function.
        """
        return [
            ModelOption(
                provider="google",
                model_name="gemini-2.5-flash-preview-05-20",
                reasoning_effort="medium",
            ),
        ]

handle_ta_message cached property

handle_ta_message: list[ModelOption]

Get model options for the handle_ta_message function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

generate_new_queries cached property

generate_new_queries: list[ModelOption]

Get model options for the generate_new_queries function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

summarize_intended_solution cached property

summarize_intended_solution: list[ModelOption]

Get model options for the summarize_intended_solution function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

summarize_agent_actions cached property

summarize_agent_actions: list[ModelOption]

Get model options for the summarize_agent_actions function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

group_actions_into_high_level_steps cached property

group_actions_into_high_level_steps: list[ModelOption]

Get model options for the group_actions_into_high_level_steps function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

interesting_agent_observations cached property

interesting_agent_observations: list[ModelOption]

Get model options for the interesting_agent_observations function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

propose_clusters cached property

propose_clusters: list[ModelOption]

Get model options for the propose_clusters function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

execute_search: list[ModelOption]

Get model options for the execute_search function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

execute_search_paired cached property

execute_search_paired: list[ModelOption]

Get model options for the execute_search_paired function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

cluster_assign_o3_mini cached property

cluster_assign_o3_mini: list[ModelOption]

Get model options for the cluster_assign_o3-mini function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

cluster_assign_sonnet_37_thinking cached property

cluster_assign_sonnet_37_thinking: list[ModelOption]

Get model options for the cluster_assign_sonnet-37-thinking function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

compare_transcripts cached property

compare_transcripts: list[ModelOption]

Get model options for the compare_transcripts function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.

cluster_assign_gemini_flash cached property

cluster_assign_gemini_flash: list[ModelOption]

Get model options for the cluster_assign_gemini_flash function.

Returns:

Type Description
list[ModelOption]

List of configured model options for this function.