Transcript

A Transcript object represents a sequence of chat messages (user, assistant, system, tool) from the perspective of a single agent. See here for more details on the chat message schemas.

Action units

Action units are logical groupings of related messages in a conversation. They represent complete interaction cycles between users, AI assistants, and tools.

Action units are determined by the following rules:

System Messages: Each system message forms its own standalone action unit
User-Assistant Exchanges:
- A new user message starts a new action unit (unless following another user message)
- Assistant messages following a user or another assistant stay in the same unit
- Tool messages are always part of the current unit

For precise details on how action units are determined, refer to the _compute_units_of_action method implementation.

Conceptual Examples

Example 1: basic action units

Action Unit 0:
  [System] "You are a helpful AI assistant..."

Action Unit 1:
  [User] "What's the weather today?"
  [Assistant] "The weather in your area is sunny with a high of 72°F"

Action Unit 2:
  [User] "What should I wear?"
  [Assistant] "Given the sunny weather, light clothing would be appropriate"

Example 2: action units with tools

Action Unit 0:
  [System] "You are a coding assistant..."

Action Unit 1:
  [User] "Create a Python function to calculate Fibonacci numbers"
  [Assistant] "I'll create that function for you"
  [Tool] Code generation tool output

Action Unit 2:
  [Assistant] "Here's the function I created: ..."

Edge cases

Multiple consecutive user messages stay in same unit

Action Unit 0:
  [User] "Hello"
  [User] "Can you help me with something?"
  [Assistant] "Yes, I'd be happy to help."

docent.data_models.transcript

Transcript

Bases: BaseModel

Represents a transcript of messages in a conversation with an AI agent.

A transcript contains a sequence of messages exchanged between different roles (system, user, assistant, tool) and provides methods to organize these messages into logical units of action.

Attributes:

Name	Type	Description
`id`	`str`	Unique identifier for the transcript, auto-generated by default.
`name`	`str \| None`	Optional human-readable name for the transcript.
`description`	`str \| None`	Optional description of the transcript.
`messages`	`list[ChatMessage]`	List of chat messages in the transcript.
`metadata`	`BaseMetadata`	Additional structured metadata about the transcript.

Source code in docent/data_models/transcript.py

class Transcript(BaseModel):
    """Represents a transcript of messages in a conversation with an AI agent.

    A transcript contains a sequence of messages exchanged between different roles
    (system, user, assistant, tool) and provides methods to organize these messages
    into logical units of action.

    Attributes:
        id: Unique identifier for the transcript, auto-generated by default.
        name: Optional human-readable name for the transcript.
        description: Optional description of the transcript.
        messages: List of chat messages in the transcript.
        metadata: Additional structured metadata about the transcript.
    """

    id: str = Field(default_factory=lambda: str(uuid4()))
    name: str | None = None
    description: str | None = None

    messages: list[ChatMessage]
    metadata: BaseMetadata = Field(default_factory=BaseMetadata)

    _units_of_action: list[list[int]] | None = PrivateAttr(default=None)

    @field_serializer("metadata")
    def serialize_metadata(self, metadata: BaseMetadata, _info: Any) -> dict[str, Any]:
        """
        Custom serializer for the metadata field so the internal fields are explicitly preserved.
        """
        return metadata.model_dump(strip_internal_fields=False)

    @field_validator("metadata", mode="before")
    @classmethod
    def _validate_metadata_type(cls, v: Any) -> Any:
        if v is not None and not isinstance(v, BaseMetadata):
            raise ValueError(
                f"metadata must be an instance of BaseMetadata, got {type(v).__name__}"
            )
        return v

    @property
    def units_of_action(self) -> list[list[int]]:
        """Get the units of action in the transcript.

        A unit of action represents a logical group of messages, such as a system message
        on its own or a user message followed by assistant responses and tool outputs.

        Returns:
            list[list[int]]: List of units of action, where each unit is a list of message indices.
        """
        if self._units_of_action is None:
            self._units_of_action = self._compute_units_of_action()
        return self._units_of_action

    def __init__(self, *args: Any, **kwargs: Any):
        super().__init__(*args, **kwargs)
        self._units_of_action = self._compute_units_of_action()

    def _compute_units_of_action(self) -> list[list[int]]:
        """Compute the units of action in the transcript.

        A unit of action is defined as:
        - A system prompt by itself
        - A group consisting of a user message, assistant response, and any associated tool outputs

        Returns:
            list[list[int]]: A list of units of action, where each unit is a list of message indices.
        """
        if not self.messages:
            return []

        units: list[list[int]] = []
        current_unit: list[int] = []

        def _start_new_unit():
            nonlocal current_unit
            if current_unit:
                units.append(current_unit.copy())
            current_unit = []

        for i, message in enumerate(self.messages):
            role = message.role
            prev_message = self.messages[i - 1] if i > 0 else None

            # System messages are their own unit
            if role == "system":
                assert not current_unit, "System message should be the first message"
                units.append([i])

            # User message always starts a new unit UNLESS the previous message was a user message
            elif role == "user":
                if current_unit and prev_message and prev_message.role != "user":
                    _start_new_unit()
                current_unit.append(i)

            # Start a new unit if the previous message was not a user or assistant message
            elif role == "assistant":
                if (
                    current_unit
                    and prev_message
                    and prev_message.role != "user"
                    and prev_message.role != "assistant"
                ):
                    _start_new_unit()
                current_unit.append(i)

            # Tool messages are part of the current unit
            elif role == "tool":
                current_unit.append(i)

            else:
                raise ValueError(f"Unknown message role: {role}")

        # Add the last unit if it exists
        _start_new_unit()

        return units

    def get_first_block_in_action_unit(self, action_unit_idx: int) -> int | None:
        """Get the index of the first message in a given action unit.

        Args:
            action_unit_idx: The index of the action unit.

        Returns:
            int | None: The index of the first message in the action unit,
                        or None if the action unit doesn't exist.

        Raises:
            IndexError: If the action unit index is out of range.
        """
        if not self._units_of_action:
            self._units_of_action = self._compute_units_of_action()

        if 0 <= action_unit_idx < len(self._units_of_action):
            unit = self._units_of_action[action_unit_idx]
            return unit[0] if unit else None
        return None

    def get_action_unit_for_block(self, block_idx: int) -> int | None:
        """Find the action unit that contains the specified message block.

        Args:
            block_idx: The index of the message block to find.

        Returns:
            int | None: The index of the action unit containing the block,
                        or None if no action unit contains the block.
        """
        if not self._units_of_action:
            self._units_of_action = self._compute_units_of_action()

        for unit_idx, unit in enumerate(self._units_of_action):
            if block_idx in unit:
                return unit_idx
        return None

    def set_messages(self, messages: list[ChatMessage]):
        """Set the messages in the transcript and recompute units of action.

        Args:
            messages: The new list of chat messages to set.
        """
        self.messages = messages
        self._units_of_action = self._compute_units_of_action()

    def to_str(
        self,
        transcript_idx: int = 0,
        agent_run_idx: int | None = None,
        highlight_action_unit: int | None = None,
    ) -> str:
        return self.to_str_with_token_limit(
            token_limit=sys.maxsize,
            agent_run_idx=agent_run_idx,
            transcript_idx=transcript_idx,
            highlight_action_unit=highlight_action_unit,
        )[0]

    def to_str_with_token_limit(
        self,
        token_limit: int,
        transcript_idx: int = 0,
        agent_run_idx: int | None = None,
        highlight_action_unit: int | None = None,
    ) -> list[str]:
        """Represents the transcript as a list of strings, each of which is at most token_limit tokens
        under the GPT-4 tokenization scheme.

        We'll try to split up long transcripts along message boundaries and include metadata.
        For very long messages, we'll have to truncate them and remove metadata.

        Returns:
            list[str]: A list of strings, each of which is at most token_limit tokens
            under the GPT-4 tokenization scheme.
        """
        if highlight_action_unit is not None and not (
            0 <= highlight_action_unit < len(self._units_of_action or [])
        ):
            raise ValueError(f"Invalid action unit index: {highlight_action_unit}")

        # Format blocks by units of action
        au_blocks: list[str] = []
        for unit_idx, unit in enumerate(self._units_of_action or []):
            unit_blocks: list[str] = []
            for msg_idx in unit:
                unit_blocks.append(
                    format_chat_message(
                        self.messages[msg_idx],
                        msg_idx,
                        transcript_idx,
                        agent_run_idx,
                    )
                )

            unit_content = "\n".join(unit_blocks)

            # Add highlighting if requested
            if highlight_action_unit and unit_idx == highlight_action_unit:
                blocks_str_template = "<HIGHLIGHTED>\n{}\n</HIGHLIGHTED>"
            else:
                blocks_str_template = "{}"
            au_blocks.append(
                blocks_str_template.format(
                    f"<action unit {unit_idx}>\n{unit_content}\n</action unit {unit_idx}>"
                )
            )
        blocks_str = "\n".join(au_blocks)

        # Gather metadata
        metadata_obj = self.metadata.model_dump(strip_internal_fields=True)
        # Add the field descriptions if they exist
        metadata_obj = {
            (f"{k} ({d})" if (d := self.metadata.get_field_description(k)) is not None else k): v
            for k, v in metadata_obj.items()
        }

        yaml_width = float("inf")
        block_str = f"<blocks>\n{blocks_str}\n</blocks>\n"
        metadata_str = f"<metadata>\n{yaml.dump(metadata_obj, width=yaml_width)}\n</metadata>"

        if token_limit == sys.maxsize:
            return [f"{block_str}" f"{metadata_str}"]

        metadata_token_count = get_token_count(metadata_str)
        block_token_count = get_token_count(block_str)

        if metadata_token_count + block_token_count <= token_limit:
            return [f"{block_str}" f"{metadata_str}"]
        else:
            results: list[str] = []
            block_token_counts = [get_token_count(block) for block in au_blocks]
            ranges = group_messages_into_ranges(
                block_token_counts, metadata_token_count, token_limit
            )
            for msg_range in ranges:
                if msg_range.include_metadata:
                    cur_au_blocks = "\n".join(au_blocks[msg_range.start : msg_range.end])
                    results.append(f"<blocks>\n{cur_au_blocks}\n</blocks>\n" f"{metadata_str}")
                else:
                    assert (
                        msg_range.end == msg_range.start + 1
                    ), "Ranges without metadata should be a single message"
                    result = str(au_blocks[msg_range.start])
                    if msg_range.num_tokens > token_limit - 10:
                        result = truncate_to_token_limit(result, token_limit - 10)
                    results.append(f"<blocks>\n{result}\n</blocks>\n")

            return results

units_of_action `property`

units_of_action: list[list[int]]

Get the units of action in the transcript.

A unit of action represents a logical group of messages, such as a system message on its own or a user message followed by assistant responses and tool outputs.

Returns:

Type	Description
`list[list[int]]`	list[list[int]]: List of units of action, where each unit is a list of message indices.

serialize_metadata

serialize_metadata(metadata: BaseMetadata, _info: Any) -> dict[str, Any]

Custom serializer for the metadata field so the internal fields are explicitly preserved.

Source code in docent/data_models/transcript.py

@field_serializer("metadata")
def serialize_metadata(self, metadata: BaseMetadata, _info: Any) -> dict[str, Any]:
    """
    Custom serializer for the metadata field so the internal fields are explicitly preserved.
    """
    return metadata.model_dump(strip_internal_fields=False)

get_first_block_in_action_unit

get_first_block_in_action_unit(action_unit_idx: int) -> int | None

Get the index of the first message in a given action unit.

Parameters:

Name	Type	Description	Default
`action_unit_idx`	`int`	The index of the action unit.	required

Returns:

Type	Description
`int \| None`	int \| None: The index of the first message in the action unit, or None if the action unit doesn't exist.

Raises:

Type	Description
`IndexError`	If the action unit index is out of range.

Source code in docent/data_models/transcript.py

def get_first_block_in_action_unit(self, action_unit_idx: int) -> int | None:
    """Get the index of the first message in a given action unit.

    Args:
        action_unit_idx: The index of the action unit.

    Returns:
        int | None: The index of the first message in the action unit,
                    or None if the action unit doesn't exist.

    Raises:
        IndexError: If the action unit index is out of range.
    """
    if not self._units_of_action:
        self._units_of_action = self._compute_units_of_action()

    if 0 <= action_unit_idx < len(self._units_of_action):
        unit = self._units_of_action[action_unit_idx]
        return unit[0] if unit else None
    return None

get_action_unit_for_block

get_action_unit_for_block(block_idx: int) -> int | None

Find the action unit that contains the specified message block.

Parameters:

Name	Type	Description	Default
`block_idx`	`int`	The index of the message block to find.	required

Returns:

Type	Description
`int \| None`	int \| None: The index of the action unit containing the block, or None if no action unit contains the block.

Source code in docent/data_models/transcript.py

def get_action_unit_for_block(self, block_idx: int) -> int | None:
    """Find the action unit that contains the specified message block.

    Args:
        block_idx: The index of the message block to find.

    Returns:
        int | None: The index of the action unit containing the block,
                    or None if no action unit contains the block.
    """
    if not self._units_of_action:
        self._units_of_action = self._compute_units_of_action()

    for unit_idx, unit in enumerate(self._units_of_action):
        if block_idx in unit:
            return unit_idx
    return None

set_messages

set_messages(messages: list[ChatMessage])

Set the messages in the transcript and recompute units of action.

Parameters:

Name	Type	Description	Default
`messages`	`list[ChatMessage]`	The new list of chat messages to set.	required

Source code in docent/data_models/transcript.py

def set_messages(self, messages: list[ChatMessage]):
    """Set the messages in the transcript and recompute units of action.

    Args:
        messages: The new list of chat messages to set.
    """
    self.messages = messages
    self._units_of_action = self._compute_units_of_action()

to_str_with_token_limit

to_str_with_token_limit(token_limit: int, transcript_idx: int = 0, agent_run_idx: int | None = None, highlight_action_unit: int | None = None) -> list[str]

Represents the transcript as a list of strings, each of which is at most token_limit tokens under the GPT-4 tokenization scheme.

We'll try to split up long transcripts along message boundaries and include metadata. For very long messages, we'll have to truncate them and remove metadata.

Returns:

Type	Description
`list[str]`	list[str]: A list of strings, each of which is at most token_limit tokens
`list[str]`	under the GPT-4 tokenization scheme.

Source code in docent/data_models/transcript.py

def to_str_with_token_limit(
    self,
    token_limit: int,
    transcript_idx: int = 0,
    agent_run_idx: int | None = None,
    highlight_action_unit: int | None = None,
) -> list[str]:
    """Represents the transcript as a list of strings, each of which is at most token_limit tokens
    under the GPT-4 tokenization scheme.

    We'll try to split up long transcripts along message boundaries and include metadata.
    For very long messages, we'll have to truncate them and remove metadata.

    Returns:
        list[str]: A list of strings, each of which is at most token_limit tokens
        under the GPT-4 tokenization scheme.
    """
    if highlight_action_unit is not None and not (
        0 <= highlight_action_unit < len(self._units_of_action or [])
    ):
        raise ValueError(f"Invalid action unit index: {highlight_action_unit}")

    # Format blocks by units of action
    au_blocks: list[str] = []
    for unit_idx, unit in enumerate(self._units_of_action or []):
        unit_blocks: list[str] = []
        for msg_idx in unit:
            unit_blocks.append(
                format_chat_message(
                    self.messages[msg_idx],
                    msg_idx,
                    transcript_idx,
                    agent_run_idx,
                )
            )

        unit_content = "\n".join(unit_blocks)

        # Add highlighting if requested
        if highlight_action_unit and unit_idx == highlight_action_unit:
            blocks_str_template = "<HIGHLIGHTED>\n{}\n</HIGHLIGHTED>"
        else:
            blocks_str_template = "{}"
        au_blocks.append(
            blocks_str_template.format(
                f"<action unit {unit_idx}>\n{unit_content}\n</action unit {unit_idx}>"
            )
        )
    blocks_str = "\n".join(au_blocks)

    # Gather metadata
    metadata_obj = self.metadata.model_dump(strip_internal_fields=True)
    # Add the field descriptions if they exist
    metadata_obj = {
        (f"{k} ({d})" if (d := self.metadata.get_field_description(k)) is not None else k): v
        for k, v in metadata_obj.items()
    }

    yaml_width = float("inf")
    block_str = f"<blocks>\n{blocks_str}\n</blocks>\n"
    metadata_str = f"<metadata>\n{yaml.dump(metadata_obj, width=yaml_width)}\n</metadata>"

    if token_limit == sys.maxsize:
        return [f"{block_str}" f"{metadata_str}"]

    metadata_token_count = get_token_count(metadata_str)
    block_token_count = get_token_count(block_str)

    if metadata_token_count + block_token_count <= token_limit:
        return [f"{block_str}" f"{metadata_str}"]
    else:
        results: list[str] = []
        block_token_counts = [get_token_count(block) for block in au_blocks]
        ranges = group_messages_into_ranges(
            block_token_counts, metadata_token_count, token_limit
        )
        for msg_range in ranges:
            if msg_range.include_metadata:
                cur_au_blocks = "\n".join(au_blocks[msg_range.start : msg_range.end])
                results.append(f"<blocks>\n{cur_au_blocks}\n</blocks>\n" f"{metadata_str}")
            else:
                assert (
                    msg_range.end == msg_range.start + 1
                ), "Ranges without metadata should be a single message"
                result = str(au_blocks[msg_range.start])
                if msg_range.num_tokens > token_limit - 10:
                    result = truncate_to_token_limit(result, token_limit - 10)
                results.append(f"<blocks>\n{result}\n</blocks>\n")

        return results

TranscriptWithoutMetadataValidator

Bases: Transcript

A version of Transcript that doesn't have the model_validator on metadata. Needed for sending/receiving transcripts via JSON, since they incorrectly trip the existing model_validator.

Source code in docent/data_models/transcript.py

class TranscriptWithoutMetadataValidator(Transcript):
    """
    A version of Transcript that doesn't have the model_validator on metadata.
    Needed for sending/receiving transcripts via JSON, since they incorrectly trip the existing model_validator.
    """

    @field_validator("metadata", mode="before")
    @classmethod
    def _validate_metadata_type(cls, v: Any) -> Any:
        # Bypass the model_validator
        return v

units_of_action `property`

units_of_action: list[list[int]]

Get the units of action in the transcript.

A unit of action represents a logical group of messages, such as a system message on its own or a user message followed by assistant responses and tool outputs.

Returns:

Type	Description
`list[list[int]]`	list[list[int]]: List of units of action, where each unit is a list of message indices.

serialize_metadata

serialize_metadata(metadata: BaseMetadata, _info: Any) -> dict[str, Any]

Custom serializer for the metadata field so the internal fields are explicitly preserved.

Source code in docent/data_models/transcript.py

@field_serializer("metadata")
def serialize_metadata(self, metadata: BaseMetadata, _info: Any) -> dict[str, Any]:
    """
    Custom serializer for the metadata field so the internal fields are explicitly preserved.
    """
    return metadata.model_dump(strip_internal_fields=False)

get_first_block_in_action_unit

get_first_block_in_action_unit(action_unit_idx: int) -> int | None

Get the index of the first message in a given action unit.

Parameters:

Name	Type	Description	Default
`action_unit_idx`	`int`	The index of the action unit.	required

Returns:

Type	Description
`int \| None`	int \| None: The index of the first message in the action unit, or None if the action unit doesn't exist.

Raises:

Type	Description
`IndexError`	If the action unit index is out of range.

Source code in docent/data_models/transcript.py

def get_first_block_in_action_unit(self, action_unit_idx: int) -> int | None:
    """Get the index of the first message in a given action unit.

    Args:
        action_unit_idx: The index of the action unit.

    Returns:
        int | None: The index of the first message in the action unit,
                    or None if the action unit doesn't exist.

    Raises:
        IndexError: If the action unit index is out of range.
    """
    if not self._units_of_action:
        self._units_of_action = self._compute_units_of_action()

    if 0 <= action_unit_idx < len(self._units_of_action):
        unit = self._units_of_action[action_unit_idx]
        return unit[0] if unit else None
    return None

get_action_unit_for_block

get_action_unit_for_block(block_idx: int) -> int | None

Find the action unit that contains the specified message block.

Parameters:

Name	Type	Description	Default
`block_idx`	`int`	The index of the message block to find.	required

Returns:

Type	Description
`int \| None`	int \| None: The index of the action unit containing the block, or None if no action unit contains the block.

Source code in docent/data_models/transcript.py

def get_action_unit_for_block(self, block_idx: int) -> int | None:
    """Find the action unit that contains the specified message block.

    Args:
        block_idx: The index of the message block to find.

    Returns:
        int | None: The index of the action unit containing the block,
                    or None if no action unit contains the block.
    """
    if not self._units_of_action:
        self._units_of_action = self._compute_units_of_action()

    for unit_idx, unit in enumerate(self._units_of_action):
        if block_idx in unit:
            return unit_idx
    return None

set_messages

set_messages(messages: list[ChatMessage])

Set the messages in the transcript and recompute units of action.

Parameters:

Name	Type	Description	Default
`messages`	`list[ChatMessage]`	The new list of chat messages to set.	required

Source code in docent/data_models/transcript.py

def set_messages(self, messages: list[ChatMessage]):
    """Set the messages in the transcript and recompute units of action.

    Args:
        messages: The new list of chat messages to set.
    """
    self.messages = messages
    self._units_of_action = self._compute_units_of_action()

to_str_with_token_limit

to_str_with_token_limit(token_limit: int, transcript_idx: int = 0, agent_run_idx: int | None = None, highlight_action_unit: int | None = None) -> list[str]

Represents the transcript as a list of strings, each of which is at most token_limit tokens under the GPT-4 tokenization scheme.

We'll try to split up long transcripts along message boundaries and include metadata. For very long messages, we'll have to truncate them and remove metadata.

Returns:

Type	Description
`list[str]`	list[str]: A list of strings, each of which is at most token_limit tokens
`list[str]`	under the GPT-4 tokenization scheme.

Source code in docent/data_models/transcript.py

def to_str_with_token_limit(
    self,
    token_limit: int,
    transcript_idx: int = 0,
    agent_run_idx: int | None = None,
    highlight_action_unit: int | None = None,
) -> list[str]:
    """Represents the transcript as a list of strings, each of which is at most token_limit tokens
    under the GPT-4 tokenization scheme.

    We'll try to split up long transcripts along message boundaries and include metadata.
    For very long messages, we'll have to truncate them and remove metadata.

    Returns:
        list[str]: A list of strings, each of which is at most token_limit tokens
        under the GPT-4 tokenization scheme.
    """
    if highlight_action_unit is not None and not (
        0 <= highlight_action_unit < len(self._units_of_action or [])
    ):
        raise ValueError(f"Invalid action unit index: {highlight_action_unit}")

    # Format blocks by units of action
    au_blocks: list[str] = []
    for unit_idx, unit in enumerate(self._units_of_action or []):
        unit_blocks: list[str] = []
        for msg_idx in unit:
            unit_blocks.append(
                format_chat_message(
                    self.messages[msg_idx],
                    msg_idx,
                    transcript_idx,
                    agent_run_idx,
                )
            )

        unit_content = "\n".join(unit_blocks)

        # Add highlighting if requested
        if highlight_action_unit and unit_idx == highlight_action_unit:
            blocks_str_template = "<HIGHLIGHTED>\n{}\n</HIGHLIGHTED>"
        else:
            blocks_str_template = "{}"
        au_blocks.append(
            blocks_str_template.format(
                f"<action unit {unit_idx}>\n{unit_content}\n</action unit {unit_idx}>"
            )
        )
    blocks_str = "\n".join(au_blocks)

    # Gather metadata
    metadata_obj = self.metadata.model_dump(strip_internal_fields=True)
    # Add the field descriptions if they exist
    metadata_obj = {
        (f"{k} ({d})" if (d := self.metadata.get_field_description(k)) is not None else k): v
        for k, v in metadata_obj.items()
    }

    yaml_width = float("inf")
    block_str = f"<blocks>\n{blocks_str}\n</blocks>\n"
    metadata_str = f"<metadata>\n{yaml.dump(metadata_obj, width=yaml_width)}\n</metadata>"

    if token_limit == sys.maxsize:
        return [f"{block_str}" f"{metadata_str}"]

    metadata_token_count = get_token_count(metadata_str)
    block_token_count = get_token_count(block_str)

    if metadata_token_count + block_token_count <= token_limit:
        return [f"{block_str}" f"{metadata_str}"]
    else:
        results: list[str] = []
        block_token_counts = [get_token_count(block) for block in au_blocks]
        ranges = group_messages_into_ranges(
            block_token_counts, metadata_token_count, token_limit
        )
        for msg_range in ranges:
            if msg_range.include_metadata:
                cur_au_blocks = "\n".join(au_blocks[msg_range.start : msg_range.end])
                results.append(f"<blocks>\n{cur_au_blocks}\n</blocks>\n" f"{metadata_str}")
            else:
                assert (
                    msg_range.end == msg_range.start + 1
                ), "Ranges without metadata should be a single message"
                result = str(au_blocks[msg_range.start])
                if msg_range.num_tokens > token_limit - 10:
                    result = truncate_to_token_limit(result, token_limit - 10)
                results.append(f"<blocks>\n{result}\n</blocks>\n")

        return results

Transcript

Action units

Conceptual Examples

Edge cases

docent.data_models.transcript

Transcript

units_of_action property

serialize_metadata

get_first_block_in_action_unit

get_action_unit_for_block

set_messages

to_str_with_token_limit

TranscriptWithoutMetadataValidator

units_of_action property

serialize_metadata

get_first_block_in_action_unit

get_action_unit_for_block

set_messages

to_str_with_token_limit

units_of_action `property`

units_of_action `property`