Skip to content

Conversation

@strands-agent
Copy link
Contributor

Description

Add native audio content block support to the SDK's type system, following the established pattern for video, image, and document content.

Changes

  1. Types (src/strands/types/media.py):

    • Add AudioFormat literal type (mp3, wav, flac, ogg, webm)
    • Add AudioSource TypedDict with bytes attribute
    • Add AudioContent TypedDict with format and source attributes
  2. ContentBlock (src/strands/types/content.py):

    • Add audio: AudioContent field to the ContentBlock TypedDict
  3. Bedrock Provider (src/strands/models/bedrock.py):

    • Add audio handling in _format_request_message_content() following the video pattern
  4. LlamaCpp Provider (src/strands/models/llamacpp.py):

    • Update to use native AudioContent types instead of cast(Dict[str, Any], content)
  5. Tests (tests/strands/models/test_bedrock.py):

    • Add test_format_request_filters_audio_content_blocks test

Benefits

  • Type Safety: No more cast(Dict[str, Any], content) workarounds
  • Consistency: Audio follows the same pattern as video, image, and document
  • Model Support: Enables type-safe audio handling for Bedrock (Nova Sonic), LlamaCpp (Qwen2.5-Omni), and future providers

Usage Example

from strands.types.content import ContentBlock
from strands.types.media import AudioContent

audio_block: ContentBlock = {
    "audio": {
        "format": "mp3",
        "source": {"bytes": audio_bytes}
    }
}

response = agent([audio_block, {"text": "What do you hear in this audio?"}])

Related Issues

Closes #866

Documentation PR

No documentation changes required - this adds types that follow existing patterns.

Type of Change

New feature

Testing

  • I ran hatch run prepare (formatter + linter + mypy + tests)
  • Unit test passes: test_format_request_filters_audio_content_blocks
  • All existing tests continue to pass

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Adds AudioContent TypedDict to support audio input in messages, following
the established pattern for image and video content.

Changes:
- Add AudioFormat Literal type with common audio formats (mp3, wav, flac, ogg, aac, webm)
- Add AudioSource TypedDict for audio binary content
- Add AudioContent TypedDict with format and source fields
- Add 'audio' field to ContentBlock TypedDict
- Add audio handling in BedrockModel._format_request_message_content()
- Add unit test for audio content block filtering

This enables type-safe audio input for model providers that support multimodal
audio content, such as Bedrock (Nova Sonic) and LlamaCpp (Qwen2.5-Omni).

Closes strands-agents#866
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add Native Audio Support to ContentBlock Type

1 participant