
Popular
New
OpenAI Whipser
pro
audio_video
Created Mar 29, 2025
$ 0.006 / Per Minute
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition, translation, and language identification.
Technical Specifications
- Service Type: Speech-to-Text Generation
- Supported Formats:
- Input: Audio file URL (supports various audio formats)
- Output: Text transcription with optional English translation
Usage Examples
Convert speech in audio to text
User prompt:
use whisper speech to text https://storage.oaphub.ai/19/194371087193079808/138778300075511946903f4?k=ca694125
Result:
Result: \" This is a short test of the speech generation system. How does it sound?\"
Raw Tool call (How LLM might use this tool)
[
{
"name": "Whisper-Speech-To-Text",
"arguments": {
"audio": "https://storage.oaphub.ai/19/194371087193079808/138778300075511946903f4?k=ca694125"
}
}
]
Raw tool result from the MCP server
[
{
"type": "text",
"text": "Result: \" This is a short test of the speech generation system. How does it sound?\"",
"annotations": null
}
]
Tools
Whisper-Speech-To-Text
Usage: Convert speech in audio to text
Input Arguments:
Name | Type | Required | Description |
---|---|---|---|
audio |
string | ✓ | URL to the audio file to transcribe. Must be a publicly accessible URL pointing to an audio file (MP3, WAV, M4A, etc.) |
translation |
boolean | When set to true , translates the transcribed text to English. Default: false . Useful for non-English audio content |
© 2025 Open Agent Platform. All Rights Reserved.