mcp-image
Popular
New
OpenAI Whipser
pro
audio_video
Created Mar 29, 2025
$ 0.006 / Per Minute
Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition, translation, and language identification.

Technical Specifications

  • Service Type: Speech-to-Text Generation
  • Supported Formats:
    • Input: Audio file URL (supports various audio formats)
    • Output: Text transcription with optional English translation

Usage Examples

Convert speech in audio to text

User prompt:

use whisper speech to text https://storage.oaphub.ai/19/194371087193079808/138778300075511946903f4?k=ca694125

Result:

Result: \" This is a short test of the speech generation system. How does it sound?\"

Raw Tool call (How LLM might use this tool)

[
  {
    "name": "Whisper-Speech-To-Text",
    "arguments": {
      "audio": "https://storage.oaphub.ai/19/194371087193079808/138778300075511946903f4?k=ca694125"
    }
  }
]

Raw tool result from the MCP server

[
  {
    "type": "text",
    "text": "Result: \" This is a short test of the speech generation system. How does it sound?\"",
    "annotations": null
  }
]

Tools

Whisper-Speech-To-Text

Usage: Convert speech in audio to text

Input Arguments:

Name Type Required Description
audio string URL to the audio file to transcribe. Must be a publicly accessible URL pointing to an audio file (MP3, WAV, M4A, etc.)
translation boolean When set to true, translates the transcribed text to English. Default: false. Useful for non-English audio content
© 2025 Open Agent Platform. All Rights Reserved.