We have hosted the application whisper in order to run this application in our online workstations with Wine or directly.


Quick description about whisper:

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.

Programming Language: Python.
Categories:
Speech Recognition, Speech to Text

Page navigation:

©2024. Winfy. All Rights Reserved.

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.