Speech 🔈

End-to-End blazingly fast streamable Speech-to-Text and Speech Translation with Speaker Diarization and OpenAI compatible.


Able to transcribe Standard Malay, local Malay, Standard English, Manglish, Mandarin, Indonesian and Tamil as audio input.

Better Accuracy

🔈 Base achieved 100% on prepared Hallucination-Free Test, mesolitica/speech-test-set/hallucination-free, higher is better.

We benchmarked on Malaysian Speech Translation and Speech-to-Text test set, mesolitica/malaysian-stt-leaderboard, higher is better.

Developer Playground

You can play around with Speech 🔈 at Dashboard

  • 230 TPS / 50 Seconds per Second 🔈 Base, 300 TPS / 80 Seconds per Second 🔈 Small
  • Real-time SRT format
  • Record Mode and Upload Mode
  • File Upload, accept MP3, WAV or FLAC, max up to 30MB
  • Real-time Speaker Diarization
  • Generate code for OpenAI Python, OpenAI NodeJS, Python AIOHTTP Streaming and CuRL

Prepaid pricing

Best either for solo or a team.

  • Speech Translate to Malay or English
  • Max 30 MB upload
  • Real-time Speaker Diarization
  • Streaming capability

🔈 Base

0.4 USD
per 1 Hour

🔈 Small

0.25 USD
per 1 Hour

Frequently asked questions

How does Speech API charged if I have a long audio?

We charged every 30 successful processed seconds.

What is the rate limit?

Currently we hard limit 2 hours per Minute.

How to topup?

Just go to billing page and topup! Minimum 3 USD and Maximum 1000 USD.

Interested for Enterprise solution?

If you are interested to self-host in your virtual private network either on-premise or private cloud with custom solution, email us at khalil@mesolitica.com or husein@mesolitica.com