The End-to-End Malaysian Retrieval Engine !

Retrieval

End-to-End Multi-lingual Malaysian Retrieval Engine, 8k context length and faster.

πŸ‡²πŸ‡ΎπŸ€–πŸ”

Lower Latency

Faster converting your texts compared to OpenAI endpoints with average 50 ms, lower is better.
Tested in Singapore region, single string, stress-tested on 50 requests for 30 seconds with rate of 10 spawner per second.

Better Embedding Accuracy

Better accuracy compared to OpenAI Embedding. We benchmarked on Malaysia knowledge base, mesolitica/malaysian-embedding-leaderboard, higher is better.

Improve Retrieval Recall using Reranker

Post-sorting Embedding Base using Reranker Base improve recall score, higher is better.

Playground

You can play around with Retrieval, try it at Nous App

USA map

Try the API

Embedding engine is compatible with OpenAI library, read more Nous LLM Router Documentation

USA map

Pricing

Prepaid based, natively Multi-lingual, share credits with MaLLaM πŸŒ™

Model name Input / 1M tokens
Embedding Base MYR 1.00
Reranker Base MYR 1.00

Private

Self-host Retrieval in your private network for 100% privacy, either on-premise or private cloud, read more at MaLLaM πŸŒ™ Self-hosted Enterprise

Frequently asked questions


What is this retrieval engine?

Embedding and Reranker models are crucial pipelines for LLMOps to retrieve the correct knowledge base for user queries.

What is the rate limit?

Currently we hard limit 100k Tokens per Minute.

How to topup?

Just go to billing page and topup! Minimum RM3 and Maximum RM99.