How is India's Sarvam AI Surpassing Global Leaders in OCR and Speech Technology?

Share:
Audio Loading voice…
How is India's Sarvam AI Surpassing Global Leaders in OCR and Speech Technology?

Synopsis

Discover how Bengaluru-based Sarvam AI is setting new benchmarks in optical character recognition and text-to-speech technology for Indian languages, outpacing global giants like Google and ChatGPT. This innovation not only enhances AI accessibility in India but also showcases the country's technological prowess.

Key Takeaways

Sarvam AI has outperformed global competitors in AI benchmarks.
The startup supports 22 scheduled Indian languages .
Its models achieve remarkable accuracy rates of up to 93.28% .
Focus on making AI accessible for everyone in India.
Innovative applications include complex table parsing and natural scene understanding.

Mumbai, Feb 9 (NationPress) The startup Sarvam AI, located in Bengaluru, has announced that its advanced vision and speech models have surpassed major global competitors like Google Gemini and ChatGPT in critical benchmarks for optical character recognition and text-to-speech tailored for Indian languages.

In a recent update on X, Pratyush Kumar, one of the co-founders of Sarvam AI, stated, "Sarvam Vision achieves a remarkable accuracy of 84.3% on the olmOCR-Bench (English only subset), surpassing cutting-edge models such as Gemini 3 Pro and recent OCR innovations like DeepSeek OCR 2."

On OmniDocBench v1.5 (English only subset), Sarvam Vision registered an impressive overall score of 93.28%, particularly excelling in complex formula interpretations and layout parsing, getting closer to the current state-of-the-art standards, Kumar noted.

He also mentioned that the Bulbul V3 text-to-speech model from Sarvam AI is capable of supporting 35 voices across all 22 scheduled Indian languages and is adept at managing varied quality scans and content.

"For Indian languages, Sarvam Vision stands out as the leading model, offering support for all 22 scheduled Indian languages," he asserted.

The Vision series features a 3-billion-parameter state-space model proficient in tasks such as image captioning, scene text recognition, chart interpretation, and intricate table parsing.

Sarvam AI emphasizes its commitment to making artificial intelligence accessible to every individual in India. "We aspire for India to engage confidently and with control in this significant technological evolution. Our goal is to develop foundational components and adapt them to cater to the country’s unique requirements," the company expressed.

Kumar showcased examples on social media where the platform successfully extracted technical terminology from complex tables with merged rows and columns. Additionally, it demonstrated the ability to extract data from a chart featured in the latest Economic Survey.

Beyond document processing, his posts illustrated Sarvam Vision’s capability in understanding general natural scenes, accurately interpreting a photograph of stunning landscapes.

Union IT Minister Ashwini Vaishnaw remarked in a recent post on X that the achievements of this startup reflect the triumph of India’s AI mission.

Point of View

It is crucial to recognize the strides made by Sarvam AI, which not only showcases India's capability in the AI sector but also aligns with the nation's vision for technological advancement. By prioritizing local language support and accessibility, Sarvam AI exemplifies a commitment to inclusivity and innovation that aligns with the broader goals of national progress.
NationPress
8 May 2026

Frequently Asked Questions

How does Sarvam AI's technology compare to global leaders?
Sarvam AI's recent models have shown superior performance against global competitors like Google Gemini and ChatGPT, achieving impressive accuracy rates in key benchmarks.
What languages does Bulbul V3 support?
The Bulbul V3 text-to-speech model supports 35 voices across all 22 scheduled Indian languages.
What are the applications of Sarvam Vision?
Sarvam Vision's applications include image captioning, scene text recognition, complex table parsing, and understanding natural scenes.
Why is Sarvam AI significant for India?
Sarvam AI represents a significant technological advancement for India, promoting accessibility and inclusivity in AI, which aligns with the nation’s broader goals for technological development.
Nation Press
The Trail

Connected Dots

Tracing the thread behind this story — newest first.

8 Dots
  1. Latest 2 months ago
  2. 2 months ago
  3. 2 months ago
  4. 2 months ago
  5. 2 months ago
  6. 2 months ago
  7. 2 months ago
  8. 1 year ago
Google Prefer NP
On Google