BharatGen: India’s AI Manhattan Project in the Making

September 30, 2025

General Studies Paper-3

Context

The recent launch of BharatGen marks a transformative mission toward technological sovereignty and culturally rooted artificial intelligence systems, aiming to embed AI into the very fabric of India’s digital future.

About BharatGen

It is the world’s first government-funded multimodal Large Language Model (LLM) initiative, launched under the IndiaAI Mission, focused on developing foundational AI models in:
- Text (LLMs for Indian languages);
- Speech (Text-to-Speech and Automatic Speech Recognition);
- Vision-language systems (for multimodal understanding);
It is spearheaded by the Department of Science and Technology (DST) under the National Mission on Interdisciplinary Cyber-Physical Systems (NM-ICPS).
The project is led by IIT Bombay, with a consortium of premier institutions including IIIT Hyderabad, IIT Madras, IIT Kanpur, IIT Mandi, IIT Hyderabad, and IIM Indore.
It is being executed by the Technology Innovation Hub (TIH) Foundation for IoT and IoE at IIT Bombay, which serves as the central coordination hub.
It oversees:
- Model development across modalities;
- Data collection and curation focused on Indian contexts;
- Ecosystem partnerships for compute, talent, and deployment;
- Governance and strategic planning;
Budgetary Support: BharatGen has secured a staggering ₹988.6 crore in funding from MeitY, making it the largest beneficiary of the ₹1,500 crore national AI budget, under the IndiaAI Mission 2025.

Key Features & Importance

Language Coverage and Inclusivity: Currently, BharatGen models support 9 Indian languages, and aims to cover all 22 scheduled Indian languages by June 2026.
It has already launched Param-1, a bilingual LLM trained on 5 trillion tokens in English and Hindi.
Real-World Impact: BharatGen has already piloted applications in agriculture, governance, and defence, and these applications aim to be scaled across all states and districts, transforming public service delivery nationwide, once fully deployed.
Redefining Digital Sovereignty Through AI: BharatGen seeks to redefine digital sovereignty through AI, much like the original Manhattan Project that redefined global power dynamics through nuclear science.
- BharatGen is not just about building models — it’s about building infrastructure, policy, and public-good ecosystems.
Supporting Digital Public Infrastructure: BharatGen can be intelligence by participation like Aadhaar was identity by participation.
It needs to evolve into a stack: APIs, developer toolkits, open frameworks, and deployment systems that seed innovation across state universities, startups, and grassroots hubs.

Related Concerns & Challenges

Lack of Robust AI Regulation: India still lacks comprehensive AI-specific legislation, despite the scale of BharatGen.
- The Digital Personal Data Protection Act (DPDPA) offers broad exemptions to the government, raising concerns about unchecked data processing and surveillance.
Infrastructure Gaps: Limited access to high-performance GPUs, data centers, and compute resources could slow down development.
Language and Inclusivity Challenges: Ensuring accuracy, cultural sensitivity, and regional relevance across hundreds of dialects is a massive challenge.
Talent and Ecosystem Readiness: Deep tech projects need interdisciplinary expertise — from linguistics to ethics to engineering — which is still developing in India.
Ethical and Governance Frameworks: There are concerns like who controls the models, how decisions are made, and how citizens can challenge AI-driven outcomes.
- The risk of algorithmic bias, especially in sensitive domains like healthcare and governance, requires rigorous testing and redressal mechanisms.

Government Efforts & Progress To Overcome Above Challenges

Strengthening AI Infrastructure: BharatGen has already secured 13,640 H100 GPUs under the IndiaAI Mission to support trillion-parameter models.
- IBM and BharatGen are co-developing open-source, Indic-specific data workflows to streamline model training.
Building Ethical and Regulatory Frameworks: IBM is helping implement enterprise-grade governance frameworks for responsible model development.
- Experts recommend involving civil society and academia in shaping AI ethics policies.
Enhancing Language Inclusivity: BharatGen is developing Domain-specific Small Language Models (SLMs) for agriculture, Ayurveda, legal, and finance sectors.
- IBM and BharatGen are building systems that switch seamlessly between Indian languages while preserving context.
Fostering Talent and Ecosystem Growth: BharatGen brings together IITs, IIITs, and IIMs to pool expertise across domains.
- Open-source models and solution templates allow Indian startups to build AI tools for local contexts.
Promoting Public Trust and Accessibility: Open-source Release: BharatGen plans to make its models publicly available to ensure transparency and accessibility.
- IBM and BharatGen are creating templates for sectors like education, governance, and healthcare.
- AI-powered platforms will support vernacular languages for inclusive public service delivery.