ADVERTISEMENT

India’s AI Opportunity Will Come From Data Digitisation, Says Wadhwani AI’s Alpan Raval

Wadhwani AI positions itself as an organisation dedicated to building AI solutions for underserved communities in developing nations.

India’s AI Opportunity Will Come From Data Digitisation, Says Wadhwani AI’s Alpan Raval

India has huge opportunity when it comes to building large language models for very narrow use cases in local languages, according to Wadhwani AI’s Chief Scientist of Artificial Intelligence and Machine Learning, Dr. Alpan Raval.

Wadhwani AI positions itself as an organisation dedicated to building AI solutions for underserved communities in developing nations. They’ve created solutions for the agriculture and health sectors. Some of their work includes a pest management system for smaller cotton farmers, as well as using AI to identify patients who might be drug resistant to tuberculosis.

It’s not that India needs to build and develop several homegrown large language models, but the fact that we can build an ecosystem around them, specifically, in speech recognition. Raval points to Bhashini, a government effort which aims to translate Indian languages to make it easy for people to access the internet and digital services.

“The great opportunity with the combination of LLMs and speech recognition is the opportunity we have to leapfrog into digitisation,” he said. A common complaint from companies building LLMs and the ecosystem is that India lacks usable data to train their AI models.

In fact, for decades now, data created in India is on paper, tucked away in cabinets, or in musty old rooms. Even today, while digitisation projects have continued at breakneck pace, large swathes of information, across industries and ecosystems, have continued to be on paper.

Opinion
Data, Risk Remain Key Challenges To Scaling Generative AI: Survey

India’s Digital Economy

India’s digitisation journey has been progressing at a breakneck pace, but the disparity is wide. The Indian Council for Research on International Economic Relations, an independent public policy organisation, in its State of India’s Digital Economy report 2024, proposed a new framework to measure where a nation’s digital economy stands, based on certain parameters.

“We have huge datasets. We’re the largest population in the world and the most diverse, but these datasets are mostly all on paper,” said Raval.

On an aggregate digital economy level, India ranks third in the world, behind the US and China but that score drops to the 12th spot when ranking user score. The reason being that India makes consistent advances in the production of new technologies like AI, as well as increased investment in startups. But the country lags in the adoption of older, standard technology like broadband and internet accessibility.

The challenge is ensuring how we can ensure that our data is digital at source. That’s when speech recognition, combined with LLMs to understand what’s being spoken, can be converted into structured databases, to make a very powerful application, Raval said.

The lack of gold standard data and structured databases has been a common complaint in India, particularly from those developing AI. Raval offers a different way of looking at it.

“You really have to ask the question, what do you really need gold standard data for?”

To train an AI or a model, you need data that fits the context. It must be data that comes from the same contextual setting that it seeks to serve. For example, if you’re building a model for a large hospital setting with top quality equipment, curating and carving out gold standard datasets is possible.

That wouldn’t work so well in a rural, farm setting. There, building gold standard datasets is simply not possible. “Your AI should in fact be trained on a certain level of noisy data, so that it can understand ‘noise,’” Raval said.

Throwing in other information likely won’t be useful. Additionally, by the time a model makes its way into the hands of users, practical realities are starkly different from what the model is trained on.

Opinion
Beyond Tomorrow: OpenAI Founder Raises $1 Billion, Musk's Supercomputer Cluster And More — Weekly AI Wrap

The Global South Opportunity

While AI has several applications, a lot of the development and use cases are being built for white-collar jobs, in primarily developed nations. The applications of AI in Global South is capable of solving several developmental issues like healthcare, farming and even as an aid to education.

The Global South accounts for Latin America, Africa, the Caribbean, most of Asia (excluding Japan, Israel and South Korea) and Oceania (excluding New Zealand and Australia).

Once more, however, the problem circles back to data.

“We’ve got a lot of untapped potential just in terms of the size and diversity of our data sets,” said Raval, adding that if we can find a way to digitise all our data, India can make it available to people who can build models. “We can build models for the world. Given the diversity we have, our models will need very little fine tuning to work in the rest of the world.”

India already has some digital platforms that cater to specific use-cases. We have the NIKSHAY portal, an initiative that incentivises the private sector to notify tuberculosis cases to the government. An AI integration with the platform can provide insights across geographies, communications and even hotspots of the disease.

Similarly, integrating AI into POSHAN Tracker, which is used for last mile nutrition service delivery and for tracking malnutrition prevalence in children can also have promising results.

Agri Stack, an initiative by the Ministry of Agriculture & Farmers Welfare, aims to make it easier for farmers to get access to subsidies and schemes, as well as localised advice for their fields.

“All of these platforms are being built. If you already have a platform dynamically collecting data on the fly and integrate AI into it, there is already scale because the platform has scaled,” Raval said.

Opinion
TCS, Infosys IT Picks For Bernstein As Gen AI-Led Capex Paves Way To Recovery