Subscribe
About

Bridging the digital divide: Why African languages must be included in AI

Nkhensani Nkhwashu
By Nkhensani Nkhwashu, ITWeb portals journalist.
Johannesburg, 18 Feb 2025
Neda Smith, chartered CIO, Agile Advisory Services at the ITWeb Artificial Intelligence Summit last week.
Neda Smith, chartered CIO, Agile Advisory Services at the ITWeb Artificial Intelligence Summit last week.

The lack of African language inclusion in AI systems has a devastating impact, shutting out millions of Africans from the digital world. This was highlighted by Neda Smith, chartered CIO of Agile Advisory Services at the ITWeb Artificial Intelligence Summit last week. Smith's discussion centred on bridging the digital divide: overcoming challenges in African language AI. Smith noted the pressing need to bring African languages to the forefront of AI innovation.

“Language is more than just words; it's a reflection of our identity, culture and access to information.” She said with over 2 000 languages spoken across the continent, Africa is home to a staggering linguistic diversity. However, Smith said despite this richness, African languages are being left behind in the AI revolution.

Currently, AI systems are being trained on datasets that predominantly feature Western languages, she noted, effectively excluding millions of Africans from the digital landscape.

Smith said the exclusion has far-reaching consequences, particularly in education, communication and socio-economic growth. “PwC has estimated that AI could contribute up to $1.5 trillion to the African economy in 2030, but if millions of Africans cannot communicate with these tools, how are they going to form part of that economy?”

She also highlighted the importance of education and cultural preservation. “If AI tools and learning platforms can communicate with people in their own languages, it could revolutionise education systems across the continent. Moreover, African languages represent not just communication but also identity, culture, heritage and values. If these languages are not included in AI systems, there is a risk of losing this cultural heritage.”

Smith identified several key areas of African languages within AI, including data scarcity and data quality, linguistic diversity and complexity, limited investment and resource funding, lack of standardised orthography and AI-friendly datasets and specialised terminology.

“A significant portion of Africa's knowledge and literature remains undocumented, existing only in oral traditions or in textbooks and texts that have yet to be digitised.”

She also mentioned that of the 2 000 languages spoken in Africa, there are also different dialects, and very few of these have been documented, so it's difficult to teach the tools how to construct sentences in these languages.

To overcome these challenges, Smith emphasised the need for collective action. This includes expanding language models, digitising African literature and history, investing in local research and supporting local tech start-ups.

“Language barriers should not be a limitation. We need to work together to bridge the digital divide and bring African languages to the forefront of AI innovation.”

Share