This is a running list of tech resources and initatives related to Manding language varieties like Bambara, Jula, Maninka and Mandinka.
It is NOT a list of resources for learning Manding varieties (or at least, they aren’t specifically meant for learning). For that, please see the General Resources page.
I made it as a simple way to keep track of all the projects, resources and initiatives that I keep coming across.
Corpora
Apps/Programs
Bambara Reference Corpus (Fr. Corpus Bambara de Référence) — A web-based text analysis or search engine app that lets you search across a huge collection of texts written in Latin-based Bambara. See this page of mine for some tips in English about how to use it.
Maninka Reference Corpus (Fr. Corpus Maninka de Référence) — A web-based text analysis and/or search engine app that lets you search across collections of Maninka texts written in both Latin-based orthography and N’ko-based orthography
N’ko Corpus — Another searchable corpora app. This one is N’ko only and distinct from the Maninka Reference Corpus listed above.
Machine Translation
Apps/Programs
GoogleTranslate — Machine translation software service that can handle Latin-based Bambara. See this page for a write-up with links and testing and review video that I did. In June 2024, they added N’ko (which is Manding [typically closest to Maninka] written in the N’ko script) and Jula (which they call “Dyula”). I have yet to test these.
Bambara MT — A web-based software app that is designed to “Translate text to Bambara and convert it to speech with options to enhance audio quality.” Uses a bunch of machine learning models and datasets, which are listed on Aboubacar Ouattara’s HuggingFace profile (but also elsewhere here on this page).
Text-to-Speech (TTS)
Data Sets
bambara-tts — A dataset of scraped Bambara language texts such the dialogs from José Morales’s J’apprends le bambara and Charles Bailleul’s collection of Bambara proverbs.
Automatic Speech Recognition (ASR)
Apps/Programs
Bambara ASR — A web-based demo app to which you can upload Bambara language audio files and get a rough transcription.
Kouma Bi Boro [sic, Kuma b’i boro] — Android-based app for recognizing Jula as spoken in Côte d’Ivoire.
Open Moise — A software initiative based out of Abidjan that is focused on creating smartphones that can recognize Jula and other local languages of Côte d’Ivoire. See this video profile of mine of one of the developers behind the project for more information.
Data Sets
Jeli-ASR — A dataset of 30 hours of Bambara language storytelling that is recorded, transcribed and translated. Plus an ASR model. See this write-up for my assessment of the dataset/corpus.
bambara-asr — A dataset. Appears to use the data from Jeli-ASR (and perhaps other sources).
nicolingua — Includes Maninka language data from Guinea.
Articles
Deploying a Speech Recognition Model for Under-Resourced Languages: A Case Study on Dioula Wake Words 1, 2, 3, and 4 [LINK]
Misc
Nothing here yet
Tips and tricks for mastering E and Ɛ in Bambara/Jula and Maninka!