Dr. Idris Abdulmumin
Postdoctoral Fellow, University of Pretoria DSFSI.

Room 5-49 IT Building,
University of Pretoria,
Hatfield Campus
I am a Postdoctoral Fellow at the University of Pretoria’s Data Science for Social Impact Research Group, working on improving Natural Language Processing (NLP) for underrepresented African languages. I hold a PhD from Bayero University Kano, where I focused on using domain awareness, quality estimation, and transfer learning to optimize approaches for leveraging monolingual data to enhance machine translation in low-resource languages.
I have led and contributed to a range of impactful projects, including developing NLP resources and techniques for tasks such as LLM building, sentiment analysis, emotion analysis, semantic relatedness, hate and offensive speech analysis, machine translation, named-entity recognition, and sentence alignment. I actively collaborate with international research communities such as Masakhane, HausaNLP, LITHME, and the Open Language Data Initiative, and have contributed to several Semantic Evaluation shared tasks: AfriSenti 2023, SemRel 2024, and BRIGHTER 2025.
I’m passionate about mentoring students and early-career researchers, and I strive to integrate ethical, human-centric approaches into my work, aligning with my commitment to advancing NLP for social good. My research is driven by the belief that AI should be inclusive and accessible, and I aim to make meaningful contributions to the field while addressing the needs of underrepresented communities.
latest posts
selected publications
- LRECHausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine TranslationIn Proceedings of the Language Resources and Evaluation Conference, Jun 2022
- ICCAITLeveraging Closed-Access Multilingual Embedding for Automatic Sentence Alignment in Low Resource LanguagesIn Proceedings of the International Conference on Computing and Advances in Information Technology (ICCAIT 2023), Nov 2023
- EMNLPAfriSenti: A Twitter Sentiment Analysis Benchmark for African LanguagesIn Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023