Dr. Idris Abdulmumin

Postdoctoral Fellow, University of Pretoria DSFSI.

prof_pic.jpg

Room 5-49 IT Building,

University of Pretoria,

Hatfield Campus

I am a Postdoctoral Fellow at the University of Pretoria’s Data Science for Social Impact Research Group, working on improving Natural Language Processing (NLP) for underrepresented African languages. I hold a PhD from Bayero University Kano, where I focused on using domain awareness, quality estimation, and transfer learning to optimize approaches for leveraging monolingual data to enhance machine translation in low-resource languages.

I have led and contributed to a range of impactful projects, including developing NLP resources and techniques for tasks such as LLM building, sentiment analysis, emotion analysis, semantic relatedness, hate and offensive speech analysis, machine translation, named-entity recognition, and sentence alignment. I actively collaborate with international research communities such as Masakhane, HausaNLP, LITHME, and the Open Language Data Initiative, and have contributed to several Semantic Evaluation shared tasks: AfriSenti 2023, SemRel 2024, and BRIGHTER 2025.

I’m passionate about mentoring students and early-career researchers, and I strive to integrate ethical, human-centric approaches into my work, aligning with my commitment to advancing NLP for social good. My research is driven by the belief that AI should be inclusive and accessible, and I aim to make meaningful contributions to the field while addressing the needs of underrepresented communities.

latest posts

selected publications

  1. DiB
    ZASCA-Sum: A Dataset of the South Africa Supreme Courts of Appeal Judgments and Media Summaries for Legal Documents Summarization Research
    Idris Abdulmumin and Vukosi Marivate
    Data in Brief, 2025
  2. Mach. Trans.
    Tag-less back-translation
    Idris Abdulmumin, Bashir Shehu Galadanci, and Garba Aliyu
    Machine Translation, Dec 2021
  3. LREC
    Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
    Idris Abdulmumin, Satya Ranjan Dash, Musa Abdullahi Dawud, and 7 more authors
    In Proceedings of the Language Resources and Evaluation Conference, Jun 2022
  4. ICCAIT
    Leveraging Closed-Access Multilingual Embedding for Automatic Sentence Alignment in Low Resource Languages
    Idris Abdulmumin, Auwal Abubakar Khalid, Shamsuddeen Hassan Muhammad, and 5 more authors
    In Proceedings of the International Conference on Computing and Advances in Information Technology (ICCAIT 2023), Nov 2023
  5. EMNLP
    AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
    Shamsuddeen Muhammad, Idris Abdulmumin, Abinew Ayele, and 24 more authors
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023