Publications

My publications.

2025

  1. NAACL
    AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages
    Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, and 24 more authors
    In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Apr 2025
  2. arXiv
    BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages
    Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Idris Abdulmumin, and 45 more authors
    Apr 2025
  3. DiB
    ZASCA-Sum: A Dataset of the South Africa Supreme Courts of Appeal Judgments and Media Summaries for Legal Documents Summarization Research
    Idris Abdulmumin and Vukosi Marivate
    Data in Brief, Apr 2025

2024

  1. SemEval
    SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages
    Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, and 14 more authors
    In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), Jun 2024
  2. LREC-COLING
    Mitigating Translationese in Low-resource Languages: The Storyboard Approach
    Garry Kuwanto, Eno-Abasi E. Urua, Priscilla Amondi Amuok, and 21 more authors
    In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), May 2024
  3. SIGIR
    CIRAL: A Test Collection for CLIR Evaluation in African Languages
    Mofetoluwa Adeyemi, Akintunde Oladipo, Xinyu Zhang, and 20 more authors
    In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 2024
  4. SACAIR
    Analysing Public Transport User Sentiment on Low Resource Multilingual Data
    Rozina Myoya, Vukosi Marivate, and Idris Abdulmumin
    In Proceedings of the Fifth Southern African Conference for Artificial Intelligence Research, Jul 2024
  5. WOAH
    HausaHate: An Expert Annotated Corpus for Hausa Hate Speech Detection
    Francielle Vargas, Samuel Guimarães, Shamsuddeen Hassan Muhammad, and 6 more authors
    In Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024), Jun 2024
  6. WMT
    Correcting FLORES Evaluation Dataset for Four African Languages
    Idris Abdulmumin, Sthembiso Mkhwanazi, Mahlatse Mbooi, and 7 more authors
    In Proceedings of the Ninth Conference on Machine Translation, Nov 2024
  7. WMT
    Findings of WMT2024 English-to-Low Resource Multimodal Translation Task
    Shantipriya Parida, Ondřej Bojar, Idris Abdulmumin, and 2 more authors
    In Proceedings of the Ninth Conference on Machine Translation, Nov 2024
  8. ACL
    SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages
    Nedjma Ousidhoum, Shamsuddeen Muhammad, Mohamed Abdalla, and 24 more authors
    In Findings of the Association for Computational Linguistics: ACL 2024, Aug 2024

2023

  1. ACL
    HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language
    Shantipriya Parida, Idris Abdulmumin, Shamsuddeen Hassan Muhammad, and 7 more authors
    In Findings of the Association for Computational Linguistics: ACL 2023, Jul 2023
  2. SemEval
    HausaNLP at SemEval-2023 Task 10: Transfer Learning, Synthetic Data and Side-information for Multi-level Sexism Classification
    Saminu Mohammad Aliyu, Idris Abdulmumin, Shamsuddeen Hassan Muhammad, and 4 more authors
    In Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023), Jul 2023
  3. ICCAIT
    Analyzing COVID-19 Vaccination Sentiments in Nigerian Cyberspace: Insights from a Manually Annotated Twitter Dataset
    Ibrahim Said Ahmad, Lukman Jibril Aliyu, Auwal Abubakar Khalid, and 6 more authors
    In Proceedings of the International Conference on Computing and Advances in Information Technology (ICCAIT 2023), Nov 2023
  4. ICCAIT
    Leveraging Closed-Access Multilingual Embedding for Automatic Sentence Alignment in Low Resource Languages
    Idris Abdulmumin, Auwal Abubakar Khalid, Shamsuddeen Hassan Muhammad, and 5 more authors
    In Proceedings of the International Conference on Computing and Advances in Information Technology (ICCAIT 2023), Nov 2023
  5. SemEval
    SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)
    Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Seid Muhie Yimam, and 7 more authors
    In Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023), Jul 2023
  6. IJCNLP
    MasakhaNEWS: News Topic Classification for African languages
    David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, and 62 more authors
    In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, Nov 2023
  7. EMNLP
    AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
    Shamsuddeen Muhammad, Idris Abdulmumin, Abinew Ayele, and 24 more authors
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023

2022

  1. LREC
    NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis
    Shamsuddeen Hassan Muhammad, David Ifeoluwa Adelani, Sebastian Ruder, and 8 more authors
    In Proceedings of the Language Resources and Evaluation Conference, Jun 2022
  2. NAACL
    A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for African News Translation
    David Adelani, Jesujoba Alabi, Angela Fan, and 42 more authors
    In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jul 2022
  3. AfricaNLP
    NECAT-CLWE: A Simple But Efficient Parallel Data Generation Approach for Unsupervised and Semi-Supervised Neural Machine Translation
    Rabiu Abdullahi Ibrahim and Idris Abdulmumin
    In 3rd Workshop on African Natural Language Processing, Jul 2022
  4. AfricaNLP
    The African Stopwords Project: Curating Stopwords for African Languages
    Chris Chinenye Emezue, Hellina Hailu Nigatu, Cynthia Thinwa, and 12 more authors
    In 3rd Workshop on African Natural Language Processing, Jul 2022
  5. WiNLP
    Domain-Specific Lexicon-Based Sentiment Analysis using Contextual Shifter Patterns
    Shamsuddeen Muhammad, Pavel Brazdil, and Idris Abdulmumin
    In Proceedings of the Sixth Workshop on Widening Natural Language Processing, Dec 2022
  6. WiNLP
    HERDPhobia: A Dataset for Hate Speech Detection against Fulani Herdsmen in Nigeria
    Saminu Aliyu, Gregory Wajiga, Muhammad Murtala, and 3 more authors
    In Proceedings of the Sixth Workshop on Widening Natural Language Processing, Dec 2022
  7. EMNLP
    MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
    David Ifeoluwa Adelani, Graham Neubig, Sebastian Ruder, and 42 more authors
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Dec 2022
  8. IEEE
    Quantity vs. Quality of Monolingual Source Data in Automatic Text Translation: Can It Be Too Little If It Is Too Good?
    Idris Abdulmumin, Bashir Shehu Galadanci, Shamsuddeen Hassan Muhammad, and 1 more author
    In 2022 IEEE Nigeria 4th International Conference on Disruptive Technologies for Sustainable Development (NIGERCON), Dec 2022
  9. LREC
    Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation
    Idris Abdulmumin, Satya Ranjan Dash, Musa Abdullahi Dawud, and 7 more authors
    In Proceedings of the Language Resources and Evaluation Conference, Jun 2022
  10. WMT
    Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
    Idris Abdulmumin, Michael Beukman, Jesujoba Alabi, and 8 more authors
    In Proceedings of the Seventh Conference on Machine Translation, Dec 2022
  11. arXiv
    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
    Teven Le Scao, Angela Fan, Christopher Akiki, and 387 more authors
    Dec 2022

2021

  1. Mach. Trans.
    Tag-less back-translation
    Idris Abdulmumin, Bashir Shehu Galadanci, and Garba Aliyu
    Machine Translation, Dec 2021
  2. IAENG EL
    A hybrid approach for improved low resource neural machine translation using monolingual data
    Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isah, and 2 more authors
    Engineering Letters, Nov 2021
  3. LNCS
    Data Selection as an Alternative to Quality Estimation in Self-Learning for Low Resource Neural Machine Translation
    Idris Abdulmumin, Bashir Shehu Galadanci, Ibrahim Said Ahmad, and 1 more author
    In Computational Science and Its Applications – ICCSA 2021, Nov 2021
  4. CCIS
    Enhanced Back-Translation for Low Resource Neural Machine Translation Using Self-training
    Idris Abdulmumin, Bashir Shehu Galadanci, and Abubakar Isa
    In Information and Communication Technology and Applications, Nov 2021

2019

  1. IEEE
    HauWE: Hausa Words Embedding for Natural Language Processing
    Idris Abdulmumin and Bashir Shehu Galadanci
    In 2019 2nd International Conference of the IEEE Nigeria Computer Chapter, NigeriaComputConf 2019, Nov 2019