I am Bonaventure F. P. Dossou, a United Nations University scholar and author. I am also a Computer Science Ph.D. student at McGill University, in the NLP group specializing in Natural Language Processing (NLP) + Healthcare. I am advised by Professor Jackie Cheung. I hold a Bachelor of Science with honors in Mathematics from Kazan Federal University, Russia, and a Master of Science with honors in Computer Science and Data Engineering from Jacobs University Bremen, Germany. Recently, I was a Machine Learning Consultant and Research Scientist at Phagos Biotech, where I worked on building large-scale language models for genomic sequencing and bacteriophages.
My interests are in Natural Language Processing (Machine Translation, Large Language Modeling, Speech Recognition, Information Retrieval) for low-resourced languages and Machine Learning for Healthcare (Drug Discovery, small molecule generations, gene therapy). I am the creator of many Afro-centric NLP systems like the FFRTranslate, AfroLM and Okwugbe ASR (Automatic Speech Recognition for low-resourced languages) Python library (just to name but a few). My research on my native language, Fongbé have been significant contributions and towards its integration (in July 2024) in Google Translate. You can find my CV here.
Before my PhD, I was a research intern at the Mila Quebec AI Institute, working on Drug Discovery projects using Deep Learning (and GFlowNets), at Mila Quebec AI Institute under the supervisions of Yoshua Bengio and Dianbo Lui. More specifically, I worked on leveraging GFlowNets for Biological Sequence Design but also to learn the posterior distribution over binary multimodal dropout masks (GFlowOut). Previously, I was also an NLP Data Scientist at Roche Canada and a Research Scientist at ModelisLabs, working on Health & Pharma-related challenges. Alternatively, I am working on NLP language technologies, with a focus on low-resourced Sub-Saharan languages at Masakhane Research Foundation (and previously at Google Research).
Past Work and Research Experiences
1. Mila Scientist in Residence [Probe Medical]
2. Mila Scientist in Residence (AI for Chemical Compound Discovery) [Modelis]
3. ML Student Researcher [Google Research]
3. Research Intern (AI for Drug Discovery) [Mila Quebec AI Institute]
4. ML Research Scientist Consultant (AI for Drug Discovery) [Phagos]
5. NLP Research Intern [Roche Canada]
7. Senior Machine Learning Engineer [Omdena]
8. African NLP Researcher & Core Member [Masakhane]
9. Part-time Senior Data Scientist [Speeqo]
10. Fundamental Research Scientist [Lelapa AI]
All publications can be accessed through my
Semantic Scholar and
Google Scholar pages.
1.
AfriMed-QA: Towards A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset Tobi Olatunji, Charles Nimo,...,
Bonaventure F. P. Dossou,... (under review at NeuRIPS 2024)
2.
Adapting Pretrained ASR Models to Low-resource Clinical Speech using Epistemic Uncertainty-based Data Selection. Bonaventure F. P. Dossou (SIIGUL 2024, under review at NeuRIPS 2024)
3.
A Study of Acquisition Functions for Medical Imaging Deep Active Learning Bonaventure F. P. Dossou (Deep Learning Indaba 2023)
4.
FonMTL: Towards Multitask Learning for the Fon Language. Bonaventure F. P. Dossou et.al. (EMNLP 2023)
5.
AfriSpeechNames: Most ASR models "butcher" African Names. Tobi Olatunji, Tejumade Afonja, Bonaventure F. P. Dossou et.al. (Interspeech 2023)
6.
Pretrained Vision Models for Predicting High-Risk Breast Cancer Stage. Bonaventure F. P. Dossou et.al. (ICLR 2023)
7.
GFlowOut: Dropout with Generative Flow Networks. Dianbo Liu, Moksh Jain, Bonaventure F. P. Dossou et.al. (ICML 2023)
8.
AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages. Bonaventure F. P. Dossou et.al. (EMNLP 2022)
9.
Biological Sequence Design with GFlowNets. Moksh Jain, Emmanuel Bengio, Alex Hernandez-Garcia, Jarrid Rector-Brooks, Bonaventure F. P. Dossou et.al. (ICML 2022)
10.
MeSH2Matrix: Machine learning-driven biomedical relation classification based on the MeSH keywords of PubMed scholarly publications. Houcemeddine Turki, Bonaventure F. P. Dossou et.al. (ECIR 2022)
11.
GraphCC for Diverse and Novel Antimicrobial Peptides Generation and Selection. Bonaventure F. P. Dossou et.al. (preprint)
12.
OkwuGbé: End-to-End Speech Recognition for Fon and Igbo. Bonaventure F. P. Dossou et.al. (EMNLP 2021)
13.
MMTAfrica: Multilingual Machine Translation for African Languages. Chris C. Emezue and Bonaventure F. P. Dossou (EMNLP 2021)
14.
FSER: Deep Convolutional Neural Networks for Speech Emotion Recognition. Bonaventure F. P. Dossou et.al. (ICCV 2021)
15.
Crowdsourced Phrase-Based Tokenization for Low-Resourced Neural Machine Translation: The Case of Fon Language. Bonaventure F. P. Dossou et.al. (EACL 2021)
16.
AfriVEC: Word Embedding Models for African Languages. Case Study of Fon and Nobiin. Bonaventure F. P. Dossou et.al. (EACL 2021)
17.
FFR v1.1: Fon-French Neural Machine Translation. Bonaventure F. P. Dossou et.al. (ACL 2020)
Awards, Honours, Grants & Services
1. Borealis AI PhD Fellowship Award (2024)
2. University Scholars Leadership Symposium Delegate (2024)
3. Honorable Mention Solution for the Nightingale Contest for Detecting Active Tuberculosis Bacilli (2024)
4. Two Best Poster Awards at the Deep Learning Indaba Conference (2023)
5. Winner of AIM-AHEAD Health Equity Data Challenge (2023)
6. Winner of Nightingale Predicting High-Risk Breast Cancer Contexts (2022, 2023)
7. Mila Quebec AI Institute's Impact Annual Reports (2022, 2023)
8. McGill Engineering Doctoral Award (2022)
9. Innovation Award 2022 of the German African Diaspora (2022)
10. Jacobs University's Dean's Prize for outstanding Master's Thesis (2022)
11. Shuttleworth Flash Grant (2021)
12. Winner of the ViVaTech-Unesco Challenge for Cracking Language Barriers through Data and AI (2021)
13. Wikimedia Foundation Research of the Year Award 2021 (2021)
14. Grant "Lacuna Fund" for Named Entity Recognition for Fon with Masakhane Community (2021)
15. Jacobs University Community Award for Innovation, Cultural Understanding, and Diversity (2021)
16. Jacobs University Mobility Area's Scholarship & Jacobs University Faces (2020-2022)
17. Global Nominee and Benin's finalist with «Afro Num» - NASA's World Space Apps Challenge (2020)
18. Winner of the National Russian AI Hackathon (2019)
19. International interviews and articles on BBC, Voice of America, German, Russian newspapers, and TVs (2020-)
20. Scientific presentations and publications, Workshops organizations, and Reviewing Services at ACL, EACL, NAACL, AACL, EMNLP, ICML, ICLR, NeuRIPs (2020, 2021, 2022, 2023, 2024)