Skip to main content

Colin Leong

Bachelor of Science in Computer Engineering, Wright State University, May 2013

Master of Science in Computer Science, Wright State University, December 2015

Current Program: PhD in Electrical Engineering

Research Topics:

Adviser: Dr. Asari


Bio

Colin Leong is a Machine Learning Engineer at the University of Dayton Research Institute. He received a Bachelor's degree in Computer Engineering from Wright State University in 2013, and a Master's degree in Computer Science from Wright State University in 2015.

His research interests include machine translation, natural language processing, unsupervised or self-supervised learning, computer vision, and reinforcement learning.

He is currently looking into applying machine translation techniques to facilitate Bible translation into "low resource" languages that do not have access to large quantities of written information. Some of these languages do not even have written orthographies at all, and speakers are cut off from almost all the world's informational resources. If the process of translating texts into these languages can be improved, the lives of these people could be transformed for the better in many ways.

Colin's favorite dinosaur is the Stegosaurus, because of the rather amusing story of how its spiked tail came to be known as a "thagomizer."


Publications

Christopher Menart, Colin Leong, Olga Mendoza-Schrock, Edmund Zelnio, "Characterization of CNN classifier performance with respect to variation in optical contrast, using synthetic electro-optical data," Proc. SPIE 10988, Automatic Target Recognition XXIX, 109880N (14 May 2019); https://doi.org/10.1117/12.2519494 

 

Colin Leong and Daniel Whitenack. 2022. Phone-ing it in: Towards Flexible Multi-Modal Language Model Training by Phonetic Representations of Data. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5306–5315, Dublin, Ireland. Association for Computational Linguistics.

 

Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Ortiz Suarez, Iroro Orife, Kelechi Ogueji, Andre Niyongabo Rubungo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Ballı, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi; Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets. Transactions of the Association for Computational Linguistics 2022; 10 50–72. doi: https://doi.org/10.1162/tacl_a_00447

 

Meyer, Josh, David Ifeoluwa Adelani, Edresson Casanova, Alp Oktem, Daniel Whitenack Julian Weber, Salomon Kabongo KABENAMUALU, Elizabeth Salesky, Iroro Orife, Colin Leong, Perez Ogayo, Chris C. Emezue, Jonathan Mukiibi, Salomey Osei, Apelete Agbolo, Victor Akinode, Bernard Opoku, Samuel Olayemi Olanrewaju, Jesujoba Oluwadara Alabi and Shamsuddeen Hassan Muhammad. “BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus.” ArXiv abs/2207.03546 (2022): n. pag. https://doi.org/10.48550/arXiv.2207.03546 

CONTACT

Vision Lab, Dr. Vijayan Asari, Director

Kettering Laboratories
300 College Park
Dayton, Ohio 45469 - 0232
937-229-1779
Email