Colin Raffel: Curriculum Vitae

2012–2016

Columbia University, New York, NY
Ph.D. and M.Phil. in Electrical Engineering
Laboratory for the Recognition and Organization of Speech and Audio
Advisor: Daniel P. W. Ellis

2009–2010

Stanford University, Stanford, CA
M.A. in Music, Science, and Technology
Center for Computer Research in Music and Acoustics
Advisor: Julius O. Smith III

2005–2009

Oberlin College, Oberlin, OH
B.A. in Mathematics with Physics minor

2023–now

Associate Professor, Department of Computer Science, University of Toronto

2023–now

Associate Research Director, Vector Institute

2021–now

Faculty Researcher, Hugging Face

2020–2023

Assistant Professor, Department of Computer Science, UNC Chapel Hill

2020–2021

Staff Research Scientist, Google Brain

2018–2020

Senior Research Scientist, Google Brain

2017–2018

Research Scientist, Google Brain

2016–2017

Resident, Google Brain

2024

TMLR Outstanding Certification Finalist
for "Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language model"

2024

ICBS Frontiers of Science Award in Theoretical Computer and Information Sciences
for "Extracting Training Data from Large Language Models"

2023

NeurIPS Outstanding Paper (Runner-Up)
for "Scaling Data-Constrained Language Models"

2023

Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies (Runner-Up)
for "Extracting Training Data from Large Language Models"

2022

Best Paper Honorable Mention
NeurIPS Workshop on Broadening Research Collaborations in ML

2022

CACM Research Highlight
for "Learning-based Memory Allocation for C++ Server Workloads"

2022

NSF CAREER Award

2021

SIGPLAN Research Highlight
for "Learning-based Memory Allocation for C++ Server Workloads"

2021

Google Research Award

2018

Top 10 Reviewer Award
35th International Conference on Machine Learning

2016

National Science Foundation Student Travel Grant
41st IEEE International Conference on Acoustics, Speech, and Signal Processing

2015

Best Student Paper
16th International Society for Music Information Retrieval Conference

2015

Travel Grant
National Science Foundation Data Science Workshop

2014

Best Poster Presentation
15th International Society for Music Information Retrival Conference

2014

Student Travel Award
15th International Society for Music Information Retrieval Conference

2013

SoundSoftware.ac.uk Prize for Reproducibility in Audio and Music Research

2012

NSF Integrative Graduate Education and Research Training Fellowship

Fengyuan Liu, Nikhil Kandpal, and Colin Raffel, “AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution”, 13th International Conference on Learning Representations, 2025 (to appear).

Prateek Yadav, Leshem Choshen, Colin Raffel and Mohit Bansal, “ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization”, Transactions on Machine Learning (TMLR), 2025.

Prateek Yadav, Colin Raffel, Mohammed Muqeeth, Lucas Caccia, Haokun Liu, Tianlong Chen, Mohit Bansal, Leshem Choshen, and Alessandro Sordoni, “A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning”, Transactions on Machine Learning (TMLR), 2025.

Loubna Ben Allal, Anton Lozhkov, Elie Bakouch, Gabriel Martín Blázquez, and 18 others including Colin Raffel, “SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model ”, arXiv preprint arXiv:2502.02737, 2025.

Guilherme Penedo, Hynek Kydlíček, Loubna Ben allal, Anton Lozhkov, Margaret Mitchell, Colin Raffel, Leandro Von Werra, and Thomas Wolf, “The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale”, Neural Information Processing Systems 38 (NeurIPS), 2024.

Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, and 386 others including Colin Raffel, “BLOOM: A 176B-Parameter Open-Access Multilingual Language Model”, Journal of Machine Learning Research (JMLR), 2024.

Derek Tam, Yash Kant, Brian Lester, Igor Gilitschenski, and Colin Raffel, “Realistic Evaluation of Model Merging for Compositional Generalization”, arXiv preprint arXiv:2409.18314, 2024.

Ajay Patel, Colin Raffel, and Chris Callison-Burch, “DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows”, 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2024.

Mohammed Muqeeth, Haokun Liu, Yufan Liu, and Colin Raffel, “Learning to Route Among Specialized Experts for Zero-Shot Generalization”, 41st International Conference on Machine Learning (ICML), 2024.

Mohammed Muqeeth, Haokun Liu, and Colin Raffel, “Soft Merging of Experts with Adaptive Routing”, Transactions on Machine Learning Research (TMLR), 2024.

Derek Tam, Mohit Bansal, and Colin Raffel, “Merging by Matching Models in Task Subspaces”, Transactions on Machine Learning Research (TMLR), 2024.

Colin Raffel, “A New Alchemy: Language Model Development as a Subfield?”, ICLR 2024 Blog Post Track, 2024.

Bowen Pan, Yikang Shen, Haokun Liu, Mayank Mishra, Gaoyuan Zhang, Aude Oliva, Colin Raffel, and Rameswar Panda, “Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models”, arXiv preprint arXiv:2404.05567, 2024.

Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley, and Colin Raffel, “Combining Machine Learning and Lifetime-based Resource Management for Memory Allocation and Beyond”, Communications of the Association for Computing Machinery (CACM), 2024.

Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, and 10 others including Colin Raffel, “A Survey on Data Selection for Language Models”, Transactions on Machine Learning (TMLR), 2024.

Alon Albalak, Liangming Pan, Colin Raffel, and William Yang Wang, “Efficient Online Data Mixing For Language Model Pre-Training”, NeurIPS 2023 Workshop on Robustness of Few-shot and Zero-shot Learning in Large Foundation Models, 2023.

Dhuvarakesh Karthikeyan, Colin Raffel, Benjamin Vincent, and Alex Rubinsteyn, “Conditional Generation of Antigen Specific T-cell Receptor Sequences”, NeurIPS 2023 Generative AI and Biology (GenBio) Workshop, 2023.

Alexander Borzunov, Dmitry Baranchuk, Tim Dettmers, Max Ryabinin, Younes Belkada, Artem Chumachenko, Pavel Samygin, and Colin Raffel, “Distributed Inference and Fine-tuning of Large Language Models Over The Internet”, Neural Information Processing Systems 37 (NeurIPS), 2023.

Prateek Yadav, Derek Tam, Leshem Choshen, Colin Raffel, and Mohit Bansal, “TIES-Merging: Resolving Interference When Merging Models”, Neural Information Processing Systems 37 (NeurIPS), 2023.

Niklas Muennighoff, Alexander M. Rush, Boaz Barak, Teven Le Scao, Aleksandra Piktus, Nouamane Tazi, Sampo Pyysalo, Thomas Wolf, and Colin Raffel, “Scaling Data-Constrained Language Models”, Neural Information Processing Systems 37 (NeurIPS), 2023.

Alon Albalak, Colin Raffel, and William Yang Wang, “Improving Few-Shot Generalization by Exploring and Exploiting Auxiliary Data”, Neural Information Processing Systems 37 (NeurIPS), 2023.

Haikang Deng and Colin Raffel, “Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model”, 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.

Almog Gueta, Elad Venezian, Colin Raffel, Noam Slonim, Yoav Katz, and Leshem Choshen, “Knowledge is a Region in Weight Space for Fine-tuned Language Models”, Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023.

Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, and 39 others including Colin Raffel, “Scaling Up Models and Data with t5x and seqio”, Journal of Machine Learning Research (JMLR), 2023.

Michael Matena and Colin Raffel, “NPEFF: Non-Negative Per-Example Fisher Factorization”, arXiv preprint arXiv:2310.04649, 2023.

Marcos Treviso*, Tianchu Ji*, Ji-Ung Lee*, Betty van Aken, and 14 others including Colin Raffel, “Efficient Methods for Natural Language Processing: A Survey”, Transactions of the Association for Computational Linguistics (TACL), 2023.

Nikhil Kandpal*, Brian Lester*, Mohammed Muqeeth, Anisha Mascarenhas, Monty Evans, Vishal Baskaran, Tenghao Huang, Haokun Liu, and Colin Raffel, “Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models”, 40th International Conference on Machine Learning, 2023.

Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, and Colin Raffel, “Large Language Models Struggle to Learn Long-Tail Knowledge”, 40th International Conference on Machine Learning, 2023.

Shachar Don-Yehiya, Elad Venezian, Colin Raffel, Noam Slonim, Yoav Katz, and Leshem Choshen, “ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning”, 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.

Derek Tam*, Anisha Mascarenhas*, Shiyue Zhang, Sarah Kwan, Mohit Bansal, and Colin Raffel, “Evaluating the Factual Consistency of Large Language Models Through Summarization”, Findings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.

Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, and 15 others including Colin Raffel, “Crosslingual Generalization through Multitask Finetuning”, 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023.

Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, and 440 others including Colin Raffel, “Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models”, Transactions on Machine Learning Research (TMLR), 2023.

Ajay Patel, Bryan Li, Mohammad Sadegh Rasooli, Noah Constant, Colin Raffel, and Chris Callison-Burch, “Bidirectional Language Models Are Also Few-shot Learners”, 11th International Conference on Learning Representations (ICML), 2023.

Derek Tam, Colin Raffel, and Mohit Bansal, “Simple Weakly-Supervised Image Captioning via CLIP's Multimodal Embeddings”, AAAI Workshop on Creative AI Across Modalities, 2023.

Colin Raffel, “Building Machine Learning Models like Open-Source Software”, Communications of the Association for Computing Machinery (CACM), 2023.

Teven Le Scao*, Thomas Wang*, Daniel Hesslow*, Lucile Saulnier*, Stas Bekman*, and 13 others including Colin Raffel, “What Language Model to Train if You Have One Million GPU Hours?”, Findings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.

Mohammed Muqeeth, Haokun Liu, and Colin Raffel, “Models with Conditional Computation Learn Suboptimal Solutions”, NeurIPS Workshop on Understanding Deep Learning Through Empirical Falsification (I Can't Believe It's Not Better), 2022.

Alexander Borzunov*, Dmitry Baranchuk*, Tim Dettmers*, Max Ryabinin*, Younes Belkada*, Artem Chumachenko, Pavel Samygin, and Colin Raffel, “Petals: Collaborative Inference and Fine-tuning of Large Models”, 61st Annual Meeting of the Association for Computational Linguistics (ACL) Demo Track and NeurIPS Workshop on Broadening Research Collaborations in Machine Learning, 2022.

Michael Matena and Colin Raffel, “A Combinatorial Perspective on the Optimization of Shallow ReLU Networks”, Neural Information Processing Systems 36 (NeurIPS), 2022.

Zhenlin Xu, Marc Niethammer, and Colin Raffel, “Compositional Generalization in Unsupervised Compositional Representation Learning: A Study on Disentanglement and Emergent Language”, Neural Information Processing Systems 36 (NeurIPS), 2022.

Haokun Liu*, Derek Tam*, Mohammed Muqeeth*, Jay Mohta, Tenghao Huang, Mohit Bansal, and Colin Raffel, “Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning”, Neural Information Processing Systems 36 (NeurIPS), 2022.

Michael Matena and Colin Raffel, “Merging Models with Fisher-Weighted Averaging”, Neural Information Processing Systems 36 (NeurIPS), 2022.

Jiaao Chen*, Derek Tam*, Colin Raffel, Mohit Bansal, and Diyi Yang, “An Empirical Survey of Data Augmentation for Limited Data Learning in NLP”, Transactions of the Association for Computational Linguistics (TACL), 2022.

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, and 12 others, “Emergent Abilities of Large Language Models”, Transactions on Machine Learning Research (TMLR), 2022.

Thomas Wang*, Adam Roberts*, Daniel Hesslow, Teven Le Scao, Hyung Won Chung, Iz Beltagy, Julien Launay, and Colin Raffel, “What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?”, 39th International Conference on Machine Learning (ICML), 2022.

Nikhil Kandpal, Eric Wallace, and Colin Raffel, “Deduplicating Training Data Mitigates Privacy Risks in Language Models”, 39th International Conference on Machine Learning (ICML), 2022.

Stephen H. Bach*, Victor Sanh*, Zheng-Xin Yong, Albert Webson, Colin Raffel, and 22 others, “PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts”, 60th Annual Meeting of the Association for Computational Linguistics (ACL) Demo Track, 2022.

Victor Sanh*, Albert Webson*, Colin Raffel*, Stephen H. Bach*, and 37 others, “Multitask Prompted Training Enables Zero-Shot Task Generalization”, 10th International Conference on Learning Representations (ICLR), 2022.

Linting Xue*, Aditya Barua*, Noah Constant*, Rami Al-Rfou*, Sharan Narang, Mihir Kale, Adam Roberts, and Colin Raffel, “ByT5: Towards a token-free future with pre-trained byte-to-byte models”, Transactions of the Association for Computational Linguistics (TACL), 2022.

Sabrina J. Mielke, Zaid Alyafeai, Elizabeth Salesky, Colin Raffel, Manan Dey, Matthias Gallé, Arun Raja, Chenglei Si, Wilson Y. Lee, Benoît Sagot, and Samson Tan, “Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP”, arXiv preprint arxiv:2112.10508, 2021.

Yi-Lin Sung*, Varun Nair*, and Colin Raffel, “Training Neural Networks with Fixed Sparse Masks”, Neural Information Processing Systems 35 (NeurIPS), 2021.

Derek Tam*, Rakesh R Menon*, Mohit Bansal, Shashank Srivastava, and Colin Raffel, “Improving and Simplifying Pattern Exploiting Training”, 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, and 12 others including Colin Raffel, “Do Transformer Modifications Transfer Across Implementations and Applications?”, 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.

Ching-Yuan Bai, Hsuan-Tien Lin, Colin Raffel, and Wendy Chih-wen Kan, “On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition”, 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021.

Nicholas Carlini, Florian Tramer, Eric Wallace, Matthew Jagielski, and 8 others including Colin Raffel, “Extracting Training Data from Large Language Models”, 30th USENIX Security Symposium, 2021.

Linting Xue*, Noah Constant*, Adam Roberts*, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel, “mT5: A Massively Multilingual Pre-Trained Text-to-Text Transformer”, 2021 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021.

Zhenlin Xu, Deyi Liu, Junlin Yang, Colin Raffel, and Marc Niethammer, “Robust and Generalizable Visual Representation Learning via Random Convolutions”, 9th International Conference on Learning Representations (ICLR), 2021.

Colin Raffel and Kevin P. Murphy (ed.), “Learning with Fewer Labeled Examples”, Book chapter in Probabilistic Machine Learning: An Introduction, 2021.

Kihyuk Sohn*, David Berthelot*, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, and Colin Raffel, “FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence”, Neural Information Processing Systems 34 (NeurIPS), 2020.

Samarth Sinha, Anirudh Goyal, Colin Raffel, and Augustus Odena, “Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples”, Neural Information Processing Systems 34 (NeurIPS), 2020.

Sewon Min, Jordan Boyd-Graber, Chris Alberti, Danqi Chen, and 10 others including Colin Raffel, “NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned”, Proceedings of Machine Learning Research (PMLR), NeurIPS 2020 Competition and Demonstration Track, 2020.

Adam Roberts*, Colin Raffel*, and Noam Shazeer, “How Much Knowledge Can You Pack Into the Parameters of a Language Model?”, 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020.

Colin Raffel*, Noam Shazeer*, Adam Roberts*, Katherine Lee*, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu, “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer”, Journal of Machine Learning Research (JMLR), 21(140), 2020.

David Berthelot, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Kihyuk Sohn, Han Zhang, and Colin Raffel, “ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring”, 8th International Conference on Learning Representations (ICLR), 2020.

Yao Qin*, Nicholas Frosst*, Sara Sabour, Colin Raffel, Garrison Cottrell, and Geoffrey Hinton, “Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions”, 8th International Conference on Learning Representations (ICLR), 2020.

Sharan Narang*, Colin Raffel*, Katherine Lee, Adam Roberts, Noah Fiedel, and Karishma Malkan, “WT5?! Training Text-to-Text Models to Explain their Predictions”, arXiv preprint arXiv:2004.14546, 2020.

Martin Maas, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley, and Colin Raffel, “Learning-based Memory Allocation for C++ Server Workloads”, 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2020.

Yao Qin, Nicholas Frosst, Colin Raffel, Garrison Cottrell, and Geoffrey Hinton, “Deflecting Adversarial Attacks”, arXiv preprint arXiv:2002.07405, 2020.

David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, and Colin Raffel, “MixMatch: A Holistic Approach to Semi-Supervised Learning”, Neural Information Processing Systems 33 (NeurIPS), 2019.

Naveen Arivazhagan*, Colin Cherry*, Wolfgang Macherey, Chung-Cheng Chiu, Semih Yavuz, Ruoming Pang, Wei Li, and Colin Raffel, “Monotonic Infinite Lookback Attention for Simultaneous Machine Translation”, 57th Annual Meeting of the Association for Computational Linguistics (ACL), 2019.

Yao Qin, Nicholas Carlini, Ian Goodfellow, Garrison Cottrell, and Colin Raffel, “Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition”, 36th International Conference on Machine Learning (ICML), 2019.

David Berthelot*, Colin Raffel*, Aurko Roy, and Ian Goodfellow, “Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer”, 7th International Conference on Learning Representations (ICLR), 2019.

Ishaan Gulrajani, Colin Raffel, and Luke Metz, “Towards GAN Benchmarks Which Require Generalization”, 7th International Conference on Learning Representations (ICLR), 2019.

Vaishnavh Nagarajan, Colin Raffel, and Ian J. Goodfellow, “Theoretical Insights into Memorization in GANs”, NeurIPS Workshop on Integration of Deep Learning Theories, 2018.

Avital Oliver*, Augustus Odena*, Colin Raffel*, Ekin D. Cubuk, and Ian J. Goodfellow, “Realistic Evaluation of Deep Semi-Supervised Learning Algorithms”, Neural Information Processing Systems 32 (NeurIPS), 2018.

Ian Simon, Adam Roberts, Colin Raffel, Jesse Engel, Curtis Hawthorne, and Douglas Eck, “Learning a Latent Space of Multitrack Measures”, 2nd NeurIPS Workshop on Machine Learning for Creativity and Design, 2018.

Curtis Hawthorne*, Erich Elsen*, Jialin Song*, Adam Roberts, Ian Simon, Colin Raffel, Jesse Engel, Sageev Oore, and Douglas Eck, “Onsets and Frames: Dual-Objective Piano Transcription”, 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.

Adam Roberts, Jesse Engel, Colin Raffel, Curtis Hawthorne, and Douglas Eck, “A Hierarchical Latent Vector Model for Learning Long-Term Structure in Music”, 35th International Conference on Machine Learning (ICML), 2018.

Augustus Odena, Jacob Buckman, Catherine Olsson, Tom B. Brown, Christopher Olah, Colin Raffel, and Ian Goodfellow, “Is Generator Conditioning Causally Related to GAN Performance?”, 35th International Conference on Machine Learning (ICML), 2018.

Chung-Cheng Chiu* and Colin Raffel*, “Monotonic Chunkwise Attention”, 6th International Conference on Learning Representations (ICLR), 2018.

Jacob Buckman*, Aurko Roy*, Colin Raffel, and Ian J. Goodfellow, “Thermometer Encoding: One Hot Way To Resist Adversarial Examples”, 6th International Conference on Learning Representations (ICLR), 2018.

Dieterich Lawson*, George Tucker*, Chung-Cheng Chiu*, Colin Raffel, Kevin Swersky, and Navdeep Jaitly, “Learning Hard Alignments with Variational Inference”, 43rd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018.

Colin Raffel, Minh-Thang Luong, Peter J. Liu, Ron J. Weiss, and Douglas Eck, “Online and Linear-Time Attention by Enforcing Monotonic Alignments”, 34th International Conference on Machine Learning (ICML), 2017.

Colin Raffel and Dieterich Lawson, “Training a Subsampling Mechanism in Expectation”, 5th International Conference on Learning Representations Workshop (ICLR), 2017.

Justin Gilmer, Colin Raffel, Samuel S. Schoenholz, Maithra Raghu, and Jascha Sohl-Dickstein, “Explaining the Learning Dynamics of Direct Feedback Alignment”, 5th International Conference on Learning Representations Workshop (ICLR), 2017.

Colin Raffel and Daniel P. W. Ellis, “Extracting Ground Truth Information from MIDI Files: A MIDIfesto”, 17th International Society for Music Information Retrieval Conference (ISMIR), 2016.

Colin Raffel, “Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching”, Ph.D. Thesis, 2016.

Colin Raffel and Daniel P. W. Ellis, “Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems”, 4th International Conference on Learning Representations Workshop (ICLR), 2016.

Colin Raffel and Daniel P. W. Ellis, “Pruning Subsequence Search with Attention-Based Embedding”, 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.

Colin Raffel and Daniel P. W. Ellis, “Optimizing DTW-Based Audio-to-MIDI Alignment and Matching”, 41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.

Nikolai Yakovenko, Liangliang Cao, Colin Raffel, and James Fan, “Poker-CNN: A Pattern Learning Strategy for Making Draws and Bets in Poker Games”, 30th AAAI Conference on Artificial Intelligence, 2016.

Colin Raffel and Daniel P. W. Ellis, “Accelerating Multimodal Sequence Retrieval with Convolutional Networks”, NeurIPS Multimodal Machine Learning Workshop, 2015.

Andreas Jansson, Colin Raffel, and Tillman Weyde, “This Is My Jam: Data Dump”, 16th International Society for Music Information Retrieval Conference Late Breaking and Demo Papers, 2015.

Colin Raffel and Daniel P. W. Ellis, “Large-Scale Content-Based Matching of MIDI and Audio Files”, 16th International Society for Music Information Retrieval Conference (ISMIR), 2015.

Brian McFee, Colin Raffel, Dawen Liang, Daniel P. W. Ellis, Matt McVicar, and Oriol Nieto, “librosa: Audio and Music Signal Analysis in Python”, 14th Python in Science Conference (SciPy), 2015.

Colin Raffel and Daniel P. W. Ellis, “Intuitive Analysis, Creation and Manipulation of MIDI Data with pretty_midi”, 15th International Society for Music Information Retrieval Conference Late Breaking and Demo Papers, 2014.

Colin Raffel, Brian McFee, Eric J. Humphrey, Justin Salamon, Oriol Nieto, Dawen Liang, and Daniel P. W. Ellis, “mir_eval: A Transparent Implementation of Common MIR Metrics”, 15th International Society for Music Information Retrieval Conference (ISMIR), 2014.

Colin Raffel and Daniel P. W. Ellis, “Estimating Timing and Channel Distortion Across Related Signals”, 39th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2014.

Colin Raffel, “Using Noise Substitution for Backwards-Compatible Audio Codec Improvement”, 129th Convention of the Audio Engineering Society (AES), 2010.

Colin Raffel and Julius O. Smith, “Practical Modeling of Bucket-Brigade Device Circuits”, 13th International Conference on Digital Audio Effects (DAFx), 2010.

Colin Raffel, Nick Kruge, Diane Douglas, Edgar Berdahl, and Wendy Ju, “The Lattice Harp: A New Hybrid Instrument and Controller”, 35th International Computer Music Conference (ICMC), 2010.

2024

Merging and MoErging for Compositional Generalization
NeurIPS 2024 Workshop on Compositional Learning and ETH Zurich SPCL_Bcast()

2024

The Most Expensive Part of an LLM Should Be Its Training Data
Toronto LLM x Law Hackathon

2024

Progress on a Permissively Licensed Text Dataset
Vector ML Security and Privacy Workshop and University of Pennsylvania CLunch

2023

Build an Ecosystem, Not a Monolith
Simons Institute Workshop on Large Language Models and Transformers, Google Responsible Machine Learning Reading Group, University of Edinburgh ILCC Seminar, Stanford NLP Seminar, UCSD AI Seminar, Yale CPSC 488/588 Lecture, University of Toronto CL Colloquium, and Open AGI Summit@EthCC

2023

Collaborative, Communal, & Continual Machine Learning
Faculty job talk

2022

Building Better Language Models: Insights from BigScience
Stanford Center for Research on Foundation Models

2022

Weird Things About Professorship
EMNLP Share Stories and Lessons Learned Workshop

2022

Building Better Language Models
Johns Hopkins University CSCI 601.771 Lecture, Mosaic.ml, and Vector Institute Research Symposium

2022

Infrastructure and Progress Towards the First Community-Built and Continually-Improved Model
Microsoft Research Efficient Large-Scale AI Workshop

2022

Building Machine Learning Models Like Open-Source Software
Microsoft Research Summit, World Artificial Intelligence Conference, Technische Universität Darmstadt, UT Austin Forum for Artificial Intelligence, Korea AI Summit, Stanford CS324 Lecture, Stanford MLSys Seminar Series, and MLsys Symposium on Decentralized and Collaborative Learning

2022

How to Be an Academic Machine Learning Researcher in the Era of Scale
CIFAR Deep Learning and Reinforcement Learning Summer School

2022

Less Data, More ___? Data Augmentation and Semi-Supervised Learning for Natural Language Processing
60th Annual Meeting of the Association for Computational Linguistics Tutorials

2021

A call to build models like we build open-source software
Cornell University Artificial Intelligence Seminar, Georgia Tech NLP Seminar, UMass Amherst Machine Learning & Friends Lunch, UC Santa Barbara NLP Seminar

2021

A few possibly controversial opinions about large language models
Carnegie Mellon University Language Technologies Topical Seminar

2021

The Sweet Lesson
SustaiNLP Workshop

2021

What do language models learn from language modeling?
Stanford University CS 330 Lecture and Advanced Language Processing Winter School

2021

How and why should(n't) we scale machine learning?
IBM AI Hardware Forum Keynote

2021

A better way to get language models to do what you ask
AKBC 2021 Unstructured and Structured Knowledge Bases Workshop and Cohere.ai

2021

Scaling up Models and Data
CIFAR Deep Learning and Reinforcement Learning Summer School, Nepal Winter School in AI, and Advanced Language Processing Winter School

2021

Explicit and Implicit Entropy Minimization in Proxy-Label-Based Semi-Supervised Learning
CVPR Workshop on Learning with Limited and Imperfect Data

2021

The benefits of unified frameworks for language understanding
Conceptual Understanding of Deep Learning Workshop

2020

T5 and large language models: The good, the bad, and the ugly
Stanford University CS 224n Lecture, CU Boulder Applied Mathematics Colloquium, Twitter Machine Learning Seminar, Google Graduate Symposium & TTIC NLP Seminar

2020

Responsible publication: NLP case study
Navigating the Broader Impacts of AI Research Workshop Panel

2020

What Can MIR Learn From Transfer Learning in NLP?
NLP for Music and Audio Workshop Keynote

2020

Transfer Learning for NLP: T5 and Beyond
Montreal Institute for Learning Algorithms Tea Talk & Spotify Research Seminar

2020

Answering Questions by Querying the Implicit Knowledge Base Inside T5
AKBC 2020 Unstructured and Structured Knowledge Bases Workshop

2019

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Allen Institute for Artificial Intelligence & New York University CILVR Seminar

2019

Outskirts of Deep Generative Modeling
Faculty Job Talk

2018

Why are GANs Interesting?
New York University CILVR Seminar

2018

A Few Unusual Autoencoders
Vector Institute, New York University & San Francisco State University

2017

Leveraging MIDI Files for Music Information Retrieval
18th International Society for Music Information Retrieval Conference Tutorials

2017

Doing Strange Things with Attention
AI With The Best & 1st USF Data Institute Conference

2016

The Lakh MIDI Dataset: How It Was Made, and How to Use It
BISH Bash Meetup, Centre for Digital Music Seminar & Jukedeck Lunch and Learn

2016

Learning-Based Methods for Comparing Sequences, with Applications to Audio-to-MIDI Alignment and Matching
2nd ICML Machine Learning for Music Discovery Workshop

2015

Accelerating Large-Scale Sequence Retrieval with Convolutional Networks
IIT Bombay Electrical Engineering Seminar

2015

Learning Efficient Representations for Sequence Retrieval
Boston Data Festival

2015

Using Convolutional Networks (with Attention) for Orders-of-Magnitude Speedup of DTW-Based Sequence Retrieval
Spotify Machine Learning Seminar

2015

Recurrent Networks in Lasagne
Mount Sinai Hammer Lab Seminar

2015

Lasagne Tutorial
Next.ml Boston

2015

Theano Tutorial
Next.ml Boston

2015

mir_eval
Objective Evaluation in Semantic Audio Analysis and Processing Panel at the 138th Convention of the Audio Engineering Society

2015

Large-Scale Content-Based Matching of Audio and MIDI Data
Stanford University DSP Seminar

2013

Advances and Challenges in Large-Scale Music Information Retrieval
Digital Music Research Network+8

2013

Quantifying Rhythmic Synchrony
Midwestern Music Cognition Symposium

2011

A Sequential Approach to Musical Event Detection
Carnegie Mellon University Music and Technology Seminar

2010

ROW-mp3: An Enhanced MP3-Compatible Audio Codec
Stanford University DSP Seminar

2010

An Effective Model of Bucket-Brigade Device-Based Audio Circuits
Stanford University DSP Seminar

2008

Voltage-Controlled Resistance: Modulate Anything
Circuitastrophe Circuit Bending Music Festival

2024

Instructor, Neural Networks and Deep Learning, University of Toronto

2023

Instructor, Neural Networks and Deep Learning, University of Toronto

2023

Instructor, Deep Learning, UNC Chapel Hill

2022

Instructor, Large Language Models, UNC Chapel Hill

2022

Instructor, Deep Learning, UNC Chapel Hill

2021

Instructor, Information Theory, UNC Chapel Hill

2021

Instructor, Deep Learning, UNC Chapel Hill

2020

Instructor, Learning from Limited Labeled Data, UNC Chapel Hill

2018

Instructor, Introduction to Machine Learning, Google TechExchange

2015

Teacher's Assistant, Deep Learning for Computer Vision and NLP, Columbia University

2014

Teacher's Assistant, Music Digital Signal Processing, Columbia University

2013

Teacher's Assistant, Music Digital Signal Processing, Columbia University

2009

Instructor, Electronics, Oberlin College

2006

Instructor, Circuit Bending, Oberlin ExCo

2024–now

Marco Ciccone, Postdoc at the Vector Institute

2025–now

Malikeh Eghaghi, PhD student at the University of Toronto

2024–now

Gyung Hyun Je, PhD student at the University of Toronto

2024–now

Gül Sena Altintaş, PhD student at the University of Toronto

2022–now

Brian Lester, PhD student at the University of Toronto

2021–now

Haokun Liu, PhD student at the University of Toronto

2020–now

Nikhil Kandpal, PhD student at the University of Toronto

2020–now

Derek Tam, PhD student at the University of Toronto

2020–now

Michael Matena, PhD student at UNC

2024–now

Fengyuan Liu, Undergraduate at the University of Toronto

2024–now

Yu Xin Li, Undergraduate at the University of Toronto

2024–2025

Wanru Zhao, Intern at the Vector Institute

2024–2025

Cam Bishop, Intern at the Vector Institute

2024

Weiwei Sun, Intern at the Vector Institute

2021–2023

Muqeeth Mohammed, Master's student at UNC

2022–2023

Yufan Liu, Undergraduate at UNC

2022–2023

Haikang Deng, Undergraduate at UNC

2021–2022

Anisha Mascarenhas, Master's student at UNC

2021–2022

Vishal Baskaran, Master's student at UNC

2022

Peiyu Li, Undergraduate at UNC

2022

Tara Ghorpadkar, Undergraduate at UNC

2021–2022

Mansi Sakarvadia, Undergraduate at UNC

2021–2022

Tenghao Huang, Undergraduate at UNC

2021–2022

Ellie Evans, Undergraduate at UNC

2021–2022

Monty Evans, Undergraduate at UNC

2020–2022

Zhenlin Xu, PhD student at UNC

2021

Varun Nair, Undergraduate at Duke University

2020–2021

Jay Mohta, Master's student at NC State

2018–2019

Yao Qin, intern at Google Brain

2018

Vaishnavh Nagarajan, intern at Google Brain

2017–2018

Ishaan Gulrajani, resident at Google Brain

2017–2018

Avital Oliver, resident at Google Brain

2017

Jacob Buckman, resident at Google Brain

2025

Organizer, ICLR 2025 Workshop on Modularity for Collaborative, Decentralized, and Continual Deep Learning (MCDC)

2025

Workshop Chair, Conference on Language Modeling

2025

Senior Area Chair, International Conference on Learning Representations

2024–2025

Senior Area Chair, Neural Information Processing Systems

2023

Peer Reviewer, Canada CIFAR AI Chairs program

2022

Organizer, NeurIPS Workshop on Transfer Learning for NLP

2022–2023

Senior Area Chair, Conference on Empirical Methods in Natural Language Processing

2022

Panel Moderator, 7th Workshop on Representation Learning for NLP

2022

Area Chair, Conference on Computational Natural Language Learning

2022

Senior Area Chair, North American Chapter of the Association for Computational Linguistics

2022

Organizer, ICML 2022 Workshop on Pre-Training

2022

Panelist, National Science Foundation

2022

Area Chair, Annual Meeting of the Association for Computational Linguistics

2022–2025

Action Editor, Transactions on Machine Learning Research

2021

Member, ACL Working Group on Efficient NLP

2021

Action Editor, ACL Rolling Review

2021

Lead Organizer, ICLR 2021 Workshop on Enormous Language Models

2021

Area Chair, North American Chapter of the Association for Computational Linguistics

2021–2024

Area Chair, International Conference on Learning Representations

2021

Area Chair, AAAI Conference on Artificial Intelligence

2020

Mentor, Women in Machine Learning (WiML) Roundtable

2020

Organizer, NeurIPS Competition on Efficient Open-Domain Question Answering

2020–2023

Area Chair, Neural Information Processing Systems

2020

Panelist, Decoding Graduate Programs in CS

2020

Area Chair, Conference on Empirical Methods in Natural Language Processing

2020–2023

Reviewer, Journal of Machine Learning Research

2020

Reviewer, Transactions of the International Society for Music Information Retrieval

2020

Mentor, Black in AI

2019

Mentor, Women in Music Information Retrieval (WiMIR)

2018

Reviewer, Machine Learning for Creativity and Design Workshop

2018

Area Chair, International Society for Music Information Retrieval Conference

2018–2022

Reviewer, International Conference on Machine Learning

2018–2020

Reviewer, International Conference on Learning Representations

2018

Reviewer, International Journal of Computer Vision

2017–2019

Reviewer, Neural Information Processing Systems

2016

Late Breaking/Demo and Unconference Chair, 17th International Society for Music Information Retrieval Conference

2016

Reviewer, Journal of New Music Research

2015

Reviewer, EURASIP Journal on Audio, Speech, and Music Processing

2014–2016

Founder and Organizer, Neural Network Reading Group and Seminar Series at Columbia University

2014–2017

Reviewer, International Society for Music Information Retrieval Conference

2014

Founder and Organizer, Crucial Python Seminar Series at Columbia University

2014

Reviewer, IEEE International Symposium on Information Theory

2013–now

Founder and Organizer, Hacking Audio and Music Research (HAMR) hackathon series

2008–2009

Mathematics Tutor, Oberlin College

2025–now

Samarendra Chandan Bindu Dash, University of Toronto

2025–now

Baorun Mu, University of Toronto

2024–2025

Prateek Yadav, University of Toronto

2024–now

Karthikeyan, Dhuvarakesh, University of North Carolina

2024–now

Zhenwei (Joseph) Tang, University of Toronto

2024–now

Lunjun Zhang, University of Toronto

2024–now

Ajay Patel, University of Pennsylvania

2024–now

Towaki Takikawa, University of Toronto

2024–now

Honghua Dong, University of Toronto

2024–now

Yangjun Ruan, University of Toronto

2024

Lucas Gomez, McGill University

2023–now

Gavin Guan, University of Toronto

2023–now

Alex Adams, University of Toronto

2023–2024

Zining Zhu, University of Toronto

2023–2024

Jiaao Chen, Georgia Institute of Technology

2022–2024

Andrew Freeman, University of North Carolina, Chapel Hill

2022–2023

Albert Webson, Brown University

2022–2023

Feng Cheng, University of North Carolina, Chapel Hill

2022–2023

Xiang Zhou, University of North Carolina, Chapel Hill

2022–2023

Chao Zhao, University of North Carolina, Chapel Hill

2022–2023

Tu Vu, University of Massachusetts, Amherst

2022–2023

Peirong Liu, University of North Carolina, Chapel Hill

2020–2023

Junhua Yan, University of North Carolina, Chapel Hill

2020–2022

Yang Li, University of North Carolina, Chapel Hill

2020–2022

Zhengyang Shen, University of North Carolina, Chapel Hill

2020–2022

Dan Korn, University of North Carolina, Chapel Hill

2021–2023

YoungJoong Kwon, University of North Carolina, Chapel Hill

Colin Raffel

Education

Academic Positions

Honors and Awards

Publications

Invited Talks

Teaching

Press

Advising

Academic Service

Doctoral Committees