The Data Exchange Podcast: Michael Mahoney on meta-analysis and adversarial training, and predicting trends in machine learning.
In this episode of the Data Exchange I speak with Michael Mahoney, a researcher at UC Berkeley’s RISELab, ICSI, and Department of Statistics. Mike and his collaborators were recently awarded one of the best papers awards at NeurIPS 2020, one of leading research conferences in machine learning.
We discussed three of Mike’s recent papers, and this led to a discussion about norms and practices that are common in the ML community. While these are papers are somewhat technical in nature, they all have practical insights for data scientists and machine learning engineers in industry:
- Improved guarantees and a multiple-descent curve for Column Subset Selection and the Nyström method: Selecting a small but representative sample of column vectors from a large matrix has applications in machine learning, signal processing, and scientific computing. These are applications where you need to not just develop models but validate them. These are also situations where the columns might mean something, in which case you might also need models that make sense in terms of the domain where the data came from.
- Adversarially-Trained Deep Nets Transfer Better: Adversarial examples have become more common in fields like computer vision, and to a lesser extent in natural language models. These are usually carefully crafted perturbations meant to fool a model but are also being used to test model robustness. Mike and his colleagues show that adversarial training translates to models that transfer better.
- Predicting trends in the quality of state-of-the-art neural networks without access to training or testing data: This is a followup of a previous conversation Mike and I had where we discussed WeightWatcher: an open source project for predicting the accuracy of deep neural networks. In this paper they develop tools for the following very common scenario: suppose you are given a model that you did not train (and thus know little about how it was actually built), can I tell if the model is terrible or mediocre or good or great? The usual approach to evaluating a model is to examine its training and testing error. But unless you have access to the data used to train the model thi standard approach is off the table. To answer these questions, Mike and his collaborators examine a range of quantities that enable them to make assertions about the quality of a model.
We closed by discussing a recent paper from MIT – The Computational Limits of Deep Learning – which I featured in a previous episode. This in turn led to a discussion about the need for the ML community to consider things like adversarial training, meta-analysis, and other tools, that might help produce more robust and more interpretable models:
- ❛ There’s a strong bias towards getting what’s called state-of-the-art results. This means some epsilon improvement on some previous model. And the easiest way to be epsilon better is not to have a new conceptual framework, or argue about generalization versus transfer and interpretability. … So a lot of progress in machine learning has been made because of contests. And when you have well defined contests, you have a number that people try and beat a current record, but then you may over optimize to that number. So I think the flip side of the progress that happens with a contest is that you get this culture that chases these sort of state-of-the-art results. If you have a broader context and are interested in other things like interpretability and robustness, those tend to fall by the wayside.
… Something that very few papers do but this one (“The Computational Limits of Deep Learning”) did is do something like a meta-analysis. So one way to think about predicting trends in state-of-the-art neural networks without access to training and testing data is to look at hundreds of publicly available pre-trained models. When we submitted our paper on predicting trends, the reviewer asked why we didn’t focus on training new models. We instead looked at publicly available models, which is sort of what the MIT paper did. And I think that’s common in biomedical analysis and a range of areas I think it’s sorely lacking in machine learning here. There’s so many ML papers out there, doing a meta-analysis like that I think is valued.
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
Related content and resources:
- A video version of this conversation is available on our YouTube channel.
- Neil Thompson: “The Computational Limits of Deep Learning”
- Michael Mahoney: “Understanding deep neural networks”
- Peter Warden: “Why TinyML will be huge”
- Navigate the road to Responsible AI
- Yishay Carmiel: “End-to-end deep learning models for speech applications”
- Dan Geer and Andrew Burt: “Security and privacy for the disoriented”
- Rumman Chowdury: “The State of Responsible AI”