The Data Exchange Podcast: Sercan Arik on a canonical neural network architecture for tabular data.
This week’s guest is Sercan Arik, Research Scientist at Google Cloud AI. Sercan and his collaborators recently published a paper on TabNet, a deep neural network architecture for tabular data. It uses sequential attention to select features, is explainable, and based on tests Sercan and team have done spanning many domains, TabNet outperforms or is on par with other models (e.g., XGBoost) on classification and regression problems. A more recent paper suggests that better regularization techniques could yield further improvements to neural models for tabular data.
How does one get started with TabNet? While Google has only made TabNet available on Google Cloud, I have been experimenting with an open source PyTorch version from DreamQuark and have found the results encouraging. With that said, a recent study from Intel suggests that XGBoost outperforms TabNet and other deep neural models across a wide variety of datasets. So which technique are users suppose to focus on? The right strategy is to experiment and combine models as you see fit, based on the specifics of your dataset and application requirements.
At Google, we have very high value applications with structured data and some of them are a lot more important than some others. But if you can solve the problem in a canonical way, there is a greater value to people. So that is one motivation.
We looked at different things while contemplating what neural architecture to use. So one of them is that we know that decision tree based approaches, which are dominant in ML for tabular data, have some structure in them in how they process the data and create decision manifolds.
Next let me explain the inspiration for “attention” mechanisms. When humans make decisions, they look at the data, then they focus on some parts of the data, then they focus on some other parts of the data and so on. And eventually, they aggregate these pieces of information. Data is also the motivation for why attention based architectures were proposed initially, one of the motivations for attention was mimicking human perceptual system.
We were also very interested in explainability, because for a lot of the applications where structured data is important – like in finance or healthcare – explainability is very important. So we were thinking of how we can integrate explainable AI into the architecture, especially given the fact that instance wise feature importance can be very important, because these tabular data sets of samples have very different characteristics.
Download a complete transcript of this episode by filling out the form below:
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
- A video version of this conversation is available on our YouTube channel.
- “Model Monitoring Enables Robust Machine Learning Applications”
- Combine the development experience of a laptop with the scale of the cloud
- Travis Addair: “The Future of Machine Learning Lies in Better Abstractions”
- Nicolas Hohn: “Reinforcement Learning For the Win”
- Neil Thompson: “The Computational Limits of Deep Learning”
- Rumman Chowdury: “Responsible AI meets Reality”
- Ram Shankar: “Securing machine learning applications”
[Image by Sarah Lötscher from Pixabay.]