The Data Exchange Podcast: Harish Doddi on building enterprise solutions for model operations and model governance.
In this episode of the Data Exchange I speak with Harish Doddi, cofounder of Datatron, a startup focused on helping companies operationalize machine learning. Over the past two years, Harish has worked closely with enterprises to understand their needs in the areas of model operations and model governance. Last year Harish and I, along with David Talby, wrote two articles on these topics. In the first article, we described these emerging areas (“What are model governance and model operations?”), and in the second we listed lessons that ML engineers can draw from two highly regulated industries (“Managing machine learning in the enterprise: Lessons from banking and health care”).
As machine learning becomes widely deployed, organizations will need to develop processes and tools to ensure that models behave as intended. This means having the right set of controls and validation steps in place.
Our conversation focused on model governance and related topics:
- We discussed the three related areas of MLOps, Model Governance, Model Observability.
- I asked Harish to describe how model governance is perceived and practiced in different industries.
- We discussed real-world examples of model governance, and organizational and staffing considerations that come into play.
- CI/CD for machine learning.
- Key enterprise features for model governance solutions.
Subscribe to our Newsletter:
We also publish a popular newsletter where we share highlights from recent episodes, trends in AI / machine learning / data, and a collection of recommendations.
Download a complete transcript of this episode by filling out the form below:
Ben: I wanted to have you on the podcast, Harish, because last year we wrote a couple of posts, along with our friend David Talby, on the topics of model governance, machine learning ops (MLOps), and model observability. I want to spend this episode talking to you mostly about model governance because I feel like that’s not as discussed, but let’s start off by having you briefly describe each of these areas. If I’m a CTO or CIO, how should I think about model observability, MLOps, and model governance?
Harish: If you are an executive at the CIO or CDO level, or even CTO level, often your internal data science teams are assigned to different business problems. For example, some of them might be working on a fraud problem, some of them might be working on a pricing problem, and some of them might be working on a credit risk problem. The thing is, often in the model life cycle, the models need to go to a production-grade environment to finalize the production part—models have to go through a continuous life cycle because what you learn in production, you take back to development and rethink the model, and then put it back into production.
So, in MLOps, once the models are developed by data scientists internally, it facilitates the models going to a production environment by automating the operations part. This includes things as simple as containerization of the model, infrastructure, where the model is getting deployed, management of the models, and even model deployment. Now, the other part is, if you are actually working in regulatory industries like financial institutions or lending companies or healthcare, or telecommunications, it’s not just about taking models into production; you also have to make sure the models are going through a proper rigorous compliance process. This is where model governance really comes into the picture; it makes sure that your models are adhering to certain standards in these industries, and that you have tools to enable all those standards to make sure your models are being governed properly.
- Rajat Monga: “The evolution of TensorFlow and of machine learning infrastructure”
- Evan Sparks: “An open source platform for training deep learning models”
- Dean Wampler: “Scalable Machine Learning, Scalable Python, For Everyone”
- Edo Liberty: “How deep learning is being used for search and information retrieval”
- Morten Dahl: “The state of privacy-preserving machine learning”
- David Talby: “Building domain specific natural language applications”
[Image: Philippines Department of Justice, from Wikimedia.]