Although we like to think about ML workflows as straight line narratives from experiment to training to production, and then monitoring, the reality for large companies is that all the steps are happening at one time in concert with other models, with shifting data and sometimes misaligned key feature inputs.
Moreover regulated firms are required to track all the models, the changes, and the impacts of those changes For compliance. Enter explainability supported by model monitoring, far from sleepy monitoring of changes and anomalies. Today’s ML monitoring and performance management requires the ability to identify changes and alert the right people, the ability to assist in diagnosing issues, to create what if scenarios, and the ability to pop models back into production in real time with proper governance.
FiddlerAI is a startup focused on enterprise model performance management. They are tackling the unique challenges of building in-house stable and secure MLOps systems at scale. Today we are interviewing Krishna Gade about trusting AI, the technical challenges of ML monitoring and the real world problem statements beyond compliance that explainability can address.
Sponsorship inquiries: sponsor@softwareengineeringdaily.com
The post MLOps Systems at Scale with Krishna Gade appeared first on Software Engineering Daily.