Instrumenting Machine Learning pipelines

"Datapane is a speedy way to generate complex visualizations, and share these with non-technical people. It’s a cool tool that plugs right into our ML stack."

Company background

Proportunity is a leading UK real-estate finance company which uses cutting-edge machine learning models to power their signature product: the Proportunity Loan.

The Challenge: Inspecting an automated data pipeline

Proportunity needs an accurate assessment of a property’s value and buyer creditworthiness in order to offer home loans at scale. To do so, they train and test many machine learning models each month. Since each model is quite complex to train and may have millions of parameters, all training and deployment is done in an automated data pipeline.

After training, each model is assessed on its accuracy compared to historical data. If something goes wrong during the training step, it can be challenging to figure out what the root cause was from the trained model and log files alone. Before Datapane, the team would manually screenshot and save the training step outputs to have a record in case something went wrong, which meant that it took hours of manual work to train and test each model.

Graph from a Proportunity report showing the effect of the '08 financial crisis on property purchases
Proportunity report on Datapane showing the ’08 financial crisis impact on property prices

The Solution: Plug-in analytics reports

The Proportunity Machine Learning team added a Datapane reporting script as a step in their model pipeline after training. This script automatically generates an interactive report about the model’s training process and performance on their Datapane Teams account.

If the team tests a new model and sees that any top-level metrics have degraded, they can quickly trace back to the root cause by looking at the Datapane report for that model, which contains detailed visuals and raw data on the training process. These interactive inspections have helped the team to identify and fix issues much quicker, and also serve as a great medium for explaining the work of the Machine Learning team to the rest of the company.

Results: Higher accuracy, faster iterations

Adding Datapane has enabled Proportunity to fully automate their model training and deployment pipeline, knowing that they are able to have detailed inspections at any step along the way. This has resulted in a 3X increase in the number of models tested each month, ultimately resulting in a more accurate credit assessment and lower default rates.

