Balancing model accuracy and explainability of machine learning models
In early 2018, a team of ATB data scientists competed against some of the top teams in the world in a predictive analytics Kaggle competition. For the team competing, although there are many differences between applying their skills in competition versus their everyday work, there were incredible learnings to be had. Our team was able to explore methodologies, test a broad selection of feature engineering strategies, and try applications of high efficiency models like Light GBM and XGBoost.
There are key differences between professional application of data science and participating in a Kaggle data science competition.
In a Kaggle competition:
the focus is to win the competition by building a highly predictive model. In real life, the goal is to get useful results that can be interpreted and acted on;
success is measured by ranking criteria such as AUC (area under the curve) score. In life, predictive models have real-world impacts, consequences, and both quantitative and qualitative value assessment;
any technique that increases accuracy is fair, even if the final model cannot be explained fully. In life, it is crucial to understand the input features and their interactions in a predictive model to ensure that the model is robust, compliant, and can be trusted; and
the jump from 96.83 per cent accuracy to 97.85 might put you in the the top 5% of competitors. In life, 85 per cent accuracy might be the target model fit, based on many different factors to weigh, including ease of model deployment and management.
In banking, we are impacting people’s lives. So it is not just about putting the model into production that has the highest accuracy. Equally important, we must make sure that whatever we are proceeding with will work to the benefit of our customers and Albertans: ethics, transparency, and compliance are pivotal model design paradigms.
While high efficiency models can be exciting and beneficial, the availability of elastic compute engines and black box solutions puts additional scrutiny on data scientists to ensure that the models that are being used to make important decisions have been thoroughly validated. Interpretability increases in importance when applying machine learning techniques to real-world problems.
For example, in a Kaggle competition, ensemble modeling is a commonly used technique that can provide huge lift in accuracy of a predictive model. However, if applied in a real-life setting, we would need to apply techniques to ensure the model is explainable, including continuous validation and quality assurance so we are sure that we are impacting our customers lives in a positive way.
One of the greatest benefits of competing in a Kaggle competition is the opportunity to expose ourselves to new thoughts and ideas. Both in the application of algorithms, but also to expand our understanding of the impact of AI on the world. That is why one of the focus areas of ATB’s AI Lab, launched in April 2018, is the ethical application of AI for people and society.
The AI Lab also builds on our learnings about the value of bringing together diverse thinkers. We are working with professors and graduate students from the University of Alberta across the fields of Math, Science, and Economics to help find the best solutions to customer problems at ATB.
In the same way that great breakthroughs can be achieved in a Kaggle competitions by leveraging team members with diverse backgrounds and applying newer techniques and strategies, in real-life we are bringing theoretical research powered by a team from various disciplines into production at our AI lab. The diversity of thought is what leads to success.
We’re excited to continue our exploration of AI-powered future while remaining keenly aware of the dichotomy between the blind pursuit of model accuracy and the importance of keeping human ethics at the center of model design.
Are you interested in joining us as we work towards building an AI-powered future grounded in ethics? Explore our open positions here.