Card sort determined that data being used by a machine learning model was wildly inaccurate.

How might we improve the data collected by debt collection specialists to enhance a machine learning model to better improve rates of collection across all reasons for delinquency?  A remote card sort challenged collection specialists with applying predetermined reasons for delinquency labels, such as bankruptcy, dispute, unemployment, etc., to common and uncommon customer scenarios. We found that the existing data being fed into the model was wrong and sending collection specialists inaccurate recommendations to improve their collection rates. 

Problem Details

A previous research effort to develop a debt collection journey map uncovered some conflicting definitions of the reasons for delinquency. I determined that we need to dig into this finding more as these reasons were crucial to the machine learning models being used by that department. 

Research goals: 

- Determine if the current state list of Reasons for Delinquency are sufficient for

model building

- Understand if data captured by collections specialists is consistent and accurate

Ultimately, the hypothesis was that collections specialists find the reasons for delinquency to be confusing and vague and that the historical data collected is wrong.

Method

Because this was the beginning stage of verifying data, I leveraged a remote virtual hybrid card sort. Participants were asked to sort common and uncommon customer scenarios (cards) into the current state Reason for Delinquency labels (categories) using Optimal Workshop’s card sorting tool. The participants consisted of collections specialists of various tenure and speciality and were anonymous.

Primary Research

Analysis

The diagrams illustrate the many different ways the customer scenarios were interpreted by the collection specialist as the reason for delinquency, including the categories they created. This example proved that we can’t use these reasons for delinquency to build machine learning models until the interpretations are objective rather than subjective. 

60% of collection specialists felt their interpretation of delinquency labels was accurate, but the data says otherwise.

Outcome

Great deal of incongruent use of Reason for Delinquency labels identified in this phase of the study

While scenario cards were challenging, the variety of labels per card used indicates disagreement on proper usage. Lack of clarity when categorizing an account can skew the data captured and negatively impacts the modeling.


Recommendation: Clarify meanings of reason for delinquency for the collection specialists and then retest to identify if the data can be used for a machine learning model.

Previous
Previous

How a $10 ream of paper saved a hospital ~$165,000

Next
Next

Developing a customer experience strategy for a brand new company