When most insurers think of data science in their industry, naturally it is mostly about numbers and turning abstract data into correlations that become actionable business intelligence. However, there is considerably more to data science than sifting through spreadsheets and building predictive models. It requires not just mathematical skills but human judgment and intelligent decisions at pivotal moments.
When looking for an ideal candidate to be in charge of the data that informs large-scale impacts to businesses, it’s important for insurers to identify data scientists that have an even mix of hard (tech) and soft (communication) skillsets.
This was a solid agreement of all panelists (including myself) who spoke on a big data panel at the recent “Super Regional P/C Insurer Conference 2017.” Soft skills are crucial to passing data initiatives through your organization, getting approval from regulators and knowing how to avoid a lack of context around data correlation (which can ruin a model’s performance in production). It is important to always keep data-driven decisions in check in order to develop a sustainable data strategy within a business environment. Below are areas where human judgment is critical in data analysis.
A data scientist analyzes data with a combination of different predictive modeling techniques, including general linear modeling (GLM), machine learning, classification trees, and other multivariate and univariate techniques. Regardless of approach, there always will be certain statistically supported variables that should not be used. A good life insurance example is the relationship between a candidate and whether they have relatives who are felons. While it has proven to be a statistically viable metric, it cannot be used within a predictive model because it is ethically unsound.
One of the most widespread consumer-facing issues in insurance with regard to data and analytics has been price optimization. Since there is confusion about how these models are constructed, it leads some to believe there are certain unethical practices at play, like unfair weighting of socioeconomic factors in auto insurance rates. Deciding which variables to include or exclude is an important and strategic decision between the data science and business units involved.
Correlation vs. Causation
It is the job of the data scientist to determine the causation of the relationships. For example, a positive relationship between two variables that has survived the testing process is global temperature and the increase in piracy. This may correlate, but clearly one does not cause the other. This is an obvious example to demonstrate a point, but many aren’t that clear and require meticulous discretion and a lot of “connecting the dots” to determine whether there is causation.
Data Analysis Won’t Get Far Without Understanding
One of the most critical components of soft skillsets in data analysis is simply the ability to explain the variables and recommendations that the model output is showing. Transparency makes it simpler to obtain buy-in from all parties in the organization and explain to regulators how models come to their decisions.
Take machine learning, for example. Machine learning provides deep and accurate predictors and can often maximize model lift, but because data is being continuously fed into the model without human interruption, you forgo transparency and the ability to show how the model arrived at the decisions. There are instances where this tradeoff is completely appropriate and other times when it is problematic. Responsibility falls to the data scientist to know which modeling technique is appropriate to the use case and to provide clear explanations to all stakeholders involved.
While data scientists will always be valued for their statistical abilities, it is important to understand that soft skills have immense value to the sustainability of data initiatives. This is particularly important to consider when hiring for the slew of data positions that are coming into the insurance industry. Data provides companies with a seemingly limitless arsenal of information that can be used to provide key business advantages, but it takes people to empirically decide which of that information is actionable and to be mindful of what will pass in a stringent regulatory environment.