There are two players in the tug of war that is the implementation of a catastrophe (CAT) model—the user and the vendor.

Users require a sophisticated CAT model that pulls together multiple datasets to highlight risks for their portfolio. To do this successfully and with confidence, these users need to test CAT models to assess parameter sensitivity and to get a practical understanding of how a model works. Their job is made easier if a model vendor is transparent about the data and methods underpinning their model. However, just how much transparency do model users actually need to make informed decisions and manage their risk?

The more opaque a vendor is about the inner workings of a model, the more the user is forced to perform black box testing. In software development, a black box test only considers the external behavior of the system. As such, the internal workings of the software are not evaluated. In contrast, as a vendor becomes more transparent about their processes, a model user can begin to utilize white box testing. This is a method that takes into consideration its internal functioning, therefore extracting more value and understanding about the process.

Should model users be white box or black box testing?

There are well known advantages and disadvantages of both testing methods, but the current demand in the CAT modeling world is leaning towards white box testing.

Through the Vendor’s Lens

Providing the user with the information needed to perform white box testing opens the vendor up to a lot of scrutiny, especially to single decisions that are conditional on many other decisions that have been made in a holistic manner. The solution to this is to allocate time and resources to transfer knowledge to the user, so that these interdependencies are fully understood and judged fairly. Aside from the expense of this, there is the risk that this will make the model seem overly complicated and therefore inaccessible, ultimately putting off users who don’t want to drown in the details. The payoff may be worth it, despite the upfront effort and cost, because the relationship forged by user and vendor in this process can be used for a constant feedback loop of calibration and validation.

Many model vendors draw from similar datasets, often open source or easily accessible for anyone to use, in the creation of their model and then process these datasets differently. Some model vendors may be concerned that by disclosing their sources, they are admitting that there’s overlap between their own datasets and their competitors. There is nothing wrong with this overlap; there is only so much data available and model vendors try to train their model on as much data as possible. The value added by the vendor comes from the processing of data and the implementation of methods.

It is widely accepted that there are inherent limitations of catastrophe models. (See this recent article by Professor Paul Bates for more information). Examples of these include:

  1. Boundary condition errors. If something happens frequently, it’s more likely that multiple observational datasets will exist, which makes it easier to remove the impact of external factors or spot faults in a simulation. But for low probability conditions (i.e. boundary conditions), validating results is much trickier and errors are indistinguishable, and so unavoidable.
  2. Validating models using incomplete data. Few studies have been done to validate our ‘best’ satellite data, and where it has been completed, the results show that the data isn’t perfect. How can we validate the accuracy of one dataset against another, when our best validation data can have up to a 30 percent margin of error?

These limitations require more research in the catastrophe research field to better characterize uncertainties. A vendor’s transparency about the making of their model comes hand in hand with them exposing the fundamental limitations of the model, too. The vendor must put their faith in the user when exposing these constraints—that the user understands how these restrictions impact the interpretation of the model’s outputs, and recognizes where these limits are industrywide, rather than a shortcoming of one specific vendor.

Through the User’s Lens

White box testing helps the model user gain better understanding of the model because they can perform more probing and directed testing from the get-go. To do this type of testing there must be clear and consistent communication between vendor and user to provide a lot of knowledge transfer in the scrutiny of the model. However, is there a limit to how transparent users want models to be? Understanding a model’s inner workings and the reasoning behind assumptions requires a lot of knowledge. This transfer of such knowledge is expensive in terms of time and resources for both players.

On the other hand, black box testing is cheaper and less time consuming because it doesn’t involve testing on a parameter-by-parameter basis. There is also no cross-industry standard for vendor model validation and testing. Different models contain different and sometimes incomparable inputs, and are trained using different loss portfolios. This means it is difficult for a user to compare models effectively and fairly. One might argue that this makes white box testing redundant, if it’s not possible to compare the outcomes: Does clarity matter if there’s no way of checking if the data is accurate or not?

So where does that leave us?

The author of the article is a software engineer for Fathom. Founded in 2013, Fathom gives risk management professionals the scientifically robust tools and intelligence to understand the climate’s impact on water risk.

Model vendors are left packaging up their data as big grey boxes. Each model vendor will package this up differently, based on their own attitude towards transparency and external validation. This leaves model users with many different shades, let’s call it 50 shades of grey boxed model vendors to choose from. The market is hankering for these boxes to be of the palest grey tone, and has seen multiple organizations take their first step in this direction by adopting the Open Data Standards set out in the OASIS Loss Modelling Framework. Such a framework balances the goals of communicating model parameters, limitations and methodologies to users while not overwhelming them with information too technical to understand or exposing a vendor’s unique selling point—an approach or innovation that gives the vendor a competitive edge.

The easy thing to do would be to package models up as a black box. However, to get value out of a CAT model it needs to be a collaborative process and boxed up in a transparent packaging. This helps both players keep a continuous development cycle, and a model is far more valuable to both when each knows the subjective decisions that go into it.