For many businesses, one of the key questions that arose from the fallout of the COVID-19 pandemic was whether black swan events can be predicted with sophisticated models. We asked Maurizio Garro, Senior Lead IBOR Transition Programme, Lloyds Bank, to share his thoughts and expertise on the matter, and in this article, he analyses what black swan events are and how his models approach them.
February 2002, Donald Rumsfeld, the then US Secretary of State for Defence, stated:
There are known knowns. There are things we know that we know. There are known unknowns. That is to say, there are things that we now know we don't know. But there are also unknown unknowns. There are things we do not know we don't know.
Audience initially thought this was a nonsense but actually this sentence makes sense and it expresses the concept of black swan. Of course, the term black swan goes much further back in the human history: the roman poet, Juvenal once said, “Rare Avis” - “A rare bird upon the earth, and very much like a black swan”.
Black swan has become again a hot topic after the COVID-19 pandemic as many have defined the phenomenon as a black swan. Apparently, it is not. This is the verdict of Nassim Nicholas Taleb who has criticised those who describe the COVID-19 outbreak as a black swan event. In simple words, Taleb affirms that pandemics are predictable as they are statistically almost inevitable, given enough time, enough disruption of wild habitat, cruel breeding of animals (allowing their diseases to infect humans), and enough mass movement of people and goods around the world. A similar concept was expressed by Bill Gates, Laurie Garrett, and others so far. Finally, it is common knowledge that many medical researchers are in the process of identifying a universal vaccine to deal with the next pandemic.
Now a step back: how do we define a black swan? Taleb describes a black swan as an event that
- is beyond normal expectations that is so rare that even the possibility that it might occur is unknown,
- has a catastrophic impact when it does occur, and
- is explained in hindsight as if it were actually predictable.
Now we can focus on a key question: can a model predict a black swan event? If we apply Taleb’s definition, the answer will be no. However, in the last years, we did some good progress in the predictability of black swan-like events. Artificial intelligence (AI) and its subset, machine learning (ML), the availability of the big data and supporting technology have helped a lot in improving our understanding of some events which could be classified as black swans. On this basis, I classify black swans in two categories: pure black swan events (e.g. 9/11) and temporary black swan events which we could have predicted if we leveraged our available information and/or if we had a more advanced technology to analyse them.
In order to explain my view better, I will go back to the COVID-19 outbreak mentioned earlier on. While no one could have predicted the COVID-19 spread in China and then to the rest of the world, they are elements of the globalisation described above that we can consider risk factors of the pandemic COVID-19. This is the core of my view. We cannot predict the black swan per se, but we can predict the combined effects of risk factors (proxies) of the black swan we are studying.
Of course, there are many challenges, including the classic one related to the treatment of the outliers. All the time we develop a model, the starting point is to define the data we are going to use to train and test our models. When we analyse the data, initially we perform a data quality task to assess if data are enough, reliable, and verify missing data, etc. Of course, we use statistics to assess if there is any outlier to decide if we need to eliminate them from the sample. In this case, there is a trade-off between the explanatory power the outliers can provide vs the biases they can apply to the model’s outcome. This is the challenge, and this is particularly important when we want to predict a black swan. It is essential to determine if the outlier is an extreme observation of a phenomenon or it was a random event that we can ignore.
On this point, we can think about how the current AI/ML will take into consideration the current period when the COVID-19 pandemic has changed the way that everything from the individual to governments at every corner of the world operate. There could be different options like cutting this period out, recalibrate the current sample to smooth any outlier, or create an ad hoc sample to create models for extreme events (not necessarily black swan). In this case, there is no right or wrong, but the most important thing is the consistency between the chosen approach and the scope and use of our research.
It is important to leverage events like the COVID-19 pandemic and previous financial crises to learn about the general rather than the specific. It is essential to learn that we live in a more dynamic, fast, interconnected world where things happen at a high speed. I would consider the following important points to improve the predictability of our models: intense and thorough data scrutiny, identification of the key risk factors (proxies of black swan events), robust and flexible calibration and continuous monitoring of models’ performance and limitations.