By applying nonlinear neural networks and explainable machine learning methods (e.g., layer-wise relevance propagation or integrated gradients), we aim to disentangle forced climate patterns from internal variability in observations and large ensembles. In particular, we are interested in using these methods to detect biases/differences in their simulation of compound extreme events, internal variability, and forced trends in fully-coupled climate models. Explainable AI methods can be used as another tool to understand physical mechanisms in the climate system.
Comparing climate models and observations in the Arctic
The Arctic is warming 3-4x faster than the globally averaged mean temperature trend. In addition to sea-ice loss, numerous other feedbacks in the climate system contribute to this rapid warming (Druckenmiller et al. 2021). In fact, some studies have pointed out that the Arctic is transitioning to an entirely new state (Landrum and Holland, 2020). To understand the consequences of this rapid warming, it especially important to evaluate the validity of current and future projections from climate models.
Since the development of the earliest forms of climate models, scientists have used statistical methods with different levels of complexity for evaluating their performance (e.g., using root-mean-square error (RMSE), pattern correlations, model weighting, emergent constraints, principal component analysis, etc.) (see Gleckler et al. 2016). However, there can be issues with these approaches (e.g., Willmott et al. 2017), such as linear assumptions about the data or only considering point-by-point statistics. Given the recent successes for using machine learning to identify regional climate patterns (e.g., Barnes et al. 2020), we decided to consider whether this framework could also be used for comparing climate models and observations.
Due to the growing availability of computational resources, climate centers frequently run their climate models simulations for a number of iterations, which only differ by small tweaks to the initial conditions. These types of experiments are called large ensembles, and they are useful for evaluating internal variability in the climate system (Maher et al. 2021). In other words, this is the randomness/chaos/noise of our atmosphere. Given that big datasets are needed to train, evaluate, and limit overfitting for useful machine learning models, climate model large ensemble experiments are particularly well-suited for addressing all sorts of scientific questions.
Considering these two pieces – Arctic climate change and machine learning – we evaluate climate model large ensembles and observations using a statistical method called neural networks. Specifically, in our new study (Labe and Barnes, 2022), we input yearly maps of near-surface air temperature from 7 different climate models and ask the neural network if it can identify which climate model produced each temperature map. Interestingly, we find that the neural network quickly learns how to accurately identify the right temperature map with the right climate model. To find out how the neural network is doing so, we leverage machine learning explainability methods to identify the regional temperature patterns that are unique to each climate model large ensemble. These regional patterns often align with known climate model biases at the ice-ocean interface, such over the marginal ice zone of the Greenland and Barents Seas.
Finally, we input maps of yearly Arctic temperatures from observations and ask the neural network to associate it with one of the climate models. As a method of evaluating the skill of each observational map classification, we sort the climate models according to the confidence of the neural network output. We also compare this ranking with traditional climate model evaluation methods, such as RMSE, and find they compare surprisingly well despite the large differences in statistical approaches.
In summary, we leverage explainability methods in machine learning to identify differences between climate models and observations in the Arctic. One advantage of this approach is that we can address potential nonlinearities in the climate system and compare patterns of regional variability across the Arctic. In agreement with recent studies, we find that neural networks can be valuable tools for addressing patterns of climate change and variability in large ensemble modeling experiments.
Predicting temporary slowdowns in decadal warming
While there is a growing demand from government officials and other stakeholders for decadal climate predictions (e.g., Hermanson et al. 2022), forecasts are usually provided by modeling centers running computationally expensive Earth system models. Despite the increasing interest in explainable artificial intelligence (XAI) for weather and climate science (Boukabara et al. 2021), there still has been very little work on using machine learning methods for predicting on seasonal-to-decadal timescales (e.g., Gordon et al. 2021; Toms et al. 2021; Gordon and Barnes, 2022).
Meanwhile, the early 2000s temporary warming slowdown (also known as the “climate change hiatus/pause” – see Medhaug et al. 2017 and Wei et al. 2022) revealed a large gap between the multi-model mean from fully-coupled climate models and real-world observations. Thus, in our new study (Labe and Barnes, 2022), we were motivated to leverage recently adopted XAI tools for climate science and evaluate the potential predictability of similar warming slowdown events occurring in a climate model and observations. To assess whether these temporary slowdowns in decadal climate warming trends are predictable, we trained an artificial neural network by inputting maps of upper ocean heat content anomalies to assess whether a slowdown event will occur within the next 10-year period. Specifically, we use data from the newly released CESM2 Large Ensemble, which was a climate model experiment designed to assess the influences of internal variability in the climate system.
Notably, despite only training on single maps of upper ocean heat content for each year, we find that the skill of our neural network is substantially better than random chance and higher than other logistic regression models using predictors like global mean (sea) surface temperature or the Interdecadal Pacific Oscillation. We further use a machine learning explainability tool to peer into the “black box” and highlight that our neural network is using physically-consistent mechanisms for its correct predictions, which resemble transitions in the phase of the Interdecadal Pacific Oscillation.
Lastly, we test the skill of the neural network on observations and find that it successfully predicts the early 2000s slowdown using single annual mean maps of upper ocean heat content. This suggests that patterns of ocean heat content variability in the CESM2 large ensemble may be consistent with the real-world for some types of decadal warming slowdown events. In summary, we believe that even this simple neural network highlights the promising future for using machine learning tools in a wide variety of decadal climate prediction problems.
Disentangling aerosols and greenhouse gases
Aerosols (particles in the atmosphere) have an important influence on Earth’s climate (IPCC, 2013). On one hand, they can block incoming solar radiation, which acts as a cooling mechanism. However, other aerosols, like black carbon, can induce warming by absorbing solar energy. In general, aerosols remain a highly uncertain climate forcing (Bellouin et al. 2019). Recent studies have also shown that global climate model simulations are highly sensitive to the number of human-caused (anthropogenic) aerosols that are emitted during the 20th century (Dittus et al. 2020; Fyfe et al. 2021).
Thanks to a growing number of supercomputers, climate modelers have started to run their simulations over and over again. These types of experiments are called large ensembles, and they are produced by slightly tweaking the initial conditions of a climate model using a small round-off error (Deser et al. 2020b). Large ensembles have the advantage of allowing scientists to explore internal variability – in other words, the noise in the climate system (an example is shown in the animation). These large ensembles are often prescribed with realistic greenhouse gases and anthropogenic aerosols over the 20th and 21st centuries (Deser et al. 2020a). Since both greenhouse gases and aerosols can have (non)linear interactions that affect regional and global climate change, it is difficult for scientists to directly attribute their individual causality. To address this issue, scientists at the National Center for Atmospheric Research (NCAR) have developed a new set of large ensemble simulations that are forced by different combinations of anthropogenic aerosols and greenhouse gases. For example, in one experiment, greenhouse gases evolve realistically from 1920 to 2080, while industrial aerosols are held fixed to 1920 levels. This combination of single-forcing simulations allow us to disentangle their influence on climate change and variability.
In our new study (Labe and Barnes, 2021), we use a novel pattern recognition-like method from an artificial neural network (ANN) to compare regional climate change signals across the single-forcing large ensembles. Our relatively shallow (few hidden layers) ANN is trained on inputs of surface temperature maps from a climate model, and then it outputs the year/decade of those maps as its prediction. As shown in earlier studies (Barnes et al. 2019; Barnes et al. 2020), the ANN architecture here use regional patterns of information in order to predict the year of the input maps. To find these climate indicators, we use a method of explainable artificial intelligence (XAI) called layer-wise relevance propagation (LRP; Toms et al. 2020), which allows us to “see” where these important regions exist that the ANN is using for its predictions. Finally, after training our model on the large ensemble simulations, we can test the ANN by inputting maps of real-world observations. By using an ANN, we consider potential nonlinearities in regional climate signals that evolve over time due to greenhouse gases and industrial aerosols, which may not be easily captured by traditional statistical methods.
In summary, we find that Southeast Asia, the Southern Ocean, and the North Atlantic Ocean are key regional signals that are important for the ANN to be able to make a prediction. The patterns of relevance also differ between the aerosol and greenhouse gas simulations. In agreement with recent work (e.g., Dagan et al. 2020), our LRP method reveals that anthropogenic aerosols have had an important role in surface temperature trends near the North Atlantic Warming Hole. Finally (and perhaps most interesting), we find that the yearly predictions based on real-world observations (from 1920 to 2015) correlate more closely to actual data after training the ANN on the large ensemble with industrial aerosols held fixed to 1920 levels. This correlation is slightly higher than from the ANN trained on the more realistic large ensemble simulation (with both time-evolving aerosols and greenhouse gases). Broadly, this work supports recent studies that reveal how global climate models may be overly sensitive to aerosols when compared to observations in the 20th century. Our study shows how XAI methods can be a valuable tool for identifying the timing of emergence of regional climate change signals.
 Labe, Z.M. and E.A. Barnes (2022), Comparison of climate model large ensembles with observations in the Arctic using simple neural networks. Earth and Space Science, DOI:10.1029/2022EA002348
[Plain Language Summary]
 Labe, Z.M. and E.A. Barnes (2022), Predicting slowdowns in decadal climate warming trends with explainable neural networks. Geophysical Research Letters, DOI:10.1029/2022GL098173
[Plain Language Summary][DOE Research Highlight]
> Labe, Z.M. and E.A. Barnes (2021), Detecting climate signals using explainable AI with single-forcing large ensembles. Journal of Advances in Modeling Earth Systems, DOI:10.1029/2021MS002464
[Plain Language Summary][Data Skeptic Podcast]
 Po-Chedley, S., J.T. Fasullo, N. Siler, Z.M. Labe, E.A. Barnes, C.J.W. Bonfils, and B.D. Santer (2022). Internal variability and forcing influence model-satellite differences in the rate of tropical tropospheric warming. (submitted)
 Labe, Z.M., E.A. Barnes, and J. Hurrell. Detecting the regional emergence of climate signals with machine learning in a set of stratospheric aerosol injection simulations, 2022 American Geophysical Union Annual Meeting, Chicago, IL (Dec 2022).
 Po-Chedley, S., E.A. Barnes, C. Bonfils, J. Fasullo, Z.M. Labe, B. Santer, and N. Siler. Substantial contribution of internal variability to satellite-era tropospheric warming inferred from CMIP6 large ensembles, 2022 American Geophysical Union Annual Meeting, Chicago, IL (Dec 2022).
Labe, Z.M. and E.A. Barnes. Temporary slowdowns in decadal warming predictions by a neural network, CLIVAR Climate Dynamics Panel (CDP) annual workshop: External versus internal variability on decadal and longer time scales, Virtual
Workshop (Oct 2022).
 Po-Chedley, S., E.A. Barnes, C. Bonfils, J. Fasullo, Z.M. Labe, B. Santer, and N. Siler. Internal Variability and Forcing Influence Model-satellite Differences in the Rate of Tropical Tropospheric Warming, Asia Oceania Geosciences Society 19th Annual Meeting, Virtual Conference (Aug 2022).
 Labe, Z.M. Learning new climate science by thinking creatively with machine learning, GFDL/AOS Summer Internship Lecture Series, Princeton University, NJ (Jun 2022).
 Labe, Z.M. and E.A. Barnes. Using neural networks to predict temporary slowdowns in decadal climate warming trends, 27th Annual CESM Workshop, Virtual Workshop (Jun 2022).
 Po-Chedley, S., E.A. Barnes, C. Bonfils, J. Fasullo, Z.M. Labe, B. Santer, and N. Siler. Internal variability influences model-satellite differences in the rate of tropical tropospheric warming, US CLIVAR: The Pattern Effect: Coupling of SST Patterns, Radiative Feedbacks, and Climate Sensitivity Workshop, Boulder, CO (May 2022).
 Labe, Z.M. and E.A. Barnes. Using explainable neural networks for comparing climate model projections, 27th Conference on Probability and Statistics, Virtual Attendance (Jan 2022).
 Labe, Z.M. and E.A. Barnes. Using neural networks to explore regional climate patterns in single-forcing large ensembles, 2021 American Geophysical Union Annual Meeting, Virtual Attendance (Dec 2021) (Invited).
 Labe, Z.M. and E.A. Barnes. Evaluating global climate models using simple, explainable neural networks, 2021 American Geophysical Union Annual Meeting, Virtual Attendance (Dec 2021) (Invited).
 Labe, Z.M. Exploring climate change signals with explainable AI, NASA JPL Carbon Club, Pasadena, CA. Remote Presentation (Dec 2021) (Invited).
 Labe, Z.M. and E.A. Barnes. Decadal warming slowdown predictions by an artificial neural network, 2021 Young Scientist Symposium on Atmospheric Research (YSSAR), Colorado State University, CO (Oct 2021).
 Labe, Z.M. Assessing climate variability and change with explainable neural networks, GFDL, Princeton University, NJ. Remote Presentation (Oct 2021) (Invited).
 Labe, Z.M. Learning new climate science by opening the machine learning black box, Department of Psychology: Cognitive Brownbag Series, Colorado State University, CO (Sep 2021) (Invited).
 Labe, Z.M. and E.A. Barnes. Exploring climate model large ensembles with explainable neural networks, WCRP workshop on attribution of multi-annual to decadal changes in the climate system, Virtual Workshop (Sep 2021).
 Labe, Z.M. and E.A. Barnes. Climate model evaluation with explainable neural networks, 3rd NOAA Workshop on Leveraging AI in Environmental Sciences, Virtual Workshop (Sep 2021).
 Labe, Z.M. and E.A. Barnes. Using explainable neural networks for comparing historical climate model simulations, 2nd Workshop on Knowledge Guided Machine Learning (KGML2021), Virtual Workshop (Aug 2021).
 Labe, Z.M. and E.A. Barnes. Climate Signals in CESM1 Single-Forcing Large Ensembles Revealed by Explainable Neural Networks, 26th Annual CESM Workshop, Virtual Workshop (Jun 2021).
 Mayer, K.J., E. Gordon, Z.M. Labe, A. Mamalakis, Z. Martin, and E.A. Barnes. Explainable AI for Climate Science, Institute for Energy and Climate Research (IEK-7), Remote Presentation (Mar 2021).
 Labe, Z.M. Revealing climate change signals with explainable AI, 2021 Spring Postdoctoral Research Symposium, Remote Presentation (Mar 2021).
 Labe, Z.M. and E.A. Barnes. Disentangling Climate Forcing in Multi-Model Large Ensembles Using Neural Networks, 20th Conference on Artificial Intelligence for Environmental Science, Virtual Conference (Jan 2021).