Cari lettori, oggi vi proponiamo uno studio ricevuto dal Dott. Franco Pavese, ex ricercatore del CNR, il quale ha voluto partecipare al nostro impegno divulgativo sul tema ambientale proponendoci un suo lavoro riguardante i modelli previsionali e la loro attendibilità. Siccome il nostro spazio digitale è aperto al contributo di chiunque volesse parteciparvi, abbiamo ritenuto che il suo fosse attendibile e potesse dare elementi stimolanti per il dibattito. Buona lettura
Uncertainty in the case of lack of information: extrapolating data in time, with examples on climate forecast models
Research Director (former) in CNR.IMGC (INRiM from 2006) *, email@example.com
The main scientific tool for performing a prediction is called “forecast model”, a mathematical model supported by observations. Basically it is the evolution in time of some parameters of the to-date law(s) considered of fundamental importance in the specific case. The relevant available data are obviously limited to a past period of time— admittedly in most cases a limited period, where the law in question is considered valid and verified with sufficient precision—while no direct information can be available about the future trend. A (usually) mathematical (set of) function(s) is extrapolated ahead in time, to show to the present and next generations what they should be supposed to observe in future. A problem arises from the fact that no (set of) mathematical function that could be used for a model is infinitely “flexible”, i.e. apt to “correctly” interpolate any cluster of data, and the less when less are the parameters of the function(s). A data fit is considered good when is a balance between a mere “copying” a behaviour with time (like when the function has to follow a given profile) and a satisfactory “averaging” of the behaviour, especially over longer periods of time, without “masking” changing points. Further, the data uncertainty is a decoration of each datum that is often absent from the information supplied with the extrapolations. Instead, the data uncertainty must be taken into account and the relevant information must always be supplied, since the quality of the fitting on the available data is vital for the quality of the subsequent extrapolation. Therefore, the forecast should better consists of an area (typically increasing its width with time) where the future determinations are assumed to fall, within an assigned probability. Thus, it should be quite clear that the extrapolation of past data to future time, i.e. the present estimation that can be passed to next generations, is affected by a high risk and that careful precautions and limitations should be adopted.
Keywords: forecast; prediction; extrapolation; models; information; uncertainty; ignorance; climate
The studies at high level in metrology allows the scientist to acquire a special competence in treating experimental data, and understanding their features and the level of confidence that one can attach to them, especially those related to uncertainty. Scientists in other disciplines, instead, often does not have the necessary
* IMEKO Chair IMEKO TC21 and representative in ISO TC69; National Representative in IUPAC Commission I-1.
feeling of the need of using specific tools and refined procedures when the data analysis goes beyond simple examination.
This under-evaluation is not infrequent and often makes very critical a procedure, like the extrapolation of data not founded on solid bases: excessive length of extrapolation, though appealing for non-scientific purposes, or insufficient control on the constrained-level of the portion of the used function, is often found, especially when the uncertainty of the available data should be taken into account.
This paper, after a recall of the foundation of science, addresses the today very popular issue of the extrapolation of experimental data ahead in time for forecast purposes (i.e., where no information is still available), and presents the difficulties and limitations intrinsic in that task and the consequent risk of propagating false information, especially in the field of thermodynamics. As an example, the paper will explicitly address the specific discipline of climate forecast, very popular today.
2. Science as a crossroad of disciplines
Science is a complex frame and the crossroad of different disciplines. The modern method according to the Britannica definition is: “Scientific method: mathematical and experimental technique employed in the sciences. More specifically, it is the technique used in the construction and testing of a scientific hypothesis. The process of observing, asking questions, and seeking answers through tests and experiments is not unique to any one field of science. In fact, the scientific method is applied broadly in science, across many different fields. Many empirical sciences … use mathematical tools borrowed from probability theory and statistics, together with outgrowths of these …”.
One crossroad is between experimental observations and theoretical studies, two disciplines that are interlaced, one providing experimental data, the other being responsible for the inference of the underlying “laws”: it is difficult to state in general who come first in the scientific process.
Another crossroad is with philosophy, where “…Philosophers of science have addressed general methodological problems, such as the nature of scientific explanation and the justification of induction”, again from Britannica.
In both cases, one needs inter-communication to the Community of the findings and hypotheses in a lexical form (oral or written). Only the “language” is different: in case of theory it is the mathematical one, considered a universal symbolic tool (carrying no ambiguity); in case of data, it is a universally defined and accepted symbolic language (typically algebra); in case of the philosophical foundations, a logical symbolic language or its own idiom—i.e., a specific local-language.
In all cases, it is assumed that also the next generations will be able to correctly understand the past and present communications—directly or through historical education—and compare them with their novel founding.
This assumption means that the previously transmitted information may be invariant in time, but it does not mean that the whole previous knowledge remains invariant in time, because the main science goal is to progress in knowledge, possibly also correcting, or even contrasting (Kuhn’s revolutions), the previous one.
3. Prediction in science
One of the most popular expectations of the non-scientific Community is that science allows prediction. However, quite often that expectation does not understand that prediction is always limited to within a level of confidence—in statistical meaning, never providing certainty. Scientists, on the other hand, are expected to not ignore the latter fact, as they are supposed to never ignore the un-certainty associated with any of their findings or thinking, and the possibility for errors in them, or of evolution affecting them.
However, as amply discussed in philosophy of science, the problem is about how to take the doubt or the (conditional) certainty into account. This issue is particularly critical when concerning the top-level of the knowledge pyramid: the recognised “laws of nature”, even when expressed in their less ambiguous form, the mathematical one.
Basically a law is assessed to be valid by inter-subjective consensus until a contrast is “demonstrated”, empirically (by observations) or formally (from mathematical contradictions). Actually, there is a third possible reason, connected to the human ways of communicating with each other: the contradiction with respect to the foundations of human logic, e.g., concerning the cause-effect principle.
Concluding, approaching truth is a “vaste programme” (how we can understand if we are approaching it if we do not know where it is, as observed again by Kuhn?) and the scientist should be humble in this respect, because here comes another basic feature of science. It consist of the fact that science, basically, is not looking for “truth”, but simply for a consistent explanation—satisfactory to us—of facts, by means of a sufficiently long roadmap, made of observations showing a sufficient degree of repeatability, and of theoretical inferences.
Along the above roadmap, a diversity of positions almost invariably, and intrinsically, confronts with each other, requiring time and often adjustment for advancing in knowledge until an issue can converge univocally and be considered acceptable by the whole Community. Nobody today still believes that the Earth is flat, but it took centuries of disputes and of experimental evidences before getting the certainty that the spherical model is the right one for us. In this case, the conclusion may have been made easier by the fact that the discipline of mechanics can be considered simpler to manage with respect to others, namely thermodynamics.
One of the tools used by the fans of each position to support it is the use of a well-know method—very much appreciated outside science—to show how good a prediction can be obtained from the asserted position, i.e. how well the yet “unknown” looks to follow the known”. The main scientific tool for performing the prediction is called “forecast model”, a mathematical model supported by observations.  Basically it is the evolution in time of some parameters of the to- date law(s) considered of fundamental importance in the specific case.
The relevant available data obviously are limited to a past period of time— admittedly in most cases a limited period, where the law in question is considered valid and verified with sufficient precision. That is a risky job, because in most cases the past precision has increased with time, being that a goal of experimental science, but not always the data are “weighted” for they precision, so confidence in the
precision of the function fitted to these data could already be affected by precision un-homogeneity. Then the mathematical (set of) function(s) is extrapolated ahead in time, to show to the present and next generations what they should be supposed to observe in future: this is commonly done for the weather forecast, normally for a subsequent period of a few days—why not more?
4. Modern use of prediction
The expected duration of the prediction validity depends, first, on the observed law: in case of mechanics, e.g. the orbit of the big sky bodies like the Earth, the forecast can confidently be done for extremely long periods of time. In the case of thermodynamics (see later), most of us know very well the uncertainty about the weather forecast, a branch of it.
With the rapid diffusion of informatics the use of computer models has rapidly grown up, and became one of the preferred tools in the Internet socials for “informing” people. This has pushed an increasing number of scientists to exercise in this risky field, risky because the limit between science and politics is almost invisible, certainly quite uncertain.
Then, the problem arises from the fact that no (set of) mathematical function that could be used for a model is infinitely “flexible”, i.e. apt to “correctly” interpolate any cluster of data, and the less when less are the parameters of the function(s). There is a vast literature on this subject matter, for example to detect the “changing point” of a behaviour. A fit is considered good when is a balance between a mere “copying” a behaviour with time (like when the function has to follow a given profile) and a satisfactory “averaging” of the behaviour, especially over longer periods of time, without “masking” changing points.
As a consequence, there is, normally, a balance between the number of data available—and the length of the period during which they were taken—and a future period where the function so obtained can “safely” be extrapolated and providing a safe forecast—i.e. remaining accurate while behaving without any constraints except those (purely mathematical) set by the function itself.
For example, if the “safe” observation time is considered 50 years, it is hardly possible to imagine any sensible extrapolation to a further period of the same length, and the shorter is the former the more the variation in the extrapolated period can be problematic, especially for rapidly increasing (or decreasing) function derivatives, or for non-simple shapes of it.
Further, the data uncertainty was not even involved yet, a datum that is often absent from the information supplied with the extrapolations. Instead, the data uncertainty must be taken into account and the relevant information must always be supplied (not happening in many instances), since the quality of the fitting on the available data is vital for the quality of the subsequent extrapolation. There are instances where the data uncertainty is so large that their fit is already sufficient to consider them unreliable and the extrapolation meaningless. Fitting weighted data is always advised to limit this deficiency.
More frequently, supplying the results from more than one model is preferred, as a multiplicity able to allow an indirect evaluation of the possible variability of the
forecast. This comparison of models can certainly mitigate the risk of false extrapolations, if made with different fitting (set of) equations—and of different complexity—on the same data.
Therefore, the forecast almost always consists of an area (typically increasing its width with time) where the future determinations are assumed to fall, within an assigned probability. Generally the trend is monotonic, because usually changing points cannot be foreseen. In a few cases, the latter are also foreseen: in that case the extrapolation can also show a change in sign of the first or/and of the second derivative (e.g., future lowering of the local/world human population, or exhaust/born of causes for the past/present trend).
5. A few examples on prediction on Earth thermodynamic parameters
A particularly risky field of prediction is that for thermodynamic phenomena, e.g. those dominating in our planet.
The field of mechanics is generally simpler to handle because is basically deterministic and little time dependent, even when having a dynamics, and being most often limited to studies on a few bodies.
The extension of studies to “many bodies” is totally a different affair— a bit less in the astronomic field—when considering the “colligative” behaviour of a discrete ensemble of bodies, ”, i.e. depending only on the body numerosity and not from their chemical-physical nature. This is the case of thermodynamics. The difficulties are usually somewhat mitigated by considering its dynamics as a sequence of equilibrium states, in which case granularity is usually ignored and continuous mathematical functions are used to describe the time behaviour (stationary systems).
However, that is a simplification that cannot hold in very large systems, in which cases the validity of models extrapolated ahead in time becomes more and more questionable, especially in non-homogeneous systems and in case of complex (physical-chemical) interactions among bodies, or in the case of discrete systems. The development of science for the case of discreteness in physics and chemistry is accelerating, but at present it is still quite unsatisfactory. This theme has already been treated in a previous paper .
Thus, in the current situation, the increasing importance of reliable forecast for a much longer span in time than presently available for weather forecast cannot be considered sufficient to match with the present development of sound mathematical and statistical tools, needing a considerable progress to cover the expectations of the next generation. Current unresolved problems in Climate analysis have already being noted by authors in Philosophy of Science and on Environment, e.g.: “Non-epistemic values pervade climate modelling, as is now well documented and widely discussed in the philosophy of climate science”,  and “Internal variability in the climate system confounds assessment of human-induced climate change and imposes irreducible limits on the accuracy of climate change projections, especially at regional and decadal scales” .
A few examples follow to bring evidence of how critical can be the forecast over long period of future natural behaviour, concerning popular parameters in climate forecast, such as Earth global temperature and global sea-water level.
6. Final remarks
As it should be quite clear from the above, the extrapolation of past data to future time, i.e. the information that can be passed to next generations, is affected by a high risk. Risk level is rarely used as a tool often used in predictions to make a measure of the reliability that one can assign to them. In many fields of prediction this important parameter is not available, though it can be extremely varied from case to case.
One basic reason is that a correct and full uncertainty analysis is not performed, in particular because an uncertainty budget  is not compiled, like in the current case of the domain of climate, or is not made available.
1. F. Pavese, Measurement in science: between Evaluation and Prediction, AMCTM XII (Pavese F., Forbes A.B., Zhang N.F., Chunovkina A.G., Eds.), Series on Advances in Mathematics for Applied Sciences, 2021, Singapore, World Scientific Publishing Co., pp. 346–363.
2. F. Pavese, About continuity and discreteness of quantities: examples from physics and metrology, Ukrainian Metrological Journal (2021) n.2, UDC 53.02:006.91 DOI: 10.24027/2306- 7039.2.2021.236054.
3. J. Jebeile, M. Crucifix, Multi-model ensembles in climate science: mathematical structures and expert judgements, Studies in History and Philosophy of Science Part A, 2020, https://doi.org/10.1016/j.shpsa.2020.03.001. HAL Open Science, https://hal.archives- ouvertes.fr/hal-02541285
4. F. Giorgi, Uncertainties in climate change projections, from the global to the regional scale, EPJ Web of Conferences 9, 115–129 (2010) doi:10.1051/epjconf/201009009
5. Global Climate Report—Annual 2020. National Centers for Environment, https://www.ncdc.noaa.gov/sotc/global 202013
6. M. Dobre, D. Sestan, A. Merlone, Air temperature measurement uncertainty associated to a mounting configuration temperature sensor-radiation shield, 2018, WMO Technical Conference on Meteorological and Environmental Instruments and Methods of Observation, https://www.wmocimo.net/wpcontent/uploads/O1_4_Dobre_Air_Temp_MU_Extended-Abstract- TECO2018-Dobre-Sestan-Merlone.pdf
7. P. Frank, Uncertainty in the global average surface temperature index: a representative lower limit, Energy & Environment, Vol. 21 No. 8, 2010–21
8. IPCC, 2019: Summary for Policymakers. In: IPCC Special Report on the Ocean and Cryosphere in a Changing Climate [H.O. Pörtner, D.C. Roberts, V. Masson-Delmotte, P. Zhai, M. Tignor, E.
Poloczanska, K. Mintenbeck, M. Nicolai, A. Okem, J. Petzold, B. Rama, N. Weyer (eds.)].
9. VIM, Vocabulaire international de métrologie – Concepts fondamentaux et généraux et termes associés, 3rd edition, 2012, BIPM, Sèvres, France
10. M. Mengel, A. Levermann, K. Frieler, A. Robinson, B. Marzeion, R. Winkelmann, Future sea level rise constrained by observations and long-term commitment, www.pnas.org/cgi/doi/10.1073/pnas.1500515113
Figure 1 shows a prediction for the Mean Global Earth Surface Temperature 1880-2021 from NOAA,  compared with the prediction from IPCC, by making an author’s fit of the original data, where the fit over the longer period indicates a quite lower annual increase, and with a better fitting s.d. (Note: the s.d. of the fit should not be confused with the accuracy of the temperature data, which, arising from a collation of meteorological stations’ data cannot be better than ± 0.5 °C [6-7]).
Figure 2 shows an IPCC prediction of Mean Sea Level increase up to 2300 (!), according to different models : it is difficult to believe that the mathematical model can be accurate for a so long future period without a high risk, being based on a much shorter period of observations, during which the level increase has been limited to less than +0.1 m (uncertainty unreported), the same risk affecting the extrapolation to 2100.