Plotting Global Temperature

Using the satellite time series for the global temperature someone concluding that there is a decline.

But the reliability is bad. We cannot even say that the slope is + or – with any confidence.

Call:
lm(formula = RSS ~ x, data = df)

Residuals:
Min 1Q Median 3Q Max
-0.36741 -0.12314 -0.00018 0.09130 0.59246

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.919332 5.288187 1.119 0.264
x -0.002830 0.002637 -1.073 0.285

Residual standard error: 0.1701 on 191 degrees of freedom
Multiple R-squared: 0.005992, Adjusted R-squared: 0.0007882
F-statistic: 1.151 on 1 and 191 DF, p-value: 0.2846

So, that is not the way to do it. First, decompose the time series into the trend, the periodic part and the noise.

Then, use the linear regression for the trend.

Now things get better. We have a reasonable p-value.

Call:
lm(formula = ff ~ df$x)

Residuals:
Min 1Q Median 3Q Max
-0.200830 -0.080675 0.008528 0.058615 0.275443

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.121011 3.899699 2.082 0.0387 *
df$x -0.003926 0.001945 -2.019 0.0450 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1139 on 179 degrees of freedom
(12 observations deleted due to missingness)
Multiple R-squared: 0.02227, Adjusted R-squared: 0.01681
F-statistic: 4.077 on 1 and 179 DF, p-value: 0.04497

Filtering out the seasonal change helps. The result is definitely more robust, with p(slope) < 0.05.

However, the R2 remains very near to zero. The improved analysis still shows almost no corellation between the temperature and time.

Fundamental problems with both approaches

Both methods have the same fundamental problems:

(i) the choice of the period,  (ii) the choice of linear regression as a predictor and (iii) the reliability of the source data.

The source data consists of many averages, and we did not use the deviation in the data. Our estimates of the reliability are therefore too optimistic.

There are two peaks in the time series that really “drive” the negative slope: the peaks in 1998 and 2010. Without 1998 the slope is positive. 1998 is an anomaly, and we even know the course. The third problem is the value of the predictive method. If the regression line predicts the future (2013 and 2014) with any accuracy, it should also be able to predict the past (1995-1997). But, it does predict that past very poorly.

Leave a Reply