Analysis of Water Temperature Data
Analysis of Water Temperature Data
at
Woods Hole, Massachusetts
We are exploring the possibility of Climate Change and possible changes in the Average Water Temperature at Woods Hole Oceanographic Institute, Massachusetts. The data is a listing of daily water temperatures at the site taken in March of each year.
The temperatures cover the years from 1893 to 2009. The listing is complete for every year from 1966 through 2009. 1893 has only one data point (3/31), 1894 is complete, there are no listings for 1895-1897, 1899 is complete, 1900 is missing 3/31, 1901 has only 3/31, 1902 is missing, 1903 is missing 3/31, no data for 1904-1963, 1964 is missing 3/6, 3/7, 3/13, 3/14, 3/20, 3/21, and 3/28, and lastly 1965 is missing 3/5, 3/12, 3/19, 3/26, 3/31.
In looking at the total data set we have the following summary statistics for the mean of each March for which we have data:
Summary statistics:
Column n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3 IQR
Avg Temp 52 36.825577 4.253629 2.0624328 0.28600797 36.825 8.42 32 40.42 35.59 38.575 2.985
Note the mean temperature in Degrees Fahrenheit is 36.83 rounded to two decimal places, with a standard deviation of 2.06 degrees and a standard error of the mean of 0.29 degrees. Our sample size is n = 52. We have 52 years of March temperatures in our data set (approximately 1500 data points) with the gaps as listed. For this analysis we are using only the means of the March temperatures for each given year.
115252525908000The distribution of the entire data set is approximately normal as shown by the Histogram below:
Further analysis shows using a Q-Q plot test for normality yields the following graph, showing the data to be approximately normally distributed
1485900237490
The confidence interval for the mean at the 95% level of confidence is (36.75, 36.91)
EMBED Equation.3
EMBED Equation.3
The CI for the mean at the 99% level of confidence is (36.27, 37.39)
The confidence interval for the variance at the 95% level of confidence is (.058, 0.129)
EMBED Equation.3
Binomial Analysis:
The confidence interval for the probability of EMBED Equation.3 being greater than 36.91 F (the upper limit of our confidence interval for the mean at the 95% level of confidence) at the 95% level of confidence is (0.3073, 0.5773).
Taking the probability of the average temperature of any given March being a greater than 36.91 degrees Fahrenheit is 23 out of 52 or EMBED Equation.3 with EMBED Equation.3
Note: We are using the normal distribution to approximate the Binomial as EMBED Equation.3 and the data appears to be approximately normally distributed.
EMBED Equation.3
EMBED Equation.3
Hypothesis :
If we have an increase of Water Temperature as a result of climate change we should be able to see a difference in current temperatures as compared to the entire data set.
We claim that the current mean of the water temperatures is greater than the mean of the total data set.
Comparing the last ten years (2000 to 2009) of temperature data to the total data set we look at the mean of each. First we need the statistics for the last ten year period
Summary Statistics for the ten year period 2000-2009 inclusive are:
Summary statistics: (years 2000-2009)
Column n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3 Sum IQR
2000-09 Avg Temps 10 37.173 5.5649123 2.3590066 0.7459834 36.365 6.39 33.88 40.27 35.81 39.77 371.73 3.96
95% Confidence Interval for mean – 2000-2009 – (35.71, 38.64)
Boxplots comparing the two data sets, temperature for the years of 2000 to 2009 inclusive as compared to the total data set, is inconclusive.
1456690-6985
Comparing the temperatures of the last 10 years of data to the total data set and a confidence level of 95% we have the following.
Hypothesis testing comparing the temperatures of the last 10 years of data to the total data set and a confidence level of 95% we have the following:
EMBED Equation.3
At an EMBED Equation.3 there is not enough evidence to support the claim that Water temperatures at Woods Hole have risen over the previous 52 years
From a Different perspective:
Taking a different approach we divided our data set into two subsets, the first ten years of data and the last ten years of data (previously examined). We might be able to see a significant difference in the two time periods. The claim is that the last ten years of March average temperatures will have mean temperatures higher to the first ten years of March average temperature data
The summary statistics for the first ten years (<1965) of data are listed below
Summary statistics – years < 1968:
Column n Mean Variance Std. Dev. Std. Err. Median Range Min Max Q1 Q3 Sum IQR
Last 10 Yrs 10 35.21 6.1235776 2.4745865 0.78253293 35.01 7.23 32 39.23 33.42 37.16 352.1 3.74
95% CI for Mean – years < 1968- (33.68, 36.74)
169164012446000
Boxplots of the data sets:
Comparing that with the last ten years, 2000-2009, given earlier and doing a two sample t test of the hypothesis as follows:
EMBED Equation.3
At an EMBED Equation.3 there is enough evidence to support the claim that Water Temperatures at Woods Hole are higher in the time period of 2000-2009 than they were in the 10 years prior to 1968
Proportional Analysis:
Looking at the proportion of the temperatures greater than 36.93 (the upper limit for the CI of the mean at a 99% confidence level)
Total Data set (all years) EMBED Equation.3
The first ten years (<1968) EMBED Equation.3
The last ten years (2000-2009) EMBED Equation.3
A difference of 0.10 between the first and last ten year data sets, but both are less than the total data set proportion.
Regression Analysis:
Looking further we did a scatter plot of temperature and years with the “x” value (horizontal axis) the years and the “y” value the average March temperatures.
-9525302895
meter Estimate
Intercept -16.100054
Slope 0.026778493
The relationship shows a slight positive correlation with a slope of 0.0268 degrees per year over the full sample.
The first ten years of samples has a slope of 0.163484 for this same linear relationship
0177800
Parameter Estimate
Intercept -0.25629535
Slope 0.163484
The last ten years of samples has a slope of 0.0367 for this same linear relationship
200025170180
Parameter Estimate
Intercept -36.325333
Slope 0.036666665
Comparing the slopes we can see that the first ten years have a steeper slope (0.166) than that of the last ten years (0.037) which is larger than the slope for the complete data set (0.268). Further analysis show that taking further 10 year samples between the first and last 10 groupings we have two 10 year samples with negative slop and one with a positive slope.
-0.02674 1999-1990
0.505353 1989-1980
-0.20369 1979-1970
There is a pattern of ten years of positive slope then 10 years of negative slope followed by ten years of positive slope then again 10 years of negative slope and our last ten year cycle with positive slope. The periods of increase have been larger or of greater magnitude than the periods of decrease.
Summary and Conclusions:
Mean and Confidence Intervals:
Blue: Our total data set has a mean Water Temperature of 36.830 and a 95% CI of (36.27, 37.39)
Red: The first ten year sample has a mean Water Temperature of35.210 and a 95% CI of (33.68, 36.74)
Green: The last ten year sample has a mean Water Temperature of 37.170 and a 95% CI of (35.71, 38.64)
Below are these values graphed. Note the overlap for the Blue and green data and Less overlap for the Red and blue or the Red and Green Data.
2047875-1397000Boxplots for all three data sets:
Hypothesis Test Results:
Mean of last ten years (2000-2009) against total data set:
EMBED Equation.3
Mean of last ten years (2000-2009) against first ten years ( <1968)
EMBED Equation.3
Regression analysis:
Slopes or Rate of change:
Slope Time Period
0.037 2000-2009
-0.02674 1999-1990
0.505353 1989-1980
-0.20369 1979-1970
.166 < 1968
0.268 Total Data
Conclusions:
Overall we see a continuous increase in Water Temperature from the start of the data to the end of March 2009, although the increase is very slight it is consistent for the time period.
The first ten years of data seems significantly lower than the rest of the data set but with a significantly steeper rate of change or slope. Not having data earlier than 1898 it could be a warming period after some event that lowered temperatures and now is recovering.
It is significant that over the complete data set the most recent 10 year period is not significantly (at an EMBED Equation.3 of 0.050) different than the total data set.
We seem to be experiencing a slow gradual increase in Water Temperatures at Woods Hole but the rate of change for the most recent ten year period is less than the mean of the total data set. Is the increase in Temperatures slowing down? Also noted is the trend that the rate of change switches from increasing to decreasing every ten years although the long term net is positive but that net positive includes the significantly larger positive rate of change for the first 10 years of data.
It seems clear that the mean Water Temperatures at Woods Hole have increased over the previous 50 years but the rate of that increase is by no means constant nor is it accelerating.