In the last session we handled all the duplicates in GHCN’s file v2mean. With well defined functions for reading in the data and processing the duplicates, we can read a v2.mean file and process the duplicates in one line
Now, if we could could trust the “uncompress package to uncompress files without an error, then we could actually write the underlying functions to check the web for updates, download a fresh file if it existed, unzip it,check it against the local copy, and process it accordingly. And rebuild all your analysis and backup your old work! But for now we will do a few things by hand, all of that fancy checking for updates can be plumbed in later if we need to. Since, processing the raw file takes awhile and we dont need to do that math over and over, lets save our processed file as an R object. First we will pick a file name for it. And DP cleverly stands for Duplicates Processed. The Rdata extension is special. do not confuse it with RData. Some fool using my name did this on the R help list.
Ok. Now we save our object
and its saved to our working directory. Next up, the temperature data extends from 1701 to the present. The file is very sparse before 1900 or so. GISS use it back until 1880, CRU use it back until 1850. Whatever you choice, you need a quick way to select that window of data. Now, R data classes like time series [ ts() ] have a function for windowing the data. Its afunction called window(). So we will imitate that function with our own window function. And if I wanted to get fancy, I’d create a special class type for v2mean data and just overload the window function. Maybe later when I learn about OOP in R. for now, we will just make something that looks the same.
return(v2mean[which(v2mean$Year>=range & v2mean$Year<=range),])
This function is pretty simple. In fact many programmers would not turn this into a function. Still I think the code is more readable with this turned into a function, so I do it that way. You will note an input variable of “range”. That’s a vector of a start year and stop year.
And once we window the data we will save that away as well.
Code, goes up tomorrow and we move onto selecting a base period for calculating anomalies and some other neat things in R