Climate Reference Network: package crn 1.0
I’ve just finished and uploaded another climate data package for R. This one focuses on CRN the climate Reference Network
Here is their home page
The package for now is really simple, but all of the packages I’m building are getting simpler. In the end ( whenever that is ) I think I’ll end up with a host of packages that manage the downloading of data and the formating of it into “analysis friendly” formats. The CRN posed an interesting challenge. They have hourly data and over 30 measurands. Going forward I’m seeing a collection of packages that looks like this: a series of packages that takes online climate data and reformats it into some standard formats that a few of us have been converging on. In an OOP design they will become the core objects. Then we have a series of functions for doing basic spatial and time series stats, and we have our spatial tools and Time series tools. Here is the package line up as of today
1. RghcnV3: data formation and analysis
2. CHCN ( enviroment canada data )
3. Ghcndaily: ghcn daily data
4. crn : climate reference network
5. Metadata.
Over time the goal will be to refactor RghcnV3 and strip out the analysis part of it into a separate package. All the data formating packages thus would have a common set of formats and objects and then analysis code would be written as methods on that.. eh well thats the dream
So here is what you can do with the crn package today. There are 3 core functions: downloadCRN, collate*, and writeDataset(). The download function does all the heavy lifting to download both daily and hourly data from CRN. Data starts in the year 2000 and extend to today. The downloadCRN function uses RCurl to get the directory listings from the ftp and create the download lists. Then the process of downloading the 1000 + files starts.
The function lets you control whether you want daily files or hourly. I just get both. the data comes in station files. One file for every station for every year. The next step we take is to collate these files into one monolithic file. One file that contains the data for all the stations. For daily data we use collateDaily() and for hourly data we use collateHourly(). These functions have two side effects. They write a consolidated datafile and a metadata file that records station names, lat/lon and Id number. In the case of the hourly data this file is quite large over 1GB. Moreover the file contains all the variables: T min, tmax, solar radiation, soil temperatures. The last function turns these monolithic files into what we are used to. Files with one variable for all stations. That function is writeDataset(). The function operates on either hourly or daily files and collects a single variable such as T_MEAN. the function is defined like so
writeDataset(filename, cnames = colnamesDaily, varname = “T_DAILY_MEAN”)
The first variable filename is supplied to point to either the monolithic daily data or hourly data. the next variable “cnames” points to the column names of the dataset. These are predefined as constants for the package. the last variable “varname” is the variable you want to collect. When you run this function the side effect is a file is written contain all the T_DAILY_MEAN data for every station. Effectively the package allows you to download the CRN data and build subsets of data from the huge collection. There are over 30 climate variables available, so I constructed the tool so that you can build datasets from the source data.
Version 1.0 is posted to CRAN, should be up shortly. In the next installment I’ll probably add support for “zoo” objects and function to create daily from hourly and monthly from daily. At that point it will be fully integrated with the RghcnV3 data structures
Steven, I got the lyap_k package up and running. At this point it’s very black box, so I need to study the docs. Initial, zeroth-order results encouraging.
Do these stations have evaporation pan and data?
Thanks
ps
Is there an R-to-F parser somewhere 🙂
I believe they may. been a while since I looked at it. I would have to do some custom code
to pull out whatever you like.
R to Fortran?? oh man. that’s cruel. Doubt it. What do you really want to do?
describe that, maybe I have a different solution
I downloaded your crn package, (thanks for the effort) but I’ve run into a few problems.
I’m running on an iMac with OSX 10.7
R 2.14.0
RCurl 1.6-10
It wouldn’t download anything. I went in and did some debugging and found some issues in the getUrlsCRN function.
the line:
ftpDir <- getURL(yearUrl, http://ftp.use.epsv = FALSE, ftplistonly = TRUE)
was throwing an error
"In mapCurlOptNames(names(.els), asNames = TRUE) :
Unrecognized CURL options: ftplistonly"
The RCurl documentation doesn't show this as a valid option. On the Mac, at least, the directory listing that comes back is the full version. Of course, your parsing code wasn't expecting that and that seems to be causing the failure.
I was able to get it to work by replacing that line with:
ftpDir <- getURL(yearUrl,.opts = list(CUSTOMREQUEST = "NLST"))
I also had to change the next line to
names <- strsplit(ftpDir, split = "\n")
to get it to work.
So I don't know if this is something to do with the Mac, the particular versions I'm using or what, but I thought I'd let you know.
I had the same problem with RCurl on the MAC.
your libcurl is out of date and you need to upgrade it.
nice work around though
one of the reasons I switched to PC was difficulties with libcurl on the MAC
I’ve slowly been getting your code to work on my MAC. I’ve seen a few more problems that don’t seem to be connected with RCurl in the collateDaily function. They seem to be in R itself and related to the apply function (it was fed a data frame and output a numeric vector).
I did want to mention a problem I found in the data.
The New Mexico site (3087) has two locations and the name is spelled differently.
3087 35.82 -106.32 NM_Santa_Fe_20_WNW
3087 35.78 -106.27 NM_Sante_Fe_20_WNW
Just FYI since I thought it was very odd. Don’t know who to contact on this or if this is even a bug.
Sorry, I found the metadata on the crn site and they did move it.
Ya, I justrecently discovered that the data on this ftp is actually more than CRN
when I finish my current project I’ll have to get back to this and fix things up.
I went through their metadata and tried to strain out the actual crn sites. I’ve got a file with my best guess at the real sites.(you would think they would have a list since they make a point that there are 114 sites). Winnowed out the non 48 and sites that were identified as “test”. If you’d like to have it email me.
I’m just getting up to speed w/R so nothing to contribute, (ie this is off topic) but thought you’d find this interesting if you hadn’t seen it:
http://www.twincities.com/opinion/ci_19367298
It would be interesting to brainstorm if there are any climate puzzles that could be crowd sourced via an internet game, or some such thing IMO. Feel free to bulldoze.
OT but might be of interest:
Recent article by statistician Grant Foster ( Tamino )
Global temperature evolution 1979–2010
Click to access 1748-9326_6_4_044022.pdf
it’s absolute nonsense!
Hi Steven
You need to publicise your blog, Dr. Curry hasn’t your name on the roll call.
I see my post is still there, either you aren’t much of a censor or perhaps didn’t object to the content. As a sign of good will I would like to email you data on this very controversial graph, so I can have benefit of your comment. My email is on the graph, but as a blog keeper you should have it anyway.
http://www.vukcevic.talktalk.net/SSN-T.htm
Thanks
I dont practice censorship of any kind. grown people can decide for themselves what to put in their heads and what not to put in their heads.
I’m Familar with Grants work. In fact I work with some of his code.
I’ll comment on whatever you like but I make all my comments in public so that people do not have to worry that I say one thing in private and another thing in public.
Thanks Steven, that’s fine. I would look for your comments here on your blog, which is far easier to access and use. My pc goes ‘nuts’ when I try to post something on Judith’s Climate etc
vuk
There is no “systematic causal relationship” simply because the greenhouse conjecture is not based on real world physics.
Prof Claes Johnson has proved in Computational Blackbody Radiation* that energy in radiation only gets converted to thermal energy if the peak frequency of the radiation from the source is above the peak frequency of the radiation from the target.
This essentially provides a mechanism which explains why the Second Law of Thermodynamics also applies for radiative heat transfer, as it does for heat transferred by conduction.
There seems no plausible alternative explanation for the observed Second Law, so I suggest we all heed what Johnson has deduced mathematically, being as he is, a Professor of Applied Mathematics.
It is not the net radiative flux (or even its direction) which determines whether (and in which direction) thermal energy is transferred. For example, if the emissivity of two bodies is very different, there can be more radiative flux from the cooler one. But all that flux will be scattered by the warmer one and not converted to thermal energy. Only the flux from the warmer one (no matter how weak) will be converted to thermal energy in the cooler one. This “ensures” that the Second Law is valid in all cases because it depends
on peak frequency which is proportional to absolute temperature – see http://en.wikipedia.org/wiki/Wien's_displacement_law
Thus the IPCC “backradiation” cannot affect the temperature of the surface and there can be no atmospheric radiative greenhouse effect.
* http://climate-change-theory.com/RadiationAbsorption.html
Sorry Doug. Claus is wrong. Even professors make mistakes. He’ll survive