Climate Reference Network: package crn 1.0

Home > Uncategorized > Climate Reference Network: package crn 1.0

Climate Reference Network: package crn 1.0

September 21, 2011 Steven Mosher Leave a comment Go to comments

I’ve just finished and uploaded another climate data package for R. This one focuses on CRN the climate Reference Network

Here is their home page

The package for now is really simple, but all of the packages I’m building are getting simpler. In the end ( whenever that is ) I think I’ll end up with a host of packages that manage the downloading of data and the formating of it into “analysis friendly” formats. The CRN posed an interesting challenge. They have hourly data and over 30 measurands. Going forward I’m seeing a collection of packages that looks like this: a series of packages that takes online climate data and reformats it into some standard formats that a few of us have been converging on. In an OOP design they will become the core objects. Then we have a series of functions for doing basic spatial and time series stats, and we have our spatial tools and Time series tools. Here is the package line up as of today

1. RghcnV3: data formation and analysis

2. CHCN ( enviroment canada data )

3. Ghcndaily: ghcn daily data

4. crn : climate reference network

5. Metadata.

Over time the goal will be to refactor RghcnV3 and strip out the analysis part of it into a separate package. All the data formating packages thus would have a common set of formats and objects and then analysis code would be written as methods on that.. eh well thats the dream

So here is what you can do with the crn package today. There are 3 core functions: downloadCRN, collate*, and writeDataset(). The download function does all the heavy lifting to download both daily and hourly data from CRN. Data starts in the year 2000 and extend to today. The downloadCRN function uses RCurl to get the directory listings from the ftp and create the download lists. Then the process of downloading the 1000 + files starts.

The function lets you control whether you want daily files or hourly. I just get both. the data comes in station files. One file for every station for every year. The next step we take is to collate these files into one monolithic file. One file that contains the data for all the stations. For daily data we use collateDaily() and for hourly data we use collateHourly(). These functions have two side effects. They write a consolidated datafile and a metadata file that records station names, lat/lon and Id number. In the case of the hourly data this file is quite large over 1GB. Moreover the file contains all the variables: T min, tmax, solar radiation, soil temperatures. The last function turns these monolithic files into what we are used to. Files with one variable for all stations. That function is writeDataset(). The function operates on either hourly or daily files and collects a single variable such as T_MEAN. the function is defined like so

writeDataset(filename, cnames = colnamesDaily, varname = “T_DAILY_MEAN”)

The first variable filename is supplied to point to either the monolithic daily data or hourly data. the next variable “cnames” points to the column names of the dataset. These are predefined as constants for the package. the last variable “varname” is the variable you want to collect. When you run this function the side effect is a file is written contain all the T_DAILY_MEAN data for every station. Effectively the package allows you to download the CRN data and build subsets of data from the huge collection. There are over 30 climate variables available, so I constructed the tool so that you can build datasets from the source data.

Version 1.0 is posted to CRAN, should be up shortly. In the next installment I’ll probably add support for “zoo” objects and function to create daily from hourly and monthly from daily. At that point it will be fully integrated with the RghcnV3 data structures

Categories: Uncategorized

Comments (15) Trackbacks (0) Leave a comment Trackback

Dan Hughes

October 19, 2011 at 1:37 AM

Reply

Steven, I got the lyap_k package up and running. At this point it’s very black box, so I need to study the docs. Initial, zeroth-order results encouraging.

Do these stations have evaporation pan and data?

Thanks

ps

Is there an R-to-F parser somewhere 🙂
- steven mosher
  
  October 19, 2011 at 3:01 AM
  
  Reply
  
  I believe they may. been a while since I looked at it. I would have to do some custom code
  to pull out whatever you like.
  
  R to Fortran?? oh man. that’s cruel. Doubt it. What do you really want to do?
  
  describe that, maybe I have a different solution
BarryW

November 4, 2011 at 8:15 PM

Reply

I downloaded your crn package, (thanks for the effort) but I’ve run into a few problems.
I’m running on an iMac with OSX 10.7
R 2.14.0
RCurl 1.6-10

It wouldn’t download anything. I went in and did some debugging and found some issues in the getUrlsCRN function.

the line:
ftpDir <- getURL(yearUrl, http://ftp.use.epsv = FALSE, ftplistonly = TRUE)
was throwing an error

"In mapCurlOptNames(names(.els), asNames = TRUE) :
Unrecognized CURL options: ftplistonly"

The RCurl documentation doesn't show this as a valid option. On the Mac, at least, the directory listing that comes back is the full version. Of course, your parsing code wasn't expecting that and that seems to be causing the failure.

I was able to get it to work by replacing that line with:

ftpDir <- getURL(yearUrl,.opts = list(CUSTOMREQUEST = "NLST"))

I also had to change the next line to

names <- strsplit(ftpDir, split = "\n")

to get it to work.

So I don't know if this is something to do with the Mac, the particular versions I'm using or what, but I thought I'd let you know.
- Steven Mosher
  
  November 4, 2011 at 8:54 PM
  
  Reply
  
  I had the same problem with RCurl on the MAC.
  
  your libcurl is out of date and you need to upgrade it.
  
  nice work around though
  
  one of the reasons I switched to PC was difficulties with libcurl on the MAC
BarryW

November 14, 2011 at 9:59 AM

Reply

I’ve slowly been getting your code to work on my MAC. I’ve seen a few more problems that don’t seem to be connected with RCurl in the collateDaily function. They seem to be in R itself and related to the apply function (it was fed a data frame and output a numeric vector).

I did want to mention a problem I found in the data.
The New Mexico site (3087) has two locations and the name is spelled differently.
3087 35.82 -106.32 NM_Santa_Fe_20_WNW
3087 35.78 -106.27 NM_Sante_Fe_20_WNW

Just FYI since I thought it was very odd. Don’t know who to contact on this or if this is even a bug.
BarryW

November 14, 2011 at 10:05 AM

Reply

Sorry, I found the metadata on the crn site and they did move it.
- steven mosher
  
  November 14, 2011 at 12:17 PM
  
  Reply
  
  Ya, I justrecently discovered that the data on this ftp is actually more than CRN
  when I finish my current project I’ll have to get back to this and fix things up.
- BarryW
  
  November 21, 2011 at 6:16 AM
  
  Reply
  
  I went through their metadata and tried to strain out the actual crn sites. I’ve got a file with my best guess at the real sites.(you would think they would have a list since they make a point that there are 114 sites). Winnowed out the non 48 and sites that were identified as “test”. If you’d like to have it email me.
TerryMN

November 20, 2011 at 10:00 PM

Reply

I’m just getting up to speed w/R so nothing to contribute, (ie this is off topic) but thought you’d find this interesting if you hadn’t seen it:

http://www.twincities.com/opinion/ci_19367298

It would be interesting to brainstorm if there are any climate puzzles that could be crowd sourced via an internet game, or some such thing IMO. Feel free to bulldoze.
vukcevic

December 10, 2011 at 9:34 PM

Reply

OT but might be of interest:
Recent article by statistician Grant Foster ( Tamino )
Global temperature evolution 1979–2010

Click to access 1748-9326_6_4_044022.pdf

it’s absolute nonsense!
vukcevic

December 12, 2011 at 5:00 AM

Reply

Hi Steven
You need to publicise your blog, Dr. Curry hasn’t your name on the roll call.
I see my post is still there, either you aren’t much of a censor or perhaps didn’t object to the content. As a sign of good will I would like to email you data on this very controversial graph, so I can have benefit of your comment. My email is on the graph, but as a blog keeper you should have it anyway.
http://www.vukcevic.talktalk.net/SSN-T.htm
Thanks
steven mosher

December 12, 2011 at 9:13 AM

Reply

I dont practice censorship of any kind. grown people can decide for themselves what to put in their heads and what not to put in their heads.

I’m Familar with Grants work. In fact I work with some of his code.

I’ll comment on whatever you like but I make all my comments in public so that people do not have to worry that I say one thing in private and another thing in public.
- vukcevic
  
  December 12, 2011 at 2:15 PM
  
  Reply
  
  Thanks Steven, that’s fine. I would look for your comments here on your blog, which is far easier to access and use. My pc goes ‘nuts’ when I try to post something on Judith’s Climate etc
  vuk
Doug Cotton

February 14, 2012 at 6:28 AM

Reply

There is no “systematic causal relationship” simply because the greenhouse conjecture is not based on real world physics.

Prof Claes Johnson has proved in Computational Blackbody Radiation* that energy in radiation only gets converted to thermal energy if the peak frequency of the radiation from the source is above the peak frequency of the radiation from the target.

This essentially provides a mechanism which explains why the Second Law of Thermodynamics also applies for radiative heat transfer, as it does for heat transferred by conduction.

There seems no plausible alternative explanation for the observed Second Law, so I suggest we all heed what Johnson has deduced mathematically, being as he is, a Professor of Applied Mathematics.

It is not the net radiative flux (or even its direction) which determines whether (and in which direction) thermal energy is transferred. For example, if the emissivity of two bodies is very different, there can be more radiative flux from the cooler one. But all that flux will be scattered by the warmer one and not converted to thermal energy. Only the flux from the warmer one (no matter how weak) will be converted to thermal energy in the cooler one. This “ensures” that the Second Law is valid in all cases because it depends
on peak frequency which is proportional to absolute temperature – see http://en.wikipedia.org/wiki/Wien's_displacement_law

Thus the IPCC “backradiation” cannot affect the temperature of the surface and there can be no atmospheric radiative greenhouse effect.

* http://climate-change-theory.com/RadiationAbsorption.html
- steven mosher
  
  February 14, 2012 at 10:01 AM
  
  Reply
  
  Sorry Doug. Claus is wrong. Even professors make mistakes. He’ll survive