Home > Uncategorized > Climate Reference Network: package crn 1.0

Climate Reference Network: package crn 1.0

I’ve just finished and uploaded another climate data package for R. This one focuses on  CRN the climate Reference Network

Here is their home page


The package for now is really simple, but all of the packages  I’m building are getting simpler. In the end ( whenever that is ) I think I’ll end up with a host of packages that manage the downloading of data and the formating of it into “analysis friendly” formats.  The CRN posed an interesting challenge. They have hourly data and  over 30 measurands.  Going forward I’m seeing a collection of packages that looks like this: a series of packages that takes online climate data and reformats it into some standard formats that a few of us have been converging on.  In an OOP design they will become the core objects. Then we have a series of functions for doing basic spatial and time series stats, and we have our spatial tools and Time series tools.  Here is the package line up as of today

1. RghcnV3:   data formation and analysis

2. CHCN ( enviroment canada data )

3. Ghcndaily:  ghcn daily data

4. crn : climate reference network

5. Metadata.

Over time the goal will be to refactor RghcnV3 and strip out the analysis part of it into a separate package. All the data formating packages thus would have a common set of formats and objects and then analysis code would be written as methods on that.. eh well thats the dream

So here is what you can do with the crn package today. There are 3 core functions:  downloadCRN, collate*, and writeDataset().  The download function does all the heavy lifting to download both daily and hourly data from CRN. Data starts in the year 2000 and extend to today. The downloadCRN function uses RCurl to get the directory listings from the ftp and create the download lists. Then the process of downloading the 1000 + files starts.

The function lets you control whether you want daily files or hourly. I just get both.  the data comes in station files. One file for every station for every year.  The  next step  we take is to collate these files into one monolithic file. One file that contains the data for all the stations. For daily data we use collateDaily() and for hourly data we use collateHourly().  These functions have two side effects. They write a consolidated datafile and a metadata file that records station names, lat/lon and Id number. In the case of the hourly data this file is quite large over 1GB. Moreover the file contains all the variables:  T min, tmax, solar radiation, soil temperatures.  The last function turns these monolithic files into what we are used to. Files with one variable for all stations.  That function is  writeDataset(). The function operates on either hourly or daily files and collects  a single variable such as T_MEAN.  the function is defined like so

writeDataset(filename, cnames = colnamesDaily, varname = “T_DAILY_MEAN”)

The first variable filename  is supplied to point to either the monolithic daily data or hourly data. the next variable “cnames”  points to the column names of the dataset. These are predefined as constants for the package. the last variable “varname” is the variable you want to collect. When you run this function the side effect is a file is written contain all the T_DAILY_MEAN data for every station. Effectively the package allows you to download the CRN data and build subsets of data from the huge collection. There are over 30 climate variables available, so I constructed the tool so that you can build datasets from the source data.

Version 1.0 is posted to CRAN, should be up shortly.  In the next installment I’ll probably add support for “zoo” objects and function to create daily from hourly and monthly from daily. At that point it will be fully integrated with the RghcnV3 data structures




Categories: Uncategorized
  1. October 19, 2011 at 1:37 AM

    Steven, I got the lyap_k package up and running. At this point it’s very black box, so I need to study the docs. Initial, zeroth-order results encouraging.

    Do these stations have evaporation pan and data?



    Is there an R-to-F parser somewhere 🙂

    • steven mosher
      October 19, 2011 at 3:01 AM

      I believe they may. been a while since I looked at it. I would have to do some custom code
      to pull out whatever you like.

      R to Fortran?? oh man. that’s cruel. Doubt it. What do you really want to do?

      describe that, maybe I have a different solution

  2. BarryW
    November 4, 2011 at 8:15 PM

    I downloaded your crn package, (thanks for the effort) but I’ve run into a few problems.
    I’m running on an iMac with OSX 10.7
    R 2.14.0
    RCurl 1.6-10

    It wouldn’t download anything. I went in and did some debugging and found some issues in the getUrlsCRN function.

    the line:
    ftpDir <- getURL(yearUrl, http://ftp.use.epsv = FALSE, ftplistonly = TRUE)
    was throwing an error

    "In mapCurlOptNames(names(.els), asNames = TRUE) :
    Unrecognized CURL options: ftplistonly"

    The RCurl documentation doesn't show this as a valid option. On the Mac, at least, the directory listing that comes back is the full version. Of course, your parsing code wasn't expecting that and that seems to be causing the failure.

    I was able to get it to work by replacing that line with:

    ftpDir <- getURL(yearUrl,.opts = list(CUSTOMREQUEST = "NLST"))

    I also had to change the next line to

    names <- strsplit(ftpDir, split = "\n")

    to get it to work.

    So I don't know if this is something to do with the Mac, the particular versions I'm using or what, but I thought I'd let you know.

    • Steven Mosher
      November 4, 2011 at 8:54 PM

      I had the same problem with RCurl on the MAC.

      your libcurl is out of date and you need to upgrade it.

      nice work around though

      one of the reasons I switched to PC was difficulties with libcurl on the MAC

  3. BarryW
    November 14, 2011 at 9:59 AM

    I’ve slowly been getting your code to work on my MAC. I’ve seen a few more problems that don’t seem to be connected with RCurl in the collateDaily function. They seem to be in R itself and related to the apply function (it was fed a data frame and output a numeric vector).

    I did want to mention a problem I found in the data.
    The New Mexico site (3087) has two locations and the name is spelled differently.
    3087 35.82 -106.32 NM_Santa_Fe_20_WNW
    3087 35.78 -106.27 NM_Sante_Fe_20_WNW

    Just FYI since I thought it was very odd. Don’t know who to contact on this or if this is even a bug.

  4. BarryW
    November 14, 2011 at 10:05 AM

    Sorry, I found the metadata on the crn site and they did move it.

    • steven mosher
      November 14, 2011 at 12:17 PM

      Ya, I justrecently discovered that the data on this ftp is actually more than CRN
      when I finish my current project I’ll have to get back to this and fix things up.

    • BarryW
      November 21, 2011 at 6:16 AM

      I went through their metadata and tried to strain out the actual crn sites. I’ve got a file with my best guess at the real sites.(you would think they would have a list since they make a point that there are 114 sites). Winnowed out the non 48 and sites that were identified as “test”. If you’d like to have it email me.

  5. TerryMN
    November 20, 2011 at 10:00 PM

    I’m just getting up to speed w/R so nothing to contribute, (ie this is off topic) but thought you’d find this interesting if you hadn’t seen it:


    It would be interesting to brainstorm if there are any climate puzzles that could be crowd sourced via an internet game, or some such thing IMO. Feel free to bulldoze.

  6. December 10, 2011 at 9:34 PM

    OT but might be of interest:
    Recent article by statistician Grant Foster ( Tamino )
    Global temperature evolution 1979–2010

    Click to access 1748-9326_6_4_044022.pdf

    it’s absolute nonsense!

  7. December 12, 2011 at 5:00 AM

    Hi Steven
    You need to publicise your blog, Dr. Curry hasn’t your name on the roll call.
    I see my post is still there, either you aren’t much of a censor or perhaps didn’t object to the content. As a sign of good will I would like to email you data on this very controversial graph, so I can have benefit of your comment. My email is on the graph, but as a blog keeper you should have it anyway.

  8. steven mosher
    December 12, 2011 at 9:13 AM

    I dont practice censorship of any kind. grown people can decide for themselves what to put in their heads and what not to put in their heads.

    I’m Familar with Grants work. In fact I work with some of his code.

    I’ll comment on whatever you like but I make all my comments in public so that people do not have to worry that I say one thing in private and another thing in public.

    • December 12, 2011 at 2:15 PM

      Thanks Steven, that’s fine. I would look for your comments here on your blog, which is far easier to access and use. My pc goes ‘nuts’ when I try to post something on Judith’s Climate etc

  9. February 14, 2012 at 6:28 AM

    There is no “systematic causal relationship” simply because the greenhouse conjecture is not based on real world physics.

    Prof Claes Johnson has proved in Computational Blackbody Radiation* that energy in radiation only gets converted to thermal energy if the peak frequency of the radiation from the source is above the peak frequency of the radiation from the target.

    This essentially provides a mechanism which explains why the Second Law of Thermodynamics also applies for radiative heat transfer, as it does for heat transferred by conduction.

    There seems no plausible alternative explanation for the observed Second Law, so I suggest we all heed what Johnson has deduced mathematically, being as he is, a Professor of Applied Mathematics.

    It is not the net radiative flux (or even its direction) which determines whether (and in which direction) thermal energy is transferred. For example, if the emissivity of two bodies is very different, there can be more radiative flux from the cooler one. But all that flux will be scattered by the warmer one and not converted to thermal energy. Only the flux from the warmer one (no matter how weak) will be converted to thermal energy in the cooler one. This “ensures” that the Second Law is valid in all cases because it depends
    on peak frequency which is proportional to absolute temperature – see http://en.wikipedia.org/wiki/Wien's_displacement_law

    Thus the IPCC “backradiation” cannot affect the temperature of the surface and there can be no atmospheric radiative greenhouse effect.

    * http://climate-change-theory.com/RadiationAbsorption.html

    • steven mosher
      February 14, 2012 at 10:01 AM

      Sorry Doug. Claus is wrong. Even professors make mistakes. He’ll survive

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: