Home > Uncategorized > RghcnV3 2.0

RghcnV3 2.0

Well, version 2.0 is in the can and I’ll be uploading to CRAN over the next couple of days. Lets go over the highlights. Prior to version 2.0 we had basically 3 kinds of data flowing around the package: V3 14 column format, zoo objects and mts objects.  The 14 column format has always been a PITA and much of the code was designed to provide ways to transform that into 2D zoo or mts objects with station data organized into columns. After reviewing some of Nick’s code it became clear that there was a way to get rid of the 14 column data and streamline a bunch of the code. Going forward there are three types of objects: Zoo and Mts  which are 2D representations of station data and Nick’s 3D version which is an array. From input then you select which style you like and the readV3Data() function has been restructured for both speed and configurability:

readV3Data(filename=”foo”, output = c(“Array”,”Zoo”,”Mts”).  On ingest you decide what format you want to work in. If you change your mind there are a set of functions to handle transformation:  asZoo(), asMts(), asArray() and of course a set of logical functions to determine types. The core analysis functions have also been rewritten to accept ANY of these three object types. So, passesCamZoo() and passesCamMts()  etc have all been replaced with one function passesCam() that function accepts all three types of objects and just works. Some functions such as Roman’s function and Tamino’s function, and rasterizeZoo()  still have limited input: they require, for example, an Mts input or Zoo input. I’ll probably enhance those functions in another release and then it’s done.

With 2.0 thus you have these  kind of paths

readV3Data(); passesCam();anomalize();rasterizeZoo()  and you have stations selected by CAM and area averaged

readV3data(); averageStations() rasterizeCells() and you have stations estimated by Romans regression by grid cell

readV3Data(); inverseDensity();solveTemperature() and you have Nick stokes solution

and then you have Taminos approach as well. What we know faster all this is that the methods of computing averages for temperature stations yield the same global answers. There may however be slightly different answers if you look at smaller regions or have data that is too sparse in the temporal domain for CAM.

Next Steps:

There are several things I want to do and a few things I have to do.

1. Start work on GHCN Daily

2. incorporate Zeke’s paired station approach

3. Do some more work on CHCN

4. OOP. S4 classes and methods

5. Incorporate more of Nicks work

6. metadata package.

7. Demos and studies.


Categories: Uncategorized
  1. cce
    August 8, 2011 at 12:39 PM

    Are you going to implement some kind of step change detection algorithm at some point? ie for splitting station data into multiple series. Also, this isn’t exactly on the same page, but making the program flexible enough to process radiosonde data would also be useful. There are tons of experiments that can be done there as well.

    • Steven Mosher
      August 8, 2011 at 9:00 PM

      Hi cce. I hope what I am doing will allow the integration of structchange type packages or algorithms. for me its all about getting the interface down right. That’s why I’m hesitating on the OOP design.. I will be looking at some struct change stuff. The slicing may drive the design. I looked at the BEST work early on pre-slicing and it occurrred to me that the data structures would have to change. I know thats a bit cryptic. there may be some other step change work that do this fall. we will see. radiosonde. ya. cool. I will look into it. If there is a working scientist with a real problem that makes my task easier and more compelling.. that canada shit was pure crazy

      I think I need to re name the packages– RghcnV3 is the wrong name..

  2. cce
    August 9, 2011 at 9:00 PM

    How about, “RTemp.” Or if that is too generic, “RGlobalTemp.”

    • Steven Mosher
      August 9, 2011 at 11:05 PM

      That’s a good idea. I need to think through a bunch of issues about package development and namespaces and functionality. For radiosonde data.. point me at some sources, I’ve been looking at arctic bouy data as well.

      I’m thinking that I probably need to abstract out the following

      1. Packages to get climatedata and format
      A) ghcn, env canada, radiosonde etc.

      2. Analsyis tools: nicks stuff, romans stuff,

      3. Station metadata package. I think this is fairly straight forward. basically
      Lat/lons fed to GIS stuff. BUT the issue is the system of tying this back to

      gotta run. but point me at the radiosonde data

  3. cce
    August 10, 2011 at 6:25 AM

    This is a good place to start:

    I think the “slicing” method would be particularly good at correcting for instrument changes, but the density is so low in most areas that detecting undocumented changes might be difficult.

    • Steven Mosher
      September 5, 2011 at 10:26 PM

      Thanks cce. that’s the one I was looking at as well.

      I’m deep into ghcn daily right now, but I’m taking your suggestion and I will try to split the packages into data formating,, and then data processing packages.

      so, we would have : ghcnMonthly, canada, ghcn daily, and radiosonde.

      then one separate package with all the processing stuff

      and a metadata package Im finishing just now

  4. Jan Verbesselt
    September 5, 2011 at 7:30 PM

    Nice package. Do you perhaps also now about tools/packages in R to download monthly rainfall data? (e.g. for East Africa or globally?) (for specific locations or in a coarse spatial resolution) until now (i.e. July/ 2011)
    thanks, Jan

    • Steven Mosher
      September 5, 2011 at 10:22 PM

      I would start with GHCN Daily.

      I’ve posted a new package for that. However, I have not tested it for precipitation. It should work in principle or with a couple tweaks to the code. If you are interested let me know.
      Also CRU have a precip data product ( I believe).

      if you have daily, monthly is easy

  5. nishadh
    August 16, 2012 at 4:54 PM

    RghcnV3- Great help for working with temperature data, especially with Berkeley Earth surface Temperature Data, Thank you very much.
    I am working on finding regional average temperature. Came to know the regionalAverage.R function in RghcnV3 will be helpful for this. But I couldn’t find this function in the latest package of RghcnV3_2.9. The earlier package versions say 1.5 is having that and it worked very well with BEST data.
    Please clear me about this is, did the function is name changed or removed in latest version of RghcnV3. Did the analysis is in right track using earlier version regionalAverage.R function. Kindly en-light me.

    • Steven Mosher
      August 16, 2012 at 9:55 PM

      Hi, thanks.

      I renamed that function to referenceStation() the underlying code is the same as in V1.5
      If that doesnt look right to you, just email me at moshersteven AT gmail

      • nishadh
        August 18, 2012 at 12:31 PM

        Hello Steven Mosher,
        Got cleared, thank you very much. I must have read the RghcnV3 package manual.

        No issue with the function.

      • Steven Mosher
        August 19, 2012 at 12:28 AM

        Well, I do need to redo the entire manual. I’m to the point where I want to redo everything in S4..

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: