Home > Uncategorized > Download the SST Verification

Download the SST Verification

Update: code here

BUG: reader roj found a bug: this line and the next. Replacing zip. That’s what I get for staying up late

And another test is in the latest zip


should be.  ( including surrounding bits)


hadmonths<-hadmonths[1:totalMonths,1] # fixed

error<-hadmonths-results  #fixed



UPDATE: I heard back from Hadley. Nice. The difference arises from the fact that they take the average of the NH and the average of the SH and then take the mean of that.  Anyway, mystery solved. They sent along a file of them doing  it “my way” and we reconciled. wow. that was cool.

I’m going to walk through the SST verification. First just to show the work and more importantly to show ‘raster working. Going forward with the SST work I will start to use  some additions to ‘raster’ that the developer has been adding as i work through this application. For now I will be using “loops” just to give you the basic ideas and as we plow ahead i’ll add more “brick’ operations and some new features that are being added. Let’s just start with the basics a raster is conceptually a 2D grid of defined by latitude and longitude. You can perform operations on that grid. If you have a raster name “Land”, then you can invoke ‘area(Land) and you get back a raster of areas for each grid cell. Area<-area(Land) thus ‘gets’ a raster with values in each cells that represent the area of that grid. If we abstract further we get to the concept of a ‘brick’ a multi layered raster. There will then be operations we can perform on the brick, as a whole, or layer by layer.

To the code: See the library commands below. Make sure you have the latest versions of these installed. For right  now I won’t use the latest build of raster from Forge, but in the next few days I will switch to that as it will be on Cran shortly.

Start by making  the SSTvalidation folder your working directory. Install the packages from Cran if you haven’t already. onward..





source(“sstscripts.R”)  # just some file names and things you need

First off, I go out and download the file for you and unzip it. ITS A BIG FILE. be patient.

########### download AND UNZIP the file

if(!file.exists(HadSST2ncdf)) downloadHadSST2()

############ This only needs to be done once################

fixtheNA<-open.ncdf(HadSST2ncdf, write=TRUE)

set.missval.ncdf( fixtheNA, ‘sst’, -1e+30 )


As I noted before, they’ve set the metadata wrong for the file, so I open it, fix it and close the file.



we can read directly into the ‘brick’

######### raster can figure out the ‘Z’ variable

#### Thanks to Robert J Hijmans, raster maintainer/developer.for explaining this.

# we know however that it starts in jan 1850



And you should see the dates.

totalLayers<-nlayers(SST) # every layer in the brick is a month





And you should see the number of months, 1927. I’ll trim back the 2010 data to 2009. Next we will set up the loop paramets. Rememeber I’m just doing this for clarity at this point to unpack the operations. This goes away in the future, but for now, I do it this way. A slow clear implementation to get a valid approach and then make it elegant and I hope faster.



results will hold the output per month.next we loop over every raster

for(i in 1:loop){


First we get a layer


Then we create a mask. You get a T whereever there is a temperature. F where there is no temp. That’s a land cell OR a missing Observation. Next we get the area of the grid.. ALL cells will have an area


Then we multiply the area values times ZERO or 1.


Then we create weights. The weights are equal to the area in the cell divided by the total area of cells in the sample. Recall that A only contains areas where you have temperatures


Lastly we multiply weights by the temps. Recall that “mth” has temps or NAs. weights has fractional weights that sum to one. The weights multiply by the temps  and the other cells get NA!



Then we stuff the sum of all weighted temps (cellStats() into the result[]. done


The rest of the code, outputs data and draws charts. we have near perfect agreement. A few months out of wack. I checked one example by hand. It looks like they have an issue when the number of grids is small.

charts below. Zip in the morning. need some rest

Categories: Uncategorized
  1. Roj
    August 25, 2010 at 9:35 PM

    I downloaded your code and struggled through installing R and the packages. I had to change the following lines to get it to run. Don’t think this changed anything, ‘cos the output still looks the same (R noob, so not confident!)
    and ‘error<-hadmonths-month'&nbsp&nbsp&nbsp&nbspto&nbsp&nbsp&nbsp&nbsp'error<-hadmonths-results'Hope this has relevance.
    Thanks for going to the trouble of sharing this with us lurkers; R seems to do this type of stuff so easily, it’s a shame it looks as opaque as Egyptian hieroglyphs!

  2. Roj
    August 25, 2010 at 9:37 PM

    Hmmm. Well that didn’t work too well, and there was me trying to be clever with the formatting. Hey ho, back to lurking…

    • Steven Mosher
      August 25, 2010 at 10:48 PM

      Thanks for that, I did the stupid, “one more change before I hit the sack trick”

      There is a new zip with another test

  3. Steven Mosher
    August 25, 2010 at 9:45 PM

    Weird Roj,

    Lemme know if you can just type out what change you had to make, as opposed to trying a copy and paste from the editor.

    Hmm, if its opaque to you then I AM NOT doing a good job. i’ll walk you though line by line if you like, no worries as other jobs are running in the background.

    So starting at the top, which line confuses you first

  4. Roj
    August 26, 2010 at 2:17 AM

    The changes I made were

    ‘hadmonths<-hadmonths[1:totalMonths]' to ‘hadmonths<-hadmonths[1:totalMonths, 1]'

    and ‘error<-hadmonths-month' to ‘error<-hadmonths-results'

    Wonder what that will look like…

    I don't think I phrased my moan very well. I can follow what your code is doing, no problem with that. It was that R itself doesn't look very welcoming to a beginner. There seem to be large numbers of unintuitive functions with large numbers of unintuitive parameters to learn before one can do anything useful. But I suppose Z80 assembler looked like that when I started out using it!

    • Steven Mosher
      August 26, 2010 at 4:13 AM


      Until you get a hang of R lingo some stuff is hard to understand, I find the indexing ( as you saw in your bug) something that is easy
      to mess up and POWERFUL once you get it.

      Like a vector of numbers c(,1,3,5,6,8,11,12,13) can just be used to index a bigger object.
      Example: Say I have a matrix 15 by 15. I can get all the odd columns in the even rows like so

      bigmatrix<-matrix(1:225,ncol=15) # stuff sequence 1,2,..225 into a 15*15 matrix

      out<-bigmatrix[seq(2,14,by=2),seq(1,15,by=2)] #rows even, columns odd

  5. Roj
    August 26, 2010 at 2:17 AM

    That’s better – KISS!

  6. Roj
    August 26, 2010 at 2:54 AM

    …and if I use ‘hadmonths<-hadmonths[,1:totalMonths]' from the new weighttest.R, I get an error

    Error in `[.data.frame`(hadmonths, , 1:totalMonths) :
    undefined columns selected

    whereas ‘hadmonths<-hadmonths[1:totalMonths, 1]' seems to work?

    • Steven Mosher
      August 26, 2010 at 3:27 AM

      I must have an old object floating around in my workspace. Thanks for the patience. Mine ( hadmonths[,1:totalMonths] should NOT work, since its a datframe frame with one column. But it does. weird. Another zip is coming.

    • Steven Mosher
      August 26, 2010 at 4:03 AM


  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: