## Download the SST Verification

**Update: code here**

**BUG: reader roj found a bug: this line and the next. Replacing zip****. That’s what I get for staying up late**

**And another test is in the latest zip**

hadmonths<-hadmonths[1:totalMonths]

should be. ( including surrounding bits)

hadmonths<-read.table(Had2Monthly)

hadmonths<-hadmonths[1:totalMonths,1] # fixed

error<-hadmonths-results #fixed

print(max(error))

print(mean(error))

UPDATE: I heard back from Hadley. Nice. The difference arises from the fact that they take the average of the NH and the average of the SH and then take the mean of that. Anyway, mystery solved. They sent along a file of them doing it “my way” and we reconciled. wow. that was cool.

I’m going to walk through the SST verification. First just to show the work and more importantly to show ‘raster working. Going forward with the SST work I will start to use some additions to ‘raster’ that the developer has been adding as i work through this application. For now I will be using “loops” just to give you the basic ideas and as we plow ahead i’ll add more “brick’ operations and some new features that are being added. Let’s just start with the basics a raster is conceptually a 2D grid of defined by latitude and longitude. You can perform operations on that grid. If you have a raster name “Land”, then you can invoke ‘area(Land) and you get back a raster of areas for each grid cell. Area<-area(Land) thus ‘gets’ a raster with values in each cells that represent the area of that grid. If we abstract further we get to the concept of a ‘brick’ a multi layered raster. There will then be operations we can perform on the brick, as a whole, or layer by layer.

To the code: See the library commands below. Make sure you have the latest versions of these installed. For right now I won’t use the latest build of raster from Forge, but in the next few days I will switch to that as it will be on Cran shortly.

Start by making the SSTvalidation folder your working directory. Install the packages from Cran if you haven’t already. onward..

library(“R.utils”)

library(“raster”)

library(“ncdf”)

library(“zoo”)

source(“sstscripts.R”) # just some file names and things you need

First off, I go out and download the file for you and unzip it. ITS A BIG FILE. be patient.

########### download AND UNZIP the file

if(!file.exists(HadSST2ncdf)) downloadHadSST2()

############ This only needs to be done once################

fixtheNA<-open.ncdf(HadSST2ncdf, write=TRUE)

set.missval.ncdf( fixtheNA, ‘sst’, -1e+30 )

close.ncdf(fixtheNA)

As I noted before, they’ve set the metadata wrong for the file, so I open it, fix it and close the file.

seafile<-HadSST2ncdf

SST<-brick(seafile,varname=’sst’)

we can read directly into the ‘brick’

######### raster can figure out the ‘Z’ variable

#### Thanks to Robert J Hijmans, raster maintainer/developer.for explaining this.

# we know however that it starts in jan 1850

dates<-SST@zvalue

print(dates)

And you should see the dates.

totalLayers<-nlayers(SST) # every layer in the brick is a month

totalYears<-(2009-1850)+1

totalMonths<-totalYears*12

timeline<-as.yearmon(1850+seq(0,length=totalMonths)/12)

print(totalLayers)

And you should see the number of months, 1927. I’ll trim back the 2010 data to 2009. Next we will set up the loop paramets. Rememeber I’m just doing this for clarity at this point to unpack the operations. This goes away in the future, but for now, I do it this way. A slow clear implementation to get a valid approach and then make it elegant and I hope faster.

loop<-totalMonths

results<-vector(length=loop)

results will hold the output per month.next we loop over every raster

for(i in 1:loop){

mth<-raster(SST,layer=i)

First we get a layer

mask<-as.logical(mth)

Then we create a mask. You get a T whereever there is a temperature. F where there is no temp. That’s a land cell OR a missing Observation. Next we get the area of the grid.. ALL cells will have an area

Area<-area(mth)

Then we multiply the area values times ZERO or 1.

A<-Area*mask

Then we create weights. The weights are equal to the area in the cell divided by the total area of cells in the sample. Recall that A only contains areas where you have temperatures

weights<-Area/cellStats(A,sum)

Lastly we multiply weights by the temps. Recall that “mth” has temps or NAs. weights has fractional weights that sum to one. The weights multiply by the temps and the other cells get NA!

v<-mth*weights

results[i]<-cellStats(v,sum)

Then we stuff the sum of all weighted temps (cellStats() into the result[]. done

}

The rest of the code, outputs data and draws charts. we have near perfect agreement. A few months out of wack. I checked one example by hand. It looks like they have an issue when the number of grids is small.

charts below. Zip in the morning. need some rest

I downloaded your code and struggled through installing R and the packages. I had to change the following lines to get it to run. Don’t think this changed anything, ‘cos the output still looks the same (R noob, so not confident!)

‘hadmonths<-hadmonths[1:totalMonths]'    to    'hadmonths<-hadmonths[1:totalMonths,1]'

and ‘error<-hadmonths-month'    to    'error<-hadmonths-results'Hope this has relevance.

Thanks for going to the trouble of sharing this with us lurkers; R seems to do this type of stuff so easily, it’s a shame it looks as opaque as Egyptian hieroglyphs!

Hmmm. Well that didn’t work too well, and there was me trying to be clever with the formatting. Hey ho, back to lurking…

Thanks for that, I did the stupid, “one more change before I hit the sack trick”

There is a new zip with another test

Weird Roj,

Lemme know if you can just type out what change you had to make, as opposed to trying a copy and paste from the editor.

Hmm, if its opaque to you then I AM NOT doing a good job. i’ll walk you though line by line if you like, no worries as other jobs are running in the background.

So starting at the top, which line confuses you first

The changes I made were

‘hadmonths<-hadmonths[1:totalMonths]' to ‘hadmonths<-hadmonths[1:totalMonths, 1]'

and ‘error<-hadmonths-month' to ‘error<-hadmonths-results'

Wonder what that will look like…

I don't think I phrased my moan very well. I can follow what your code is doing, no problem with that. It was that R itself doesn't look very welcoming to a beginner. There seem to be large numbers of unintuitive functions with large numbers of unintuitive parameters to learn before one can do anything useful. But I suppose Z80 assembler looked like that when I started out using it!

Yes,

Until you get a hang of R lingo some stuff is hard to understand, I find the indexing ( as you saw in your bug) something that is easy

to mess up and POWERFUL once you get it.

Like a vector of numbers c(,1,3,5,6,8,11,12,13) can just be used to index a bigger object.

Example: Say I have a matrix 15 by 15. I can get all the odd columns in the even rows like so

bigmatrix<-matrix(1:225,ncol=15) # stuff sequence 1,2,..225 into a 15*15 matrix

out<-bigmatrix[seq(2,14,by=2),seq(1,15,by=2)] #rows even, columns odd

That’s better – KISS!

…and if I use ‘hadmonths<-hadmonths[,1:totalMonths]' from the new weighttest.R, I get an error

Error in `[.data.frame`(hadmonths, , 1:totalMonths) :

undefined columns selected

whereas ‘hadmonths<-hadmonths[1:totalMonths, 1]' seems to work?

I must have an old object floating around in my workspace. Thanks for the patience. Mine ( hadmonths[,1:totalMonths] should NOT work, since its a datframe frame with one column. But it does. weird. Another zip is coming.

Fixed: