Calculating Area Averaged Temperatures with Raster
Browsing through the r repository of packages a while back I happened upon ‘raster’ At the time I just used raster for getting “grid cells” for temperature stations, working with a landmask and area calculations and plotting. I didnt use raster for any substantial calculation. With the addition of SST it seems like the perfect tool. before getting to the problem with SST, I’ll turn back to the calculations made for the land. Basically to show some raster at work.
Lets recall the flow of processing the land code. Stations ( at lat/lon) are brought into a matrix that is ROWS of stations and columns of time (months). Then the routine as.Grid(), takes the station lon/lat and returns a cellnumber for the gridsize you use. A raster object of 3degreex3degrees is used to generate these cell numbers. basically the 2D description of lat/lon is turned into a unique cellnumber. for a 5degree world there are 36*72 cells. rows for lat, columns for longitude. Continuing, we then average the stations within each cell. If there are 3 stations in cell 1020, we average them. In a 3×3 world the stations per cell range from 0 to over 30. After this averaging we have a “cell” series. A 2d matrix of cells in the rows and time in the columns.
Then we proceed to “area average” these series. For each month we calculate the total area on the globe where land measurements are taken. No measurement, no area. We then calculate a weight for each cell based on its ratio to the total area. I use raster to do this, sort of. At the time it occurred to me that I could have raster do all this work, much clear and less prone to error. There are some challenges however, so I did some of those things by hand, knowing that I would return to raster later when working on SST, AND when combining the SST with the land.
Let’s start with the obvious problem in the land data. Co-located stations. The raster layer takes one value per cell. If we have multiple stations within a cell, then we have to reduce that problem. One way, is just to introduce a finer grid. What happen if we go down to a .25degree world? I’ll show that, just as a way to introduce how raster will solve some issues and make doing the SST easier. Remember with SST we have 1 measure per 5 degree bin.
First, lets build a fine grained world: .25dgrees per side of the gridcell. Fetch the Station Inventory, and build a synthetic temperature series.
world<-raster(nrows=rws,ncols=cls,xmn=-180, xmx=180, ymn=-90, ymx=90,
So, we have a dataframe with the longitude,latutude and fake temps. Next, lets see what gridcells those stations get assigned to
The point here is that we can see that the number of unique cells is smaller than the number of total cells. Look at the stations:
Cell Lon Lat Temp
5269 391763 -159.40 22.00 5.594263e-01
5268 391763 -159.35 21.98 1.525001e+00
5271 394648 -158.10 21.30 9.538807e-01
5272 394648 -158.07 21.32 -5.408068e-02
5273 394649 -157.93 21.35 9.353993e-01
5270 394649 -157.78 21.45 1.626147e+00
5279 397535 -156.47 20.83 1.428712e+00
5278 397535 -156.43 20.90 2.015386e+00
There are a couple ways to solve this, ( other suggestions welcomed) so for now, I’ll just remove all “co gridded”, cells there are only a couple hundred. The point to make here is how I can use raster to aggregate. So, lets remove the duplicates, for illustration only.
CellInv<-CellInv[which(CellInv$Cell %in% cells),]
Now, I have a dataframe, with one station per .25degree grid cell. What next? stick the temps in:
I apologize for the faint colors:
The fact=4, tells the package to combine cells 4:1, which gives us a 1 degree world. AND to use the function ‘mean’ to do this. NAs are handled without specifying na.rm. You get this:
What lies ahead? When it comes to working with SST I dont have to worry about aggregating. We have a 5 degree SST data set. for 1927 months! the question is how to process all those months. The key , I hope is in the “raster brick” A “brick” is a multilayer raster object. Each layer of the brick will represent a Time, a month.
Then for each month we have to.
1. get the area sampled ( that’s easy with masking and a water area map)
2. multiply every temp by the gridweight
3. cellStats(layer,sum) = temperature average for the month.
Once all 1927 months(layers) are done, then we aggregate the layers in groups of 12, taking the mean. Thats our annual average.
How to do that with a raster brick? I’ll play around some and see what happens. Also a question to the R help list, while I busy myself learning raster better