Home > Uncategorized > MoshTemp301: Location,Location,Location

MoshTemp301: Location,Location,Location

MoshTemp301 is all about Location. Putting stations on the map.

refreshGhcnInv()

Globe<-getLandMask()

invObj<-getGhcnInventory()

gridObj<-as.GriddedInventory(invObj,Globe)

gridInv<-gridObj$Data

stations<-table(gridInv$Cell)

hist(stations,breaks=40,labels=T)

gridInv[1:100,c(1,2,3,4,17,18,19)]

Just a short explanation here, and then more before I do the final drop of MT301. If you run MT301, you get the chart above and the output below

>gridInv[1:100,c(1,2,3,4,17,18,19)]
Id                    Name    Lat   Lon Cell      Area      Land
1   10160355000                  SKIKDA  36.93  6.95 2103  88477.57  27428.05
2   10160360000                  ANNABA  36.83  7.82 2103  88477.57  27428.05
3   10160390000            DAR-EL-BEIDA  36.72  3.25 2102  88477.57  23888.94
4   10160395001              FTNATIONAL  36.52  4.18 2102  88477.57  23888.94
5   10160400001              CAP CARBON  36.80  5.10 2102  88477.57  23888.94
6   10160402000                  BEJAIA  36.72  5.07 2102  88477.57  23888.94
7   10160403000                  GUELMA  36.47  7.47 2103  88477.57  27428.05
8   10160419000            CONSTANTINE   36.28  6.62 2103  88477.57  27428.05
9   10160425000                  CHLEF   36.22  1.33 2101  88477.57  13959.79
10  10160425001            ORLEANSVILLE  36.17  1.50 2101  88477.57  13959.79
11  10160430000                MILIANA   36.30  2.23 2101  88477.57  13959.79
12  10160444000          BORDJ BOU ARR   36.07  4.77 2102  88477.57  23888.94
13  10160445000                  SETIF   36.18  5.42 2102  88477.57  23888.94
14  10160457000          MOSTAGANEM VI   35.88  0.12 2221  91909.99  91195.14

To the code. The first function refreshGhcnInv() downloads the station inventory from NOAA and then cleans up the text fields in that file ( so they print a bit nicer..more  work here) and creates and writes a Rdata object to your working directory. Then we Globe<-getLandMask() which means we load the data in the land mask. Next we read in the station inventory object:invObj<-getGhcnInventory(). The next step is critical. We take that inventory and we feed it to a function which will add the extra geographical information we need. Basically the grid cell the station lives in, the AREA of that gridcell and the actual LAND area. The GHCN inventtory has data from surface ships and some tiny islands. They will register ZERO land.  There are not many and it doesnt matter how you treat these stations in the final analysis. I will drop them. But rest assured I tested it both ways before making that decision. Here is that function. Take the ghcnInv data and the Globe and return me a gridded inventory gridObj<-as.GriddedInventory(invObj,Globe). Recall that in MoshTemp202 I added some bits ( Parameters) to the output objects to make them more encapsulated self documenting objects. The DATA I want is in the $Data element. get that: gridInv<-gridObj$Data. There now I have a GHCN Inventory file that has been “gridded.” I’ve added geographical information to it that is KEYED to the gridding I selected for my land mask: Lets see how:

> gridObj$Parameter

$Title

[1] “BASELAND1900-2009”

$Cellsize

[1] 3

$Years

[1] NA

$Cam

[1] NA

$Landmask

[1] NA

$SourceData

[1] “v2.temperature.inv”

$SourceDate

[1] “2010-08-19 15:29:00 PDT”

So, whenever I am reading and writing these objects to file I have to make sure I update anything that changes. In this case when I created the gridded inventory I gave the source data, the DATE of that source data and since I took information that depends upon the cell size, I updated that as well. Recall the cell number has no INHERENT meaning. If the world is gridded in 3 degrees, then cell number 1 will cover a specific 3 degree by 3 degree grid (-180E,90N to -177E,87N) If I did a 5 degree grid cell one would cover a different grid. There are 7200 total grids in a 3degree world.
Next I do a simple summary to quick check that I ported by old code correctly  stations<-table(gridInv$Cell) and I make a histogram that shows a lot of cells with one station only. hist(stations,breaks=40,labels=T). The table function gives me a frequency count of stations in every cell. If a cell has 40 stations, the count is 40. 5 stations, the count is 5. The chart for that is above

lastly I’ll output a portion of the inventory and check gridInv[1:100,c(1,2,3,4,17,18,19)] I have selected the first 100 records, and output the first 4 fields and the last three fields. Those fields get appended when you call “as.GriddedInv”. They are $Cell, $Area, $Landarea.

so the first cell has a land area that is LESS than the grid cell area. WTF? ah, that means one thing, noob.
sucker must be near the coast and only PART of the 3 degree cell is land: Check for yourself:
>gridInv[1:1,c(1,2,3,4,5,6,7,8,9,10,11,17,18,19)]
Id    Name   Lat  Lon Altitude Grid_Alt Rural Population Topography Vegetation Coastal Cell     Area     Land
1 10160355000  SKIKDA 36.93 6.95        7       18     U        107         HI         NA      CO 2103 88477.57 27428.05

YUP! $Coastal=CO Hit Google to check that thing( later we output google earth tours wo0 hoo):


More later. I guess my inbox is filling

Categories: Uncategorized
  1. August 20, 2010 at 3:52 PM

    Where do you get your land mask from?

  2. August 24, 2010 at 1:50 PM

    Brill, thanks!

    • Steven Mosher
      August 24, 2010 at 9:09 PM

      David, I disovered an issue with Land masks last night while trying to replicate HAdiSSt2. Not really an issue, but something you need
      to be aware of. The Mask I used has “inland water”. It’s probably better to use the masks on that site that have inland water removed if you are doing a
      merge of Land and SST values. It depends upon how you do your Area weighting.

      Most people do there area weighting by simply looking at the “extent’ of a grid cell ( a simple function of latitude). If you want something more fine grained that that you use the percent of land/water in a grid. That’s what I do, But in selecting a land grid with inland water, you get the
      actual land area. Probably better to remove the inland water. Its a minor minor detail that doesnt impact the total averages
      rather just the accuracy as you weight each cell by the cell area/total area. When I finish the SST stuff I’ll go back and update
      accordingly

  3. August 25, 2010 at 2:53 PM

    At the moment I’m just concerned with ccc-gistemp. For any given cell (there are 8000) GISTEMP ends up selecting either the land series (which may be empty) or the ocean series. There is no weighting or blending or anything clever. I’m merely trying to represent that land/ocean decision as a mask. Recall that the GISTEMP grid cells are equal area, that leads to a tiny tiny simplification in the ccc-gistemp code.

    Concerns about lakes noted.

    • Steven Mosher
      August 25, 2010 at 7:35 PM

      Hi David, if you look at the site where I got the mask from you’ll note a data product that is a BINARY mask. The percentage maps
      are turned into a binary 0/1 map. So .GT. 50% gets a 1 for land. So just hunt around that URL and you’ll see the binary map.
      Zeke, and chad and I all tested with both choices: percentage maps and binary maps. No difference. Nick stokes tested as well.
      I actually like the equal area approach just for simplicity of downstream math.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: