## Musical Interlude

After you download and execute Moshtemp202 ( arrg I need to put this all under source control duh!) you can draw the nifty graph you see above. Remember I hate writing 2D Graphics code. Anyways, Here is what you can do at the console:

**>temp<-rowMeans(V2Anomalies,na.rm=T)**

**> months<-zoo(temp,order.by=timeline)**

**> plot(months)**

**> plotAnomalySeries(getwd(),”test2″,months)**

In MoshTemp202 we are merely loading the v2Anomaly object that we created in MT201 and wrote to file. For the folks who have read through GISSTEMP I am taking a similar approach to documenting every datastep with an output file. This allows me to test every step and also share data at every step.

V2Anomaly is the step after we have Windowed the data, and combined duplicates. See the comments section where I explain JR’s confusion over what a duplicate is. Its tricky so be careful. Anyway, after we combine duplicates ( I do it different that Hansen) we then create an Anomaly. So V2Anomaly is a datastructure that has COLUMNS of Stations ( each colume NAME is a GHCNID) and each row is a month. Each row name is a time. 1900, 1900.083, etc. And the data in the matrix is temps in 1/10 C. No need to change that right now.

So, in MT201 we saw how we can get a station count just by turning temperatures to T/F flags and summing. When you work in R you have to relearn some old tricks, like masking. Anyway, In todays chart and example I do the following: I take every ROW (month) and I take the mean of all temps in that row. This is actually our FIRST estimate of the global average. See how? Its the average where ALL stations get the same weight. Its an important result. we have 5000 or so stations and we just average them all without regard to their spatial location.

Here is an analogy. You have a big swimming pool, with 10 thermometers located in it. You dont know where. I ask you to estimate the average temperature of the pool. Your best guess is to average them all. That IS your best guess. There is no better guess. That guess has the smallest error. Now suppose I tell you that 9 of those thermometers are located in a 1 foot radius, while the other thermometer is 80 feet away. What’s your best guess now? Well, clearly you want to take that spatial information into account. One approach is to average the 9 thermometers FIRST, and then average that result with the other thermometer. What if I tell you that 5 thermometers are at the center of the pool in a 3 foot radius, 2 thermometers are at the end of the pool in a three foot radius and the last 3 are at the other end in a 5 foot radius. Well, then you would average all thermometers within say a 5 foot radius. and you get 3 “blocks” or “cells” of of “average” local temps. Then you average all the cells. What we are worried about is this. we are worried that some spatial locations may warm or cool faster than others. By applying a spatial sampling we address that concern. If 9 of the ten where in a hot spot, that hot spot would get a FINAL weight of 1, IF we area weight. I did this example in the comments so let me repeat here:

1. LA is 70F

2: SF is 50F

Whats your best ESTIMATE of the average temperature for california? 60F. Now, is it 60F everywhere? Nope, that not what the estimate MEANS. The estimate means this. Given the information you have your best estimate for unknown location X,Y in california is……60F. No guess will be better on average.

Now, I add some information. I tell you that it is 60F in Oakland, The chart above shows the following estimate. (50+60+70)=180. divide by 3 = 60F. That’s an estimate that takes no notice of the spatial information. Oakland is nearby SF. Our next step therefore is to apply our spatial information. We do that by gridding the stations onto the world. My gridding differs from Hansen’s gridding. He uses equal area squares. I use lat/lon grids and then adjust for the area in the grid. More on that later.

The next step will involve bringing in the station inventories and “gridding” the inventory. So a peek ahead.

Moshtemp30X : inventories spatialized

Moshtemp40x: area weighting.

Moshtemp50x: final outputs:

Then some project related stuff, more directories, and sample studies. Beyond that? Big question. bring on the sea? incorporate Ron Brobergs metadata work? statistical methods ( monte carlo, paired testing, bootstrapping) OR graphics code. hehe. do NOT expect great charts. I’ll try to define an interface, but fonts and crap and labels and legends and blah blah blah. not my cup of tea. I need a pretty assistant? Maybe TCO

Please fetch Moshtemp2.zip at the drop.

You’re a denialist! Booga-Booga! You’re a skeptic fool! Wooga-wooga! Welcome to the club and have fun…

In your pool analogy, it would be depth that matters most, and I assume that’s the point of singing ‘Down in the Valley,’ right? Nice paper, btw.

Well the point of the pool analogy is just to illustrate the concept of spatial sampling.

its a wading pool. 3 feet deep all around. Thing is you dont know. WHAT is your best estimate

GIVEN your GIVENS. we dont get to ask what if’s. what if simians were expelled from the southern

exit of my alimentary canal?

I’ll ask Jane Goodall next time we speak. I think they would be slightly annoyed…

ha.

Hey Mosh,

it’s an interesting graph, but what is it representing? A ~10C increase over the C20th seems a bit drastic, is it supposed to be ~1C?. Where did the 1940 spike go?

Read the text. The measurements are all in 1/10 C, like the source. When we get to the very end I will scale it properly. The choice has to be made to either scale at the front end of the stream or at the end. Currently I will

~~multiply~~divide doh. by 10 at the end.1940 spike? Recall you have a UNWEIGHTED average here so you cant compare directly to GISS or CRU.. YET.

OK. Thanks for all the hard work you’ve done on this.

thanks tallone