Home > Uncategorized > Modis QC Bits

Modis QC Bits

In the course of working through my MODIS  LST project and reviewing the steps that Imhoff and Zhang took as well has the data preparations other researchers have taken ( Neteler ) the issue of MODIS Quality control bits came up.  Every MODIS  HDF file comes with multiple SDS or multiple layers of data. For MOD11A1 there are layers for daytime LST, Night LST, the time of  collection, the emissivity from ch 31 and 32,  and two layer for Quality Bits.  The various layers of   MODIS11A1  file is available here. Or if you are running the MODIS package, you can issue the getSds()  call and pass it the name of a HDF file.

As Neteler notes the final LST files contain pixels of various quality. Pixels that are obscured by clouds are not produced, of course, but that does not entail that every pixel in the map is of the same quality. The SDS layers for QC  – QC_day and QC_Night– contain the QC bits for every lat/lon  in the LST map. On the surface this seem pretty straighforward, but deciding the bits can be a bit of a challenge. To do that I considered several source.

 1. Neteler’s web post

2. The Modis groups  user guide

3. The  QA Tutorial

4. LP  DAAC  Tools, specifically the LDope tools provided by guys at MODLAN

I will start with 4, the ldope tools, If you are a hard core developer I think these tools would be a great addition to your bag of tricks. I won’t go into a lot of detail, but the tools allow you to do a host of processing on HDF files using a command line type interface. They provide source code in c and at some point in the future I expect that I would want to either write an R interface to these various tools  or take the C code directly and wrap it in R. Since they have the code for reading HDF directly and quickly it might be a source to use to support the direct import of HDF  into raster. However, I found nothing to really help with understanding the issue of decoding the  QC bits. You can manipulate the QC layer and do a  count of the various flags  in the data set. Using the following command on all the tiles for the US I was able to poke around and figure out what flags are typically reported

comp_sds_values -sds=QC_Day C:\\Users\\steve\\Documents\\MODIS_ARC\\MODIS\\MOD11A1.005\\2005.07.01\\MOD11A1.A2005182.h08v05.005.2008040085443.hdf

However,  I could do the same thing simply by loading a version of the whole US into raster and using the freq() command on raster. Figuring about what values are in the QC file is easy, but we want to understand what they mean. Let’s begin with that list of values. By looking at the frequency count of the US raster and by looking at a large collection of HDf files I found the following values for the QC layer

values <-c(0,2,3,5,17,21,65,69,81,85,129,133,145,149,193)

The question is  what do they mean. The easiest one to understand is 0. A zero means that this pixel is highest quality. If we want only the highest quality pixels, then we can use the QC map, turn it into a logical mask and apply it to the  LST map such that we only keep pixels that  have 0 for their quality bits. There, job done!

Not so fast. We throw away many good pixels that way. To understand the QC bits lets start with the table provided


If you are not clear on how  this works  every number, every integer in the QC map is an unsigned 8 bit integer. The range of numbers is 0 to 255. The integers represent bit codes which requires you to do some base 2 math. That is not the tricky part. The number 8 for example would be   “0001″  and 9 would be  1001.  If you are unclear about binary representations of integers I suppose google is your friend.

Given the binary representation of the integer value we are then in a position to understand what the quality bits  represent, but there is a minor complication which I’ll explain using an example.  Let’s take  the integer value of 65.  In R the way we turn this into a binary representation is by using the call intToBits.

[1] 01 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[25] 00 00 00 00 00 00 00 00

We only need 8 bits, so we should do the following

> intToBits(65)[1:8]
[1] 01 00 00 00 00 00 01 00

and for more clarity I will turn this into 0 and 1 like so

> as.integer(intToBits(65)[1:8])
[1] 1 0 0 0 0 0 1 0

So the first bit ( or bit 0 ) has a value of 1, and seventh bit  ( bit 6 ) has a value of 1.  We can check our math 2^6 =64  and 2^0 = 1, so it checks. Don’t forget that the 8 bits are numbered 0 to 7 and each represents a power of 2.

If you try to use this bit representation to understand the  QC “state”,  you will go horribly wrong!. It’s wrong because HDF files are written in big endian format. If you are not familiar with that, what you need to understand is that the least significant bit is on the right. In the example above the zero bit is on the left, so we need to flip the bits so that bit zero is on the right . Little Endian goes from right to left, 2^0 to 2^7. Big Endian goes left to right 2^7 to 2^0.

Little Endian:   1 0 0 0 0 0 1 0   = 65       2^0 + 2^6

Big Endian  0 1 0 0 0 0 0 1         = 65        2^6 + 2^0

The difference is this:

If we took  1 0 0 0 0 0 1 0 and ran it through the table, the table would indicate that  the first two bits 10  would mean the pixel was not generated, and  bit 1 in the 6th position would indicate an LST error of  <= 3K.  Now flip those bits

> f<-as.integer(intToBits(65)[1:8])
> f[8:1]
[1] 0 1 0 0 0 0 0 1

This bit order has the first two bits as 01 which indicates that the pixel is produced, but that other quality bits should be checked. Since bit 6 is 1 that indicates  the pixel has an error > 2Kelvin but less than 3K

In short,  to understand the QC bits we first turn the integer into a bit notation and then we flip the order of the bits and then use the table to figure out the QC status.  For MOD11A1 I wrote a quick little program to generate all the possible bits and then  add descriptive fields. I would probably do this for every Modis project I worked on especially since I dont work in binary everyday and I made about 100 mistakes trying to do this in my head.

QC_Data <- data.frame(Integer_Value = 0:255,
Bit7 = NA,
Bit6 = NA,
Bit5 = NA,
Bit4 = NA,
Bit3 = NA,
Bit2 = NA,
Bit1 = NA,
Bit0 = NA,
QA_word1 = NA,
QA_word2 = NA,
QA_word3 = NA,
QA_word4 = NA

for(i in QC_Data$Integer_Value){
AsInt <- as.integer(intToBits(i)[1:8])
QC_Data[i+1,2:9]<- AsInt[8:1]

QC_Data$QA_word1[QC_Data$Bit1 == 0 & QC_Data$Bit0==0] <- “LST GOOD”
QC_Data$QA_word1[QC_Data$Bit1 == 0 & QC_Data$Bit0==1] <- “LST Produced,Other Quality”
QC_Data$QA_word1[QC_Data$Bit1 == 1 & QC_Data$Bit0==0] <- “No Pixel,clouds”
QC_Data$QA_word1[QC_Data$Bit1 == 1 & QC_Data$Bit0==1] <- “No Pixel, Other QA”

QC_Data$QA_word2[QC_Data$Bit3 == 0 & QC_Data$Bit2==0] <- “Good Data”
QC_Data$QA_word2[QC_Data$Bit3 == 0 & QC_Data$Bit2==1] <- “Other Quality”
QC_Data$QA_word2[QC_Data$Bit3 == 1 & QC_Data$Bit2==0] <- “TBD”
QC_Data$QA_word2[QC_Data$Bit3 == 1 & QC_Data$Bit2==1] <- “TBD”

QC_Data$QA_word3[QC_Data$Bit5 == 0 & QC_Data$Bit4==0] <- “Emiss Error <= .01″
QC_Data$QA_word3[QC_Data$Bit5 == 0 & QC_Data$Bit4==1] <- “Emiss Err >.01 <=.02″
QC_Data$QA_word3[QC_Data$Bit5 == 1 & QC_Data$Bit4==0] <- “Emiss Err >.02 <=.04″
QC_Data$QA_word3[QC_Data$Bit5 == 1 & QC_Data$Bit4==1] <- “Emiss Err > .04″

QC_Data$QA_word4[QC_Data$Bit7 == 0 & QC_Data$Bit6==0] <- “LST Err <= 1″
QC_Data$QA_word4[QC_Data$Bit7 == 0 & QC_Data$Bit6==1] <- “LST Err > 2 LST Err <= 3″
QC_Data$QA_word4[QC_Data$Bit7 == 1 & QC_Data$Bit6==0] <- “LST Err > 1 LST Err <= 2″
QC_Data$QA_word4[QC_Data$Bit7 == 1 & QC_Data$Bit6==1] <- “LST Err > 4″

Which  looks like this


Next,  I will remove those flags that don’t matter.  All good pixels, all pixels that don’t get drawn, and  those where the TBD bit is set high. What I want to do is select all those flags where the pixel is produced but the pxel quality is not the highest quality

FINAL <- QC_Data[QC_Data$Bit1 == 0 & QC_Data$Bit0 ==1 & QC_Data$Bit3 !=1,].  Below see a part of this table


And then I can select those  that occur in my HDF files.


Looking at LST error which matters most to me, the table indicates that I can use pixels that have a  QC value of 0,5,17 and 21. I want to only select  pixels where LSt error is less than 1 K

Very quickly I’ll show how one uses raster functions to apply the QC bits to  LST.   using MODIS R and the function  getHdf() I’ve downloaded all the tiles for US for every day in July 2005. For July 1st, I’ve used  MRT to mosiac and resample  all the SDS. I use Nearest neighbor to insure that I dont average quality bits. The mosaic is the projected from a SIN projection to a Geographic projection ( WGS84) and a geotiff output format. Pixel size is set at   30 arc seconds

The SDS are read in using R’s raster


##  get every layer of data in the July1 directory

SDS <- list.files(choose.dir(),full.names=T)

##  for now I just work with the Day data

DayLst <- raster(SDS[7])
DayQC <- raster(SDS[11])

##  The fill value for LST is Zero. That means Zero represents a No Data pixel

##  So, we force those pixels to NA


##  Units in LST  need to be scaled to be put into Kelvin. The adjustment value is  .02.  See the docs for these values

##  every SDS has a different fill value and a different scaling value.
DayLst <- DayLst * .02

Now, we can plot the two raster

plot(DayLst, main = “July1 2005 Day”)

plot(DayQC, main=”Day Quality COntrol”)



And I note  that the area around texas has a nice variety of QC bits, so we can zoom in on that



Next we will use  the QC  layer and create mask

m <- DayQC
m[m > 21]<-NA
m[ m == 2 | m== 3]<-NA

Good <- mask(DayLst, m)

For this mask I’ve set all those grids with  QC bits  greater than 21 to NA.  I’ve also set  QC pixels with a value of 2 and 3 to NA.  This step isnt really necessary  since  QC  values of 2 and 3 mean that the pixel doesnt get produce, but I want to draw a map of the mask which will show which bits are NA and which bits are not NA.  The mask() function works like so. “m” is laid over DayLst.  bits that are NA in “m” will be NA in the destination (“Good”)  if a pixel is not NA in “m” and has a value like, 0 or 17 or 25, then the value of the source (DayLst) is written into the destination.

Here are the values in my mask

> freq(m)
value    count
[1,] 0        255179
[2,] 5         12
[3,] 17       8539
[4,] 21        3
[5,] NA   189147

So the mask will “block” 189147 pixels in DayLst from being copied from source to destination, while allowing the source pixels that have QC bits of 0,5,17,21 to be written. These QC bits  are those that represent an LST error of less than 1K


When this mask is applied the colored pixels will be replaced by the values in the “source”

Good Lst

And to recall, these are all the pixels PRIOR to the application of the QC bits


As a side note, one thing I noticed when doing my SUHI analysis was that the LST data had a large variance over small areas. Even when I normalized for elevation and slope and land class there were  pixels that were 10+ Kelvin warmer than the means. Essentially 6 sigma cases. That, of course lead me to have a look at the QC bits.  Hopefully, this will improve some of the R^2 I was getting in my test regressions.

About these ads
Categories: Uncategorized
  1. January 27, 2013 at 7:08 AM | #1


    I don’t know where to find your email here, so I’m leaving this unrelated comment on the first available blog post.

    I’m writing you about a recent comment thread at WUWT, in which we had a disagreement about some of the arguments used by AGW in support of the GHG theory of warming. Willis Eschenbach made a comment, clarifying the IPCC’s approach on this. I’ll put WIllis’ comment in context here:

    Willis Eschenbach says:
    January 24, 2013 at 7:42 pm
    Steven Mosher says:
    January 24, 2013 at 6:59 pm

    [I had written, at the end of a previous comment] “And Steve Mosher, don’t be such a tool. It’s the AGW types who have claimed that it’s impossible to have warming such as we have now, without manmade GHGs being the driver. ”

    SM repled to me: “Wrong. no one says that is IMPOSSIBLE to have the warming we have now without GHGs. The argument is entirely different”

    [Willis continues] Thanks, Mosh. I suspect the argument he is referring to is the argument involving climate models. It says that because climate models do poorly when you remove the anthropological forcings, this means the anthropological forcings must be causing the temperature changes. See the IPCC AR4 version of the argument here.

    The clearest statement is from the IPCC TAR (emphasis mine):

    The United Nations International Panel on Climate Change (IPCC) produces a major scientic report involving up to 2500 scientists in the writing and reviewing process every 5th year. The IPCC Third Assessment Report (TAR) states: A climate model can be used to simulate the temperature changes that occur from both natural and anthropogenic causes. The simulations in a) were done with only natural forcings: solar variation and volcanic activity. In b) only anthropogenic forcings are included: greenhouse gases and sulfate aerosols. In c) both natural and anthropogenic forcings are included. The best match is obtained when both forcings are combined, as in c). Natural forcing alone cannot explain the global warming over the last 50 years.

    That sounds a whole lot more like what he said than to what you said … I don’t doubt that, as you say, there is another argument out there that “is entirely different”. But the argument he’s talking about has definitely been made, and by the IPCC no less.


    PS—If I were a tool, I think I’d be a Leatherman …

    Now, my question to you is, were you aware of this line of argument in the IPCC AR4 report, or similar arguments commonly made by AGW proponents? Or is this the first you’ve heard of it?

    Willis and I seem to have considerable disagreement about what your knowledge of climate science is. I’ve presumed that you do know about this line of argument, since you know so much about the climate wars, are so widely read, and even blog and frequently comment about it.

    So what’s the story? Did you know about this? And if so, why did you write that “no one says that is IMPOSSIBLE to have the warming we have now without GHGs”, when I think it’s clear that the IPCC does make that an explicit element in their rationale for concluding that our present warming MUST be due to man-made GHG emissions.

    Hope you weren’t overly insulted by my blunt language. But this is so basic an element of the pro-AGW arguments, I have a hard time imagining you weren’t aware of it.


    Conrad Goehausen
    (Broken Yogi)

    • steven mosher
      January 29, 2013 at 9:51 AM | #2

      “Now, my question to you is, were you aware of this line of argument in the IPCC AR4 report, or similar arguments commonly made by AGW proponents? Or is this the first you’ve heard of it?
      Willis and I seem to have considerable disagreement about what your knowledge of climate science is. I’ve presumed that you do know about this line of argument, since you know so much about the climate wars, are so widely read, and even blog and frequently comment about it.
      So what’s the story? Did you know about this? And if so, why did you write that “no one says that is IMPOSSIBLE to have the warming we have now without GHGs”, when I think it’s clear that the IPCC does make that an explicit element in their rationale for concluding that our present warming MUST be due to man-made GHG emissions.
      Hope you weren’t overly insulted by my blunt language. But this is so basic an element of the pro-AGW arguments, I have a hard time imagining you weren’t aware of it.

      1. Yes I am aware of the line of argument
      2. This is the exact argument I was referring to.
      3. There is a reason why I repeated your word ‘impossible’ and put it in bold.

      There is an important logical difference between arguing

      A) we did these simulations and it is IMPOSSIBLE to have the warming we have now absent GHGS

      B) we simulated with and without all currently known forcings and found that the warming could not explained by natural forcing only.

      The first makes a claim that no scientist ( no good scientist would make ) since science deals with the “likely” and the “unlikely”

      The whole point is your use of the word impossible. impossible suggests certainty and there is no certainty in science. Willis as usual is wrong in his assesment of what my point would be, and I’m not surprised he would miss the importance of my emphasis on the word IMPOSSIBLE. The actual argument doesn’t make this type of claim. I hope you see the point. No one says that its IMPOSSIBLE. The way to disprove this is not to find something where you think they imply its impossible, but to actually find something where they claim that it is, in fact, impossible and actually use the word. IMPOSSIBLE.

  2. February 3, 2013 at 2:21 AM | #3

    Thanks, Steve. I appreciate your clarification. Yes, I understand your emphasis on “impossible”. In the context of the debate, however, that’s not how it came off, or even that your qualification is accurate in relation to the work of the modelers. It is quite clear that the modelers really are saying that, within the context of their own modeling scenarios and the assumptions they have built into them, it’s impossible for them to construct a working model that 1) conforms to observational data, and 2) does not include sizeable assumptions about GHG warming and its feedbacks. From that, they conclude that it’s simply impossible to have warming as we have now, without GHG being the driver of much of it. I don’t think that’s a mischaracterization of their arguments. It’s why they call anyone who disagree with their conclusions “deniers”, rather than merely having an honest disagreement.

    And that’s why I accused you of being obtuse in your response.

    Sorry for any ill feelings I’ve aroused. Seems like Willis got a lot more bent out of shape about it than you did.

    No hard feelings.


    • steven mosher
      February 3, 2013 at 4:06 AM | #4

      Well conrad lets use willis’s test. quote their words. show me where they say EXACTLY that it is impossible. The problem you and Willis have is that you don’t know these guys and you’ve never talked to them in person and at length and you are imagining arguments that they never make. And I suspect that neither of you have done large scale physics modelling.
      Neither have you absorbed the meaning of the phrase “all models are wrong, but some are useful.” Modelers accept that all models are wrong. That is why, we would not say IMPOSSIBLE. of course you might imply that from the things people say, but only because that is self serving and not truth seeking on your part and Willis’ part. Put your belief to a test. Write to a modeler. Ask them directly if they believe it is impossible. WRT to feelings, of course WIllis got more bent out of shape. That is part of his act. Sheesh I thought everyone was on to his game of outrage

  3. February 3, 2013 at 2:41 AM | #5

    Steve, I’ve reposted your response to the WUWT thread in question, and added this comment:

    As to the issue of arguing in “bad faith”, I’d say that focusing on the semantics of my use of the word “impossible” distracts from the actual constraints put on modelers by their assumptions about the essential requirement for GHG warming to produce meaningful results, and the reliance on those models for the CAGW advocates to claim that GHG warming is the only possible scientific explanation for our recent warming trend. Since SM was very much familiar with these arguments, and the requirements of the models, I’d say it’s a disingenious way of arguing, to assume that “impossible” refers to some existentialist eternality, rather than the practical matter of making the equations and the computer models add up.

    Reply if you care to.


    Here’s the relevant thread at WUWT:


    • steven mosher
      February 3, 2013 at 4:08 AM | #6

      conrad I would say you and willis are exercising bad faith. You are well aware of Willis’ demand to quote his words exactly when talking about his positions. yet here, you and he feel free to play fast and lose with terms. Shameful.

      • February 3, 2013 at 6:58 PM | #7

        I’m not sure why I need to quote Willis here, since I’m talking to you on your blog. As for “playing fast and loose with terms”, I’ve answered you on the “IMPOSSIBLE” question back at WUWT. Better to keep it over there, rather than on this unrelated blog post, I think. I’m not getting why you think I’m arguing in bad faith, when you understand very well what my terms mean.

  4. April Geffre
    February 3, 2013 at 8:46 AM | #8

    Hi Mr. Mosher,

    Sorry, this comment has nothing to do with your blog post, but I do not know how else to contact you. Are you the same Steven Mosher that spoke at the SEEK 2013 conference in Orlando? If you are, I went to one of your talks about abortion in China. I am a dental hygiene student, and I am writing a paper about China for one of my classes. We are allowed to use an interview as one of our sources, and I thought that you might be a great choice. Could you tell me about the overall culture of China? Also, if you would have any information or knowledge about healthcare in China, especially the dental aspect of it that would be great.

    Thank you!

    • steven mosher
      February 4, 2013 at 12:59 AM | #9

      Hi april. that is steven W mosher

  5. February 3, 2013 at 6:57 PM | #10


    I’m not sure why I need to quote Willis here, since I’m talking to you on your blog. As for
    “playing fast and loose with terms”, I’ve answered you on the “IMPOSSIBLE” question back at WUWT. Better to keep it over there, rather than on this unrelated blog post, I think. I’m not getting why you think I’m arguing in bad faith, when you understand very well what my terms mean.

    • steven mosher
      February 4, 2013 at 12:59 AM | #11

      Conrad, here is the point. This is not about quoting willis exactly. This is about willis’ rule.

      Like so.

      1. You claim they make an argument about impossibility.
      2. I say they dont.
      3. Willis says they kinda do.
      4. I invoke the willis rule ” show where they say that exactly” quote their words.


      • February 4, 2013 at 2:00 AM | #12


        I didn’t know Willis makes the rules here, but if you want to do things that way, fine. I already told you I posted a response to this at the WUWT thread. I’ll reprint it here for your convenience:


        Thanks for rejoining the discussion here, after that very wierd interruption. Back to the science:

        So rather than telling us what you think modelers mean quote them exactly.

        I think the quote Willis produced from the AR4 is more than adequate to justify my remark:

        Natural forcing alone cannot explain the global warming over the last 50 years.

        Notice that this doesn’t say “It is difficult to explain the warming of the last 50 years by natural forcings,” or that “it’s likely the warming was caused by GHG forcings”. Instead, it used the word “cannot”. That is an uneqivocal statement. This is the grammatical equivalent of saying that it’s impossible. Though I grant you that scientists don’t like to use that word, the phrasing they choose has no ambiguity to it at all. I’m sure if you sat them down, they’d admit that it’s not literally impossible in the existential sense that they are wrong. But in the practical, working man’s scientific sense, yes, they are saying that. It’s why the climate debate has become so extreme. Many of these guys really are saying that the basic scientific issue is settled, and while there are some issues to work out, there’s simply no reasonable doubt in their minds that the warming of these past 50 years has been almost entirely due to GHG forcings.

        Is that unreasonable to suggest? The AR4 is hardly the product of some lone fanatic out there in the hinterlands. It’s the “scientific consensus” of the world’s largest international climate science body. I’m sure there are some even on the AGW side of the aisle who might be a little nervous about the lack of equivocation in that statement. But it’s there, in black and white, nevertheless, representing the scientific consensus. I’m glad you criticize it, but you can’t deny it’s existence as the voice of the international climate community. Sadly, it is, until something changes within that community. We can criticize the underlying justifications for statements like that, but we can’t pretend they aren’t out there dominating the scientific view on this matter, and of course strongly influencing all media and discussion of the subject.

      • February 4, 2013 at 2:12 AM | #13

        As a further note, here’s the online dictionary definition of “impossible”:

        im·pos·si·ble (m-ps-bl)
        1. Incapable of having existence or of occurring.
        2. Not capable of being accomplished: an impossible goal.
        3. Unacceptable; intolerable: impossible behavior.
        4. Extremely difficult to deal with or tolerate: an impossible child; an impossible situation.

        I’ve highlighted the relevant aspect of the definition, “Incapable…of occurring.”

        When the AR4 report says, “Natural forcing alone cannot explain the global warming over the last 50 years,” I think the above definition of “impossible” covers the chances of natural forcing being the driver of recent warming, in the AR4′s view.

      • brokenyogi
        February 13, 2013 at 5:42 AM | #14

        Still no response, Steve? I did what you asked, but I get nothing back. That’s a bit rude, don’t you think?

      • Steven Mosher
        February 13, 2013 at 9:33 AM | #15

        “I think the quote Willis produced from the AR4 is more than adequate to justify my remark:”

        No it is not. because there is a difference between saying “its impossible” and noting that nothing in your models explains it.

        Try again. Quote them exactly.

      • Steven Mosher
        February 13, 2013 at 9:36 AM | #16

        Conrad? Rude?

        No rude, would be ignoring my argument which is what you are doing. All modelers live by the creed “all models are wrong, but some are useful” This is incompatible with your reading of their text. You were wrong to use the word impossible. It is easy to fix. Just do it.

  6. February 15, 2013 at 6:06 PM | #17

    Steven Mosher :
    “I think the quote Willis produced from the AR4 is more than adequate to justify my remark:”
    No it is not. because there is a difference between saying “its impossible” and noting that nothing in your models explains it.
    Try again. Quote them exactly.

    You’re playing semantics here.They are simply saying that it’s impossible in their understanding of the known science on climate, which they think is comprehensive enough to make blanket statements like this. You are taking my use of the word “impossible” to mean “utterly unimaginable”, which I think you know is not at all how I meant it.

    This is why I accuse you of arguing in bad faith. Which is what getting tied up in semantics amounts to.

    • Steven Mosher
      February 16, 2013 at 4:31 AM | #18

      It’s not playing sematics. I know its not playing semantics. They are not saying it is impossible. I made that clear by capitalizing the word IMPOSSIBLE.
      You may think that you get to define what they mean, unfortunately since I talk to them I can check your hypothesis about what they mean. That hypothesis has been falsified. They do not mean what you thnk they mean. unfool yourself

  7. February 15, 2013 at 6:08 PM | #19

    Steven Mosher :
    Conrad? Rude?
    No rude, would be ignoring my argument which is what you are doing. All modelers live by the creed “all models are wrong, but some are useful” This is incompatible with your reading of their text. You were wrong to use the word impossible. It is easy to fix. Just do it.

    Steve, you’re not making an argument. You haven’t challenged the AR4 quote at all. You are simply playing semantic games as to the meaning of the word “impossible”. WHen they say it the later 20th century warming cannot be explained except by GHG forcing, they are saying it is impossible to explain it otherwise. Impossible=cannot, as the dictionary demonstrates. What exactly is your argument, because you still haven’t offered one?

    • February 15, 2013 at 6:16 PM | #20

      Also, where on earth is this creed stated? You are just making things up. The AR4 makes no such equivocation in its language. Give me a quote from the AR4 backing up this “creed”.

      But if you can’t abide the word impossible, simply take it to mean what the dictionary takes it to mean, which is “cannot”. Good enough?

      This isn’t a minor point. The whole of climate science these days hinges on the notion that our recent warming cannot be explained by anything other than a change in GHG forcing. Debate that issue, rather than what you think the word “impossible” means.

    • February 15, 2013 at 6:23 PM | #21

      As the AR4 says:

      ““Natural forcing alone cannot explain the global warming over the last 50 years,””

      They don’t say “Our models, which are undoubtedly wrong, but which are useful for now, tell us that Natural forcing alone cannot explain the global warming over the last 50 years,”

      There’s a reason they phrase it as they did. They are trying to get across their certainty that it’s impossible for the present warming to be explained other than by GHG forcings. If they were uncertain about that matter, they would say so. Since they do not, I take them at their word. If they are lying, and deliberately overstating their certainty, that’s another matter.

      • Steven Mosher
        February 16, 2013 at 4:36 AM | #22

        ‘There’s a reason they phrase it as they did. They are trying to get across their certainty that it’s impossible for the present warming to be explained other than by GHG forcings.”

        Now you are a mind reader. This is very simple.
        Ask them; Is it possible that something else, something we dont understand could be causing the warming? Ask them
        is it within the realm of possibility that your models might be wrong? or is it impossible because
        A) your models are correct
        B) your models say its not possible.

    • Steven Mosher
      February 16, 2013 at 4:33 AM | #23

      “Steve, you’re not making an argument. You haven’t challenged the AR4 quote at all. ”

      Since the quote does not mean what you say it means I have challenged YOUR INTERPRETATION of what it means. This is not that hard. they dont use the word impossible. they dont mean impossible, and you are wrong. its pretty easy to prove otherwise. I’ll gladly put you in touch with any number of modelers and you can ask them for yourself.

  8. Brandon Shollenberger
    February 25, 2013 at 1:22 PM | #24

    Sorry for being off-topic, but I couldn’t find a contact page, and I don’t have your e-mail address.

    I’ve been trying to look at some things related to the BEST results, and I have a few questions/obstacles I wanted to ask about. The biggest one is tied to the fact I’ve never had or used Matlab. I use R, and I’ve tried opening the .mat files in the latest BEST release with the R.matlab package. Unfortunately, I get an error every time. Have you been able to open those files in R? If so, is there anything in particular I need to do?

    I have a couple other questions, but the most important step is just reading the data. I’ve done some work with an old release of data, but I don’t want to “publish” any results based on outdated data. Could you send me an e-mail so I could run a few things by you before putting them out for anyone to see?

  9. Frank
    April 6, 2013 at 10:06 AM | #25

    Sorry to be off-topic, but I was interested in your WUWT debate with Willis and WUWT isn’t a great place for serious comment. (I find your independent streak refreshing and appreciate your willingness to share your knowledge.)

    You made the following challenge:
    1. Define Urban or rural ex ante in a way that is objectively measureable.
    2. i will divide stations into urban and rural per your definition.
    3. i will compute the difference.
    A cookie for anyone who can find the signal.

    That’s a safe bet that skirts the real issue. UHI’s clearly exist and have the potential to bias surface temperature trends. So does poor station siting, which is sometimes included in discussion of UHI. However, biases are only introduced into the trend when the amount of UHI or the quality of siting CHANGES appreciably over the period in question. Looking at stations TODAY tells us little about how much CHANGE occurred in the past or when it occurred. The DIFFERENCE in trend between “urban” and “rural” stations doesn’t tell us much about UHI biases unless you’ve got a group of urban stations whose UHI or siting has worsened and you’ve successfully excluded anthropogenic biases from the rural group. That’s hard to do with the limited existing information. The trend from a carefully selected group of rural stations is meaningful. IMO, BEST should simply report the trend from the most reliably rural stations they can find and stop suggesting that the difference between their rural and urban groups proves that UHI isn’t important.

    excuse me for posting in your comment like the “voice of god” but its easier that copying your long comment and responding. Lets unpack some of your assumptions/arguments.
    1. Bias is only introduced when the quality changes over time. Agreed. I’ve often tried to make this argument at WUWT and other places.
    Note however that now we are restricting ourselves to siting changes that occur over time.. Like trees growing. Concrete around the site
    is either there or not there. An air conditioner is either there or not there.. they don’t creep in over time. If these changes ( site added
    concrete, site add AC etc ) are real and significant– change point analysis will find them. That said I agree.
    2. The difference in trend can tell you something. Lets put aside the quality issue and just talk UHI for clarity

    Below find the logical possibilities:
    A) rural site.
    B) urban site
    1. went from rural to urban
    2. stayed roughly the same urban from T1 to Tn
    3. worsened urban
    4. improved urban ( say depopulation as in places like detroit )

    The question on the table is this: The sum of all sites ( A + B1-4) is used as the global average.
    We may never be able to discriminate B1-B4 to everyones satisfaction. Question: does the use of urban sites introduce
    MEASURABLE bias in the sum of all sites ( A and B1-B4 ). Well, we known that B4 is a very small number of sites.. the issue
    is B1 and B3 which may be hard to tease out. There are two ways to answer the question analystically. Look at the difference
    between A and all the Bs ( cause its hard to figure out all the cases) or look at A alone and compare to A plus all the Bs.
    If life were burger King and steve had it his way, we would just report rural stations.

    3. UHI is imporant. But that is not the question. The scientific question is this: from 1750 to today we see a warming of ca. 1C
    How much of this warming is due to sampling bias?
    A) spatial sampling bias
    B) temporal sampling bias
    C) measurement bias
    1) instrument
    2. location
    3. method
    UHI exists. Its real. It is a problem in many cities and may get worse. All valid points. Different question: does it bias the record ( I think so)
    is that bias MEASURABLE: hmmm I’m not so sure. understanding why is fun ( see some of your points about changes over time )

    “If someone really wants to understand the role of UHI in the surface temperature record, they should look for stations with changing UHI; not at inadequate, static measures of urbanity such as MODIS, current population, or night lights (or at current siting quality). Stations with changing UHI are going to have unusually large warming trends or overall temperature changes.

    1. This presupposes what I have asked for: a definition of urban/rural and presupposes a metric for changing urban
    2. I’ve looked at population changes; Looked at changes in land class.
    3. We have MODIS over many years and impervious surface over many years I could look at that.
    4. See above. this class ( B1 and B3 ) are already included in the analysis. they just are not broken out.
    Ifthe effect was LARGE it should overwhelm the B2 and B4 class. It doesnt which argues for
    A) the class ( B1 and B3) is not large and/or
    B) the effect is not large.

    Start by dividing stations into two groups: a group to explore for signals characteristic of changing UHI (the exploration group) and a group to quantify the UHI bias in stations predicted to have changing UHI (the test group). First, you want to subtract from both groups all of the stations at high latitudes and elevations than might show amplified warming because of changes in albedo (seasonal snow cover) or Arctic amplification – their large warming trends are likely to be due to UHI.

    1. what basis is used to separate the classes
    2. UHI is latitude dependent. by getting rid of high latitude stations you’ll
    A) make your result non generalizable
    B) eliminate stations with the highest UHI.. again the idea is to see is UHI is in the actual record.
    3. seasonal Albedo changes is handled through deseasonalizing

    One might exclude all stations poleward of 50 degrees latitude to be safe. Then study the 10% of the remaining stations which show the greatest warming (trend or change). Which of these rapidly warming stations are urban enough to possibly have UHI? Of course, some of the stations that show the greatest warming are going to have errors and artifacts in their data, but that won’t matter. You simply want to find some significant subset of rapidly warming stations with factors in common that seem to be related to UHI. What separates these rapidly urban stations from urban stations that show average or below average warming? Do urban stations with the highest warming rates tend to be surrounded by forests? Have they undergone the greatest population growth? Found near megagacities? Does inland or coastal make a difference? Many days with little wind? Have many or few breakpoints or other artifacts in their records? Hopefully this rapidly warming subgroup will contain some (or all) of the stations where UHI is known or suspected to be important. (The BEST paper mentions the role of UHI in Toyko.)

    long ago I took this approach, of finding sites with the greatest trend and then hunting for causes. data snooping.
    And it wasnt very successful. basically, I took trend and did a multiple regression on all geographic criteria.
    There were some data issues ( like getting wind and cloud data ). In the end you can find UHI. just pick the few
    cities were population is over 1million. Its a small number. When you thrown these small number of stations in with
    39K other stations… signal gets lost. The signal is real. UHI is real. the question is. Are there ENOUGH of these sites
    to bias the TOTAL record. all tests say no. That does not mean UHI doesnt exists. It does. Look, people over 7 feet
    exist. But there are not enough of them to move the average height.

    Once you have a “formula” based on factors related to UHI for selecting stations where changing UHI is most likely be contributing to unusual warming, one can try that formula on test group of stations. If your formula selects 5% of the stations in the test group and those 5% have an average of double the warming rate as the whole test group, then 5% of the warming in the whole group could easily be due to UHI. If those 5% have triple the warming, then UHI could explain for 10% of the overall warming. If your formula identifies 25% of stations that have triple the average warming rate, you’ll have evidence that 50% of warming could be due to UHI – but this appears extremely improbably based on BEST and other studies.

    Then one might try to relax the stringency of the formula used to select stations where UHI biases the trend. The most obvious 5% of stations might have twice the average warming trend. Maybe this will lead to identification of another 10% of stations with a warming rate 30% above average potentially attributable to UHI.

    BEST failed to find any difference between rural and urban warming rates – an incongruous, but not necessarily wrong, result. When BEST can’t find any difference, skeptics wonder if they have correctly separated stations biased by UHI (there clearly must be some!) from ordinary stations. A more convincing approach would be to say: Here is a small group of similar stations probably strongly biased by UHI. Using their characteristics, we identified a larger group probably weakly biased by UHI. However, the combined bias of these two groups of stations is only a small fraction of the warming seen in the surface record. Other rapidly warming stations don’t have characteristics anticipated for UHI bias: urbanity today, rural in the past with a surface that warms when the natural environment is replaced with urban structures. Urban stations without the characteristic signs of UHI bias have trends similar to rural stations.

    I’m not so convinced folks would buy an argument on that basis. For example, it was shown long ago that you can find UHI in cities over 1 million. Folks then pointed out that this number is really small. both the effect size and the number of stations. At that point people argued that
    “no UHI is caused by going from Zero people to 1 person” That said, I do think I will take some time and show the lack of effect from DEPOPULATION.
    FWIW I think that will be publishable. hmm and when time permits perhaps I revisit the approach I did which kinda follows the one you laid out here.

  10. Frank
    April 11, 2013 at 2:32 AM | #26

    Steve: Thanks for the reply. I don’t mind “God” writing in bold font over my thoughts. I might learn something. I agree with much or what you wrote.

    I wanted to divide stations into two groups at random. If we mine the data from one group for a set of characteristics that point to stations with high trends because of UHI, we need to test those characteristic on a second group of stations to see if those characteristics actually point to a bias and how big that bias is. If I have 10 data points, I can randomly pick five points and get a perfect fit with a fourth degree polynomial. That’s meaningless! If I find that my fourth degree polynomial explains 80% of the variance in the remaining five data points, the fourth degree polynomial has some predictive value.

    yes at one point in the work I did with Zeke I took this approach for both warming and cooling stations.
    The issue was something like this: T = f(x,y,z,w,r,q) where T is temp and x,y,z etc are variables which
    explain the temperature. Once you include latitude, altitude, distance from water a large portion of the variance
    is explained. In Berkeley Earth for example over 90% of the variance is explained by geography.. which leaves
    precious little to be explained by UHI factors. Still, I can probably play around with it again.. at best I
    could find a bias that was less than .1C decade.. just on the edge of detection

    I’m under the impression (or delusion) that warming trends for all stations increase as one moves poleward. a) Models predict more warming near the poles. b) 20th century warming and black carbon have probably reduced the snow cover albedo at stations at higher latitudes and elevations. c) Modern weather forecasting programs apparently make some of their biggest mistakes with forecasts in the vicinity the rain/snow line in the days following a storm. If my city gets more snow than predicted today, next week’s forecast is often much too warm – even though a new air mass has moved in. Those difficulties last until the snow melts, which will be longer than expected since it’s colder than expected. If one simply looks at stations with high trends, I assume that list will be biased towards stations at high latitudes and/or with decreasing snow cover. Those stations are less likely than average to be urban and warm-biased by UHI. If I want to find stations biased high by UHI among stations with high trends, I first want to get rid of stations that have high trends for other reasons. Multiple regression may be the proper way to develop a model capable of identifying stations biased high by UHI, but I’m thinking qualitatively.
    ok but you have to distinguish between finding UHI.. which you CAN.. and finding a bias in the record. I can find UHI. Just look at the
    few dozen sites with population over 1m

    Cities greater than a million have slightly high trends than average stations. What factors among cities greater than one million predict whether they will have an unusually high trend? Days of snow cover? Latitude? Can we eliminate these factors – that aren’t related to UHI – quantitatively (by regression) or qualitatively (looking only at cities a low latitudes or with little snow)?
    UHI is strongest in the summer months and less strong in the winter. I’d suggest you read Oke on the causes of UHI. Or read through my posts
    where I cover the factors that cause UHI.

    I believe you’ve written that replacing forest with urban area has greater potential for UHI than any other type. Is the modestly higher trend associated with cities greater than one million concentrated in cities surrounded by forest while large cities in grassland tend towards average warming trends? (Or are wind or coastal location factors that prevent cities greater than one million from exhibiting high trends.) Perhaps we are getting somewhere. If we get somewhere with cities greater than one million, does the same pattern hold true for cities greater than one hundred thousand? Fifty thousand?

    see my posts on imhoff. UHI is a function of the urban and the rural. A city surrounded by a cool forest will have higher UHI than
    the same city surrounded by desert. Why? because UHI = urban – rural and if rural is higher then UHI is lower.. its subtraction
    Take a city: call it 10C. add 1C for UHI. its 11C. if the forest is at 8C.. UHI looks like 3C. If its surrounded by croplands the UHI
    might be 1C.. Not because the city is warmer than 11C, but because the croplands are 10C as opposed to the cool forest at 8C.

    If you got to the point where you could identify stations with double the average trend because of factors related to UHI, you could go back to the BEST rural/urban separation and say: BEST did (or did not) successfully placed all of these stations whose trends appear to be biased by UHI in the urban category. Unfortunately the urban category contains perhaps ten times as many stations that our analysis suggests will not show a strong UHI, even when they are large cities. We know how to find many stations where UHI is biasing the trends, but we can’t(?) use our knowledge to find enough stations with a large enough bias to demonstrate that the global record is biased.

    Alternatively, you might be able to predict that one group of modest towns could be biased high by UHI while a comparison group of similar cities or towns won’t. You may know what weather conditions or seasons will show the greatest UHI around any city. You keep the groups secret, but get Andy to recruit volunteers to test predictions through WUWT by measuring the rural/urban difference around the town during the right time of day.

    again its not about finding UHI. I can tell the exact kind of day to look for it. The issue is you dont get a lot of those types of days

  11. Frank
    April 13, 2013 at 1:58 AM | #27

    Steve: Again, thanks for the reply. I have read about Oke’s work, probably at your blog, but I haven’t retained as much as I’d like. (I hate when others write assertively and erroneously, so I try not to do so.)

    This discussion seems to be headed towards a multiple regression that explains the variance in the temperature trend (or overall temperature change during a fixed period). Predictive factors include the geographical factors you mentioned above plus at least two terms relevant to UHI: 1) a term involving population or population growth (log or linear, nightlights?), and 2) a term which depends upon how the surrounding terrain differs from urban. Is the albedo of the surround terrain is an appropriate metric or perhaps the difference in albedo between the surrounding terrain and typical urban terrain? (That doesn’t seem right. If it were, there could be some huge UHI effects during winter around towns with lots of surrounding snow cover.) Capacity for heat storage?

    I’ve tested population growth in the past and there is not much there of interest. People seem to forget that Oke’s original formulation
    was for UHI –MAX, and further to get it to work he needed regional windspeed, and the exponents changes with geography.. which lead him
    and others to realize that the type of building mattered.. and focus shifted toward the size of the urban area, building height, etc
    One metric for the difference might be emissivity but emissivity changes with season as does albedo, and then problem with getting
    these values is that they depend on cloud free conditions ( eg they bias select for days when UHI is higher )

    The impact of UHI on the global record could be assessed by calculating how much warming is reduced by setting the coefficients of the terms relevant to UHI in such a model to zero.

    If UHI is highly seasonal, it might be better to develop a model for the season when it is strongest and then generalize.

    I’m not sure how you generalize when a summer UHI may be .5C and the rest of the year zero

    You wrote: “again its not about finding UHI. I can tell the exact kind of day to look for it. The issue is you dont get a lot of those types of days.” Here, I respectfully disagree. The easiest way to not find something is to look in the wrong place. We can compare places with and without strong night lights, but an unambiguous relationship between night lights and UHI hasn’t been established.

    I’m not following you. We are not talking about the wrong place, we are talking about the temporal aspects. UHI is strongest on cloud free
    rain free calm days. For example, if you look at July in the US, you might find 15 days when all of the stations have cloud free ( UHI strong ) conditions. If you are lucky those days would be wind free as well. It’s pretty simple math. Assume a UHI signal of 1C. Now, what happens if there is
    only 1 day out of 365 when that happens. Well, I can find the one day, but in the yearly average that signal vanishes

  12. Frank
    April 15, 2013 at 2:36 AM | #28

    Steve wrote: “I’m not sure how you generalize when a summer UHI may be .5C and the rest of the year zero.”

    Frank replies: Make a regression model explaining as much of the variance in summer temperature trends in terms of geographic factors and UHI factors (population and surrounding terrain). See which UHI factors provide the best description of how much UHI contributes to variance: population vs log population vs population change vs night lights) The best model will provides coefficients (with confidence intervals) quantifying the contribution UHI makes to the trend. The confidence intervals hopefully will be narrower because UHI is most important during the summer. Now use the factors which definite UHI best during the summer and use those factors for the whole year or for each season. During seasons when UHI doesn’t make a major contribution, the confidence intervals for the coefficients related to UHI may include zero.

    If you think cloud-free summer days with little wind have the most UHI, make a regression model that explains the variance of temperature trends on such days in terms of geographic factors and UHI-related factors. Then prove that these UHI factors explain much less variance when it’s cloudy, windy or not summer. Look in the “right place” first, then demonstrate that the factors that quantify the right place become negligible in general, assuming they do indeed become negligible. Don’t build a general model first and say that it proves UHI is irrelevant.

  13. Frank
    April 15, 2013 at 12:03 PM | #29

    A discussion paper by McKitrick may have some relevance to this discussion. If one starts with a model that correctly identifies UHI bias in some circumstances, the chances of ending up with the type of misleading conclusions Ross describes might go down.


  14. p
    May 4, 2013 at 8:54 PM | #30

    Hello Steven,

    Can you tell me if there are any problems with CHCN (R)? :

    [1] 7678
    [1] “No empty files”
    Error in make.names(col.names, unique = TRUE) :

    • Steven Mosher
      May 9, 2013 at 1:01 AM | #31

      Which version are you working with

  15. p
    May 20, 2013 at 7:37 PM | #32

    First I tought that was 1.61, but now I see that is 1.5 on cran. So now I think that was 1.5.

    • Steven Mosher
      May 20, 2013 at 8:51 PM | #33

      Ok, I need to get you the latest version. I’m behind on all my updates

  16. D J C
    February 25, 2014 at 2:00 AM | #34

    I make no bones about the fact that I am determined to stamp out the travesty of physics which is promulgated on warmist and luke warm climate blogs. This comment appears on several of them.

    Roy Spencer still cannot prove with any valid physics his crazy postulate that there would be isothermal conditions in Earth’s troposphere in the absence of water vapour and radiating gases. The greenhouse conjecture depends totally upon this garbage “fissics” that would violate the entropy conditions of the Second Law of Thermodynamics. All the models depend totally on this weird idea which is never observed anywhere on any planet or moon, not even on Uranus where the base of the nominal troposphere is hotter than Earth.

    Roy only needs to look at the data for the Uranus troposphere to realise that thermal gradients (aka “lapse rates”) evolve spontaneously at the molecular level. Radiating gases reduce the gradient (and thus cool the surface) due to inter-molecular radiation. They help energy escape faster up the troposphere and eventually to space. Radiation that strikes any warmer surface is just pseudo scattered.

    There is no need for advection (upward rising gases) or any direct solar radiation or a surface: the lapse rate just forms autonomously as gravity acts on molecules in free flight between collisions.

    That is why the (badly named) “lapse rate” on Earth, Venus, Uranus, the outer crust of Earth, the core of the Moon – everywhere – evolves spontaneously in solids, liquids and gases. That is why radiative forcing is not what is the primary determinant of any planet’s atmospheric or surface temperature – gravity is – gravity traps energy.

    Water vapour reduces the insulation effect – just consider the problem with moist air in double glazed windows. Moist regions are cooler than dry regions – I have proved that with real world temperature records.

    You’ll find the study in my book “Why it’s not carbon dioxide after all” available late April from Amazon etc. and from which I quote …

    “The world will one day look back upon a small slice of history that began in the 1980′s and sadly have to conclude that never in the name of science have so many people been so seriously misled by so few for so long. Never have so many careers, so much time and so much money been spent in the pursuit of such a misguided and ineffective goal to reduce human emissions of carbon dioxide, a harmless gas which comprises about one molecule in every two and a half thousand other molecules in the atmosphere of our planet, Earth.”


  17. Frank
    April 15, 2014 at 11:41 AM | #35

    Steve: You and others have convinced me that any bias introduced into the temperature record by development (including UHI) can’t be detected by looking at where stations are located today and how poorly sited they may be. However, a huge bias such as a change in time of observation can’t be detected by this method either. Without documentation, one can’t tell if a breakpoint was produced by a change in TOB, a station move, or maintenance that restores of earlier station conditions. One should correct breakpoints arising from the first two problems (or split the record), but not the third.

    Has anyone ever tried to study how development might have distorted the trend at a particular location by studying the current “temperature field” around a town of city? Suppose we could place as many stations in and around an isolated city as we needed to unambiguously characterize its temperature field. Distant stations on “undisturbed” land would tell us what the “real temperature” would have been on land that has been developed. Stations on various types of currently developed land (farmed land, deforested land, low-density suburban, high-density urban, airports, parks in urban areas) would tell us how much effect the current development would bias the temperature trend if a station had been located there.

    Then we look at the history of development of that city and conclude that the temperature record would have suffered a 0.X degC upward bias if the station had been moved located in a region that became suburban from far, or urban from suburban or located to an airport that changed over the years from farm-like to urban. In some case, we might know where the station was located at various times in the past (but not necessarily the factors like shadows or nearby asphalt that cold distort its reading). If you have hundreds of stations, some of them are certain to

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 31 other followers

%d bloggers like this: