Search Results

Keyword: ‘Nightlights’

rgdal + raster + RCurl = My next package

September 19, 2011 2 comments

UPDATE:  uploaded  Metadata 1.0 to CRAN this  am.

So quick demos when it hits

This package has been a long time in the making.  In the end it’s more of a data package than a functional package, but pulling all the pieces together required me to learn some really cool packages: raster ( which I already knew ) rgdal and RCurl.  I’ll provide a littlebit of an overview of what comes together on this package and why I built it, and then maybe some future directions.

In the course of looking at climate stations and the question of UHI I got kinda fascinated with the question of metadata: the variables that describe a stations location and physical features.  The goal of the project was to assemble a somewhat complete set of datasets to describe the geographical conditions of  any climate station.  I set out some requirements, first and foremost I wanted to use data that was open and not behind registration walls. That way I can write code to simply go get the data, download it and unpack it for the package user. I failed. In the end I ended up with a few datasets that require minimal registration headaches. I think with some help from RCurl experts I could tackle most of those issues. we will see.  Lets start by canvasing all the datasets I collected

1. GHRSST distance from coast dataset.  This 1km resolution file gives you the distance from the coast for all bodies of water.

2. NASA distance from Coast. This 1km dataset gives you distance from coast for  sea pixels only.

Dataset 1 and 2 along with some other files can be used to create a 1km land mask. Here I use them to tell if a station is on the coast or not.

3. Nightlights data.  radiance calibrated nightlights from  DSMP. a 1km dataset of radiance calibrated nightlights. This is the latest and greatest nightlights data with a much larger dynamic range than other products.

4. Impervious surface data. another 1km data product of impervious surfaces

5. Airports. 40K plus airport locations are rasterized into a 1km raster.  When I get time I’ll turn this into a distance map ( one call in raster) so that each cell contains distance to the nearest airport.

6. Harmonized Land Use data.  The package build a raster brick from 7 land use files  at 5 minutes resolution: urban land, cultivated land,irrigated land, rain irrigated land, forest, sparse vegetation, grassland.

7. Bluewater irrigation in 5 minute resolution: The amount of bluewater used for irrigation on the croplands in the grid.

8. Modis urban extent. 500 meter data. This file  requires registration and permission from the PI. Its freely given and I really wanted to use the new Moid data

7. Landcover. This is a previous Modis product that also requires registration, but its painless. Also cool because the data is in 72 tiles

8. Hyde population. Historical population density in 5 minute resolution. Also rural population counts and urban population counts. This requires regsitration

9. GPW population. Gridded population of the world in 2.5 minute resolution. Requires registration.

10. Grump urban extent. Urban extent at 1km resolution. requires registration.

With the exception of Modis I’m pretty sure if I were better with RCurl I could get those to download by filling in the registration forms programatically. Hyde is also easy except the server likes to download incomplete files and I should probably work on that a bit. For now, you have to download a few of the files manually.

The package has just a couple core functions that are tied to these files. The functions are primarily for doing complilations of datasets. The first function createRasters() just does the work of reading in files and creating native raster files. So for Hyde, ISA, landcover, airports, and Modis, I found it beneficial to reclass some of the datasets and save them as native rasters. That function takes a while to run but you only run it once. The next function if collateMetadata(). This function takes a dataframe of lat/lons and attaches metadata for every position. Obviously of you know raster you dont need any of this as you can just use the raster “extract” function to pull metadata from any of the assets.

test code just finished and it’s headed to CRAN in the morning.

Must sleep

Categories: Uncategorized

Ghcn V3 Metadata improvements

December 13, 2010 40 comments

The Global Historical Climate Network  (GHCN) is in it’s beta stage. On of the stated goals of the project is to improve the metadata that is provided for the station data.  Over the past few months several independent volunteers have been focusing on the issue of station metadata, each with their own focus. Ron Broberg deserves credit for taking the lead with applying GIS tools to the issue and Peter O’neill deserves credit for his station by station review of GISS inventories. A couple other folks are busy at work and I will leave it to them to discuss their efforts when the time is appropriate as the publication process precludes them from talking openly about it.

Here, my main focus has been on GHCN and more recently GHCN V3. The goal of the project is to provide a more accurate and more comprehensive inventory of station locations and station metadata.  A short recap of the importance of this. Imagine, if you would, an inventory of 2000 stations. We suspect that some are urban and some are rural. We want to estimate the difference between the urban and the rural with an eye toward assessing the impact of UHI on the record.  Further let’s suppose that the difference is large, suppose that urban warms are 1C per century ( from 1900 to 2010) while rural shows 0C warming. In that case the rule we use to separate urban from rural, while important, is not critical. For example, if we mis identify some urban sites as rural ( say 10%) then some fraction of urban warming will  “infect” the rural subset. The effect of urbanization will still be clear in our comparison. If our categorization is less accurate, say we get 50% of the urban wrong, then our ability to discriminate the signal will be reduced according. If we also mis label rural sites as urban, the effect will be compounded. We can see then that if the UHI effect is small, the need for a better discrimination function increases.  For example, if we think that the urban stations have warmed .8C over the course of 1900-2009 (+-.05C) while the rural have only warmed, say .6C (+-.05) misidentification will have more impact on our ability to find that UHI signal. One approach would be to take the 2000 stations and divide them into 3 groups. extreme rural, extreme urban, and “mixed”. This would, of course reduce the number of stations and the signal could then be lost in the noise that results from fewer stations. Still, that result would indicate that the UHI  effect is small. That happens to be my position.  Proving that, however, requires a diligent look at the metadata.

Metadata Improvements:

Data file is in the Box: named ExtV3Metadata.inv, a csv file is also included. here

The station information presented here is still in the beta stage. But it’s ready for a public release and some initial comments on what we can tell:  The process of improving the data and extending it is described below.


1. GHCN V3 Inventory.

2. Updated WMO station locations

3. Nightlights as used by Hansen 2010

4. Improved Nightlights as recommended by Nightlights principle Investigator

5. Nightlight Buffers

6. Gridded population density as provided by GPW

7. Gridded historical population density as provided by Hyde

8. Gridded Population Density as provided by GRUMP

9. Gridded Impervious Surface area.

10. Land masks provided by several sources.

Step One: The beta version of the GHCN v3 inventories are read into a R data frame. For the posted file that inventory was the matching inventory for the “adjusted” dataset. This inventory includes only 7279 stations as one appears to be dropped from the unadjusted data.  The data fields read in include

Id:  The Id field is the GHCN ID. It’s an 11 digit index of the form cccwwwwwddd. Where ccc indicates a country code, wwwww, indicates a WMO code, and ddd indicates a    IMOD number. There are several things to note. For the US stations, the WMO code does not appear to map to the WMO master list. For example, “42500046506” is listed as the GHCNID of Orland California. In GHCN v2 the ID for ORLAND is :  42572591004 ORLAND. And for USHCN it is:046506-02 For For WMO we have no entry for ORLAND. In V2 Orland was listed according the WMO number for nearby Red Bluff. The 004 in the Orland IMOD indicates that Orland is at a different location than the WMO it is reported under. Confusing? You bet. To be accurate the V3 readme will have to be changed to indicate that the new GHCN Id, does not reflect WMO numbers in the middle 5 digits in all cases. In the US, the USHCN ID is used as the last 5 digits. Basically there are USHCN stations that do not have WMO numbers. In V2 they were listed as IMODs of the closest WMO ( redbluff) in V3 they are listed according to their USHCN number. That makes comparing V2 to V3 a bit troublesome.

Lat: The latitude of the station is reported in degrees north from -90 to 90. In my inventory   the value  is one of two values: the value found in GHCN V3 or the value found in the recently updated WMO master list. The WMO has required countries to update the precision of the station location data and that process is underway. It’s not entirely complete. Consequently some of the GHCN V3 station locations remain the same. Those that have been updated by WMO are updated here

Lon: Longitude is degrees east, from -180 to 180. As with Latitude this field contains the corrections from the recent WMO updates.

Altitude: Altitude in meters from the Ghcn V3 inventory. Corrected altitudes from the WMO master list are not included here. That will come later.

Name: The station name from the GHCN V3 inventory. I am in the laborious process of cleaning up the name list to remove the following: country names, state designations, province designations, partial names, punctuation marks. The goal would be to have a list of names as well as alternative names. Countries, states, provinces can be added properly by geocoding and should not be in the station name field.

GridEl: Grid elevation. As taken from the GHCN inventory data. In meters this represents the average elevation of the grid at .5degrees. Once the position data is improved this could be supplanted with more accurate metadata from DEMs.

Rural: A designation R,S,U that indicates whether the station is Rural, Small Town or Urban. This characterization is made based on the population of the nearest town, where R is a town with less than 10K people and Urban is greater than 50,000. This is a dated measure of urbanity. It’s problematic because it does not tell us whether the town is densely populated or spread out.

Population: The population of the nearest town in 1000s.

Topography: type of topography in the environment surrounding the station, (Flat-FL,Hilly-HI,Mountain Top-MT,Mountainous Valley-MV).

Vegetation:type of vegetation in environment of station if station is Ruraland when it is indicated on the Operational Navigation Chart (Desert-DE,Forested-FO,Ice-IC,Marsh-MA).

Coastal: An indication if the site is a Coastal location (CO) or near a lake (LA) or more than 30km away from water.

DistanceToCoast: In the site is close to water this field indicates the distance in km.

Airport: a true false flag for whether the station is at an airport or not. This has not been corrected using WMO data, but there are discrepancies.

DistanceToTown: Distance in km for the airport

NDVI: Normalized Difference Vegetative index. This field indicates the type of vegetation in the area. Its the original V3 data and should be supplanted with improved data.

Light_Code: while the V3 read me does not include or explain this data, it was present in V2. Bascially it is an undocumented description of the sites urbanity

Step Two. In the second step the updated WMO master list is merged with GHCN V3. This is not straightforward. First the GHCN list must be reduced to those stations that are not IMODs. Where the GHCN ID is  cccwwwwwddd, the ddd field must be 000. Next the WMO file must be trimmed as well. It has multiple entries for stations. The multiple stations represent “air stations” that are collocated with the ground station. In the WMO index this is indicated by an indexSubNbr = 1. Next the GHCN V3 file is merged with The WMO file based on WMO   ID.  After this is completed distances can be calculated and the names can be checked for consistency. That process results in the following fields:

WmoName : The name used by the WMO is recorded. In certain cases the WMO name is spelled differently. In some cases it is entirely different.

WmoLon: The Longitude given by the updated WMO master list. These updates are in progress. Some mistakes remain as memeber nations are delivering partial results. The data is supposed to be accurate to degrees, mintutes and seconds.

WmoLat:the latitude given by the updated WMO master list. These updates are in progress. Some mistakes remain as memeber nations are delivering partial results. The data is supposed to be accurate to degrees, mintutes and seconds.

GhcnDistance: The distance between the old loaction given by GHCN V3 and the new location. As calculated by a Haversine distance calculation

NameMatch: A true false flag indicating if the name matched using a rather lax fuzzy name match criteria

GhcnLon: The Legacy longitude. This is the Longitude from the source GHCN V3 file.

GhcnLat: The legacy Latitude

Step Three: In step three the corrected inventory is passed to a metadata compilation function.  The lon lat is passed in and metadata associated with those positions is passed out, along with the LON and LAT passed in for consistency checking

Lon  : corrected Longitude same as field 1

Lat : corrected Latitude same as field 2

LandWater: The fraction of land in the 1/4 degree grid cell surrounding the station. This includes inland water.

LandOcean : The fraction of land in the 1/4 degree grid cell surrounding the station. This includes only ocean water.

CoastDistance: The distance the station is from the coast. If the station is over land this should equal 0.  If a station is in the water it returns the distance to the closest coast. This occurs when coastal stations or island stations are misplaced. The accuracy of the coast map is 30 arc seconds.

Lights: The value of nightlights using the same file that Hansen2010 uses. It should be noted that this file has been deprecated by the file creators. It represents nightlights at the station in the 1995-97 era.

LightsF16: The value of nightlights using the most recent analysis from 2006.  The raw data in the file has been processed to produce a DN number according to the file readme.

Bright3km: Every LightsF16 field surrounding the station has been processed to extract the brightest pixel within 3km.  Given that Nightlights positional accuracy is ~1-2km, a station with perfect location information may still be mis registered with the image because of positional errors in the nightlights data.

Bright5km: same as above with a 5km radius

Bright10km same as above with a 10km radius

Bright20km same as above with a 20km radius

Isa: Impervious surface percentage. The percentage of impervious surfaces estimated from 0-100% A negative number indicates the station is in the water. ISA is the result of a regression and is based on Nightlights data and Landscan population.

GpwDensity : population density ( humans per square km) from the GPW source

GDensity :population density ( humans per square km) from the GRUMP source

The following fields are derived from the HYDE historical population/land use project which is being used for Ar5. The figure is density of humans per sq km. Figures are given for every decade. The data has been  processed from 5 minute data.

Pop1850 ,Pop1860,  Pop1870, Pop1880, Pop1890,Pop1900,Pop1910,Pop1920
Pop1930,Pop1940,Pop1950,Pop1960,Pop1970 ,Pop1980,Pop1990,Pop2000

GrumpUrban: A flag indicating where the site is Urban (2) rural(1) or in water (0)

Categories: Uncategorized

Wetbulb Temperature

November 8, 2010 19 comments

This google map display is just one of 230 GHCN stations that is located in the water. After finding  instances of this phenomena over and over, it seemed an easy thing to find and analyze all such cases in GHCN. The issue matters for a two reasons:

  1. In my temperature analysis program I use a land/water mask to isolate land temperatures from sea temperatures and to weight the temperatures by the land area. An area that would be zero in the ocean, of course.
  2. Hansen2010 uses nightlights based on station location and in most cases the lights at a coastal location are brighter than those off shore. Although I have seen “blooming” even in radiance calibrated lights such that “water pixels” do on occasion have lights on them.

The process of finding “wet stations” is trivial in the “raster” package of R. All that is needed is high resolution land/sea mask. In my previous work, I used a ¼ degree base map. ¼ degree is roughly 25km at the equator.  I was able to find a 1km land mask used by satellites. That data is read in one line of code, and then it is simple matter to determine which stations are “wet”. Since NCDC is updating the GHCN V3 inventory I have alerted them to the problem and will, of course provide the code. I have yet to write NASA GISS. Since H2010 is already in the publishing process, I’m unsure of the correct path forward.

Looking through the 230 cases is not that difficult. It’s just time consuming.  We can identify several types of case: Atolls, Islands, and coastal locations. It’s also possible to put the correct locations in for some stations by referencing either WMO publications or other inventories which have better accuracy than either GHCN or GISS. We can also note that in some cases the “mislocation” may not matter to nightlights.  These are cases where you see no lights whatsover withing the  1/2 degree grid that I show. In the google maps presented below, I’ll show a sampling of all 230. The blue cross shows the GHCN station location and the contour lines show the contour of the nightlights raster. Pitch black locations have no contour.

I will also update this with a newer version of Nighlights. A google tour is available for folks who want it. The code is trivial and I can cover that if folks find it interesting. with the exception of the graphing it is as simple as this:

Ghcn<-readV2Inv() # read in the inventory

lonLat <- data.frame(Ghcn$Lon,Ghcn$Lat)

Nlight <- raster(hiResNightlights)

extent(Nlight)<-c(-180,180,-90,90) # fix the metadata error in nightlights

Ghcn<-cbind(Ghcn,Lights=extract(Nlight,lonLat)) # extract the lights using “points”

distCoast <-raster(coastDistanceFile,varname=”dst”) # get the special land mask

Ghcn <- cbind(Ghcn,CoastDistance=extract(distCoast,lonLat))

# for this mask, Water pixels are coded by their distance from land. All land pixels are 0

# make an inventory of just those land stations that appear in the water.

wetBulb <- Ghcn[which(Ghcn$CoastDistance>0),]


Some shots from the gallery. The 1km land/water mask is very accurate. You might notice one or two stations actually on land. Nightlights is less accurate, something H2010 does not recognize. Its pixels can be over 1km off true position. The small sample below should show the various cases. No attempt is made to ascertain if this causes an issue for identification of rural/urban categories. As it stands the inaccuracies in Nightlights and station locations suggests more work before that effort is taken up.

Categories: Uncategorized

Errors in Ghcn Inventories

October 31, 2010 17 comments

UPDATE!  the WMO is set to publish more accurate data on Nov 8, 2010. So the results here are SUBJECT TO CHANGE.  Stay tuned.

In the debate over the accuracy of the global temperature nothing is more evident than errors in the location data for stations in the GHCN inventory. That inventory is the primary source for all the temperature series. One question is “do these mistakes make a difference?” If one believes as I do that the record is largely correct, then it’s obvious that these mistakes cannot make a huge difference. If one believes, as some do, that the record is flawed, then it’s obvious that these mistakes could be part of the problem. Up until know that is where these two sides of the debate stand. Believers convinced that the small mistakes cannot make a difference; and dis-believers holding that these mistakes could in fact contribute to the bias in the record.  Before I get to the question of whether or not these mistakes make a difference, I need to establish the mistakes, show how some of them originate, correct them where I can and then do some simple evaluations of the impact of the mistakes. This is not a simple process. Throughout this process I think we can say two things that are unassailable: 1. the mistakes are real. 2. we simply don’t know if they make a difference. Some believe they cannot (but they haven’t demonstrated that) and some believe they will (but they haven’t demonstrated that). The demonstration of either position requires real work. Up to now no one has done this work.

This matters primarily because to settle the matter of UHI stations must be categorized  as urban or rural. That entails collecing some information about the character of the station, say it’s population or the characteristics of the land surface. So, location matters. Consider Nightlights which Hansen2010 uses to categorize stations into urban and rural. That determination is made by looking up the value of a pixel in an image. If it bright, the site is urban. If its dark (mis-located in the ocean) the site is rural.

In the GHCN metadata the station may be reported at location xyz.xyN yzx.yxE. In reality it can be many miles from this location. That means the nightlights lookup or ANY georeferenced data ( impervious surfaces, gridded population, land cover) may be wrong. One of my readers alerted me to a project to correct the data. That project can be found here. That resource led to other resources including a 2 year long project to correct the data for all weather stations. Its a huge repository. That led to the WMO documents one of the putative sources for GHCN. This source also has errors. Luckily the WMO has asked all member nations to report more accurate data back in 2009. That process has yet to be completed and when it is done we should have data that is reported down to the arc second. Until then we are stuck trying to reconcile various sources.

The first problem to solve is the loss of precision problem. The WMO has reports that are down to the arc minute. It’s clear that when GHCN uses this data and transforms it into decimal degrees that they round and truncate. These truncations, on occasion, will move a station.  I’ve documented that by examining the original WMO documents and the GHCN documents. In other cases it hard to see the exact error in GHCN, but they clearly dont track with WMO. First the WMO coordinates for WMO 60355 and then the GHCN coordinates:

WMO:   60355 SKIKDA 36 53N 06 54E  [36.8833333, 6.9000]

GHCN: 10160355000 SKIKDA 36.93 6.95

GHCN places the station in the ocean. WMO places it on land as seen above.

To start correcting these locations I started working through the various sources. In this post I will start the work by correcting the GHCN inventory using WMO information as the basis. Aware, of course that WMO may have it own issue. The task is complicated by the lack of any GHCN documents showing how they used WMO documents. In the first step I’ve done this. I compared the GHCN inventory with the WMO inventory and looked at those records where GHCN and WMO have the same  station number and station name. That is difficult in itself because of the way GHCN truncates names to fit a data field. It’s also complicated by the issue of re spelling, multiple names for each site and the issue of GHCN Imod flags and WMO station index sub numbers.

Here is what we find. If we start with the 7200 stations in the GHCN inventory and use the WMO identifier to look up the same stations in the WMO official inventory we get roughly 2500 matches. Here are the matching rules I used.

1. the WMO number must be the same

2. The GHCN name must match the WMO name (or alternate names match).

3. The GHCNID must not have any Imod variants. (no multiple stations per WMO)

4. The WMO station must not have any sub index variants. (107 WMO numbers have subindexes)

That’s a bit hard to explain but in short I try to match the stations that are unique in GHCN with those that are unique in the WMO records. Here is what a sample record looks like.WMO positions are translated from degrees and minutes to decimal degrees and the full precision is retained. You can check that against GHCN rounding. As we saw in previous posts slight movements in stations can move them from Bright to dark and from dark to bright pixels.

63401001000     JAN MAYEN 70.93 -8.67              1001    JAN MAYEN 70.93333 -8.666667

63401008000     SVALBARD LUFT 78.25 15.47    1008    SVALBARD AP 78.25000 15.466667

63401025000 TROMO/SKATTO      69.50 19.00    1025   TROMSO/LANGNES 69.68333 18.916667

63401028000 BJORNOYA                 74.52 19.02    1028    BJORNOYA 74.51667 19.016667

63401049000  ALTA LUFTHAVN 69.98 23.37    1049  ALTA LUFTHAVN 69.98333 23.366667

You also see some of the name matching difficulties where the two records have the same WMO and slightly different names. If we collate all differences on lat and lon in matching stations we get the following:

And when we check the worst record we find the following

WMO:  60581  HASSI-MESSAOUD             31.66667      6.15

GHCN:  10160581000 HASSI-MESSOUD 31.7               2.9

GHCN has the station at latitude ha!  longitude 2.9. According to GHCN the station is an airport:

The location in the WMO file

And the difference is roughly 300km.WMO is more correct than GHCN. GHCN is off by 300km

An old picture of the approach

And diagrams of the airfield

Now, why does this matter.  Giss uses GHCN inventories to get Nightlights. Nightlights uses the location information to determine if the pixel is dark (rural) or bright (urban)

NASA thinks this site is dark. They think it is pitch dark. Of course they are looking 300km away from the real site. From the inventory used in H2010.

10160581000 HASSI-MESSOUD   31.70    2.90  398  630R  HOT DESERT    A    0

Categories: Uncategorized

What’s that 5km from the station “location”

October 21, 2010 3 comments

In our last installment we looked at stations which were pitch black. The case I examined, Middlesboro Kentucky illustrated 1. The station location data used by Hansen2010 has inaccuracies. 2. While the purported station location was pitch dark, nearby within a couple 1/100ths of a degree there were urban lights. What this example illustrated was that you have no assurance that a station which has Zero lights is in fact in a rural location. the principle reason? station mislocation. The first screen I looked at was a 3km screen. That is I looked for pitch black stations with urban lights within 3km. There are only a handful

In this post I push the boundary out and examine stations that fit two criteria:

1. Nasanightlights =0

2 DMSP nightlights > 35 ( urban class 1) within 5km.

What we are essentially looking for are larger errors in the station location data.  The proceedure is exactly the same. We screen for these stations. We plot the google earth map and the light contour on top of each other ( to show the algorithm works) and then we investigate the details of the station by using a google earth tour that is loaded with all the station of this class.

here are some samples








There are several things that become clear looking at these examples.

1. Airport locations have the wrong coordinates in the ROW.

2. US location data has better quality, in fact, using the 4 digit accuracy that is available from NCDC we can see that some sites, when properly placed are STILL pitch dark.

3. Coastal locations are particularly sensitive to this type of error.

Those observations may afford us some remedies. In the USA the remedies are fairly easy. Since the location data is good ( if you use 4 digit accuracies we can see that the probability that location errors result in miscoding rural as urban is small.

for the ROW we have to be concerned with Coastal and airport stations. I will quickly survey the coastal issue and the airport in  subsequent posts.

Oh, and ISA has been added to metadata. more on that later


Categories: Uncategorized

Middlesboro Kentucky: Pitch Black?

October 19, 2010 2 comments

In his august draft of Hansen2010, Dr. Hansen makes the following claim:

“We present evidence here that the urban warming has little effect on our standard global

temperature analysis.  However, in the Appendix we carry out an even more rigorous test.

We show there that there are a sufficient number of stations located in “pitch black” regions,

i.e., regions with brightness below the satellite’s detectability limit (~1 µW/m2/sr/µm),

o allow global analysis with only the stations in pitch black regions defining long-term

trends.  The effect of this more stringent definition of rural areas on analyzed global

temperature change is immeasurably small (<0.01°C  per century).  The finding of a negligible

effect in this test (using only stations in pitch black areas) also addresses, to a substantial

degree, the question of whether movement of weather stations to airports has an important

effect on analyzed global temperature change.  The pitch black requirement eliminates not

only urban and peri-urban stations but also three-quarters of the stations in the more than

500 GHCN records that are identified as airports in the station name.  (The fact that

one-quarter of the airports are pitch black suggests that they are in extreme rural areas

and are shut down during the night.) Station location in the meteorological data records

is provided with a resolution of 0.01 degrees of latitude and longitude, corresponding to

a distance of about 1 km.  This resolution is useful for investigating urban effects on

regional atmospheric temperature.”

The are several claims here but I will only narrowly examine a few of them. I do not asses the claim about the role of UHI in the global record. That claim, in my mind cannot be assessed until the categorization of rural/urban is settled. So, my observations here have nothing to do with the effect of the issues that the case of Middlesboro will raise. In short, I still believe the world is warming and that man is the principle cause.  Instead on will focus on Dr. Hansen’s methodology. In particular, the assumption that the station locations are accurate to .01 degrees or 1 km ( at the equator) and his assumption that selecting “pitchblack stations” gives you a rural sample. Very simply, the station locations are not accurate to .01 degrees as we have seen repeatedly in this series.

To understand this problem in detail requires focusing on individual stations. That focus should neither convince people that the problem is widespread nor should it convince them that it is rare. What it should do is motivate those concerned to be more comprehensive and diligent in their work and their criticism. The conclusions I draw then are most narrow. First, station location data is too inaccurate to use with a simple look up into nightlights, second, a pitch black requirement does not eliminate the issue, and third nightlights is not a reliable indicator of the actual physical processes that cause UHI.

We will start with the GISS inventory data for this station: found here

42572326006 MIDDLESBORO 36.60 -83.73 358 469S 11HIxxno-9x-9COOL FOR./FIELD C2 0

decoding: the latitude is 36.60, longitude is -83.73. The “S” indicates it is a small town, 11 indicates a population of 11,000  and finally the last value  0, indicates that the station is pitch black by  nightlights. In H2010 this last value is apparently the one used to determine if a station is dark. Lets look  what our replicated inventory shows. it shows that Nightlights is 0, but it also indicates that there is a light with value 54 within 55km of the site. More importantly, the expanded inventory shows that within 3km of the station location there is a light with a value more than 35 DN. Simply, there are urban lights very close to the proported station location. Because I process all the pixels within a radius of every station I can locate these cases automatically. I merely sort for all the pitch dark stations and then sort for those with urban pixels within 3, 5, 10  20 km  all the way out to the 1degree cell boundary. Having identified this station as a possible issue the program then outputs the relevant google map with an overlay of nightlights contours. Like so: look at the pale blue cross. So, my algorithm works.

The program also outputs a kml file which then I can bring

up in Google earth and tour all the stations.

Not seeing anything that looks like a weather station at the location, perhaps at the airport?  Well, if  we check source data at NCDC we find the actual location(s)

And we can map all four which are all north of 36.60. In the bright zone

Checking close to the airport  36.61 -83.74 E


[1] 276750752

The Nightlights value  value at that location? not zero. its 33.

cellValues(hiResLights,cell=276750752)   33

To repeat. GISS have the station at  36.60,-83.73. The “lights at that location are Zero. But the actual station location is north of that in the bright zone .The lights at the airport are 33, which qualifies as Periurban, periurban type2. There are lights as high as 56 within the region. That qualifies as urban, urban type2 by Imhoff’s criteria.

The lights in the area near to the station suggest something btween periurban type2 or urban type 2. Urban type 1  is roughly 680  people per square km. The town in fact has 20 square km which translates into roughly 13K people. Checking back with the GISS inventory:

469S   11

11K people . You can check wikipedia. So Imhoffs nightlights did a good job of guessing the population, but if the station location is wrong you look up a dark pixel as opposed to the bright picels right next to them.

Hansen’s screen of pitch black stations is not adequate. A tighter screen, such as no dark pixels within the area of station location uncertainty would be better. We will work our way through that as we improve the tools here.

And in case you wondered about the temps?

Now there is one last thing I had to check. Hansen speaks of stations in pitch black “areas” Looking at his charts however it appears he picked stations at pitch black pixels. To check this the only think I can do is compare his view of USA stations  with my view. They match fairly well ( he shows fewer which may mean the stations drop for other reasons like short records), so I’ll assume  that he picked stations at pitch black “pixels”. As we have seen the value at the “pixel” of a station can be misleading because of very very minor location errors.

hansens graphic and then mine

For one final check, I produce a graphic of stations with periurban pixels within 3km ( marked by a cross) and those with periurban pixels within 5km of the site. Confirming the supposition that hansen has picked stations at pitch black pixels. he does not consider potential station location errors

Categories: Uncategorized

More Graphics with Google earth

October 19, 2010 5 comments

Dr. Paul, R graphics guru, blessed us with his rendition of transparent contour maps drawn on a google earth image: Cool stuff. I’ll be taking his code and turning it into a function and sharing it back: here is the picture his code creates:

That is just plain slick.  While I was looking over his code a couple more examples hit my inbox. One using ggplot which I havent got to ( see the comments on the previous post) and the other example came from the raster guru, Robert. First, lets take a look at his code:

r <- raster('world_avg.tif')
e <- extent(-122.6, -122.3, 37.75, 37.9)
rc <- crop(r, e)
g <- gmap(e, type='satellite')
mrc <- projectRaster(rc, g)
plot(g, interpolate=TRUE)
contour(mrc, add=TRUE, lwd=2, levels=1:7 * 20, col=rev(heat.colors(7)))
pt <- cbind(-122.45, 37.8)
mpt <- Mercator(pt)
points(mpt, col='light blue', pch='+', cex=2)

First we load the package dismo. Now ordinarily you would not go looking for resources to do graphics in a package that dealt with species distribution. In addition to working on raster Robert also works on dismo. It has a couple of tools we need. The first is g <- gmap(e, type=’satellite’).  The call to gmap replaces my calls to GetMap.bbox() from RgoogleMaps, in fact it is based on that code. I happen to like the interface of gmap better. Also, there is no annoying *rda file written for every map. And the pesky GDAL warnings are gone ( you could just suppress the warnings dummy) Anyway, gmap issues a call to the static map server and the pixels are returned to the object “g”. next comes a projection: mrc <- projectRaster(rc, g). This function projects the “rc” raster to the mrc raster using the coordinate system of “g” ( err, I think) basically, the g raster has a mercator projection, and rc is unprojected. so we need to get a mercator projection of rc. Then, we just plot g. On top of that we add a contour of mrc ( which is just rc projected in a mercator projection). Then we set the levels and select heat.colors. I have to study this more to understand how to select the number of levels. We’ll see how this selection plays out. ( I prolly need to vary it based on the range of the data)  Ideally, I’d like the levels to correspond to the Imhoff urbanity levels. Then we draw the cross hair. We pick a point ( where the station is at) then then we take that point and transform that point using a mercator projection. Then we draw the point and we are done.  Below find the three charts we did as samples in the previous post. The key to the first two was seeing that the nightlights data was high frequency so that position accuracy of the station matters. The third chart shows how minor errors in positioning of the station can lead to mis identification. The station location given by the inventory is not coincident with the location of the place name. If we look at the place name, where the station is likely to be positioned, we see different nightlights. So positional innacuries, even small ones, can change how we categorize sites Rural/periurban/urban. hence, it is important for  folks to either improve the accuracy of the data or employ exclusionary rules to preclude errors that may arise from mislocations.

Categories: Uncategorized

Kuwait Airport

October 14, 2010 2 comments


Kuwait International airport. Giss has it as nightlights =0, so do I. By looking at comparisons of nightlights with the station centered and a static google map with the station centered, there are mismatches between GISS and Me and between Nightlights and the  world. Subtle shift here and there. Annoying. Also, you can see how subtle shifts could turn a station from rural to urban.























Categories: Uncategorized