Wednesday, September 26, 2012

On the importance of metadata ...

If you are reading this then you undoubtedly will know that much contention arises over the siting of stations, particularly over the United States where such effects have been documented for the USHCN subset of the COOP network through the citizen science www.surfacestations.org effort. This, and available modern station inventories, shows that most of the network now consists of MMTS sensors (the things that look like the heads of Dalek's in Dr. Who or stacked UFOs) rather than the Cotton Region Shelters (Stevenson Screens) which are white painted wooden ventilated boxes that were used to house liquid in glass thermometers for the majority of the record. As an aside, very early in the record for the US prior to the early twentieth Century, as elsewhere, a large number of approaches were used.

Given that:
  • We know there must have been a change occur between Cotton Region Shelters and MMTS for all USHCN stations that are currently MMTS instrumented.
  • The MMTS has a short electrical lead that will more often than not have involved a relocation of the instrument closer to a power source (building) with lower quality siting characteristics.
  • The MMTS has different measurement characteristics (a tendancy to under-estimate daily maxima and over-estimate daily minima compared to Cotton Region Shelter instrumentation) verified by several side-by-side comparisons, some over several decades.
It is of interest to ask for what period of time the modern station configurations may have been 'representative' i.e. when such sites likely changed location and / or instrument. Both the very likely change in physical measurement location and the very certain change in instrument characteristics will be important considerations in the continuity of the station records and they clearly cannot be divorced from one another on a site-by-site basis. 

Fortunately, for the US we have good, although not complete, metadata. Based upon this its possible to break down when important aspects such as instrument changes, changes in time of observation and other considerations occured both for the USHCN subset and the larger COOP network (although the metadata for the remainder of the COOP is somewhat poorer prior to the mid-twentieth Century). This is shown below (courtesy Claude Williams, NCDC):

Timeseries of frequency of metadata event types for a subset of metadata classes across both USHCN and the broader COOP network. For completeness ASOS is described here 

So, from the above figure its obvious that MMTS transition started in 1982, with the bulk of the transition both in the USHCN and COOP occuring between then and 1990, but some substantial number of such replacements occuring through at least 2000 (some of these may have been replacements of previously installed MMTS sensors). So, how far back can one imply anything from the modern siting of currently MMTS stations? At the earliest 1982 when the replacement program began, and possibly no further back than the early 2000s for some stations. Beyond that careful forensics on a site-by-site basis would be required to ascertain whether the MMTS transition necessitated a change in measurement location. If it did then modern siting would have no bearing on the pre-MMTS segment of the record. Given that the metadata records generally the most significant change it may be rare that the metadata record itself notes whether a change in measurement location was associated with the (more substantial) change in instrumentation.

Bottom line: We need to know not just what the site looks like now but how it has changed over its history if we are to properly assess potential issues of representivity and homogeneity. Current siting tells us about today's measurements, not about yesterday's measurements (or for that matter tomorrows and MMTS has itself now started to be replaced by a newer - although similar - instrument) ... so we need contiguous metadata and not simply snapshots (although they are a valuable start, don't get me wrong) if we are to properly interpret records. Of course outside the US its rare to have access to more than lat, lon, elevation and name, which doesn't mean to say it doesn't exist, rather its not been shared and certainly not in a common format that is machine readable, but that is another post for another time ...